Message ID | 20181119101535.16538-1-carenas@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | t5562: skip if NO_CURL is enabled | expand |
On Mon, Nov 19 2018, Carlo Marcelo Arenas Belón wrote: > 6c213e863a ("http-backend: respect CONTENT_LENGTH for receive-pack", 2018-07-27) > introduced all tests but without a check for CURL support from git. > > Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> > --- > t/t5562-http-backend-content-length.sh | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh > index b24d8b05a4..7594899471 100755 > --- a/t/t5562-http-backend-content-length.sh > +++ b/t/t5562-http-backend-content-length.sh > @@ -3,6 +3,12 @@ > test_description='test git-http-backend respects CONTENT_LENGTH' > . ./test-lib.sh This seems like the wrong fix for whatever bug you're encountering. I just built with NO_CURL and: $ ./t5561-http-backend.sh 1..0 # SKIP skipping test, git built without http support $ ./t5562-http-backend-content-length.sh ok 1 - setup ok 2 - setup, compression related ok 3 - fetch plain ok 4 - fetch plain truncated ok 5 - fetch plain empty ok 6 - fetch gzipped ok 7 - fetch gzipped truncated ok 8 - fetch gzipped empty ok 9 - push plain ok 10 - push plain truncated ok 11 - push plain empty ok 12 - push gzipped ok 13 - push gzipped truncated ok 14 - push gzipped empty ok 15 - CONTENT_LENGTH overflow ssite_t ok 16 - empty CONTENT_LENGTH # passed all 16 test(s) 1..16 So all these test pass. Of courses I still have curl on my system, but I don't see the curl(1) utility used in the test, and my git at this point can't operate on https?:// URLs, so what error are you getting? Can you paste the test output with -x -v? > +if test -n "$NO_CURL" > +then > + skip_all='skipping test, git built without http support' > + test_done > +fi > + > test_lazy_prereq GZIP 'gzip --version' > > verify_http_result() { If we do end up needing this after all it seems better to do something like: diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index a8729f8232..adad654277 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -30,11 +30,7 @@ # Copyright (c) 2008 Clemens Buchacher <drizzd@aon.at> # -if test -n "$NO_CURL" -then - skip_all='skipping test, git built without http support' - test_done -fi +. "$TEST_DIRECTORY"/lib-no-curl.sh if test -n "$NO_EXPAT" && test -n "$LIB_HTTPD_DAV" then diff --git a/t/lib-no-curl.sh b/t/lib-no-curl.sh new file mode 100644 index 0000000000..014947aa2d --- /dev/null +++ b/t/lib-no-curl.sh @@ -0,0 +1,5 @@ +if test -n "$NO_CURL" +then + skip_all='skipping test, git built without http support' + test_done +fi diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh index b24d8b05a4..cffb460673 100755 --- a/t/t5562-http-backend-content-length.sh +++ b/t/t5562-http-backend-content-length.sh @@ -2,6 +2,7 @@ test_description='test git-http-backend respects CONTENT_LENGTH' . ./test-lib.sh +. ./lib-no-curl.sh test_lazy_prereq GZIP 'gzip --version' Not really a problem with your patch, we have lots of this copy/pasting all over the place already. I.e. stuff like: if test -n "$X" then skip_all="$Y" test_done fi or: if ! test_have_prereq "$X" then skip_all="$Y" test_done fi Maybe we should make more use of test_lazy_prereq and factor all that into a new helper like: test_have_prereq_or_skip_all "$X" "$Y" Which could be put at the top of these various tests...
On Mon, Nov 19, 2018 at 02:15:35AM -0800, Carlo Marcelo Arenas Belón wrote: > 6c213e863a ("http-backend: respect CONTENT_LENGTH for receive-pack", 2018-07-27) > introduced all tests but without a check for CURL support from git. The tests should not be using curl, they just pipe data to http-backend's standard input.
On Mon, Nov 19, 2018 at 10:40 AM Max Kirillov <max@max630.net> wrote: > > On Mon, Nov 19, 2018 at 02:15:35AM -0800, Carlo Marcelo Arenas Belón wrote: > > 6c213e863a ("http-backend: respect CONTENT_LENGTH for receive-pack", 2018-07-27) > > introduced all tests but without a check for CURL support from git. > > The tests should not be using curl, they just pipe data to > http-backend's standard input. NO_CURL reflects the build setting (for http support); CURL checks for the curl binary, but as Ævar points out the requirements must be from somewhere else since a NO_CURL=1 build (tested in macOS) still passes the test, but not in NetBSD. tests 3-8 seem to fail because perl is hardcoded to /urs/bin/perl in t5562/invoke-with-content-length.pl, while I seem to be getting some sporadic errors in 9 with the following output : ++ env CONTENT_TYPE=application/x-git-receive-pack-request QUERY_STRING=/repo.git/git-receive-pack 'PATH_TRANSLATED=/home/carenas/src/git/t/trash directory.t5562-http-backend-content-length/.git/git-receive-pack' GIT_HTTP_EXPORT_ALL=TRUE REQUEST_METHOD=POST /home/carenas/src/git/t/t5562/invoke-with-content-length.pl push_body git http-backend ++ verify_http_result '200 OK' ++ grep fatal: act.err Binary file act.err matches ++ return 1 error: last command exited with $?=1 not ok 9 - push plain and the following output in act.err (with a 200 in act) fatal: the remote end hung up unexpectedly Carlo
On Mon, Nov 19, 2018 at 11:36:08AM -0800, Carlo Arenas wrote: > NO_CURL reflects the build setting (for http support); CURL checks for > the curl binary, but as Ævar points out the requirements must be from > somewhere else since a NO_CURL=1 build (tested in macOS) still passes > the test, but not in NetBSD. > > tests 3-8 seem to fail because perl is hardcoded to /urs/bin/perl in > t5562/invoke-with-content-length.pl, I see. In other perl files I can see either '#!/usr/bin/perl' or '#!/ust/bin/env perl'. The second one should be more portable. Does the latter work on the NetBSD? To all: what is supposed to be done about it? > while I seem to be getting some > sporadic errors in 9 with the following output : This is more complicated. Does it happen often? Does test 12 ("push gzipped") ever fail? So far I can imagine either a buffering issue or some mistake in length calculation.
On Mon, Nov 19, 2018 at 11:26:03PM +0200, Max Kirillov wrote: > On Mon, Nov 19, 2018 at 11:36:08AM -0800, Carlo Arenas wrote: > > NO_CURL reflects the build setting (for http support); CURL checks for > > the curl binary, but as Ævar points out the requirements must be from > > somewhere else since a NO_CURL=1 build (tested in macOS) still passes > > the test, but not in NetBSD. > > > > tests 3-8 seem to fail because perl is hardcoded to /urs/bin/perl in > > t5562/invoke-with-content-length.pl, > > I see. > > In other perl files I can see either '#!/usr/bin/perl' or > '#!/ust/bin/env perl'. The second one should be more > portable. Does the latter work on the NetBSD? > > To all: what is supposed to be done about it? You should swap this out for $PERL_PATH. You can use write_script() to help if you're copying the script around anyway. Though it looks like you just run it from the one function. So maybe just: diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh index b24d8b05a4..90d890d02f 100755 --- a/t/t5562-http-backend-content-length.sh +++ b/t/t5562-http-backend-content-length.sh @@ -31,6 +31,7 @@ test_http_env() { PATH_TRANSLATED="$PWD/.git/git-$handler_type-pack" \ GIT_HTTP_EXPORT_ALL=TRUE \ REQUEST_METHOD=POST \ + "$PERL_PATH" \ "$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl \ "$request_body" git http-backend >act.out 2>act.err } (note that it's normally OK to just run "perl", because we use a shell-function wrapper that respects $PERL_PATH, but here we're actually passing it to "env"). You could also lose the executable bit on the script at that point. It doesn't matter much, but it would catch an erroneous call relying on the shebang line. -Peff
On Mon, Nov 19, 2018 at 11:36:08AM -0800, Carlo Arenas wrote: > tests 3-8 seem to fail because perl is hardcoded to /urs/bin/perl in > t5562/invoke-with-content-length.pl, while I seem to be getting some > sporadic errors in 9 with the following output : > > ++ env CONTENT_TYPE=application/x-git-receive-pack-request > QUERY_STRING=/repo.git/git-receive-pack > 'PATH_TRANSLATED=/home/carenas/src/git/t/trash > directory.t5562-http-backend-content-length/.git/git-receive-pack' > GIT_HTTP_EXPORT_ALL=TRUE REQUEST_METHOD=POST > /home/carenas/src/git/t/t5562/invoke-with-content-length.pl push_body > git http-backend > ++ verify_http_result '200 OK' > ++ grep fatal: act.err > Binary file act.err matches > ++ return 1 > error: last command exited with $?=1 > not ok 9 - push plain > > and the following output in act.err (with a 200 in act) > > fatal: the remote end hung up unexpectedly This bit me today, too, and I can reproduce it by running under my stress-testing script. Curiously, the act.err file also has 54 NUL bytes before the "fatal:" message. I tried adding an "strace" to see who was producing that output, but I can't seem to get it to fail when running under strace (presumably because the timing is quite different, and this likely some kind of pipe race). -Peff
FWIW the issue goes away when more than 1 CPU is used in NetBSD 8,0 (32-bit) and for some tracing, it would seem that it gets 0 when trying to read 4 bytes from what I think is a pipe that connects to a child that has been gone already for a while. Carlo
On Wed, Nov 21, 2018 at 04:02:04AM -0800, Carlo Arenas wrote: > for some tracing, it would seem that it gets 0 when > trying to read 4 bytes from what I think is a pipe that connects to a > child that has been gone already for a while. Could you clarify it? I'm afraid I don't understand. Meanwhile, I've been staring at code and so far don't have any assumption where it could fail. Except basic things like something is wrong with forking or reading/writing pipes, but then it would have bigger consequences. Also, I tried to look at it with NetBSD but cannot get past error, while running tests: > ./test-lib.sh: 327: Syntax error: Bad substitution There is the following code there: ----- if test -z "$test_untraceable" || { test -n "$BASH_VERSION" && { test ${BASH_VERSINFO[0]} -gt 4 || { # line 327 test ${BASH_VERSINFO[0]} -eq 4 && test ${BASH_VERSINFO[1]} -ge 1 ----- Should I install bash for it to work? I cannot say I understand what the message is about.
On Wed, Nov 21, 2018 at 2:49 PM Max Kirillov <max@max630.net> wrote: > > Should I install bash for it to work? I cannot say I understand what the message is about. yes, you need to install bash and use SHELL_PATH=/usr/pkg/bin/bash; PERL_PATH=/usr/pkg/bin/perl for the perl script Carlo
On Wed, Nov 21, 2018 at 2:49 PM Max Kirillov <max@max630.net> wrote: > > On Wed, Nov 21, 2018 at 04:02:04AM -0800, Carlo Arenas wrote: > > for some tracing, it would seem that it gets 0 when > > trying to read 4 bytes from what I think is a pipe that connects to a > > child that has been gone already for a while. > > Could you clarify it? I'm afraid I don't understand. the error that gets eventually to stderr in the caller comes from get_packet_data, who is trying to read 4 bytes and gets 0. when looking at the trace (obtained with ktrace) I see there is no longer any other process running, the last child of it is long gone with an error as shown by : 9255 1 git-http-backend CALL close(1) 9255 1 git-http-backend RET close 0 9255 1 git-http-backend CALL read(0,0xbfb2bb14,0) 9255 1 git-http-backend GIO fd 0 read 0 bytes "" 9255 1 git-http-backend RET read 0 9255 1 git-http-backend CALL write(2,0xbfb2a604,0x36) 9255 1 git-http-backend GIO fd 2 wrote 54 bytes "fatal: request ended in the middle of the gzip stream\n" 9255 1 git-http-backend RET write 54/0x36 9255 1 git-http-backend CALL write(1,0xb781f0e0,0x94) 9255 1 git-http-backend RET write -1 errno 9 Bad file descriptor not sure how it got into that state, though Carlo
On Wed, Nov 21, 2018 at 05:04:25PM -0800, Carlo Arenas wrote: > the error that gets eventually to stderr in the caller comes from > get_packet_data, who is trying to read 4 bytes and gets 0. > when looking at the trace (obtained with ktrace) Yes too early close of the input data is the thing which triggers the "remote end hung up unexpectedly" message. > I see there is no > longer any other process running, do you mean git receive-pack? This is strange, all its parents should be waiting for it to exit. > the last child of it is long gone with an error as shown by : > > 9255 1 git-http-backend CALL close(1) ... > 9255 1 git-http-backend CALL write(2,0xbfb2a604,0x36) > 9255 1 git-http-backend GIO fd 2 wrote 54 bytes > "fatal: request ended in the middle of the gzip stream\n" This should be some other test than push_plain, some of the gzip related ones. Are there other tests failing? > 9255 1 git-http-backend RET write 54/0x36 > 9255 1 git-http-backend CALL write(1,0xb781f0e0,0x94) > 9255 1 git-http-backend RET write -1 errno 9 Bad file descriptor This is interesting. http-backend for some reason closes its stdout. Here it then tries to write there something. I have not seen it in my push_plain run. Maybe it worth redirecting instead to stderr, to avoid losing some diagnostics? > > not sure how it got into that state, though > > Carlo
On Wed, Nov 21, 2018 at 10:37 PM Max Kirillov <max@max630.net> wrote: > > On Wed, Nov 21, 2018 at 05:04:25PM -0800, Carlo Arenas wrote: > > the last child of its children long gone with an error as shown by : > > > > 9255 1 git-http-backend CALL close(1) > ... > > 9255 1 git-http-backend CALL write(2,0xbfb2a604,0x36) > > 9255 1 git-http-backend GIO fd 2 wrote 54 bytes > > "fatal: request ended in the middle of the gzip stream\n" > > This should be some other test than push_plain, some of the > gzip related ones. Are there other tests failing? it should, but I should note that for test 9 to fail, then either (or both) tests 7 and 8 should first succeed; not that I'd seen any other test fail (after I locally patched the perl path, of course) even when reordering them and while making sure tests 1 and 2 run first to create the dependencies for the rest Peff, could you elaborate on your "load testing" setup? which could give us any hints on what to look for?, FWIW I hadn't been able to reproduce the problem anywhere else (and not for a lack of trying) > > 9255 1 git-http-backend RET write 54/0x36 > > 9255 1 git-http-backend CALL write(1,0xb781f0e0,0x94) > > 9255 1 git-http-backend RET write -1 errno 9 Bad file descriptor > > This is interesting. http-backend for some reason closes its > stdout. Here it then tries to write there something. I have > not seen it in my push_plain run. Maybe it worth redirecting instead > to stderr, to avoid losing some diagnostics? that should help with the garbled output from stderr, AFAIK the process API allows creating a pipe specifically for that with would be better than redirecting stderr into stdout. the fact we got EBADF means that there is a problem somewhere though in the way the previous failure that closed stdout got handled (which should had been most likely in the call to die) Carlo PS. upstreaming the PERL_PATH fix is likely to be good to do soonish as I presume at least all BSD might be affected, let me know if you would rather me do that instead as I suspect we might be deadlocked otherwise ;)
On Thu, Nov 22, 2018 at 02:17:01AM -0800, Carlo Arenas wrote: > Peff, could you elaborate on your "load testing" setup? which could > give us any hints > on what to look for?, FWIW I hadn't been able to reproduce the problem anywhere > else (and not for a lack of trying) The script I use is at: https://github.com/peff/git/blob/meta/stress which you invoke like "/path/to/stress t5562" from the top-level of a git.git checkout. It basically just runs a loop of twice as many simultaneous invocations of the test script as you have CPUs, and waits for one to fail. The load created by all of the runs tends to flush out timing effects after a while. It fails for me on t5562 within 30 seconds or so (but note that in this particular case it sometimes takes a while to produce the final output because invoke-with-content-length misses the expected SIGCLD and sleeps the full 60 seconds). You'll probably need to tweak the variables at the top of the script for your system. > PS. upstreaming the PERL_PATH fix is likely to be good to do soonish > as I presume at least all BSD might be affected, let me know if you > would rather me do that instead as I suspect we might be deadlocked > otherwise ;) Yeah, the $PERL_PATH thing is totally orthogonal, and should graduate separately. -Peff
On Thu, Nov 22, 2018 at 11:17:22AM -0500, Jeff King wrote: > The script I use is at: > > https://github.com/peff/git/blob/meta/stress > > which you invoke like "/path/to/stress t5562" from the top-level of a > git.git checkout. It basically just runs a loop of twice as many > simultaneous invocations of the test script as you have CPUs, and waits > for one to fail. The load created by all of the runs tends to flush out > timing effects after a while. > > It fails for me on t5562 within 30 seconds or so (but note that in this > particular case it sometimes takes a while to produce the final output > because invoke-with-content-length misses the expected SIGCLD and sleeps > the full 60 seconds). I have observed it caught failure at the very first run. However I could not fail t again. I tried running up to 20 instances, with 1 or 2 active cores (that's all I have here), also edited the test to include only push_plain case, and repeat it several times, to avoid running irrelevant cases, the failure never happened again. The first failure was a bit unusual, in the ouput actually all tests were marked as passed, but it still failed somehow. Unfortunately, I did not save the output. I submitted the perl patch
On Thu, Nov 22, 2018 at 3:43 PM Max Kirillov <max@max630.net> wrote: > also edited the test to include only push_plain case, > and repeat it several times, to avoid running irrelevant > cases, the failure never happened again. as I explained previously[1] and as odd as it might seem the push_plain case ONLY fails if your run them together with the other tests that return errors with compressed input frankly I don't understand how one could affect the other as they should be running in independent processes but it happens fairly consistently in NetBSD (tested 7.1, 7.2, 8) with only one CPU (tested i386 and amd64) Carlo [1] https://public-inbox.org/git/20181119213924.GA2318@sigill.intra.peff.net/T/#m041e9703432c39dcb04fe10e86fc53d5254474b4
On Tue, Nov 20, 2018 at 04:11:08AM -0500, Jeff King wrote: > On Mon, Nov 19, 2018 at 11:36:08AM -0800, Carlo Arenas wrote: > > > tests 3-8 seem to fail because perl is hardcoded to /urs/bin/perl in > > t5562/invoke-with-content-length.pl, while I seem to be getting some > > sporadic errors in 9 with the following output : > > > > ++ env CONTENT_TYPE=application/x-git-receive-pack-request > > QUERY_STRING=/repo.git/git-receive-pack > > 'PATH_TRANSLATED=/home/carenas/src/git/t/trash > > directory.t5562-http-backend-content-length/.git/git-receive-pack' > > GIT_HTTP_EXPORT_ALL=TRUE REQUEST_METHOD=POST > > /home/carenas/src/git/t/t5562/invoke-with-content-length.pl push_body > > git http-backend > > ++ verify_http_result '200 OK' > > ++ grep fatal: act.err > > Binary file act.err matches > > ++ return 1 > > error: last command exited with $?=1 > > not ok 9 - push plain > > > > and the following output in act.err (with a 200 in act) > > > > fatal: the remote end hung up unexpectedly > > This bit me today, too, and I can reproduce it by running under my > stress-testing script. I saw this a few times on Travis CI as well. > Curiously, the act.err file also has 54 NUL bytes before the "fatal:" > message. I think those NUL bytes come from the file system. The contents of 'act.err' from the previous test ('fetch gzipped empty') is usually: fatal: request ended in the middle of the gzip stream fatal: the remote end hung up unexpectedly Notice that the length of the first line is 54 bytes (including the trailing newline). So I suspect that the following is happening: - http-backend in the previous test writes the first line, - that test finishes and this one starts, - this test truncates 'act.err', - and then the still-running http-backend from the previous test finally writes the second line. So at this point 'act.err' is empty, but the offset of the fd of the redirection still open from the previous test is at 54, so the file system fills those bytes with NULs. > I tried adding an "strace" to see who was producing that > output, but I can't seem to get it to fail when running under strace > (presumably because the timing is quite different, and this likely some > kind of pipe race). > > -Peff
On Thu, Nov 22 2018, Jeff King wrote: > On Thu, Nov 22, 2018 at 02:17:01AM -0800, Carlo Arenas wrote: >> PS. upstreaming the PERL_PATH fix is likely to be good to do soonish >> as I presume at least all BSD might be affected, let me know if you >> would rather me do that instead as I suspect we might be deadlocked >> otherwise ;) > > Yeah, the $PERL_PATH thing is totally orthogonal, and should graduate > separately. On the subject of orthagonal things: This test fails on AIX with /bin/sh (but not /bin/bash) due to some interaction of ssize_b100dots and the build_option function. On that system: $ ./git version --build-options git version 2.20.0-rc1 cpu: 00FA74164C00 no commit associated with this build sizeof-long: 4 sizeof-size_t: 4 But it somehow ends up in the 'die' condition in that case statement. I dug around briefly but couldn't find the cause, probably some limitation in the shell constructs it supports. Just leaving a note about this...
On Wed, Nov 28, 2018 at 02:27:08PM +0100, SZEDER Gábor wrote: > > Curiously, the act.err file also has 54 NUL bytes before the "fatal:" > > message. > > I think those NUL bytes come from the file system. > > The contents of 'act.err' from the previous test ('fetch gzipped > empty') is usually: > > fatal: request ended in the middle of the gzip stream > fatal: the remote end hung up unexpectedly > > Notice that the length of the first line is 54 bytes (including the > trailing newline). So I suspect that the following is happening: > > - http-backend in the previous test writes the first line, > - that test finishes and this one starts, > - this test truncates 'act.err', > - and then the still-running http-backend from the previous test > finally writes the second line. > > So at this point 'act.err' is empty, but the offset of the fd of the > redirection still open from the previous test is at 54, so the file > system fills those bytes with NULs. Right, good thinking. Thanks for the explanation! -Peff
On Wed, Nov 28, 2018 at 03:56:29PM +0100, Ævar Arnfjörð Bjarmason wrote: > > On Thu, Nov 22 2018, Jeff King wrote: > > > On Thu, Nov 22, 2018 at 02:17:01AM -0800, Carlo Arenas wrote: > >> PS. upstreaming the PERL_PATH fix is likely to be good to do soonish > >> as I presume at least all BSD might be affected, let me know if you > >> would rather me do that instead as I suspect we might be deadlocked > >> otherwise ;) > > > > Yeah, the $PERL_PATH thing is totally orthogonal, and should graduate > > separately. > > On the subject of orthagonal things: This test fails on AIX with /bin/sh > (but not /bin/bash) due to some interaction of ssize_b100dots and the > build_option function. On that system: > > $ ./git version --build-options > git version 2.20.0-rc1 > cpu: 00FA74164C00 > no commit associated with this build > sizeof-long: 4 > sizeof-size_t: 4 > > But it somehow ends up in the 'die' condition in that case statement. I > dug around briefly but couldn't find the cause, probably some limitation > in the shell constructs it supports. Just leaving a note about this... That's weird. The functions involved are pretty vanilla. I'd suspect something funny with the sed invocation: build_option () { git version --build-options | sed -ne "s/^$1: //p" } but that's the one thing that shouldn't be dependent on the shell in use. Can you manually replicate the shell commands to see where it goes wrong? -Peff
diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh index b24d8b05a4..7594899471 100755 --- a/t/t5562-http-backend-content-length.sh +++ b/t/t5562-http-backend-content-length.sh @@ -3,6 +3,12 @@ test_description='test git-http-backend respects CONTENT_LENGTH' . ./test-lib.sh +if test -n "$NO_CURL" +then + skip_all='skipping test, git built without http support' + test_done +fi + test_lazy_prereq GZIP 'gzip --version' verify_http_result() {
6c213e863a ("http-backend: respect CONTENT_LENGTH for receive-pack", 2018-07-27) introduced all tests but without a check for CURL support from git. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> --- t/t5562-http-backend-content-length.sh | 6 ++++++ 1 file changed, 6 insertions(+)