diff mbox series

[v3,5/5] archive-tar: use internal gzip by default

Message ID d9e75b24-c351-e226-011d-5a5cc2e1c858@web.de (mailing list archive)
State Superseded
Headers show
Series Avoid spawning gzip in git archive | expand

Commit Message

René Scharfe June 12, 2022, 6:19 a.m. UTC
Drop the dependency on gzip(1) and use our internal implementation to
create tar.gz and tgz files.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 Documentation/git-archive.txt |  4 ++--
 archive-tar.c                 |  4 ++--
 t/t5000-tar-tree.sh           | 32 ++++++++++++++++----------------
 3 files changed, 20 insertions(+), 20 deletions(-)

--
2.36.1

Comments

Junio C Hamano June 13, 2022, 9:55 p.m. UTC | #1
René Scharfe <l.s.r@web.de> writes:

> -test_expect_success GZIP 'git archive --format=tar.gz' '
> +test_expect_success 'git archive --format=tar.gz' '
>  	git archive --format=tar.gz HEAD >j1.tar.gz &&
>  	test_cmp_bin j.tgz j1.tar.gz
>  '

Curiously, this breaks for me.  It is understandable if we are not
producing byte-for-byte identical output with internal gzip.

With the following hack I can force the step pass, so it seems that
the two invocations of internal gzip are not emitting identical
result for the tar stream taken out of HEAD^{tree} object?

diff --git c/t/t5000-tar-tree.sh w/t/t5000-tar-tree.sh
index 1a68e89a55..c0a2cb92d4 100755
--- c/t/t5000-tar-tree.sh
+++ w/t/t5000-tar-tree.sh
@@ -340,14 +340,16 @@ test_expect_success 'only enabled filters are available remotely' '
 '
 
 test_expect_success 'git archive --format=tgz' '
-	git archive --format=tgz HEAD >j.tgz
+	git -c tar.tgz.command="gzip -cn" archive --format=tgz HEAD >j.tgz
 '
 
 test_expect_success 'git archive --format=tar.gz' '
-	git archive --format=tar.gz HEAD >j1.tar.gz &&
+	git -c tar.tar.gz.command="gzip -cn" archive --format=tar.gz HEAD >j1.tar.gz &&
 	test_cmp_bin j.tgz j1.tar.gz
 '
 
+exit
+
 test_expect_success 'infer tgz from .tgz filename' '
 	git archive --output=j2.tgz HEAD &&
 	test_cmp_bin j.tgz j2.tgz
Johannes Schindelin June 14, 2022, 11:27 a.m. UTC | #2
Hi Junio,

On Mon, 13 Jun 2022, Junio C Hamano wrote:

> René Scharfe <l.s.r@web.de> writes:
>
> > -test_expect_success GZIP 'git archive --format=tar.gz' '
> > +test_expect_success 'git archive --format=tar.gz' '
> >  	git archive --format=tar.gz HEAD >j1.tar.gz &&
> >  	test_cmp_bin j.tgz j1.tar.gz
> >  '
>
> Curiously, this breaks for me.  It is understandable if we are not
> producing byte-for-byte identical output with internal gzip.

Indeed, I can reproduce this, too. In particular, `j.tgz` and `j1.tar.gz`
differ like this in my test run:

-00000000  1f 8b 08 1a 00 2e ca 09  00 03 04 00 89 45 fc 83 |.............E..|
+00000000  1f 8b 08 1a 00 35 2a 10  00 03 04 00 89 45 fc 83 |.....5*......E..|

and

-00000010  7d fc 00 f1 d0 ec b7 63  8c 30 cc 9b e6 db b6 6d |}......c.0.....m|
+00000010  7d fc 00 54 ff ec b7 63  8c 30 cc 9b e6 db b6 6d |}..T...c.0.....m|

According to https://datatracker.ietf.org/doc/html/rfc1952#page-5, the
difference in the first line is the mtime. For reference, this is the
version with `git -c tar.tgz.command="gzip -cn" archive --format=tgz
HEAD`:

00000000  1f 8b 08 00 00 00 00 00  00 03 ec b7 63 8c 30 cc |............c.0.|

In other words, `gzip` forces the `mtim` member to all zeros, which makes
sense.

The recorded mtimes are a bit funny, according to
https://wolf-tungsten.github.io/gzip-analyzer/, they are 1975-03-17
00:36:32 and 1978-08-05 22:45:36, respectively...

And the mtime actually changes all the time.

What's even more funny: if I comment out the `deflateSetHeader()`, the
mtime header field is left at all-zeros. This is on Ubuntu 18.04 with
zlib1g 1:1.2.11.dfsg-0ubuntu2.

So I dug in a bit deeper and what do you know, the `deflateHeader()`
function is implemented like this
(https://github.com/madler/zlib/blob/21767c654d31/deflate.c#L557-L565):

	int ZEXPORT deflateSetHeader (strm, head)
	    z_streamp strm;
	    gz_headerp head;
	{
	    if (deflateStateCheck(strm) || strm->state->wrap != 2)
		return Z_STREAM_ERROR;
	    strm->state->gzhead = head;
	    return Z_OK;
	}

Now, the caller is implemented like this:

	static void tgz_set_os(git_zstream *strm, int os)
	{
	#if ZLIB_VERNUM >= 0x1221
		struct gz_header_s gzhead = { .os = os };
		deflateSetHeader(&strm->z, &gzhead);
	#endif
	}

The biggest problem is not that the return value of `deflateSetHeader()`
is ignored. The biggest problem is that it passes the address of a heap
variable to the `deflateSetHeader()` function, which then stores it away
in another struct that lives beyond the point when we return from
`tgz_set_os()`.

In other words, this is the very issue I pointed out as GCC not catching:
https://lore.kernel.org/git/nycvar.QRO.7.76.6.2205272235220.349@tvgsbejvaqbjf.bet/

The solution is to move the heap variable back into a scope that matches
the lifetime of the compression:

-- snip --
diff --git a/archive-tar.c b/archive-tar.c
index 60669eb7b9c..3d77e0f7509 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -460,17 +460,12 @@ static void tgz_write_block(const void *data)

 static const char internal_gzip_command[] = "git archive gzip";

-static void tgz_set_os(git_zstream *strm, int os)
-{
-#if ZLIB_VERNUM >= 0x1221
-	struct gz_header_s gzhead = { .os = os };
-	deflateSetHeader(&strm->z, &gzhead);
-#endif
-}
-
 static int write_tar_filter_archive(const struct archiver *ar,
 				    struct archiver_args *args)
 {
+#if ZLIB_VERNUM >= 0x1221
+	struct gz_header_s gzhead = { .os = 3 }; /* Unix, for reproducibility */
+#endif
 	struct strbuf cmd = STRBUF_INIT;
 	struct child_process filter = CHILD_PROCESS_INIT;
 	int r;
@@ -481,7 +476,10 @@ static int write_tar_filter_archive(const struct archiver *ar,
 	if (!strcmp(ar->filter_command, internal_gzip_command)) {
 		write_block = tgz_write_block;
 		git_deflate_init_gzip(&gzstream, args->compression_level);
-		tgz_set_os(&gzstream, 3); /* Unix, for reproducibility */
+#if ZLIB_VERNUM >= 0x1221
+		if (deflateSetHeader(&gzstream.z, &gzhead) != Z_OK)
+			BUG("deflateSetHeader() called too late");
+#endif
 		gzstream.next_out = outbuf;
 		gzstream.avail_out = sizeof(outbuf);

-- snap --

With this, the test passes for me.

René, would you mind squashing this into your patch series?

Thank you,
Dscho
René Scharfe June 14, 2022, 3:47 p.m. UTC | #3
Am 14.06.22 um 13:27 schrieb Johannes Schindelin:
> Hi Junio,
>
> On Mon, 13 Jun 2022, Junio C Hamano wrote:
>
>> René Scharfe <l.s.r@web.de> writes:
>>
>>> -test_expect_success GZIP 'git archive --format=tar.gz' '
>>> +test_expect_success 'git archive --format=tar.gz' '
>>>  	git archive --format=tar.gz HEAD >j1.tar.gz &&
>>>  	test_cmp_bin j.tgz j1.tar.gz
>>>  '
>>
>> Curiously, this breaks for me.  It is understandable if we are not
>> producing byte-for-byte identical output with internal gzip.

Makes sense in retrospect, there's no reason the output of gzip(1) and
zlib would have to be the same exactly.  It just happened to be so on my
platform, so the tests deceptively passed for me.  I think we simply
have to drop those that try to compare compressed files made by
different tools -- we can still check if their content can be extracted
and matches.

> Indeed, I can reproduce this, too. In particular, `j.tgz` and `j1.tar.gz`
> differ like this in my test run:
>
> -00000000  1f 8b 08 1a 00 2e ca 09  00 03 04 00 89 45 fc 83 |.............E..|
> +00000000  1f 8b 08 1a 00 35 2a 10  00 03 04 00 89 45 fc 83 |.....5*......E..|
>
> and
>
> -00000010  7d fc 00 f1 d0 ec b7 63  8c 30 cc 9b e6 db b6 6d |}......c.0.....m|
> +00000010  7d fc 00 54 ff ec b7 63  8c 30 cc 9b e6 db b6 6d |}..T...c.0.....m|
>
> According to https://datatracker.ietf.org/doc/html/rfc1952#page-5, the
> difference in the first line is the mtime. For reference, this is the
> version with `git -c tar.tgz.command="gzip -cn" archive --format=tgz
> HEAD`:
>
> 00000000  1f 8b 08 00 00 00 00 00  00 03 ec b7 63 8c 30 cc |............c.0.|
>
> In other words, `gzip` forces the `mtim` member to all zeros, which makes
> sense.

And that's what zlib does as well for me:

   $ ./git-archive --format=tgz HEAD | hexdump -C | head -1
   00000000  1f 8b 08 00 00 00 00 00  00 03 ec bd 59 73 1c 49  |..........?Ys.I|

>
> The recorded mtimes are a bit funny, according to
> https://wolf-tungsten.github.io/gzip-analyzer/, they are 1975-03-17
> 00:36:32 and 1978-08-05 22:45:36, respectively...
>
> And the mtime actually changes all the time.
>
> What's even more funny: if I comment out the `deflateSetHeader()`, the
> mtime header field is left at all-zeros. This is on Ubuntu 18.04 with
> zlib1g 1:1.2.11.dfsg-0ubuntu2.
>
> So I dug in a bit deeper and what do you know, the `deflateHeader()`
> function is implemented like this
> (https://github.com/madler/zlib/blob/21767c654d31/deflate.c#L557-L565):
>
> 	int ZEXPORT deflateSetHeader (strm, head)
> 	    z_streamp strm;
> 	    gz_headerp head;
> 	{
> 	    if (deflateStateCheck(strm) || strm->state->wrap != 2)
> 		return Z_STREAM_ERROR;
> 	    strm->state->gzhead = head;
> 	    return Z_OK;
> 	}
>
> Now, the caller is implemented like this:
>
> 	static void tgz_set_os(git_zstream *strm, int os)
> 	{
> 	#if ZLIB_VERNUM >= 0x1221
> 		struct gz_header_s gzhead = { .os = os };
> 		deflateSetHeader(&strm->z, &gzhead);
> 	#endif
> 	}
>
> The biggest problem is not that the return value of `deflateSetHeader()`
> is ignored. The biggest problem is that it passes the address of a heap
> variable to the `deflateSetHeader()` function, which then stores it away
> in another struct that lives beyond the point when we return from
> `tgz_set_os()`.

Ah, you mean the address of an automatic variable on the stack, but I
get it.  D'oh!

>
> In other words, this is the very issue I pointed out as GCC not catching:
> https://lore.kernel.org/git/nycvar.QRO.7.76.6.2205272235220.349@tvgsbejvaqbjf.bet/
>
> The solution is to move the heap variable back into a scope that matches
> the lifetime of the compression:
>
> -- snip --
> diff --git a/archive-tar.c b/archive-tar.c
> index 60669eb7b9c..3d77e0f7509 100644
> --- a/archive-tar.c
> +++ b/archive-tar.c
> @@ -460,17 +460,12 @@ static void tgz_write_block(const void *data)
>
>  static const char internal_gzip_command[] = "git archive gzip";
>
> -static void tgz_set_os(git_zstream *strm, int os)
> -{
> -#if ZLIB_VERNUM >= 0x1221
> -	struct gz_header_s gzhead = { .os = os };
> -	deflateSetHeader(&strm->z, &gzhead);
> -#endif
> -}
> -
>  static int write_tar_filter_archive(const struct archiver *ar,
>  				    struct archiver_args *args)
>  {
> +#if ZLIB_VERNUM >= 0x1221
> +	struct gz_header_s gzhead = { .os = 3 }; /* Unix, for reproducibility */
> +#endif
>  	struct strbuf cmd = STRBUF_INIT;
>  	struct child_process filter = CHILD_PROCESS_INIT;
>  	int r;
> @@ -481,7 +476,10 @@ static int write_tar_filter_archive(const struct archiver *ar,
>  	if (!strcmp(ar->filter_command, internal_gzip_command)) {
>  		write_block = tgz_write_block;
>  		git_deflate_init_gzip(&gzstream, args->compression_level);
> -		tgz_set_os(&gzstream, 3); /* Unix, for reproducibility */
> +#if ZLIB_VERNUM >= 0x1221
> +		if (deflateSetHeader(&gzstream.z, &gzhead) != Z_OK)
> +			BUG("deflateSetHeader() called too late");
> +#endif
>  		gzstream.next_out = outbuf;
>  		gzstream.avail_out = sizeof(outbuf);
>
> -- snap --

Good find, thank you!  A shorter solution would be to make gzhead static.

>
> With this, the test passes for me.
>
> René, would you mind squashing this into your patch series?
>
> Thank you,
> Dscho
René Scharfe June 14, 2022, 3:56 p.m. UTC | #4
Am 14.06.22 um 17:47 schrieb René Scharfe:
> Am 14.06.22 um 13:27 schrieb Johannes Schindelin:
>> Hi Junio,
>>
>> On Mon, 13 Jun 2022, Junio C Hamano wrote:
>>
>>> René Scharfe <l.s.r@web.de> writes:
>>>
>>>> -test_expect_success GZIP 'git archive --format=tar.gz' '
>>>> +test_expect_success 'git archive --format=tar.gz' '
>>>>  	git archive --format=tar.gz HEAD >j1.tar.gz &&
>>>>  	test_cmp_bin j.tgz j1.tar.gz
>>>>  '
>>>
>>> Curiously, this breaks for me.  It is understandable if we are not
>>> producing byte-for-byte identical output with internal gzip.
>
> Makes sense in retrospect, there's no reason the output of gzip(1) and
> zlib would have to be the same exactly.  It just happened to be so on my
> platform, so the tests deceptively passed for me.  I think we simply
> have to drop those that try to compare compressed files made by
> different tools -- we can still check if their content can be extracted
> and matches.

I have to take that back, I was confused -- the tests are fine.  There are no
comparisons between gzip-generated and zlib-generated files.  There's just
the gzhead use-after-return error that Dscho discovered.

René
Johannes Schindelin June 14, 2022, 4:29 p.m. UTC | #5
Hi René,

On Tue, 14 Jun 2022, René Scharfe wrote:

> Am 14.06.22 um 13:27 schrieb Johannes Schindelin:
>
> > The solution is to move the heap variable back into a scope that matches
> > the lifetime of the compression:
> >
> > -- snip --
> > diff --git a/archive-tar.c b/archive-tar.c
> > index 60669eb7b9c..3d77e0f7509 100644
> > --- a/archive-tar.c
> > +++ b/archive-tar.c
> > @@ -460,17 +460,12 @@ static void tgz_write_block(const void *data)
> >
> >  static const char internal_gzip_command[] = "git archive gzip";
> >
> > -static void tgz_set_os(git_zstream *strm, int os)
> > -{
> > -#if ZLIB_VERNUM >= 0x1221
> > -	struct gz_header_s gzhead = { .os = os };
> > -	deflateSetHeader(&strm->z, &gzhead);
> > -#endif
> > -}
> > -
> >  static int write_tar_filter_archive(const struct archiver *ar,
> >  				    struct archiver_args *args)
> >  {
> > +#if ZLIB_VERNUM >= 0x1221
> > +	struct gz_header_s gzhead = { .os = 3 }; /* Unix, for reproducibility */
> > +#endif
> >  	struct strbuf cmd = STRBUF_INIT;
> >  	struct child_process filter = CHILD_PROCESS_INIT;
> >  	int r;
> > @@ -481,7 +476,10 @@ static int write_tar_filter_archive(const struct archiver *ar,
> >  	if (!strcmp(ar->filter_command, internal_gzip_command)) {
> >  		write_block = tgz_write_block;
> >  		git_deflate_init_gzip(&gzstream, args->compression_level);
> > -		tgz_set_os(&gzstream, 3); /* Unix, for reproducibility */
> > +#if ZLIB_VERNUM >= 0x1221
> > +		if (deflateSetHeader(&gzstream.z, &gzhead) != Z_OK)
> > +			BUG("deflateSetHeader() called too late");
> > +#endif
> >  		gzstream.next_out = outbuf;
> >  		gzstream.avail_out = sizeof(outbuf);
> >
> > -- snap --
>
> Good find, thank you!  A shorter solution would be to make gzhead static.

I should have said that I had considered this, but decided against it
because it would introduce yet another issue: it would render the code
needlessly un-multi-threadable. And that can be avoided _really_ easily.

Ciao,
Dscho
René Scharfe June 14, 2022, 8:04 p.m. UTC | #6
Am 14.06.22 um 18:29 schrieb Johannes Schindelin:
> Hi René,
>
> On Tue, 14 Jun 2022, René Scharfe wrote:
>
>> Am 14.06.22 um 13:27 schrieb Johannes Schindelin:
>>
>>> The solution is to move the heap variable back into a scope that matches
>>> the lifetime of the compression:
>>>
>>> -- snip --
>>> diff --git a/archive-tar.c b/archive-tar.c
>>> index 60669eb7b9c..3d77e0f7509 100644
>>> --- a/archive-tar.c
>>> +++ b/archive-tar.c
>>> @@ -460,17 +460,12 @@ static void tgz_write_block(const void *data)
>>>
>>>  static const char internal_gzip_command[] = "git archive gzip";
>>>
>>> -static void tgz_set_os(git_zstream *strm, int os)
>>> -{
>>> -#if ZLIB_VERNUM >= 0x1221
>>> -	struct gz_header_s gzhead = { .os = os };
>>> -	deflateSetHeader(&strm->z, &gzhead);
>>> -#endif
>>> -}
>>> -
>>>  static int write_tar_filter_archive(const struct archiver *ar,
>>>  				    struct archiver_args *args)
>>>  {
>>> +#if ZLIB_VERNUM >= 0x1221
>>> +	struct gz_header_s gzhead = { .os = 3 }; /* Unix, for reproducibility */
>>> +#endif
>>>  	struct strbuf cmd = STRBUF_INIT;
>>>  	struct child_process filter = CHILD_PROCESS_INIT;
>>>  	int r;
>>> @@ -481,7 +476,10 @@ static int write_tar_filter_archive(const struct archiver *ar,
>>>  	if (!strcmp(ar->filter_command, internal_gzip_command)) {
>>>  		write_block = tgz_write_block;
>>>  		git_deflate_init_gzip(&gzstream, args->compression_level);
>>> -		tgz_set_os(&gzstream, 3); /* Unix, for reproducibility */
>>> +#if ZLIB_VERNUM >= 0x1221
>>> +		if (deflateSetHeader(&gzstream.z, &gzhead) != Z_OK)
>>> +			BUG("deflateSetHeader() called too late");
>>> +#endif
>>>  		gzstream.next_out = outbuf;
>>>  		gzstream.avail_out = sizeof(outbuf);
>>>
>>> -- snap --
>>
>> Good find, thank you!  A shorter solution would be to make gzhead static.
>
> I should have said that I had considered this, but decided against it
> because it would introduce yet another issue: it would render the code
> needlessly un-multi-threadable. And that can be avoided _really_ easily.

archive-tar.c (and archive-zip.c) use other static variables, so a
static gzhead won't break or block anything in this regard.  There was
no interest in running it in parallel threads so far AFAIK, and it's
hard for me to imagine the usefulness of creating multiple .tgz files at
the same time.

The doubled ZLIB_VERNUM is unsightly and I'm not sure the BUG check is
useful -- I omitted error checking because there is no recurse for us if
deflateSetHeader() doesn't work, and on ancient zlib versions we
silently continue anyway.

But that's all just minor quibbling -- I'll include your changes in the
next version, they look fine overall.

René
Junio C Hamano June 15, 2022, 4:41 p.m. UTC | #7
René Scharfe <l.s.r@web.de> writes:

> archive-tar.c (and archive-zip.c) use other static variables, so a
> static gzhead won't break or block anything in this regard.  There was
> no interest in running it in parallel threads so far AFAIK, and it's
> hard for me to imagine the usefulness of creating multiple .tgz files at
> the same time.

;-)  FWIW I had exactly the same reaction.  

If this were a lot isolated piece of helper function, perhaps, but
no, reentrancy for this helper function specifically written for
this code path is not a very good argument.

The code structure the (not very good) argument tried to suport,
however, is a good thing to have regardless, and if the code is
written in such a way from the beginning, there is no reason to
reject it.  If it opens the door to a unified way to deal with all
the other static global variables (e.g. have a "archiver_state"
structure that collects all of them, with this one included, and
pass it around, or something), that would be great.

Short of that, I do not care too much either way.  As this topic is
not even in 'next', "once written in one way, it is not worth the
code churn to rewrite it in the other way, when these two ways are
both reasonable" does not yet apply here, so ...

> But that's all just minor quibbling -- I'll include your changes in the
> next version, they look fine overall.

... that's fine, too.

Thanks, both.
diff mbox series

Patch

diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index 5b017c2bdc..9de12896fc 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -148,8 +148,8 @@  tar.<format>.command::
 	format is given.
 +
 The "tar.gz" and "tgz" formats are defined automatically and default to
-`gzip -cn`. You may override them with custom commands. An internal gzip
-implementation can be used by specifying the value `git archive gzip`.
+the magic value `git archive gzip`, which invokes an internal
+implementation of gzip. You may override them with custom commands.

 tar.<format>.remote::
 	If true, enable `<format>` for use by remote clients via
diff --git a/archive-tar.c b/archive-tar.c
index bf7e321e0e..60669eb7b9 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -528,9 +528,9 @@  void init_tar_archiver(void)
 	int i;
 	register_archiver(&tar_archiver);

-	tar_filter_config("tar.tgz.command", "gzip -cn", NULL);
+	tar_filter_config("tar.tgz.command", internal_gzip_command, NULL);
 	tar_filter_config("tar.tgz.remote", "true", NULL);
-	tar_filter_config("tar.tar.gz.command", "gzip -cn", NULL);
+	tar_filter_config("tar.tar.gz.command", internal_gzip_command, NULL);
 	tar_filter_config("tar.tar.gz.remote", "true", NULL);
 	git_config(git_tar_config, NULL);
 	for (i = 0; i < nr_tar_filters; i++) {
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 9ac0ec67fe..1a68e89a55 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -339,21 +339,21 @@  test_expect_success 'only enabled filters are available remotely' '
 	test_cmp_bin remote.bar config.bar
 '

-test_expect_success GZIP 'git archive --format=tgz' '
+test_expect_success 'git archive --format=tgz' '
 	git archive --format=tgz HEAD >j.tgz
 '

-test_expect_success GZIP 'git archive --format=tar.gz' '
+test_expect_success 'git archive --format=tar.gz' '
 	git archive --format=tar.gz HEAD >j1.tar.gz &&
 	test_cmp_bin j.tgz j1.tar.gz
 '

-test_expect_success GZIP 'infer tgz from .tgz filename' '
+test_expect_success 'infer tgz from .tgz filename' '
 	git archive --output=j2.tgz HEAD &&
 	test_cmp_bin j.tgz j2.tgz
 '

-test_expect_success GZIP 'infer tgz from .tar.gz filename' '
+test_expect_success 'infer tgz from .tar.gz filename' '
 	git archive --output=j3.tar.gz HEAD &&
 	test_cmp_bin j.tgz j3.tar.gz
 '
@@ -363,31 +363,31 @@  test_expect_success GZIP 'extract tgz file' '
 	test_cmp_bin b.tar j.tar
 '

-test_expect_success GZIP 'remote tar.gz is allowed by default' '
+test_expect_success 'remote tar.gz is allowed by default' '
 	git archive --remote=. --format=tar.gz HEAD >remote.tar.gz &&
 	test_cmp_bin j.tgz remote.tar.gz
 '

-test_expect_success GZIP 'remote tar.gz can be disabled' '
+test_expect_success 'remote tar.gz can be disabled' '
 	git config tar.tar.gz.remote false &&
 	test_must_fail git archive --remote=. --format=tar.gz HEAD \
 		>remote.tar.gz
 '

-test_expect_success 'git archive --format=tgz (internal gzip)' '
-	test_config tar.tgz.command "git archive gzip" &&
-	git archive --format=tgz HEAD >internal_gzip.tgz
+test_expect_success GZIP 'git archive --format=tgz (external gzip)' '
+	test_config tar.tgz.command "gzip -cn" &&
+	git archive --format=tgz HEAD >external_gzip.tgz
 '

-test_expect_success 'git archive --format=tar.gz (internal gzip)' '
-	test_config tar.tar.gz.command "git archive gzip" &&
-	git archive --format=tar.gz HEAD >internal_gzip.tar.gz &&
-	test_cmp_bin internal_gzip.tgz internal_gzip.tar.gz
+test_expect_success GZIP 'git archive --format=tar.gz (external gzip)' '
+	test_config tar.tar.gz.command "gzip -cn" &&
+	git archive --format=tar.gz HEAD >external_gzip.tar.gz &&
+	test_cmp_bin external_gzip.tgz external_gzip.tar.gz
 '

-test_expect_success GZIP 'extract tgz file (internal gzip)' '
-	gzip -d -c <internal_gzip.tgz >internal_gzip.tar &&
-	test_cmp_bin b.tar internal_gzip.tar
+test_expect_success GZIP 'extract tgz file (external gzip)' '
+	gzip -d -c <external_gzip.tgz >external_gzip.tar &&
+	test_cmp_bin b.tar external_gzip.tar
 '

 test_expect_success 'archive and :(glob)' '