diff mbox series

[v2] archive: make --add-virtual-file honor --prefix

Message ID pull.1719.v2.git.git.1715967267420.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v2] archive: make --add-virtual-file honor --prefix | expand

Commit Message

Tom Scogland May 17, 2024, 5:34 p.m. UTC
From: Tom Scogland <scogland1@llnl.gov>

The documentation for archive describes the `--add-virtual-file` option
thusly:

  The path of the file in the archive is built by concatenating the
  value of the last `--prefix` moption (if any) before this
  `--add-virtual-file` and <path>.

The `--add-file` documentation is similar:

  The path of the file in the archive is built by concatenating the
  value of the last --prefix option (if any) before this --add-file and
  the basename of <file>.

Notably both explicitly state that they honor the last `--prefix` option
before the `--add` option in question.  The implementation of
`--add-file` seems to have always honored prefix, but the implementation
of `--add-virtual-file` does not.  Also note that `--add-virtual-file`
explicitly states it will use the full path given, while `--add-file`
uses the basename of the path it is given.

Modify archive.c to include the prefix in the path used by
`--add-virtual-file` and add checks into
the existing add-virtual-file test to verify:

* that `--prefix` is honored
* that leading path components are preserved
* that both work together and separately

Changes since v1:
- Revised the commit message style
- Added tests for basename/non-basename behavior
- Fixed archive.c to use full path for virtual and basename for add-file

Signed-off-by: Tom Scogland <scogland1@llnl.gov>
---
    archive: make --add-virtual-file honor --prefix

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1719%2Ftrws%2Fhonor-prefix-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1719/trws/honor-prefix-v2
Pull-Request: https://github.com/git/git/pull/1719

Range-diff vs v1:

 1:  1b685d8ca1a ! 1:  b9a0ef282a3 archive: make --add-virtual-file honor --prefix
     @@ Metadata
       ## Commit message ##
          archive: make --add-virtual-file honor --prefix
      
     -    The documentation for archive states:
     +    The documentation for archive describes the `--add-virtual-file` option
     +    thusly:
      
            The path of the file in the archive is built by concatenating the
            value of the last `--prefix` moption (if any) before this
            `--add-virtual-file` and <path>.
      
     -    This matches the documentation for --add-file and the behavior works for
     -    that option, but --prefix is ignored for --add-virtual-file.
     +    The `--add-file` documentation is similar:
      
     -    This commit modifies archive.c to include the prefix in the path and
     -    adds a check into the existing add-virtual-file test to ensure that it
     -    honors both the most recent prefix before the flag.
     +      The path of the file in the archive is built by concatenating the
     +      value of the last --prefix option (if any) before this --add-file and
     +      the basename of <file>.
     +
     +    Notably both explicitly state that they honor the last `--prefix` option
     +    before the `--add` option in question.  The implementation of
     +    `--add-file` seems to have always honored prefix, but the implementation
     +    of `--add-virtual-file` does not.  Also note that `--add-virtual-file`
     +    explicitly states it will use the full path given, while `--add-file`
     +    uses the basename of the path it is given.
     +
     +    Modify archive.c to include the prefix in the path used by
     +    `--add-virtual-file` and add checks into
     +    the existing add-virtual-file test to verify:
     +
     +    * that `--prefix` is honored
     +    * that leading path components are preserved
     +    * that both work together and separately
      
     -    In looking for others with this issue, I found message
     -    a143e25a70b44b82b4ee6fa3bb2bcda4@atlas-elektronik.com on the mailing
     -    list, where Stefan proposed a basically identical patch to archive.c
     -    back in February, so the main addition here is the test along with that
     -    patch.
     +    Changes since v1:
     +    - Revised the commit message style
     +    - Added tests for basename/non-basename behavior
     +    - Fixed archive.c to use full path for virtual and basename for add-file
      
          Signed-off-by: Tom Scogland <scogland1@llnl.gov>
      
     @@ archive.c: int write_archive_entries(struct archiver_args *args,
      +		strbuf_reset(&path_in_archive);
      +		if (info->base)
      +			strbuf_addstr(&path_in_archive, info->base);
     -+		strbuf_addstr(&path_in_archive, basename(path));
       		if (!info->content) {
      -			strbuf_reset(&path_in_archive);
      -			if (info->base)
      -				strbuf_addstr(&path_in_archive, info->base);
     --			strbuf_addstr(&path_in_archive, basename(path));
     + 			strbuf_addstr(&path_in_archive, basename(path));
      -
       			strbuf_reset(&content);
       			if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
       				err = error_errno(_("cannot read '%s'"), path);
      @@ archive.c: int write_archive_entries(struct archiver_args *args,
     + 						  canon_mode(info->stat.st_mode),
       						  content.buf, content.len);
       		} else {
     ++			strbuf_addstr(&path_in_archive, path);
       			err = write_entry(args, &fake_oid,
      -					  path, strlen(path),
      +					  path_in_archive.buf, path_in_archive.len,
     @@ t/t5003-archive-zip.sh: test_expect_success UNZIP 'git archive --format=zip --ad
       		--add-virtual-file=\""$PATHNAME"\": \
      -		--add-virtual-file=hello:world $EMPTY_TREE &&
      +		--add-virtual-file=hello:world \
     -+		--prefix=subdir/ --add-virtual-file=hello:world \
     ++		--add-virtual-file=with/dir/noprefix:withdirnopre \
     ++		--prefix=subdir/ --add-virtual-file=with/dirprefix:withdirprefix \
     ++		--prefix=subdir2/ --add-virtual-file=withoutdir:withoutdir \
      +		--prefix= $EMPTY_TREE &&
       	test_when_finished "rm -rf tmp-unpack" &&
       	mkdir tmp-unpack && (
     @@ t/t5003-archive-zip.sh: test_expect_success UNZIP 'git archive --format=zip --ad
       		test_path_is_file "$PATHNAME" &&
      -		test world = $(cat hello)
      +		test world = $(cat hello) &&
     -+		test_path_is_file subdir/hello &&
     -+		test world = $(cat subdir/hello)
     ++		test_path_is_file with/dir/noprefix &&
     ++		test withdirnopre = $(cat with/dir/noprefix) &&
     ++		test_path_is_file subdir/with/dirprefix &&
     ++		test withdirprefix = $(cat subdir/with/dirprefix) &&
     ++		test_path_is_file subdir2/withoutdir &&
     ++		test withoutdir = $(cat subdir2/withoutdir)
       	)
       '
       


 archive.c              | 10 +++++-----
 t/t5003-archive-zip.sh | 14 ++++++++++++--
 2 files changed, 17 insertions(+), 7 deletions(-)


base-commit: 786a3e4b8d754d2b14b1208b98eeb0a554ef19a8

Comments

Junio C Hamano May 17, 2024, 11:33 p.m. UTC | #1
"Tom Scogland via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Notably both explicitly state that they honor the last `--prefix` option
> before the `--add` option in question.  The implementation of
> `--add-file` seems to have always honored prefix, but the implementation
> of `--add-virtual-file` does not.

The above is misleading.

    The implementation of `--add-file` has always honored the prefix,
    while the implementation of `--add-virtual-file` has always ignored
    the prefix.

would make it easier to assess how long existing users may have been
relying on the current behaviour.

> Also note that `--add-virtual-file`
> explicitly states it will use the full path given, while `--add-file`
> uses the basename of the path it is given.

Yes, this is a very good thing to mention.  It is probably the
reason why the implementation decided not to add prefix to the "full
path" that already can have the leading directories.

> Modify archive.c to include the prefix in the path used by
> `--add-virtual-file` and add checks into
> the existing add-virtual-file test to verify:
>
> * that `--prefix` is honored
> * that leading path components are preserved
> * that both work together and separately

Very nice job explaining the chosen design clearly (even though I do
not necessarily agree with the direction this patch is going).

Also, given that this option was introduced for an explicit purpose
of using it to write out diagnostics archive file, we should mention
that this change does not break it in the proposed log message, at
least.  Of course, we should do so after verifying that is indeed
the case, and better yet, after verifying that it will be hard for
future changes to diagnose.c to trigger an unexpected behaviour
caused by this change [*].

> Changes since v1:
> - Revised the commit message style
> - Added tests for basename/non-basename behavior
> - Fixed archive.c to use full path for virtual and basename for add-file

The "changes since v1" section does not belong to the log message
proper, as v1 never happened as long as readers of "git log" are
concerned.  It is a very good thing to help reviewers to have below
the three-dash lines that comes after your sign-off, though.

> Signed-off-by: Tom Scogland <scogland1@llnl.gov>
> ---

>  archive.c              | 10 +++++-----
>  t/t5003-archive-zip.sh | 14 ++++++++++++--
>  2 files changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/archive.c b/archive.c
> index 5287fcdd8e0..64777a9870d 100644
> --- a/archive.c
> +++ b/archive.c
> @@ -365,12 +365,11 @@ int write_archive_entries(struct archiver_args *args,
>  
>  		put_be64(fake_oid.hash, i + 1);
>  
> +		strbuf_reset(&path_in_archive);
> +		if (info->base)
> +			strbuf_addstr(&path_in_archive, info->base);
>  		if (!info->content) {
> -			strbuf_reset(&path_in_archive);
> -			if (info->base)
> -				strbuf_addstr(&path_in_archive, info->base);
>  			strbuf_addstr(&path_in_archive, basename(path));
> -
>  			strbuf_reset(&content);
>  			if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
>  				err = error_errno(_("cannot read '%s'"), path);
> @@ -380,8 +379,9 @@ int write_archive_entries(struct archiver_args *args,
>  						  canon_mode(info->stat.st_mode),
>  						  content.buf, content.len);
>  		} else {
> +			strbuf_addstr(&path_in_archive, path);
>  			err = write_entry(args, &fake_oid,
> -					  path, strlen(path),
> +					  path_in_archive.buf, path_in_archive.len,
>  					  canon_mode(info->stat.st_mode),
>  					  info->content, info->stat.st_size);
>  		}
> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
> index 961c6aac256..0cf3aef8ace 100755
> --- a/t/t5003-archive-zip.sh
> +++ b/t/t5003-archive-zip.sh
> @@ -218,14 +218,24 @@ test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
>  	fi &&
>  	git archive --format=zip >with_file_with_content.zip \
>  		--add-virtual-file=\""$PATHNAME"\": \
> -		--add-virtual-file=hello:world $EMPTY_TREE &&
> +		--add-virtual-file=hello:world \
> +		--add-virtual-file=with/dir/noprefix:withdirnopre \
> +		--prefix=subdir/ --add-virtual-file=with/dirprefix:withdirprefix \
> +		--prefix=subdir2/ --add-virtual-file=withoutdir:withoutdir \
> +		--prefix= $EMPTY_TREE &&
>  	test_when_finished "rm -rf tmp-unpack" &&
>  	mkdir tmp-unpack && (
>  		cd tmp-unpack &&
>  		"$GIT_UNZIP" ../with_file_with_content.zip &&
>  		test_path_is_file hello &&
>  		test_path_is_file "$PATHNAME" &&
> -		test world = $(cat hello)
> +		test world = $(cat hello) &&
> +		test_path_is_file with/dir/noprefix &&
> +		test withdirnopre = $(cat with/dir/noprefix) &&
> +		test_path_is_file subdir/with/dirprefix &&
> +		test withdirprefix = $(cat subdir/with/dirprefix) &&
> +		test_path_is_file subdir2/withoutdir &&
> +		test withoutdir = $(cat subdir2/withoutdir)

OK.  With different payload at different paths, it is easier than
the previous round to see where things are expected to go in the
result.


[Footnote]

 * I got curious and did this part for you.  After calling
   add_directory_to_archiver() that uses "--prefix" to move the
   target directory around in the output hierarchy, the code clears
   with "--prefix="---even a future change to diagnose.c adds more
   uses to --add-virtual-file= after it happens, it will not go to
   deep in the directory hierarchy where the last use of "--prefix"
   happened to be pointing at.
Tom Scogland May 18, 2024, 12:26 a.m. UTC | #2
On 17 May 2024, at 16:33, Junio C Hamano wrote:

> "Tom Scogland via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> Notably both explicitly state that they honor the last `--prefix` option
>> before the `--add` option in question.  The implementation of
>> `--add-file` seems to have always honored prefix, but the implementation
>> of `--add-virtual-file` does not.
>
> The above is misleading.
>
>     The implementation of `--add-file` has always honored the prefix,
>     while the implementation of `--add-virtual-file` has always ignored
>     the prefix.
>
> would make it easier to assess how long existing users may have been
> relying on the current behaviour.

Fair, I had no intention to mislead and will reword.

>> Modify archive.c to include the prefix in the path used by
>> `--add-virtual-file` and add checks into
>> the existing add-virtual-file test to verify:
>>
>> * that `--prefix` is honored
>> * that leading path components are preserved
>> * that both work together and separately
>
> Very nice job explaining the chosen design clearly (even though I do
> not necessarily agree with the direction this patch is going).

Thanks for that.  As to the direction, I mentioned earlier adding a different flag, or perhaps marking the filename in some fashion to express that the prefix should be honored, would you prefer that? It would, as you said, be much safer in that there's no reason for it to be a breaking change. If there's a design you prefer that would result in having an opt-in way to get the prefix behavior I wouldn't mind implementing it.

> Also, given that this option was introduced for an explicit purpose
> of using it to write out diagnostics archive file, we should mention
> that this change does not break it in the proposed log message, at
> least.  Of course, we should do so after verifying that is indeed
> the case, and better yet, after verifying that it will be hard for
> future changes to diagnose.c to trigger an unexpected behaviour
> caused by this change [*].

That's a very good point, and thank you for digging into it.

>> Changes since v1:
>> - Revised the commit message style
>> - Added tests for basename/non-basename behavior
>> - Fixed archive.c to use full path for virtual and basename for add-file
>
> The "changes since v1" section does not belong to the log message
> proper, as v1 never happened as long as readers of "git log" are
> concerned.  It is a very good thing to help reviewers to have below
> the three-dash lines that comes after your sign-off, though.

My apologies, this is my unfamiliarity with GitGitGadget, I'll put information like this in the PR description next time, which I think will do that.
René Scharfe May 19, 2024, 1:25 p.m. UTC | #3
Am 17.05.24 um 19:34 schrieb Tom Scogland via GitGitGadget:
> From: Tom Scogland <scogland1@llnl.gov>
>
> The documentation for archive describes the `--add-virtual-file` option
> thusly:
>
>   The path of the file in the archive is built by concatenating the
>   value of the last `--prefix` moption (if any) before this
>   `--add-virtual-file` and <path>.

The documentation does not actually misspell "option" as "moption".

> The `--add-file` documentation is similar:
>
>   The path of the file in the archive is built by concatenating the
>   value of the last --prefix option (if any) before this --add-file and
>   the basename of <file>.
>
> Notably both explicitly state that they honor the last `--prefix` option
> before the `--add` option in question.  The implementation of
> `--add-file` seems to have always honored prefix, but the implementation
> of `--add-virtual-file` does not.  Also note that `--add-virtual-file`
> explicitly states it will use the full path given, while `--add-file`
> uses the basename of the path it is given.
>
> Modify archive.c to include the prefix in the path used by
> `--add-virtual-file`

Aligning code and docs is a good idea.  Have you considered keeping the
code as is and changing the documentation instead, though?

The two options are related in that they both add untracked files, but
they necessarily have different arguments:

   --add-file=<file>
   --add-virtual-file=<path>:<content>

You can already specify any path you want with --add-virtual-file.
What's the advantage of honoring --prefix as well?

René
Tom Scogland May 20, 2024, 4:10 p.m. UTC | #4
On 19 May 2024, at 6:25, René Scharfe wrote:

>
> Aligning code and docs is a good idea.  Have you considered keeping the
> code as is and changing the documentation instead, though?
>
> The two options are related in that they both add untracked files, but
> they necessarily have different arguments:
>
>    --add-file=<file>
>    --add-virtual-file=<path>:<content>
>
> You can already specify any path you want with --add-virtual-file.
> What's the advantage of honoring --prefix as well?

I came into this after trying to translate an --add-file to an --add-virtual-file and being surprised that the prefix wasn't applied. anything else you add, in the repo or not, it gets the prefix. I can go back and explicitly add the prefix, but it makes the options less naturally composable in my opinion.  The specific case was packaging some software that uses git metadata to set version and some other things when in a repo, and files in a tarball. The tarball's prefix is set in one part of the packaging code, files and exclusions in another, and passing that through was something we didn't have to think about using add-file, but do with add-virtual-file.

That said, it sounds like both you and Junio prefer updating the docs rather than the code, which makes me think I'm in the minority in that opinion.  If that's the case, I can certainly update the docs, and I imagine we can backport that easily wherever it makes sense.  I would really like to have the option to have the prefix apply though, either adding a new flag or an option to the existing one that would be invalid given current syntax or similar to provide the option.
René Scharfe May 20, 2024, 5:07 p.m. UTC | #5
Am 20.05.24 um 18:10 schrieb Tom Scogland:
>
> On 19 May 2024, at 6:25, René Scharfe wrote:
>
>> You can already specify any path you want with --add-virtual-file.
>> What's the advantage of honoring --prefix as well?
>
> I came into this after trying to translate an --add-file to an
> --add-virtual-file and being surprised that the prefix wasn't
> applied.

Understandable, the documentation promised otherwise and the options
have very similar names.

> anything else you add, in the repo or not, it gets the prefix.> I can go back and explicitly add the prefix, but it makes the
> options less naturally composable in my opinion.

True, applying the prefix to all items would be simpler overall.

Speaking of simpler: --add-virtual-file could have been implemented to
only take a single argument -- the content -- and rely on --prefix to
provide the full path.  That's more consistent with other options, as
most of them only take single-valued arguments (or none). :]

> That said, it sounds like both you and Junio prefer updating the docs
> rather than the code, which makes me think I'm in the minority in
> that opinion.

I'm not sure I have an opinion on that topic, yet.  Fixing the
documentation is certainly easier.  Adding the prefix to the path of
virtual files as well is a breaking change.  I feel that the easier
route should at least be mentioned in the commit message and why it
was not taken.

> If that's the case, I can certainly update the docs, and I imagine we
> can backport that easily wherever it makes sense.  I would really
> like to have the option to have the prefix apply though, either
> adding a new flag or an option to the existing one that would be
> invalid given current syntax or similar to provide the option.
You mean like replacing a leading colon in the path with the prefix?

René
Junio C Hamano May 20, 2024, 5:55 p.m. UTC | #6
Tom Scogland <scogland1@llnl.gov> writes:

> That said, it sounds like both you and Junio prefer updating the
> docs rather than the code, which makes me think I'm in the
> minority in that opinion.  If that's the case, I can certainly
> update the docs, and I imagine we can backport that easily
> wherever it makes sense.  I would really like to have the option
> to have the prefix apply though, either adding a new flag or an
> option to the existing one that would be invalid given current
> syntax or similar to provide the option.

[jc: wrapped overly long lines]

Wouldn't

    #!/bin/sh
    prefix=foo/bar
    git archive --prefix="$prefix" --add-file=x --add-file=y \
	--add-virtual-file="$prefix/path:contents"

be an option enough?  You only have to define $prefix once.
Junio C Hamano June 14, 2024, 6:07 p.m. UTC | #7
René Scharfe <l.s.r@web.de> writes:

> I'm not sure I have an opinion on that topic, yet.  Fixing the
> documentation is certainly easier.  Adding the prefix to the path of
> virtual files as well is a breaking change.  I feel that the easier
> route should at least be mentioned in the commit message and why it
> was not taken.

It has been a few weeks since this discussion stalled.  Let me make
an executive decision on the direction here---let's keep the behaviour
and align the documentation so that we won't break existing users.

Thanks.
diff mbox series

Patch

diff --git a/archive.c b/archive.c
index 5287fcdd8e0..64777a9870d 100644
--- a/archive.c
+++ b/archive.c
@@ -365,12 +365,11 @@  int write_archive_entries(struct archiver_args *args,
 
 		put_be64(fake_oid.hash, i + 1);
 
+		strbuf_reset(&path_in_archive);
+		if (info->base)
+			strbuf_addstr(&path_in_archive, info->base);
 		if (!info->content) {
-			strbuf_reset(&path_in_archive);
-			if (info->base)
-				strbuf_addstr(&path_in_archive, info->base);
 			strbuf_addstr(&path_in_archive, basename(path));
-
 			strbuf_reset(&content);
 			if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
 				err = error_errno(_("cannot read '%s'"), path);
@@ -380,8 +379,9 @@  int write_archive_entries(struct archiver_args *args,
 						  canon_mode(info->stat.st_mode),
 						  content.buf, content.len);
 		} else {
+			strbuf_addstr(&path_in_archive, path);
 			err = write_entry(args, &fake_oid,
-					  path, strlen(path),
+					  path_in_archive.buf, path_in_archive.len,
 					  canon_mode(info->stat.st_mode),
 					  info->content, info->stat.st_size);
 		}
diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index 961c6aac256..0cf3aef8ace 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -218,14 +218,24 @@  test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
 	fi &&
 	git archive --format=zip >with_file_with_content.zip \
 		--add-virtual-file=\""$PATHNAME"\": \
-		--add-virtual-file=hello:world $EMPTY_TREE &&
+		--add-virtual-file=hello:world \
+		--add-virtual-file=with/dir/noprefix:withdirnopre \
+		--prefix=subdir/ --add-virtual-file=with/dirprefix:withdirprefix \
+		--prefix=subdir2/ --add-virtual-file=withoutdir:withoutdir \
+		--prefix= $EMPTY_TREE &&
 	test_when_finished "rm -rf tmp-unpack" &&
 	mkdir tmp-unpack && (
 		cd tmp-unpack &&
 		"$GIT_UNZIP" ../with_file_with_content.zip &&
 		test_path_is_file hello &&
 		test_path_is_file "$PATHNAME" &&
-		test world = $(cat hello)
+		test world = $(cat hello) &&
+		test_path_is_file with/dir/noprefix &&
+		test withdirnopre = $(cat with/dir/noprefix) &&
+		test_path_is_file subdir/with/dirprefix &&
+		test withdirprefix = $(cat subdir/with/dirprefix) &&
+		test_path_is_file subdir2/withoutdir &&
+		test withoutdir = $(cat subdir2/withoutdir)
 	)
 '