diff mbox series

[v2,2/4] http-fetch: allow custom index-pack args

Message ID 57220ceb841056aade08705ca0ac73ccc69f05ab.1614021093.git.jonathantanmy@google.com (mailing list archive)
State Accepted
Commit 27e35ba6c6b532ad6fc88b554d437b6892b4e915
Headers show
Series Check .gitmodules when using packfile URIs | expand

Commit Message

Jonathan Tan Feb. 22, 2021, 7:20 p.m. UTC
This is the next step in teaching fetch-pack to pass its index-pack
arguments when processing packfiles referenced by URIs.

The "--keep" in fetch-pack.c will be replaced with a full message in a
subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-http-fetch.txt | 10 ++++++++--
 fetch-pack.c                     |  3 +++
 http-fetch.c                     | 20 +++++++++++++++-----
 t/t5550-http-fetch-dumb.sh       |  5 ++++-
 4 files changed, 30 insertions(+), 8 deletions(-)

Comments

Ævar Arnfjörð Bjarmason Feb. 23, 2021, 1:17 p.m. UTC | #1
On Mon, Feb 22 2021, Jonathan Tan wrote:

> diff --git a/Documentation/git-http-fetch.txt b/Documentation/git-http-fetch.txt
> index 4deb4893f5..9fa17b60e4 100644
> --- a/Documentation/git-http-fetch.txt
> +++ b/Documentation/git-http-fetch.txt
> @@ -41,11 +41,17 @@ commit-id::
>  		<commit-id>['\t'<filename-as-in--w>]
>  
>  --packfile=<hash>::
> -	Instead of a commit id on the command line (which is not expected in
> +	For internal use only. Instead of a commit id on the command
> +	line (which is not expected in
>  	this case), 'git http-fetch' fetches the packfile directly at the given
>  	URL and uses index-pack to generate corresponding .idx and .keep files.
>  	The hash is used to determine the name of the temporary file and is
> -	arbitrary. The output of index-pack is printed to stdout.
> +	arbitrary. The output of index-pack is printed to stdout. Requires
> +	--index-pack-args.
> +
> +--index-pack-args=<args>::
> +	For internal use only. The command to run on the contents of the
> +	downloaded pack. Arguments are URL-encoded separated by spaces.
>  
>  --recover::
>  	Verify that everything reachable from target is fetched.  Used after
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 876f90c759..aeac010b0b 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1645,6 +1645,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
>  		strvec_pushf(&cmd.args, "--packfile=%.*s",
>  			     (int) the_hash_algo->hexsz,
>  			     packfile_uris.items[i].string);
> +		strvec_push(&cmd.args, "--index-pack-arg=index-pack");
> +		strvec_push(&cmd.args, "--index-pack-arg=--stdin");
> +		strvec_push(&cmd.args, "--index-pack-arg=--keep");

The docs say --*-args, but the code checks --*arg, that seems like a
mistake that should be fixed to make the code/tests use the plural form,
no?
Jonathan Tan Feb. 23, 2021, 4:51 p.m. UTC | #2
> > diff --git a/Documentation/git-http-fetch.txt b/Documentation/git-http-fetch.txt
> > index 4deb4893f5..9fa17b60e4 100644
> > --- a/Documentation/git-http-fetch.txt
> > +++ b/Documentation/git-http-fetch.txt
> > @@ -41,11 +41,17 @@ commit-id::
> >  		<commit-id>['\t'<filename-as-in--w>]
> >  
> >  --packfile=<hash>::
> > -	Instead of a commit id on the command line (which is not expected in
> > +	For internal use only. Instead of a commit id on the command
> > +	line (which is not expected in
> >  	this case), 'git http-fetch' fetches the packfile directly at the given
> >  	URL and uses index-pack to generate corresponding .idx and .keep files.
> >  	The hash is used to determine the name of the temporary file and is
> > -	arbitrary. The output of index-pack is printed to stdout.
> > +	arbitrary. The output of index-pack is printed to stdout. Requires
> > +	--index-pack-args.
> > +
> > +--index-pack-args=<args>::
> > +	For internal use only. The command to run on the contents of the
> > +	downloaded pack. Arguments are URL-encoded separated by spaces.
> >  
> >  --recover::
> >  	Verify that everything reachable from target is fetched.  Used after
> > diff --git a/fetch-pack.c b/fetch-pack.c
> > index 876f90c759..aeac010b0b 100644
> > --- a/fetch-pack.c
> > +++ b/fetch-pack.c
> > @@ -1645,6 +1645,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> >  		strvec_pushf(&cmd.args, "--packfile=%.*s",
> >  			     (int) the_hash_algo->hexsz,
> >  			     packfile_uris.items[i].string);
> > +		strvec_push(&cmd.args, "--index-pack-arg=index-pack");
> > +		strvec_push(&cmd.args, "--index-pack-arg=--stdin");
> > +		strvec_push(&cmd.args, "--index-pack-arg=--keep");
> 
> The docs say --*-args, but the code checks --*arg, that seems like a
> mistake that should be fixed to make the code/tests use the plural form,
> no?

Thanks for catching that. Originally it was plural since this single
argument would give multiple arguments to index-pack, but now each
argument gives only a single argument, so "arg" is correct. I'll update
it in the next version.
Jonathan Nieder March 5, 2021, 12:19 a.m. UTC | #3
Hi Jonathan,

Jonathan Tan wrote:

> This is the next step in teaching fetch-pack to pass its index-pack
> arguments when processing packfiles referenced by URIs.
>
> The "--keep" in fetch-pack.c will be replaced with a full message in a
> subsequent commit.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  Documentation/git-http-fetch.txt | 10 ++++++++--
>  fetch-pack.c                     |  3 +++
>  http-fetch.c                     | 20 +++++++++++++++-----
>  t/t5550-http-fetch-dumb.sh       |  5 ++++-
>  4 files changed, 30 insertions(+), 8 deletions(-)

This is producing an interesting symptom for me:

 git init repro
 cd repro
 git config fetch.uriprotocols https
 git config remote.origin.url https://fuchsia.googlesource.com/fuchsia
 git config remote.origin.fetch +refs/heads/*:refs/remotes/origin/*
 git fetch -p origin

Expected result: fetches

Actual result:

 fatal: pack has bad object at offset 12: unknown object type 5
 fatal: finish_http_pack_request gave result -1
 fatal: fetch-pack: expected keep then TAB at start of http-fetch output

Thanks to Nathan Mulcahey (cc-ed) for a clear report.

Bisects to b664e9ffa153189dae9b88f32d1c5fedcf85056a, which is part of
"next" and 2.31.0-rc1.  Another report of the same is at
https://crbug.com/1184814.

Known problem?

Thanks,
Jonathan
diff mbox series

Patch

diff --git a/Documentation/git-http-fetch.txt b/Documentation/git-http-fetch.txt
index 4deb4893f5..9fa17b60e4 100644
--- a/Documentation/git-http-fetch.txt
+++ b/Documentation/git-http-fetch.txt
@@ -41,11 +41,17 @@  commit-id::
 		<commit-id>['\t'<filename-as-in--w>]
 
 --packfile=<hash>::
-	Instead of a commit id on the command line (which is not expected in
+	For internal use only. Instead of a commit id on the command
+	line (which is not expected in
 	this case), 'git http-fetch' fetches the packfile directly at the given
 	URL and uses index-pack to generate corresponding .idx and .keep files.
 	The hash is used to determine the name of the temporary file and is
-	arbitrary. The output of index-pack is printed to stdout.
+	arbitrary. The output of index-pack is printed to stdout. Requires
+	--index-pack-args.
+
+--index-pack-args=<args>::
+	For internal use only. The command to run on the contents of the
+	downloaded pack. Arguments are URL-encoded separated by spaces.
 
 --recover::
 	Verify that everything reachable from target is fetched.  Used after
diff --git a/fetch-pack.c b/fetch-pack.c
index 876f90c759..aeac010b0b 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1645,6 +1645,9 @@  static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 		strvec_pushf(&cmd.args, "--packfile=%.*s",
 			     (int) the_hash_algo->hexsz,
 			     packfile_uris.items[i].string);
+		strvec_push(&cmd.args, "--index-pack-arg=index-pack");
+		strvec_push(&cmd.args, "--index-pack-arg=--stdin");
+		strvec_push(&cmd.args, "--index-pack-arg=--keep");
 		strvec_push(&cmd.args, uri);
 		cmd.git_cmd = 1;
 		cmd.no_stdin = 1;
diff --git a/http-fetch.c b/http-fetch.c
index 2d1d9d054f..fa642462a9 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -3,6 +3,7 @@ 
 #include "exec-cmd.h"
 #include "http.h"
 #include "walker.h"
+#include "strvec.h"
 
 static const char http_fetch_usage[] = "git http-fetch "
 "[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin | --packfile=hash | commit-id] url";
@@ -43,11 +44,9 @@  static int fetch_using_walker(const char *raw_url, int get_verbosely,
 	return rc;
 }
 
-static const char *index_pack_args[] =
-	{"index-pack", "--stdin", "--keep", NULL};
-
 static void fetch_single_packfile(struct object_id *packfile_hash,
-				  const char *url) {
+				  const char *url,
+				  const char **index_pack_args) {
 	struct http_pack_request *preq;
 	struct slot_results results;
 	int ret;
@@ -90,6 +89,7 @@  int cmd_main(int argc, const char **argv)
 	int packfile = 0;
 	int nongit;
 	struct object_id packfile_hash;
+	struct strvec index_pack_args = STRVEC_INIT;
 
 	setup_git_directory_gently(&nongit);
 
@@ -116,6 +116,8 @@  int cmd_main(int argc, const char **argv)
 			packfile = 1;
 			if (parse_oid_hex(p, &packfile_hash, &end) || *end)
 				die(_("argument to --packfile must be a valid hash (got '%s')"), p);
+		} else if (skip_prefix(argv[arg], "--index-pack-arg=", &p)) {
+			strvec_push(&index_pack_args, p);
 		}
 		arg++;
 	}
@@ -128,10 +130,18 @@  int cmd_main(int argc, const char **argv)
 	git_config(git_default_config, NULL);
 
 	if (packfile) {
-		fetch_single_packfile(&packfile_hash, argv[arg]);
+		if (!index_pack_args.nr)
+			die(_("--packfile requires --index-pack-args"));
+
+		fetch_single_packfile(&packfile_hash, argv[arg],
+				      index_pack_args.v);
+
 		return 0;
 	}
 
+	if (index_pack_args.nr)
+		die(_("--index-pack-args can only be used with --packfile"));
+
 	if (commits_on_stdin) {
 		commits = walker_targets_stdin(&commit_id, &write_ref);
 	} else {
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 483578b2d7..358b322e05 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -224,7 +224,10 @@  test_expect_success 'http-fetch --packfile' '
 
 	git init packfileclient &&
 	p=$(cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git && ls objects/pack/pack-*.pack) &&
-	git -C packfileclient http-fetch --packfile=$ARBITRARY "$HTTPD_URL"/dumb/repo_pack.git/$p >out &&
+	git -C packfileclient http-fetch --packfile=$ARBITRARY \
+		--index-pack-arg=index-pack --index-pack-arg=--stdin \
+		--index-pack-arg=--keep \
+		"$HTTPD_URL"/dumb/repo_pack.git/$p >out &&
 
 	grep "^keep.[0-9a-f]\{16,\}$" out &&
 	cut -c6- out >packhash &&