diff mbox series

[v5,3/3] clone: respect remote unborn HEAD

Message ID 922e8c229c359c15f1265876e6def87d7a18b763.1611686656.git.jonathantanmy@google.com (mailing list archive)
State Superseded
Headers show
Series Cloning with remote unborn HEAD | expand

Commit Message

Jonathan Tan Jan. 26, 2021, 6:55 p.m. UTC
Teach Git to use the "unborn" feature introduced in a previous patch as
follows: Git will always send the "unborn" argument if it is supported
by the server. During "git clone", if cloning an empty repository, Git
will use the new information to determine the local branch to create. In
all other cases, Git will ignore it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config/init.txt |  2 +-
 builtin/clone.c               | 16 ++++++++++++++--
 connect.c                     | 28 ++++++++++++++++++++++++++--
 t/t5606-clone-options.sh      |  8 +++++---
 t/t5702-protocol-v2.sh        | 25 +++++++++++++++++++++++++
 transport.h                   |  8 ++++++++
 6 files changed, 79 insertions(+), 8 deletions(-)

Comments

Junio C Hamano Jan. 26, 2021, 10:24 p.m. UTC | #1
Jonathan Tan <jonathantanmy@google.com> writes:

>  init.defaultBranch::
>  	Allows overriding the default branch name e.g. when initializing
> -	a new repository or when cloning an empty repository.
> +	a new repository.

Looking good.

> diff --git a/builtin/clone.c b/builtin/clone.c
> index 211d4f54b0..77fdc61f4d 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1330,10 +1330,21 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  		remote_head = NULL;
>  		option_no_checkout = 1;
>  		if (!option_bare) {
> -			const char *branch = git_default_branch_name();
> -			char *ref = xstrfmt("refs/heads/%s", branch);
> +			const char *branch;
> +			char *ref;
> +
> +			if (transport_ls_refs_options.unborn_head_target &&
> +			    skip_prefix(transport_ls_refs_options.unborn_head_target,
> +					"refs/heads/", &branch)) {
> +				ref = transport_ls_refs_options.unborn_head_target;
> +				transport_ls_refs_options.unborn_head_target = NULL;
> +			} else {
> +				branch = git_default_branch_name();
> +				ref = xstrfmt("refs/heads/%s", branch);
> +			}
>  
>  			install_branch_config(0, branch, remote_name, ref);
> +			create_symref("HEAD", ref, "");
>  			free(ref);

OK, we used to say "point our HEAD always to the local default
name", and the code is still there in the else clause.  But when the
transport found what name the other side uses, we use that name
instead.

I presume that clearing transport_ls_ref_options.unborn_head_target
is to take ownership of this piece of memory ourselves?

We didn't call create_symref() in the original code, but now we do.
Is this a valid bugfix even if we did not have this "learn remote
symref even for unborn HEAD" feature?  Or is the original codepath
now somehow got broken with an extra create_symref() that we used
not to do, but now we do?

> @@ -1385,5 +1396,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	junk_mode = JUNK_LEAVE_ALL;
>  
>  	strvec_clear(&transport_ls_refs_options.ref_prefixes);
> +	free(transport_ls_refs_options.unborn_head_target);
>  	return err;
>  }
> diff --git a/connect.c b/connect.c
> index 328c279250..879669df93 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
>  }
>  
>  /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
> -static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
> +static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
> +			  char **unborn_head_target)
>  {
>  	int ret = 1;
>  	int i = 0;
> @@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
>  		goto out;
>  	}
>  
> +	if (!strcmp("unborn", line_sections.items[i].string)) {
> +		i++;
> +		if (unborn_head_target &&
> +		    !strcmp("HEAD", line_sections.items[i++].string)) {
> +			/*
> +			 * Look for the symref target (if any). If found,
> +			 * return it to the caller.
> +			 */
> +			for (; i < line_sections.nr; i++) {
> +				const char *arg = line_sections.items[i].string;
> +
> +				if (skip_prefix(arg, "symref-target:", &arg)) {
> +					*unborn_head_target = xstrdup(arg);
> +					break;
> +				}
> +			}
> +		}
> +		goto out;
> +	}

We split the line and notice that the first token is "unborn"; if
the caller is not interested in the unborn head, we just skip the
rest, but otherwise, if it is about HEAD (i.e. we do not care if a
dangling symref that is not HEAD is reported), we notice the target
in unborn_head_target.

OK.  We already saw how this is used in cmd_clone().

> @@ -461,6 +481,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  	const char *hash_name;
>  	struct strvec *ref_prefixes = transport_options ?
>  		&transport_options->ref_prefixes : NULL;
> +	char **unborn_head_target = transport_options ?
> +		&transport_options->unborn_head_target : NULL;

So any caller that passes transport_options will get the unborn head
information for free?  The other callers are in fetch-pack.c and
transport.c, which presumably are about fetching and not cloning.

I recall discussions on filling a missing refs/remotes/X/HEAD when
we fetch from X and learn where X points at.  Such an extension can
be done on top of this mechanism to pass transport_options from the
fetch codepath, I presume?


Thanks.  I tried to follow the thought in the patches aloud, and it
was mostly a pleasant read.
Jonathan Tan Jan. 30, 2021, 4:27 a.m. UTC | #2
> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> >  init.defaultBranch::
> >  	Allows overriding the default branch name e.g. when initializing
> > -	a new repository or when cloning an empty repository.
> > +	a new repository.
> 
> Looking good.
> 
> > diff --git a/builtin/clone.c b/builtin/clone.c
> > index 211d4f54b0..77fdc61f4d 100644
> > --- a/builtin/clone.c
> > +++ b/builtin/clone.c
> > @@ -1330,10 +1330,21 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
> >  		remote_head = NULL;
> >  		option_no_checkout = 1;
> >  		if (!option_bare) {
> > -			const char *branch = git_default_branch_name();
> > -			char *ref = xstrfmt("refs/heads/%s", branch);
> > +			const char *branch;
> > +			char *ref;
> > +
> > +			if (transport_ls_refs_options.unborn_head_target &&
> > +			    skip_prefix(transport_ls_refs_options.unborn_head_target,
> > +					"refs/heads/", &branch)) {
> > +				ref = transport_ls_refs_options.unborn_head_target;
> > +				transport_ls_refs_options.unborn_head_target = NULL;
> > +			} else {
> > +				branch = git_default_branch_name();
> > +				ref = xstrfmt("refs/heads/%s", branch);
> > +			}
> >  
> >  			install_branch_config(0, branch, remote_name, ref);
> > +			create_symref("HEAD", ref, "");
> >  			free(ref);
> 
> OK, we used to say "point our HEAD always to the local default
> name", and the code is still there in the else clause.  But when the
> transport found what name the other side uses, we use that name
> instead.
> 
> I presume that clearing transport_ls_ref_options.unborn_head_target
> is to take ownership of this piece of memory ourselves?

Yes - just to be consistent with the other branch where "ref" needs to
be freed.

> We didn't call create_symref() in the original code, but now we do.
> Is this a valid bugfix even if we did not have this "learn remote
> symref even for unborn HEAD" feature?  Or is the original codepath
> now somehow got broken with an extra create_symref() that we used
> not to do, but now we do?

Ah...now I think I see what you and Peff [1] were saying. Yes I think
the symref creation is not necessary when we use the default branch name
(like we currently do). I'll verify and write back with my findings in
the next version.

[1] https://lore.kernel.org/git/YBCf8SI3fK+rDyox@coredump.intra.peff.net/

> 
> > @@ -1385,5 +1396,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
> >  	junk_mode = JUNK_LEAVE_ALL;
> >  
> >  	strvec_clear(&transport_ls_refs_options.ref_prefixes);
> > +	free(transport_ls_refs_options.unborn_head_target);
> >  	return err;
> >  }
> > diff --git a/connect.c b/connect.c
> > index 328c279250..879669df93 100644
> > --- a/connect.c
> > +++ b/connect.c
> > @@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
> >  }
> >  
> >  /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
> > -static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
> > +static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
> > +			  char **unborn_head_target)
> >  {
> >  	int ret = 1;
> >  	int i = 0;
> > @@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
> >  		goto out;
> >  	}
> >  
> > +	if (!strcmp("unborn", line_sections.items[i].string)) {
> > +		i++;
> > +		if (unborn_head_target &&
> > +		    !strcmp("HEAD", line_sections.items[i++].string)) {
> > +			/*
> > +			 * Look for the symref target (if any). If found,
> > +			 * return it to the caller.
> > +			 */
> > +			for (; i < line_sections.nr; i++) {
> > +				const char *arg = line_sections.items[i].string;
> > +
> > +				if (skip_prefix(arg, "symref-target:", &arg)) {
> > +					*unborn_head_target = xstrdup(arg);
> > +					break;
> > +				}
> > +			}
> > +		}
> > +		goto out;
> > +	}
> 
> We split the line and notice that the first token is "unborn"; if
> the caller is not interested in the unborn head, we just skip the
> rest, but otherwise, if it is about HEAD (i.e. we do not care if a
> dangling symref that is not HEAD is reported), we notice the target
> in unborn_head_target.
> 
> OK.  We already saw how this is used in cmd_clone().
> 
> > @@ -461,6 +481,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> >  	const char *hash_name;
> >  	struct strvec *ref_prefixes = transport_options ?
> >  		&transport_options->ref_prefixes : NULL;
> > +	char **unborn_head_target = transport_options ?
> > +		&transport_options->unborn_head_target : NULL;
> 
> So any caller that passes transport_options will get the unborn head
> information for free?  The other callers are in fetch-pack.c and
> transport.c, which presumably are about fetching and not cloning.
> 
> I recall discussions on filling a missing refs/remotes/X/HEAD when
> we fetch from X and learn where X points at.  Such an extension can
> be done on top of this mechanism to pass transport_options from the
> fetch codepath, I presume?

I don't recall those discussions, but I think that we can do that (as
long as HEAD points to a branch that is part of the refspec we're
fetching, because the ref-prefix check still applies).

> Thanks.  I tried to follow the thought in the patches aloud, and it
> was mostly a pleasant read.

Thanks.
diff mbox series

Patch

diff --git a/Documentation/config/init.txt b/Documentation/config/init.txt
index dc77f8c844..79c79d6617 100644
--- a/Documentation/config/init.txt
+++ b/Documentation/config/init.txt
@@ -4,4 +4,4 @@  init.templateDir::
 
 init.defaultBranch::
 	Allows overriding the default branch name e.g. when initializing
-	a new repository or when cloning an empty repository.
+	a new repository.
diff --git a/builtin/clone.c b/builtin/clone.c
index 211d4f54b0..77fdc61f4d 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1330,10 +1330,21 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 		remote_head = NULL;
 		option_no_checkout = 1;
 		if (!option_bare) {
-			const char *branch = git_default_branch_name();
-			char *ref = xstrfmt("refs/heads/%s", branch);
+			const char *branch;
+			char *ref;
+
+			if (transport_ls_refs_options.unborn_head_target &&
+			    skip_prefix(transport_ls_refs_options.unborn_head_target,
+					"refs/heads/", &branch)) {
+				ref = transport_ls_refs_options.unborn_head_target;
+				transport_ls_refs_options.unborn_head_target = NULL;
+			} else {
+				branch = git_default_branch_name();
+				ref = xstrfmt("refs/heads/%s", branch);
+			}
 
 			install_branch_config(0, branch, remote_name, ref);
+			create_symref("HEAD", ref, "");
 			free(ref);
 		}
 	}
@@ -1385,5 +1396,6 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 	junk_mode = JUNK_LEAVE_ALL;
 
 	strvec_clear(&transport_ls_refs_options.ref_prefixes);
+	free(transport_ls_refs_options.unborn_head_target);
 	return err;
 }
diff --git a/connect.c b/connect.c
index 328c279250..879669df93 100644
--- a/connect.c
+++ b/connect.c
@@ -376,7 +376,8 @@  struct ref **get_remote_heads(struct packet_reader *reader,
 }
 
 /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
-static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
+static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
+			  char **unborn_head_target)
 {
 	int ret = 1;
 	int i = 0;
@@ -397,6 +398,25 @@  static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
 		goto out;
 	}
 
+	if (!strcmp("unborn", line_sections.items[i].string)) {
+		i++;
+		if (unborn_head_target &&
+		    !strcmp("HEAD", line_sections.items[i++].string)) {
+			/*
+			 * Look for the symref target (if any). If found,
+			 * return it to the caller.
+			 */
+			for (; i < line_sections.nr; i++) {
+				const char *arg = line_sections.items[i].string;
+
+				if (skip_prefix(arg, "symref-target:", &arg)) {
+					*unborn_head_target = xstrdup(arg);
+					break;
+				}
+			}
+		}
+		goto out;
+	}
 	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
 	    *end) {
 		ret = 0;
@@ -461,6 +481,8 @@  struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	const char *hash_name;
 	struct strvec *ref_prefixes = transport_options ?
 		&transport_options->ref_prefixes : NULL;
+	char **unborn_head_target = transport_options ?
+		&transport_options->unborn_head_target : NULL;
 	*list = NULL;
 
 	if (server_supports_v2("ls-refs", 1))
@@ -490,6 +512,8 @@  struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	if (!for_push)
 		packet_write_fmt(fd_out, "peel\n");
 	packet_write_fmt(fd_out, "symrefs\n");
+	if (server_supports_feature("ls-refs", "unborn", 0))
+		packet_write_fmt(fd_out, "unborn\n");
 	for (i = 0; ref_prefixes && i < ref_prefixes->nr; i++) {
 		packet_write_fmt(fd_out, "ref-prefix %s\n",
 				 ref_prefixes->v[i]);
@@ -498,7 +522,7 @@  struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 
 	/* Process response from server */
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
-		if (!process_ref_v2(reader, &list))
+		if (!process_ref_v2(reader, &list, unborn_head_target))
 			die(_("invalid ls-refs response: %s"), reader->line);
 	}
 
diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
index 7f082fb23b..0111d4e8bd 100755
--- a/t/t5606-clone-options.sh
+++ b/t/t5606-clone-options.sh
@@ -102,11 +102,13 @@  test_expect_success 'redirected clone -v does show progress' '
 '
 
 test_expect_success 'chooses correct default initial branch name' '
-	git init --bare empty &&
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=foo init --bare empty &&
+	test_config -C empty lsrefs.allowUnborn true &&
 	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
 	git -c init.defaultBranch=up clone empty whats-up &&
-	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
-	test refs/heads/up = $(git -C whats-up config branch.up.merge)
+	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
+	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
 '
 
 test_expect_success 'guesses initial branch name correctly' '
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 7d5b17909b..a8ef92b644 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -209,6 +209,31 @@  test_expect_success 'clone with file:// using protocol v2' '
 	grep "ref-prefix refs/tags/" log
 '
 
+test_expect_success 'clone of empty repo propagates name of default branch' '
+	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
+test_expect_success '...but not if explicitly forbidden by config' '
+	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
+	test_config -C file_empty_parent lsrefs.allowUnborn false &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	! grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
 test_expect_success 'fetch with file:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
diff --git a/transport.h b/transport.h
index 1f5b60e4d3..24e15799e7 100644
--- a/transport.h
+++ b/transport.h
@@ -243,6 +243,14 @@  struct transport_ls_refs_options {
 	 * provided ref_prefixes.
 	 */
 	struct strvec ref_prefixes;
+
+	/*
+	 * If unborn_head_target is not NULL, and the remote reports HEAD as
+	 * pointing to an unborn branch, transport_get_remote_refs() stores the
+	 * unborn branch in unborn_head_target. It should be freed by the
+	 * caller.
+	 */
+	char *unborn_head_target;
 };
 #define TRANSPORT_LS_REFS_OPTIONS_INIT { STRVEC_INIT }