[v5,0/3] Cloning with remote unborn HEAD

Message ID	cover.1611686656.git.jonathantanmy@google.com (mailing list archive)
Headers	show Return-Path: <git-owner@kernel.org> Sender: "jonathantanmy via sendgmr" <jonathantanmy@twelve4.c.googlers.com> Date: Tue, 26 Jan 2021 10:55:54 -0800 In-Reply-To: <20201208013121.677494-1-jonathantanmy@google.com> Message-Id: <cover.1611686656.git.jonathantanmy@google.com> Mime-Version: 1.0 References: <20201208013121.677494-1-jonathantanmy@google.com> Subject: [PATCH v5 0/3] Cloning with remote unborn HEAD From: Jonathan Tan <jonathantanmy@google.com> To: git@vger.kernel.org Cc: Jonathan Tan <jonathantanmy@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	Cloning with remote unborn HEAD \| expand [v5,0/3] Cloning with remote unborn HEAD [v5,1/3] ls-refs: report unborn targets of symrefs [v5,2/3] connect, transport: encapsulate arg in struct [v5,3/3] clone: respect remote unborn HEAD

Jonathan Tan Jan. 26, 2021, 6:55 p.m. UTC

Thanks, Peff, for your review. I have addressed your comments (through
replies to your emails and here in this v5 patch set).

Jonathan Tan (3):
  ls-refs: report unborn targets of symrefs
  connect, transport: encapsulate arg in struct
  clone: respect remote unborn HEAD

 Documentation/config.txt                |  2 +
 Documentation/config/init.txt           |  2 +-
 Documentation/config/lsrefs.txt         |  3 ++
 Documentation/technical/protocol-v2.txt | 10 ++++-
 builtin/clone.c                         | 34 +++++++++++-----
 builtin/fetch-pack.c                    |  3 +-
 builtin/fetch.c                         | 18 +++++----
 builtin/ls-remote.c                     |  9 +++--
 connect.c                               | 32 +++++++++++++--
 ls-refs.c                               | 53 +++++++++++++++++++++++--
 ls-refs.h                               |  1 +
 remote.h                                |  4 +-
 serve.c                                 |  2 +-
 t/t5606-clone-options.sh                |  8 ++--
 t/t5701-git-serve.sh                    |  2 +-
 t/t5702-protocol-v2.sh                  | 25 ++++++++++++
 transport-helper.c                      |  5 ++-
 transport-internal.h                    |  9 +----
 transport.c                             | 23 ++++++-----
 transport.h                             | 29 ++++++++++----
 20 files changed, 210 insertions(+), 64 deletions(-)
 create mode 100644 Documentation/config/lsrefs.txt

Range-diff against v4:
1:  d7d2ba597e ! 1:  32e16dfdbd ls-refs: report unborn targets of symrefs
    @@ Commit message
     
         Currently, symrefs that have unborn targets (such as in this case) are
         not communicated by the protocol. Teach Git to advertise and support the
    -    "unborn" feature in "ls-refs" (guarded by the lsrefs.allowunborn
    +    "unborn" feature in "ls-refs" (by default, this is advertised, but
    +    server administrators may turn this off through the lsrefs.allowunborn
         config). This feature indicates that "ls-refs" supports the "unborn"
         argument; when it is specified, "ls-refs" will send the HEAD symref with
         the name of its unborn target.
    @@ ls-refs.c: static int send_ref(const char *refname, const struct object_id *oid,
     +	int flag;
     +	int oid_is_null;
     +
    -+	memset(&oid, 0, sizeof(oid));
     +	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
    -+	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
    ++	if (!resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag))
    ++		return; /* bad ref */
     +	oid_is_null = is_null_oid(&oid);
     +	if (!oid_is_null ||
     +	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
    @@ ls-refs.c: int ls_refs(struct repository *r, struct strvec *keys,
      	memset(&data, 0, sizeof(data));
      
     -	git_config(ls_refs_config, NULL);
    ++	data.allow_unborn = 1;
     +	git_config(ls_refs_config, &data);
      
      	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
    @@ ls-refs.c: int ls_refs(struct repository *r, struct strvec *keys,
     +	if (value) {
     +		int allow_unborn_value;
     +
    -+		if (!repo_config_get_bool(the_repository,
    ++		if (repo_config_get_bool(the_repository,
     +					 "lsrefs.allowunborn",
    -+					 &allow_unborn_value) &&
    ++					 &allow_unborn_value) ||
     +		    allow_unborn_value)
     +			strbuf_addstr(value, "unborn");
     +	}
    @@ serve.c: struct protocol_capability {
      	{ "fetch", upload_pack_advertise, upload_pack_v2 },
      	{ "server-option", always_advertise, NULL },
      	{ "object-format", object_format_advertise, NULL },
    +
    + ## t/t5701-git-serve.sh ##
    +@@ t/t5701-git-serve.sh: test_expect_success 'test capability advertisement' '
    + 	cat >expect <<-EOF &&
    + 	version 2
    + 	agent=git/$(git version | cut -d" " -f3)
    +-	ls-refs
    ++	ls-refs=unborn
    + 	fetch=shallow
    + 	server-option
    + 	object-format=$(test_oid algo)
2:  51d8a359c7 < -:  ---------- connect, transport: add no-op arg for future patch
-:  ---------- > 2:  4eec551668 connect, transport: encapsulate arg in struct
3:  896be550f1 ! 3:  922e8c229c clone: respect remote unborn HEAD
    @@ Documentation/config/init.txt: init.templateDir::
     +	a new repository.
     
      ## builtin/clone.c ##
    -@@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
    - 	int submodule_progress;
    - 
    - 	struct strvec ref_prefixes = STRVEC_INIT;
    -+	char *unborn_head_target = NULL;
    - 
    - 	packet_trace_identity("clone");
    - 
    -@@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
    - 	if (!option_no_tags)
    - 		strvec_push(&ref_prefixes, "refs/tags/");
    - 
    --	refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
    -+	refs = transport_get_remote_refs(transport, &ref_prefixes,
    -+					 &unborn_head_target);
    - 
    - 	if (refs) {
    - 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
     @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
      		remote_head = NULL;
      		option_no_checkout = 1;
    @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
     +			const char *branch;
     +			char *ref;
     +
    -+			if (unborn_head_target &&
    -+			    skip_prefix(unborn_head_target, "refs/heads/", &branch)) {
    -+				ref = unborn_head_target;
    -+				unborn_head_target = NULL;
    ++			if (transport_ls_refs_options.unborn_head_target &&
    ++			    skip_prefix(transport_ls_refs_options.unborn_head_target,
    ++					"refs/heads/", &branch)) {
    ++				ref = transport_ls_refs_options.unborn_head_target;
    ++				transport_ls_refs_options.unborn_head_target = NULL;
     +			} else {
     +				branch = git_default_branch_name();
     +				ref = xstrfmt("refs/heads/%s", branch);
    @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
      		}
      	}
     @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
    - 	strbuf_release(&key);
      	junk_mode = JUNK_LEAVE_ALL;
      
    -+	free(unborn_head_target);
    - 	strvec_clear(&ref_prefixes);
    + 	strvec_clear(&transport_ls_refs_options.ref_prefixes);
    ++	free(transport_ls_refs_options.unborn_head_target);
      	return err;
      }
     
    @@ connect.c: static int process_ref_v2(struct packet_reader *reader, struct ref **
      	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
      	    *end) {
      		ret = 0;
    +@@ connect.c: struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
    + 	const char *hash_name;
    + 	struct strvec *ref_prefixes = transport_options ?
    + 		&transport_options->ref_prefixes : NULL;
    ++	char **unborn_head_target = transport_options ?
    ++		&transport_options->unborn_head_target : NULL;
    + 	*list = NULL;
    + 
    + 	if (server_supports_v2("ls-refs", 1))
     @@ connect.c: struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
      	if (!for_push)
      		packet_write_fmt(fd_out, "peel\n");
    @@ connect.c: struct ref **get_remote_refs(int fd_out, struct packet_reader *reader
      
      	/* Process response from server */
      	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
    --		if (unborn_head_target)
    --			BUG("NEEDSWORK: provide unborn HEAD target to caller while reading refs");
     -		if (!process_ref_v2(reader, &list))
     +		if (!process_ref_v2(reader, &list, unborn_head_target))
      			die(_("invalid ls-refs response: %s"), reader->line);
    @@ t/t5702-protocol-v2.sh: test_expect_success 'clone with file:// using protocol v
      '
      
     +test_expect_success 'clone of empty repo propagates name of default branch' '
    ++	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
    ++
     +	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
     +	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
    -+	test_config -C file_empty_parent lsrefs.allowUnborn true &&
     +
     +	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
     +	git -c init.defaultBranch=main -c protocol.version=2 \
     +		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
     +	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
     +'
    ++
    ++test_expect_success '...but not if explicitly forbidden by config' '
    ++	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
    ++
    ++	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
    ++	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
    ++	test_config -C file_empty_parent lsrefs.allowUnborn false &&
    ++
    ++	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
    ++	git -c init.defaultBranch=main -c protocol.version=2 \
    ++		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
    ++	! grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
    ++'
     +
      test_expect_success 'fetch with file:// using protocol v2' '
      	test_when_finished "rm -f log" &&
      
    +
    + ## transport.h ##
    +@@ transport.h: struct transport_ls_refs_options {
    + 	 * provided ref_prefixes.
    + 	 */
    + 	struct strvec ref_prefixes;
    ++
    ++	/*
    ++	 * If unborn_head_target is not NULL, and the remote reports HEAD as
    ++	 * pointing to an unborn branch, transport_get_remote_refs() stores the
    ++	 * unborn branch in unborn_head_target. It should be freed by the
    ++	 * caller.
    ++	 */
    ++	char *unborn_head_target;
    + };
    + #define TRANSPORT_LS_REFS_OPTIONS_INIT { STRVEC_INIT }
    +

Junio C Hamano Jan. 27, 2021, 1:11 a.m. UTC | #1

Jonathan Tan <jonathantanmy@google.com> writes:

> Thanks, Peff, for your review. I have addressed your comments (through
> replies to your emails and here in this v5 patch set).
>
> Jonathan Tan (3):
>   ls-refs: report unborn targets of symrefs
>   connect, transport: encapsulate arg in struct
>   clone: respect remote unborn HEAD

Applying this alone to 'master' seems to pass all tests, but
the topic seems to have funny interactions with another topic
in flight, jk/peel-iterated-oid

There is textual conflict whose resolution seems trivial, but with
that resolved ...

diff --cc builtin/clone.c
index e335734b4c,77fdc61f4d..0000000000
--- i/builtin/clone.c
+++ w/builtin/clone.c
@@@ -1326,10 -1330,21 +1330,21 @@@ int cmd_clone(int argc, const char **ar
  		remote_head = NULL;
  		option_no_checkout = 1;
  		if (!option_bare) {
- 			const char *branch = git_default_branch_name(0);
- 			char *ref = xstrfmt("refs/heads/%s", branch);
+ 			const char *branch;
+ 			char *ref;
+ 
+ 			if (transport_ls_refs_options.unborn_head_target &&
+ 			    skip_prefix(transport_ls_refs_options.unborn_head_target,
+ 					"refs/heads/", &branch)) {
+ 				ref = transport_ls_refs_options.unborn_head_target;
+ 				transport_ls_refs_options.unborn_head_target = NULL;
+ 			} else {
 -				branch = git_default_branch_name();
++				branch = git_default_branch_name(0);
+ 				ref = xstrfmt("refs/heads/%s", branch);
+ 			}
  
  			install_branch_config(0, branch, remote_name, ref);
+ 			create_symref("HEAD", ref, "");
  			free(ref);
  		}
  	}


... numerous tests fail.

For example, t5702 dies like so:

expecting success of 5702.15 'clone of empty repo propagates name of default branch':
        test_when_finished "rm -rf file_empty_parent file_empty_child" &&

        GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
        git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&

        GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
        git -c init.defaultBranch=main -c protocol.version=2 \
                clone "file://$(pwd)/file_empty_parent" file_empty_child &&
        grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD

Initialized empty Git repository in /usr/local/google/home/jch/w/git.git/t/trash directory.t5702-protocol-v2/file_empty_parent/.git/
Cloning into 'file_empty_child'...
fatal: expected flush after ref listing
not ok 15 - clone of empty repo propagates name of default branch

Ævar Arnfjörð Bjarmason Jan. 27, 2021, 1:41 a.m. UTC | #2

On Tue, Jan 26 2021, Jonathan Tan wrote:

[For some reason the patches didn't reach my mailbox, but I see them in
the list archive, so I'm replying to the cover-letter]

>  Documentation/config.txt                |  2 +
>  Documentation/config/init.txt           |  2 +-

Good, now we have init.defaultBranch docs, but they say:

     init.defaultBranch::
            Allows overriding the default branch name e.g. when initializing
    -       a new repository or when cloning an empty repository.
    +       a new repository.

So this still only applies to file:// and other "protocol" clones, but
not "git clone /some/path"?

Re my reply to v1, do we consider that a bug, feature, something just
left unimplemented?

I really don't care much, but this really needs a corresponding
documentation update. I.e. something like:

    init.defaultBranch::
        Allows overriding the default branch name e.g. when initializing a
        new repository or when cloning an empty repository.

        When cloning a repository over protocol v2 (i.e. ssh://, https://,
        file://, but not a /some/path), and if that repository has
        init.defaultBranch configured, the server will advertise its
        preferred default branch name, and we'll take its configuration over
        ours.

Which, just in terms of implementation makes me think it would make more
sense if the server just had:

    uploadPack.sendConfig = "init.defaultBranch=xyz"

The client:

    receivePack.acceptConfig = "init.defaultBranch"

And in terms of things on the wire we'd say:

    "set-config init.defaultBranch=main"

You could have many such lines, but we'd just harcode only accepting
"init.defaultBranch" by default for now.

I.e. we set "init.defaultBranch" on the server, and the client ends up
interpreting things as if though "init.defaultBranch" was set to exactly
that value. So why not just ... send a line saying "you should set your
init.defaultBranch config to this".

Makes it future-extensible pretty much for free, and I think also much
easier to explain to users. I.e. instead of init.defaultBranch somehow
being magical when talking with a remote server we can talk about a
remote server being one source of config per git-config's documented
config order, for a very narrow whitelist of config keys.

Or (not clear to me, should have waited with my other E-Mail) are we
ever expecting to send more than one of:

    "unborn <refname> symref-target:<target>"

Or is the reason closer to us being able to shoehorn this into the
existing ls-refs response, as opposed to some general "here's config for
you" response we don't have?

Jeff King Jan. 27, 2021, 4:25 a.m. UTC | #3

On Tue, Jan 26, 2021 at 05:11:42PM -0800, Junio C Hamano wrote:

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > Thanks, Peff, for your review. I have addressed your comments (through
> > replies to your emails and here in this v5 patch set).
> >
> > Jonathan Tan (3):
> >   ls-refs: report unborn targets of symrefs
> >   connect, transport: encapsulate arg in struct
> >   clone: respect remote unborn HEAD
> 
> Applying this alone to 'master' seems to pass all tests, but
> the topic seems to have funny interactions with another topic
> in flight, jk/peel-iterated-oid

I was worried at first I really screwed up something subtle, but it is
indeed just a funny local interaction.

Here's a fix which can be applied on top of jt/clone-unborn-head. It
could equally well be applied as part of the merge (with a minor
adjustment in the context), but I think it ought to be squashed into
Jonathan's patch 1 anyway.

The conflict you had to resolve was a red herring (it wasn't part of
jk/peel-iterated-oid at all, but rather other commits that got pulled in
because my topic is based on a more recent master).

-- >8 --
Subject: [PATCH] ls-refs: don't peel NULL oid

When the "unborn" feature is enabled, upload-pack serving an ls-refs
command will pass a NULL oid into send_ref(). In this case, there is no
point trying to peel the ref, since we know it points to nothing.

For now this is a harmless waste of cycles (we re-resolve HEAD and find
out that indeed, it points to nothing). But after merging with another
topic that contains 36a317929b (refs: switch peel_ref() to
peel_iterated_oid(), 2021-01-20), we'd actually end up passing NULL to
peel_object(), which segfaults!

Signed-off-by: Jeff King <peff@peff.net>
---
 ls-refs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ls-refs.c b/ls-refs.c
index 4077adeb6a..bc91f03653 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -66,7 +66,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 			    strip_namespace(symref_target));
 	}

-	if (data->peel) {
+	if (data->peel && oid) {
 		struct object_id peeled;
 		if (!peel_ref(refname, &peeled))
 			strbuf_addf(&refline, " peeled:%s", oid_to_hex(&peeled));

Junio C Hamano Jan. 27, 2021, 6:14 a.m. UTC | #4

Jeff King <peff@peff.net> writes:

> Here's a fix which can be applied on top of jt/clone-unborn-head. It
> could equally well be applied as part of the merge (with a minor
> adjustment in the context), but I think it ought to be squashed into
> Jonathan's patch 1 anyway.

Will queue but we are not merging the topic to 'next' yet, so I'll
ask Jonathan to remember making it a part of the series if it needs
to be updated later.

Thanks.

>
> -- >8 --
> Subject: [PATCH] ls-refs: don't peel NULL oid
>
> When the "unborn" feature is enabled, upload-pack serving an ls-refs
> command will pass a NULL oid into send_ref(). In this case, there is no
> point trying to peel the ref, since we know it points to nothing.
>
> For now this is a harmless waste of cycles (we re-resolve HEAD and find
> out that indeed, it points to nothing). But after merging with another
> topic that contains 36a317929b (refs: switch peel_ref() to
> peel_iterated_oid(), 2021-01-20), we'd actually end up passing NULL to
> peel_object(), which segfaults!
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  ls-refs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/ls-refs.c b/ls-refs.c
> index 4077adeb6a..bc91f03653 100644
> --- a/ls-refs.c
> +++ b/ls-refs.c
> @@ -66,7 +66,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
>  			    strip_namespace(symref_target));
>  	}
>  
> -	if (data->peel) {
> +	if (data->peel && oid) {
>  		struct object_id peeled;
>  		if (!peel_ref(refname, &peeled))
>  			strbuf_addf(&refline, " peeled:%s", oid_to_hex(&peeled));

Jonathan Tan Jan. 30, 2021, 4:41 a.m. UTC | #5

> On Tue, Jan 26 2021, Jonathan Tan wrote:
> 
> [For some reason the patches didn't reach my mailbox, but I see them in
> the list archive, so I'm replying to the cover-letter]
> 
> >  Documentation/config.txt                |  2 +
> >  Documentation/config/init.txt           |  2 +-
> 
> Good, now we have init.defaultBranch docs, but they say:
>     
>      init.defaultBranch::
>             Allows overriding the default branch name e.g. when initializing
>     -       a new repository or when cloning an empty repository.
>     +       a new repository.
> 
> So this still only applies to file:// and other "protocol" clones, but
> not "git clone /some/path"?

Ah...that's true.

> Re my reply to v1, do we consider that a bug, feature, something just
> left unimplemented?
> 
> I really don't care much, but this really needs a corresponding
> documentation update. I.e. something like:
> 
>     init.defaultBranch::
>         Allows overriding the default branch name e.g. when initializing a
>         new repository or when cloning an empty repository.
>     
>         When cloning a repository over protocol v2 (i.e. ssh://, https://,
>         file://, but not a /some/path), and if that repository has
>         init.defaultBranch configured, the server will advertise its
>         preferred default branch name, and we'll take its configuration over
>         ours.

Thanks - I'll use some of your wording, but I think it's best to leave
open the possibility that cloning using protocol v0 or the disk clone
(/some/path) copies over the current HEAD as well.

> Which, just in terms of implementation makes me think it would make more
> sense if the server just had:
> 
>     uploadPack.sendConfig = "init.defaultBranch=xyz"
> 
> The client:
> 
>     receivePack.acceptConfig = "init.defaultBranch"
> 
> And in terms of things on the wire we'd say:
> 
>     "set-config init.defaultBranch=main"
> 
> You could have many such lines, but we'd just harcode only accepting
> "init.defaultBranch" by default for now.
> 
> I.e. we set "init.defaultBranch" on the server, and the client ends up
> interpreting things as if though "init.defaultBranch" was set to exactly
> that value. So why not just ... send a line saying "you should set your
> init.defaultBranch config to this".
> 
> Makes it future-extensible pretty much for free, and I think also much
> easier to explain to users. I.e. instead of init.defaultBranch somehow
> being magical when talking with a remote server we can talk about a
> remote server being one source of config per git-config's documented
> config order, for a very narrow whitelist of config keys.
>
> Or (not clear to me, should have waited with my other E-Mail) are we
> ever expecting to send more than one of:
> 
>     "unborn <refname> symref-target:<target>"
> 
> Or is the reason closer to us being able to shoehorn this into the
> existing ls-refs response, as opposed to some general "here's config for
> you" response we don't have?

It's not the same - from what I understand, what you're suggesting is
setting a config in the repo that has just been cloned, but this patch
set does not set any such config. Also, it may be strange for the server
to be able to change the config of a currently running command - I would
expect such a thing to only take effect on future runs of Git on that
repo.

Ævar Arnfjörð Bjarmason Jan. 30, 2021, 11:13 a.m. UTC | #6

On Sat, Jan 30 2021, Jonathan Tan wrote:

>> On Tue, Jan 26 2021, Jonathan Tan wrote:
>> 
>> [For some reason the patches didn't reach my mailbox, but I see them in
>> the list archive, so I'm replying to the cover-letter]
>> 
>> >  Documentation/config.txt                |  2 +
>> >  Documentation/config/init.txt           |  2 +-
>> 
>> Good, now we have init.defaultBranch docs, but they say:
>>     
>>      init.defaultBranch::
>>             Allows overriding the default branch name e.g. when initializing
>>     -       a new repository or when cloning an empty repository.
>>     +       a new repository.
>> 
>> So this still only applies to file:// and other "protocol" clones, but
>> not "git clone /some/path"?
>
> Ah...that's true.
>
>> Re my reply to v1, do we consider that a bug, feature, something just
>> left unimplemented?
>> 
>> I really don't care much, but this really needs a corresponding
>> documentation update. I.e. something like:
>> 
>>     init.defaultBranch::
>>         Allows overriding the default branch name e.g. when initializing a
>>         new repository or when cloning an empty repository.
>>     
>>         When cloning a repository over protocol v2 (i.e. ssh://, https://,
>>         file://, but not a /some/path), and if that repository has
>>         init.defaultBranch configured, the server will advertise its
>>         preferred default branch name, and we'll take its configuration over
>>         ours.
>
> Thanks - I'll use some of your wording, but I think it's best to leave
> open the possibility that cloning using protocol v0 or the disk clone
> (/some/path) copies over the current HEAD as well.

Sure, and maybe a test_expect_failure for those cases? I.e. to
explicitly say in the current docs/tests what does / doesn't work, and
if we consider that intentional or not.

>> Which, just in terms of implementation makes me think it would make more
>> sense if the server just had:
>> 
>>     uploadPack.sendConfig = "init.defaultBranch=xyz"
>> 
>> The client:
>> 
>>     receivePack.acceptConfig = "init.defaultBranch"
>> 
>> And in terms of things on the wire we'd say:
>> 
>>     "set-config init.defaultBranch=main"
>> 
>> You could have many such lines, but we'd just harcode only accepting
>> "init.defaultBranch" by default for now.
>> 
>> I.e. we set "init.defaultBranch" on the server, and the client ends up
>> interpreting things as if though "init.defaultBranch" was set to exactly
>> that value. So why not just ... send a line saying "you should set your
>> init.defaultBranch config to this".
>> 
>> Makes it future-extensible pretty much for free, and I think also much
>> easier to explain to users. I.e. instead of init.defaultBranch somehow
>> being magical when talking with a remote server we can talk about a
>> remote server being one source of config per git-config's documented
>> config order, for a very narrow whitelist of config keys.
>>
>> Or (not clear to me, should have waited with my other E-Mail) are we
>> ever expecting to send more than one of:
>> 
>>     "unborn <refname> symref-target:<target>"
>> 
>> Or is the reason closer to us being able to shoehorn this into the
>> existing ls-refs response, as opposed to some general "here's config for
>> you" response we don't have?
>
> It's not the same - from what I understand, what you're suggesting is
> setting a config in the repo that has just been cloned[...]

No, not to set config, i.e. during/after clone doing "git config
init.defaultBranch <remote>" wouldn't make any sense. Since that would
set config in .git/config, and that would (also?) apply /after/ the
clone, e.g. if you did "git init /tmp/somewhere/else" afterwards.

> [...]but this patch set does not set any such config[...].

It does, within the scope of the runtime of the process. I.e. just like
"git -c" or whatever. In builtin/clone.c you set "branch" from local
init.defaultBranch only if the remote did not provide us a value for it,
i.e. remote config for that config key overrides local config.

> Also, it may be strange for the server to be able to change the config
> of a currently running command - I would expect such a thing to only
> take effect on future runs of Git on that repo.

Yes, as I noted on v1 I think the semantics of this whole thing are a
bit strange :)

But if we're keeping the "strangeness" all I'm saying is that I think
it's more obvious to a user if we just declare the remote to be a
limited config source in tems of explaining this special-case.

And that once we're doing that it's also more obvious IMO to have that
be what's happening on the protocol level, if we're not expecting more
than one of these values.

I.e. if you ignore your current implementation internal and just view
git as a black box, then the functionality of this thing is
indistinguishable from the remote being a (limited) source of config.

So isn't in simpler to explain it to the user in those terms?

Jonathan Tan Feb. 2, 2021, 2:22 a.m. UTC | #7

> > I really don't care much, but this really needs a corresponding
> > documentation update. I.e. something like:
> > 
> >     init.defaultBranch::
> >         Allows overriding the default branch name e.g. when initializing a
> >         new repository or when cloning an empty repository.
> >     
> >         When cloning a repository over protocol v2 (i.e. ssh://, https://,
> >         file://, but not a /some/path), and if that repository has
> >         init.defaultBranch configured, the server will advertise its
> >         preferred default branch name, and we'll take its configuration over
> >         ours.
> 
> Thanks - I'll use some of your wording, but I think it's best to leave
> open the possibility that cloning using protocol v0 or the disk clone
> (/some/path) copies over the current HEAD as well.

Looking back on this, I think that it's natural to think that both an
empty repository and a non-empty one have a HEAD that points somewhere,
and "git clone" would behave the same way in both cases. So I'll hold
off on the documentation change.

Ævar Arnfjörð Bjarmason Feb. 3, 2021, 2:23 p.m. UTC | #8

On Tue, Feb 02 2021, Jonathan Tan wrote:

>> > I really don't care much, but this really needs a corresponding
>> > documentation update. I.e. something like:
>> > 
>> >     init.defaultBranch::
>> >         Allows overriding the default branch name e.g. when initializing a
>> >         new repository or when cloning an empty repository.
>> >     
>> >         When cloning a repository over protocol v2 (i.e. ssh://, https://,
>> >         file://, but not a /some/path), and if that repository has
>> >         init.defaultBranch configured, the server will advertise its
>> >         preferred default branch name, and we'll take its configuration over
>> >         ours.
>> 
>> Thanks - I'll use some of your wording, but I think it's best to leave
>> open the possibility that cloning using protocol v0 or the disk clone
>> (/some/path) copies over the current HEAD as well.
>
> Looking back on this, I think that it's natural to think that both an
> empty repository and a non-empty one have a HEAD that points somewhere,
> and "git clone" would behave the same way in both cases. So I'll hold
> off on the documentation change.

You mean for a v6 it'll do the same thing in the local clone case too
and thus we won't need to document the exception? Sounds good.

I was mainly pointing out the need to document the current divergent
behavior.

Documenting that something isn't consistent shouldn't be seen as a
blessing that the divergence is a good idea, it's an aid to our users so
they can understand why their git version does X when they might be
expecting Y.

Junio C Hamano Feb. 5, 2021, 10:28 p.m. UTC | #9

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Tue, Jan 26 2021, Jonathan Tan wrote:
>
> [For some reason the patches didn't reach my mailbox, but I see them in
> the list archive, so I'm replying to the cover-letter]
>
>>  Documentation/config.txt                |  2 +
>>  Documentation/config/init.txt           |  2 +-
>
> Good, now we have init.defaultBranch docs, but they say:
>     
>      init.defaultBranch::
>             Allows overriding the default branch name e.g. when initializing
>     -       a new repository or when cloning an empty repository.
>     +       a new repository.
>
> So this still only applies to file:// and other "protocol" clones, but
> not "git clone /some/path"?

I agree with you that the new "unborn HEAD will also follow what the
upstream has" should be done for --local transport.  It is a bug
waiting to be complained about by users.

>     init.defaultBranch::
>         Allows overriding the default branch name e.g. when initializing a
>         new repository or when cloning an empty repository.
>     
>         When cloning a repository over protocol v2 (i.e. ssh://, https://,
>         file://, but not a /some/path), and if that repository has
>         init.defaultBranch configured, the server will advertise its
>         preferred default branch name, and we'll take its configuration over
>         ours.

I actually do not think that is what is going on.  What the other
side advertises is *NOT* their preferred default branch name and it
does not matter if they have init.defaultBranch configured or not.

What the new protocol extension gives us is that we can learn what
the other side is actually using (not their preferred default) as
their primary branch.  We've always done so since very early days of
"git clone" (even back when we failed to clone an empty repository),
by trying to guess which branch their HEAD points at.

The only thing that is new with this topic is that it now gives us
a reliable way to learn what their HEAD points at, even when it is
pointing at an unborn branch.

In general we do not let other side _dictate_ what our configuration
should look like, as that can have security implications, and this
is not sending any configuration at all.

Their HEAD may be pointing at a specific branch (which may or may
not be unborn) because that is what they configured their
init.defaultBranch to, or their version of Git created the branch
and they haven't changed it since repository creation, or the user
using that repository just started working with that branch with
"git checkout [--orphan]" (the repository being cloned does not have
to be a bare repository).  It does not matter how their HEAD ended
up pointing at a specific branch---we just try to mimic their
current status---it is because it would make it easier to give our
changes back to them if everybody involved used the same name for
the primary integration branch, and the repositories the people
clone from are most often have their primary integration branch
pointed at by their HEAD.

And I do not consider it transfering any configuration.

So while I agree that the logic to choose the branch that gets
checked out in a new repository created by "git clone" needs to be
documented well, it has very little to do with "init.defaultBranch".

[v5,0/3] Cloning with remote unborn HEAD

Message

Comments