Message ID | cover.1611686656.git.jonathantanmy@google.com (mailing list archive) |
---|---|
Headers | show |
Series | Cloning with remote unborn HEAD | expand |
Jonathan Tan <jonathantanmy@google.com> writes: > Thanks, Peff, for your review. I have addressed your comments (through > replies to your emails and here in this v5 patch set). > > Jonathan Tan (3): > ls-refs: report unborn targets of symrefs > connect, transport: encapsulate arg in struct > clone: respect remote unborn HEAD Applying this alone to 'master' seems to pass all tests, but the topic seems to have funny interactions with another topic in flight, jk/peel-iterated-oid There is textual conflict whose resolution seems trivial, but with that resolved ... diff --cc builtin/clone.c index e335734b4c,77fdc61f4d..0000000000 --- i/builtin/clone.c +++ w/builtin/clone.c @@@ -1326,10 -1330,21 +1330,21 @@@ int cmd_clone(int argc, const char **ar remote_head = NULL; option_no_checkout = 1; if (!option_bare) { - const char *branch = git_default_branch_name(0); - char *ref = xstrfmt("refs/heads/%s", branch); + const char *branch; + char *ref; + + if (transport_ls_refs_options.unborn_head_target && + skip_prefix(transport_ls_refs_options.unborn_head_target, + "refs/heads/", &branch)) { + ref = transport_ls_refs_options.unborn_head_target; + transport_ls_refs_options.unborn_head_target = NULL; + } else { - branch = git_default_branch_name(); ++ branch = git_default_branch_name(0); + ref = xstrfmt("refs/heads/%s", branch); + } install_branch_config(0, branch, remote_name, ref); + create_symref("HEAD", ref, ""); free(ref); } } ... numerous tests fail. For example, t5702 dies like so: expecting success of 5702.15 'clone of empty repo propagates name of default branch': test_when_finished "rm -rf file_empty_parent file_empty_child" && GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \ git -c init.defaultBranch=mydefaultbranch init file_empty_parent && GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \ git -c init.defaultBranch=main -c protocol.version=2 \ clone "file://$(pwd)/file_empty_parent" file_empty_child && grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD Initialized empty Git repository in /usr/local/google/home/jch/w/git.git/t/trash directory.t5702-protocol-v2/file_empty_parent/.git/ Cloning into 'file_empty_child'... fatal: expected flush after ref listing not ok 15 - clone of empty repo propagates name of default branch
On Tue, Jan 26 2021, Jonathan Tan wrote: [For some reason the patches didn't reach my mailbox, but I see them in the list archive, so I'm replying to the cover-letter] > Documentation/config.txt | 2 + > Documentation/config/init.txt | 2 +- Good, now we have init.defaultBranch docs, but they say: init.defaultBranch:: Allows overriding the default branch name e.g. when initializing - a new repository or when cloning an empty repository. + a new repository. So this still only applies to file:// and other "protocol" clones, but not "git clone /some/path"? Re my reply to v1, do we consider that a bug, feature, something just left unimplemented? I really don't care much, but this really needs a corresponding documentation update. I.e. something like: init.defaultBranch:: Allows overriding the default branch name e.g. when initializing a new repository or when cloning an empty repository. When cloning a repository over protocol v2 (i.e. ssh://, https://, file://, but not a /some/path), and if that repository has init.defaultBranch configured, the server will advertise its preferred default branch name, and we'll take its configuration over ours. Which, just in terms of implementation makes me think it would make more sense if the server just had: uploadPack.sendConfig = "init.defaultBranch=xyz" The client: receivePack.acceptConfig = "init.defaultBranch" And in terms of things on the wire we'd say: "set-config init.defaultBranch=main" You could have many such lines, but we'd just harcode only accepting "init.defaultBranch" by default for now. I.e. we set "init.defaultBranch" on the server, and the client ends up interpreting things as if though "init.defaultBranch" was set to exactly that value. So why not just ... send a line saying "you should set your init.defaultBranch config to this". Makes it future-extensible pretty much for free, and I think also much easier to explain to users. I.e. instead of init.defaultBranch somehow being magical when talking with a remote server we can talk about a remote server being one source of config per git-config's documented config order, for a very narrow whitelist of config keys. Or (not clear to me, should have waited with my other E-Mail) are we ever expecting to send more than one of: "unborn <refname> symref-target:<target>" Or is the reason closer to us being able to shoehorn this into the existing ls-refs response, as opposed to some general "here's config for you" response we don't have?
On Tue, Jan 26, 2021 at 05:11:42PM -0800, Junio C Hamano wrote: > Jonathan Tan <jonathantanmy@google.com> writes: > > > Thanks, Peff, for your review. I have addressed your comments (through > > replies to your emails and here in this v5 patch set). > > > > Jonathan Tan (3): > > ls-refs: report unborn targets of symrefs > > connect, transport: encapsulate arg in struct > > clone: respect remote unborn HEAD > > Applying this alone to 'master' seems to pass all tests, but > the topic seems to have funny interactions with another topic > in flight, jk/peel-iterated-oid I was worried at first I really screwed up something subtle, but it is indeed just a funny local interaction. Here's a fix which can be applied on top of jt/clone-unborn-head. It could equally well be applied as part of the merge (with a minor adjustment in the context), but I think it ought to be squashed into Jonathan's patch 1 anyway. The conflict you had to resolve was a red herring (it wasn't part of jk/peel-iterated-oid at all, but rather other commits that got pulled in because my topic is based on a more recent master). -- >8 -- Subject: [PATCH] ls-refs: don't peel NULL oid When the "unborn" feature is enabled, upload-pack serving an ls-refs command will pass a NULL oid into send_ref(). In this case, there is no point trying to peel the ref, since we know it points to nothing. For now this is a harmless waste of cycles (we re-resolve HEAD and find out that indeed, it points to nothing). But after merging with another topic that contains 36a317929b (refs: switch peel_ref() to peel_iterated_oid(), 2021-01-20), we'd actually end up passing NULL to peel_object(), which segfaults! Signed-off-by: Jeff King <peff@peff.net> --- ls-refs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ls-refs.c b/ls-refs.c index 4077adeb6a..bc91f03653 100644 --- a/ls-refs.c +++ b/ls-refs.c @@ -66,7 +66,7 @@ static int send_ref(const char *refname, const struct object_id *oid, strip_namespace(symref_target)); } - if (data->peel) { + if (data->peel && oid) { struct object_id peeled; if (!peel_ref(refname, &peeled)) strbuf_addf(&refline, " peeled:%s", oid_to_hex(&peeled));
Jeff King <peff@peff.net> writes: > Here's a fix which can be applied on top of jt/clone-unborn-head. It > could equally well be applied as part of the merge (with a minor > adjustment in the context), but I think it ought to be squashed into > Jonathan's patch 1 anyway. Will queue but we are not merging the topic to 'next' yet, so I'll ask Jonathan to remember making it a part of the series if it needs to be updated later. Thanks. > > -- >8 -- > Subject: [PATCH] ls-refs: don't peel NULL oid > > When the "unborn" feature is enabled, upload-pack serving an ls-refs > command will pass a NULL oid into send_ref(). In this case, there is no > point trying to peel the ref, since we know it points to nothing. > > For now this is a harmless waste of cycles (we re-resolve HEAD and find > out that indeed, it points to nothing). But after merging with another > topic that contains 36a317929b (refs: switch peel_ref() to > peel_iterated_oid(), 2021-01-20), we'd actually end up passing NULL to > peel_object(), which segfaults! > > Signed-off-by: Jeff King <peff@peff.net> > --- > ls-refs.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/ls-refs.c b/ls-refs.c > index 4077adeb6a..bc91f03653 100644 > --- a/ls-refs.c > +++ b/ls-refs.c > @@ -66,7 +66,7 @@ static int send_ref(const char *refname, const struct object_id *oid, > strip_namespace(symref_target)); > } > > - if (data->peel) { > + if (data->peel && oid) { > struct object_id peeled; > if (!peel_ref(refname, &peeled)) > strbuf_addf(&refline, " peeled:%s", oid_to_hex(&peeled));
> On Tue, Jan 26 2021, Jonathan Tan wrote: > > [For some reason the patches didn't reach my mailbox, but I see them in > the list archive, so I'm replying to the cover-letter] > > > Documentation/config.txt | 2 + > > Documentation/config/init.txt | 2 +- > > Good, now we have init.defaultBranch docs, but they say: > > init.defaultBranch:: > Allows overriding the default branch name e.g. when initializing > - a new repository or when cloning an empty repository. > + a new repository. > > So this still only applies to file:// and other "protocol" clones, but > not "git clone /some/path"? Ah...that's true. > Re my reply to v1, do we consider that a bug, feature, something just > left unimplemented? > > I really don't care much, but this really needs a corresponding > documentation update. I.e. something like: > > init.defaultBranch:: > Allows overriding the default branch name e.g. when initializing a > new repository or when cloning an empty repository. > > When cloning a repository over protocol v2 (i.e. ssh://, https://, > file://, but not a /some/path), and if that repository has > init.defaultBranch configured, the server will advertise its > preferred default branch name, and we'll take its configuration over > ours. Thanks - I'll use some of your wording, but I think it's best to leave open the possibility that cloning using protocol v0 or the disk clone (/some/path) copies over the current HEAD as well. > Which, just in terms of implementation makes me think it would make more > sense if the server just had: > > uploadPack.sendConfig = "init.defaultBranch=xyz" > > The client: > > receivePack.acceptConfig = "init.defaultBranch" > > And in terms of things on the wire we'd say: > > "set-config init.defaultBranch=main" > > You could have many such lines, but we'd just harcode only accepting > "init.defaultBranch" by default for now. > > I.e. we set "init.defaultBranch" on the server, and the client ends up > interpreting things as if though "init.defaultBranch" was set to exactly > that value. So why not just ... send a line saying "you should set your > init.defaultBranch config to this". > > Makes it future-extensible pretty much for free, and I think also much > easier to explain to users. I.e. instead of init.defaultBranch somehow > being magical when talking with a remote server we can talk about a > remote server being one source of config per git-config's documented > config order, for a very narrow whitelist of config keys. > > Or (not clear to me, should have waited with my other E-Mail) are we > ever expecting to send more than one of: > > "unborn <refname> symref-target:<target>" > > Or is the reason closer to us being able to shoehorn this into the > existing ls-refs response, as opposed to some general "here's config for > you" response we don't have? It's not the same - from what I understand, what you're suggesting is setting a config in the repo that has just been cloned, but this patch set does not set any such config. Also, it may be strange for the server to be able to change the config of a currently running command - I would expect such a thing to only take effect on future runs of Git on that repo.
On Sat, Jan 30 2021, Jonathan Tan wrote: >> On Tue, Jan 26 2021, Jonathan Tan wrote: >> >> [For some reason the patches didn't reach my mailbox, but I see them in >> the list archive, so I'm replying to the cover-letter] >> >> > Documentation/config.txt | 2 + >> > Documentation/config/init.txt | 2 +- >> >> Good, now we have init.defaultBranch docs, but they say: >> >> init.defaultBranch:: >> Allows overriding the default branch name e.g. when initializing >> - a new repository or when cloning an empty repository. >> + a new repository. >> >> So this still only applies to file:// and other "protocol" clones, but >> not "git clone /some/path"? > > Ah...that's true. > >> Re my reply to v1, do we consider that a bug, feature, something just >> left unimplemented? >> >> I really don't care much, but this really needs a corresponding >> documentation update. I.e. something like: >> >> init.defaultBranch:: >> Allows overriding the default branch name e.g. when initializing a >> new repository or when cloning an empty repository. >> >> When cloning a repository over protocol v2 (i.e. ssh://, https://, >> file://, but not a /some/path), and if that repository has >> init.defaultBranch configured, the server will advertise its >> preferred default branch name, and we'll take its configuration over >> ours. > > Thanks - I'll use some of your wording, but I think it's best to leave > open the possibility that cloning using protocol v0 or the disk clone > (/some/path) copies over the current HEAD as well. Sure, and maybe a test_expect_failure for those cases? I.e. to explicitly say in the current docs/tests what does / doesn't work, and if we consider that intentional or not. >> Which, just in terms of implementation makes me think it would make more >> sense if the server just had: >> >> uploadPack.sendConfig = "init.defaultBranch=xyz" >> >> The client: >> >> receivePack.acceptConfig = "init.defaultBranch" >> >> And in terms of things on the wire we'd say: >> >> "set-config init.defaultBranch=main" >> >> You could have many such lines, but we'd just harcode only accepting >> "init.defaultBranch" by default for now. >> >> I.e. we set "init.defaultBranch" on the server, and the client ends up >> interpreting things as if though "init.defaultBranch" was set to exactly >> that value. So why not just ... send a line saying "you should set your >> init.defaultBranch config to this". >> >> Makes it future-extensible pretty much for free, and I think also much >> easier to explain to users. I.e. instead of init.defaultBranch somehow >> being magical when talking with a remote server we can talk about a >> remote server being one source of config per git-config's documented >> config order, for a very narrow whitelist of config keys. >> >> Or (not clear to me, should have waited with my other E-Mail) are we >> ever expecting to send more than one of: >> >> "unborn <refname> symref-target:<target>" >> >> Or is the reason closer to us being able to shoehorn this into the >> existing ls-refs response, as opposed to some general "here's config for >> you" response we don't have? > > It's not the same - from what I understand, what you're suggesting is > setting a config in the repo that has just been cloned[...] No, not to set config, i.e. during/after clone doing "git config init.defaultBranch <remote>" wouldn't make any sense. Since that would set config in .git/config, and that would (also?) apply /after/ the clone, e.g. if you did "git init /tmp/somewhere/else" afterwards. > [...]but this patch set does not set any such config[...]. It does, within the scope of the runtime of the process. I.e. just like "git -c" or whatever. In builtin/clone.c you set "branch" from local init.defaultBranch only if the remote did not provide us a value for it, i.e. remote config for that config key overrides local config. > Also, it may be strange for the server to be able to change the config > of a currently running command - I would expect such a thing to only > take effect on future runs of Git on that repo. Yes, as I noted on v1 I think the semantics of this whole thing are a bit strange :) But if we're keeping the "strangeness" all I'm saying is that I think it's more obvious to a user if we just declare the remote to be a limited config source in tems of explaining this special-case. And that once we're doing that it's also more obvious IMO to have that be what's happening on the protocol level, if we're not expecting more than one of these values. I.e. if you ignore your current implementation internal and just view git as a black box, then the functionality of this thing is indistinguishable from the remote being a (limited) source of config. So isn't in simpler to explain it to the user in those terms?
> > I really don't care much, but this really needs a corresponding > > documentation update. I.e. something like: > > > > init.defaultBranch:: > > Allows overriding the default branch name e.g. when initializing a > > new repository or when cloning an empty repository. > > > > When cloning a repository over protocol v2 (i.e. ssh://, https://, > > file://, but not a /some/path), and if that repository has > > init.defaultBranch configured, the server will advertise its > > preferred default branch name, and we'll take its configuration over > > ours. > > Thanks - I'll use some of your wording, but I think it's best to leave > open the possibility that cloning using protocol v0 or the disk clone > (/some/path) copies over the current HEAD as well. Looking back on this, I think that it's natural to think that both an empty repository and a non-empty one have a HEAD that points somewhere, and "git clone" would behave the same way in both cases. So I'll hold off on the documentation change.
On Tue, Feb 02 2021, Jonathan Tan wrote: >> > I really don't care much, but this really needs a corresponding >> > documentation update. I.e. something like: >> > >> > init.defaultBranch:: >> > Allows overriding the default branch name e.g. when initializing a >> > new repository or when cloning an empty repository. >> > >> > When cloning a repository over protocol v2 (i.e. ssh://, https://, >> > file://, but not a /some/path), and if that repository has >> > init.defaultBranch configured, the server will advertise its >> > preferred default branch name, and we'll take its configuration over >> > ours. >> >> Thanks - I'll use some of your wording, but I think it's best to leave >> open the possibility that cloning using protocol v0 or the disk clone >> (/some/path) copies over the current HEAD as well. > > Looking back on this, I think that it's natural to think that both an > empty repository and a non-empty one have a HEAD that points somewhere, > and "git clone" would behave the same way in both cases. So I'll hold > off on the documentation change. You mean for a v6 it'll do the same thing in the local clone case too and thus we won't need to document the exception? Sounds good. I was mainly pointing out the need to document the current divergent behavior. Documenting that something isn't consistent shouldn't be seen as a blessing that the divergence is a good idea, it's an aid to our users so they can understand why their git version does X when they might be expecting Y.
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > On Tue, Jan 26 2021, Jonathan Tan wrote: > > [For some reason the patches didn't reach my mailbox, but I see them in > the list archive, so I'm replying to the cover-letter] > >> Documentation/config.txt | 2 + >> Documentation/config/init.txt | 2 +- > > Good, now we have init.defaultBranch docs, but they say: > > init.defaultBranch:: > Allows overriding the default branch name e.g. when initializing > - a new repository or when cloning an empty repository. > + a new repository. > > So this still only applies to file:// and other "protocol" clones, but > not "git clone /some/path"? I agree with you that the new "unborn HEAD will also follow what the upstream has" should be done for --local transport. It is a bug waiting to be complained about by users. > init.defaultBranch:: > Allows overriding the default branch name e.g. when initializing a > new repository or when cloning an empty repository. > > When cloning a repository over protocol v2 (i.e. ssh://, https://, > file://, but not a /some/path), and if that repository has > init.defaultBranch configured, the server will advertise its > preferred default branch name, and we'll take its configuration over > ours. I actually do not think that is what is going on. What the other side advertises is *NOT* their preferred default branch name and it does not matter if they have init.defaultBranch configured or not. What the new protocol extension gives us is that we can learn what the other side is actually using (not their preferred default) as their primary branch. We've always done so since very early days of "git clone" (even back when we failed to clone an empty repository), by trying to guess which branch their HEAD points at. The only thing that is new with this topic is that it now gives us a reliable way to learn what their HEAD points at, even when it is pointing at an unborn branch. In general we do not let other side _dictate_ what our configuration should look like, as that can have security implications, and this is not sending any configuration at all. Their HEAD may be pointing at a specific branch (which may or may not be unborn) because that is what they configured their init.defaultBranch to, or their version of Git created the branch and they haven't changed it since repository creation, or the user using that repository just started working with that branch with "git checkout [--orphan]" (the repository being cloned does not have to be a bare repository). It does not matter how their HEAD ended up pointing at a specific branch---we just try to mimic their current status---it is because it would make it easier to give our changes back to them if everybody involved used the same name for the primary integration branch, and the repositories the people clone from are most often have their primary integration branch pointed at by their HEAD. And I do not consider it transfering any configuration. So while I agree that the logic to choose the branch that gets checked out in a new repository created by "git clone" needs to be documented well, it has very little to do with "init.defaultBranch".