Message ID | 20200513004058.34456-1-jonathantanmy@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | wt-status: expand, not dwim, a "detached from" ref | expand |
Jonathan Tan <jonathantanmy@google.com> writes: > When a user checks out the upstream branch of HEAD and then runs "git > branch", like this: > > git checkout @{u} > git branch > > an error message is printed: > > fatal: HEAD does not point to a branch I had to spend unreasonable amount of time reproducing this. $ git branch --show-current master $ git checkout -b -t naster master $ git branch --show-current naster $ git checkout @{u} $ git branch --show-current master $ git branch * master naster What was missing was that the current branch must be forked from a remote-tracking branch; with a fork from a local branch, the last step, i.e. "git branch", works just fine. > (This error message when running "git branch" persists even after > checking out other things - it only stops after checking out a branch.) Correct. And it is also worth pointing out that this is not "git branch"; the users would see it primarily as a problem with "git status", which would die the same way. > This is because "git branch" attempts to DWIM "@{u}" when determining > the "HEAD detached" message, but that doesn't work because HEAD no > longer points to a branch. > > Therefore, when calculating the status of a worktree, only expand such a > string, not DWIM it. Such strings would not match a ref, and "git > branch" would report "HEAD detached at <hash>" instead. OK. "Detached at <hash>" is somewhat unsatisfactory (you most likely have detached from refs/remotes/origin/<something>), but it is much better than "fatal" X-<. > One alternative is to plumb down a flag from dwim_ref() to > interpret_branch_mark() that indicates that (1) don't read HEAD when > processing, and (2) when encountering a ref of the form "<optional > ref>@{u}", don't die when the optional ref doesn't exist, is not a > branch, or does not have an upstream. (We cannot change the > functionality outright because other parts of Git rely on such a message > being printed.) Thanks for a good analysis. That likely is a good direction for a longer term. @{upstream} is "tell me the upstream of the branch that is currently checked out", right? So it is reasonable to expect that it has no good answer in a detached HEAD state. And when on a branch, that branch may not have an upstream, and we should prepare for such a case, too. > diff --git a/wt-status.c b/wt-status.c > index 98dfa6f73f..f84ebc3e2c 100644 > --- a/wt-status.c > +++ b/wt-status.c > @@ -1562,7 +1562,7 @@ static void wt_status_get_detached_from(struct repository *r, > return; > } > > - if (dwim_ref(cb.buf.buf, cb.buf.len, &oid, &ref) == 1 && > + if (expand_ref(the_repository, cb.buf.buf, cb.buf.len, &oid, &ref) == 1 && > /* sha1 is a commit? match without further lookup */ > (oideq(&cb.noid, &oid) || > /* perhaps sha1 is a tag, try to dereference to a commit */ Hmph, I just did this: $ git branch -b -t next origin/next $ git checkout next@{upstream} $ git status It used to say "HEAD detached at origin/next" without this change, but now it says "HEAD detached at 046d49d455", which smells like a regression.
Junio C Hamano <gitster@pobox.com> writes: > OK. "Detached at <hash>" is somewhat unsatisfactory (you most > likely have detached from refs/remotes/origin/<something>), but > it is much better than "fatal" X-<. > >> One alternative is to plumb down a flag from dwim_ref() to >> interpret_branch_mark() that indicates that (1) don't read HEAD when >> processing, and (2) when encountering a ref of the form "<optional >> ref>@{u}", don't die when the optional ref doesn't exist, is not a >> branch, or does not have an upstream. (We cannot change the >> functionality outright because other parts of Git rely on such a message >> being printed.) > > Thanks for a good analysis. That likely is a good direction for a > longer term. @{upstream} is "tell me the upstream of the branch > that is currently checked out", right? So it is reasonable to > expect that it has no good answer in a detached HEAD state. And > when on a branch, that branch may not have an upstream, and we > should prepare for such a case, too. Well, I have to take this back. The real cause of the problem is that we record in reflog that we switched to @{u}, but when get_detached_from() tries to "dwim" it, the "current branch" that is implicitly there to be applied to @{u} is different one from what used to compute which branch to check out, isn't it? And it probably is not limited to @{upstream} but other @{<stuff>} that implicitly apply to the "current branch", e.g. @{push}. This causes a few potential problems: - The "current" branch may be different from the one when such a reflog entry that refers to @{<stuff>} when we later yank @{<stuff>} out of the reflog and try to use it. This is the problem the patch under discussion tries to hide, but the patch also breaks cases that are working fine. - Even if the user gave the branch name (i.e. 'next@{upstream}' in the example this patch would have introduced a regression) or if we updated get_detached_from() to correctly infer the branch that was current when the reflog entry in which we found @{upstream} was recorded, the upstream for the branch may have been reconfigured between the time when the reflog entry was written and get_detached_from() is called. 'next@{upstream}' may have been 'origin/next' back then but it may be a different remote-tracking branch now. This issue is not solved by the patch---the issue is unfixable unless we log the changes to the branch configuration so that we can figure out what was the upstream of 'next' back then, which we won't do. Assuming that is the root cause, I think the right solution likely lies along these three lines: (1) record not @{<stuff>} (or <branch>@{<stuff>} for that matter), but the actual starting point in the reflog (e.g. in the example this patch would have introduced a regression, i.e. next@{u}, we should record 'origin/next'. In the example this patch would have used degraded output to prevent dying, i.e. @{u}, we should also record 'origin/next')---this also will fix the "the branch's upstream may be different now" problem. (2) a patch like yours to use expand_ref() as a fallback, but only for the "@{<stuff>}" notation that depends on what the "then current" branch was---this is a mere band-aid, like the patch under discussion is, but at least it would not regress the cases that are "working". "The upstream may be different now" problem is still there. (3) when we have to interpret @{<stuff>} found in the reflog, go back to see which branch was current, and compute @{<stuff>} relative to that branch---this would "fix" more cases than (2) would, but it won't fix "the upstream can change" problem, and I think the trade-off is not all that great. I think the combination of (1) and (2) is likely to be the best way to go. (1) will remove the root cause, and (2) will allow us to cope with reflog entries before (1) is applied.
> (1) record not @{<stuff>} (or <branch>@{<stuff>} for that matter), > but the actual starting point in the reflog (e.g. in the > example this patch would have introduced a regression, > i.e. next@{u}, we should record 'origin/next'. In the example > this patch would have used degraded output to prevent dying, > i.e. @{u}, we should also record 'origin/next')---this also > will fix the "the branch's upstream may be different now" > problem. This sounds reasonable. I took a look at this. The part that converts the user-given refname (e.g. "@{u}") into an OID is the invocation of get_oid_mb() in parse_branchname_arg() in builtin/checkout.c, and get_oid_mb() eventually calls repo_dwim_ref() which has access to the absolute branch name ("origin/master"). I did not try plumbing it all the way, but I tried overriding "arg" with "refs/remotes/origin/master" after the call to get_oid_mb() and it worked. For reference, the stack between get_oid_mb() and repo_dwim_ref() is as follows (the line numbers may not be accurate because of some debug statements I added): repo_dwim_ref (refs.c:597) get_oid_basic (sha1-name.c:875) get_oid_1 (sha1-name.c:1195) get_oid_with_context_1 (sha1-name.c:1812) get_oid_with_context (sha1-name.c:1959) repo_get_oid (sha1-name.c:1610) repo_get_oid_mb (sha1-name.c:1382) Besides the increase in complicatedness of all the listed functions that we would need in order to plumb the absolute branch name through, I haven't checked if the absolute branch name is the one that we should use whenever we write to the reflog, or if there are some times that we still want to use the user-specified name. I'll take a further look, but any ideas are welcome.
Hi, Jonathan Tan wrote: > When a user checks out the upstream branch of HEAD and then runs "git > branch", like this: > > git checkout @{u} > git branch > > an error message is printed: > > fatal: HEAD does not point to a branch Interesting. Even worse, it happens when I run "git status". [...] > This is because "git branch" attempts to DWIM "@{u}" when determining > the "HEAD detached" message, but that doesn't work because HEAD no > longer points to a branch. > > Therefore, when calculating the status of a worktree, only expand such a > string, not DWIM it. Such strings would not match a ref, and "git > branch" would report "HEAD detached at <hash>" instead. [...] > - if (dwim_ref(cb.buf.buf, cb.buf.len, &oid, &ref) == 1 && > + if (expand_ref(the_repository, cb.buf.buf, cb.buf.len, &oid, &ref) == 1 && > /* sha1 is a commit? match without further lookup */ Alas, as Junio mentions downthread, this patch produces a regression in behavior: before, $ git checkout --quiet master@{u} $ git status | head -1 HEAD detached at origin/master after, $ git checkout --quiet master@{u} $ git status | head -1 HEAD detached at 675a4aaf3b2 This ends up being a bit subtle. The current branch can be in one of three states: I can be on a branch (HEAD symref pointing to a ref refs/heads/branchname), detached (HEAD psuedoref pointing to a commit), or on a branch yet to be born. __git_ps1 in contrib/completion/git-prompt.sh, for example, is able to show these three states. Since b397ea4863a (status: show more info than "currently not on any branch", 2013-03-13), "git status", on the other hand, gives richer information. In the detached case, naming the commit that HEAD points to doesn't really give a user enough information to orient themself, so we look at the reflog (!) to find out what the user had intended to check out. Reflog entries look like checkout: moving from master to v1.0 and the "v1.0" can tell us that v1.0 is the tag that we detached HEAD on. This strikes me as always having been strange in a few ways: On one hand, this is using the reflog. reflog entries are historically meant to be human-readable. They are a raw string, not structured data. They can be translated to a different human language. Other tools interacting with git repositories are not guaranteed to use the same string. Changes such as the introduction of "git switch" could be expected to lead to writing "switch:" instead of "checkout:" some day. Beyond that, it involves guesswork. As b397ea4863a explains, When it cannot figure out the original ref, it shows an abbreviated SHA-1. The reflog entry contains the parameter passed to "git checkout" in the form the user provided --- it is not a full ref name or string meant to be appended to refs/heads/ but it can be arbitrary syntax supported by "git checkout". At the time that the checkout happened, we know what *commit* that name resolved to, but we do not know what ref it resolved to: - refs in the search path can have been created or deleted since then - with @{u} syntax, the named branch's upstream can have been reconfigured - and so on If we want to know what ref HEAD was detached at, wouldn't we want some reliable source of that information that records it at the time it was known? Another example is if I used "git checkout -" syntax. In that case, the reflog records the commit id that I am checking out, instead of recording "-". That's because (after "-" is replaced by "@{-1}") the "magic branch" handler strbuf_branchname replaces @{-1} with with the branch name being checked out. That branch name *also* comes from the reflog, this time from the <old> side of the checkout: moving from <old> to <new> line, and <old> is populated as a simple commit id or branch name read from HEAD. So this creates a bit of an inconsistency: if I run "git status", "git checkout -", "git checkout -" again, and then "git status" again, the first "git status" tells me what ref I had used to detach HEAD but the second does not. Okay, so much for background. The motivating error producing fatal: HEAD does not point to a branch is a special case of the "we do not know what ref it resolved to" problem described above: the string @{upstream} represents the upstream of the branch that HEAD pointed to *at the time of the checkout*, but since then HEAD has been detached. [...] > One alternative is to plumb down a flag from dwim_ref() to > interpret_branch_mark() that indicates that (1) don't read HEAD when > processing, and (2) when encountering a ref of the form "<optional > ref>@{u}", don't die when the optional ref doesn't exist, is not a > branch, or does not have an upstream. Given the context above, two possibilities seem appealing: A. Like you're hinting, could dwim_ref get a variant that returns -1 instead of die()ing on failure? That way, we could fulfill the intent described in b397ea48: When it cannot figure out the original ref, it shows an abbreviated SHA-1. B. Alternatively, in the @{u} case could we switch together the <old> and <new> pieces of the context from the checkout: moving from master to @{upstream} reflog line to make "master@{upstream}"? It's still possible for the upstream to have changed since then, but at least in the common case this would match the lookup that happened at checkout time. Thoughts? Thanks, Jonathan
Jonathan Nieder <jrnieder@gmail.com> writes: > Given the context above, two possibilities seem appealing: > > A. Like you're hinting, could dwim_ref get a variant that returns -1 > instead of die()ing on failure? That way, we could fulfill the > intent described in b397ea48: > > When it cannot figure out the original ref, it shows an abbreviated > SHA-1. > > B. Alternatively, in the @{u} case could we switch together the <old> > and <new> pieces of the context from the > > checkout: moving from master to @{upstream} > > reflog line to make "master@{upstream}"? It's still possible for > the upstream to have changed since then, but at least in the > common case this would match the lookup that happened at checkout > time. Ah, blast from the past. Thanks for resurrecting. If we are allowed to change what goes to reflog, can we do even better than recording master@{upstream} at the time "checkout" records the HEAD movement? "checkout: moving from next to master" would be far better than "moving from next to next@{upstream}" or "moving from next to @{upstream}". Can we even change the phrasing altogether, like "checkout: moving from next to detached e9b77c..."? That would produce even more precise report.
diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh index 71818b90f0..c5d3d739e5 100755 --- a/t/t3203-branch-output.sh +++ b/t/t3203-branch-output.sh @@ -209,6 +209,16 @@ EOF test_i18ncmp expect actual ' +test_expect_success 'git branch shows detached HEAD properly after checking out upstream branch' ' + git init upstream && + test_commit -C upstream foo && + + git clone upstream downstream && + git -C downstream checkout @{u} && + git -C downstream branch >actual && + test_i18ngrep "HEAD detached at [0-9a-f]\\+" actual +' + test_expect_success 'git branch `--sort` option' ' cat >expect <<-\EOF && * (HEAD detached from fromtag) diff --git a/wt-status.c b/wt-status.c index 98dfa6f73f..f84ebc3e2c 100644 --- a/wt-status.c +++ b/wt-status.c @@ -1562,7 +1562,7 @@ static void wt_status_get_detached_from(struct repository *r, return; } - if (dwim_ref(cb.buf.buf, cb.buf.len, &oid, &ref) == 1 && + if (expand_ref(the_repository, cb.buf.buf, cb.buf.len, &oid, &ref) == 1 && /* sha1 is a commit? match without further lookup */ (oideq(&cb.noid, &oid) || /* perhaps sha1 is a tag, try to dereference to a commit */
When a user checks out the upstream branch of HEAD and then runs "git branch", like this: git checkout @{u} git branch an error message is printed: fatal: HEAD does not point to a branch (This error message when running "git branch" persists even after checking out other things - it only stops after checking out a branch.) This is because "git branch" attempts to DWIM "@{u}" when determining the "HEAD detached" message, but that doesn't work because HEAD no longer points to a branch. Therefore, when calculating the status of a worktree, only expand such a string, not DWIM it. Such strings would not match a ref, and "git branch" would report "HEAD detached at <hash>" instead. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- I tried to track down other uses of wt_status_get_detached_from() (called from wt_status_get_state() with get_detached_from=1) but there seemed to be quite a few, so I stopped. One alternative is to plumb down a flag from dwim_ref() to interpret_branch_mark() that indicates that (1) don't read HEAD when processing, and (2) when encountering a ref of the form "<optional ref>@{u}", don't die when the optional ref doesn't exist, is not a branch, or does not have an upstream. (We cannot change the functionality outright because other parts of Git rely on such a message being printed.) But this seems too complicated just for diagnosis. --- t/t3203-branch-output.sh | 10 ++++++++++ wt-status.c | 2 +- 2 files changed, 11 insertions(+), 1 deletion(-)