Message ID | 20241020202507.2596990-1-bence@ferdinandy.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [RFC] object-name: add @{upstreamhead} shorthand | expand |
Good evening On Sun, Oct 20, 2024, at 22:24, Bence Ferdinandy wrote: > The HEAD of the remote is useful in many situations, but currently one > would need to know the name of the remote to perform something like > "git log origin/HEAD..", which makes writing remote agnostic aliases > complicated. Introduce the new shorthand "@{upstreamhead}" which returns > <remote>/HEAD for the same <remote> "@{upstream}" would yield. > > Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> > --- > > Notes: > RFC v1: Testing and documentation is completely missing, I'll add those > in a v2 if people think the patch has merit. Do you have some concrete examples? I’m not well versed in using remote HEAD.
On Sun Oct 20, 2024 at 22:40, Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com> wrote: > Good evening > > On Sun, Oct 20, 2024, at 22:24, Bence Ferdinandy wrote: >> The HEAD of the remote is useful in many situations, but currently one >> would need to know the name of the remote to perform something like >> "git log origin/HEAD..", which makes writing remote agnostic aliases >> complicated. Introduce the new shorthand "@{upstreamhead}" which returns >> <remote>/HEAD for the same <remote> "@{upstream}" would yield. >> >> Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> >> --- >> >> Notes: >> RFC v1: Testing and documentation is completely missing, I'll add those >> in a v2 if people think the patch has merit. > > Do you have some concrete examples? I’m not well versed in using > remote HEAD. N.b. I was intending to write s/many situations/some situations. I basically use it for two things: - variations of `git log remote/HEAD..` for which I currently have an alias with "origin" hardcoded. E.g. I'm on a feature branch I'm reviewing and I want to know what commits are new compared to origin/(master|main|trunk), but I use HEAD, because I never know (and don't really want to pay attention to) what project uses what. And although "origin" is usually ok, but not always if there are forks in play, so @{upstreamhead} would make it agnostic to the remote's name. - I also use remote/HEAD in CICD, i.e. with `git rev-list origin/HEAD..` you can run checks on a commit-by-commit bases instead of the end result of a patch series or pull request. It's really useful to check have basic checks for commit messages for example. In a CICD of course for a _specific_ project you know what HEAD is, but still, using HEAD makes a step portable across repos. And again of course, I think in CICD you almost certainly will always end up with the remote being called "origin", so this change might not be quite so useful there. But so the long story short here is that for (origin|upstream)/(master|main|trunk) we can already have agnostic code with HEAD for the second part and with a patch like this we could have agnostic code for the whole thing. Best, Bence
On Sun, Oct 20, 2024 at 11:42:38PM +0200, Bence Ferdinandy wrote: > I basically use it for two things: > > - variations of `git log remote/HEAD..` for which I currently have an alias > with "origin" hardcoded. E.g. I'm on a feature branch I'm reviewing and > I want to know what commits are new compared to origin/(master|main|trunk), > but I use HEAD, because I never know (and don't really want to pay attention > to) what project uses what. And although "origin" is usually ok, but not > always if there are forks in play, so @{upstreamhead} would make it agnostic > to the remote's name. I'm a little skeptical that this is useful. If a local branch has a particular remote branch configured as its upstream, then shouldn't your search for new commits be against that configured upstream branch, not whatever that remote's HEAD happens to be? In many cases, of course, I'd expect that HEAD to also be the upstream branch. But then you could just use @{upstream}. And in some cases, you really want to compare against a known base point, regardless of the configured upstream. But then you should use the full name of that base point, rather than the remote half of the upstream config. It sounds more like a band-aid for scripts that are expected to be used across repos that may use other names for what is effectively "origin". In which case I question whether we really want new lookup syntax, versus having those scripts learn to query the remote name. E.g., I think you could do: upstream=$(git rev-parse --symbolic-full-name @{upstream}) git log ${upstream%/*}/HEAD.. And possibly we could make it easier to just grab the remote name with a single command. -Peff
On Sun, Oct 20, 2024 at 11:42:38PM +0200, Bence Ferdinandy wrote: > But so the long story short here is that for > (origin|upstream)/(master|main|trunk) we can already have agnostic code with > HEAD for the second part and with a patch like this we could have agnostic code > for the whole thing. I'm hesitant to pick this up because of what is said in this paragraph. When you write "(master|main|trunk)", I think you're really spelling "HEAD". And it's fine to write HEAD in a script when you want to resolve something to master/main/trunk/etc. without caring which and instead delegating that to whatever the remote HEAD is. But determining the upstream of a branch is already easy to do as Peff points out downthread. So this seems like a band-aid for scripts that do not care to perform such a resolution themselves. Thanks, Taylor
On Mon Oct 21, 2024 at 21:14, Jeff King <peff@peff.net> wrote: > On Sun, Oct 20, 2024 at 11:42:38PM +0200, Bence Ferdinandy wrote: > >> I basically use it for two things: >> >> - variations of `git log remote/HEAD..` for which I currently have an alias >> with "origin" hardcoded. E.g. I'm on a feature branch I'm reviewing and >> I want to know what commits are new compared to origin/(master|main|trunk), >> but I use HEAD, because I never know (and don't really want to pay attention >> to) what project uses what. And although "origin" is usually ok, but not >> always if there are forks in play, so @{upstreamhead} would make it agnostic >> to the remote's name. > > I'm a little skeptical that this is useful. If a local branch has a > particular remote branch configured as its upstream, then shouldn't your > search for new commits be against that configured upstream branch, not > whatever that remote's HEAD happens to be? > > In many cases, of course, I'd expect that HEAD to also be the upstream > branch. But then you could just use @{upstream}. > > And in some cases, you really want to compare against a known base > point, regardless of the configured upstream. But then you should use > the full name of that base point, rather than the remote half of the > upstream config. > > It sounds more like a band-aid for scripts that are expected to be used > across repos that may use other names for what is effectively "origin". > In which case I question whether we really want new lookup syntax, > versus having those scripts learn to query the remote name. > > E.g., I think you could do: > > upstream=$(git rev-parse --symbolic-full-name @{upstream}) > git log ${upstream%/*}/HEAD.. That particular one will break if you have something like refs/remotes/origin/foo/bar, but I get your point. > > And possibly we could make it easier to just grab the remote name with a > single command. As I was running this patch through my head yesterday I sort of distilled my argument in favour to "writing remote agnostic scripts are unnecessarily complicated", but I do agree, that if there were a git command that could return the remote for a branch without any extra scripting hacks would easily get you the same result, and may even be useful elsewhere. I'm not sure where this would be the best. Maybe: git branch --show-current-remote ? Thanks for the feedback! Best, Bence
On Mon Oct 21, 2024 at 21:45, Taylor Blau <me@ttaylorr.com> wrote: > On Sun, Oct 20, 2024 at 11:42:38PM +0200, Bence Ferdinandy wrote: >> But so the long story short here is that for >> (origin|upstream)/(master|main|trunk) we can already have agnostic code with >> HEAD for the second part and with a patch like this we could have agnostic code >> for the whole thing. > > I'm hesitant to pick this up because of what is said in this paragraph. > When you write "(master|main|trunk)", I think you're really spelling > "HEAD". And it's fine to write HEAD in a script when you want to resolve > something to master/main/trunk/etc. without caring which and instead > delegating that to whatever the remote HEAD is. > > But determining the upstream of a branch is already easy to do as Peff > points out downthread. So this seems like a band-aid for scripts that do > not care to perform such a resolution themselves. Agreed, Peff's idea to make querying remote easier is a much better way. Best, Bence
On Mon, Oct 21, 2024 at 10:09:38PM +0200, Bence Ferdinandy wrote: > > E.g., I think you could do: > > > > upstream=$(git rev-parse --symbolic-full-name @{upstream}) > > git log ${upstream%/*}/HEAD.. > > That particular one will break if you have something like > refs/remotes/origin/foo/bar, but I get your point. Ugh. Having / characters in remote names feels like a mistake to me ;-). Thanks, Taylor
On Mon, Oct 21, 2024 at 10:09:38PM +0200, Bence Ferdinandy wrote: > > And possibly we could make it easier to just grab the remote name with a > > single command. > > As I was running this patch through my head yesterday I sort of distilled my > argument in favour to "writing remote agnostic scripts are unnecessarily > complicated", but I do agree, that if there were a git command that could > return the remote for a branch without any extra scripting hacks would easily > get you the same result, and may even be useful elsewhere. > > I'm not sure where this would be the best. Maybe: > git branch --show-current-remote > ? I've been giving some thought to this. You could argue it's about querying remotes, so "git remote". Although we don't really want to know anything about the remote except its name, so it's a little weird. Or as you note, we're querying info about a branch. So "git branch" makes sense. But "--show-current-remote" feels kind of narrow there. Shouldn't we be able to ask about the configured remote for any branch? In which case it is really just a single "git config" lookup away: git config branch.$branch.remote You have to look up the current branch, of course. You can do that with symbolic-ref like: git config "branch.$(git symbolic-ref --short HEAD).remote" You might get an error from symbolic-ref if we're on a detached HEAD, of course. You can either ignore that (in which case the lookup of "branch..remote" would show nothing), or a script can actually distinguish the two cases ("not on a branch" versus "there is no configured remote"). There's also another wrinkle we hadn't discussed: we have the concept of both an upstream remote for fetching and a push remote. And this would naturally extend there (you'd ask for .pushremote instead). And finally, there's yet another way to access this information. ;) The for-each-ref formatter (which is also used for "branch --format") knows how to show remote names (and much more). So: git branch --list --format='%(upstream:remotename)' $branch also gets you what you want. I don't think there's a good way to ask that command to show just the branch pointed to by HEAD, though. We recently added --include-root-refs to for-each-ref, but that's not quite what you want (you want just HEAD, and you really want to dereference it to show details of the branch it points to). So I think rather than "branch --show-current-remote", we'd want some option to make "branch --list" show only the currently checked out branch, and then you could apply --format to it to get whatever information you wanted. Something like: git branch --list --is-head --format='%(upstream:remotename)' -Peff
On Wed Oct 23, 2024 at 23:56, Jeff King <peff@peff.net> wrote: > On Mon, Oct 21, 2024 at 10:09:38PM +0200, Bence Ferdinandy wrote: > >> > And possibly we could make it easier to just grab the remote name with a >> > single command. >> >> As I was running this patch through my head yesterday I sort of distilled my >> argument in favour to "writing remote agnostic scripts are unnecessarily >> complicated", but I do agree, that if there were a git command that could >> return the remote for a branch without any extra scripting hacks would easily >> get you the same result, and may even be useful elsewhere. >> >> I'm not sure where this would be the best. Maybe: >> git branch --show-current-remote >> ? > > I've been giving some thought to this. > > You could argue it's about querying remotes, so "git remote". Although > we don't really want to know anything about the remote except its name, > so it's a little weird. > > Or as you note, we're querying info about a branch. So "git branch" > makes sense. But "--show-current-remote" feels kind of narrow there. > Shouldn't we be able to ask about the configured remote for any branch? > > In which case it is really just a single "git config" lookup away: > > git config branch.$branch.remote > > You have to look up the current branch, of course. You can do that with > symbolic-ref like: > > git config "branch.$(git symbolic-ref --short HEAD).remote" > > You might get an error from symbolic-ref if we're on a detached HEAD, of > course. You can either ignore that (in which case the lookup of > "branch..remote" would show nothing), or a script can actually > distinguish the two cases ("not on a branch" versus "there is no > configured remote"). > > There's also another wrinkle we hadn't discussed: we have the concept of > both an upstream remote for fetching and a push remote. And this would > naturally extend there (you'd ask for .pushremote instead). > > And finally, there's yet another way to access this information. ;) The > for-each-ref formatter (which is also used for "branch --format") knows > how to show remote names (and much more). So: > > git branch --list --format='%(upstream:remotename)' $branch > > also gets you what you want. I don't think there's a good way to ask > that command to show just the branch pointed to by HEAD, though. We > recently added --include-root-refs to for-each-ref, but that's not quite > what you want (you want just HEAD, and you really want to dereference it > to show details of the branch it points to). > > So I think rather than "branch --show-current-remote", we'd want > some option to make "branch --list" show only the currently checked out > branch, and then you could apply --format to it to get whatever > information you wanted. Something like: > > git branch --list --is-head --format='%(upstream:remotename)' Thanks for running through this in such detail! This would be more widely useful for sure. I'd probably call the flag something like "--current", "--current-only" rather than "--is-head" though. "--is-head" sounds as if it would filter --list but not necessarily end up with a single entry. Anyway, thanks for the idea, I'll probably pick this up sooner or later. Best, Bence
On Thu, Oct 24, 2024 at 08:48:29PM +0200, Bence Ferdinandy wrote: > > So I think rather than "branch --show-current-remote", we'd want > > some option to make "branch --list" show only the currently checked out > > branch, and then you could apply --format to it to get whatever > > information you wanted. Something like: > > > > git branch --list --is-head --format='%(upstream:remotename)' > > Thanks for running through this in such detail! This would be more widely > useful for sure. > > I'd probably call the flag something like "--current", "--current-only" rather > than "--is-head" though. "--is-head" sounds as if it would filter --list but > not necessarily end up with a single entry. Yeah, I think --current would be fine. -Peff
On Fri Oct 25, 2024 at 08:24, Jeff King <peff@peff.net> wrote: > On Thu, Oct 24, 2024 at 08:48:29PM +0200, Bence Ferdinandy wrote: > >> > So I think rather than "branch --show-current-remote", we'd want >> > some option to make "branch --list" show only the currently checked out >> > branch, and then you could apply --format to it to get whatever >> > information you wanted. Something like: >> > >> > git branch --list --is-head --format='%(upstream:remotename)' >> >> Thanks for running through this in such detail! This would be more widely >> useful for sure. >> >> I'd probably call the flag something like "--current", "--current-only" rather >> than "--is-head" though. "--is-head" sounds as if it would filter --list but >> not necessarily end up with a single entry. > > Yeah, I think --current would be fine. I was looking through git branch and there is a --show-current option. I was wondering, would it not be better to teach --show-current to also obey --format? It would avoid having a "--current" that only works with "--list" besides having a "--show-current".
On Sun, Oct 27, 2024 at 11:07:07PM +0100, Bence Ferdinandy wrote: > > On Fri Oct 25, 2024 at 08:24, Jeff King <peff@peff.net> wrote: > > On Thu, Oct 24, 2024 at 08:48:29PM +0200, Bence Ferdinandy wrote: > > > >> > So I think rather than "branch --show-current-remote", we'd want > >> > some option to make "branch --list" show only the currently checked out > >> > branch, and then you could apply --format to it to get whatever > >> > information you wanted. Something like: > >> > > >> > git branch --list --is-head --format='%(upstream:remotename)' > >> > >> Thanks for running through this in such detail! This would be more widely > >> useful for sure. > >> > >> I'd probably call the flag something like "--current", "--current-only" rather > >> than "--is-head" though. "--is-head" sounds as if it would filter --list but > >> not necessarily end up with a single entry. > > > > Yeah, I think --current would be fine. > > I was looking through git branch and there is a --show-current option. I was > wondering, would it not be better to teach --show-current to also obey > --format? It would avoid having a "--current" that only works with "--list" > besides having a "--show-current". Yeah, I think that supporting '--format' specifiers via 'git branch --show-current' makes sense. In the interim you could do something gross like: git branch --list --format='%(upstream:remotename)' \ --end-of-options "$(git branch --show-current)" , but... yuck :-). I think the right thing to do would be to teach 'git branch --show-current' to support the full range of --format specifiers. And I think the way to do that would be to treat --show-current as a special case of --list. In the existing implementation, we special-case handling the current branch with --show-current via a separate code path in builtin/branch.c::show_current_branch_name(). It would be nice to change the implementation there to pretend as if the current branch as the pattern given to --list instead of handling printing it out separately. I think that would be a nice small-ish project for anybody looking to get their hands dirty in the 'branch' builtin's implementation. Thanks, Taylor
On Sun, Oct 27, 2024 at 11:07:07PM +0100, Bence Ferdinandy wrote: > >> I'd probably call the flag something like "--current", "--current-only" rather > >> than "--is-head" though. "--is-head" sounds as if it would filter --list but > >> not necessarily end up with a single entry. > > > > Yeah, I think --current would be fine. > > I was looking through git branch and there is a --show-current option. I was > wondering, would it not be better to teach --show-current to also obey > --format? It would avoid having a "--current" that only works with "--list" > besides having a "--show-current". Yeah, that's perfect. I had almost suggested "--list-head" originally, but I didn't want to introduce yet another major-mode to git-branch. But if we already have it, that's not a problem. :) And the patch should be quite short, I'd think. -Peff
diff --git a/object-name.c b/object-name.c index c892fbe80a..f40a226a57 100644 --- a/object-name.c +++ b/object-name.c @@ -936,6 +936,12 @@ static inline int push_mark(const char *string, int len) return at_mark(string, len, suffix, ARRAY_SIZE(suffix)); } +static inline int upstream_head_mark(const char *string, int len) +{ + const char *suffix[] = { "@{upstreamhead}", "@{uh}" }; + return at_mark(string, len, suffix, ARRAY_SIZE(suffix)); +} + static enum get_oid_result get_oid_1(struct repository *r, const char *name, int len, struct object_id *oid, unsigned lookup_flags); static int interpret_nth_prior_checkout(struct repository *r, const char *name, int namelen, struct strbuf *buf); @@ -985,7 +991,8 @@ static int get_oid_basic(struct repository *r, const char *str, int len, continue; } if (!upstream_mark(str + at, len - at) && - !push_mark(str + at, len - at)) { + !push_mark(str + at, len - at) && + !upstream_head_mark(str + at, len - at)) { reflog_len = (len-1) - (at+2); len = at; } @@ -1729,6 +1736,12 @@ int repo_interpret_branch_name(struct repository *r, options); if (len > 0) return len; + + len = interpret_branch_mark(r, name, namelen, at - name, buf, + upstream_head_mark, branch_get_upstream_head, + options); + if (len > 0) + return len; } return -1; diff --git a/remote.c b/remote.c index 10104d11e3..302f013a25 100644 --- a/remote.c +++ b/remote.c @@ -1980,6 +1980,42 @@ const char *branch_get_upstream(struct branch *branch, struct strbuf *err) return branch->merge[0]->dst; } +const char *branch_get_upstream_head(struct branch *branch, struct strbuf *err) +{ + struct strbuf retval = STRBUF_INIT, refstring = STRBUF_INIT; + struct string_list l = STRING_LIST_INIT_DUP; + + if (!branch) + return error_buf(err, _("HEAD does not point to a branch")); + + if (!branch->merge || !branch->merge[0]) { + /* + * no merge config; is it because the user didn't define any, + * or because it is not a real branch, and get_branch + * auto-vivified it? + */ + if (!refs_ref_exists(get_main_ref_store(the_repository), branch->refname)) + return error_buf(err, _("no such branch: '%s'"), + branch->name); + return error_buf(err, + _("no upstream configured for branch '%s'"), + branch->name); + } + + if (!branch->merge[0]->dst) + return error_buf(err, + _("upstream branch '%s' not stored as a remote-tracking branch"), + branch->merge[0]->src); + + string_list_split(&l, branch->merge[0]->dst, '/', -1); + strbuf_addf(&refstring, "refs/remotes/%s/HEAD", l.items[2].string); + + if (refs_read_symbolic_ref(get_main_ref_store(the_repository), refstring.buf, &retval)) + return error_buf(err, _("%s does not exist"), refstring.buf); + + return retval.buf; +} + static const char *tracking_for_push_dest(struct remote *remote, const char *refname, struct strbuf *err) diff --git a/remote.h b/remote.h index a7e5c4e07c..a1d0f44297 100644 --- a/remote.h +++ b/remote.h @@ -360,6 +360,14 @@ const char *branch_get_upstream(struct branch *branch, struct strbuf *err); */ const char *branch_get_push(struct branch *branch, struct strbuf *err); +/** + * Return the fully-qualified refname of the HEAD branch for the same remote + * that "branch@{upstream}" is on. + * + * The return value and `err` conventions match those of `branch_get_upstream`. + */ +const char *branch_get_upstream_head(struct branch *branch, struct strbuf *err); + /* Flags to match_refs. */ enum match_refs_flags { MATCH_REFS_NONE = 0,
The HEAD of the remote is useful in many situations, but currently one would need to know the name of the remote to perform something like "git log origin/HEAD..", which makes writing remote agnostic aliases complicated. Introduce the new shorthand "@{upstreamhead}" which returns <remote>/HEAD for the same <remote> "@{upstream}" would yield. Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com> --- Notes: RFC v1: Testing and documentation is completely missing, I'll add those in a v2 if people think the patch has merit. object-name.c | 15 ++++++++++++++- remote.c | 36 ++++++++++++++++++++++++++++++++++++ remote.h | 8 ++++++++ 3 files changed, 58 insertions(+), 1 deletion(-)