Message ID | 20190122232301.95971-2-nbelakovski@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | | expand |
nbelakovski@gmail.com writes: > From: Nickolai Belakovski <nbelakovski@gmail.com> > > Add an atom providing the path of the linked worktree where this ref is > checked out, if it is checked out in any linked worktrees, and empty > string otherwise. > --- Missing sign-off? > +static int ref_to_worktree_map_cmpfnc(const void *unused_lookupdata, const void *existing_hashmap_entry_to_test, > + const void *key, const void *keydata_aka_refname) > +{ > + const struct ref_to_worktree_entry *e = existing_hashmap_entry_to_test; > + const struct ref_to_worktree_entry *k = key; > + return strcmp(e->wt->head_ref, keydata_aka_refname ? keydata_aka_refname : k->wt->head_ref); Overlong line. > +} > + > +static struct hashmap ref_to_worktree_map; > +static struct worktree **worktrees = NULL; No need to initialize static vars to 0/NULL. > @@ -438,6 +456,34 @@ static int head_atom_parser(const struct ref_format *format, struct used_atom *a > return 0; > } > > +static int worktree_atom_parser(const struct ref_format *format, > + struct used_atom *atom, > + const char *arg, > + struct strbuf *unused_err) > +{ > + int i; > + > + if (worktrees) > + return 0; > + > + worktrees = get_worktrees(0); > + > + hashmap_init(&ref_to_worktree_map, ref_to_worktree_map_cmpfnc, NULL, 0); > + > + for (i = 0; worktrees[i]; i++) { > + if (worktrees[i]->head_ref) { > + struct ref_to_worktree_entry *entry; > + entry = xmalloc(sizeof(*entry)); > + entry->wt = worktrees[i]; > + hashmap_entry_init(entry, strhash(worktrees[i]->head_ref)); > + > + hashmap_add(&ref_to_worktree_map, entry); > + } > + } > + > + return 0; > +} It is kind of interesting that a function for parsing an "atom" in "format" does not look at none of its arguments at all ;-) The fact that "%(worktreepath)" atom got noticed by verify_ref_format() alone is of interest for this function. The helper iterates over all worktrees, registers them in a hashmap ref_to_worktree_map, indexed by the head reference. OK. > +static char *get_worktree_path(const struct used_atom *atom, const struct ref_array_item *ref) > +{ > + struct strbuf val = STRBUF_INIT; > + struct hashmap_entry entry; > + struct ref_to_worktree_entry *lookup_result; > + > + hashmap_entry_init(&entry, strhash(ref->refname)); > + lookup_result = hashmap_get(&ref_to_worktree_map, &entry, ref->refname); > + > + if (lookup_result) > + strbuf_addstr(&val, lookup_result->wt->path); > + > + return strbuf_detach(&val, NULL); > +} We do not need a strbuf to do the above, do we? hashmap_entry_init(...); lookup_result = hashmap_get(...); if (lookup_result) return xstrdup(lookup_result->wt->path); else return xstrdup(""); or something like that, perhaps? > @@ -1562,6 +1624,13 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err) > > if (starts_with(name, "refname")) > refname = get_refname(atom, ref); > + else if (starts_with(name, "worktreepath")) { > + if (ref->kind == FILTER_REFS_BRANCHES) > + v->s = get_worktree_path(atom, ref); > + else > + v->s = xstrdup(""); > + continue; > + } I am wondering if get_worktree_path() being called should be the triggering event for lazy initialization worktree_atom_parser() is doing in this patch, instead of verify_ref_format() seeing the "%(worktreepath)" atom. Is there any codepath that wants to make sure the lazy initialization is done without/before populate_value() sees a use of the "%(worktreepath)" atom? If so, such a plan would not work, but otherwise, we can do the following: - rename worktree_atom_parser() to lazy_init_worktrees() or something, and remove all of its unused parameters. - remove parser callback for "worktreepath" from valid_atom[]. - call lazy_inti_worktrees() at the beginning of get_worktree_path(). Hmm?
On Wed, Jan 23, 2019 at 10:57 AM Junio C Hamano <gitster@pobox.com> wrote: > > Missing sign-off? > My mistake, forgot -s on format-patch. Will remember to add it next go-around. > > +static int ref_to_worktree_map_cmpfnc(const void *unused_lookupdata, const void *existing_hashmap_entry_to_test, > > + const void *key, const void *keydata_aka_refname) > > +{ > > + const struct ref_to_worktree_entry *e = existing_hashmap_entry_to_test; > > + const struct ref_to_worktree_entry *k = key; > > + return strcmp(e->wt->head_ref, keydata_aka_refname ? keydata_aka_refname : k->wt->head_ref); > > Overlong line. > > > +} > OK, should I shorten the function signature as well? Is it too long because it's 101 characters and you're trying to stick to 100? > > > + > > +static struct hashmap ref_to_worktree_map; > > +static struct worktree **worktrees = NULL; > > No need to initialize static vars to 0/NULL. OK > > We do not need a strbuf to do the above, do we? > > hashmap_entry_init(...); > lookup_result = hashmap_get(...); > if (lookup_result) > return xstrdup(lookup_result->wt->path); > else > return xstrdup(""); > > or something like that, perhaps? > Sure. > > @@ -1562,6 +1624,13 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err) > > > > if (starts_with(name, "refname")) > > refname = get_refname(atom, ref); > > + else if (starts_with(name, "worktreepath")) { > > + if (ref->kind == FILTER_REFS_BRANCHES) > > + v->s = get_worktree_path(atom, ref); > > + else > > + v->s = xstrdup(""); > > + continue; > > + } > > I am wondering if get_worktree_path() being called should be the > triggering event for lazy initialization worktree_atom_parser() is > doing in this patch, instead of verify_ref_format() seeing the > "%(worktreepath)" atom. Is there any codepath that wants to make > sure the lazy initialization is done without/before populate_value() > sees a use of the "%(worktreepath)" atom? If so, such a plan would > not work, but otherwise, we can do the following: > > - rename worktree_atom_parser() to lazy_init_worktrees() or > something, and remove all of its unused parameters. > > - remove parser callback for "worktreepath" from valid_atom[]. > > - call lazy_inti_worktrees() at the beginning of > get_worktree_path(). > > Hmm? Yes, the parser used the atom argument in an earlier version of this patch, but we since moved the map out of the atom since it only needs to exist once globally. Even though we have a caching mechanism for atoms it still seemed like a logical move to explicitly keep one instance of the map globally. Will make the change, thanks for taking a look at it through fresh eyes.
Nickolai Belakovski <nbelakovski@gmail.com> writes: > Yes, the parser used the atom argument in an earlier version of this > patch, but we since moved the map out of the atom since it only needs > to exist once globally. Even though we have a caching mechanism for > atoms it still seemed like a logical move to explicitly keep one > instance of the map globally. I think that is a mistaken move from an earlier version to this one. The worktree-related stuff only becomes necessary and belongs to the %(worktreepath) atom, so unless there is a compelling reason not to, we should hang it there, instead of introducing a global. When you have --format='%(worktreepath) %(worktreepath)', you'd have only one shared instance of it in used_atom[] to ensure that we have a singleton instance. The object_info support in the file, which is relatively new, may be what you borrowed the wrong idea of preferring globals from; I think it should be taken as an anti-pattern.
On Thu, Jan 24, 2019 at 10:26:18AM -0800, Junio C Hamano wrote: > Nickolai Belakovski <nbelakovski@gmail.com> writes: > > > Yes, the parser used the atom argument in an earlier version of this > > patch, but we since moved the map out of the atom since it only needs > > to exist once globally. Even though we have a caching mechanism for > > atoms it still seemed like a logical move to explicitly keep one > > instance of the map globally. > > I think that is a mistaken move from an earlier version to this > one. The worktree-related stuff only becomes necessary and belongs > to the %(worktreepath) atom, so unless there is a compelling reason > not to, we should hang it there, instead of introducing a global. > When you have --format='%(worktreepath) %(worktreepath)', you'd have > only one shared instance of it in used_atom[] to ensure that we have > a singleton instance. What if you have other atoms that need worktrees? E.g., does %(worktreepath:foo) use the same used_atom slot? What if we have another worktree-related atom? If anything, I'd suggest going in the opposite direction, and teaching the worktree code a function for looking up a working tree by ref. And then it can handle its own cache to implement that reverse-mapping efficiently. > The object_info support in the file, which is relatively new, may be > what you borrowed the wrong idea of preferring globals from; I think > it should be taken as an anti-pattern. And that one is a good example where we _do_ need the global, because we already have multiple atoms pulling from it. I think the right level is "global to the ref-filter context", but we do not have that concept yet (hence why used_atoms, etc, are all file-static globals). -Peff
Jeff King <peff@peff.net> writes: > If anything, I'd suggest going in the opposite direction, and teaching > the worktree code a function for looking up a working tree by ref. And > then it can handle its own cache to implement that reverse-mapping > efficiently. Yeah, that's a thought. Then "give me a worktree that checks out this ref" can be asked outside the context of for-each-ref and friends, which is a big plus.
Jeff King <peff@peff.net> writes: > What if you have other atoms that need worktrees? E.g., does > %(worktreepath:foo) use the same used_atom slot? What if we have another > worktree-related atom? > ... > And that one is a good example where we _do_ need the global, because we > already have multiple atoms pulling from it. I guess that we broke the original atom design by mistake when we added ":<modifiers>" support. There should have been one layer of indirection that binds the instances of the same atom with different modifiers together---I agree with you that we cannot avoid globals without fixing that mistake first. Thanks.
On Thu, Jan 24, 2019 at 11:27:36AM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > If anything, I'd suggest going in the opposite direction, and teaching > > the worktree code a function for looking up a working tree by ref. And > > then it can handle its own cache to implement that reverse-mapping > > efficiently. > > Yeah, that's a thought. Then "give me a worktree that checks out > this ref" can be asked outside the context of for-each-ref and > friends, which is a big plus. One tricky thing is that we do not store a list of "struct worktree", but rather generate it on the fly when get_worktrees() is called. So having an API to ask for one ref's worktree at a time is slightly awkward. It's only by having (and caching) this full list in ref-filter.c that we can easily make the reverse-indexed hashmap. -Peff
On Thu, Jan 24, 2019 at 11:30:20AM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > What if you have other atoms that need worktrees? E.g., does > > %(worktreepath:foo) use the same used_atom slot? What if we have another > > worktree-related atom? > > ... > > And that one is a good example where we _do_ need the global, because we > > already have multiple atoms pulling from it. > > I guess that we broke the original atom design by mistake when we > added ":<modifiers>" support. There should have been one layer of > indirection that binds the instances of the same atom with different > modifiers together---I agree with you that we cannot avoid globals > without fixing that mistake first. Yes, that's one way to fix it. I actually think the biggest mistake is having that used_atoms list in the first place, as we iterate over it several times asking "can we fill this in yet?". The way pretty.c does it is just to incrementally parse the commit, saving intermediate results. And in cat-file.c, we figure out what we need ahead of time in a single pass, and then just fill it in for each object (which sort of works out the same, but doesn't require that the parsing needed for item X is a strict superset of item Y). So I'd much rather see us parse the format into a real tree of nodes, and figure out (once) which properties of each object are required to fulfill that. Then for each object, we grab those properties, and then walk the tree to generate the output string. -Peff
So where does that leave us for this series? We could move hashmap back into used_atom, but if a user entered --format="%(worktreepath)%(worktreepath:)" we'd end up freeing worktrees twice. Not that that should stop us - that scenario is one where user input isn't sensible and personally I don't think it's necessary to protect against such things (unless the user was reasonably confused, but I don't see that as the case here). I agree with Jeff that a ref-filter "context" would help. And in more ways than one, it could help us decide ahead of time whether to check if a ref is a branch or a tag before doing a hashmap lookup or just skip the check (i.e. if there are no tags within the context, the check would only add cost). But I do believe that that would be outside the scope of this series. I think leaving it as globals is a tiny bit safer and also makes it easier to pack it into a context if/when we decide to do that work, but as always I'm open to other interpretations. On Thu, Jan 24, 2019 at 1:26 PM Jeff King <peff@peff.net> wrote: > > On Thu, Jan 24, 2019 at 11:30:20AM -0800, Junio C Hamano wrote: > > > Jeff King <peff@peff.net> writes: > > > > > What if you have other atoms that need worktrees? E.g., does > > > %(worktreepath:foo) use the same used_atom slot? What if we have another > > > worktree-related atom? > > > ... > > > And that one is a good example where we _do_ need the global, because we > > > already have multiple atoms pulling from it. > > > > I guess that we broke the original atom design by mistake when we > > added ":<modifiers>" support. There should have been one layer of > > indirection that binds the instances of the same atom with different > > modifiers together---I agree with you that we cannot avoid globals > > without fixing that mistake first. > > Yes, that's one way to fix it. > > I actually think the biggest mistake is having that used_atoms list in > the first place, as we iterate over it several times asking "can we fill > this in yet?". The way pretty.c does it is just to incrementally parse > the commit, saving intermediate results. And in cat-file.c, we figure > out what we need ahead of time in a single pass, and then just fill it > in for each object (which sort of works out the same, but doesn't > require that the parsing needed for item X is a strict superset of item > Y). > > So I'd much rather see us parse the format into a real tree of nodes, > and figure out (once) which properties of each object are required to > fulfill that. Then for each object, we grab those properties, and then > walk the tree to generate the output string. > > -Peff
Jeff King <peff@peff.net> writes: > On Thu, Jan 24, 2019 at 11:30:20AM -0800, Junio C Hamano wrote: > >> Jeff King <peff@peff.net> writes: >> >> > What if you have other atoms that need worktrees? E.g., does >> > %(worktreepath:foo) use the same used_atom slot? What if we have another >> > worktree-related atom? >> > ... >> > And that one is a good example where we _do_ need the global, because we >> > already have multiple atoms pulling from it. >> >> I guess that we broke the original atom design by mistake when we >> added ":<modifiers>" support. There should have been one layer of >> indirection that binds the instances of the same atom with different >> modifiers together---I agree with you that we cannot avoid globals >> without fixing that mistake first. > > Yes, that's one way to fix it. > > I actually think the biggest mistake is having that used_atoms list in > the first place, as we iterate over it several times asking "can we fill > this in yet?". The way pretty.c does it is just to incrementally parse > the commit, saving intermediate results. And in cat-file.c, we figure > out what we need ahead of time in a single pass, and then just fill it > in for each object (which sort of works out the same, but doesn't > require that the parsing needed for item X is a strict superset of item > Y). > > So I'd much rather see us parse the format into a real tree of nodes, > and figure out (once) which properties of each object are required to > fulfill that. Then for each object, we grab those properties, and then > walk the tree to generate the output string. That sounds like a sensible longer-term strategy. Let's however leave it outside the scope of this change.
On Thu, Jan 31, 2019 at 01:42:14PM -0800, Junio C Hamano wrote: > > So I'd much rather see us parse the format into a real tree of nodes, > > and figure out (once) which properties of each object are required to > > fulfill that. Then for each object, we grab those properties, and then > > walk the tree to generate the output string. > > That sounds like a sensible longer-term strategy. Let's however > leave it outside the scope of this change. Yeah, sorry if I got us too far afield. It's definitely out of scope for this series. The takeaway I was trying to get at is that storing the worktree map as a static global is actually pretty reasonable given the current design. -Peff
On Thu, Jan 31, 2019 at 12:53:24PM -0800, Nickolai Belakovski wrote: > So where does that leave us for this series? We could move hashmap > back into used_atom, but if a user entered > --format="%(worktreepath)%(worktreepath:)" we'd end up freeing > worktrees twice. Not that that should stop us - that scenario is one > where user input isn't sensible and personally I don't think it's > necessary to protect against such things (unless the user was > reasonably confused, but I don't see that as the case here). > > I agree with Jeff that a ref-filter "context" would help. And in more > ways than one, it could help us decide ahead of time whether to check > if a ref is a branch or a tag before doing a hashmap lookup or just > skip the check (i.e. if there are no tags within the context, the > check would only add cost). But I do believe that that would be > outside the scope of this series. > > I think leaving it as globals is a tiny bit safer and also makes it > easier to pack it into a context if/when we decide to do that work, > but as always I'm open to other interpretations. Yeah, I agree with this: global for now, and then easily moved into a context struct later (along with all the other existing globals). -Peff
diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt index 774cecc7ed..6dcd39f6f6 100644 --- a/Documentation/git-for-each-ref.txt +++ b/Documentation/git-for-each-ref.txt @@ -214,6 +214,11 @@ symref:: `:lstrip` and `:rstrip` options in the same way as `refname` above. +worktreepath:: + The absolute path to the worktree in which the ref is checked + out, if it is checked out in any linked worktree. Empty string + otherwise. + In addition to the above, for commit and tag objects, the header field names (`tree`, `parent`, `object`, `type`, and `tag`) can be used to specify the value in the header field. diff --git a/ref-filter.c b/ref-filter.c index 422a9c9ae3..3c056cc148 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -20,6 +20,8 @@ #include "commit-slab.h" #include "commit-graph.h" #include "commit-reach.h" +#include "worktree.h" +#include "hashmap.h" static struct ref_msg { const char *gone; @@ -75,6 +77,22 @@ static struct expand_data { struct object_info info; } oi, oi_deref; +struct ref_to_worktree_entry { + struct hashmap_entry ent; /* must be the first member! */ + struct worktree *wt; /* key is wt->head_ref */ +}; + +static int ref_to_worktree_map_cmpfnc(const void *unused_lookupdata, const void *existing_hashmap_entry_to_test, + const void *key, const void *keydata_aka_refname) +{ + const struct ref_to_worktree_entry *e = existing_hashmap_entry_to_test; + const struct ref_to_worktree_entry *k = key; + return strcmp(e->wt->head_ref, keydata_aka_refname ? keydata_aka_refname : k->wt->head_ref); +} + +static struct hashmap ref_to_worktree_map; +static struct worktree **worktrees = NULL; + /* * An atom is a valid field atom listed below, possibly prefixed with * a "*" to denote deref_tag(). @@ -438,6 +456,34 @@ static int head_atom_parser(const struct ref_format *format, struct used_atom *a return 0; } +static int worktree_atom_parser(const struct ref_format *format, + struct used_atom *atom, + const char *arg, + struct strbuf *unused_err) +{ + int i; + + if (worktrees) + return 0; + + worktrees = get_worktrees(0); + + hashmap_init(&ref_to_worktree_map, ref_to_worktree_map_cmpfnc, NULL, 0); + + for (i = 0; worktrees[i]; i++) { + if (worktrees[i]->head_ref) { + struct ref_to_worktree_entry *entry; + entry = xmalloc(sizeof(*entry)); + entry->wt = worktrees[i]; + hashmap_entry_init(entry, strhash(worktrees[i]->head_ref)); + + hashmap_add(&ref_to_worktree_map, entry); + } + } + + return 0; +} + static struct { const char *name; info_source source; @@ -480,6 +526,7 @@ static struct { { "flag", SOURCE_NONE }, { "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser }, { "color", SOURCE_NONE, FIELD_STR, color_atom_parser }, + { "worktreepath", SOURCE_NONE, FIELD_STR, worktree_atom_parser }, { "align", SOURCE_NONE, FIELD_STR, align_atom_parser }, { "end", SOURCE_NONE }, { "if", SOURCE_NONE, FIELD_STR, if_atom_parser }, @@ -1525,6 +1572,21 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj return 0; } +static char *get_worktree_path(const struct used_atom *atom, const struct ref_array_item *ref) +{ + struct strbuf val = STRBUF_INIT; + struct hashmap_entry entry; + struct ref_to_worktree_entry *lookup_result; + + hashmap_entry_init(&entry, strhash(ref->refname)); + lookup_result = hashmap_get(&ref_to_worktree_map, &entry, ref->refname); + + if (lookup_result) + strbuf_addstr(&val, lookup_result->wt->path); + + return strbuf_detach(&val, NULL); +} + /* * Parse the object referred by ref, and grab needed value. */ @@ -1562,6 +1624,13 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err) if (starts_with(name, "refname")) refname = get_refname(atom, ref); + else if (starts_with(name, "worktreepath")) { + if (ref->kind == FILTER_REFS_BRANCHES) + v->s = get_worktree_path(atom, ref); + else + v->s = xstrdup(""); + continue; + } else if (starts_with(name, "symref")) refname = get_symref(atom, ref); else if (starts_with(name, "upstream")) { @@ -2045,6 +2114,11 @@ void ref_array_clear(struct ref_array *array) free_array_item(array->items[i]); FREE_AND_NULL(array->items); array->nr = array->alloc = 0; + if (worktrees) { + hashmap_free(&ref_to_worktree_map, 1); + free_worktrees(worktrees); + worktrees = NULL; + } } static void do_merge_filter(struct ref_filter_cbdata *ref_cbdata) diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh index fc067ed672..87e0222ea1 100755 --- a/t/t6302-for-each-ref-filter.sh +++ b/t/t6302-for-each-ref-filter.sh @@ -441,4 +441,19 @@ test_expect_success '--merged is incompatible with --no-merged' ' test_must_fail git for-each-ref --merged HEAD --no-merged HEAD ' +test_expect_success '"add" a worktree' ' + mkdir worktree_dir && + git worktree add -b master_worktree worktree_dir master +' + +test_expect_success 'validate worktree atom' ' + cat >expect <<-EOF && + master: $(pwd) + master_worktree: $(pwd)/worktree_dir + side: not checked out + EOF + git for-each-ref --format="%(refname:short): %(if)%(worktreepath)%(then)%(worktreepath)%(else)not checked out%(end)" refs/heads/ >actual && + test_cmp expect actual +' + test_done
From: Nickolai Belakovski <nbelakovski@gmail.com> Add an atom providing the path of the linked worktree where this ref is checked out, if it is checked out in any linked worktrees, and empty string otherwise. --- Documentation/git-for-each-ref.txt | 5 +++ ref-filter.c | 74 ++++++++++++++++++++++++++++++++++++++ t/t6302-for-each-ref-filter.sh | 15 ++++++++ 3 files changed, 94 insertions(+)