Message ID | 20181216215759.24011-2-nbelakovski@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3,1/3] ref-filter: add worktreepath atom | expand |
On Sun, Dec 16, 2018 at 01:57:57PM -0800, nbelakovski@gmail.com wrote: > From: Nickolai Belakovski <nbelakovski@gmail.com> > > Add an atom proving the path of the linked worktree where this ref is > checked out, if it is checked out in any linked worktrees, and empty > string otherwise. I stumbled over the word "proving" here. Maybe "showing" would be more clear? > diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt > index 901faef1bf..9590f7beab 100644 > --- a/Documentation/git-for-each-ref.txt > +++ b/Documentation/git-for-each-ref.txt > @@ -209,6 +209,10 @@ symref:: > `:lstrip` and `:rstrip` options in the same way as `refname` > above. > > +worktreepath:: > + The absolute path to the worktree in which the ref is checked > + out, if it is checked out in any linked worktree. ' ' otherwise. > + Normally single-quotes are used in asciidoc to emphasize text, and the quotes aren't passed through. Asciidoc (and asciidoctor) do seem to render the literal quotes here, which is good. I wonder if it would be more clear to just write it out, though, like: ...any linked worktree. Otherwise, replaced with a single space. Also, why are we replacing it with a single space? Wouldn't the empty string be more customary (and work with the other "if empty, then do this" formatting options)? > @@ -34,6 +36,8 @@ static struct ref_msg { > "ahead %d, behind %d" > }; > > +static struct worktree ** worktrees; Minor style nit: we put the "*" in a pointer declaration next to the variable name, without intervening whitespace. Like: static struct worktree **worktrees; > @@ -75,6 +79,12 @@ static struct expand_data { > struct object_info info; > } oi, oi_deref; > > +struct reftoworktreeinfo_entry { > + struct hashmap_entry ent; // must be the first member! > + char * ref; // key into map > + struct worktree * wt; > +}; A few style nits: - the "*" space thing from above (it's in other places below, too, but I won't point out each) - we prefer "/* */" comments, even for single-liners - since we do all-lowercase identifiers, use more underscores to break things up. E.g., ref_to_worktree_entry. Here we store the refname as a separate variable, but then point to the worktree itself to access wt->path. Why do we treat these differently? I.e., I'd expect to see either: 1. Each entry holding a single worktree object, and using its head_ref and path fields, like: struct ref_to_worktree_entry { struct hashmap_entry ent; /* must be first */ struct worktree *wt; }; .... entry = xmalloc(sizeof(*entry)); entry->wt = wt; hashmap_entry_init(entry, strhash(wt->head_ref)); ... strbuf_addstr(&out, result->wt->path); 2. Each entry containing just the bits it needs, like: struct ref_to_worktree_entry { struct hashmap_entry ent; /* must be first */ char *ref; char *path; }; ... /* * We could use FLEXPTR_ALLOC_STR() here, but it doesn't actually * support holding _two_ strings. Separate allocations probably * aren't a huge deal here, since there are only a handful of * worktrees. */ entry = xmalloc(sizeof(*entry)); entry->ref = wt->head_ref; entry->path = wt->path; hashmap_entry_init(entry, strhash(entry->ref)); ... strbuf_addstr(&out, result->path); I think the first one is strictly preferable unless we're worried about the lifetime of the "struct worktree" going away. I don't think that's an issue, though; they are ours until we call free_worktrees(). > @@ -114,6 +124,7 @@ static struct used_atom { > } objectname; > struct refname_atom refname; > char *head; > + struct hashmap reftoworktreeinfo_map; > } u; > } *used_atom; This uses one map for each %(worktree) we use. But won't they all be the same? It would ideally be associated with the ref-filter. There's no ref-filter context struct to hold this kind of data, just static globals in ref-filter.c (including this used_atom struct!). That's something we'll probably need to fix in the long run, but I think it would be reasonable to just have: static struct hashmap ref_to_worktree_map; next to the declaration of used_atom_cnt, need_symref, etc. And then those can all eventually get moved into a struct together. > @@ -461,6 +497,7 @@ static struct { > { "flag", SOURCE_NONE }, > { "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser }, > { "color", SOURCE_NONE, FIELD_STR, color_atom_parser }, > + { "worktreepath", SOURCE_NONE, FIELD_STR, worktree_atom_parser }, > { "align", SOURCE_NONE, FIELD_STR, align_atom_parser }, > { "end", SOURCE_NONE }, > { "if", SOURCE_NONE, FIELD_STR, if_atom_parser }, Marking as SOURCE_NONE makes sense. > +static const char * get_worktree_info(const struct used_atom *atom, const struct ref_array_item *ref) > +{ > + struct strbuf val = STRBUF_INIT; > + struct reftoworktreeinfo_entry * entry; > + struct reftoworktreeinfo_entry * lookup_result; > + > + FLEXPTR_ALLOC_STR(entry, ref, ref->refname); > + hashmap_entry_init(entry, strhash(entry->ref)); > + lookup_result = hashmap_get(&(atom->u.reftoworktreeinfo_map), entry, NULL); > + free(entry); We shouldn't need to do an allocation just for a lookup. That's what the extra "keydata" parameter is for in the comparison function. And I guess this is what led you to have "char *ref" in the struct, rather than reusing wt->head_ref (because you don't have a "struct worktree" here). You should be able to do it like this: struct hashmap_entry entry; struct ref_to_worktree_entry *result; hashmap_entry_init(entry, strhash(ref->refname)); result = hashmap_get(&ref_to_worktree_map, &entry, ref->refname)); ... and then your comparison function would look like this: int ref_to_worktree_hashcmp(const void *data, const void *entry, const void *entry_or_key, const void *keydata) { const struct ref_to_worktree_entry *a = entry; const struct ref_to_worktree_entry *b = entry; if (keydata) return strcmp(a->wt->head_ref, keydata); else return strcmp(a->wt->head_ref, b->wt->head_ref); } If you're thinking that this API is totally confusing and hard to figure out, I agree. It's optimized to avoid extra allocations. I wish we had a better one for simple cases (especially string->string mappings like this). Speaking of comparison functions, I didn't see one in your patch. Don't you need to pass one to hashmap_init? > + if (lookup_result) > + { > + if (!strncmp(atom->name, "worktreepath", strlen(atom->name))) > + strbuf_addstr(&val, lookup_result->wt->path); > + } > + else > + strbuf_addstr(&val, " "); What's this extra strncmp about? If we're _not_ a worktreepath atom, we'd still do the lookup only to put nothing in the string? I think we'd only call this function when populate_value() sees a worktreepath atom, though: > @@ -1537,6 +1596,10 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err) > > if (starts_with(name, "refname")) > refname = get_refname(atom, ref); > + else if (starts_with(name, "worktreepath")) { > + v->s = get_worktree_info(atom, ref); > + continue; > + } So it would be OK to drop the check of atom->name again inside get_worktree_info(). > @@ -2013,7 +2076,14 @@ void ref_array_clear(struct ref_array *array) > int i; > > for (i = 0; i < used_atom_cnt; i++) > + { > + if (!strncmp(used_atom[i].name, "worktreepath", strlen("worktreepath"))) > + { > + hashmap_free(&(used_atom[i].u.reftoworktreeinfo_map), 1); > + free_worktrees(worktrees); > + } And if we move the mapping out to a static global, then this only has to be done once, not once per atom. In fact, I think this could double-free "worktrees" with your current patch if you have two "%(worktree)" placeholders, since "worktrees" already is a global. > diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh > index fc067ed672..add70a4c3e 100755 > --- a/t/t6302-for-each-ref-filter.sh > +++ b/t/t6302-for-each-ref-filter.sh > @@ -441,4 +441,19 @@ test_expect_success '--merged is incompatible with --no-merged' ' > test_must_fail git for-each-ref --merged HEAD --no-merged HEAD > ' > > +test_expect_success '"add" a worktree' ' > + mkdir worktree_dir && > + git worktree add -b master_worktree worktree_dir master > +' > + > +test_expect_success 'validate worktree atom' ' > + cat >expect <<-\EOF && > + master: checked out in a worktree > + master_worktree: checked out in a worktree > + side: not checked out in a worktree > + EOF > + git for-each-ref --format="%(refname:short): %(if)%(worktreepath)%(then)checked out in a worktree%(else)not checked out in a worktree%(end)" refs/heads/ >actual && > + test_cmp expect actual > +' It's probably worth testing that the path we get is actually sane, too. I.e., expect something more like: cat >expect <<-\EOF master: $PWD master: $PWD/worktree side: not checked out EOF git for-each-ref \ --format="%(refname:short): %(if)%(worktreepath)%(then)%(worktreepath)%(else)not checked %out%(end) (I wish there was a way to avoid that really long line, but I don't think there is). -Peff
On Tue, Dec 18, 2018 at 9:22 AM Jeff King <peff@peff.net> wrote: > > On Sun, Dec 16, 2018 at 01:57:57PM -0800, nbelakovski@gmail.com wrote: > > > From: Nickolai Belakovski <nbelakovski@gmail.com> > > > > Add an atom proving the path of the linked worktree where this ref is > > checked out, if it is checked out in any linked worktrees, and empty > > string otherwise. > > I stumbled over the word "proving" here. Maybe "showing" would be more > clear? Oops, providing > > +worktreepath:: > > + The absolute path to the worktree in which the ref is checked > > + out, if it is checked out in any linked worktree. ' ' otherwise. > > + > > Also, why are we replacing it with a single space? Wouldn't the empty > string be more customary (and work with the other "if empty, then do > this" formatting options)? I was just following what was done for HEAD, but overall I agree that empty is preferable to single space, will change. > Minor style nit: we put the "*" in a pointer declaration next to the > variable name, without intervening whitespace. Like: > > static struct worktree **worktrees; Gotcha, will do, thanks for pointing it out. To sum up the hashmap comments: -I hadn't thought to re-use the head_ref of worktree as the key. That's clever. I like the readability of having separate entries for key and value, but I can see the benefit of not having to do an extra allocation. I can make up for the readability hit with a comment. -Actually, for any valid use case there will only be one instance of the map since the entries of used_atom are cached, but regardless it makes sense to keep per-atom info in used_atom and global context somewhere else, so I'll make that change to make it a static variable outside of used_atom. -Will change the lookup logic to remove the extra allocation. Since I'm letting the hashmap use its internal comparison function on the hash, I don't need to provide a comparison function. > What's this extra strncmp about? If we're _not_ a worktreepath atom, > we'd still do the lookup only to put nothing in the string? Leftover from an earlier iteration where I was going to support getting more info out of the worktree struct. I decided to limit scope to just the info I really needed for the branch change. I left it like this because I thought it would make the code more readable for someone who wanted to come in and add that extra info, but I think you're right that it ends up just reading kind of awkwardly. > > > @@ -2013,7 +2076,14 @@ void ref_array_clear(struct ref_array *array) > > int i; > > > > for (i = 0; i < used_atom_cnt; i++) > > + { > > + if (!strncmp(used_atom[i].name, "worktreepath", strlen("worktreepath"))) > > + { > > + hashmap_free(&(used_atom[i].u.reftoworktreeinfo_map), 1); > > + free_worktrees(worktrees); > > + } > > And if we move the mapping out to a static global, then this only has to > be done once, not once per atom. In fact, I think this could double-free > "worktrees" with your current patch if you have two "%(worktree)" > placeholders, since "worktrees" already is a global. Only if someone put a colon on one of the %(worktree) atoms, otherwise they're all cached, but as you say moot point anyway if the map is moved outside the used_atom structure. > > It's probably worth testing that the path we get is actually sane, too. > I.e., expect something more like: > > cat >expect <<-\EOF > master: $PWD > master: $PWD/worktree > side: not checked out > EOF > git for-each-ref \ > --format="%(refname:short): %(if)%(worktreepath)%(then)%(worktreepath)%(else)not checked %out%(end) > > (I wish there was a way to avoid that really long line, but I don't > think there is). > Yea good call, can do. Thanks for all the feedback, will try to turn these around quickly.
On Wed, Dec 19, 2018 at 11:09:59PM -0800, Nickolai Belakovski wrote: > > Also, why are we replacing it with a single space? Wouldn't the empty > > string be more customary (and work with the other "if empty, then do > > this" formatting options)? > > I was just following what was done for HEAD, but overall I agree that > empty is preferable to single space, will change. Ah, right, that makes more sense. I do still think for %(HEAD) it's a little different because it is "+" or a single space, so always one character. Here we have some value or not, and in the "not" case for such things we usually give an empty string (e.g., for %(push), %(upstream), etc). > To sum up the hashmap comments: > -I hadn't thought to re-use the head_ref of worktree as the key. > That's clever. I like the readability of having separate entries for > key and value, but I can see the benefit of not having to do an extra > allocation. I can make up for the readability hit with a comment. Thanks, that makes sense. > -Actually, for any valid use case there will only be one instance of > the map since the entries of used_atom are cached, but regardless it > makes sense to keep per-atom info in used_atom and global context > somewhere else, so I'll make that change to make it a static variable > outside of used_atom. Ah, right, I forgot there was some magic around used_atom. I do still agree that the separate static global makes things a little simpler. > -Will change the lookup logic to remove the extra allocation. Since > I'm letting the hashmap use its internal comparison function on the > hash, I don't need to provide a comparison function. I don't think that works. The default function is always_equal(), which will treat two entries equal if they have the same hash value. I.e., any collisions would be considered a match. > Thanks for all the feedback, will try to turn these around quickly. Great, thanks! I'll be on vacation for the next two weeks, so I may be very slow to look at the next iteration. :) -Peff
From: Nickolai Belakovski <nbelakovski@gmail.com> > I don't think that works. The default function is always_equal(), which > will treat two entries equal if they have the same hash value. I.e., any > collisions would be considered a match. You're absolutely right. I've added a compare function, but I left out the functionality for it to work with an entry passed in as a key. Doing so would mean the user would have to allocate a worktree struct, which just seems silly when the ref is all that's needed (and also defeats the purpose of avoiding extra allocations). And while most of the hashmap API seems OK, yea, this is definitely awful. It feels like it should just be able to take a key and return either an entry or NULL, and do away with entry_or_key and equals_function_data. Travis-CI results: https://travis-ci.org/nbelakovski/git/builds/471787317 Nickolai Belakovski (3): ref-filter: add worktreepath atom branch: Mark and color a branch differently if it is checked out in a linked worktree branch: Add an extra verbose output displaying worktree path for refs checked out in a linked worktree Documentation/git-for-each-ref.txt | 5 +++ builtin/branch.c | 16 ++++++--- ref-filter.c | 72 +++++++++++++++++++++++++++++++++++++- t/t3200-branch.sh | 8 ++--- t/t3203-branch-output.sh | 21 +++++++++++ t/t6302-for-each-ref-filter.sh | 15 ++++++++ 6 files changed, 128 insertions(+), 9 deletions(-)
On Mon, Dec 24, 2018 at 12:47:53AM -0800, nbelakovski@gmail.com wrote: > From: Nickolai Belakovski <nbelakovski@gmail.com> > > > I don't think that works. The default function is always_equal(), which > > will treat two entries equal if they have the same hash value. I.e., any > > collisions would be considered a match. > > You're absolutely right. I've added a compare function, but I left out the > functionality for it to work with an entry passed in as a key. Doing so would > mean the user would have to allocate a worktree struct, which just seems silly > when the ref is all that's needed (and also defeats the purpose of avoiding > extra allocations). Unfortunately, that doesn't quite work. :) Your compare function has to handle _both_ cases: two keys, or one key and a keydata. The former may be called when the hashmap has to compare two entries internally (e.g., when it has to re-bucket all of the entries after a resize). Your tests likely wouldn't run into this case in practice, since you'd only have a handful of worktrees. But if you added, say, thousands of worktrees and we had to grow the hash midway through the process, it would segfault. So you do need to handle the case when your keydata is NULL. In theory it would also be used for comparisons if we used a more clever data structure to hold entries within a bucket. But since we just use a linked list and linear search for now, we don't. > And while most of the hashmap API seems OK, yea, this is definitely awful. It > feels like it should just be able to take a key and return either an entry or > NULL, and do away with entry_or_key and equals_function_data. In your case, yeah, equals_function_data is not used at all (but there are a few call-sites which need it to avoid relying on global data). But the entry_or_key (coupled with keydata) is the magic that lets the same function be used for both lookups as well as internal entry comparisons. I think having two separate comparison functions would make this a lot more clear, though likely at the cost of having more boilerplate. -Peff
diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt index 901faef1bf..9590f7beab 100644 --- a/Documentation/git-for-each-ref.txt +++ b/Documentation/git-for-each-ref.txt @@ -209,6 +209,10 @@ symref:: `:lstrip` and `:rstrip` options in the same way as `refname` above. +worktreepath:: + The absolute path to the worktree in which the ref is checked + out, if it is checked out in any linked worktree. ' ' otherwise. + In addition to the above, for commit and tag objects, the header field names (`tree`, `parent`, `object`, `type`, and `tag`) can be used to specify the value in the header field. diff --git a/ref-filter.c b/ref-filter.c index 5de616befe..e8713484a1 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -20,6 +20,8 @@ #include "commit-slab.h" #include "commit-graph.h" #include "commit-reach.h" +#include "worktree.h" +#include "hashmap.h" static struct ref_msg { const char *gone; @@ -34,6 +36,8 @@ static struct ref_msg { "ahead %d, behind %d" }; +static struct worktree ** worktrees; + void setup_ref_filter_porcelain_msg(void) { msgs.gone = _("gone"); @@ -75,6 +79,12 @@ static struct expand_data { struct object_info info; } oi, oi_deref; +struct reftoworktreeinfo_entry { + struct hashmap_entry ent; // must be the first member! + char * ref; // key into map + struct worktree * wt; +}; + /* * An atom is a valid field atom listed below, possibly prefixed with * a "*" to denote deref_tag(). @@ -114,6 +124,7 @@ static struct used_atom { } objectname; struct refname_atom refname; char *head; + struct hashmap reftoworktreeinfo_map; } u; } *used_atom; static int used_atom_cnt, need_tagged, need_symref; @@ -420,6 +431,31 @@ static int head_atom_parser(const struct ref_format *format, struct used_atom *a return 0; } +static int worktree_atom_parser(const struct ref_format *format, + struct used_atom *atom, + const char *arg, + struct strbuf *unused_err) +{ + int i; + worktrees = get_worktrees(0); + + hashmap_init(&(atom->u.reftoworktreeinfo_map), NULL, NULL, 0); + + for (i = 0; worktrees[i]; i++) { + if (worktrees[i]->head_ref) { + struct reftoworktreeinfo_entry *entry; + FLEXPTR_ALLOC_STR(entry, ref, worktrees[i]->head_ref); + hashmap_entry_init(entry, strhash(entry->ref)); + + entry->wt = worktrees[i]; + + hashmap_add(&(atom->u.reftoworktreeinfo_map), entry); + } + } + + return 0; +} + static struct { const char *name; info_source source; @@ -461,6 +497,7 @@ static struct { { "flag", SOURCE_NONE }, { "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser }, { "color", SOURCE_NONE, FIELD_STR, color_atom_parser }, + { "worktreepath", SOURCE_NONE, FIELD_STR, worktree_atom_parser }, { "align", SOURCE_NONE, FIELD_STR, align_atom_parser }, { "end", SOURCE_NONE }, { "if", SOURCE_NONE, FIELD_STR, if_atom_parser }, @@ -1500,6 +1537,28 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj return 0; } +static const char * get_worktree_info(const struct used_atom *atom, const struct ref_array_item *ref) +{ + struct strbuf val = STRBUF_INIT; + struct reftoworktreeinfo_entry * entry; + struct reftoworktreeinfo_entry * lookup_result; + + FLEXPTR_ALLOC_STR(entry, ref, ref->refname); + hashmap_entry_init(entry, strhash(entry->ref)); + lookup_result = hashmap_get(&(atom->u.reftoworktreeinfo_map), entry, NULL); + free(entry); + + if (lookup_result) + { + if (!strncmp(atom->name, "worktreepath", strlen(atom->name))) + strbuf_addstr(&val, lookup_result->wt->path); + } + else + strbuf_addstr(&val, " "); + + return strbuf_detach(&val, NULL); +} + /* * Parse the object referred by ref, and grab needed value. */ @@ -1537,6 +1596,10 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err) if (starts_with(name, "refname")) refname = get_refname(atom, ref); + else if (starts_with(name, "worktreepath")) { + v->s = get_worktree_info(atom, ref); + continue; + } else if (starts_with(name, "symref")) refname = get_symref(atom, ref); else if (starts_with(name, "upstream")) { @@ -2013,7 +2076,14 @@ void ref_array_clear(struct ref_array *array) int i; for (i = 0; i < used_atom_cnt; i++) + { + if (!strncmp(used_atom[i].name, "worktreepath", strlen("worktreepath"))) + { + hashmap_free(&(used_atom[i].u.reftoworktreeinfo_map), 1); + free_worktrees(worktrees); + } free((char *)used_atom[i].name); + } FREE_AND_NULL(used_atom); used_atom_cnt = 0; for (i = 0; i < array->nr; i++) diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh index fc067ed672..add70a4c3e 100755 --- a/t/t6302-for-each-ref-filter.sh +++ b/t/t6302-for-each-ref-filter.sh @@ -441,4 +441,19 @@ test_expect_success '--merged is incompatible with --no-merged' ' test_must_fail git for-each-ref --merged HEAD --no-merged HEAD ' +test_expect_success '"add" a worktree' ' + mkdir worktree_dir && + git worktree add -b master_worktree worktree_dir master +' + +test_expect_success 'validate worktree atom' ' + cat >expect <<-\EOF && + master: checked out in a worktree + master_worktree: checked out in a worktree + side: not checked out in a worktree + EOF + git for-each-ref --format="%(refname:short): %(if)%(worktreepath)%(then)checked out in a worktree%(else)not checked out in a worktree%(end)" refs/heads/ >actual && + test_cmp expect actual +' + test_done
From: Nickolai Belakovski <nbelakovski@gmail.com> Add an atom proving the path of the linked worktree where this ref is checked out, if it is checked out in any linked worktrees, and empty string otherwise. --- Documentation/git-for-each-ref.txt | 4 +++ ref-filter.c | 70 ++++++++++++++++++++++++++++++++++++++ t/t6302-for-each-ref-filter.sh | 15 ++++++++ 3 files changed, 89 insertions(+)