Message ID | f3fb6833bd71d559a3076d9617a235614ad9a5f8.1678468864.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | ref-filter: ahead/behind counting, faster --merged option | expand |
On Fri, Mar 10 2023, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee <derrickstolee@github.com> > +{ > + size_t i; Ditto the decl suggestion in an earlier commit, i.e... > + struct commit_and_index *commits; > + unsigned int min_generation_index = 0; > + timestamp_t min_generation; > + struct commit_list *stack = NULL; > + > + if (!bases || !tips || !tips_nr) > + return; > + > + /* > + * Do a depth-first search starting at 'bases' to search for the > + * tips. Stop at the lowest (un-found) generation number. When > + * finding the lowest commit, increase the minimum generation > + * number to the next lowest (un-found) generation number. > + */ > + > + CALLOC_ARRAY(commits, tips_nr); > + > + for (i = 0; i < tips_nr; i++) { ...move this here? > + commits[i].commit = tips[i]; > + commits[i].index = i; > + commits[i].generation = commit_graph_generation(tips[i]); > + } > + > + /* Sort with generation number ascending. */ > + QSORT(commits, tips_nr, compare_commit_and_index_by_generation); > + min_generation = commits[0].generation; > + > + while (bases) { > + parse_commit(bases->item); > + commit_list_insert(bases->item, &stack); > + bases = bases->next; > + } > + > + while (stack) { > + unsigned int j; ...ditto... > + int explored_all_parents = 1; > + struct commit_list *p; > + struct commit *c = stack->item; > + timestamp_t c_gen = commit_graph_generation(c); > + > + /* Does it match any of our tips? */ > + for (j = min_generation_index; j < tips_nr; j++) { ...to here... > + if (c_gen < commits[j].generation) > + break; > + > + if (commits[j].commit == c) { > + tips[commits[j].index]->object.flags |= mark; > + > + if (j == min_generation_index) { > + unsigned int k = j + 1; > + while (k < tips_nr && > + (tips[commits[k].index]->object.flags & mark)) > + k++; > + > + /* Terminate early if all found. */ > + if (k >= tips_nr) > + goto done; > + > + min_generation_index = k; > + min_generation = commits[k].generation; > + } > + } > + } > + > + for (p = c->parents; p; p = p->next) { > + parse_commit(p->item); > + > + /* Have we already explored this parent? */ > + if (p->item->object.flags & SEEN) > + continue; > + > + /* Is it below the current minimum generation? */ > + if (commit_graph_generation(p->item) < min_generation) > + continue; > + > + /* Ok, we will explore from here on. */ > + p->item->object.flags |= SEEN; > + explored_all_parents = 0; > + commit_list_insert(p->item, &stack); > + break; > + } > + > + if (explored_all_parents) > + pop_commit(&stack); > + } > + > +done: > + free(commits); > + repo_clear_commit_marks(the_repository, SEEN); I didn't see this in my earlier suggestion for passing "struct repository", but I think we should do the same here, i.e. have this function take a "r" argument. > [...] > @@ -2390,33 +2390,21 @@ static void reach_filter(struct ref_array *array, > struct commit_list *check_reachable, > int include_reached) > { > - struct rev_info revs; > int i, old_nr; > struct commit **to_clear; > - struct commit_list *cr; > > if (!check_reachable) > return; > > CALLOC_ARRAY(to_clear, array->nr); > - > - repo_init_revisions(the_repository, &revs, NULL); > - > for (i = 0; i < array->nr; i++) { > struct ref_array_item *item = array->items[i]; > - add_pending_object(&revs, &item->commit->object, item->refname); > to_clear[i] = item->commit; > } > > - for (cr = check_reachable; cr; cr = cr->next) { > - struct commit *merge_commit = cr->item; > - merge_commit->object.flags |= UNINTERESTING; > - add_pending_object(&revs, &merge_commit->object, ""); > - } > - > - revs.limited = 1; > - if (prepare_revision_walk(&revs)) > - die(_("revision walk setup failed")); > + tips_reachable_from_bases(check_reachable, > + to_clear, array->nr, > + UNINTERESTING); I.e. it's not ideal, but we had a the_repository in this function before (should probably have passed it from further up, but whatever), so we could pass that to the new tips_reachable_from_bases() still. > -test_perf 'ahead-behind counts: git rev-list' ' > - for r in $(cat refs) > - do > - git rev-list --count "HEAD..$r" || return 1 > - done Why does this change require deleting the old perf test? Your commit 7/8 notes this test, but here we're deleting it, let's keep it and instead note if the results changed, or stayed the same? More generally, your commit message says: > Add extra tests for this behavior in t6600-test-reach.sh as the > interesting data shape of that repository can sometimes demonstrate > corner case bugs. And here for a supposed optimization commit you're adding new tests, but when I try them with the C code at 7/8 they pass. So it seems we should add them earlier, and this is a pure-optimization commit, but one that's a bit confused about what goes where? :)
On 3/15/2023 10:13 AM, Ævar Arnfjörð Bjarmason wrote: > > On Fri, Mar 10 2023, Derrick Stolee via GitGitGadget wrote: > >> From: Derrick Stolee <derrickstolee@github.com> (omitting the_repository stuff which will be reflected in v3) >> -test_perf 'ahead-behind counts: git rev-list' ' >> - for r in $(cat refs) >> - do >> - git rev-list --count "HEAD..$r" || return 1 >> - done > > Why does this change require deleting the old perf test? Your commit 7/8 > notes this test, but here we're deleting it, let's keep it and instead > note if the results changed, or stayed the same? > > More generally, your commit message says: > >> Add extra tests for this behavior in t6600-test-reach.sh as the >> interesting data shape of that repository can sometimes demonstrate >> corner case bugs. This note is about t6600-test-reach.sh, not p1500-graph-walks.sh. Not only does the previous message note this perf test, it has this to say: The 'git rev-list' test exists in this change as a demonstration, but it will be removed in the next change to avoid wasting time on this comparison. > And here for a supposed optimization commit you're adding new tests, but > when I try them with the C code at 7/8 they pass. > > So it seems we should add them earlier, and this is a pure-optimization > commit, but one that's a bit confused about what goes where? :) I can make it more clear why we are removing it in this commit. Thanks, -Stolee
On 3/15/2023 12:17 PM, Derrick Stolee wrote: > On 3/15/2023 10:13 AM, Ævar Arnfjörð Bjarmason wrote: >> >> On Fri, Mar 10 2023, Derrick Stolee via GitGitGadget wrote: >> >>> From: Derrick Stolee <derrickstolee@github.com> > Not only does the previous message note this perf test, it has this to say: > > The 'git rev-list' test exists in this change as a demonstration, but it > will be removed in the next change to avoid wasting time on this > comparison. > >> And here for a supposed optimization commit you're adding new tests, but >> when I try them with the C code at 7/8 they pass. >> >> So it seems we should add them earlier, and this is a pure-optimization >> commit, but one that's a bit confused about what goes where? :) > > I can make it more clear why we are removing it in this commit. Oh wait, I did: >> (Note that we remove the iterative 'git rev-list' test from p1500 >> because it no longer makes sense as a comparison to 'git for-each-ref' >> and would just waste time running it for these comparisons.) Thanks, -Stolee
diff --git a/commit-reach.c b/commit-reach.c index 338ca8084b2..f6c4a3c93c7 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -1037,3 +1037,117 @@ void ahead_behind(struct commit **commits, size_t commits_nr, clear_bit_arrays(&bit_arrays); clear_prio_queue(&queue); } + +struct commit_and_index { + struct commit *commit; + unsigned int index; + timestamp_t generation; +}; + +static int compare_commit_and_index_by_generation(const void *va, const void *vb) +{ + const struct commit_and_index *a = (const struct commit_and_index *)va; + const struct commit_and_index *b = (const struct commit_and_index *)vb; + + if (a->generation > b->generation) + return 1; + if (a->generation < b->generation) + return -1; + return 0; +} + +void tips_reachable_from_bases(struct commit_list *bases, + struct commit **tips, size_t tips_nr, + int mark) +{ + size_t i; + struct commit_and_index *commits; + unsigned int min_generation_index = 0; + timestamp_t min_generation; + struct commit_list *stack = NULL; + + if (!bases || !tips || !tips_nr) + return; + + /* + * Do a depth-first search starting at 'bases' to search for the + * tips. Stop at the lowest (un-found) generation number. When + * finding the lowest commit, increase the minimum generation + * number to the next lowest (un-found) generation number. + */ + + CALLOC_ARRAY(commits, tips_nr); + + for (i = 0; i < tips_nr; i++) { + commits[i].commit = tips[i]; + commits[i].index = i; + commits[i].generation = commit_graph_generation(tips[i]); + } + + /* Sort with generation number ascending. */ + QSORT(commits, tips_nr, compare_commit_and_index_by_generation); + min_generation = commits[0].generation; + + while (bases) { + parse_commit(bases->item); + commit_list_insert(bases->item, &stack); + bases = bases->next; + } + + while (stack) { + unsigned int j; + int explored_all_parents = 1; + struct commit_list *p; + struct commit *c = stack->item; + timestamp_t c_gen = commit_graph_generation(c); + + /* Does it match any of our tips? */ + for (j = min_generation_index; j < tips_nr; j++) { + if (c_gen < commits[j].generation) + break; + + if (commits[j].commit == c) { + tips[commits[j].index]->object.flags |= mark; + + if (j == min_generation_index) { + unsigned int k = j + 1; + while (k < tips_nr && + (tips[commits[k].index]->object.flags & mark)) + k++; + + /* Terminate early if all found. */ + if (k >= tips_nr) + goto done; + + min_generation_index = k; + min_generation = commits[k].generation; + } + } + } + + for (p = c->parents; p; p = p->next) { + parse_commit(p->item); + + /* Have we already explored this parent? */ + if (p->item->object.flags & SEEN) + continue; + + /* Is it below the current minimum generation? */ + if (commit_graph_generation(p->item) < min_generation) + continue; + + /* Ok, we will explore from here on. */ + p->item->object.flags |= SEEN; + explored_all_parents = 0; + commit_list_insert(p->item, &stack); + break; + } + + if (explored_all_parents) + pop_commit(&stack); + } + +done: + free(commits); + repo_clear_commit_marks(the_repository, SEEN); +} diff --git a/commit-reach.h b/commit-reach.h index f871b5dcce9..14043ed8562 100644 --- a/commit-reach.h +++ b/commit-reach.h @@ -134,4 +134,12 @@ struct ahead_behind_count { void ahead_behind(struct commit **commits, size_t commits_nr, struct ahead_behind_count *counts, size_t counts_nr); +/* + * For all tip commits, add 'mark' to their flags if and only if they + * are reachable from one of the commits in 'bases'. + */ +void tips_reachable_from_bases(struct commit_list *bases, + struct commit **tips, size_t tips_nr, + int mark); + #endif diff --git a/ref-filter.c b/ref-filter.c index 896bf703f59..ece77d7e8ba 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -2390,33 +2390,21 @@ static void reach_filter(struct ref_array *array, struct commit_list *check_reachable, int include_reached) { - struct rev_info revs; int i, old_nr; struct commit **to_clear; - struct commit_list *cr; if (!check_reachable) return; CALLOC_ARRAY(to_clear, array->nr); - - repo_init_revisions(the_repository, &revs, NULL); - for (i = 0; i < array->nr; i++) { struct ref_array_item *item = array->items[i]; - add_pending_object(&revs, &item->commit->object, item->refname); to_clear[i] = item->commit; } - for (cr = check_reachable; cr; cr = cr->next) { - struct commit *merge_commit = cr->item; - merge_commit->object.flags |= UNINTERESTING; - add_pending_object(&revs, &merge_commit->object, ""); - } - - revs.limited = 1; - if (prepare_revision_walk(&revs)) - die(_("revision walk setup failed")); + tips_reachable_from_bases(check_reachable, + to_clear, array->nr, + UNINTERESTING); old_nr = array->nr; array->nr = 0; @@ -2440,7 +2428,6 @@ static void reach_filter(struct ref_array *array, clear_commit_marks(merge_commit, ALL_REV_FLAGS); } - release_revisions(&revs); free(to_clear); } diff --git a/t/perf/p1500-graph-walks.sh b/t/perf/p1500-graph-walks.sh index 439a448c2e6..e14e7620cce 100755 --- a/t/perf/p1500-graph-walks.sh +++ b/t/perf/p1500-graph-walks.sh @@ -35,11 +35,16 @@ test_perf 'ahead-behind counts: git tag' ' xargs git tag -l --format="%(ahead-behind:HEAD)" <tags ' -test_perf 'ahead-behind counts: git rev-list' ' - for r in $(cat refs) - do - git rev-list --count "HEAD..$r" || return 1 - done +test_perf 'contains: git for-each-ref --merged' ' + git for-each-ref --merged=HEAD --stdin <refs +' + +test_perf 'contains: git branch --merged' ' + xargs git branch --merged=HEAD <branches +' + +test_perf 'contains: git tag --merged' ' + xargs git tag --merged=HEAD <tags ' test_done diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh index 0cb50797ef7..b330945f497 100755 --- a/t/t6600-test-reach.sh +++ b/t/t6600-test-reach.sh @@ -529,4 +529,87 @@ test_expect_success 'for-each-ref ahead-behind:none' ' --format="%(refname) %(ahead-behind:commit-8-4)" --stdin ' +test_expect_success 'for-each-ref merged:linear' ' + cat >input <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-1-3 + refs/heads/commit-1-5 + refs/heads/commit-1-8 + refs/heads/commit-2-1 + refs/heads/commit-5-1 + refs/heads/commit-9-1 + EOF + cat >expect <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-1-3 + refs/heads/commit-1-5 + refs/heads/commit-1-8 + EOF + run_all_modes git for-each-ref --merged=commit-1-9 \ + --format="%(refname)" --stdin +' + +test_expect_success 'for-each-ref merged:all' ' + cat >input <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-2-4 + refs/heads/commit-4-2 + refs/heads/commit-4-4 + EOF + cat >expect <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-2-4 + refs/heads/commit-4-2 + refs/heads/commit-4-4 + EOF + run_all_modes git for-each-ref --merged=commit-5-5 \ + --format="%(refname)" --stdin +' + +test_expect_success 'for-each-ref ahead-behind:some' ' + cat >input <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-5-3 + refs/heads/commit-4-8 + refs/heads/commit-9-9 + EOF + cat >expect <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-5-3 + EOF + run_all_modes git for-each-ref --merged=commit-9-6 \ + --format="%(refname)" --stdin +' + +test_expect_success 'for-each-ref merged:some, multibase' ' + cat >input <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-5-3 + refs/heads/commit-7-8 + refs/heads/commit-4-8 + refs/heads/commit-9-9 + EOF + cat >expect <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-4-8 + refs/heads/commit-5-3 + EOF + run_all_modes git for-each-ref \ + --merged=commit-5-8 \ + --merged=commit-8-5 \ + --format="%(refname)" \ + --stdin +' + +test_expect_success 'for-each-ref merged:none' ' + cat >input <<-\EOF && + refs/heads/commit-7-5 + refs/heads/commit-4-8 + refs/heads/commit-9-9 + EOF + >expect && + run_all_modes git for-each-ref --merged=commit-8-4 \ + --format="%(refname)" --stdin +' + test_done