Message ID | 5cec04b65d350ca8b482ca14260ef118341e4039.1686178917.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | shortlog: introduce --email-only, --group-filter options | expand |
On 6/7/2023 7:02 PM, Taylor Blau wrote: > This means that you could easily view the hashes of all commits you > either wrote or co-authored with something like: > > $ git shortlog -n --group=author --group=trailer:Co-authored-by \ > --group-filter="$(git config user.name)" > > When filtering just by trailers, it is tempting to want to introduce a > new grep mode for matching a given trailer, like `--author=<pattern>` > for matching the author header. But this would not be suitable for the > above, since we want commits which match either the author or the > Co-authored-by trailer, not ones which match both. One thing that is not immediately obvious from reading the patch, but becomes clearer in patch 2, is that your --group-filter is an exact string match. This differs from the --author filter in 'git log' and similar, which is actually a case-insensitive substring match. > +static int want_shortlog_group(struct shortlog *log, const char *group) > +{ > + if (!log->group_filter.nr) > + return 1; > + return string_list_has_string(&log->group_filter, group); > +} > + This is the critical piece of code for this issue. Replacing it with static int want_shortlog_group(struct shortlog *log, const char *group) { struct string_list_item *item; if (!log->group_filter.nr) return 1; for_each_string_list_item(item, &log->group_filter) { if (strcasestr(group, item->string)) return 1; } return 0; } Results in the case-insensitive substring search that I would expect from this parameter. This would also solve the problem from Patch 2 where we want to search by email address. Using '-e --group-filter="my@email.com"' works, though it will catch users with 'tammy@email.com' emails, as well. Something to consider at a high level before committing to this CLI. Thanks, -Stolee
On Thu, Jun 08, 2023 at 10:34:02AM -0400, Derrick Stolee wrote: > On 6/7/2023 7:02 PM, Taylor Blau wrote: > > > This means that you could easily view the hashes of all commits you > > either wrote or co-authored with something like: > > > > $ git shortlog -n --group=author --group=trailer:Co-authored-by \ > > --group-filter="$(git config user.name)" > > > > When filtering just by trailers, it is tempting to want to introduce a > > new grep mode for matching a given trailer, like `--author=<pattern>` > > for matching the author header. But this would not be suitable for the > > above, since we want commits which match either the author or the > > Co-authored-by trailer, not ones which match both. > > One thing that is not immediately obvious from reading the patch, but > becomes clearer in patch 2, is that your --group-filter is an exact > string match. This differs from the --author filter in 'git log' and > similar, which is actually a case-insensitive substring match. Yeah, funnily enough I originally thought about adding a `--trailer` flag to the revision machinery, but realized that it was a non-starter since `--author` is documented as only showing commits matching that author. So that doesn't solve for the case of finding a commit which has the identity you want buried in some trailer, but is written by a different author. shortlog needs to see all commits along the traversal, so I went with the filtering approach instead. Thanks for the pointer on beefing up the documentation to indicate that this is an exact search. > This is the critical piece of code for this issue. Replacing it with > > static int want_shortlog_group(struct shortlog *log, const char *group) > { > struct string_list_item *item; > if (!log->group_filter.nr) > return 1; > > for_each_string_list_item(item, &log->group_filter) { > if (strcasestr(group, item->string)) > return 1; > } > > return 0; > } > > Results in the case-insensitive substring search that I would expect > from this parameter. > > This would also solve the problem from Patch 2 where we want to search > by email address. Using '-e --group-filter="my@email.com"' works, though > it will catch users with 'tammy@email.com' emails, as well. Yeah, thanks for raising it. I wonder if there are other semantics that don't incorrectly return substring matches. You could conceivably extend the --group-filter argument to take an extended regular expression, and then just match on whatever's inside the last "<>". That approach was one that I considered, but feels a little heavyweight for what we're trying to do here. One potential approach is to keep things as-is, and then consider extending `--group-filter` to mean "extended regular expression" in the future. Although I'm not sure that would work, either, since fixed-string matches would suddenly match anywhere in the string, breaking backwards compatibility. You could add another option that specifies the behavior of `--group-filter`, or add some extra bit of information there like `--group-filter=egrep:<pattern>` or something. At least there's no shortage of options ;-). Thanks, Taylor
diff --git a/Documentation/git-shortlog.txt b/Documentation/git-shortlog.txt index 7d0277d033..dab6d09648 100644 --- a/Documentation/git-shortlog.txt +++ b/Documentation/git-shortlog.txt @@ -97,6 +97,11 @@ counts both authors and co-authors. If width is `0` (zero) then indent the lines of the output without wrapping them. +--group-filter=<group>:: + Only show output from the given group. If given more than once, + show output from any of the previously specified groups. May be + cleared with `--no-group-filter`. + <revision-range>:: Show only commits in the specified revision range. When no <revision-range> is specified, it defaults to `HEAD` (i.e. the diff --git a/builtin/shortlog.c b/builtin/shortlog.c index 46f4e0832a..679db22c57 100644 --- a/builtin/shortlog.c +++ b/builtin/shortlog.c @@ -365,6 +365,7 @@ void shortlog_init(struct shortlog *log) log->trailers.strdup_strings = 1; log->trailers.cmp = strcasecmp; log->format.strdup_strings = 1; + log->group_filter.strdup_strings = 1; } void shortlog_finish_setup(struct shortlog *log) @@ -377,6 +378,7 @@ void shortlog_finish_setup(struct shortlog *log) log->email ? "%cN <%cE>" : "%cN"); string_list_sort(&log->trailers); + string_list_sort(&log->group_filter); } int cmd_shortlog(int argc, const char **argv, const char *prefix) @@ -400,6 +402,8 @@ int cmd_shortlog(int argc, const char **argv, const char *prefix) &parse_wrap_args), OPT_CALLBACK(0, "group", &log, N_("field"), N_("group by field"), parse_group_option), + OPT_STRING_LIST(0, "group-filter", &log.group_filter, + N_("group"), N_("only show matching groups")), OPT_END(), }; @@ -476,6 +480,13 @@ static void add_wrapped_shortlog_msg(struct strbuf *sb, const char *s, strbuf_addch(sb, '\n'); } +static int want_shortlog_group(struct shortlog *log, const char *group) +{ + if (!log->group_filter.nr) + return 1; + return string_list_has_string(&log->group_filter, group); +} + void shortlog_output(struct shortlog *log) { size_t i, j; @@ -486,6 +497,9 @@ void shortlog_output(struct shortlog *log) log->summary ? compare_by_counter : compare_by_list); for (i = 0; i < log->list.nr; i++) { const struct string_list_item *item = &log->list.items[i]; + if (!want_shortlog_group(log, item->string)) + goto next; + if (log->summary) { fprintf(log->file, "%6d\t%s\n", (int)UTIL_TO_INT(item), item->string); @@ -505,11 +519,15 @@ void shortlog_output(struct shortlog *log) fprintf(log->file, " %s\n", msg); } putc('\n', log->file); + } + +next: + if (!log->summary) { + struct string_list *onelines = item->util; onelines->strdup_strings = 1; string_list_clear(onelines, 0); free(onelines); } - log->list.items[i].util = NULL; } diff --git a/shortlog.h b/shortlog.h index 28d04f951a..8ebee0e2d6 100644 --- a/shortlog.h +++ b/shortlog.h @@ -18,6 +18,8 @@ struct shortlog { int abbrev; struct date_mode date_mode; + struct string_list group_filter; + enum { SHORTLOG_GROUP_AUTHOR = (1 << 0), SHORTLOG_GROUP_COMMITTER = (1 << 1), diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh index 8e4effebdb..0695c42ca8 100755 --- a/t/t4201-shortlog.sh +++ b/t/t4201-shortlog.sh @@ -346,6 +346,60 @@ test_expect_success 'shortlog can match multiple groups' ' test_cmp expect actual ' +test_expect_success '--group-filter shows only matching groups (single groups)' ' + cat >expect <<-\EOF && + 1 A U Thor + EOF + git shortlog -ns \ + --group=trailer:another-trailer \ + --group-filter="A U Thor" \ + -2 HEAD >actual && + test_cmp expect actual +' + +test_expect_success '--group-filter shows only matching groups (multiple groups)' ' + cat >expect <<-\EOF && + 2 A U Thor + EOF + git shortlog -ns \ + --group=author \ + --group=trailer:some-trailer \ + --group=trailer:another-trailer \ + --group-filter="A U Thor" \ + -2 HEAD >actual && + test_cmp expect actual +' + +test_expect_success '--group-filter can be specified more than once' ' + cat >expect <<-\EOF && + 2 User B + 1 User A + EOF + git shortlog -ns \ + --group=author \ + --group=trailer:some-trailer \ + --group=trailer:another-trailer \ + --group-filter="User A" \ + --group-filter="User B" \ + -2 HEAD >actual && + test_cmp expect actual +' + +test_expect_success '--no-group-filter reset group filters' ' + cat >expect <<-\EOF && + 2 A U Thor + 2 User B + 1 User A + EOF + git shortlog -ns \ + --group=author \ + --group=trailer:some-trailer \ + --group=trailer:another-trailer \ + --group-filter="A U Thor" --no-group-filter \ + -2 HEAD >actual && + test_cmp expect actual +' + test_expect_success 'shortlog can match multiple format groups' ' GIT_COMMITTER_NAME="$GIT_AUTHOR_NAME" \ git commit --allow-empty -m "identical names" &&