Message ID | 20190410162409.117264-5-brho@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blame: add the ability to ignore commits | expand |
Barret Rhoden <brho@google.com> writes: > Sometimes we are unable to even guess at what commit touched a line. > These lines are 'unblamable.' The second option, > blame.maskIgnoredUnblamables, will zero the hash of any unblamable line. > > For example, say we ignore e5e8d36d04cbe: > e5e8d36d04cbe (Barret Rhoden 2016-04-11 13:57:54 -0400 26) > appears as: > 0000000000000 (Barret Rhoden 2016-04-11 13:57:54 -0400 26) Wouldn't this make it impossible to tell between what's done by such a commit that was marked to be ignored, and what's done locally only in the working tree, which the users have long accustomed to see with the ^0*$ object name? I think it would make a lot more sense to show the object name of the "ignored" commit, which would be recognizable by the user who fed such an object name to the command in the first place. Alternatively, perhaps the same idea as replacing one of the hexdigits with '*' used by the other configuration can be applied to this as well?
On Sun, 14 Apr 2019 at 04:45, Junio C Hamano <gitster@pobox.com> wrote: > Wouldn't this make it impossible to tell between what's done by such > a commit that was marked to be ignored, and what's done locally only > in the working tree, which the users have long accustomed to see > with the ^0*$ object name? I think it would make a lot more sense > to show the object name of the "ignored" commit, which would be > recognizable by the user who fed such an object name to the command > in the first place. Alternatively, perhaps the same idea as replacing > one of the hexdigits with '*' used by the other configuration can be > applied to this as well? I had the same objection to zeroing out hashes, but this option is off by default so I think it's OK. If you enable both blame.markIgnoredLines and blame.maskIgnoredUnblamables then the hash does appear as "*0000000000" like you suggest. I think it's appropriate that the '*' is only added if you opt in with the markIgnoredLines option. If you only enable blame.markIgnoredLines then the hash for "unblamable" lines appears as e.g. "*3252488f5" - this doesn't seem right to me because the commit *wasn't* ignored, it is in fact the commit in which that line was added. I think '*' should denote "this information may be inaccurate" as that's what a typical user needs to be aware of. However given that "unblamable" lines tend to be either empty or a single character I'm not going to insist :)
Michael Platings <michael@platin.gs> writes: > If you only enable blame.markIgnoredLines then the hash for > "unblamable" lines appears as e.g. "*3252488f5" - this doesn't seem > right to me because the commit *wasn't* ignored, I think you misunderstood me. I was merely suggesting to use the approach to mark the line in a way other than using the NULLed out object name that has been reserved for something totally different, and hinting with "the same *idea*". And that idea is not even original to this series; the "^" marker that is used to say "the line is attributed to this commit, but that may only be because you blamed with commit range A..B and we reached the bottom of the range---if you dug further, you might find the line originates from another commit" is the origin of the same idea, and this topic borrows it and uses a different mark, i.e. '*', for the "we are not certain---take this with grain of salt" mark. If you ended up hitting the commit the user wanted to ignore, perhaps you can find another character that is different from '^' or '*' and use that, following the same idea. That is what I meant. So you shouldn't be worried about using the same '*' making the result ambiguous. By the way, a configuration only feature is something we usually do not accept. A feature must be guarded with --command-line-option and then optionally can have a corresponding configuration once the option proves to be useful enough that it becomes useful to be able to say "in this repository (or to this user), the feature is on by default".
On Sun, 14 Apr 2019 at 11:24, Junio C Hamano <gitster@pobox.com> wrote: > > If you only enable blame.markIgnoredLines then the hash for > > "unblamable" lines appears as e.g. "*3252488f5" - this doesn't seem > > right to me because the commit *wasn't* ignored, > > I think you misunderstood me. I was merely suggesting to use the > approach to mark the line in a way other than using the NULLed out > object name that has been reserved for something totally different, > and hinting with "the same *idea*". Hi Junio, that paragraph wasn't targetted at yourself, more a comment on the functionality as it exists in the latest patch series. Sorry for not making that clear. > the "^" marker > that is used to say "the line is attributed to this commit, but that > may only be because you blamed with commit range A..B and we reached > the bottom of the range---if you dug further, you might find the > line originates from another commit" is the origin of the same idea, > and this topic borrows it and uses a different mark, i.e. '*', for > the "we are not certain---take this with grain of salt" mark. So it sounds like we have many types of blame to consider: 1) This commit is truly the last one to touch this line, and you didn't ask to ignore it. 2) This commit is truly the last one to touch this line, but you asked to ignore it (AKA "unblamable"). 3) This commit is at the bottom of the range of commits (^) 4) The "true" commit was ignored but we guess this is the one you're actually interested in (*) 5) The "true" commit was ignored and we've reached the bottom of the range of commits (^*)? 6) This commit is at the bottom of the range of commits, and you asked to ignore it. > If you ended up hitting the commit the user wanted to ignore, > perhaps you can find another character that is different from '^' or > '*' and use that, following the same idea. I personally don't find the "unblamable" lines interesting enough to justify giving them a symbol. But if Barret strongly feels that such lines should get a '*' then I won't fight it - these lines tend to be as simple as "}". > By the way, a configuration only feature is something we usually do > not accept. A feature must be guarded with --command-line-option > and then optionally can have a corresponding configuration once the > option proves to be useful enough that it becomes useful to be able > to say "in this repository (or to this user), the feature is on by > default". In that case we definitely need a --mark-ignored-lines option to git blame, and I would strongly prefer that we also keep the blame.markIgnoredLines option as I for one will be switching it on.
Hi - On 4/14/19 7:27 AM, Michael Platings wrote: > On Sun, 14 Apr 2019 at 11:24, Junio C Hamano <gitster@pobox.com> wrote: >>> If you only enable blame.markIgnoredLines then the hash for >>> "unblamable" lines appears as e.g. "*3252488f5" - this doesn't seem >>> right to me because the commit *wasn't* ignored, >> >> I think you misunderstood me. I was merely suggesting to use the >> approach to mark the line in a way other than using the NULLed out >> object name that has been reserved for something totally different, >> and hinting with "the same *idea*". > > Hi Junio, that paragraph wasn't targetted at yourself, more a comment > on the functionality as it exists in the latest patch series. Sorry > for not making that clear. > >> the "^" marker >> that is used to say "the line is attributed to this commit, but that >> may only be because you blamed with commit range A..B and we reached >> the bottom of the range---if you dug further, you might find the >> line originates from another commit" is the origin of the same idea, >> and this topic borrows it and uses a different mark, i.e. '*', for >> the "we are not certain---take this with grain of salt" mark. > > So it sounds like we have many types of blame to consider: > > 1) This commit is truly the last one to touch this line, and you > didn't ask to ignore it. > 2) This commit is truly the last one to touch this line, but you asked > to ignore it (AKA "unblamable"). > 3) This commit is at the bottom of the range of commits (^) > 4) The "true" commit was ignored but we guess this is the one you're > actually interested in (*) > 5) The "true" commit was ignored and we've reached the bottom of the > range of commits (^*)? > 6) This commit is at the bottom of the range of commits, and you asked > to ignore it. > >> If you ended up hitting the commit the user wanted to ignore, >> perhaps you can find another character that is different from '^' or >> '*' and use that, following the same idea. > > I personally don't find the "unblamable" lines interesting enough to > justify giving them a symbol. But if Barret strongly feels that such > lines should get a '*' then I won't fight it - these lines tend to be > as simple as "}". I'm fine with not zeroing the hash, so long as there's some way to mark it. We could mark with another *, such that if we mark-ignored and mark-unblamable you get "**hash". You can't have an unblamable that isn't from an ignored commit, so a single '*' has only one meaning, based on your config options. If that works for you all, I can change this to markIgnoredUnblamables (instead of 'mask') in the next version. >> By the way, a configuration only feature is something we usually do >> not accept. A feature must be guarded with --command-line-option >> and then optionally can have a corresponding configuration once the >> option proves to be useful enough that it becomes useful to be able >> to say "in this repository (or to this user), the feature is on by >> default". > > In that case we definitely need a --mark-ignored-lines option to git > blame, and I would strongly prefer that we also keep the > blame.markIgnoredLines option as I for one will be switching it on. I'd also keep this set. I think the whole reason for these config options was that everyone has a different preference, but that preference rarely changes. I don't want to have to type --mark-ignored-lines every time I run git blame. If I had to, I'd have to alias git blame or something. I think having config options for these sorts of things is fine, since we know already that for a given user+repo, we want the feature on (or off). But if I have to remove it, then let me know. Thanks, Barret
diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt index 8f155196c6fe..e7b8b5e4b87b 100644 --- a/Documentation/blame-options.txt +++ b/Documentation/blame-options.txt @@ -115,7 +115,11 @@ take effect. change never happened. Lines that were changed or added by an ignored commit will be blamed on the previous commit that changed that line or nearby lines. This option may be specified multiple times to ignore - more than one revision. + more than one revision. If the `blame.markIgnoredLines` config option + is set, then lines that were changed by an ignored commit will be + marked with a `*` in the blame output. If the + `blame.maskIgnoredUnblamables` config option is set, then those lines that + we could not attribute to another revision are outputted as all zeros. --ignore-revs-file <file>:: Ignore revisions listed in `file`, one unabbreviated object name per line. diff --git a/Documentation/config/blame.txt b/Documentation/config/blame.txt index 4da2788f306d..bb6674227da1 100644 --- a/Documentation/config/blame.txt +++ b/Documentation/config/blame.txt @@ -26,3 +26,12 @@ blame.ignoreRevsFile:: `#` are ignored. This option may be repeated multiple times. Empty file names will reset the list of ignored revisions. This option will be handled before the command line option `--ignore-revs-file`. + +blame.maskIgnoredUnblamables:: + Output an object hash of all zeros for lines that were changed by an ignored + revision and that we could not attribute to another revision in the output + of linkgit:git-blame[1]. + +blame.markIgnoredLines:: + Mark lines that were changed by an ignored revision with a '*' in the + output of linkgit:git-blame[1]. diff --git a/blame.c b/blame.c index 0bbb86ad5985..a98ae00e2cfc 100644 --- a/blame.c +++ b/blame.c @@ -481,6 +481,7 @@ void blame_coalesce(struct blame_scoreboard *sb) for (ent = sb->ent; ent && (next = ent->next); ent = next) { if (ent->suspect == next->suspect && ent->s_lno + ent->num_lines == next->s_lno && + ent->ignored == next->ignored && ent->unblamable == next->unblamable) { ent->num_lines += next->num_lines; ent->next = next->next; @@ -733,6 +734,7 @@ static void split_overlap(struct blame_entry *split, int chunk_end_lno; memset(split, 0, sizeof(struct blame_entry [3])); + split[0].ignored = split[1].ignored = split[2].ignored = e->ignored; split[0].unblamable = e->unblamable; split[1].unblamable = e->unblamable; split[2].unblamable = e->unblamable; @@ -857,6 +859,7 @@ static struct blame_entry *split_blame_at(struct blame_entry *e, int len, struct blame_entry *n = xcalloc(1, sizeof(struct blame_entry)); n->suspect = new_suspect; + n->ignored = e->ignored; n->unblamable = e->unblamable; n->lno = e->lno + len; n->s_lno = e->s_lno + len; @@ -922,6 +925,7 @@ static void ignore_blame_entry(struct blame_entry *e, struct blame_line_tracker *line_blames; int entry_len, nr_lines, i; + e->ignored = 1; line_blames = xcalloc(sizeof(struct blame_line_tracker), e->num_lines); guess_line_blames(e, parent, target, offset, parent_slno, parent_len, diff --git a/blame.h b/blame.h index 91664913d7c4..53df8b4c5b3f 100644 --- a/blame.h +++ b/blame.h @@ -92,6 +92,7 @@ struct blame_entry { * scanning the lines over and over. */ unsigned score; + int ignored; int unblamable; }; diff --git a/builtin/blame.c b/builtin/blame.c index b48842d8459b..c10a6a802240 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -53,6 +53,8 @@ static int show_progress; static char repeated_meta_color[COLOR_MAXLEN]; static int coloring_mode; static struct string_list ignore_revs_file_list = STRING_LIST_INIT_NODUP; +static int mask_ignored_unblamables; +static int mark_ignored_lines; static struct date_mode blame_date_mode = { DATE_ISO8601 }; static size_t blame_date_width; @@ -347,7 +349,7 @@ static void emit_porcelain(struct blame_scoreboard *sb, struct blame_entry *ent, char hex[GIT_MAX_HEXSZ + 1]; oid_to_hex_r(hex, &suspect->commit->object.oid); - if (ent->unblamable) + if (mask_ignored_unblamables && ent->unblamable) memset(hex, '0', strlen(hex)); printf("%s %d %d %d\n", hex, @@ -482,7 +484,11 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int } } - if (ent->unblamable) + if (mark_ignored_lines && ent->ignored) { + length--; + putchar('*'); + } + if (mask_ignored_unblamables && ent->unblamable) memset(hex, '0', length); printf("%.*s", length, hex); if (opt & OUTPUT_ANNOTATE_COMPAT) { @@ -710,6 +716,14 @@ static int git_blame_config(const char *var, const char *value, void *cb) string_list_insert(&ignore_revs_file_list, str); return 0; } + if (!strcmp(var, "blame.maskignoredunblamables")) { + mask_ignored_unblamables = git_config_bool(var, value); + return 0; + } + if (!strcmp(var, "blame.markignoredlines")) { + mark_ignored_lines = git_config_bool(var, value); + return 0; + } if (!strcmp(var, "color.blame.repeatedlines")) { if (color_parse_mem(value, strlen(value), repeated_meta_color)) warning(_("invalid color '%s' in color.blame.repeatedLines"), diff --git a/t/t8013-blame-ignore-revs.sh b/t/t8013-blame-ignore-revs.sh index df4993f98682..cc049a390b0d 100755 --- a/t/t8013-blame-ignore-revs.sh +++ b/t/t8013-blame-ignore-revs.sh @@ -53,6 +53,7 @@ test_expect_success ignore_rev_changing_lines ' # to have nothing in common with "line-one" or "line-two", to keep any # heuristics from matching them with any lines in the parent. test_expect_success ignore_rev_adding_unblamable_lines ' + git config --add blame.maskIgnoredUnblamables true && test_write_lines line-one-change line-two-changed y3 y4 >file && git add file && test_tick && @@ -123,6 +124,39 @@ test_expect_success bad_files_and_revs ' test_i18ngrep "Invalid object name: NOREV" err ' +# Commit Z will touch the first two lines. Y touched all four. +# A--B--X--Y--Z +# The blame output when ignoring Z should be: +# ^Y ... 1) +# ^Y ... 2) +# Y ... 3) +# Y ... 4) +# We're checking only the first character +test_expect_success mark_ignored_lines ' + git config --add blame.markIgnoredLines true && + + test_write_lines line-one-Z line-two-Z y3 y4 >file && + git add file && + test_tick && + git commit -m Z && + git tag Z && + + git blame --ignore-rev Z file >blame_raw && + echo "*" >expect && + + sed -n "1p" blame_raw | cut -c1 >actual && + test_cmp expect actual && + + sed -n "2p" blame_raw | cut -c1 >actual && + test_cmp expect actual && + + sed -n "3p" blame_raw | cut -c1 >actual && + ! test_cmp expect actual && + + sed -n "4p" blame_raw | cut -c1 >actual && + ! test_cmp expect actual + ' + # Resetting the repo and creating: # # A--B--M
When ignoring commits, the commit that is blamed might not be responsible for the change. Users might want to know when a particular line has a potentially inaccurate blame. Furthermore, they might never want to see the object hash of an ignored commit. This patch adds two config options to control the output behavior. The first option can identify ignored lines by specifying blame.markIgnoredFiles. When this option is set, each blame line is marked with an '*'. For example: 278b6158d6fdb (Barret Rhoden 2016-04-11 13:57:54 -0400 26) appears as: *278b6158d6fd (Barret Rhoden 2016-04-11 13:57:54 -0400 26) where the '*' is placed before the commit, and the hash has one fewer characters. Sometimes we are unable to even guess at what commit touched a line. These lines are 'unblamable.' The second option, blame.maskIgnoredUnblamables, will zero the hash of any unblamable line. For example, say we ignore e5e8d36d04cbe: e5e8d36d04cbe (Barret Rhoden 2016-04-11 13:57:54 -0400 26) appears as: 0000000000000 (Barret Rhoden 2016-04-11 13:57:54 -0400 26) Signed-off-by: Barret Rhoden <brho@google.com> --- Documentation/blame-options.txt | 6 +++++- Documentation/config/blame.txt | 9 +++++++++ blame.c | 4 ++++ blame.h | 1 + builtin/blame.c | 18 +++++++++++++++-- t/t8013-blame-ignore-revs.sh | 34 +++++++++++++++++++++++++++++++++ 6 files changed, 69 insertions(+), 3 deletions(-)