Message ID | 20210818043749.85274-1-gwymor@tilde.club (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [filter-repo] filter-repo: add new --replace-message option | expand |
Hi, On Tue, Aug 17, 2021 at 9:38 PM Gwyneth Morgan <gwymor@tilde.club> wrote: > > Like --replace-text, add an option --replace-message which replaces text > in commit message bodies, so that users can easily replace text without > constructing a --message-callback. Interesting idea. Missing a Signed-off-by trailer. > --- > Documentation/git-filter-repo.txt | 19 +++++++- > git-filter-repo | 12 ++++- > t/t9390-filter-repo.sh | 1 + > t/t9390/basic-message | 78 +++++++++++++++++++++++++++++++ > t/t9390/sample-message | 2 + > 5 files changed, 110 insertions(+), 2 deletions(-) > create mode 100644 t/t9390/basic-message > create mode 100644 t/t9390/sample-message > > diff --git a/Documentation/git-filter-repo.txt b/Documentation/git-filter-repo.txt > index 2798378..7a71375 100644 > --- a/Documentation/git-filter-repo.txt > +++ b/Documentation/git-filter-repo.txt > @@ -181,6 +181,10 @@ Renaming of refs (see also --refname-callback) > Filtering of commit messages (see also --message-callback) > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > +--replace-message <expressions_file>:: > + A file with expressions that, if found in commit messages, will > + be replaced. This file uses the same syntax as --replace-text. > + Just commit messages? What about tag messages? > --preserve-commit-hashes:: > By default, since commits are rewritten and thus gain new > hashes, references to old commit hashes in commit messages are > @@ -894,7 +898,20 @@ YYYY-MM-DD. In the expressions file, there are a few things to note: > beginning and ends of lines rather than the beginning and end of file. > See https://docs.python.org/3/library/re.html for details. > > -See also the `--blob-callback` from <<CALLBACKS>>. > +See also the `--blob-callback` from <<CALLBACKS>>. Similarly, if you > +want to modify commit messages, you can do so with the same syntax. For > +example, with a file named expressions.txt containing > + > +-------------------------------------------------- > +foo==>bar > +-------------------------------------------------- > + > +then running > +-------------------------------------------------- > +git filter-repo --replace-message expressions.txt > +-------------------------------------------------- > + > +will replace `foo` in commit messages with `bar`. You've added this text to the "Content based filtering" section of the manual, which doesn't make sense. It should go in a section about updating commit/tag messages. > Refname based filtering > ~~~~~~~~~~~~~~~~~~~~~~~ > diff --git a/git-filter-repo b/git-filter-repo > index b91bd96..5fe0f91 100755 > --- a/git-filter-repo > +++ b/git-filter-repo > @@ -1843,6 +1843,10 @@ EXAMPLES > > messages = parser.add_argument_group(title=_("Filtering of commit messages " > "(see also --message-callback)")) > + messages.add_argument('--replace-message', metavar='EXPRESSIONS_FILE', > + help=_("A file with expressions that, if found in commit messages, " > + "will be replaced. This file uses the same syntax as " > + "--replace-text.")) > messages.add_argument('--preserve-commit-hashes', action='store_true', > help=_("By default, since commits are rewritten and thus gain new " > "hashes, references to old commit hashes in commit messages " > @@ -2189,6 +2193,8 @@ EXAMPLES > args.mailmap = MailmapInfo(args.mailmap) > if args.replace_text: > args.replace_text = FilteringOptions.get_replace_text(args.replace_text) > + if args.replace_message: > + args.replace_message = FilteringOptions.get_replace_text(args.replace_message) > if args.strip_blobs_with_ids: > with open(args.strip_blobs_with_ids, 'br') as f: > args.strip_blobs_with_ids = set(f.read().split()) > @@ -3374,9 +3380,13 @@ class RepoFilter(object): > if not self._args.preserve_commit_hashes: > commit.message = self._hash_re.sub(self._translate_commit_hash, > commit.message) > + if self._args.replace_message: > + for literal, replacement in self._args.replace_message['literals']: > + commit.message = commit.message.replace(literal, replacement) > + for regex, replacement in self._args.replace_message['regexes']: > + commit.message = regex.sub(replacement, commit.message) Makes sense. > if self._message_callback: > commit.message = self._message_callback(commit.message) > - Why this stray line removal? > # Change the author & committer according to mailmap rules > args = self._args > if args.mailmap: As noted above, just as --message-callback affects both commit and tag messages, shouldn't this option affect both (i.e. should there also be a section in tweak_tag() similar to the one you added to tweak_commit())? > diff --git a/t/t9390-filter-repo.sh b/t/t9390-filter-repo.sh > index 3f567e7..6d2d985 100755 > --- a/t/t9390-filter-repo.sh > +++ b/t/t9390-filter-repo.sh > @@ -39,6 +39,7 @@ filter_testcase basic basic-filename --invert-paths --path-glob 't*en*' > filter_testcase basic basic-numbers --invert-paths --path-regex 'f.*e.*e' > filter_testcase basic basic-mailmap --mailmap ../t9390/sample-mailmap > filter_testcase basic basic-replace --replace-text ../t9390/sample-replace > +filter_testcase basic basic-message --replace-message ../t9390/sample-message > filter_testcase empty empty-keepme --path keepme > filter_testcase empty more-empty-keepme --path keepme --prune-empty=always \ > --prune-degenerate=always > diff --git a/t/t9390/basic-message b/t/t9390/basic-message > new file mode 100644 > index 0000000..4ac1968 > --- /dev/null > +++ b/t/t9390/basic-message > @@ -0,0 +1,78 @@ > +feature done > +blob > +mark :1 > +data 8 > +initial > + > +reset refs/heads/B > +commit refs/heads/B > +mark :2 > +author Little O. Me <me@little.net> 1535228562 -0700 > +committer Little O. Me <me@little.net> 1535228562 -0700 > +data 9 > +Modified > +M 100644 :1 filename > +M 100644 :1 ten > +M 100644 :1 twenty > + > +blob > +mark :3 > +data 11 > +twenty-mod > + > +commit refs/heads/B > +mark :4 > +author Little 'ol Me <me@laptop.(none)> 1535229544 -0700 > +committer Little 'ol Me <me@laptop.(none)> 1535229544 -0700 > +data 18 > +add the number 20 > +from :2 > +M 100644 :3 twenty > + > +blob > +mark :5 > +data 8 > +ten-mod > + > +commit refs/heads/A > +mark :6 > +author Little O. Me <me@machine52.little.net> 1535229523 -0700 > +committer Little O. Me <me@machine52.little.net> 1535229523 -0700 > +data 8 > +add ten > +from :2 > +M 100644 :5 ten > + > +commit refs/heads/master > +mark :7 > +author Lit.e Me <me@fire.com> 1535229559 -0700 > +committer Lit.e Me <me@fire.com> 1535229580 -0700 > +data 24 > +Merge branch 'A' into B > +from :4 > +merge :6 > +M 100644 :5 ten > + > +blob > +mark :8 > +data 6 > +final > + > +commit refs/heads/master > +mark :9 > +author Little Me <me@bigcompany.com> 1535229601 -0700 > +committer Little Me <me@bigcompany.com> 1535229601 -0700 > +data 9 > +whatever > +from :7 > +M 100644 :8 filename > +M 100644 :8 ten > +M 100644 :8 twenty > + > +tag v1.0 > +from :9 > +tagger Little John <second@merry.men> 1535229618 -0700 > +data 5 > +v1.0 > + > +done > diff --git a/t/t9390/sample-message b/t/t9390/sample-message > new file mode 100644 > index 0000000..a374d61 > --- /dev/null > +++ b/t/t9390/sample-message > @@ -0,0 +1,2 @@ > +Initial==>Modified > +regex:tw.nty==>the number 20 > -- > 2.32.0 Testcase looks good. Thanks for sending this along; if you fix up the issues I pointed out, I'd be happy to apply this to git-filter-repo.
On 2021-08-23 10:34:07-0700, Elijah Newren wrote: > Hi, > > On Tue, Aug 17, 2021 at 9:38 PM Gwyneth Morgan <gwymor@tilde.club> wrote: > > > > Like --replace-text, add an option --replace-message which replaces text > > in commit message bodies, so that users can easily replace text without > > constructing a --message-callback. > > Interesting idea. > > Missing a Signed-off-by trailer. Will fix. > > @@ -894,7 +898,20 @@ YYYY-MM-DD. In the expressions file, there are a few things to note: > > beginning and ends of lines rather than the beginning and end of file. > > See https://docs.python.org/3/library/re.html for details. > > > > -See also the `--blob-callback` from <<CALLBACKS>>. > > +See also the `--blob-callback` from <<CALLBACKS>>. Similarly, if you > > +want to modify commit messages, you can do so with the same syntax. For > > +example, with a file named expressions.txt containing > > + > > +-------------------------------------------------- > > +foo==>bar > > +-------------------------------------------------- > > + > > +then running > > +-------------------------------------------------- > > +git filter-repo --replace-message expressions.txt > > +-------------------------------------------------- > > + > > +will replace `foo` in commit messages with `bar`. > > You've added this text to the "Content based filtering" section of the > manual, which doesn't make sense. It should go in a section about > updating commit/tag messages. Ah, got it. I'll move that into a new section. > > if self._message_callback: > > commit.message = self._message_callback(commit.message) > > - > > Why this stray line removal? That was accidental. Will fix. > > # Change the author & committer according to mailmap rules > > args = self._args > > if args.mailmap: > > As noted above, just as --message-callback affects both commit and tag > messages, shouldn't this option affect both (i.e. should there also be > a section in tweak_tag() similar to the one you added to > tweak_commit())? Yes, it should. I'll change that.
diff --git a/Documentation/git-filter-repo.txt b/Documentation/git-filter-repo.txt index 2798378..7a71375 100644 --- a/Documentation/git-filter-repo.txt +++ b/Documentation/git-filter-repo.txt @@ -181,6 +181,10 @@ Renaming of refs (see also --refname-callback) Filtering of commit messages (see also --message-callback) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +--replace-message <expressions_file>:: + A file with expressions that, if found in commit messages, will + be replaced. This file uses the same syntax as --replace-text. + --preserve-commit-hashes:: By default, since commits are rewritten and thus gain new hashes, references to old commit hashes in commit messages are @@ -894,7 +898,20 @@ YYYY-MM-DD. In the expressions file, there are a few things to note: beginning and ends of lines rather than the beginning and end of file. See https://docs.python.org/3/library/re.html for details. -See also the `--blob-callback` from <<CALLBACKS>>. +See also the `--blob-callback` from <<CALLBACKS>>. Similarly, if you +want to modify commit messages, you can do so with the same syntax. For +example, with a file named expressions.txt containing + +-------------------------------------------------- +foo==>bar +-------------------------------------------------- + +then running +-------------------------------------------------- +git filter-repo --replace-message expressions.txt +-------------------------------------------------- + +will replace `foo` in commit messages with `bar`. Refname based filtering ~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/git-filter-repo b/git-filter-repo index b91bd96..5fe0f91 100755 --- a/git-filter-repo +++ b/git-filter-repo @@ -1843,6 +1843,10 @@ EXAMPLES messages = parser.add_argument_group(title=_("Filtering of commit messages " "(see also --message-callback)")) + messages.add_argument('--replace-message', metavar='EXPRESSIONS_FILE', + help=_("A file with expressions that, if found in commit messages, " + "will be replaced. This file uses the same syntax as " + "--replace-text.")) messages.add_argument('--preserve-commit-hashes', action='store_true', help=_("By default, since commits are rewritten and thus gain new " "hashes, references to old commit hashes in commit messages " @@ -2189,6 +2193,8 @@ EXAMPLES args.mailmap = MailmapInfo(args.mailmap) if args.replace_text: args.replace_text = FilteringOptions.get_replace_text(args.replace_text) + if args.replace_message: + args.replace_message = FilteringOptions.get_replace_text(args.replace_message) if args.strip_blobs_with_ids: with open(args.strip_blobs_with_ids, 'br') as f: args.strip_blobs_with_ids = set(f.read().split()) @@ -3374,9 +3380,13 @@ class RepoFilter(object): if not self._args.preserve_commit_hashes: commit.message = self._hash_re.sub(self._translate_commit_hash, commit.message) + if self._args.replace_message: + for literal, replacement in self._args.replace_message['literals']: + commit.message = commit.message.replace(literal, replacement) + for regex, replacement in self._args.replace_message['regexes']: + commit.message = regex.sub(replacement, commit.message) if self._message_callback: commit.message = self._message_callback(commit.message) - # Change the author & committer according to mailmap rules args = self._args if args.mailmap: diff --git a/t/t9390-filter-repo.sh b/t/t9390-filter-repo.sh index 3f567e7..6d2d985 100755 --- a/t/t9390-filter-repo.sh +++ b/t/t9390-filter-repo.sh @@ -39,6 +39,7 @@ filter_testcase basic basic-filename --invert-paths --path-glob 't*en*' filter_testcase basic basic-numbers --invert-paths --path-regex 'f.*e.*e' filter_testcase basic basic-mailmap --mailmap ../t9390/sample-mailmap filter_testcase basic basic-replace --replace-text ../t9390/sample-replace +filter_testcase basic basic-message --replace-message ../t9390/sample-message filter_testcase empty empty-keepme --path keepme filter_testcase empty more-empty-keepme --path keepme --prune-empty=always \ --prune-degenerate=always diff --git a/t/t9390/basic-message b/t/t9390/basic-message new file mode 100644 index 0000000..4ac1968 --- /dev/null +++ b/t/t9390/basic-message @@ -0,0 +1,78 @@ +feature done +blob +mark :1 +data 8 +initial + +reset refs/heads/B +commit refs/heads/B +mark :2 +author Little O. Me <me@little.net> 1535228562 -0700 +committer Little O. Me <me@little.net> 1535228562 -0700 +data 9 +Modified +M 100644 :1 filename +M 100644 :1 ten +M 100644 :1 twenty + +blob +mark :3 +data 11 +twenty-mod + +commit refs/heads/B +mark :4 +author Little 'ol Me <me@laptop.(none)> 1535229544 -0700 +committer Little 'ol Me <me@laptop.(none)> 1535229544 -0700 +data 18 +add the number 20 +from :2 +M 100644 :3 twenty + +blob +mark :5 +data 8 +ten-mod + +commit refs/heads/A +mark :6 +author Little O. Me <me@machine52.little.net> 1535229523 -0700 +committer Little O. Me <me@machine52.little.net> 1535229523 -0700 +data 8 +add ten +from :2 +M 100644 :5 ten + +commit refs/heads/master +mark :7 +author Lit.e Me <me@fire.com> 1535229559 -0700 +committer Lit.e Me <me@fire.com> 1535229580 -0700 +data 24 +Merge branch 'A' into B +from :4 +merge :6 +M 100644 :5 ten + +blob +mark :8 +data 6 +final + +commit refs/heads/master +mark :9 +author Little Me <me@bigcompany.com> 1535229601 -0700 +committer Little Me <me@bigcompany.com> 1535229601 -0700 +data 9 +whatever +from :7 +M 100644 :8 filename +M 100644 :8 ten +M 100644 :8 twenty + +tag v1.0 +from :9 +tagger Little John <second@merry.men> 1535229618 -0700 +data 5 +v1.0 + +done diff --git a/t/t9390/sample-message b/t/t9390/sample-message new file mode 100644 index 0000000..a374d61 --- /dev/null +++ b/t/t9390/sample-message @@ -0,0 +1,2 @@ +Initial==>Modified +regex:tw.nty==>the number 20