Message ID | 20210706224321.14371-1-sandals@crustytoothpaste.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | rev-list: add option for --pretty without header | expand |
On Wed, Jul 7, 2021 at 12:47 AM brian m. carlson <sandals@crustytoothpaste.net> wrote: > > In general, we encourage users to use plumbing commands, like git > rev-list, over porcelain commands, like git log, when scripting. > However, git rev-list has one glaring problem that prevents it from > being used in certain cases: when --pretty is used, it always prints out > a line containing "commit" and the object ID. You say always, but it looks like it doesn't do it when --pretty=oneline is used. By the way this makes me think that the doc could be improved, because it describes the format this way: • online <hash> <title line> • short commit <hash> Author: <author> <title line> • reference <abbrev hash> (<title line>, <short author date>) • email From <hash> <date> From: <author> Date: <author date> Subject: [PATCH] <title line> <full commit message> while when --pretty=reference or --pretty=email is used the 'commit <hash>' line is still printed despite not being shown in the doc. > This makes it unsuitable > for many scripting needs, and forces users to use git log instead. I agree that it's annoying when using --pretty=format:'...' > While we can't change this behavior for backwards compatibility, we can > add an option to suppress this behavior, so let's do so, and call it > "--no-commit-header". Additionally, add the corresponding positive > option to switch it back on. It's not clear if this new option will remove the 'commit <hash>' line when a builtin format like --pretty=short, --pretty=medium, --pretty=reference or --pretty=email is used. > Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> > --- > This has come up a lot on the list and I'm sure most of the regulars > have run into it before. Yeah, I did run into it before. Thanks for doing something about it! > It came up at $DAYJOB and I said I intended to > send a patch, so here it is. > > I chose --no-commit-header because --header is already taken. Yeah, that looks like the most obvious option name. > --header:: > Print the contents of the commit in raw-format; each record is > separated with a NUL character. > + > +--no-commit-header:: > + Suppress the header line containing "commit" and the object ID printed before > + the specified format. This might want to tell what happens with builtin formats. > +--commit-header:: > + Overrides a previous `--no-commit-header`. > endif::git-rev-list[] > > --parents:: [...] > diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh > index 35a2f62392..e68bf9f21c 100755 > --- a/t/t6006-rev-list-format.sh > +++ b/t/t6006-rev-list-format.sh > @@ -54,9 +54,15 @@ test_expect_success 'setup' ' > > # usage: test_format name format_string [failure] <expected_output > test_format () { > + local header_arg= > + if test "$1" = "--no-commit-header" > + then > + header_arg="--no-commit-header" > + shift > + fi > cat >expect.$1 > test_expect_${3:-success} "format $1" " > - git rev-list --pretty=format:'$2' main >output.$1 && > + git rev-list $header_arg --pretty=format:'$2' main >output.$1 && > test_cmp expect.$1 output.$1 > " > } It looks like the tests only check what happens in case --pretty=format:'...' is used, but I wonder what the code does if a builtin format is used.
Christian Couder <christian.couder@gmail.com> writes: > On Wed, Jul 7, 2021 at 12:47 AM brian m. carlson > <sandals@crustytoothpaste.net> wrote: >> >> In general, we encourage users to use plumbing commands, like git >> rev-list, over porcelain commands, like git log, when scripting. >> However, git rev-list has one glaring problem that prevents it from >> being used in certain cases: when --pretty is used, it always prints out >> a line containing "commit" and the object ID. > > You say always, but it looks like it doesn't do it when > --pretty=oneline is used. I've understood that he meant "when --pretty=format is used", though, as it is the only format whose intent was to allow generation of output stream that does not have to be preprocessed with "grep -v '^[0-9a-f]{40}'" etc. > It looks like the tests only check what happens in case > --pretty=format:'...' is used, but I wonder what the code does if a > builtin format is used. Good point. I also think the handling of --abbrev-commit may need to be rethought with this change. See here: diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 7677b1af5a..f571cc9598 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -132,7 +132,7 @@ static void show_commit(struct commit *commit, void *data) if (revs->abbrev_commit && revs->abbrev) fputs(find_unique_abbrev(&commit->object.oid, revs->abbrev), stdout); - else + else if (revs->include_header) fputs(oid_to_hex(&commit->object.oid), stdout); if (revs->print_parents) { struct commit_list *parents = commit->parents; The original says that if --abbrev-commit is set and --abbrev is set to non-zero, we'd show unique abbreviation and if not, specifically, even when --abbrev-commit is set but --abbrev is set to 0, we didn't do the find_unique_abbrev() of full hexdigits but left the output to the else clause. This was because the original code KNOWS that the else clause unconditionally emits the full commit object name. That assumption, which made the original code's handling of "--abbrev-commit --abbrev=0" correct, no longer holds with the updated code. What happens is with --no-commit-header, if we give "--abbrev-commit --abbrev=4", we still see the unique abbreviation that is not shorter than 4 hexdigits (i.e. "--no-commit-header" is ignored), but if we say "--abbrev-commit --abbrev=0", we do not see any commit header. My hunch is that this "if / else", which determines if the commit header should give shortened or full length, should be skipped when the .include_header is false.
On 2021-07-09 at 15:44:20, Junio C Hamano wrote: > Christian Couder <christian.couder@gmail.com> writes: > > > On Wed, Jul 7, 2021 at 12:47 AM brian m. carlson > > <sandals@crustytoothpaste.net> wrote: > >> > >> In general, we encourage users to use plumbing commands, like git > >> rev-list, over porcelain commands, like git log, when scripting. > >> However, git rev-list has one glaring problem that prevents it from > >> being used in certain cases: when --pretty is used, it always prints out > >> a line containing "commit" and the object ID. > > > > You say always, but it looks like it doesn't do it when > > --pretty=oneline is used. > > I've understood that he meant "when --pretty=format is used", > though, as it is the only format whose intent was to allow > generation of output stream that does not have to be preprocessed > with "grep -v '^[0-9a-f]{40}'" etc. Yes, that's the case. I'll rephrase the commit message to reflect that's what I meant. > > It looks like the tests only check what happens in case > > --pretty=format:'...' is used, but I wonder what the code does if a > > builtin format is used. > > Good point. I'll add some more tests. > I also think the handling of --abbrev-commit may need to be > rethought with this change. See here: > > diff --git a/builtin/rev-list.c b/builtin/rev-list.c > index 7677b1af5a..f571cc9598 100644 > --- a/builtin/rev-list.c > +++ b/builtin/rev-list.c > @@ -132,7 +132,7 @@ static void show_commit(struct commit *commit, void *data) > if (revs->abbrev_commit && revs->abbrev) > fputs(find_unique_abbrev(&commit->object.oid, revs->abbrev), > stdout); > - else > + else if (revs->include_header) > fputs(oid_to_hex(&commit->object.oid), stdout); > if (revs->print_parents) { > struct commit_list *parents = commit->parents; > > The original says that if --abbrev-commit is set and --abbrev is set > to non-zero, we'd show unique abbreviation and if not, specifically, > even when --abbrev-commit is set but --abbrev is set to 0, we didn't > do the find_unique_abbrev() of full hexdigits but left the output to > the else clause. This was because the original code KNOWS that the > else clause unconditionally emits the full commit object name. > > That assumption, which made the original code's handling of > "--abbrev-commit --abbrev=0" correct, no longer holds with the > updated code. What happens is with --no-commit-header, if we give > "--abbrev-commit --abbrev=4", we still see the unique abbreviation > that is not shorter than 4 hexdigits (i.e. "--no-commit-header" is > ignored), but if we say "--abbrev-commit --abbrev=0", we do not see > any commit header. > > My hunch is that this "if / else", which determines if the commit > header should give shortened or full length, should be skipped when > the .include_header is false. I'll take a look at this for a v2.
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 5bf2a85f69..84b03a4cde 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -1064,6 +1064,13 @@ ifdef::git-rev-list[] --header:: Print the contents of the commit in raw-format; each record is separated with a NUL character. + +--no-commit-header:: + Suppress the header line containing "commit" and the object ID printed before + the specified format. + +--commit-header:: + Overrides a previous `--no-commit-header`. endif::git-rev-list[] --parents:: diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 7677b1af5a..f571cc9598 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -132,7 +132,7 @@ static void show_commit(struct commit *commit, void *data) if (revs->abbrev_commit && revs->abbrev) fputs(find_unique_abbrev(&commit->object.oid, revs->abbrev), stdout); - else + else if (revs->include_header) fputs(oid_to_hex(&commit->object.oid), stdout); if (revs->print_parents) { struct commit_list *parents = commit->parents; @@ -153,7 +153,7 @@ static void show_commit(struct commit *commit, void *data) show_decorations(revs, commit); if (revs->commit_format == CMIT_FMT_ONELINE) putchar(' '); - else + else if (revs->include_header) putchar('\n'); if (revs->verbose_header) { @@ -512,6 +512,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) repo_init_revisions(the_repository, &revs, prefix); revs.abbrev = DEFAULT_ABBREV; revs.commit_format = CMIT_FMT_UNSPECIFIED; + revs.include_header = 1; /* * Scan the argument list before invoking setup_revisions(), so that we @@ -627,6 +628,16 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) continue; } + if (!strcmp(arg, ("--commit-header"))) { + revs.include_header = 1; + continue; + } + + if (!strcmp(arg, ("--no-commit-header"))) { + revs.include_header = 0; + continue; + } + if (!strcmp(arg, "--disk-usage")) { show_disk_usage = 1; info.flags |= REV_LIST_QUIET; @@ -639,7 +650,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) if (revs.commit_format != CMIT_FMT_UNSPECIFIED) { /* The command line has a --pretty */ info.hdr_termination = '\n'; - if (revs.commit_format == CMIT_FMT_ONELINE) + if (revs.commit_format == CMIT_FMT_ONELINE || !revs.include_header) info.header_prefix = ""; else info.header_prefix = "commit "; diff --git a/revision.h b/revision.h index 17698cb51a..7464434f60 100644 --- a/revision.h +++ b/revision.h @@ -215,7 +215,8 @@ struct rev_info { missing_newline:1, date_mode_explicit:1, preserve_subject:1, - encode_email_headers:1; + encode_email_headers:1, + include_header:1; unsigned int disable_stdin:1; /* --show-linear-break */ unsigned int track_linear:1, diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh index 35a2f62392..e68bf9f21c 100755 --- a/t/t6006-rev-list-format.sh +++ b/t/t6006-rev-list-format.sh @@ -54,9 +54,15 @@ test_expect_success 'setup' ' # usage: test_format name format_string [failure] <expected_output test_format () { + local header_arg= + if test "$1" = "--no-commit-header" + then + header_arg="--no-commit-header" + shift + fi cat >expect.$1 test_expect_${3:-success} "format $1" " - git rev-list --pretty=format:'$2' main >output.$1 && + git rev-list $header_arg --pretty=format:'$2' main >output.$1 && test_cmp expect.$1 output.$1 " } @@ -93,6 +99,14 @@ $head1 $head1_short EOF +test_format --no-commit-header hash-no-header %H%n%h <<EOF +$head2 +$head2_short +$head1 +$head1_short +EOF + + test_format tree %T%n%t <<EOF commit $head2 $tree2 @@ -181,6 +195,13 @@ $added EOF +test_format --no-commit-header raw-body-no-header %B <<EOF +$changed + +$added + +EOF + test_expect_success 'basic colors' ' cat >expect <<-EOF && commit $head2
In general, we encourage users to use plumbing commands, like git rev-list, over porcelain commands, like git log, when scripting. However, git rev-list has one glaring problem that prevents it from being used in certain cases: when --pretty is used, it always prints out a line containing "commit" and the object ID. This makes it unsuitable for many scripting needs, and forces users to use git log instead. While we can't change this behavior for backwards compatibility, we can add an option to suppress this behavior, so let's do so, and call it "--no-commit-header". Additionally, add the corresponding positive option to switch it back on. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> --- This has come up a lot on the list and I'm sure most of the regulars have run into it before. It came up at $DAYJOB and I said I intended to send a patch, so here it is. I chose --no-commit-header because --header is already taken. Of course, suggestions for better names are welcome, since we all know my naming skills are terrible. Documentation/rev-list-options.txt | 7 +++++++ builtin/rev-list.c | 17 ++++++++++++++--- revision.h | 3 ++- t/t6006-rev-list-format.sh | 23 ++++++++++++++++++++++- 4 files changed, 45 insertions(+), 5 deletions(-)