diff mbox series

rev-list: add option for --pretty without header

Message ID 20210706224321.14371-1-sandals@crustytoothpaste.net (mailing list archive)
State New, archived
Headers show
Series rev-list: add option for --pretty without header | expand

Commit Message

brian m. carlson July 6, 2021, 10:43 p.m. UTC
In general, we encourage users to use plumbing commands, like git
rev-list, over porcelain commands, like git log, when scripting.
However, git rev-list has one glaring problem that prevents it from
being used in certain cases: when --pretty is used, it always prints out
a line containing "commit" and the object ID.  This makes it unsuitable
for many scripting needs, and forces users to use git log instead.

While we can't change this behavior for backwards compatibility, we can
add an option to suppress this behavior, so let's do so, and call it
"--no-commit-header".  Additionally, add the corresponding positive
option to switch it back on.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
This has come up a lot on the list and I'm sure most of the regulars
have run into it before.  It came up at $DAYJOB and I said I intended to
send a patch, so here it is.

I chose --no-commit-header because --header is already taken.  Of
course, suggestions for better names are welcome, since we all know my
naming skills are terrible.

 Documentation/rev-list-options.txt |  7 +++++++
 builtin/rev-list.c                 | 17 ++++++++++++++---
 revision.h                         |  3 ++-
 t/t6006-rev-list-format.sh         | 23 ++++++++++++++++++++++-
 4 files changed, 45 insertions(+), 5 deletions(-)

Comments

Christian Couder July 9, 2021, 8:01 a.m. UTC | #1
On Wed, Jul 7, 2021 at 12:47 AM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> In general, we encourage users to use plumbing commands, like git
> rev-list, over porcelain commands, like git log, when scripting.
> However, git rev-list has one glaring problem that prevents it from
> being used in certain cases: when --pretty is used, it always prints out
> a line containing "commit" and the object ID.

You say always, but it looks like it doesn't do it when
--pretty=oneline is used.

By the way this makes me think that the doc could be improved, because
it describes the format this way:

      •   online

              <hash> <title line>

      •   short

              commit <hash>
              Author: <author>

              <title line>

      •   reference

              <abbrev hash> (<title line>, <short author date>)

      •   email

              From <hash> <date>
              From: <author>
              Date: <author date>
              Subject: [PATCH] <title line>

              <full commit message>

while when --pretty=reference or --pretty=email is used the 'commit
<hash>' line is still printed despite not being shown in the doc.

> This makes it unsuitable
> for many scripting needs, and forces users to use git log instead.

I agree that it's annoying when using --pretty=format:'...'

> While we can't change this behavior for backwards compatibility, we can
> add an option to suppress this behavior, so let's do so, and call it
> "--no-commit-header".  Additionally, add the corresponding positive
> option to switch it back on.

It's not clear if this new option will remove the 'commit <hash>' line
when a builtin format like --pretty=short, --pretty=medium,
--pretty=reference or --pretty=email is used.

> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
> This has come up a lot on the list and I'm sure most of the regulars
> have run into it before.

Yeah, I did run into it before. Thanks for doing something about it!

> It came up at $DAYJOB and I said I intended to
> send a patch, so here it is.
>
> I chose --no-commit-header because --header is already taken.

Yeah, that looks like the most obvious option name.

>  --header::
>         Print the contents of the commit in raw-format; each record is
>         separated with a NUL character.
> +
> +--no-commit-header::
> +       Suppress the header line containing "commit" and the object ID printed before
> +       the specified format.

This might want to tell what happens with builtin formats.

> +--commit-header::
> +       Overrides a previous `--no-commit-header`.
>  endif::git-rev-list[]
>
>  --parents::

[...]

> diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh
> index 35a2f62392..e68bf9f21c 100755
> --- a/t/t6006-rev-list-format.sh
> +++ b/t/t6006-rev-list-format.sh
> @@ -54,9 +54,15 @@ test_expect_success 'setup' '
>
>  # usage: test_format name format_string [failure] <expected_output
>  test_format () {
> +       local header_arg=
> +       if test "$1" = "--no-commit-header"
> +       then
> +               header_arg="--no-commit-header"
> +               shift
> +       fi
>         cat >expect.$1
>         test_expect_${3:-success} "format $1" "
> -               git rev-list --pretty=format:'$2' main >output.$1 &&
> +               git rev-list $header_arg --pretty=format:'$2' main >output.$1 &&
>                 test_cmp expect.$1 output.$1
>         "
>  }

It looks like the tests only check what happens in case
--pretty=format:'...' is used, but I wonder what the code does if a
builtin format is used.
Junio C Hamano July 9, 2021, 3:44 p.m. UTC | #2
Christian Couder <christian.couder@gmail.com> writes:

> On Wed, Jul 7, 2021 at 12:47 AM brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
>>
>> In general, we encourage users to use plumbing commands, like git
>> rev-list, over porcelain commands, like git log, when scripting.
>> However, git rev-list has one glaring problem that prevents it from
>> being used in certain cases: when --pretty is used, it always prints out
>> a line containing "commit" and the object ID.
>
> You say always, but it looks like it doesn't do it when
> --pretty=oneline is used.

I've understood that he meant "when --pretty=format is used",
though, as it is the only format whose intent was to allow
generation of output stream that does not have to be preprocessed
with "grep -v '^[0-9a-f]{40}'" etc.

> It looks like the tests only check what happens in case
> --pretty=format:'...' is used, but I wonder what the code does if a
> builtin format is used.

Good point.

I also think the handling of --abbrev-commit may need to be
rethought with this change.  See here:

    diff --git a/builtin/rev-list.c b/builtin/rev-list.c
    index 7677b1af5a..f571cc9598 100644
    --- a/builtin/rev-list.c
    +++ b/builtin/rev-list.c
    @@ -132,7 +132,7 @@ static void show_commit(struct commit *commit, void *data)
            if (revs->abbrev_commit && revs->abbrev)
                    fputs(find_unique_abbrev(&commit->object.oid, revs->abbrev),
                          stdout);
    -	else
    +	else if (revs->include_header)
                    fputs(oid_to_hex(&commit->object.oid), stdout);
            if (revs->print_parents) {
                    struct commit_list *parents = commit->parents;

The original says that if --abbrev-commit is set and --abbrev is set
to non-zero, we'd show unique abbreviation and if not, specifically,
even when --abbrev-commit is set but --abbrev is set to 0, we didn't
do the find_unique_abbrev() of full hexdigits but left the output to
the else clause.  This was because the original code KNOWS that the
else clause unconditionally emits the full commit object name.

That assumption, which made the original code's handling of
"--abbrev-commit --abbrev=0" correct, no longer holds with the
updated code.  What happens is with --no-commit-header, if we give
"--abbrev-commit --abbrev=4", we still see the unique abbreviation
that is not shorter than 4 hexdigits (i.e. "--no-commit-header" is
ignored), but if we say "--abbrev-commit --abbrev=0", we do not see
any commit header.

My hunch is that this "if / else", which determines if the commit
header should give shortened or full length, should be skipped when
the .include_header is false.
brian m. carlson July 11, 2021, 5:02 p.m. UTC | #3
On 2021-07-09 at 15:44:20, Junio C Hamano wrote:
> Christian Couder <christian.couder@gmail.com> writes:
> 
> > On Wed, Jul 7, 2021 at 12:47 AM brian m. carlson
> > <sandals@crustytoothpaste.net> wrote:
> >>
> >> In general, we encourage users to use plumbing commands, like git
> >> rev-list, over porcelain commands, like git log, when scripting.
> >> However, git rev-list has one glaring problem that prevents it from
> >> being used in certain cases: when --pretty is used, it always prints out
> >> a line containing "commit" and the object ID.
> >
> > You say always, but it looks like it doesn't do it when
> > --pretty=oneline is used.
> 
> I've understood that he meant "when --pretty=format is used",
> though, as it is the only format whose intent was to allow
> generation of output stream that does not have to be preprocessed
> with "grep -v '^[0-9a-f]{40}'" etc.

Yes, that's the case.  I'll rephrase the commit message to reflect
that's what I meant.

> > It looks like the tests only check what happens in case
> > --pretty=format:'...' is used, but I wonder what the code does if a
> > builtin format is used.
> 
> Good point.

I'll add some more tests.

> I also think the handling of --abbrev-commit may need to be
> rethought with this change.  See here:
> 
>     diff --git a/builtin/rev-list.c b/builtin/rev-list.c
>     index 7677b1af5a..f571cc9598 100644
>     --- a/builtin/rev-list.c
>     +++ b/builtin/rev-list.c
>     @@ -132,7 +132,7 @@ static void show_commit(struct commit *commit, void *data)
>             if (revs->abbrev_commit && revs->abbrev)
>                     fputs(find_unique_abbrev(&commit->object.oid, revs->abbrev),
>                           stdout);
>     -	else
>     +	else if (revs->include_header)
>                     fputs(oid_to_hex(&commit->object.oid), stdout);
>             if (revs->print_parents) {
>                     struct commit_list *parents = commit->parents;
> 
> The original says that if --abbrev-commit is set and --abbrev is set
> to non-zero, we'd show unique abbreviation and if not, specifically,
> even when --abbrev-commit is set but --abbrev is set to 0, we didn't
> do the find_unique_abbrev() of full hexdigits but left the output to
> the else clause.  This was because the original code KNOWS that the
> else clause unconditionally emits the full commit object name.
> 
> That assumption, which made the original code's handling of
> "--abbrev-commit --abbrev=0" correct, no longer holds with the
> updated code.  What happens is with --no-commit-header, if we give
> "--abbrev-commit --abbrev=4", we still see the unique abbreviation
> that is not shorter than 4 hexdigits (i.e. "--no-commit-header" is
> ignored), but if we say "--abbrev-commit --abbrev=0", we do not see
> any commit header.
> 
> My hunch is that this "if / else", which determines if the commit
> header should give shortened or full length, should be skipped when
> the .include_header is false.

I'll take a look at this for a v2.
diff mbox series

Patch

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 5bf2a85f69..84b03a4cde 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -1064,6 +1064,13 @@  ifdef::git-rev-list[]
 --header::
 	Print the contents of the commit in raw-format; each record is
 	separated with a NUL character.
+
+--no-commit-header::
+	Suppress the header line containing "commit" and the object ID printed before
+	the specified format.
+
+--commit-header::
+	Overrides a previous `--no-commit-header`.
 endif::git-rev-list[]
 
 --parents::
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 7677b1af5a..f571cc9598 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -132,7 +132,7 @@  static void show_commit(struct commit *commit, void *data)
 	if (revs->abbrev_commit && revs->abbrev)
 		fputs(find_unique_abbrev(&commit->object.oid, revs->abbrev),
 		      stdout);
-	else
+	else if (revs->include_header)
 		fputs(oid_to_hex(&commit->object.oid), stdout);
 	if (revs->print_parents) {
 		struct commit_list *parents = commit->parents;
@@ -153,7 +153,7 @@  static void show_commit(struct commit *commit, void *data)
 	show_decorations(revs, commit);
 	if (revs->commit_format == CMIT_FMT_ONELINE)
 		putchar(' ');
-	else
+	else if (revs->include_header)
 		putchar('\n');
 
 	if (revs->verbose_header) {
@@ -512,6 +512,7 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	repo_init_revisions(the_repository, &revs, prefix);
 	revs.abbrev = DEFAULT_ABBREV;
 	revs.commit_format = CMIT_FMT_UNSPECIFIED;
+	revs.include_header = 1;
 
 	/*
 	 * Scan the argument list before invoking setup_revisions(), so that we
@@ -627,6 +628,16 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 			continue;
 		}
 
+		if (!strcmp(arg, ("--commit-header"))) {
+			revs.include_header = 1;
+			continue;
+		}
+
+		if (!strcmp(arg, ("--no-commit-header"))) {
+			revs.include_header = 0;
+			continue;
+		}
+
 		if (!strcmp(arg, "--disk-usage")) {
 			show_disk_usage = 1;
 			info.flags |= REV_LIST_QUIET;
@@ -639,7 +650,7 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	if (revs.commit_format != CMIT_FMT_UNSPECIFIED) {
 		/* The command line has a --pretty  */
 		info.hdr_termination = '\n';
-		if (revs.commit_format == CMIT_FMT_ONELINE)
+		if (revs.commit_format == CMIT_FMT_ONELINE || !revs.include_header)
 			info.header_prefix = "";
 		else
 			info.header_prefix = "commit ";
diff --git a/revision.h b/revision.h
index 17698cb51a..7464434f60 100644
--- a/revision.h
+++ b/revision.h
@@ -215,7 +215,8 @@  struct rev_info {
 			missing_newline:1,
 			date_mode_explicit:1,
 			preserve_subject:1,
-			encode_email_headers:1;
+			encode_email_headers:1,
+			include_header:1;
 	unsigned int	disable_stdin:1;
 	/* --show-linear-break */
 	unsigned int	track_linear:1,
diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh
index 35a2f62392..e68bf9f21c 100755
--- a/t/t6006-rev-list-format.sh
+++ b/t/t6006-rev-list-format.sh
@@ -54,9 +54,15 @@  test_expect_success 'setup' '
 
 # usage: test_format name format_string [failure] <expected_output
 test_format () {
+	local header_arg=
+	if test "$1" = "--no-commit-header"
+	then
+		header_arg="--no-commit-header"
+		shift
+	fi
 	cat >expect.$1
 	test_expect_${3:-success} "format $1" "
-		git rev-list --pretty=format:'$2' main >output.$1 &&
+		git rev-list $header_arg --pretty=format:'$2' main >output.$1 &&
 		test_cmp expect.$1 output.$1
 	"
 }
@@ -93,6 +99,14 @@  $head1
 $head1_short
 EOF
 
+test_format --no-commit-header hash-no-header %H%n%h <<EOF
+$head2
+$head2_short
+$head1
+$head1_short
+EOF
+
+
 test_format tree %T%n%t <<EOF
 commit $head2
 $tree2
@@ -181,6 +195,13 @@  $added
 
 EOF
 
+test_format --no-commit-header raw-body-no-header %B <<EOF
+$changed
+
+$added
+
+EOF
+
 test_expect_success 'basic colors' '
 	cat >expect <<-EOF &&
 	commit $head2