mbox series

[v3,0/3] Unify trailers formatting logic for pretty.c and ref-filter.c

Message ID pull.726.v3.git.1612602945.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Unify trailers formatting logic for pretty.c and ref-filter.c | expand

Message

Philippe Blain via GitGitGadget Feb. 6, 2021, 9:15 a.m. UTC
Currently, there exists a separate logic for %(trailers) in "pretty.{c,h}"
and "ref-filter.{c,h}". Both are actually doing the same thing, why not use
the same code for both of them?

This is the 3rd version of the patch series. It is focused on unifying the
"%(trailers)" logic for both 'pretty.{c,h}' and 'ref-filter.{c,h}'. So, we
can have one logic for trailers.

v3 changes:

 * replaced echo with printf in the tests failing in previous version for
   consistent behaviour.
 * strbuf_reset() is back in format_set_trailers_options().
 * made struct ref_trailer_buf static.
 * initialised structure variable ref_trailer_buf . so we may not encounter
   any problem while doing strbuf_reset().
 * refer to the trailers part of "git-log"'s man page in
   "git-for-each-ref"'s man page.
 * improved commit messages.

/* TODO */

As suggested by Ævar Arnfjörð Bjarmason avarab@gmail.com here
[https://public-inbox.org/git/875z3ep30j.fsf@evledraar.gmail.com/], I plan
to unify the trailers related tests for "git log" and "git for-each-ref" in
new file. Maybe on top of this patch series?

Hariom Verma (3):
  pretty.c: refactor trailer logic to `format_set_trailers_options()`
  pretty.c: capture invalid trailer argument
  ref-filter: use pretty.c logic for trailers

 Documentation/git-for-each-ref.txt |   8 +-
 pretty.c                           |  98 ++++++++++++++----------
 pretty.h                           |  12 +++
 ref-filter.c                       |  36 +++++----
 t/t6300-for-each-ref.sh            | 119 +++++++++++++++++++++++++----
 5 files changed, 200 insertions(+), 73 deletions(-)


base-commit: e6362826a0409539642a5738db61827e5978e2e4
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-726%2Fharry-hov%2Funify-trailers-logic-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-726/harry-hov/unify-trailers-logic-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/726

Range-diff vs v2:

 1:  fc5fd5217dfc ! 1:  81030f00b11b pretty.c: refactor trailer logic to `format_set_trailers_options()`
     @@ Commit message
          pretty.c: refactor trailer logic to `format_set_trailers_options()`
      
          Refactored trailers formatting logic inside pretty.c to a new function
     -    `format_set_trailers_options()`. This change will allow us to reuse
     -    the same logic in other places.
     +    `format_set_trailers_options()`. This new function returns the non-zero
     +    in case of unusual. The caller handles the non-zero by "goto trailers_out".
     +
     +    This change will allow us to reuse the same logic in other places.
      
          Mentored-by: Christian Couder <chriscool@tuxfamily.org>
          Mentored-by: Heba Waly <heba.waly@gmail.com>
     @@ pretty.c: static int format_trailer_match_cb(const struct strbuf *key, void *ud)
      +			uintptr_t len = arglen;
      +
      +			if (!argval)
     -+				return 1;
     ++				return -1;
      +
      +			if (len && argval[len - 1] == ':')
      +				len--;
     @@ pretty.c: static int format_trailer_match_cb(const struct strbuf *key, void *ud)
      +			opts->only_trailers = 1;
      +		} else if (match_placeholder_arg_value(*arg, "separator", arg, &argval, &arglen)) {
      +			char *fmt;
     ++
     ++			strbuf_reset(sepbuf);
      +			fmt = xstrndup(argval, arglen);
      +			strbuf_expand(sepbuf, fmt, strbuf_expand_literal_cb, NULL);
      +			free(fmt);
      +			opts->separator = sepbuf;
      +		} else if (match_placeholder_arg_value(*arg, "key_value_separator", arg, &argval, &arglen)) {
      +			char *fmt;
     ++
     ++			strbuf_reset(kvsepbuf);
      +			fmt = xstrndup(argval, arglen);
      +			strbuf_expand(kvsepbuf, fmt, strbuf_expand_literal_cb, NULL);
      +			free(fmt);
 2:  245e48eb6835 ! 2:  f4a6b2df1444 pretty.c: capture invalid trailer argument
     @@ Metadata
       ## Commit message ##
          pretty.c: capture invalid trailer argument
      
     -    As we would like to use this same logic in ref-filter, it's nice to
     -    get invalid trailer argument. This will allow us to print precise
     -    error message, while using `format_set_trailers_options()` in
     +    As we would like to use this trailers logic in the ref-filter, it's
     +    nice to get an invalid trailer argument. This will allow us to print
     +    precise error message while using `format_set_trailers_options()` in
          ref-filter.
      
     +    For capturing the invalid argument, we changed the working of
     +    `format_set_trailers_options()` a little bit.
     +    Original logic does "break" and fell through in mainly 2 cases -
     +        1. unknown/invalid argument
     +        2. end of the arg string
     +
     +    But now instead of "break", we capture invalid argument and return
     +    non-zero. And non-zero is handled by the caller.
     +    (We prepared the caller to handle non-zero in the previous commit).
     +
     +    Capturing invalid arguments this way will also affects the working
     +    of current logic. As at the end of the arg string it will return non-zero.
     +    So in order to make things correct, introduced an additional conditional
     +    statement i.e if encounter ")", do 'break'.
     +
          Mentored-by: Christian Couder <chriscool@tuxfamily.org>
          Mentored-by: Heba Waly <heba.waly@gmail.com>
          Signed-off-by: Hariom Verma <hariom18599@gmail.com>
     @@ pretty.c: int format_set_trailers_options(struct process_trailer_options *opts,
       		const char *argval;
       		size_t arglen;
       
     -+		if(**arg == ')') {
     ++		if (**arg == ')')
      +			break;
     -+		}
      +
       		if (match_placeholder_arg_value(*arg, "key", arg, &argval, &arglen)) {
       			uintptr_t len = arglen;
     @@ pretty.c: int format_set_trailers_options(struct process_trailer_options *opts,
      -			   !match_placeholder_bool_arg(*arg, "valueonly", arg, &opts->value_only))
      -			break;
      +			   !match_placeholder_bool_arg(*arg, "valueonly", arg, &opts->value_only)) {
     -+			size_t invalid_arg_len = strcspn(*arg, ",)");
     -+			*invalid_arg = xstrndup(*arg, invalid_arg_len);
     -+			return 1;
     ++			if (invalid_arg) {
     ++				size_t len = strcspn(*arg, ",)");
     ++				*invalid_arg = xstrndup(*arg, len);
     ++			}
     ++			return -1;
      +		}
       	}
       	return 0;
       }
      @@ pretty.c: static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
     - 		struct strbuf sepbuf = STRBUF_INIT;
     - 		struct strbuf kvsepbuf = STRBUF_INIT;
     - 		size_t ret = 0;
     -+		char *unused = NULL;
     - 
     - 		opts.no_divider = 1;
       
       		if (*arg == ':') {
       			arg++;
      -			if (format_set_trailers_options(&opts, &filter_list, &sepbuf, &kvsepbuf, &arg))
     -+			if (format_set_trailers_options(&opts, &filter_list, &sepbuf, &kvsepbuf, &arg, &unused))
     ++			if (format_set_trailers_options(&opts, &filter_list, &sepbuf, &kvsepbuf, &arg, NULL))
       				goto trailer_out;
       		}
       		if (*arg == ')') {
     -@@ pretty.c: static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
     - 	trailer_out:
     - 		string_list_clear(&filter_list, 0);
     - 		strbuf_release(&sepbuf);
     -+		free((char *)unused);
     - 		return ret;
     - 	}
     - 
      
       ## pretty.h ##
      @@ pretty.h: int format_set_trailers_options(struct process_trailer_options *opts,
 3:  7b8cfb2721c3 ! 3:  47d89f872314 ref-filter: use pretty.c logic for trailers
     @@ Commit message
            :key=<K> - only show trailers with specified key.
            :valueonly[=val] - only show the value part.
            :separator=<SEP> - inserted between trailer lines.
     -      :key_value_separator=<SEP> - inserted between trailer lines
     +      :key_value_separator=<SEP> - inserted between key and value in trailer lines
      
          Enhancement to existing options(now can take value and its optional):
            :only[=val]
     @@ Documentation/git-for-each-ref.txt: contents:lines=N::
      -that each trailer appears on a line by itself with its full content with
      -`trailers:unfold`. Both can be used together as `trailers:unfold,only`.
      +are obtained as `trailers[:options]` (or by using the historical alias
     -+`contents:trailers[:options]`). Valid [:option] are:
     -+** 'key=<K>': only show trailers with specified key. Matching is done
     -+   case-insensitively and trailing colon is optional. If option is
     -+   given multiple times trailer lines matching any of the keys are
     -+   shown. This option automatically enables the `only` option so that
     -+   non-trailer lines in the trailer block are hidden. If that is not
     -+   desired it can be disabled with `only=false`.  E.g.,
     -+   `%(trailers:key=Reviewed-by)` shows trailer lines with key
     -+   `Reviewed-by`.
     -+** 'only[=val]': select whether non-trailer lines from the trailer
     -+   block should be included. The `only` keyword may optionally be
     -+   followed by an equal sign and one of `true`, `on`, `yes` to omit or
     -+   `false`, `off`, `no` to show the non-trailer lines. If option is
     -+   given without value it is enabled. If given multiple times the last
     -+   value is used.
     -+** 'separator=<SEP>': specify a separator inserted between trailer
     -+   lines. When this option is not given each trailer line is
     -+   terminated with a line feed character. The string SEP may contain
     -+   the literal formatting codes. To use comma as separator one must use
     -+   `%x2C` as it would otherwise be parsed as next option. If separator
     -+   option is given multiple times only the last one is used.
     -+   E.g., `%(trailers:key=Ticket,separator=%x2C)` shows all trailer lines
     -+   whose key is "Ticket" separated by a comma.
     -+** 'key_value_separator=<SEP>': specify a separator inserted between
     -+   key and value. The string SEP may contain the literal formatting codes.
     -+   E.g., `%(trailers:key=Ticket,key_value_separator=%x2C)` shows all trailer
     -+   lines whose key is "Ticket" with key and value separated by a comma.
     -+** 'unfold[=val]': make it behave as if interpret-trailer's `--unfold`
     -+   option was given. In same way as to for `only` it can be followed
     -+   by an equal sign and explicit value. E.g.,
     -+   `%(trailers:only,unfold=true)` unfolds and shows all trailer lines.
     -+** 'valueonly[=val]': skip over the key part of the trailer line and only
     -+   show the value part. Also this optionally allows explicit value.
     ++`contents:trailers[:options]`). For valid [:option] values see `trailers`
     ++section of linkgit:git-log[1].
       
       For sorting purposes, fields with numeric values sort in numeric order
       (`objectsize`, `authordate`, `committerdate`, `creatordate`, `taggerdate`).
     @@ ref-filter.c: struct refname_atom {
       	int lstrip, rstrip;
       };
       
     -+struct ref_trailer_buf {
     ++static struct ref_trailer_buf {
      +	struct string_list filter_list;
      +	struct strbuf sepbuf;
      +	struct strbuf kvsepbuf;
     -+} ref_trailer_buf;
     ++} ref_trailer_buf = {STRING_LIST_INIT_NODUP, STRBUF_INIT, STRBUF_INIT};
      +
       static struct expand_data {
       	struct object_id oid;
     @@ ref-filter.c: static int subject_atom_parser(const struct ref_format *format, st
      +		char *invalid_arg = NULL;
      +
      +		if (format_set_trailers_options(&atom->u.contents.trailer_opts,
     -+			&ref_trailer_buf.filter_list,
     -+			&ref_trailer_buf.sepbuf,
     -+			&ref_trailer_buf.kvsepbuf,
     -+			&argbuf, &invalid_arg)) {
     ++		    &ref_trailer_buf.filter_list,
     ++		    &ref_trailer_buf.sepbuf,
     ++		    &ref_trailer_buf.kvsepbuf,
     ++		    &argbuf, &invalid_arg)) {
      +			if (!invalid_arg)
      +				strbuf_addf(err, _("expected %%(trailers:key=<value>)"));
      +			else
     @@ t/t6300-for-each-ref.sh: test_expect_success '%(trailers:only) and %(trailers:un
      +	option="$2"
      +	expect="$3"
      +	test_expect_success "$title" '
     -+		echo $expect >expect &&
     ++		printf "$expect\n" >expect &&
      +		git for-each-ref --format="%($option)" refs/heads/main >actual &&
      +		test_cmp expect actual &&
      +		git for-each-ref --format="%(contents:$option)" refs/heads/main >actual &&

Comments

Junio C Hamano Feb. 7, 2021, 3:33 a.m. UTC | #1
"Hariom Verma via GitGitGadget" <gitgitgadget@gmail.com> writes:

>      @@ t/t6300-for-each-ref.sh: test_expect_success '%(trailers:only) and %(trailers:un
>       +	option="$2"
>       +	expect="$3"
>       +	test_expect_success "$title" '
>      -+		echo $expect >expect &&
>      ++		printf "$expect\n" >expect &&

Are we sure that "$expect" would not ever have any '%' in it, to
confuse printf?  To be future-proof and safe, it would be prudent to
instead use

	printf "%s\n" "$expect"

to make sure that whatever is passed in $3 gets output LITERALLY.

The callers need to adopt the change I gave you in the review of the
previous round so that they do not assume backslash-en will by
changed to LF by somebody---instead if they mean LF, they just say
LF.

Thanks.
Junio C Hamano Feb. 7, 2021, 5:06 a.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> "Hariom Verma via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>>      @@ t/t6300-for-each-ref.sh: test_expect_success '%(trailers:only) and %(trailers:un
>>       +	option="$2"
>>       +	expect="$3"
>>       +	test_expect_success "$title" '
>>      -+		echo $expect >expect &&
>>      ++		printf "$expect\n" >expect &&
>
> Are we sure that "$expect" would not ever have any '%' in it, to
> confuse printf?

Just to make sure we won't waste your time in useless roundtrip(s),
let me say that possible unacceptable answers are:

 - no, right now nobody passes a % in it
 - no, I do not expect anybody needs to pass a % in it
 - when somebody really needs to pass a %, they can write it as %%

The last one is the worst one, by the way.

The point of adding a test_trailer_option HELPER function is to HELP
the developers who write tests, now and in the future.  There are
some things they MUST know to use the helper successfully, like it
takes three parameters, the first one being the test title, the
second one is the string you'd give as the "--format=<format>"
option to the for-each-ref command, and the third one is the
expected output.

Forcing them to know any more than that is *not* helping them.

The shell programming language is perfectly capable of passing an
argument that happens to be a multi-line string to functions and
external commands, and the developers who are writing test knows
that already (the last argument to test_expect_success used
everywhere in the test suite, that is a multi-line code snippet, is
a good example).  When they need to write an expected output that is
two lines, they expect to be able to write things like

	test_trailer_option title format-string \
	'expected line #1
	expected line #2'

without having to worry about the need for special formatting that
is applicable *only* when passing argument to this helper.  They do
not need to know that they cannot pass backslash-en literally, and
have to say '\\n' instead, or they have to double a per-cent sign,
only when using this helper but not other helper functions.

In the message I am replying to, I used

	printf "%s\n" "$expect"

for a reason.  We expect that trailer options output are complete
lines, so it is annoying to force the caller to write the final
newline, especially if many of the callers have only one line of
expected output.  So

    test_trailer_option title format-string 'expected output'

would end up doing

	printf "%s\n" "expected output" >expect

to write a complete line, i.e. a caller does not have to say any of

    test_trailer_option title format-string 'expected output\n'

    test_trailer_option title format-string 'expected output
    '

    lf='
    '
    test_trailer_option title format-string "expected output$lf"

If we do not need to extend test_trailer_option with further
"features" (like "check output case insensitively this time" or
"allow output lines in any order"), we can make it even nicer and
easier to use for callers, by the way.

For example, with this (by the way, make sure there are SP on both
sides around "()", that's our house style):

	test_trailer_option () {
		title=$1 option=$2
		shift 2
		if test $# != 0
		then
			printf "%s\n" "$@"
		fi >expect
		test_expect_success "$title" '
			... >actual &&
			test_cmp expect actual &&
			... >actual &&
			test_cmp expect actual
		'
	}

the caller can do

	test_trailer_option 'single line output' \
		'trailers:key=Signed-off-by' \
		'Signed-off-by: A U Thor <author@example.com>'

	test_trailer_option 'expect two lines' \
		'trailers:key=Reviewed-by' \
		'Reviewed-by: A U Thor <author@example.com>' \
		'Reviewed-by: R E Viewer <reviewer@example.com>'

	test_trailer_option 'no output expected' 'trailers:key=no-such:' ''

That is, instead of "the first arg is title, the second is format
and the third is the entire expected output", the helper's manual
can say "give title and format as the first and the second argument.
Each argument after that is an expected output, one line per arg."

Another possibility is to feed the expected output from the standard
input of the helper, e.g.

	test_trailer_option () {
		title=$1 option=$2
		cat >expect
		test_expect_success "$title" '
			... >actual &&
			test_cmp expect actual &&
			... >actual &&
			test_cmp expect actual
		'
	}

And the caller can now do:

	test_trailer_option 'expect two lines' 'trailers:key=Reviewed-by' <<-\EOF
        Reviewed-by: A U Thor <author@example.com>
        Reviewed-by: R E Viewer <reviewer@example.com>
	EOF

It is a bit cumbersome when the expected output is a single line:

	test_trailer_option 'single line output' 'trailers:key=Signed-off-by' <<-\EOF
	Signed-off-by: A U Thor <author@example.com>
	EOF

but the contrast between the "two-line expected" case and this one
would be easy to see when reading the tests.  The pattern to write
"expect no output" would be quite simple, too:

	test_trailer_option 'no output expected' 'trailers:key=no-such:' </dev/null

Among the ones designed while writing this response, I would think I
like the last one, i.e. "the first arg is title, the second arg is
format, and the expected output is given from the standasd output"
probably the best.

Thanks.