[2/2] builtin/show-ref.c: limit output with `--count`

Message ID	3fcf1f555715e925385d37712ffe880bb869741e.1654552560.git.me@ttaylorr.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@kernel.org> Date: Mon, 6 Jun 2022 17:56:07 -0400 From: Taylor Blau <me@ttaylorr.com> To: git@vger.kernel.org Cc: derrickstolee@github.com, gitster@pobox.com Subject: [PATCH 2/2] builtin/show-ref.c: limit output with `--count` Message-ID: <3fcf1f555715e925385d37712ffe880bb869741e.1654552560.git.me@ttaylorr.com> References: <cover.1654552560.git.me@ttaylorr.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <cover.1654552560.git.me@ttaylorr.com> Precedence: bulk
Series	builtin/show-ref.c: support `--count` for limiting output \| expand [0/2] builtin/show-ref.c: support `--count` for limiting output [1/2] builtin/show-ref.c: rename `found_match` to `matches_nr` [2/2] builtin/show-ref.c: limit output with `--count`

Taylor Blau June 6, 2022, 9:56 p.m. UTC

`git show-ref` is a useful tool for answering existential questions
about whether or not certain (kinds of) references exist or not. For
example, to quickly determine whether or not a repository has any tags,
one could run:

    $ git show-ref -q --tags | head -1

and check whether there was any output.

But certain environments (like spawning Git processes from Ruby code) do
not have ergonomic ways to simulate sending SIGPIPE back to `show-ref`
when they have seen enough output.

Callers _could_ use `for-each-ref`, which has supports a `--count`
option to limit output, but this is sub-par for latency-sensitive
callers since `for-each-ref` buffers all results before printing
anything. In the above instance, even something like:

    $ git for-each-ref --count=1 -- 'refs/tags/*'

will enumerate every tag in a repository before deciding to print just
the first one. (The current behavior exists to accommodate sorting the
output from for-each-ref, and could be improved. See [1] for previous
work in this area).

In the meantime, introduce support for a similar `--count` option in
`show-ref`, so that callers can run:

    $ git show-ref -q --tags --count=1

to quickly determine whether a repository has any (e.g.) tags or not by
only enumerating at most one reference.

(This option is currently "incompatible" with `--verify`, though there
is no reason why the meaning of `--count` couldn't be extended to mean,
"with `--verify`, only verify `<n>` references".)

[1]: https://lore.kernel.org/git/YTNpQ7Od1U%2F5i0R7@coredump.intra.peff.net/

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-show-ref.txt |  7 ++++++-
 builtin/show-ref.c             | 23 ++++++++++++++++++++++-
 t/t1403-show-ref.sh            | 21 +++++++++++++++++++++
 3 files changed, 49 insertions(+), 2 deletions(-)

Junio C Hamano June 6, 2022, 11:09 p.m. UTC | #1

Taylor Blau <me@ttaylorr.com> writes:

> +--count::
> +
> +	Do not print more than `<n>` matching references, or print all
> +	references if `<n>` is 0. Incompatible with `--exclude-existing`
> +	and `--verify`.

I can easily understand why the option being incompatible with
"--verify", but I do not see a reason why this should not work with
"--exclude-existing" from end user's point of view.

I know it is processed in a completely separate code path, but that
does not stop end users from complaining.

Ævar Arnfjörð Bjarmason June 7, 2022, 8:07 a.m. UTC | #2

On Mon, Jun 06 2022, Taylor Blau wrote:

> diff --git a/Documentation/git-show-ref.txt b/Documentation/git-show-ref.txt
> index ab4d271925..28256c04dd 100644
> --- a/Documentation/git-show-ref.txt
> +++ b/Documentation/git-show-ref.txt
> @@ -10,7 +10,7 @@ SYNOPSIS
>  [verse]
>  'git show-ref' [-q|--quiet] [--verify] [--head] [-d|--dereference]
>  	     [-s|--hash[=<n>]] [--abbrev[=<n>]] [--tags]
> -	     [--heads] [--] [<pattern>...]
> +	     [--heads] [--count=<n>] [--] [<pattern>...]

In addition to what Junio noted, the SYNOPSIS is now inaccurate per your
documentation. I.e. if this option is incompatible with --verify and
--exclude-existing we should use "|" to indicate that, e.g.:

	[ [--verify] [--exclude-existing] | --count=<n> ]

> +	if (max_count) {
> +		int compatible = 0;
> +
> +		if (max_count < 0)
> +			error(_("invalid --count argument: (`%d' < 0)"),
> +			      max_count);
> +		else if (verify)
> +			error(_("--count is incompatible with %s"), "--verify");
> +		else if (exclude_arg)
> +			error(_("--count is incompatible with %s"),
> +			      "--exclude-existing");
> +		else
> +			compatible = 1;
> +
> +		if (!compatible)
> +			usage_with_options(show_ref_usage, show_ref_options);

Instead of this "int compatible" and if/else-if" just use usage_msg_optf().

That or die_for_incompatible_opt4(), at least the new _() messages
should make use of the same translations. I.e. we recently made these
parameterized.

> +	}
> +
>  	if (exclude_arg)
>  		return exclude_existing(exclude_existing_arg);
>  
> diff --git a/t/t1403-show-ref.sh b/t/t1403-show-ref.sh
> index 9252a581ab..b79e114c1e 100755
> --- a/t/t1403-show-ref.sh
> +++ b/t/t1403-show-ref.sh
> @@ -196,4 +196,25 @@ test_expect_success 'show-ref --verify with dangling ref' '
>  	)
>  '
>  
> +test_expect_success 'show-ref --count limits relevant output' '
> +	git show-ref --heads --count=1 >out &&
> +	test_line_count = 1 out
> +'
> +
> +test_expect_success 'show-ref --count rejects invalid input' '
> +	test_must_fail git show-ref --count=-1 2>err &&
> +	grep "invalid ..count argument: (.-1. < 0)" err

The use of .. here seems odd...

> +'
> +
> +test_expect_success 'show-ref --count incompatible with --verify' '
> +	test_must_fail git show-ref --count=1 --verify HEAD 2>err &&
> +	grep "..count is incompatible with ..verify" err

...i.e. this looks like a way to avoid "--" at the beginning, but then
why use it in the middle of the regex?

> +test_expect_success 'show-ref --count incompatible with --exclude-existing' '
> +	echo "refs/heads/main" >in &&
> +	test_must_fail git show-ref --count=1 --exclude-existing <in 2>err &&
> +	grep "..count is incompatible with ..exclude.existing" err

Seems like you could avoid it entirely by escaping it, e.g. "[-]-" at
the beginning, or in this case I think "test_expect_code 129" would be
more than sufficient, we use that to test "had usage spewed at us"
elsewhere.

Taylor Blau June 7, 2022, 9:13 p.m. UTC | #3

On Tue, Jun 07, 2022 at 10:07:32AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
> On Mon, Jun 06 2022, Taylor Blau wrote:
>
> > diff --git a/Documentation/git-show-ref.txt b/Documentation/git-show-ref.txt
> > index ab4d271925..28256c04dd 100644
> > --- a/Documentation/git-show-ref.txt
> > +++ b/Documentation/git-show-ref.txt
> > @@ -10,7 +10,7 @@ SYNOPSIS
> >  [verse]
> >  'git show-ref' [-q|--quiet] [--verify] [--head] [-d|--dereference]
> >  	     [-s|--hash[=<n>]] [--abbrev[=<n>]] [--tags]
> > -	     [--heads] [--] [<pattern>...]
> > +	     [--heads] [--count=<n>] [--] [<pattern>...]
>
> In addition to what Junio noted, the SYNOPSIS is now inaccurate per your
> documentation. I.e. if this option is incompatible with --verify and
> --exclude-existing we should use "|" to indicate that, e.g.:
>
> 	[ [--verify] [--exclude-existing] | --count=<n> ]

Good catch. Should this be squashed into the first example in the
SYNOPSIS, the second, or a new one?

> > +	if (max_count) {
> > +		int compatible = 0;
> > +
> > +		if (max_count < 0)
> > +			error(_("invalid --count argument: (`%d' < 0)"),
> > +			      max_count);
> > +		else if (verify)
> > +			error(_("--count is incompatible with %s"), "--verify");
> > +		else if (exclude_arg)
> > +			error(_("--count is incompatible with %s"),
> > +			      "--exclude-existing");
> > +		else
> > +			compatible = 1;
> > +
> > +		if (!compatible)
> > +			usage_with_options(show_ref_usage, show_ref_options);
>
> Instead of this "int compatible" and if/else-if" just use usage_msg_optf().
>
> That or die_for_incompatible_opt4(), at least the new _() messages
> should make use of the same translations. I.e. we recently made these
> parameterized.

Good catch again. I wasn't aware of usage_msg_optf(), but it's exactly
what I'm looking for here. It does mean that we'd only print one warning
at a time, but I think that's a fair tradeoff, and unlikely to matter in
practice anyways.

And I must have dropped the parameterized msgids on the floor when
preparing this patch, since I definitely have it locally. Oops, fixed.

> > +	}
> > +
> >  	if (exclude_arg)
> >  		return exclude_existing(exclude_existing_arg);
> >
> > diff --git a/t/t1403-show-ref.sh b/t/t1403-show-ref.sh
> > index 9252a581ab..b79e114c1e 100755
> > --- a/t/t1403-show-ref.sh
> > +++ b/t/t1403-show-ref.sh
> > @@ -196,4 +196,25 @@ test_expect_success 'show-ref --verify with dangling ref' '
> >  	)
> >  '
> >
> > +test_expect_success 'show-ref --count limits relevant output' '
> > +	git show-ref --heads --count=1 >out &&
> > +	test_line_count = 1 out
> > +'
> > +
> > +test_expect_success 'show-ref --count rejects invalid input' '
> > +	test_must_fail git show-ref --count=-1 2>err &&
> > +	grep "invalid ..count argument: (.-1. < 0)" err
>
> The use of .. here seems odd...
>
> > +'
> > +
> > +test_expect_success 'show-ref --count incompatible with --verify' '
> > +	test_must_fail git show-ref --count=1 --verify HEAD 2>err &&
> > +	grep "..count is incompatible with ..verify" err
>
> ...i.e. this looks like a way to avoid "--" at the beginning, but then
> why use it in the middle of the regex?

Muscle memory ;).

> > +test_expect_success 'show-ref --count incompatible with --exclude-existing' '
> > +	echo "refs/heads/main" >in &&
> > +	test_must_fail git show-ref --count=1 --exclude-existing <in 2>err &&
> > +	grep "..count is incompatible with ..exclude.existing" err
>
> Seems like you could avoid it entirely by escaping it, e.g. "[-]-" at
> the beginning, or in this case I think "test_expect_code 129" would be
> more than sufficient, we use that to test "had usage spewed at us"
> elsewhere.

I like having the extra test to ensure the error we got made sense, but
I agree either would work. I modified the grep expressions to replace
leading "."'s with "[-]", and "."'s in the middle of the expression with
"-".

Thanks,
Taylor

Ævar Arnfjörð Bjarmason June 7, 2022, 9:31 p.m. UTC | #4

On Tue, Jun 07 2022, Taylor Blau wrote:

> On Tue, Jun 07, 2022 at 10:07:32AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Mon, Jun 06 2022, Taylor Blau wrote:
>>
>> > diff --git a/Documentation/git-show-ref.txt b/Documentation/git-show-ref.txt
>> > index ab4d271925..28256c04dd 100644
>> > --- a/Documentation/git-show-ref.txt
>> > +++ b/Documentation/git-show-ref.txt
>> > @@ -10,7 +10,7 @@ SYNOPSIS
>> >  [verse]
>> >  'git show-ref' [-q|--quiet] [--verify] [--head] [-d|--dereference]
>> >  	     [-s|--hash[=<n>]] [--abbrev[=<n>]] [--tags]
>> > -	     [--heads] [--] [<pattern>...]
>> > +	     [--heads] [--count=<n>] [--] [<pattern>...]
>>
>> In addition to what Junio noted, the SYNOPSIS is now inaccurate per your
>> documentation. I.e. if this option is incompatible with --verify and
>> --exclude-existing we should use "|" to indicate that, e.g.:
>>
>> 	[ [--verify] [--exclude-existing] | --count=<n> ]
>
> Good catch. Should this be squashed into the first example in the
> SYNOPSIS, the second, or a new one?

Personally I really don't care if the end-state is good :)

>> > +	if (max_count) {
>> > +		int compatible = 0;
>> > +
>> > +		if (max_count < 0)
>> > +			error(_("invalid --count argument: (`%d' < 0)"),
>> > +			      max_count);
>> > +		else if (verify)
>> > +			error(_("--count is incompatible with %s"), "--verify");
>> > +		else if (exclude_arg)
>> > +			error(_("--count is incompatible with %s"),
>> > +			      "--exclude-existing");
>> > +		else
>> > +			compatible = 1;
>> > +
>> > +		if (!compatible)
>> > +			usage_with_options(show_ref_usage, show_ref_options);
>>
>> Instead of this "int compatible" and if/else-if" just use usage_msg_optf().
>>
>> That or die_for_incompatible_opt4(), at least the new _() messages
>> should make use of the same translations. I.e. we recently made these
>> parameterized.
>
> Good catch again. I wasn't aware of usage_msg_optf(), but it's exactly
> what I'm looking for here. It does mean that we'd only print one warning
> at a time, but I think that's a fair tradeoff, and unlikely to matter in
> practice anyways.

Yeah, I think that should be OK. We do that in other cases.

> And I must have dropped the parameterized msgids on the floor when
> preparing this patch, since I definitely have it locally. Oops, fixed.

*nod*

>> > +	}
>> > +
>> >  	if (exclude_arg)
>> >  		return exclude_existing(exclude_existing_arg);
>> >
>> > diff --git a/t/t1403-show-ref.sh b/t/t1403-show-ref.sh
>> > index 9252a581ab..b79e114c1e 100755
>> > --- a/t/t1403-show-ref.sh
>> > +++ b/t/t1403-show-ref.sh
>> > @@ -196,4 +196,25 @@ test_expect_success 'show-ref --verify with dangling ref' '
>> >  	)
>> >  '
>> >
>> > +test_expect_success 'show-ref --count limits relevant output' '
>> > +	git show-ref --heads --count=1 >out &&
>> > +	test_line_count = 1 out
>> > +'
>> > +
>> > +test_expect_success 'show-ref --count rejects invalid input' '
>> > +	test_must_fail git show-ref --count=-1 2>err &&
>> > +	grep "invalid ..count argument: (.-1. < 0)" err
>>
>> The use of .. here seems odd...
>>
>> > +'
>> > +
>> > +test_expect_success 'show-ref --count incompatible with --verify' '
>> > +	test_must_fail git show-ref --count=1 --verify HEAD 2>err &&
>> > +	grep "..count is incompatible with ..verify" err
>>
>> ...i.e. this looks like a way to avoid "--" at the beginning, but then
>> why use it in the middle of the regex?
>
> Muscle memory ;).
>
>> > +test_expect_success 'show-ref --count incompatible with --exclude-existing' '
>> > +	echo "refs/heads/main" >in &&
>> > +	test_must_fail git show-ref --count=1 --exclude-existing <in 2>err &&
>> > +	grep "..count is incompatible with ..exclude.existing" err
>>
>> Seems like you could avoid it entirely by escaping it, e.g. "[-]-" at
>> the beginning, or in this case I think "test_expect_code 129" would be
>> more than sufficient, we use that to test "had usage spewed at us"
>> elsewhere.
>
> I like having the extra test to ensure the error we got made sense, but
> I agree either would work. I modified the grep expressions to replace
> leading "."'s with "[-]", and "."'s in the middle of the expression with
> "-".

Yeah, fair enough, thanks!

Junio C Hamano June 8, 2022, 4:10 p.m. UTC | #5

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Tue, Jun 07 2022, Taylor Blau wrote:
>
>> On Tue, Jun 07, 2022 at 10:07:32AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> On Mon, Jun 06 2022, Taylor Blau wrote:
>>>
>>> > diff --git a/Documentation/git-show-ref.txt b/Documentation/git-show-ref.txt
>>> > index ab4d271925..28256c04dd 100644
>>> > --- a/Documentation/git-show-ref.txt
>>> > +++ b/Documentation/git-show-ref.txt
>>> > @@ -10,7 +10,7 @@ SYNOPSIS
>>> >  [verse]
>>> >  'git show-ref' [-q|--quiet] [--verify] [--head] [-d|--dereference]
>>> >  	     [-s|--hash[=<n>]] [--abbrev[=<n>]] [--tags]
>>> > -	     [--heads] [--] [<pattern>...]
>>> > +	     [--heads] [--count=<n>] [--] [<pattern>...]
>>>
>>> In addition to what Junio noted, the SYNOPSIS is now inaccurate per your
>>> documentation. I.e. if this option is incompatible with --verify and
>>> --exclude-existing we should use "|" to indicate that, e.g.:
>>>
>>> 	[ [--verify] [--exclude-existing] | --count=<n> ]
>>
>> Good catch. Should this be squashed into the first example in the
>> SYNOPSIS, the second, or a new one?
>
> Personally I really don't care if the end-state is good :)

Heh.  I actually do think that the proposed documentation is
correct; the implementation that excludes these two options
from the "count" feature is buggy.

[2/2] builtin/show-ref.c: limit output with `--count`

Commit Message

Comments

Patch