diff mbox series

describe: drop early return for max_candidates == 0

Message ID 20241205201449.GA2635755@coredump.intra.peff.net (mailing list archive)
State New
Headers show
Series describe: drop early return for max_candidates == 0 | expand

Commit Message

Jeff King Dec. 5, 2024, 8:14 p.m. UTC
On Wed, Dec 04, 2024 at 06:27:50PM -0500, Jeff King wrote:

> > Subject: Re: [PATCH] fixup! describe: stop traversing when we run out of names
> 
> This commit is already in 'next', so it's too late to squash in a change
> (though I'd have done this separately anyway, as it's already an issue
> for a manual --candidates=0 setting, as unlikely as that is).
> 
> Can you re-send with a full commit message?

Actually, after thinking on this a bit more, I think the solution below
is a bit more elegant. This can go on top of jk/describe-perf.

-- >8 --
From: Josh Steadmon <steadmon@google.com>
Subject: [PATCH] describe: drop early return for max_candidates == 0

Before we even start the describe algorithm, we check to see if
max_candidates is 0 and bail immediately if we did not find an exact
match. This comes from 2c33f75754 (Teach git-describe --exact-match to
avoid expensive tag searches, 2008-02-24), since the the --exact-match
option just sets max_candidates to 0.

But this interacts badly with the --always option (ironically added only
a week later in da2478dbb0 (describe --always: fall back to showing an
abbreviated object name, 2008-03-02)). With --always, we'd still want to
show the hash rather than calling die().

So this:

  git describe --exact-match --always

and likewise:

  git describe --exact-match --candidates=0

has always been broken. But nobody ever noticed, because using those
options together is rather unlikely. However, this bug became a lot
easier to trigger with a30154187a (describe: stop traversing when we run
out of names, 2024-10-31). There we reduce max_candidates automatically
based on the number of tags available. So in a repo with no tags (or one
where --match finds no tags), max_candidates becomes 0, and --always
will never show anything.

So that early check for --exact-match's zero candidates needs to be
adjusted. One way to do so is to have it check the "always" flag and
handle it specially, producing the expected hash. But that would require
duplicating the output code for "always".

Instead, we'd prefer to just fall through to the normal algorithm, which
should notice that we are not allowed to find any more candidates, stop
looking, and then hit the regular "always" output code. Back when
2c33f75754 was first done, this was a bad idea, since the normal
algorithm kept looking for the max+1 candidate. But since 082a4d90af
(describe: stop digging for max_candidates+1, 2024-10-31), we don't do
that anymore, and the algorithm is essentially a noop.

So we can drop the early return entirely, and the fact that
max_candidates is 0 will let us quit early without any special casing.

Reported-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Jeff King <peff@peff.net>
---
There is some small bit of setup work in the algorithm, like creating
the reverse index of commits->names in a slab. I don't think that's
worth worrying about. But if we did care, we could lazily initialize
that index, which would also benefit any other cases that bail before
needing it.

 builtin/describe.c  | 2 --
 t/t6120-describe.sh | 6 ++++++
 2 files changed, 6 insertions(+), 2 deletions(-)

Comments

Josh Steadmon Dec. 5, 2024, 10:28 p.m. UTC | #1
On 2024.12.05 15:14, Jeff King wrote:
> On Wed, Dec 04, 2024 at 06:27:50PM -0500, Jeff King wrote:
> 
> > > Subject: Re: [PATCH] fixup! describe: stop traversing when we run out of names
> > 
> > This commit is already in 'next', so it's too late to squash in a change
> > (though I'd have done this separately anyway, as it's already an issue
> > for a manual --candidates=0 setting, as unlikely as that is).
> > 
> > Can you re-send with a full commit message?
> 
> Actually, after thinking on this a bit more, I think the solution below
> is a bit more elegant. This can go on top of jk/describe-perf.
> 

Thanks, and sorry for not replying earlier, I got distracted by a
different $DAYJOB breakage:
https://lore.kernel.org/git/b41ae080654a3603af09801018df539f656cf9d8.1733430345.git.steadmon@google.com/
Jeff King Dec. 5, 2024, 11:21 p.m. UTC | #2
On Thu, Dec 05, 2024 at 02:28:45PM -0800, Josh Steadmon wrote:

> > Actually, after thinking on this a bit more, I think the solution below
> > is a bit more elegant. This can go on top of jk/describe-perf.
> > 
> 
> Thanks, and sorry for not replying earlier, I got distracted by a
> different $DAYJOB breakage:

No problem. Thanks for finding it!

-Peff
Junio C Hamano Dec. 6, 2024, 3:01 a.m. UTC | #3
Jeff King <peff@peff.net> writes:

> Actually, after thinking on this a bit more, I think the solution below
> is a bit more elegant. This can go on top of jk/describe-perf.
>
> -- >8 --
> From: Josh Steadmon <steadmon@google.com>
> Subject: [PATCH] describe: drop early return for max_candidates == 0

OK, so the patch authorship still blames Josh.  But there is no
sign-off because ... the approach to the fix is so different that
blaming Josh for this implementation is no longer appropriate?

> Reported-by: Josh Steadmon <steadmon@google.com>
> Signed-off-by: Jeff King <peff@peff.net>

If so, please take the authorship yourself.

> Before we even start the describe algorithm, we check to see if
> max_candidates is 0 and bail immediately if we did not find an exact
> match. This comes from 2c33f75754 (Teach git-describe --exact-match to
> avoid expensive tag searches, 2008-02-24), since the the --exact-match
> option just sets max_candidates to 0.
> ...
> So this:
>
>   git describe --exact-match --always
>
> and likewise:
>
>   git describe --exact-match --candidates=0

Did the latter mean to say "git decribe --candidates=0 --always", as
the earlier paragraph explains that "--exact" affects the number of
candidates?

Without this patch, all three give the same result:

    $ git describe --exact-match --always HEAD
    fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
    $ git describe --exact-match --candidates=0 HEAD
    fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
    $ git describe --candidates=0 --always HEAD
    fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'

With this patch, we instead get this:

    $ ./git describe --exact-match --always HEAD
    59d18088fe
    $ ./git describe --exact-match --candidates=0 HEAD
    fatal: No tags can describe '59d18088fe8ace4bf18ade27eeef3664fb6b0878'.
    Try --always, or create some tags.
    $ ./git describe --candidates=0 --always HEAD
    59d18088fe

> But this interacts badly with the --always option (ironically added only
> a week later in da2478dbb0 (describe --always: fall back to showing an
> abbreviated object name, 2008-03-02)). With --always, we'd still want to
> show the hash rather than calling die().
> ...

> has always been broken.

Hmph, I am not sure if the behaviour is _broken_ in the first place.

The user asks with "--exact-match" that a result based on some ref
that does not directly point at the object being described is *not*
acceptable, so with or without "--always", it looks to me that it is
doing the right thing, if there is no exact match (or there is no
tag and the user only allowed tag to describe the objects) and the
result is "no tag exactly matches object X" failure.

Or is our position that these mutually incompatible options, namely
"--exact-match" and "--always", follow the "last one wins" rule?
The implementation does not seem to say so.

If the earlier request is to describe only as exact tag (and fail if
there is no appropriate tag), but then we changed our mind and ask
to fall back to an abbreviation, this one is understandable:

    $ ./git describe --exact-match --always HEAD
    59d18088fe

But then this is not.  The last thing we explicitly told the command
is that we accept only the exact match, but this one does not fail,
which seems like a bug:

    $ ./git describe --always --exact-match HEAD
    59d18088fe

So I am not sure if the "buggy" behaviour is buggy to begin with.
The way these two are documented can be read both ways,

    --exact-match::
            Only output exact matches (a tag directly references the
            supplied commit).  This is a synonym for --candidates=0.

    --always::
            Show uniquely abbreviated commit object as fallback.

but my reading is when you give both and when the object given is
not directly pointed at by any existing tag, "ONLY output exact
matches" cannot be satisified.  And "show as fallback" cannot be
satisfied within the constraint that the command is allowed "only
output exact matches".

I think the complexity from the point of view of a calling script to
deal with either behaviour is probably similar.  If you ask for
"--exact-match" and there is no exact match, you can ask rev-parse
to give a shortened one, and you know which one you are giving the
user.  We can change what "--exact-match + --candidate=0" combination
means to let it fallback, but then you'd need to check the output to
see if you got an exact tag or a fallback, and for that you'd
probably need to ask "show-ref refs/tags/$output" or something.

So I am not sure if it is worth changing the behaviour this late in
the game?
Jeff King Dec. 6, 2024, 3:28 a.m. UTC | #4
On Fri, Dec 06, 2024 at 12:01:41PM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Actually, after thinking on this a bit more, I think the solution below
> > is a bit more elegant. This can go on top of jk/describe-perf.
> >
> > -- >8 --
> > From: Josh Steadmon <steadmon@google.com>
> > Subject: [PATCH] describe: drop early return for max_candidates == 0
> 
> OK, so the patch authorship still blames Josh.  But there is no
> sign-off because ... the approach to the fix is so different that
> blaming Josh for this implementation is no longer appropriate?

Oh, whoops. I had originally intended to just write the commit message
and leave him with credit, but then I ended up changing approach.
Leaving him as the author was an oversight.

> > Before we even start the describe algorithm, we check to see if
> > max_candidates is 0 and bail immediately if we did not find an exact
> > match. This comes from 2c33f75754 (Teach git-describe --exact-match to
> > avoid expensive tag searches, 2008-02-24), since the the --exact-match
> > option just sets max_candidates to 0.
> > ...
> > So this:
> >
> >   git describe --exact-match --always
> >
> > and likewise:
> >
> >   git describe --exact-match --candidates=0
> 
> Did the latter mean to say "git decribe --candidates=0 --always", as
> the earlier paragraph explains that "--exact" affects the number of
> candidates?

Urgh, yes, you are correct. I can resend, but I think we should resolve
the questions below.

> Without this patch, all three give the same result:
> 
>     $ git describe --exact-match --always HEAD
>     fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
>     $ git describe --exact-match --candidates=0 HEAD
>     fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
>     $ git describe --candidates=0 --always HEAD
>     fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
> 
> With this patch, we instead get this:
> 
>     $ ./git describe --exact-match --always HEAD
>     59d18088fe
>     $ ./git describe --exact-match --candidates=0 HEAD
>     fatal: No tags can describe '59d18088fe8ace4bf18ade27eeef3664fb6b0878'.
>     Try --always, or create some tags.
>     $ ./git describe --candidates=0 --always HEAD
>     59d18088fe

Right, exactly.

> > But this interacts badly with the --always option (ironically added only
> > a week later in da2478dbb0 (describe --always: fall back to showing an
> > abbreviated object name, 2008-03-02)). With --always, we'd still want to
> > show the hash rather than calling die().
> > ...
> > has always been broken.
> 
> Hmph, I am not sure if the behaviour is _broken_ in the first place.
> 
> The user asks with "--exact-match" that a result based on some ref
> that does not directly point at the object being described is *not*
> acceptable, so with or without "--always", it looks to me that it is
> doing the right thing, if there is no exact match (or there is no
> tag and the user only allowed tag to describe the objects) and the
> result is "no tag exactly matches object X" failure.
> 
> Or is our position that these mutually incompatible options, namely
> "--exact-match" and "--always", follow the "last one wins" rule?
> The implementation does not seem to say so.

I think you could argue that they are mutually incompatible. But we have
never marked them as such, nor do we do any sort of last-one-wins.
They are two distinct options, but in --exact-match mode, --always is
simply ignored. Which I think is a bug.

> So I am not sure if the "buggy" behaviour is buggy to begin with.
> The way these two are documented can be read both ways,
> 
>     --exact-match::
>             Only output exact matches (a tag directly references the
>             supplied commit).  This is a synonym for --candidates=0.
> 
>     --always::
>             Show uniquely abbreviated commit object as fallback.
> 
> but my reading is when you give both and when the object given is
> not directly pointed at by any existing tag, "ONLY output exact
> matches" cannot be satisified.  And "show as fallback" cannot be
> satisfied within the constraint that the command is allowed "only
> output exact matches".

I think there can be a more expansive reading of --exact-match (or of
--candidates=0), which is: only output a tag that matches exactly. And
then --always is orthogonal to that. There is no other output to
produce, so we show the commit object itself.

Now that more expansive reading is not what --exact-match says above.
But it is the only thing that makes sense to me for --candidates=0, and
the two are synonyms.

> I think the complexity from the point of view of a calling script to
> deal with either behaviour is probably similar.  If you ask for
> "--exact-match" and there is no exact match, you can ask rev-parse
> to give a shortened one, and you know which one you are giving the
> user.  We can change what "--exact-match + --candidate=0" combination
> means to let it fallback, but then you'd need to check the output to
> see if you got an exact tag or a fallback, and for that you'd
> probably need to ask "show-ref refs/tags/$output" or something.
> 
> So I am not sure if it is worth changing the behaviour this late in
> the game?

I think there are really two questions here:

  1. Is the current behavior of "describe --exact-match --always" a bug?
     I'll grant that probably nobody cares deeply, which is why the
     interaction has not been noticed for all of these years. I think
     the semantics this patch gives are the only ones that make sense,
     but I also don't care that deeply. But...

  2. What should we do about the new regression caused by limiting the
     candidate list? I.e., my earlier patches in this topic make us
     behave as if --candidates=<n> was given when there are fewer tags
     in the repo. That runs afoul of the special-casing of
     --candidates=0 when there are no tags in the repo (or you limit the
     candidates to zero via --match).

     If we are not going to address (1) as this patch does, then we need
     another solution. We can internally hold an extra variable to
     distinguish the number of user-requested candidates from the number
     of actual candidates available. But I think my solution to (1) here
     harmonizes the --candidates=0 case with --always, and then the
     auto-adjusted max-candidates case just falls out naturally.

-Peff
Junio C Hamano Dec. 6, 2024, 4:19 a.m. UTC | #5
Jeff King <peff@peff.net> writes:

>> Without this patch, all three give the same result:
>> 
>>     $ git describe --exact-match --always HEAD
>>     fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
>>     $ git describe --exact-match --candidates=0 HEAD
>>     fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
>>     $ git describe --candidates=0 --always HEAD
>>     fatal: no tag exactly matches '59d18088fe8ace4bf18ade27eeef3664fb6b0878'
>> 
>> With this patch, we instead get this:
>> 
>>     $ ./git describe --exact-match --always HEAD
>>     59d18088fe
>>     $ ./git describe --exact-match --candidates=0 HEAD
>>     fatal: No tags can describe '59d18088fe8ace4bf18ade27eeef3664fb6b0878'.
>>     Try --always, or create some tags.
>>     $ ./git describe --candidates=0 --always HEAD
>>     59d18088fe
> ...
> I think there are really two questions here:
>
>   1. Is the current behavior of "describe --exact-match --always" a bug?
>      I'll grant that probably nobody cares deeply, which is why the
>      interaction has not been noticed for all of these years. I think
>      the semantics this patch gives are the only ones that make sense,
>      but I also don't care that deeply. But...
>
>   2. What should we do about the new regression caused by limiting the
>      candidate list?

Ahh, OK, these --candidate=0 / --exact-match were for illustration
purposes only.  The real issue is that the user does not, with

  $ git describe --always HEAD

ask for exact matches only at all, but we internally pretend as if
they did, which is not nice.

My gut reaction is that it is wrong not to give the abbreviated
object name in this case, but the price to do so shouldn't be to
change the behaviour when --exact-match was requested the the user.

Loosening the interaction between the two options, when both were
given explicitly, may be an improvement, but I think that should be
treated as a separate topic, with its merit justified independently,
since the command has been behaving this way from fairly early
version, possibly the one that had both of the options for the first
time.

  $ rungit v2.20.0 describe --exact-match HEAD
  fatal: No names found, cannot describe anything.
  $ rungit v2.20.0 describe --exact-match --always HEAD
  fatal: no tag exactly matches '13a3dd7fe014658da465e9621ec3651f5473041e'
  $ rungit v2.20.0 describe --exact-match --candidate=0 HEAD
  fatal: No names found, cannot describe anything.

Thanks.
diff mbox series

Patch

diff --git a/builtin/describe.c b/builtin/describe.c
index 8ec3be87df..21e1c87c65 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -336,8 +336,6 @@  static void describe_commit(struct object_id *oid, struct strbuf *dst)
 		return;
 	}
 
-	if (!max_candidates)
-		die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid));
 	if (debug)
 		fprintf(stderr, _("No exact match on refs or tags, searching to describe\n"));
 
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 5633b11d01..009d84ff17 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -715,4 +715,10 @@  test_expect_success 'describe --broken --dirty with a file with changed stat' '
 	)
 '
 
+test_expect_success '--always with no refs falls back to commit hash' '
+	git rev-parse HEAD >expect &&
+	git describe --no-abbrev --always --match=no-such-tag >actual &&
+	test_cmp expect actual
+'
+
 test_done