diff mbox series

t1500: ensure current --since= behavior remains

Message ID 20210210215543.18960-1-e@80x24.org (mailing list archive)
State Accepted
Commit a5cdca452052e824c9f3f7cf78385fbec5bb1976
Headers show
Series t1500: ensure current --since= behavior remains | expand

Commit Message

Eric Wong Feb. 10, 2021, 9:55 p.m. UTC
This behavior of git-rev-parse is observed since git 1.8.3.1
at least(*), and likely earlier versions.

At least one git-reliant project in-the-wild relies on this
current behavior of git-rev-parse being able to handle multiple
--since= arguments without squeezing identical results together.
So add a test to prevent the potential for regression in
downstream projects.

(*) 1.8.3.1 the version packaged for CentOS 7.x

Signed-off-by: Eric Wong <e@80x24.org>
---
 Said git-reliant project does use allow /optional/ use of libgit2
 via Perl Inline::C; but libgit2's git__date_parse() isn't part of
 the public API; and it's unlikely anybody using CentOS 7.x would
 run the latest libgit2 even if it were added today.

 t/t1500-rev-parse.sh | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

Comments

Jeff King Feb. 11, 2021, 11:18 a.m. UTC | #1
On Wed, Feb 10, 2021 at 09:55:43PM +0000, Eric Wong wrote:

> This behavior of git-rev-parse is observed since git 1.8.3.1
> at least(*), and likely earlier versions.
> 
> At least one git-reliant project in-the-wild relies on this
> current behavior of git-rev-parse being able to handle multiple
> --since= arguments without squeezing identical results together.
> So add a test to prevent the potential for regression in
> downstream projects.

I had to read this a few times to understand what "this behavior" meant.
It is just: when given multiple --since options, output a --max-age for
each of them, even though internally, Git's revision traversal will only
use one (in the usual last-one-wins fashion).

I'm not sure if I was just being dense, or if this could be spelled out
more clearly. :)

Out of curiosity, why does the other project want that? From your
mention of libgit2's git__date_parse(), I assume it's something that
wants to parse approxidates into timestamps in a script. Maybe we ought
to provide a more direct and robust way of doing that. We have a similar
need in t0006, but we use a test-helper program for it.

(I have no problem in the meantime with this patch, though; any new
method for accomplishing this would want to give other projects time to
adapt to its use).

-Peff
Eric Wong Feb. 11, 2021, 5:06 p.m. UTC | #2
Jeff King <peff@peff.net> wrote:
> On Wed, Feb 10, 2021 at 09:55:43PM +0000, Eric Wong wrote:
> 
> > This behavior of git-rev-parse is observed since git 1.8.3.1
> > at least(*), and likely earlier versions.
> > 
> > At least one git-reliant project in-the-wild relies on this
> > current behavior of git-rev-parse being able to handle multiple
> > --since= arguments without squeezing identical results together.
> > So add a test to prevent the potential for regression in
> > downstream projects.
> 
> I had to read this a few times to understand what "this behavior" meant.
> It is just: when given multiple --since options, output a --max-age for
> each of them, even though internally, Git's revision traversal will only
> use one (in the usual last-one-wins fashion).
> 
> I'm not sure if I was just being dense, or if this could be spelled out
> more clearly. :)

*shrug* :>  My brain struggles with coherent thought so I'm
surprised anybody is able to understand me at all :x

> Out of curiosity, why does the other project want that? From your
> mention of libgit2's git__date_parse(), I assume it's something that
> wants to parse approxidates into timestamps in a script. Maybe we ought
> to provide a more direct and robust way of doing that. We have a similar
> need in t0006, but we use a test-helper program for it.

It takes about 5ms for my system to run git-rev-parse once.  I
may be getting multiple approxidates at once, 1-2 in a typical
input (start..end); but a strange or malicious input could have
hundreds/thousands of approxidates to parse.

Thus, I'm batching up all the approxidates into one rev-parse
invocation (up to system argv limits right now).  With the
output lines split into an array, walking the output/input
arrays in parallel will match them up.  Hypothetically, if
rev-parse were to get clever and deduplicate or reject identical
inputs; then the parallel walk would be broken.

> (I have no problem in the meantime with this patch, though; any new
> method for accomplishing this would want to give other projects time to
> adapt to its use).

Yes.  I think I've mentioned some years/decade ago having
general functionality along the lines of "git cat-file --batch"
or fast-import would be nice (even for some existing scripts and
tests shipped with git).  (v)fork+execve is painful even on
Linux.
diff mbox series

Patch

diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
index abdda360d5..deae916707 100755
--- a/t/t1500-rev-parse.sh
+++ b/t/t1500-rev-parse.sh
@@ -243,4 +243,19 @@  test_expect_success 'showing the superproject correctly' '
 	test_cmp expect out
 '
 
+# at least one external project depends on this behavior:
+test_expect_success 'rev-parse --since= unsqueezed ordering' '
+	x1=--since=1970-01-01T00:00:01Z &&
+	x2=--since=1970-01-01T00:00:02Z &&
+	x3=--since=1970-01-01T00:00:03Z &&
+	git rev-parse $x1 $x1 $x3 $x2 >actual &&
+	cat >expect <<-EOF &&
+	--max-age=1
+	--max-age=1
+	--max-age=3
+	--max-age=2
+	EOF
+	test_cmp expect actual
+'
+
 test_done