diff mbox series

[1/3] t5318: demonstrate commit-graph generation v2 corruption

Message ID 0a49c86037bac200bb23e1abf9f67363e99c4b7c.1657667404.git.me@ttaylorr.com (mailing list archive)
State New, archived
Headers show
Series commit-graph: fix corruption during generation v2 upgrade | expand

Commit Message

Taylor Blau July 12, 2022, 11:10 p.m. UTC
When upgrading a commit-graph using generation v1 to one using
generation v2, it is possible to force Git into a corrupt state where it
(incorrectly) believes that a GDO2 chunk is necessary, *after* deciding
not to write one.

This makes subsequent reads using the commit-graph produce the following
error message:

    fatal: commit-graph requires overflow generation data but has none

Demonstrate this bug by increasing our test coverage to include a
minimal example of upgrading a commit-graph from generation v1 to v2.
The only notable components of this test are:

  - The committer date of the commit is chosen carefully so that the
    offset underflows when computed using a v1 generation number, but
    would not overflow when using v2 generation numbers.

  - The upgrade to generation number v2 must read in the v1 generation
    numbers, which we can do by passing `--changed-paths`, which will
    force the commit-graph internals to call `fill_commit_graph_info()`.

A future patch will squash this bug.

Reported-by: Jeff King <peff@peff.net>
Reproduced-by: Will Chandler <wfc@wfchandler.org>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t5318-commit-graph.sh | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

Comments

Derrick Stolee July 15, 2022, 3:15 a.m. UTC | #1
On 7/12/2022 7:10 PM, Taylor Blau wrote:
> When upgrading a commit-graph using generation v1 to one using
> generation v2, it is possible to force Git into a corrupt state where it
> (incorrectly) believes that a GDO2 chunk is necessary, *after* deciding
> not to write one.
> 
> This makes subsequent reads using the commit-graph produce the following
> error message:
> 
>     fatal: commit-graph requires overflow generation data but has none
> 
> Demonstrate this bug by increasing our test coverage to include a
> minimal example of upgrading a commit-graph from generation v1 to v2.
> The only notable components of this test are:
> 
>   - The committer date of the commit is chosen carefully so that the
>     offset underflows when computed using a v1 generation number, but
>     would not overflow when using v2 generation numbers.
> 
>   - The upgrade to generation number v2 must read in the v1 generation
>     numbers, which we can do by passing `--changed-paths`, which will
>     force the commit-graph internals to call `fill_commit_graph_info()`.
> 
> A future patch will squash this bug.

Thanks for finding a good test.

> +		# This commit will have a date at two seconds past the Epoch,
> +		# and a (v1) generation number of 1, since it is a root commit.
> +		#
> +		# The offset will then be computed as 2-1, which will underflow

I have verified that your test works, but this explanation is confusing me.
"2 - 1" is 1, which does not underflow. There must be something else going
on.

Looking ahead, you describe the situation correctly in Patch 3 to show that
we take "generation - date", so you really just need s/2-1/1-2/ here.

> +		# to 2^31, which is greater than the v2 offset small limit of
> +		# 2^31-1.
> +		#
> +		# This is sufficient to need a large offset table for the v2
> +		# generation numbers.
> +		test_commit --date "@2 +0000" base &&
> +		git repack -d &&
> +
> +		# Test that upgrading from generation v1 to v2 correctly
> +		# produces the overflow table.
> +		git -c commitGraph.generationVersion=1 commit-graph write &&
> +		git -c commitGraph.generationVersion=2 commit-graph write \
> +			--changed-paths &&

Simple and fast to set up and test. Thanks for using the config explicitly
in both commands so it is robust to possible default changes in the future.

Thanks,
-Stolee
Taylor Blau July 15, 2022, 10:05 p.m. UTC | #2
On Thu, Jul 14, 2022 at 11:15:42PM -0400, Derrick Stolee wrote:
> > +		# This commit will have a date at two seconds past the Epoch,
> > +		# and a (v1) generation number of 1, since it is a root commit.
> > +		#
> > +		# The offset will then be computed as 2-1, which will underflow
>
> I have verified that your test works, but this explanation is confusing me.
> "2 - 1" is 1, which does not underflow. There must be something else going
> on.
>
> Looking ahead, you describe the situation correctly in Patch 3 to show that
> we take "generation - date", so you really just need s/2-1/1-2/ here.

Yes, absolutely. Thanks for catching it.

Junio: you may want to s/2-1/1-2 in this patch's message, or I can send
you a replacement or reroll, whatever is easier.

Thanks,
Taylor
Junio C Hamano July 16, 2022, 12:01 a.m. UTC | #3
Taylor Blau <me@ttaylorr.com> writes:

> On Thu, Jul 14, 2022 at 11:15:42PM -0400, Derrick Stolee wrote:
>> > +		# This commit will have a date at two seconds past the Epoch,
>> > +		# and a (v1) generation number of 1, since it is a root commit.
>> > +		#
>> > +		# The offset will then be computed as 2-1, which will underflow
>>
>> I have verified that your test works, but this explanation is confusing me.
>> "2 - 1" is 1, which does not underflow. There must be something else going
>> on.
>>
>> Looking ahead, you describe the situation correctly in Patch 3 to show that
>> we take "generation - date", so you really just need s/2-1/1-2/ here.
>
> Yes, absolutely. Thanks for catching it.
>
> Junio: you may want to s/2-1/1-2 in this patch's message, or I can send
> you a replacement or reroll, whatever is easier.

I've already done "rebase -i" to do so.

But for future reference, the easiest for me is if the author said,
after saying "Thanks for catching it", "Will reroll after waiting
for a bit to see if there are other comments".  That way, I only
have to edit the latest draft of "What's cooking" report to mark the
topic to be expecting a reroll (which will prevent me from merging
the topic down to 'next' prematurely by mistake) and forget about
it, until I actually see the updated set of patches.  It would be
even easier if the updated set of patches said which topic it is
meant to replace.  That way, I can trust other reviewers about the
details of the change between iterations and play a patch monkey,
when I am short of time.

It is more work to (1) look at the message you are responding to and
understand what "2 minus 1" vs "1 minus 2" being discussed is about,
and (2) go one level up in the thread to find the line in the patch
that was being discussed, and (3) run "rebase -i" and change "pick"
to "edit" and (4) find the line in the file that was affected by the
hunk in question and edit it.  The worst part of it is I'd either
have to do it right away before I forget, or mark the message unread
so that I can revisit it when I have enough time.

Either way is fine, but a straight resend with two notices (one that
says "will reroll", the other that says "this replaces topic X") is
the easiest to handle for me.

Thanks.
Taylor Blau July 16, 2022, 12:17 a.m. UTC | #4
On Fri, Jul 15, 2022 at 05:01:16PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > On Thu, Jul 14, 2022 at 11:15:42PM -0400, Derrick Stolee wrote:
> >> > +		# This commit will have a date at two seconds past the Epoch,
> >> > +		# and a (v1) generation number of 1, since it is a root commit.
> >> > +		#
> >> > +		# The offset will then be computed as 2-1, which will underflow
> >>
> >> I have verified that your test works, but this explanation is confusing me.
> >> "2 - 1" is 1, which does not underflow. There must be something else going
> >> on.
> >>
> >> Looking ahead, you describe the situation correctly in Patch 3 to show that
> >> we take "generation - date", so you really just need s/2-1/1-2/ here.
> >
> > Yes, absolutely. Thanks for catching it.
> >
> > Junio: you may want to s/2-1/1-2 in this patch's message, or I can send
> > you a replacement or reroll, whatever is easier.
>
> I've already done "rebase -i" to do so.

Thanks very much.

> But for future reference, the easiest for me is if the author said,
> after saying "Thanks for catching it", "Will reroll after waiting
> for a bit to see if there are other comments".  That way, I only
> have to edit the latest draft of "What's cooking" report to mark the
> topic to be expecting a reroll (which will prevent me from merging
> the topic down to 'next' prematurely by mistake) and forget about
> it, until I actually see the updated set of patches.  It would be
> even easier if the updated set of patches said which topic it is
> meant to replace.  That way, I can trust other reviewers about the
> details of the change between iterations and play a patch monkey,
> when I am short of time.

Makes sense. I appreciate you clarifying it explicitly, I've wondered
over the years what is easiest for you when fixing a trivial issue in a
larger series.

I've tended to try and avoid resubmitting a whole series when there is
just a typo hoping to avoid flooding the list with too many (mostly
unchanged) messages. But that requires you to do more work to futz with
the patches before they hit your tree, so I try not to do it too often.

In any case, I'll try to err more often on the side of resubmitting a
series after acking the typo in the hopes it makes your life easier ;-).

Thanks,
Taylor
diff mbox series

Patch

diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index fbf0d64578..4d9f62f22d 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -811,4 +811,31 @@  test_expect_success 'set up and verify repo with generation data overflow chunk'
 
 graph_git_behavior 'generation data overflow chunk repo' repo left right
 
+test_expect_failure 'overflow during generation version upgrade' '
+	git init overflow-v2-upgrade &&
+	(
+		cd overflow-v2-upgrade &&
+
+		# This commit will have a date at two seconds past the Epoch,
+		# and a (v1) generation number of 1, since it is a root commit.
+		#
+		# The offset will then be computed as 2-1, which will underflow
+		# to 2^31, which is greater than the v2 offset small limit of
+		# 2^31-1.
+		#
+		# This is sufficient to need a large offset table for the v2
+		# generation numbers.
+		test_commit --date "@2 +0000" base &&
+		git repack -d &&
+
+		# Test that upgrading from generation v1 to v2 correctly
+		# produces the overflow table.
+		git -c commitGraph.generationVersion=1 commit-graph write &&
+		git -c commitGraph.generationVersion=2 commit-graph write \
+			--changed-paths &&
+
+		git rev-list --all
+	)
+'
+
 test_done