Message ID | cover.1657667404.git.me@ttaylorr.com (mailing list archive) |
---|---|
Headers | show |
Series | commit-graph: fix corruption during generation v2 upgrade | expand |
Taylor Blau <me@ttaylorr.com> writes: > This brief series resolves a bug where a commit-graph would become > corrupt when upgrading from generation number v1 to v2, as originally > reported in [1]. > > Some speculation occurred as to what might be causing that bug in the > thread beginning at [1], until the problem was explained in more detail > by Will Chandler in [2]. > > The crux of the issue, as is described in [2] and [3], is that the > commit_graph_data slab is reused for read and write operations involving > the commit-graph, leading to situations where data computed in > preparation of a write is clobbered by a read of existing data. > > The first patch demonstrates the issue, and the second patch prepares to > fix it by introducing a helper function. The crux of the issue is > described and fixed in the third patch. > > [1]: https://lore.kernel.org/git/YqD5dgalb9EPnz85@coredump.intra.peff.net/ > [2]: https://lore.kernel.org/git/DD88D523-0ECA-4474-9AA5-1D4A431E532A@wfchandler.org/ > [3]: https://lore.kernel.org/git/YsS7H4i5DqUZVQ5i@nand.local/ Thanks. Do we know where this breaks? Applying [1/3] on Git 2.32, 2.34, and 2.35 seems to claim that "known breakage vanished".
On Wed, Jul 13, 2022 at 10:41:45AM -0700, Junio C Hamano wrote: > Thanks. Do we know where this breaks? Applying [1/3] on Git 2.32, > 2.34, and 2.35 seems to claim that "known breakage vanished". With a script like: --- >8 --- #!/bin/sh set -e rm -fr repo git init -q repo cd repo echo "x" >x git add x GIT_AUTHOR_DATE="@2 +0000" \ GIT_COMMITTER_DATE="@2 +0000" git commit -q -m "$(cat x)" git repack -d -q git.compile -c commitGraph.generationVersion=1 commit-graph write git.compile -c commitGraph.generationVersion=2 commit-graph write \ --changed-paths git.compile rev-list --all --- 8< --- You can bisect it to 3b0199d4c3 (commit-graph: start parsing generation v2 (again), 2022-03-01), but only because that patch teaches Git to recognize the existence of the generation v2 chunks. I suspect (but haven't confirmed) that it was probably broken before 3b0199d4c3. But such breakage wouldn't have mattered, since despite understanding generation v2, previous versions of Git never read those chunks, which would have masked over this bug. Thanks, Taylor
On 7/12/2022 7:10 PM, Taylor Blau wrote: > This brief series resolves a bug where a commit-graph would become > corrupt when upgrading from generation number v1 to v2, as originally > reported in [1]. > > Some speculation occurred as to what might be causing that bug in the > thread beginning at [1], until the problem was explained in more detail > by Will Chandler in [2]. > > The crux of the issue, as is described in [2] and [3], is that the > commit_graph_data slab is reused for read and write operations involving > the commit-graph, leading to situations where data computed in > preparation of a write is clobbered by a read of existing data. > > The first patch demonstrates the issue, and the second patch prepares to > fix it by introducing a helper function. The crux of the issue is > described and fixed in the third patch. Thanks, all, for identifying the scenario that causes this breakage. It was frustrating that we had reports of corruption without knowing where it was coming from. This solution looks like the best we can do right now, although it hints that some refactoring would be good to distinguish between "commits as read" and "commits to be written" during this upgrade phase. Something to think about, but not to block this fix. Thanks, -Stolee