mbox series

[v4,0/2] commit-graph: ignore duplicates when merging layers

Message ID pull.747.v4.git.1602276832.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series commit-graph: ignore duplicates when merging layers | expand

Message

Jean-Noël Avila via GitGitGadget Oct. 9, 2020, 8:53 p.m. UTC
Here is v4 of a series that was previously only one patch.

Thanks to the efforts of Thomas, Peff, and Taylor, we have a full
understanding of what went wrong here. Details are in the patches
themselves, but generally when writing a commit-graph chain we rely on the
commit-graph parsing to know which commits are already in the lower layers
of the chain. When 'core.commitGraph' is disabled, then we don't do that
parsing, and then we add all reachable commits to the top layer and merge
down. This causes us to hit the previous die() statement.

This fixes the problem by first handling any duplicates we see during a
merge (this is important for handling any other data out there in this
situation) and also to disable commit-graph writes when 'core.commitGraph'
is disabled.

Thanks, -Stolee

Derrick Stolee (2):
  commit-graph: ignore duplicates when merging layers
  commit-graph: don't write commit-graph when disabled

 Documentation/git-commit-graph.txt |  4 +++-
 commit-graph.c                     | 21 ++++++++++++++++++---
 t/t5324-split-commit-graph.sh      | 13 +++++++++++++
 3 files changed, 34 insertions(+), 4 deletions(-)


base-commit: d98273ba77e1ab9ec755576bc86c716a97bf59d7
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-747%2Fderrickstolee%2Fcommit-graph-dup-commits-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-747/derrickstolee/commit-graph-dup-commits-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/747

Range-diff vs v3:

 1:  9e760f07ac ! 1:  9279aea3ef commit-graph: ignore duplicates when merging layers
     @@ Commit message
          pointers. This allows us to get the end result we want without extra
          memory costs and minimal CPU time.
      
     -    Since the root cause for producing commit-graph layers with these
     -    duplicate commits is currently unknown, it is difficult to create a test
     -    for this scenario. For now, we must rely on testing the example data
     -    graciously provided in [1]. My local test successfully merged layers,
     -    and 'git commit-graph verify' passed.
     +    The root cause is due to disabling core.commitGraph, which prevents
     +    parsing commits from the lower layers during a 'git commit-graph write
     +    --split' command. Since we use the 'graph_pos' value to determine
     +    whether a commit is in a lower layer, we never discover that those
     +    commits are already in the commit-graph chain and add them to the top
     +    layer. This layer is then merged down, creating duplicates.
     +
     +    The test added in t5324-split-commit-graph.sh fails without this change.
     +    However, we still have not completely removed the need for this
     +    duplicate check. That will come in a follow-up change.
      
          Reported-by: Thomas Braun <thomas.braun@virtuell-zuhause.de>
          Helped-by: Taylor Blau <me@ttaylorr.com>
     @@ commit-graph.c: static void sort_and_scan_merged_commits(struct write_commit_gra
       	stop_progress(&ctx->progress);
       }
       
     +
     + ## t/t5324-split-commit-graph.sh ##
     +@@ t/t5324-split-commit-graph.sh: test_expect_success '--split=replace with partial Bloom data' '
     + 	verify_chain_files_exist $graphdir
     + '
     + 
     ++test_expect_success 'prevent regression for duplicate commits across layers' '
     ++	git init dup &&
     ++	git -C dup config core.commitGraph false &&
     ++	git -C dup commit --allow-empty -m one &&
     ++	git -C dup commit-graph write --split=no-merge --reachable &&
     ++	git -C dup commit --allow-empty -m two &&
     ++	git -C dup commit-graph write --split=no-merge --reachable &&
     ++	git -C dup commit --allow-empty -m three &&
     ++	git -C dup commit-graph write --split --reachable &&
     ++	git -C dup commit-graph verify
     ++'
     ++
     + test_done
 -:  ---------- > 2:  4439e8ae8f commit-graph: don't write commit-graph when disabled