mbox series

[v2,0/6] Generation Number v2: Fix a tricky split graph bug

Message ID pull.850.v2.git.1612234883.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Generation Number v2: Fix a tricky split graph bug | expand

Message

Philippe Blain via GitGitGadget Feb. 2, 2021, 3:01 a.m. UTC
Here is a bugfix for the recently-landed-in-next generation v2 topic
(ak/corrected-commit-date).

This was occasionally hitting us when computing new commit-graphs on
existing repositories with the new bits. It was very hard to reproduce, and
it turns out to be due to not parsing commits before accessing generation
number data. Doing so in the right place demonstrates the bug of recomputing
the corrected commit date even for commits in lower layers with computed
values.

The fix is split into these steps:

 1. Parse commits more often before accessing their data. (This allows the
    bug to be demonstrated in the test suite.)
 2. Check the full commit-graph chain for generation data chunks.
 3. Don't compute corrected commit dates if the lower layers do not support
    them.
 4. Parse the commit-graph file more often.

Thanks, -Stolee


Updates in v2
=============

 * Fixed some typos or other clarifications in commit messages.

 * The loop assigning read_generation_data is skipped if they already all
   agree with value 1.

 * I split compute_generation_numbers into two methods. This essentially
   splits the previous patch 4 into patches 4 & 5 here. The new patch 4 just
   splits the logic as-is, then the new patch 5 does the re-initialization
   of generation values when in the upgrade scenario.

Derrick Stolee (6):
  commit-graph: use repo_parse_commit
  commit-graph: always parse before commit_graph_data_at()
  commit-graph: validate layers for generation data
  commit-graph: compute generations separately
  commit-graph: be extra careful about mixed generations
  commit-graph: prepare commit graph

 commit-graph.c          | 138 +++++++++++++++++++++++++++++-----------
 commit.h                |   5 +-
 t/t5318-commit-graph.sh |  21 ++++++
 3 files changed, 125 insertions(+), 39 deletions(-)


base-commit: 5a3b130cad0d5c770f766e3af6d32b41766374c0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-850%2Fderrickstolee%2Fgen-v2-upgrade-fix-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-850/derrickstolee/gen-v2-upgrade-fix-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/850

Range-diff vs v1:

 1:  9c605c99f66 = 1:  9c605c99f66 commit-graph: use repo_parse_commit
 2:  82afa811dff ! 2:  454b183b9ba commit-graph: always parse before commit_graph_data_at()
     @@ Commit message
          problems when writing a commit-graph with corrected commit dates based
          on a commit-graph without them.
      
     -    It has been difficult to identify the issue here becaus it was so hard
     +    It has been difficult to identify the issue here because it was so hard
          to reproduce. It relies on this uninitialized data having a non-zero
          value, but also on specifically in a way that overwrites the existing
          data.
 3:  d554fa30660 ! 3:  3d223fa2156 commit-graph: validate layers for generation data
     @@ commit-graph.c: static struct commit_graph *load_commit_graph_chain(struct repos
       
      -	if (!g)
      -		return;
     --
     --	read_generation_data = !!g->chunk_generation_data;
      +	while (read_generation_data && p) {
      +		read_generation_data = p->read_generation_data;
      +		p = p->base_graph;
      +	}
       
     +-	read_generation_data = !!g->chunk_generation_data;
     ++	if (read_generation_data)
     ++		return 1;
     + 
       	while (g) {
     - 		g->read_generation_data = read_generation_data;
     +-		g->read_generation_data = read_generation_data;
     ++		g->read_generation_data = 0;
       		g = g->base_graph;
       	}
      +
     -+	return read_generation_data;
     ++	return 0;
       }
       
       struct commit_graph *read_commit_graph_one(struct repository *r,
 -:  ----------- > 4:  05248ff222f commit-graph: compute generations separately
 4:  b267a9653a7 ! 5:  9bccee8fb63 commit-graph: be extra careful about mixed generations
     @@ Commit message
          existing commit-graph data has no corrected commit dates.
      
          While this improves our situation somewhat, we have not completely
     -    solved the issue for correctly computing generation numbers for mixes
     +    solved the issue for correctly computing generation numbers for mixed
          layers. That follows in the next change.
      
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
     @@ commit-graph.c: static void compute_generation_numbers(struct write_commit_graph
       					_("Computing commit graph generation numbers"),
       					ctx->commits.nr);
      +
     -+	if (ctx->write_generation_data && !ctx->trust_generation_numbers) {
     ++	if (!ctx->trust_generation_numbers) {
      +		for (i = 0; i < ctx->commits.nr; i++) {
      +			struct commit *c = ctx->commits.list[i];
      +			repo_parse_commit(ctx->r, c);
     @@ commit-graph.c: static void compute_generation_numbers(struct write_commit_graph
      +
       	for (i = 0; i < ctx->commits.nr; i++) {
       		struct commit *c = ctx->commits.list[i];
     - 		uint32_t level;
     -@@ commit-graph.c: static void compute_generation_numbers(struct write_commit_graph_context *ctx)
     - 				corrected_commit_date = commit_graph_data_at(parent->item)->generation;
     - 
     - 				if (level == GENERATION_NUMBER_ZERO ||
     --				    corrected_commit_date == GENERATION_NUMBER_ZERO) {
     -+				    (ctx->write_generation_data &&
     -+				     corrected_commit_date == GENERATION_NUMBER_ZERO)) {
     - 					all_parents_computed = 0;
     - 					commit_list_insert(parent->item, &list);
     - 					break;
     -@@ commit-graph.c: static void compute_generation_numbers(struct write_commit_graph_context *ctx)
     - 					max_level = GENERATION_NUMBER_V1_MAX - 1;
     - 				*topo_level_slab_at(ctx->topo_levels, current) = max_level + 1;
     - 
     --				if (current->date && current->date > max_corrected_commit_date)
     --					max_corrected_commit_date = current->date - 1;
     --				commit_graph_data_at(current)->generation = max_corrected_commit_date + 1;
     --
     --				if (commit_graph_data_at(current)->generation - current->date > GENERATION_NUMBER_V2_OFFSET_MAX)
     --					ctx->num_generation_data_overflows++;
     -+				if (ctx->write_generation_data) {
     -+					timestamp_t cur_g;
     -+					if (current->date && current->date > max_corrected_commit_date)
     -+						max_corrected_commit_date = current->date - 1;
     -+					cur_g = commit_graph_data_at(current)->generation
     -+					      = max_corrected_commit_date + 1;
     -+					if (cur_g - current->date > GENERATION_NUMBER_V2_OFFSET_MAX)
     -+						ctx->num_generation_data_overflows++;
     -+				}
     - 			}
     - 		}
     - 	}
     + 		timestamp_t corrected_commit_date;
      @@ commit-graph.c: int write_commit_graph(struct object_directory *odb,
       	} else
       		ctx->num_commit_graphs_after = 1;
     @@ commit-graph.c: int write_commit_graph(struct object_directory *odb,
      -	validate_mixed_generation_chain(ctx->r->objects->commit_graph);
      +	ctx->trust_generation_numbers = validate_mixed_generation_chain(ctx->r->objects->commit_graph);
       
     - 	compute_generation_numbers(ctx);
     - 
     + 	compute_topological_levels(ctx);
     + 	if (ctx->write_generation_data)
 5:  dddeec30ebf ! 6:  38086c85b52 commit-graph: prepare commit graph
     @@ Commit message
          commit-graph: prepare commit graph
      
          Before checking if the repository has a commit-graph loaded, be sure
     -    to run prepare_commit_graph(). This is necessary because without this
     -    instance we would not initialize the topo_levels slab for each of the
     -    struct commit_graphs in the chain before we start to parse the
     -    commits. This leads to possibly recomputing the topological levels for
     -    commits in lower layers even when we are adding a small number of
     -    commits on top.
     +    to run prepare_commit_graph(). This is necessary because otherwise
     +    the topo_levels slab is not initialized. As we compute topo_levels for
     +    the new commits, we iterate further into the lower layers since the
     +    first visit to each commit looks as though the topo_level is not
     +    populated.
      
          By properly initializing the topo_slab, we fix the previously broken
          case of a split commit graph where a base layer has the

Comments

Taylor Blau Feb. 2, 2021, 3:08 a.m. UTC | #1
On Tue, Feb 02, 2021 at 03:01:17AM +0000, Derrick Stolee via GitGitGadget wrote:
> Updates in v2
> =============
>
>  * Fixed some typos or other clarifications in commit messages.
>
>  * The loop assigning read_generation_data is skipped if they already all
>    agree with value 1.
>
>  * I split compute_generation_numbers into two methods. This essentially
>    splits the previous patch 4 into patches 4 & 5 here. The new patch 4 just
>    splits the logic as-is, then the new patch 5 does the re-initialization
>    of generation values when in the upgrade scenario.

Thanks for addressing my nits. I'm much happier with patch 5 as a result
of your changes, so I think this series was a positive step forward not
only for fixing a correctness bug, but also for improving the
readability of commit-graph.c.

Both the range-diff and the new patches look good to me, so this series
has my:

  Reviewed-by: Taylor Blau <me@ttaylorr.com>


Thanks,
Taylor
Abhishek Kumar Feb. 11, 2021, 4:44 a.m. UTC | #2
On Tue, Feb 02, 2021 at 03:01:17AM +0000, Derrick Stolee via GitGitGadget wrote:
> Here is a bugfix for the recently-landed-in-next generation v2 topic
> (ak/corrected-commit-date).
> 
> This was occasionally hitting us when computing new commit-graphs on
> existing repositories with the new bits. It was very hard to reproduce, and
> it turns out to be due to not parsing commits before accessing generation
> number data. Doing so in the right place demonstrates the bug of recomputing
> the corrected commit date even for commits in lower layers with computed
> values.
> 
> The fix is split into these steps:
> 
>  1. Parse commits more often before accessing their data. (This allows the
>     bug to be demonstrated in the test suite.)
>  2. Check the full commit-graph chain for generation data chunks.
>  3. Don't compute corrected commit dates if the lower layers do not support
>     them.
>  4. Parse the commit-graph file more often.
> 
> Thanks, -Stolee
> 

Thanks for the fast bug-fix. It must have been hard to track down why as
we weren't able to reproduce this behaviour before. As Taylor said
before, this patch is a clear improvement in readability and correctness.

Thanks
- Abhishek

> 
> Updates in v2
> =============
> 
>  * Fixed some typos or other clarifications in commit messages.
> 
>  * The loop assigning read_generation_data is skipped if they already all
>    agree with value 1.
> 
>  * I split compute_generation_numbers into two methods. This essentially
>    splits the previous patch 4 into patches 4 & 5 here. The new patch 4 just
>    splits the logic as-is, then the new patch 5 does the re-initialization
>    of generation values when in the upgrade scenario.
> 
> Derrick Stolee (6):
>   commit-graph: use repo_parse_commit
>   commit-graph: always parse before commit_graph_data_at()
>   commit-graph: validate layers for generation data
>   commit-graph: compute generations separately
>   commit-graph: be extra careful about mixed generations
>   commit-graph: prepare commit graph
> 
>  commit-graph.c          | 138 +++++++++++++++++++++++++++++-----------
>  commit.h                |   5 +-
>  t/t5318-commit-graph.sh |  21 ++++++
>  3 files changed, 125 insertions(+), 39 deletions(-)
> 
> 
> base-commit: 5a3b130cad0d5c770f766e3af6d32b41766374c0
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-850%2Fderrickstolee%2Fgen-v2-upgrade-fix-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-850/derrickstolee/gen-v2-upgrade-fix-v2
> Pull-Request: https://github.com/gitgitgadget/git/pull/850
> 
> Range-diff vs v1:
> 
>  1:  9c605c99f66 = 1:  9c605c99f66 commit-graph: use repo_parse_commit
>  2:  82afa811dff ! 2:  454b183b9ba commit-graph: always parse before commit_graph_data_at()
>      @@ Commit message
>           problems when writing a commit-graph with corrected commit dates based
>           on a commit-graph without them.
>       
>      -    It has been difficult to identify the issue here becaus it was so hard
>      +    It has been difficult to identify the issue here because it was so hard
>           to reproduce. It relies on this uninitialized data having a non-zero
>           value, but also on specifically in a way that overwrites the existing
>           data.
>  3:  d554fa30660 ! 3:  3d223fa2156 commit-graph: validate layers for generation data
>      @@ commit-graph.c: static struct commit_graph *load_commit_graph_chain(struct repos
>        
>       -	if (!g)
>       -		return;
>      --
>      --	read_generation_data = !!g->chunk_generation_data;
>       +	while (read_generation_data && p) {
>       +		read_generation_data = p->read_generation_data;
>       +		p = p->base_graph;
>       +	}
>        
>      +-	read_generation_data = !!g->chunk_generation_data;
>      ++	if (read_generation_data)
>      ++		return 1;
>      + 
>        	while (g) {
>      - 		g->read_generation_data = read_generation_data;
>      +-		g->read_generation_data = read_generation_data;
>      ++		g->read_generation_data = 0;
>        		g = g->base_graph;
>        	}
>       +
>      -+	return read_generation_data;
>      ++	return 0;
>        }
>        
>        struct commit_graph *read_commit_graph_one(struct repository *r,
>  -:  ----------- > 4:  05248ff222f commit-graph: compute generations separately
>  4:  b267a9653a7 ! 5:  9bccee8fb63 commit-graph: be extra careful about mixed generations
>      @@ Commit message
>           existing commit-graph data has no corrected commit dates.
>       
>           While this improves our situation somewhat, we have not completely
>      -    solved the issue for correctly computing generation numbers for mixes
>      +    solved the issue for correctly computing generation numbers for mixed
>           layers. That follows in the next change.
>       
>           Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>      @@ commit-graph.c: static void compute_generation_numbers(struct write_commit_graph
>        					_("Computing commit graph generation numbers"),
>        					ctx->commits.nr);
>       +
>      -+	if (ctx->write_generation_data && !ctx->trust_generation_numbers) {
>      ++	if (!ctx->trust_generation_numbers) {
>       +		for (i = 0; i < ctx->commits.nr; i++) {
>       +			struct commit *c = ctx->commits.list[i];
>       +			repo_parse_commit(ctx->r, c);
>      @@ commit-graph.c: static void compute_generation_numbers(struct write_commit_graph
>       +
>        	for (i = 0; i < ctx->commits.nr; i++) {
>        		struct commit *c = ctx->commits.list[i];
>      - 		uint32_t level;
>      -@@ commit-graph.c: static void compute_generation_numbers(struct write_commit_graph_context *ctx)
>      - 				corrected_commit_date = commit_graph_data_at(parent->item)->generation;
>      - 
>      - 				if (level == GENERATION_NUMBER_ZERO ||
>      --				    corrected_commit_date == GENERATION_NUMBER_ZERO) {
>      -+				    (ctx->write_generation_data &&
>      -+				     corrected_commit_date == GENERATION_NUMBER_ZERO)) {
>      - 					all_parents_computed = 0;
>      - 					commit_list_insert(parent->item, &list);
>      - 					break;
>      -@@ commit-graph.c: static void compute_generation_numbers(struct write_commit_graph_context *ctx)
>      - 					max_level = GENERATION_NUMBER_V1_MAX - 1;
>      - 				*topo_level_slab_at(ctx->topo_levels, current) = max_level + 1;
>      - 
>      --				if (current->date && current->date > max_corrected_commit_date)
>      --					max_corrected_commit_date = current->date - 1;
>      --				commit_graph_data_at(current)->generation = max_corrected_commit_date + 1;
>      --
>      --				if (commit_graph_data_at(current)->generation - current->date > GENERATION_NUMBER_V2_OFFSET_MAX)
>      --					ctx->num_generation_data_overflows++;
>      -+				if (ctx->write_generation_data) {
>      -+					timestamp_t cur_g;
>      -+					if (current->date && current->date > max_corrected_commit_date)
>      -+						max_corrected_commit_date = current->date - 1;
>      -+					cur_g = commit_graph_data_at(current)->generation
>      -+					      = max_corrected_commit_date + 1;
>      -+					if (cur_g - current->date > GENERATION_NUMBER_V2_OFFSET_MAX)
>      -+						ctx->num_generation_data_overflows++;
>      -+				}
>      - 			}
>      - 		}
>      - 	}
>      + 		timestamp_t corrected_commit_date;
>       @@ commit-graph.c: int write_commit_graph(struct object_directory *odb,
>        	} else
>        		ctx->num_commit_graphs_after = 1;
>      @@ commit-graph.c: int write_commit_graph(struct object_directory *odb,
>       -	validate_mixed_generation_chain(ctx->r->objects->commit_graph);
>       +	ctx->trust_generation_numbers = validate_mixed_generation_chain(ctx->r->objects->commit_graph);
>        
>      - 	compute_generation_numbers(ctx);
>      - 
>      + 	compute_topological_levels(ctx);
>      + 	if (ctx->write_generation_data)
>  5:  dddeec30ebf ! 6:  38086c85b52 commit-graph: prepare commit graph
>      @@ Commit message
>           commit-graph: prepare commit graph
>       
>           Before checking if the repository has a commit-graph loaded, be sure
>      -    to run prepare_commit_graph(). This is necessary because without this
>      -    instance we would not initialize the topo_levels slab for each of the
>      -    struct commit_graphs in the chain before we start to parse the
>      -    commits. This leads to possibly recomputing the topological levels for
>      -    commits in lower layers even when we are adding a small number of
>      -    commits on top.
>      +    to run prepare_commit_graph(). This is necessary because otherwise
>      +    the topo_levels slab is not initialized. As we compute topo_levels for
>      +    the new commits, we iterate further into the lower layers since the
>      +    first visit to each commit looks as though the topo_level is not
>      +    populated.
>       
>           By properly initializing the topo_slab, we fix the previously broken
>           case of a split commit graph where a base layer has the
> 
> -- 
> gitgitgadget