diff mbox series

[v4,2/2] commit-graph: don't write commit-graph when disabled

Message ID 4439e8ae8fdc9abf28df29d3038a1483d9084cf2.1602276832.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v3] commit-graph: ignore duplicates when merging layers | expand

Commit Message

Philippe Blain via GitGitGadget Oct. 9, 2020, 8:53 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

The core.commitGraph config setting can be set to 'false' to prevent
parsing commits from the commit-graph file(s). This causes an issue when
trying to write with "--split" which needs to distinguish between
commits that are in the existing commit-graph layers and commits that
are not. The existing mechanism uses parse_commit() and follows by
checking if there is a 'graph_pos' that shows the commit was parsed from
the commit-graph file.

When core.commitGraph=false, we do not parse the commits from the
commit-graph and 'graph_pos' indicates that no commits are in the
existing file. The --split logic moves forward creating a new layer on
top that holds all reachable commits, then possibly merges down into
those layers, resulting in duplicate commits. The previous change makes
that merging process more robust to such a situation in case it happens
in the written commit-graph data.

The easy answer here is to avoid writing a commit-graph if reading the
commit-graph is disabled. Since the resulting commit-graph will would not
be read by subsequent Git processes. This is more natural than forcing
core.commitGraph to be true for the 'write' process.

Reported-by: Thomas Braun <thomas.braun@virtuell-zuhause.de>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-commit-graph.txt | 4 +++-
 commit-graph.c                     | 5 +++++
 t/t5324-split-commit-graph.sh      | 3 ++-
 3 files changed, 10 insertions(+), 2 deletions(-)

Comments

Junio C Hamano Oct. 9, 2020, 9:12 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +	if (!the_repository->settings.core_commit_graph) {
> +		warning(_("attempting to write a commit-graph, but 'core.commitGraph' is disabled"));
> +		return 0;
> +	}

Makes sense.  We probably would want to short-circuit invocation of
commit-graph related tasks in the maintenance by checking the
feature, if we are not already doing so.

Thanks.
Taylor Blau Oct. 9, 2020, 9:17 p.m. UTC | #2
On Fri, Oct 09, 2020 at 08:53:52PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
>
> The core.commitGraph config setting can be set to 'false' to prevent
> parsing commits from the commit-graph file(s). This causes an issue when
> trying to write with "--split" which needs to distinguish between
> commits that are in the existing commit-graph layers and commits that
> are not. The existing mechanism uses parse_commit() and follows by
> checking if there is a 'graph_pos' that shows the commit was parsed from
> the commit-graph file.
>
> When core.commitGraph=false, we do not parse the commits from the
> commit-graph and 'graph_pos' indicates that no commits are in the
> existing file. The --split logic moves forward creating a new layer on
> top that holds all reachable commits, then possibly merges down into
> those layers, resulting in duplicate commits. The previous change makes
> that merging process more robust to such a situation in case it happens
> in the written commit-graph data.

You're noting something interesting here which is that I actually think
setting 'core.commitGraph' _would_ be OK for non-split writes, and
'--split=replace' (along with any other split that happens to write a
single layer).

But, I think that actually enforcing that rule (i.e., "if you have
core.commitGraph set to false, you can't run `git commit-graph write`
except in X Y Z certain situations") is overly-complex and confusing to
users. So, I like what you have here a lot.

> The easy answer here is to avoid writing a commit-graph if reading the
> commit-graph is disabled. Since the resulting commit-graph will would not
> be read by subsequent Git processes. This is more natural than forcing
> core.commitGraph to be true for the 'write' process.
>
> Reported-by: Thomas Braun <thomas.braun@virtuell-zuhause.de>
> Helped-by: Jeff King <peff@peff.net>
> Helped-by: Taylor Blau <me@ttaylorr.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

You can add my:

  Signed-off-by: Taylor Blau <me@ttaylorr.com>

to the patch below, too, unless you want to take my suggestion below...

> ---
>  Documentation/git-commit-graph.txt | 4 +++-
>  commit-graph.c                     | 5 +++++
>  t/t5324-split-commit-graph.sh      | 3 ++-
>  3 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
> index de6b6de230..e1f48c95b3 100644
> --- a/Documentation/git-commit-graph.txt
> +++ b/Documentation/git-commit-graph.txt
> @@ -39,7 +39,9 @@ COMMANDS
>  --------
>  'write'::
>
> -Write a commit-graph file based on the commits found in packfiles.
> +Write a commit-graph file based on the commits found in packfiles. If
> +the config option `core.commitGraph` is disabled, then this command will
> +output a warning, then return success without writing a commit-graph file.
>  +
>  With the `--stdin-packs` option, generate the new commit graph by
>  walking objects only in the specified pack-indexes. (Cannot be combined
> diff --git a/commit-graph.c b/commit-graph.c
> index 0280dcb2ce..6f62a07313 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -2160,6 +2160,11 @@ int write_commit_graph(struct object_directory *odb,
>  	int replace = 0;
>  	struct bloom_filter_settings bloom_settings = DEFAULT_BLOOM_FILTER_SETTINGS;
>
> +	prepare_repo_settings(the_repository);
> +	if (!the_repository->settings.core_commit_graph) {
> +		warning(_("attempting to write a commit-graph, but 'core.commitGraph' is disabled"));
> +		return 0;
> +	}

Should this check be folded into 'commit_graph_compatible()'? Maybe in
'prepare_commit_graph()' which itself calls 'commit_graph_compatible()'?
I admit that I find this chain of callers to be confusing.

I suppose one argument for checking it here _before_ calling
'commit_graph_compatible()' is that it allows you to issue a specific
warning before returning from this function, so I'm OK with it.

I also don't have a concrete suggestion of where a better place for this
hunk might be, so I'm fine with what you wrote.

>  	if (!commit_graph_compatible(the_repository))
>  		return 0;
>
> diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
> index a314ce0368..4d3842b83b 100755
> --- a/t/t5324-split-commit-graph.sh
> +++ b/t/t5324-split-commit-graph.sh
> @@ -442,8 +442,9 @@ test_expect_success '--split=replace with partial Bloom data' '
>
>  test_expect_success 'prevent regression for duplicate commits across layers' '
>  	git init dup &&
> -	git -C dup config core.commitGraph false &&
>  	git -C dup commit --allow-empty -m one &&
> +	git -C dup -c core.commitGraph=false commit-graph write --split=no-merge --reachable 2>err &&
> +	test_i18ngrep "attempting to write a commit-graph" err &&
>  	git -C dup commit-graph write --split=no-merge --reachable &&
>  	git -C dup commit --allow-empty -m two &&
>  	git -C dup commit-graph write --split=no-merge --reachable &&

Hmm. I would have preferred to see a new test here. Unless I'm wrong, I
believe the patched version of this test _doesn't_ have a duplicate
commit across multiple layers:

  - We try to write a layer with 'one', but don't (because
    'core.commitGraph' is set to false).

  - Then we write a layer for 'one' with 'core.commitGraph' unset.

  - Then we write a layer for 'two' (and only 'two'), since we read the
    below layer containing 'one'.

But, I'm not sure of a better way to test this, either. You fixed the
bug that this is trying to exercise, so it's no longer being exercised
here, but then again neither is the new code that is supposed to handle
it.

I wonder if it is maybe worth having some sample commit-graphs laying
around in a t5324 directory that _would_ demonstrate this problem. OTOH,
maybe that is just me being overly pedantic and worrying about something
that isn't actually a problem.

I trust your judgement, so whatever you feel like is fine with me.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
index de6b6de230..e1f48c95b3 100644
--- a/Documentation/git-commit-graph.txt
+++ b/Documentation/git-commit-graph.txt
@@ -39,7 +39,9 @@  COMMANDS
 --------
 'write'::
 
-Write a commit-graph file based on the commits found in packfiles.
+Write a commit-graph file based on the commits found in packfiles. If
+the config option `core.commitGraph` is disabled, then this command will
+output a warning, then return success without writing a commit-graph file.
 +
 With the `--stdin-packs` option, generate the new commit graph by
 walking objects only in the specified pack-indexes. (Cannot be combined
diff --git a/commit-graph.c b/commit-graph.c
index 0280dcb2ce..6f62a07313 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2160,6 +2160,11 @@  int write_commit_graph(struct object_directory *odb,
 	int replace = 0;
 	struct bloom_filter_settings bloom_settings = DEFAULT_BLOOM_FILTER_SETTINGS;
 
+	prepare_repo_settings(the_repository);
+	if (!the_repository->settings.core_commit_graph) {
+		warning(_("attempting to write a commit-graph, but 'core.commitGraph' is disabled"));
+		return 0;
+	}
 	if (!commit_graph_compatible(the_repository))
 		return 0;
 
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index a314ce0368..4d3842b83b 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -442,8 +442,9 @@  test_expect_success '--split=replace with partial Bloom data' '
 
 test_expect_success 'prevent regression for duplicate commits across layers' '
 	git init dup &&
-	git -C dup config core.commitGraph false &&
 	git -C dup commit --allow-empty -m one &&
+	git -C dup -c core.commitGraph=false commit-graph write --split=no-merge --reachable 2>err &&
+	test_i18ngrep "attempting to write a commit-graph" err &&
 	git -C dup commit-graph write --split=no-merge --reachable &&
 	git -C dup commit --allow-empty -m two &&
 	git -C dup commit-graph write --split=no-merge --reachable &&