diff mbox series

[v5,3/4] repo-settings: introduce commitgraph.changedPathsVersion

Message ID 131095666d414ff68416526112c8f8deb31ac9e1.1689283789.git.jonathantanmy@google.com (mailing list archive)
State Superseded
Headers show
Series Changed path filter hash fix and version bump | expand

Commit Message

Jonathan Tan July 13, 2023, 9:42 p.m. UTC
A subsequent commit will introduce another version of the changed-path
filter in the commit graph file. In order to control which version to
write (and read), a config variable is needed.

Therefore, introduce this config variable. For forwards compatibility,
teach Git to not read commit graphs when the config variable
is set to an unsupported version. Because we teach Git this,
commitgraph.readChangedPaths is now redundant, so deprecate it and
define its behavior in terms of the config variable we introduce.

This commit does not change the behavior of writing (Git writes changed
path filters when explicitly instructed regardless of any config
variable), but a subsequent commit will restrict Git such that it will
only write when commitgraph.changedPathsVersion is a recognized value.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config/commitgraph.txt | 19 ++++++++++++++++---
 commit-graph.c                       |  2 +-
 oss-fuzz/fuzz-commit-graph.c         |  2 +-
 repo-settings.c                      |  6 +++++-
 repository.h                         |  2 +-
 5 files changed, 24 insertions(+), 7 deletions(-)

Comments

Taylor Blau July 19, 2023, 6:10 p.m. UTC | #1
On Thu, Jul 13, 2023 at 02:42:10PM -0700, Jonathan Tan wrote:
> diff --git a/Documentation/config/commitgraph.txt b/Documentation/config/commitgraph.txt
> index 30604e4a4c..07f3799e05 100644
> --- a/Documentation/config/commitgraph.txt
> +++ b/Documentation/config/commitgraph.txt
> @@ -9,6 +9,19 @@ commitGraph.maxNewFilters::
>  	commit-graph write` (c.f., linkgit:git-commit-graph[1]).
>
>  commitGraph.readChangedPaths::
> -	If true, then git will use the changed-path Bloom filters in the
> -	commit-graph file (if it exists, and they are present). Defaults to
> -	true. See linkgit:git-commit-graph[1] for more information.
> +	Deprecated. Equivalent to changedPathsVersion=-1 if true, and
> +	changedPathsVersion=0 if false.

What happens if we have a combination of the two, like:

    [commitGraph]
        readChangedPaths = false
        changedPathsVersion = 1

? From reading the implementation below, I think the answer is that
changedPathsVersion would win out. I think that's fine behavior to
implement (the more modern configuration option taking precedence over
the deprecated one). But I think that we should either (a) note that
precedence in the documentation here, or (b) issue a warning() when both
are set.

For my $.02, I think that doing just (a) would be sufficient.

> diff --git a/commit-graph.c b/commit-graph.c
> index c11b59f28b..9b72319450 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -399,7 +399,7 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
>  			graph->read_generation_data = 1;
>  	}
>
> -	if (s->commit_graph_read_changed_paths) {
> +	if (s->commit_graph_changed_paths_version != 0) {
>  		pair_chunk(cf, GRAPH_CHUNKID_BLOOMINDEXES,
>  			   &graph->chunk_bloom_indexes);
>  		read_chunk(cf, GRAPH_CHUNKID_BLOOMDATA,

Just a small note, but writing this as

    if (!s->commit_graph_changed_paths_version)

would probably be more in line with our coding guidelines.

> diff --git a/repo-settings.c b/repo-settings.c
> index 3dbd3f0e2e..e3b6565ffc 100644
> --- a/repo-settings.c
> +++ b/repo-settings.c
> @@ -24,6 +24,7 @@ void prepare_repo_settings(struct repository *r)
>  	int value;
>  	const char *strval;
>  	int manyfiles;
> +	int readChangedPaths;

Small note: this should be snake-cased like "read_changed_paths".

> diff --git a/repository.h b/repository.h
> index e8c67ffe16..1f1c32a6dd 100644
> --- a/repository.h
> +++ b/repository.h
> @@ -32,7 +32,7 @@ struct repo_settings {
>
>  	int core_commit_graph;
>  	int commit_graph_generation_version;
> -	int commit_graph_read_changed_paths;

Nice, I'm glad that we're getting rid of this variable and replacing it
with commit_graph_changed_paths_version instead.

> +	int commit_graph_changed_paths_version;

Looking good.

Thanks,
Taylor
Jonathan Tan July 20, 2023, 8:42 p.m. UTC | #2
Taylor Blau <me@ttaylorr.com> writes:
> What happens if we have a combination of the two, like:
> 
>     [commitGraph]
>         readChangedPaths = false
>         changedPathsVersion = 1
> 
> ? From reading the implementation below, I think the answer is that
> changedPathsVersion would win out. I think that's fine behavior to
> implement (the more modern configuration option taking precedence over
> the deprecated one). But I think that we should either (a) note that
> precedence in the documentation here, or (b) issue a warning() when both
> are set.
> 
> For my $.02, I think that doing just (a) would be sufficient.

Thanks. I added a note to the documentation.

> > -	if (s->commit_graph_read_changed_paths) {
> > +	if (s->commit_graph_changed_paths_version != 0) {
> >  		pair_chunk(cf, GRAPH_CHUNKID_BLOOMINDEXES,
> >  			   &graph->chunk_bloom_indexes);
> >  		read_chunk(cf, GRAPH_CHUNKID_BLOOMDATA,
> 
> Just a small note, but writing this as
> 
>     if (!s->commit_graph_changed_paths_version)
> 
> would probably be more in line with our coding guidelines.

Ah, yes. Changed it (without the exclamation mark).

> > diff --git a/repo-settings.c b/repo-settings.c
> > index 3dbd3f0e2e..e3b6565ffc 100644
> > --- a/repo-settings.c
> > +++ b/repo-settings.c
> > @@ -24,6 +24,7 @@ void prepare_repo_settings(struct repository *r)
> >  	int value;
> >  	const char *strval;
> >  	int manyfiles;
> > +	int readChangedPaths;
> 
> Small note: this should be snake-cased like "read_changed_paths".

Whoops...thanks for the catch.
Taylor Blau July 20, 2023, 9:02 p.m. UTC | #3
On Thu, Jul 20, 2023 at 01:42:32PM -0700, Jonathan Tan wrote:
> > > -	if (s->commit_graph_read_changed_paths) {
> > > +	if (s->commit_graph_changed_paths_version != 0) {
> > >  		pair_chunk(cf, GRAPH_CHUNKID_BLOOMINDEXES,
> > >  			   &graph->chunk_bloom_indexes);
> > >  		read_chunk(cf, GRAPH_CHUNKID_BLOOMDATA,
> >
> > Just a small note, but writing this as
> >
> >     if (!s->commit_graph_changed_paths_version)
> >
> > would probably be more in line with our coding guidelines.
>
> Ah, yes. Changed it (without the exclamation mark).

Hah, whoops -- thanks for spotting my mistake there. :-)

Thanks,
Taylor
diff mbox series

Patch

diff --git a/Documentation/config/commitgraph.txt b/Documentation/config/commitgraph.txt
index 30604e4a4c..07f3799e05 100644
--- a/Documentation/config/commitgraph.txt
+++ b/Documentation/config/commitgraph.txt
@@ -9,6 +9,19 @@  commitGraph.maxNewFilters::
 	commit-graph write` (c.f., linkgit:git-commit-graph[1]).
 
 commitGraph.readChangedPaths::
-	If true, then git will use the changed-path Bloom filters in the
-	commit-graph file (if it exists, and they are present). Defaults to
-	true. See linkgit:git-commit-graph[1] for more information.
+	Deprecated. Equivalent to changedPathsVersion=-1 if true, and
+	changedPathsVersion=0 if false.
+
+commitGraph.changedPathsVersion::
+	Specifies the version of the changed-path Bloom filters that Git will read and
+	write. May be -1, 0 or 1. Any changed-path Bloom filters on disk that do not
+	match the version set in this config variable will be ignored.
++
+Defaults to -1.
++
+If -1, Git will use the version of the changed-path Bloom filters in the
+repository, defaulting to 1 if there are none.
++
+If 0, git will write version 1 Bloom filters when instructed to write.
++
+See linkgit:git-commit-graph[1] for more information.
diff --git a/commit-graph.c b/commit-graph.c
index c11b59f28b..9b72319450 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -399,7 +399,7 @@  struct commit_graph *parse_commit_graph(struct repo_settings *s,
 			graph->read_generation_data = 1;
 	}
 
-	if (s->commit_graph_read_changed_paths) {
+	if (s->commit_graph_changed_paths_version != 0) {
 		pair_chunk(cf, GRAPH_CHUNKID_BLOOMINDEXES,
 			   &graph->chunk_bloom_indexes);
 		read_chunk(cf, GRAPH_CHUNKID_BLOOMDATA,
diff --git a/oss-fuzz/fuzz-commit-graph.c b/oss-fuzz/fuzz-commit-graph.c
index 914026f5d8..b56731f51a 100644
--- a/oss-fuzz/fuzz-commit-graph.c
+++ b/oss-fuzz/fuzz-commit-graph.c
@@ -18,7 +18,7 @@  int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
 	 * possible.
 	 */
 	the_repository->settings.commit_graph_generation_version = 2;
-	the_repository->settings.commit_graph_read_changed_paths = 1;
+	the_repository->settings.commit_graph_changed_paths_version = 1;
 	g = parse_commit_graph(&the_repository->settings, (void *)data, size);
 	repo_clear(the_repository);
 	free_commit_graph(g);
diff --git a/repo-settings.c b/repo-settings.c
index 3dbd3f0e2e..e3b6565ffc 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -24,6 +24,7 @@  void prepare_repo_settings(struct repository *r)
 	int value;
 	const char *strval;
 	int manyfiles;
+	int readChangedPaths;
 
 	if (!r->gitdir)
 		BUG("Cannot add settings for uninitialized repository");
@@ -54,7 +55,10 @@  void prepare_repo_settings(struct repository *r)
 	/* Commit graph config or default, does not cascade (simple) */
 	repo_cfg_bool(r, "core.commitgraph", &r->settings.core_commit_graph, 1);
 	repo_cfg_int(r, "commitgraph.generationversion", &r->settings.commit_graph_generation_version, 2);
-	repo_cfg_bool(r, "commitgraph.readchangedpaths", &r->settings.commit_graph_read_changed_paths, 1);
+	repo_cfg_bool(r, "commitgraph.readchangedpaths", &readChangedPaths, 1);
+	repo_cfg_int(r, "commitgraph.changedpathsversion",
+		     &r->settings.commit_graph_changed_paths_version,
+		     readChangedPaths ? -1 : 0);
 	repo_cfg_bool(r, "gc.writecommitgraph", &r->settings.gc_write_commit_graph, 1);
 	repo_cfg_bool(r, "fetch.writecommitgraph", &r->settings.fetch_write_commit_graph, 0);
 
diff --git a/repository.h b/repository.h
index e8c67ffe16..1f1c32a6dd 100644
--- a/repository.h
+++ b/repository.h
@@ -32,7 +32,7 @@  struct repo_settings {
 
 	int core_commit_graph;
 	int commit_graph_generation_version;
-	int commit_graph_read_changed_paths;
+	int commit_graph_changed_paths_version;
 	int gc_write_commit_graph;
 	int gc_cruft_packs;
 	int fetch_write_commit_graph;