diff mbox series

setup: fix bug with "includeIf.onbranch" when initializing dir

Message ID cf182bb9ee7d4a7eb46e5dbf4f3ef5deb198d823.1716374321.git.ps@pks.im (mailing list archive)
State New, archived
Headers show
Series setup: fix bug with "includeIf.onbranch" when initializing dir | expand

Commit Message

Patrick Steinhardt May 22, 2024, 10:38 a.m. UTC
It was reported that git-init(1) can fail when initializing an existing
directory in case the config contains an "includeIf.onbranch:"
condition:

```shell
$ mkdir repo
$ git -c includeIf.onbranch:main.path=nonexistent init repo
BUG: refs.c:2056: reference backend is unknown
```

The same error can also be triggered when re-initializing an already
existing repository.

The bug has been introduced in 173761e21b (setup: start tracking ref
storage format, 2023-12-29), which wired up the ref storage format. The
root cause is in `init_db()`, which tries to read the config before we
have initialized `the_repository` and most importantly its ref storage
format. We eventually end up calling `include_by_branch()` and execute
`refs_resolve_ref_unsafe()`, but because we have not initialized the ref
storage format yet this will trigger the above bug.

Interestingly, `include_by_branch()` has a mechanism that will only
cause us to resolve the ref when `the_repository->gitdir` is set. This
is also the reason why this only happens when we initialize an already
existing directory or repository: `gitdir` is set in those cases, but
not when creating a new directory.

Now there are two ways to address the issue:

  - We can adapt `include_by_branch()` to also make the code conditional
    on whether `the_repository->ref_storage_format` is set.

  - We can shift around code such that we initialize the repository
    format before we read the config.

While the first approach would be safe, it may also cause us to paper
over issues where a ref store should have been set up. In our case for
example, it may be reasonable to expect that re-initializing the repo
will cause the "onbranch:" condition to trigger, but we would not do
that if the ref storage format was not set up yet. This also used to
work before the above commit that introduced this bug.

Rearrange the code such that we set up the repository format before
reading the config. This fixes the bug and ensures that "onbranch:"
conditions can trigger.

Reported-by: Heghedus Razvan <heghedus.razvan@protonmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 setup.c         |  21 +++++-----
 t/t0001-init.sh | 100 ++++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 104 insertions(+), 17 deletions(-)

Comments

Heghedus Razvan May 22, 2024, 10:58 a.m. UTC | #1
On Wed May 22, 2024 at 1:38 PM EEST, Patrick Steinhardt wrote:
> It was reported that git-init(1) can fail when initializing an existing
> directory in case the config contains an "includeIf.onbranch:"
> condition:
>
> ```shell
> $ mkdir repo
> $ git -c includeIf.onbranch:main.path=nonexistent init repo
> BUG: refs.c:2056: reference backend is unknown
> ```
>
> The same error can also be triggered when re-initializing an already
> existing repository.
>
> The bug has been introduced in 173761e21b (setup: start tracking ref
> storage format, 2023-12-29), which wired up the ref storage format. The
> root cause is in `init_db()`, which tries to read the config before we
> have initialized `the_repository` and most importantly its ref storage
> format. We eventually end up calling `include_by_branch()` and execute
> `refs_resolve_ref_unsafe()`, but because we have not initialized the ref
> storage format yet this will trigger the above bug.
>
> Interestingly, `include_by_branch()` has a mechanism that will only
> cause us to resolve the ref when `the_repository->gitdir` is set. This
> is also the reason why this only happens when we initialize an already
> existing directory or repository: `gitdir` is set in those cases, but
> not when creating a new directory.
>
> Now there are two ways to address the issue:
>
>   - We can adapt `include_by_branch()` to also make the code conditional
>     on whether `the_repository->ref_storage_format` is set.
>
>   - We can shift around code such that we initialize the repository
>     format before we read the config.
>
> While the first approach would be safe, it may also cause us to paper
> over issues where a ref store should have been set up. In our case for
> example, it may be reasonable to expect that re-initializing the repo
> will cause the "onbranch:" condition to trigger, but we would not do
> that if the ref storage format was not set up yet. This also used to
> work before the above commit that introduced this bug.
>
> Rearrange the code such that we set up the repository format before
> reading the config. This fixes the bug and ensures that "onbranch:"
> conditions can trigger.
>
> Reported-by: Heghedus Razvan <heghedus.razvan@protonmail.com>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>

I can confirm it's fixing the issue. Feel free to add:

Tested-by: Heghedus Razvan <heghedus.razvan@protonmail.com>
Junio C Hamano May 22, 2024, 7:06 p.m. UTC | #2
Heghedus Razvan <heghedus.razvan@protonmail.com> writes:

>> Reported-by: Heghedus Razvan <heghedus.razvan@protonmail.com>
>> Signed-off-by: Patrick Steinhardt <ps@pks.im>
>
> I can confirm it's fixing the issue. Feel free to add:
>
> Tested-by: Heghedus Razvan <heghedus.razvan@protonmail.com>

Thanks, both.
Junio C Hamano May 23, 2024, 12:41 a.m. UTC | #3
Patrick Steinhardt <ps@pks.im> writes:

> The bug has been introduced in 173761e21b (setup: start tracking ref
> storage format, 2023-12-29), which wired up the ref storage format. The
> root cause is in `init_db()`, which tries to read the config before we
> have initialized `the_repository` and most importantly its ref storage
> format. We eventually end up calling `include_by_branch()` and execute
> `refs_resolve_ref_unsafe()`, but because we have not initialized the ref
> storage format yet this will trigger the above bug.

So I was trying to see how feasible it would be to backport this fix
on top of v2.44.0 which was the first version that was shipped with
173761e2 (setup: start tracking ref storage format, 2023-12-29),
which is v2.44.0-rc0~78^2~8 and noticed something curious.

As your addition to t0001-init.sh wants to try converting the ref
backends to and from files and reftable, it actually will barf until
b4385bf0 (Merge branch 'ps/reftable-backend', 2024-02-26), which is
v2.45.0-rc0~176, but even with a reinit with the same backend, the
problem should manifest itself, right?

> diff --git a/t/t0001-init.sh b/t/t0001-init.sh
> index b131d665db..2239bed198 100755
> --- a/t/t0001-init.sh
> +++ b/t/t0001-init.sh
> @@ -584,14 +584,38 @@ test_expect_success 'init with --ref-format=files' '
>  	test_cmp expect actual
>  '
>  
> -test_expect_success 're-init with same format' '
> -	test_when_finished "rm -rf refformat" &&
> -	git init --ref-format=files refformat &&
> -	git init --ref-format=files refformat &&
> -	echo files >expect &&
> -	git -C refformat rev-parse --show-ref-format >actual &&
> -	test_cmp expect actual
> -'

So we used to have "files to files" and nothing else.  But

> +for from_format in files reftable
> +do
> +	for to_format in files reftable
> +	do
> +		if test "$from_format" = "$to_format"
> +		then
> +			continue
> +		fi
> +
> +		test_expect_success "re-init with same format ($from_format)" '
> +			test_when_finished "rm -rf refformat" &&
> +			git init --ref-format=$from_format refformat &&
> +			git init --ref-format=$from_format refformat &&
> +			echo $from_format >expect &&
> +			git -C refformat rev-parse --show-ref-format >actual &&
> +			test_cmp expect actual
> +		'

This is very iffy, isn't it?  This one wants to do the same format
reinit, but it is behind "if from and to are the same, skip the rest"
in the loop.

I think the "same format" loop should be lifted outside and
immediately before the inner loop and we should be OK.

> +		test_expect_success "re-init with different format fails ($from_format -> $to_format)" '
> +			test_when_finished "rm -rf refformat" &&
> +			git init --ref-format=$from_format refformat &&
> +			cat >expect <<-EOF &&
> +			fatal: attempt to reinitialize repository with different reference storage format
> +			EOF
> +			test_must_fail git init --ref-format=$to_format refformat 2>err &&
> +			test_cmp expect err &&
> +			echo $from_format >expect &&
> +			git -C refformat rev-parse --show-ref-format >actual &&
> +			test_cmp expect actual
> +		'
> +	done
> +done

In any case, this "reinitialize to a different format is too late"
test has nothing to do with the problem we are fixing with this
patch, no?  It is a valuable set of tests in the post b4385bf0 world
where we can choose between the two, but it is somewhat out of scope
of the "we need to initialize the ref backend before we can do onbranch
config".

I'll send your patch backported to v2.44.0, plus the change needed
to review the merge of it into v2.45.0 codebase later.
Patrick Steinhardt May 23, 2024, 5:26 a.m. UTC | #4
On Wed, May 22, 2024 at 05:41:28PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
[snip]
> > +for from_format in files reftable
> > +do
> > +	for to_format in files reftable
> > +	do
> > +		if test "$from_format" = "$to_format"
> > +		then
> > +			continue
> > +		fi
> > +
> > +		test_expect_success "re-init with same format ($from_format)" '
> > +			test_when_finished "rm -rf refformat" &&
> > +			git init --ref-format=$from_format refformat &&
> > +			git init --ref-format=$from_format refformat &&
> > +			echo $from_format >expect &&
> > +			git -C refformat rev-parse --show-ref-format >actual &&
> > +			test_cmp expect actual
> > +		'
> 
> This is very iffy, isn't it?  This one wants to do the same format
> reinit, but it is behind "if from and to are the same, skip the rest"
> in the loop.
> 
> I think the "same format" loop should be lifted outside and
> immediately before the inner loop and we should be OK.

Yup, we can do that.

> > +		test_expect_success "re-init with different format fails ($from_format -> $to_format)" '
> > +			test_when_finished "rm -rf refformat" &&
> > +			git init --ref-format=$from_format refformat &&
> > +			cat >expect <<-EOF &&
> > +			fatal: attempt to reinitialize repository with different reference storage format
> > +			EOF
> > +			test_must_fail git init --ref-format=$to_format refformat 2>err &&
> > +			test_cmp expect err &&
> > +			echo $from_format >expect &&
> > +			git -C refformat rev-parse --show-ref-format >actual &&
> > +			test_cmp expect actual
> > +		'
> > +	done
> > +done
> 
> In any case, this "reinitialize to a different format is too late"
> test has nothing to do with the problem we are fixing with this
> patch, no?  It is a valuable set of tests in the post b4385bf0 world
> where we can choose between the two, but it is somewhat out of scope
> of the "we need to initialize the ref backend before we can do onbranch
> config".
> 
> I'll send your patch backported to v2.44.0, plus the change needed
> to review the merge of it into v2.45.0 codebase later.

Yeah, I wanted to play it safe and add some cases I felt were missing in
the existing test coverage so that there ideally aren't any other issues
and so that the proposed change doesn't regress functionality.

Patrick
diff mbox series

Patch

diff --git a/setup.c b/setup.c
index 9247cded6a..8c7bc7bdb1 100644
--- a/setup.c
+++ b/setup.c
@@ -2286,12 +2286,6 @@  int init_db(const char *git_dir, const char *real_git_dir,
 	}
 	startup_info->have_repository = 1;
 
-	/* Ensure `core.hidedotfiles` is processed */
-	git_config(platform_core_config, NULL);
-
-	safe_create_dir(git_dir, 0);
-
-
 	/* Check to see if the repository version is right.
 	 * Note that a newly created repository does not have
 	 * config file, so this will not fail.  What we are catching
@@ -2302,9 +2296,6 @@  int init_db(const char *git_dir, const char *real_git_dir,
 	validate_hash_algorithm(&repo_fmt, hash);
 	validate_ref_storage_format(&repo_fmt, ref_storage_format);
 
-	reinit = create_default_files(template_dir, original_git_dir,
-				      &repo_fmt, init_shared_repository);
-
 	/*
 	 * Now that we have set up both the hash algorithm and the ref storage
 	 * format we can update the repository's settings accordingly.
@@ -2312,6 +2303,18 @@  int init_db(const char *git_dir, const char *real_git_dir,
 	repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
 	repo_set_ref_storage_format(the_repository, repo_fmt.ref_storage_format);
 
+	/*
+	 * Ensure `core.hidedotfiles` is processed. This must happen after we
+	 * have set up the repository format such that we can evaluate
+	 * includeIf conditions correctly in the case of re-initialization.
+	 */
+	git_config(platform_core_config, NULL);
+
+	safe_create_dir(git_dir, 0);
+
+	reinit = create_default_files(template_dir, original_git_dir,
+				      &repo_fmt, init_shared_repository);
+
 	if (!(flags & INIT_DB_SKIP_REFDB))
 		create_reference_database(repo_fmt.ref_storage_format,
 					  initial_branch, flags & INIT_DB_QUIET);
diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index b131d665db..2239bed198 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -584,14 +584,38 @@  test_expect_success 'init with --ref-format=files' '
 	test_cmp expect actual
 '
 
-test_expect_success 're-init with same format' '
-	test_when_finished "rm -rf refformat" &&
-	git init --ref-format=files refformat &&
-	git init --ref-format=files refformat &&
-	echo files >expect &&
-	git -C refformat rev-parse --show-ref-format >actual &&
-	test_cmp expect actual
-'
+for from_format in files reftable
+do
+	for to_format in files reftable
+	do
+		if test "$from_format" = "$to_format"
+		then
+			continue
+		fi
+
+		test_expect_success "re-init with same format ($from_format)" '
+			test_when_finished "rm -rf refformat" &&
+			git init --ref-format=$from_format refformat &&
+			git init --ref-format=$from_format refformat &&
+			echo $from_format >expect &&
+			git -C refformat rev-parse --show-ref-format >actual &&
+			test_cmp expect actual
+		'
+
+		test_expect_success "re-init with different format fails ($from_format -> $to_format)" '
+			test_when_finished "rm -rf refformat" &&
+			git init --ref-format=$from_format refformat &&
+			cat >expect <<-EOF &&
+			fatal: attempt to reinitialize repository with different reference storage format
+			EOF
+			test_must_fail git init --ref-format=$to_format refformat 2>err &&
+			test_cmp expect err &&
+			echo $from_format >expect &&
+			git -C refformat rev-parse --show-ref-format >actual &&
+			test_cmp expect actual
+		'
+	done
+done
 
 test_expect_success 'init with --ref-format=garbage' '
 	test_when_finished "rm -rf refformat" &&
@@ -678,4 +702,64 @@  test_expect_success 'branch -m with the initial branch' '
 	test_cmp expect actual
 '
 
+test_expect_success 'init with includeIf.onbranch condition' '
+	test_when_finished "rm -rf repo" &&
+	git -c includeIf.onbranch:main.path=nonexistent init repo &&
+	echo $GIT_DEFAULT_REF_FORMAT >expect &&
+	git -C repo rev-parse --show-ref-format >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'init with includeIf.onbranch condition with existing directory' '
+	test_when_finished "rm -rf repo" &&
+	mkdir repo &&
+	git -c includeIf.onbranch:nonexistent.path=/does/not/exist init repo &&
+	echo $GIT_DEFAULT_REF_FORMAT >expect &&
+	git -C repo rev-parse --show-ref-format >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 're-init with includeIf.onbranch condition' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	git -c includeIf.onbranch:nonexistent.path=/does/not/exist init repo &&
+	echo $GIT_DEFAULT_REF_FORMAT >expect &&
+	git -C repo rev-parse --show-ref-format >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 're-init with includeIf.onbranch condition' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	git -c includeIf.onbranch:nonexistent.path=/does/not/exist init repo &&
+	echo $GIT_DEFAULT_REF_FORMAT >expect &&
+	git -C repo rev-parse --show-ref-format >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 're-init skips non-matching includeIf.onbranch' '
+	test_when_finished "rm -rf repo config" &&
+	cat >config <<-EOF &&
+	[
+	garbage
+	EOF
+	git init repo &&
+	git -c includeIf.onbranch:nonexistent.path="$(test-tool path-utils absolute_path config)" init repo
+'
+
+test_expect_success 're-init reads matching includeIf.onbranch' '
+	test_when_finished "rm -rf repo config" &&
+	cat >config <<-EOF &&
+	[
+	garbage
+	EOF
+	path="$(test-tool path-utils absolute_path config)" &&
+	git init --initial-branch=branch repo &&
+	cat >expect <<-EOF &&
+	fatal: bad config line 1 in file $path
+	EOF
+	test_must_fail git -c includeIf.onbranch:branch.path="$path" init repo 2>err &&
+	test_cmp expect err
+'
+
 test_done