diff mbox series

[01/24] init-db: remove unnecessary global variable & document existing bug

Message ID 12a251f77f679123d01892109694f8ee19b96252.1683431151.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Header cleanups (final splitting of cache.h, and some splitting of other headers) | expand

Commit Message

Elijah Newren May 7, 2023, 3:45 a.m. UTC
From: Elijah Newren <newren@gmail.com>

This commit was prompted by a desire to move the functions which
builtin/init-db.c and builtin/clone.c share out of the former file and
into setup.c.  One issue that made it difficult was the
init_is_bare_repository global variable.

init_is_bare_repository is actually not very useful.  It merely stores
the return value from is_bare_repository() and only for the duration of
a few additional function calls before its value is checked, and none of
those functions do anything that could change is_bare_repository()'s
return value.  So, we can simply dispense with the global by replacing
it with is_bare_repository().

However, the intervening code does talk about reading config from the
config file in the specified `--templates` directory, so touching this
code does lead to the question of whether core.bare could be set in such
a config file and thus whether the code is doing the right thing.  Long
story short is that the templates directory might have a config file
with core.bare set, but it has always been unconditionally ignored.
While fixing that might be nice, it looks to be a can of worms, and
cannot be fixed within this function anyway.  Instead of opening that
can of worms, document the problem with a TODO comment and a couple
test_expect_failure testcases.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/init-db.c        | 20 ++++++++++++++++----
 t/t1301-shared-repo.sh   | 22 ++++++++++++++++++++++
 t/t5606-clone-options.sh | 10 ++++++++++
 3 files changed, 48 insertions(+), 4 deletions(-)

Comments

Calvin Wan May 11, 2023, 5:24 p.m. UTC | #1
Splitting this patch into two would make it easier to review and follow.
Regardless I think both changes make sense and thanks for documenting a
preexisting bug.
Glen Choo May 11, 2023, 8:43 p.m. UTC | #2
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> This commit was prompted by a desire to move the functions which
> builtin/init-db.c and builtin/clone.c share out of the former file and
> into setup.c.  One issue that made it difficult was the
> init_is_bare_repository global variable.
>
> init_is_bare_repository is actually not very useful.  It merely stores
> the return value from is_bare_repository() and only for the duration of
> a few additional function calls before its value is checked, and none of
> those functions do anything that could change is_bare_repository()'s
> return value.  So, we can simply dispense with the global by replacing
> it with is_bare_repository().

I think the purpose of init_is_bare_repository is something different.
But based off my different understanding, I can't reproduce any
different behavior. I don't know if I'm just confused or not, but I'll
leave some breadcrumbs here to check my understanding.

Reordering the hunks for clarity,

> @@ -422,8 +436,6 @@ int init_db(const char *git_dir, const char *real_git_dir,
>  
>  	safe_create_dir(git_dir, 0);
>  
> -	init_is_bare_repository = is_bare_repository();
> -
>  	/* Check to see if the repository version is right.
>  	 * Note that a newly created repository does not have
>  	 * config file, so this will not fail.  What we are catching

Here, init_db() caches the value of is_bare_repository(), which itself
reads the value of is_bare_repository_cfg, which can be modified by when
we read "core.bare" via git_config(git_default_config) or similar
(basically, any config callback that uses git_default_config). It is
also modified in other places though, like setup.c.

IIUC, we haven't actually parsed "core.bare" at this point. The
git_config() call just above this calls "plaform_core_config", which is
either "mingw_core_config" (doesn't read "core.bare") or
noop_core_config (noop).

> @@ -231,9 +230,24 @@ static int create_default_files(const char *template_path,
>  	 * We must make sure command-line options continue to override any
>  	 * values we might have just re-read from the config.
>  	 */
> -	is_bare_repository_cfg = init_is_bare_repository || !work_tree;
>  	if (init_shared_repository != -1)
>  		set_shared_repository(init_shared_repository);
> +	/*
> +	 * TODO: heed core.bare from config file in templates if no
> +	 *       command-line override given
> +	 *
> +	 * Unfortunately, this location in the code is far too late to
> +	 * allow this to happen; both builtin/init.db and
> +	 * builtin/clone.c setup the new repository and call
> +	 * set_git_work_tree() before this point.  (Note that both do
> +	 * that repository setup before calling init_db(), which in
> +	 * turn calls create_default_files().)  Fixing it would
> +	 * require too much refactoring, and no one seems to have
> +	 * wanted this behavior in 15+ years, so we'll continue
> +	 * ignoring the config for now and just override
> +	 * is_bare_repository_cfg unconditionally.
> +	 */
> +	is_bare_repository_cfg = is_bare_repository() || !work_tree;
>  
>  	/*
>  	 * We would have created the above under user's umask -- under

Now, we're in the midst of the re-init. Expanding the context a little,
we see:

	git_config(git_default_config, NULL);

	/*
	 * We must make sure command-line options continue to override any
	 * values we might have just re-read from the config.
	 */
	is_bare_repository_cfg = init_is_bare_repository || !work_tree;

So now we've read the new config of the re-inited repo, which might have
"core.bare" set to a value other than what "git init-db [--bare]"
started with, so we want to _intentionally_ ignore it. We do this by
reading out the cached value, _not_ by calling is_bare_repository()
again. So it seems to me like this patch changes the intent.

Where I struggling with is how to make this behave badly. The lines
above seem to be defensive in nature - we never use
is_bare_repository_cfg past this point, but we want to guard against
unintentional behavior in the future.

But I thought that logging this value would show a difference in
behavior, e.g.

  diff --git a/builtin/init-db.c b/builtin/init-db.c
  index ba6e0b20fa..da3579d46d 100644
  --- a/builtin/init-db.c
  +++ b/builtin/init-db.c
  @@ -229,6 +229,8 @@ static int create_default_files(const char *template_path,
    * values we might have just re-read from the config.
    */
    is_bare_repository_cfg = init_is_bare_repository || !work_tree;
  +	printf_ln("is_bare_repository_cfg is %d", is_bare_repository_cfg);
  +
    if (init_shared_repository != -1)
      set_shared_repository(init_shared_repository);

  diff --git a/t/t1301-shared-repo.sh b/t/t1301-shared-repo.sh
  index 1b6437ec07..e4b978992e 100755
  --- a/t/t1301-shared-repo.sh
  +++ b/t/t1301-shared-repo.sh
  @@ -15,6 +15,14 @@ TEST_PASSES_SANITIZE_LEAK=true
  # Remove a default ACL from the test dir if possible.
  setfacl -k . 2>/dev/null

  +test_expect_success 're-init respects core.bare' '
  +	git init bare.git >actual &&
  +	grep "is_bare_repository_cfg is 0" actual &&
  +	git -C bare.git config core.bare true &&
  +	git init bare.git >actual &&
  +	grep "is_bare_repository_cfg is 1" actual
  +'
  +
  # User must have read permissions to the repo -> failure on --shared=0400
  test_expect_success 'shared = 0400 (faulty permission u-w)' '
    test_when_finished "rm -rf sub" &&

But apparently this doesn't work at all. I can't get this to print
"is_bare_repository_cfg is 1" no matter what I do, before or after your
patch. Maybe there's more at play here (e.g. with setup.c), but I
haven't figured that out.

If I'm right (which I'm not sure about), we might need to keep
init_is_bare_repository around _somewhere_. Not a global, but maybe
as a param.
Elijah Newren May 12, 2023, 1:21 a.m. UTC | #3
On Thu, May 11, 2023 at 1:43 PM Glen Choo <chooglen@google.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > This commit was prompted by a desire to move the functions which
> > builtin/init-db.c and builtin/clone.c share out of the former file and
> > into setup.c.  One issue that made it difficult was the
> > init_is_bare_repository global variable.
> >
> > init_is_bare_repository is actually not very useful.  It merely stores
> > the return value from is_bare_repository() and only for the duration of
> > a few additional function calls before its value is checked, and none of
> > those functions do anything that could change is_bare_repository()'s
> > return value.  So, we can simply dispense with the global by replacing
> > it with is_bare_repository().
>
> I think the purpose of init_is_bare_repository is something different.
> But based off my different understanding, I can't reproduce any
> different behavior. I don't know if I'm just confused or not, but I'll
> leave some breadcrumbs here to check my understanding.

So, while my patch does not change behavior under any circumstances,
my explanation was slightly wrong.  More on that below...

> Reordering the hunks for clarity,
>
> > @@ -422,8 +436,6 @@ int init_db(const char *git_dir, const char *real_git_dir,
> >
> >       safe_create_dir(git_dir, 0);
> >
> > -     init_is_bare_repository = is_bare_repository();
> > -
> >       /* Check to see if the repository version is right.
> >        * Note that a newly created repository does not have
> >        * config file, so this will not fail.  What we are catching
>
> Here, init_db() caches the value of is_bare_repository(), which itself
> reads the value of is_bare_repository_cfg, which can be modified by when
> we read "core.bare" via git_config(git_default_config) or similar
> (basically, any config callback that uses git_default_config). It is
> also modified in other places though, like setup.c.

Close.  is_bare_repository() not only reads the value of
is_bare_repository_cfg, but it ANDs the result with
!get_git_work_tree().  This turns out to be important.

> IIUC, we haven't actually parsed "core.bare" at this point. The
> git_config() call just above this calls "plaform_core_config", which is
> either "mingw_core_config" (doesn't read "core.bare") or
> noop_core_config (noop).

I believe this is correct.

> > @@ -231,9 +230,24 @@ static int create_default_files(const char *template_path,
> >        * We must make sure command-line options continue to override any
> >        * values we might have just re-read from the config.
> >        */
> > -     is_bare_repository_cfg = init_is_bare_repository || !work_tree;
> >       if (init_shared_repository != -1)
> >               set_shared_repository(init_shared_repository);
> > +     /*
> > +      * TODO: heed core.bare from config file in templates if no
> > +      *       command-line override given
> > +      *
> > +      * Unfortunately, this location in the code is far too late to
> > +      * allow this to happen; both builtin/init.db and
> > +      * builtin/clone.c setup the new repository and call
> > +      * set_git_work_tree() before this point.  (Note that both do
> > +      * that repository setup before calling init_db(), which in
> > +      * turn calls create_default_files().)  Fixing it would
> > +      * require too much refactoring, and no one seems to have
> > +      * wanted this behavior in 15+ years, so we'll continue
> > +      * ignoring the config for now and just override
> > +      * is_bare_repository_cfg unconditionally.
> > +      */
> > +     is_bare_repository_cfg = is_bare_repository() || !work_tree;
> >
> >       /*
> >        * We would have created the above under user's umask -- under
>
> Now, we're in the midst of the re-init. Expanding the context a little,
> we see:
>
>         git_config(git_default_config, NULL);
>
>         /*
>          * We must make sure command-line options continue to override any
>          * values we might have just re-read from the config.
>          */
>         is_bare_repository_cfg = init_is_bare_repository || !work_tree;
>
> So now we've read the new config of the re-inited repo, which might have
> "core.bare" set to a value other than what "git init-db [--bare]"
> started with

Correct.

> so we want to _intentionally_ ignore it.

That's what the code does, but not what it should do.  If neither
`--bare` nor `--no-bare` is given on the command line, we ought to pay
attention to the config setting.

The fact that we do ignore the config regardless of command line flags
is the bug that this patch documents.

> We do this by
> reading out the cached value, _not_ by calling is_bare_repository()
> again.

Almost, but you missed part of the line you are commenting on.  We
read out the cached value *and* OR it with !work_tree.  The
distinction may look small, but it turns out to be critical.

> So it seems to me like this patch changes the intent.

Yeah, looks like it does.

But it doesn't change the result.  Note the critical thing here is we
changed from computing:
   init_is_bare_repository || !work_tree;
to
   is_bare_repository() || !work_tree;

Further,
   init_is_bare_repository = [cached value of] is_bare_repository_cfg
&& [cached value of] !get_git_work_tree()
   is_bare_repository() = [current value of] is_bare_repository_cfg &&
[current value of] !get_git_work_tree()

However, nothing calls repo_init() or set_git_work_tree() between any
of these calls, so
   [current value of] !get_git_work_tree() == [cached value of]
!get_git_work_tree() == !work_tree

So, both before and after what we are computing simplifies to
   <something> && !work_tree || !work_tree
which further simplifies, regardless of whether <something> is true or
false, down to
   !work_tree

So whether we are using a cached value of is_bare_repository_cfg or a
current value of is_bare_repository_cfg is irrelevant.  In fact, from
the analysis above, we can simplify this line to just
   is_bare_repository_cfg = !work_tree;
completely ignoring both is_bare_repository_cfg and
is_bare_repository(), and it won't change the behavior.  I just did
it; it passes all tests.

> Where I struggling with is how to make this behave badly. The lines
> above seem to be defensive in nature - we never use
> is_bare_repository_cfg past this point, but we want to guard against
> unintentional behavior in the future.

Not quite true; there is another call to is_bare_repository() in the
same function about 60 lines later, which does pay attention to
is_bare_repository_cfg.

[...]
> If I'm right (which I'm not sure about), we might need to keep
> init_is_bare_repository around _somewhere_. Not a global, but maybe
> as a param.

This makes sense, despite the other bugs present.  I'll make this
change, as well as split this patch into two as Calvin suggested, and
update the description to correct my errors and explain the bug
better.
Elijah Newren May 12, 2023, 1:55 a.m. UTC | #4
On Thu, May 11, 2023 at 10:24 AM Calvin Wan <calvinwan@google.com> wrote:
>
> Splitting this patch into two would make it easier to review and follow.
> Regardless I think both changes make sense and thanks for documenting a
> preexisting bug.

Sure, I can do that.
Glen Choo May 12, 2023, 5:26 p.m. UTC | #5
Thanks for the response. I realized where my misunderstanding was.

Once again reordering slightly..

Elijah Newren <newren@gmail.com> writes:

>> Where I struggling with is how to make this behave badly. The lines
>> above seem to be defensive in nature - we never use
>> is_bare_repository_cfg past this point, but we want to guard against
>> unintentional behavior in the future.
>
> Not quite true; there is another call to is_bare_repository() in the
> same function about 60 lines later, which does pay attention to
> is_bare_repository_cfg.

Ah thanks. I didn't realize that the "is_bare_repository_cfg" that holds
the value of the CLI option is actually the global variable that also
stores cached value of "core.bare" (I thought it was a local variable).

This may not be in scope for your series, but if we want to preserve the
CLI option like we say we do...

>> Now, we're in the midst of the re-init. Expanding the context a little,
>> we see:
>>
>>         git_config(git_default_config, NULL);
>>
>>         /*
>>          * We must make sure command-line options continue to override any
>>          * values we might have just re-read from the config.
>>          */
>>         is_bare_repository_cfg = init_is_bare_repository || !work_tree;
>>
>> So now we've read the new config of the re-inited repo, which might have
>> "core.bare" set to a value other than what "git init-db [--bare]"
>> started with
>
> Correct.
>
>> so we want to _intentionally_ ignore it.
>
> That's what the code does, but not what it should do.  If neither
> `--bare` nor `--no-bare` is given on the command line, we ought to pay
> attention to the config setting.
>
> The fact that we do ignore the config regardless of command line flags
> is the bug that this patch documents.

wouldn't be a lot easier to parse the option into a local variable
instead of reusing the config one? Then we always have the original CLI
value available to us and can restore it back to
"is_bare_repository_cfg" if we really wanted to.

I'm not sure if this means we can/should drop the "|| !work_tree" in the
hunk above. It would nice if we could, I find it very confusing.

> So, both before and after what we are computing simplifies to
>    <something> && !work_tree || !work_tree
> which further simplifies, regardless of whether <something> is true or
> false, down to
>    !work_tree
>
> So whether we are using a cached value of is_bare_repository_cfg or a
> current value of is_bare_repository_cfg is irrelevant.  In fact, from
> the analysis above, we can simplify this line to just
>    is_bare_repository_cfg = !work_tree;
> completely ignoring both is_bare_repository_cfg and
> is_bare_repository(), and it won't change the behavior.  I just did
> it; it passes all tests.

Hm.. that doesn't sound intended, and fixing this is probably isn't in
scope for this series either.

>> If I'm right (which I'm not sure about), we might need to keep
>> init_is_bare_repository around _somewhere_. Not a global, but maybe
>> as a param.
>
> This makes sense, despite the other bugs present.  I'll make this
> change, as well as split this patch into two as Calvin suggested, and
> update the description to correct my errors and explain the bug
> better.

Yeah, I think this might be the smallest obviously correct thing. We can
clean up all of the other bits outside of a big refactor series.
diff mbox series

Patch

diff --git a/builtin/init-db.c b/builtin/init-db.c
index aef40361052..18733ef05aa 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -31,7 +31,6 @@ 
 
 #define GIT_DEFAULT_HASH_ENVIRONMENT "GIT_DEFAULT_HASH"
 
-static int init_is_bare_repository = 0;
 static int init_shared_repository = -1;
 
 static void copy_templates_1(struct strbuf *path, struct strbuf *template_path,
@@ -231,9 +230,24 @@  static int create_default_files(const char *template_path,
 	 * We must make sure command-line options continue to override any
 	 * values we might have just re-read from the config.
 	 */
-	is_bare_repository_cfg = init_is_bare_repository || !work_tree;
 	if (init_shared_repository != -1)
 		set_shared_repository(init_shared_repository);
+	/*
+	 * TODO: heed core.bare from config file in templates if no
+	 *       command-line override given
+	 *
+	 * Unfortunately, this location in the code is far too late to
+	 * allow this to happen; both builtin/init.db and
+	 * builtin/clone.c setup the new repository and call
+	 * set_git_work_tree() before this point.  (Note that both do
+	 * that repository setup before calling init_db(), which in
+	 * turn calls create_default_files().)  Fixing it would
+	 * require too much refactoring, and no one seems to have
+	 * wanted this behavior in 15+ years, so we'll continue
+	 * ignoring the config for now and just override
+	 * is_bare_repository_cfg unconditionally.
+	 */
+	is_bare_repository_cfg = is_bare_repository() || !work_tree;
 
 	/*
 	 * We would have created the above under user's umask -- under
@@ -422,8 +436,6 @@  int init_db(const char *git_dir, const char *real_git_dir,
 
 	safe_create_dir(git_dir, 0);
 
-	init_is_bare_repository = is_bare_repository();
-
 	/* Check to see if the repository version is right.
 	 * Note that a newly created repository does not have
 	 * config file, so this will not fail.  What we are catching
diff --git a/t/t1301-shared-repo.sh b/t/t1301-shared-repo.sh
index 1b6437ec079..c02fd64793b 100755
--- a/t/t1301-shared-repo.sh
+++ b/t/t1301-shared-repo.sh
@@ -52,6 +52,28 @@  test_expect_success 'shared=all' '
 	test 2 = $(git config core.sharedrepository)
 '
 
+test_expect_failure 'template can set core.bare' '
+	test_when_finished "rm -rf subdir" &&
+	test_when_finished "rm -rf templates" &&
+	test_config core.bare true &&
+	umask 0022 &&
+	mkdir -p templates/ &&
+	cp .git/config templates/config &&
+	git init --template=templates subdir &&
+	test_path_exists subdir/HEAD
+'
+
+test_expect_success 'template can set core.bare but overridden by command line' '
+	test_when_finished "rm -rf subdir" &&
+	test_when_finished "rm -rf templates" &&
+	test_config core.bare true &&
+	umask 0022 &&
+	mkdir -p templates/ &&
+	cp .git/config templates/config &&
+	git init --no-bare --template=templates subdir &&
+	test_path_exists subdir/.git/HEAD
+'
+
 test_expect_success POSIXPERM 'update-server-info honors core.sharedRepository' '
 	: > a1 &&
 	git add a1 &&
diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
index 27f9f776389..5890319b97b 100755
--- a/t/t5606-clone-options.sh
+++ b/t/t5606-clone-options.sh
@@ -120,6 +120,16 @@  test_expect_success 'prefers -c config over --template config' '
 
 '
 
+test_expect_failure 'prefers --template config even for core.bare' '
+
+	template="$TRASH_DIRECTORY/template-with-bare-config" &&
+	mkdir "$template" &&
+	git config --file "$template/config" core.bare true &&
+	git clone "--template=$template" parent clone-bare-config &&
+	test "$(git -C clone-bare-config config --local core.bare)" = "true" &&
+	test_path_is_file clone-bare-config/HEAD
+'
+
 test_expect_success 'prefers config "clone.defaultRemoteName" over default' '
 
 	test_config_global clone.defaultRemoteName from_config &&