diff mbox series

[3/5] repository: initialize index in `repo_init()`

Message ID 96256f9eb30e5aca632942169d5c311b1f245b40.1713180749.git.ps@pks.im (mailing list archive)
State Accepted
Commit 66bce9d00bc4e2f89b2aea21d1e162c9ee47f55c
Headers show
Series global: drop external `the_index` variable | expand

Commit Message

Patrick Steinhardt April 15, 2024, 11:42 a.m. UTC
When Git starts, one of the first things it will do is to call
`initialize_the_repository()`. This function sets up both the global
`the_repository` and `the_index` variables as required. Part of that
setup is also to set `the_repository.index = &the_index` so that the
index can be accessed via the repository.

When calling `repo_init()` on a repository though we set the complete
struct to all-zeroes, which will also cause us to unset the `index`
pointer. And as we don't re-initialize the index in that function, we
will end up with a `NULL` pointer here.

This has been fine until now becaues this function is only used to
create a new repository. git-init(1) does not access the index at all
after initializing the repository, whereas git-checkout(1) only uses
`the_index` directly. We are about to remove `the_index` though, which
will uncover this partially-initialized repository structure.

Refactor the code and create a common `initialize_repository()` function
that gets called from `repo_init()` and `initialize_the_repository()`.
This function sets up both the repository and the index as required.
Like this, we can easily special-case when `repo_init()` gets called
with `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 repository.c | 28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

Comments

karthik nayak April 17, 2024, 5:38 p.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> Refactor the code and create a common `initialize_repository()` function
> that gets called from `repo_init()` and `initialize_the_repository()`.
> This function sets up both the repository and the index as required.
> Like this, we can easily special-case when `repo_init()` gets called
> with `the_repository`.
>

Nit: `initialize_the_repository()` calling `initialize_repository()`
doesn't really indicate what each of them does and why we have two
functions which are similarly named but calling each other. Maybe we
should rename them?
Patrick Steinhardt April 18, 2024, 12:16 p.m. UTC | #2
On Wed, Apr 17, 2024 at 05:38:29PM +0000, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Refactor the code and create a common `initialize_repository()` function
> > that gets called from `repo_init()` and `initialize_the_repository()`.
> > This function sets up both the repository and the index as required.
> > Like this, we can easily special-case when `repo_init()` gets called
> > with `the_repository`.
> >
> 
> Nit: `initialize_the_repository()` calling `initialize_repository()`
> doesn't really indicate what each of them does and why we have two
> functions which are similarly named but calling each other. Maybe we
> should rename them?

I think once you know about `the_repository` it isn't all that bad.
Whereas `initialize_repository()` initializes any repository,
`initialize_the_repository()` only initializes `the_repository`.

But you know, let's further simplify this and move all initialization
logic into `initialize_repository()` to make this hopefully clearer. And
on top of that we can then also drop `initialize_the_repository()`
completely by statically initializing the `the_repository` pointer so
that callers can simply call `initialize_repository(&the_repository)`
instead.

Patrick
diff mbox series

Patch

diff --git a/repository.c b/repository.c
index e15b416944..d64d15d952 100644
--- a/repository.c
+++ b/repository.c
@@ -25,17 +25,20 @@  static struct repository the_repo;
 struct repository *the_repository;
 struct index_state the_index;
 
+static void initialize_repository(struct repository *repo,
+				  struct index_state *index)
+{
+	repo->index = index;
+	repo->objects = raw_object_store_new();
+	repo->remote_state = remote_state_new();
+	repo->parsed_objects = parsed_object_pool_new();
+	index_state_init(index, repo);
+}
+
 void initialize_the_repository(void)
 {
 	the_repository = &the_repo;
-
-	the_repo.index = &the_index;
-	the_repo.objects = raw_object_store_new();
-	the_repo.remote_state = remote_state_new();
-	the_repo.parsed_objects = parsed_object_pool_new();
-
-	index_state_init(&the_index, the_repository);
-
+	initialize_repository(the_repository, &the_index);
 	repo_set_hash_algo(&the_repo, GIT_HASH_SHA1);
 }
 
@@ -188,9 +191,12 @@  int repo_init(struct repository *repo,
 	struct repository_format format = REPOSITORY_FORMAT_INIT;
 	memset(repo, 0, sizeof(*repo));
 
-	repo->objects = raw_object_store_new();
-	repo->parsed_objects = parsed_object_pool_new();
-	repo->remote_state = remote_state_new();
+	if (repo == the_repository) {
+		initialize_repository(the_repository, &the_index);
+	} else {
+		ALLOC_ARRAY(repo->index, 1);
+		initialize_repository(repo, repo->index);
+	}
 
 	if (repo_init_gitdir(repo, gitdir))
 		goto error;