diff mbox series

[v4,2/8] builtin/refs: get worktrees without reading head information

Message ID Z67L_lmCd6NNXpWZ@ArchLinux (mailing list archive)
State Superseded
Headers show
Series add more ref consistency checks | expand

Commit Message

shejialuo Feb. 14, 2025, 4:52 a.m. UTC
In "packed-backend.c", there are some functions such as "create_snapshot"
and "next_record" which would check the correctness of the content of
the "packed-ref" file. When anything is bad, the program will die.

It may seem that we have nothing relevant to above feature, because we
are going to read and parse the raw "packed-ref" file without creating
the snapshot and using the ref iterator to check the consistency.

However, when using "get_worktrees" in "builtin/refs", we would parse
the "HEAD" information. If the referent of the "HEAD" is inside the
"packed-ref", we will call "create_snapshot" function to parse the
"packed-ref" to get the information. No matter whether the entry of
"HEAD" in "packed-ref" is correct, "create_snapshot" would call
"verify_buffer_safe" to check whether there is a newline in the last
line of the file. If not, the program will die.

Although this behavior has no harm for the program, it will
short-circuit the program. When the users execute "git refs verify" or
"git fsck", we should avoid reading the head information, which may
execute the read operation in packed backend with stricter checks to die
the program. Instead, we should continue to check other parts of the
"packed-refs" file completely.

Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
worktrees, 2023-12-29), we have introduced a function
"get_worktrees_internal" which allows us to get worktrees without
reading head information.

Create a new exposed function "get_worktrees_without_reading_head", then
replace the "get_worktrees" in "builtin/refs" with the new created
function.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 builtin/refs.c | 2 +-
 worktree.c     | 5 +++++
 worktree.h     | 6 ++++++
 3 files changed, 12 insertions(+), 1 deletion(-)

Comments

Karthik Nayak Feb. 14, 2025, 9:19 a.m. UTC | #1
shejialuo <shejialuo@gmail.com> writes:

> In "packed-backend.c", there are some functions such as "create_snapshot"
> and "next_record" which would check the correctness of the content of
> the "packed-ref" file. When anything is bad, the program will die.
>
> It may seem that we have nothing relevant to above feature, because we
> are going to read and parse the raw "packed-ref" file without creating
> the snapshot and using the ref iterator to check the consistency.
>
> However, when using "get_worktrees" in "builtin/refs", we would parse
> the "HEAD" information. If the referent of the "HEAD" is inside the
> "packed-ref", we will call "create_snapshot" function to parse the
> "packed-ref" to get the information. No matter whether the entry of
> "HEAD" in "packed-ref" is correct, "create_snapshot" would call
> "verify_buffer_safe" to check whether there is a newline in the last
> line of the file. If not, the program will die.
>

Nit: while the second paragraph above makes sense in the context of what
we're trying to achieve in this patch series. It doesn't make much sense
for this patch in isolation. Perhaps we want to give some more context
around what we're trying to solve for in the upcoming patches and hence
how it hinders that.

> Although this behavior has no harm for the program, it will
> short-circuit the program. When the users execute "git refs verify" or
> "git fsck", we should avoid reading the head information, which may
> execute the read operation in packed backend with stricter checks to die
> the program. Instead, we should continue to check other parts of the
> "packed-refs" file completely.
>
> Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
> worktrees, 2023-12-29), we have introduced a function
> "get_worktrees_internal" which allows us to get worktrees without
> reading head information.
>
> Create a new exposed function "get_worktrees_without_reading_head", then
> replace the "get_worktrees" in "builtin/refs" with the new created
> function.
>
> Mentored-by: Patrick Steinhardt <ps@pks.im>
> Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
>  builtin/refs.c | 2 +-
>  worktree.c     | 5 +++++
>  worktree.h     | 6 ++++++
>  3 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/refs.c b/builtin/refs.c
> index a29f195834..55ff5dae11 100644
> --- a/builtin/refs.c
> +++ b/builtin/refs.c
> @@ -88,7 +88,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix,
>  	git_config(git_fsck_config, &fsck_refs_options);
>  	prepare_repo_settings(the_repository);
>
> -	worktrees = get_worktrees();
> +	worktrees = get_worktrees_without_reading_head();
>  	for (size_t i = 0; worktrees[i]; i++)
>  		ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
>  				 &fsck_refs_options, worktrees[i]);
> diff --git a/worktree.c b/worktree.c
> index 248bbb39d4..89b7d86cef 100644
> --- a/worktree.c
> +++ b/worktree.c
> @@ -175,6 +175,11 @@ struct worktree **get_worktrees(void)
>  	return get_worktrees_internal(0);
>  }
>
> +struct worktree **get_worktrees_without_reading_head(void)
> +{
> +	return get_worktrees_internal(1);
> +}
> +
>  const char *get_worktree_git_dir(const struct worktree *wt)
>  {
>  	if (!wt)
> diff --git a/worktree.h b/worktree.h
> index 38145df80f..1ba4a161a0 100644
> --- a/worktree.h
> +++ b/worktree.h
> @@ -30,6 +30,12 @@ struct worktree {
>   */
>  struct worktree **get_worktrees(void);
>
> +/*
> + * Like `get_worktrees`, but does not read HEAD. This is useful when checking
> + * the consistency, as reading HEAD may not be necessary.

Checking what consistency? We should be a bit more verbose here. You can
mention that skipping HEAD allows to get the worktree without worrying
about failures pertaining to parsing the HEAD ref.

> + */
> +struct worktree **get_worktrees_without_reading_head(void);
> +
>  /*
>   * Returns 1 if linked worktrees exist, 0 otherwise.
>   */
> --
> 2.48.1
shejialuo Feb. 14, 2025, 12:20 p.m. UTC | #2
On Fri, Feb 14, 2025 at 01:19:53AM -0800, Karthik Nayak wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > In "packed-backend.c", there are some functions such as "create_snapshot"
> > and "next_record" which would check the correctness of the content of
> > the "packed-ref" file. When anything is bad, the program will die.
> >
> > It may seem that we have nothing relevant to above feature, because we
> > are going to read and parse the raw "packed-ref" file without creating
> > the snapshot and using the ref iterator to check the consistency.
> >
> > However, when using "get_worktrees" in "builtin/refs", we would parse
> > the "HEAD" information. If the referent of the "HEAD" is inside the
> > "packed-ref", we will call "create_snapshot" function to parse the
> > "packed-ref" to get the information. No matter whether the entry of
> > "HEAD" in "packed-ref" is correct, "create_snapshot" would call
> > "verify_buffer_safe" to check whether there is a newline in the last
> > line of the file. If not, the program will die.
> >
> 
> Nit: while the second paragraph above makes sense in the context of what
> we're trying to achieve in this patch series. It doesn't make much sense
> for this patch in isolation. Perhaps we want to give some more context
> around what we're trying to solve for in the upcoming patches and hence
> how it hinders that.
> 

Indeed, I think we should add this paragraph. We need to tell the
context about the motivation.

> > Although this behavior has no harm for the program, it will
> > short-circuit the program. When the users execute "git refs verify" or
> > "git fsck", we should avoid reading the head information, which may
> > execute the read operation in packed backend with stricter checks to die
> > the program. Instead, we should continue to check other parts of the
> > "packed-refs" file completely.
> >
> > Fortunately, in 465a22b338 (worktree: skip reading HEAD when repairing
> > worktrees, 2023-12-29), we have introduced a function
> > "get_worktrees_internal" which allows us to get worktrees without
> > reading head information.
> >
> > Create a new exposed function "get_worktrees_without_reading_head", then
> > replace the "get_worktrees" in "builtin/refs" with the new created
> > function.
> >
> > Mentored-by: Patrick Steinhardt <ps@pks.im>
> > Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> > Signed-off-by: shejialuo <shejialuo@gmail.com>
> > ---
> >  builtin/refs.c | 2 +-
> >  worktree.c     | 5 +++++
> >  worktree.h     | 6 ++++++
> >  3 files changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/builtin/refs.c b/builtin/refs.c
> > index a29f195834..55ff5dae11 100644
> > --- a/builtin/refs.c
> > +++ b/builtin/refs.c
> > @@ -88,7 +88,7 @@ static int cmd_refs_verify(int argc, const char **argv, const char *prefix,
> >  	git_config(git_fsck_config, &fsck_refs_options);
> >  	prepare_repo_settings(the_repository);
> >
> > -	worktrees = get_worktrees();
> > +	worktrees = get_worktrees_without_reading_head();
> >  	for (size_t i = 0; worktrees[i]; i++)
> >  		ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
> >  				 &fsck_refs_options, worktrees[i]);
> > diff --git a/worktree.c b/worktree.c
> > index 248bbb39d4..89b7d86cef 100644
> > --- a/worktree.c
> > +++ b/worktree.c
> > @@ -175,6 +175,11 @@ struct worktree **get_worktrees(void)
> >  	return get_worktrees_internal(0);
> >  }
> >
> > +struct worktree **get_worktrees_without_reading_head(void)
> > +{
> > +	return get_worktrees_internal(1);
> > +}
> > +
> >  const char *get_worktree_git_dir(const struct worktree *wt)
> >  {
> >  	if (!wt)
> > diff --git a/worktree.h b/worktree.h
> > index 38145df80f..1ba4a161a0 100644
> > --- a/worktree.h
> > +++ b/worktree.h
> > @@ -30,6 +30,12 @@ struct worktree {
> >   */
> >  struct worktree **get_worktrees(void);
> >
> > +/*
> > + * Like `get_worktrees`, but does not read HEAD. This is useful when checking
> > + * the consistency, as reading HEAD may not be necessary.
> 
> Checking what consistency? We should be a bit more verbose here. You can
> mention that skipping HEAD allows to get the worktree without worrying
> about failures pertaining to parsing the HEAD ref.
> 

Good idea, I will improve this in the next version.

> > + */
> > +struct worktree **get_worktrees_without_reading_head(void);
> > +
> >  /*
> >   * Returns 1 if linked worktrees exist, 0 otherwise.
> >   */
> > --
> > 2.48.1
diff mbox series

Patch

diff --git a/builtin/refs.c b/builtin/refs.c
index a29f195834..55ff5dae11 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -88,7 +88,7 @@  static int cmd_refs_verify(int argc, const char **argv, const char *prefix,
 	git_config(git_fsck_config, &fsck_refs_options);
 	prepare_repo_settings(the_repository);
 
-	worktrees = get_worktrees();
+	worktrees = get_worktrees_without_reading_head();
 	for (size_t i = 0; worktrees[i]; i++)
 		ret |= refs_fsck(get_worktree_ref_store(worktrees[i]),
 				 &fsck_refs_options, worktrees[i]);
diff --git a/worktree.c b/worktree.c
index 248bbb39d4..89b7d86cef 100644
--- a/worktree.c
+++ b/worktree.c
@@ -175,6 +175,11 @@  struct worktree **get_worktrees(void)
 	return get_worktrees_internal(0);
 }
 
+struct worktree **get_worktrees_without_reading_head(void)
+{
+	return get_worktrees_internal(1);
+}
+
 const char *get_worktree_git_dir(const struct worktree *wt)
 {
 	if (!wt)
diff --git a/worktree.h b/worktree.h
index 38145df80f..1ba4a161a0 100644
--- a/worktree.h
+++ b/worktree.h
@@ -30,6 +30,12 @@  struct worktree {
  */
 struct worktree **get_worktrees(void);
 
+/*
+ * Like `get_worktrees`, but does not read HEAD. This is useful when checking
+ * the consistency, as reading HEAD may not be necessary.
+ */
+struct worktree **get_worktrees_without_reading_head(void);
+
 /*
  * Returns 1 if linked worktrees exist, 0 otherwise.
  */