mbox series

[v5,0/8] add more ref consistency checks

Message ID Z7NU5fZfc8vfSvZ0@ArchLinux (mailing list archive)
Headers show
Series add more ref consistency checks | expand

Message

shejialuo Feb. 17, 2025, 3:25 p.m. UTC
Hi All:

This changes enhances the following things:

1. [PATCH v5 2/8]: enhance the comment suggested by Karthik.
2. [PATCH v5 3/8]: use lstat to check whether the filetype of
   "packed-ref" is a regular file instead of using `open_nofollow`
   to check. And also enhance the commit message suggested by Karthik.
3. [PATCH v5 4/8]: move "open_nofollow" in original [PATCH v4 3/8] to
   this.

Also, I rebase due to the conflict that all *.txt files have been
renamed to *.adoc. However, I don't know whether this is a real
conflict. But I decide to rebase to make the life of Junio easy.

Thanks,
Jialuo

---

This series mainly does the following things:

1. Fix subshell issues
2. Add ref checks for packed-backend.
   1. Check whether the filetype of "packed-refs" is correct.
   2. Check whether the syntax of "packed-refs" is correct by using the
      rules from "packed-backend.c::create_snapshot" and
      "packed-backend.c::next_record".
   3. Check whether the pointed object exists and whether the
      "packed-refs" file is sorted.
3. Call "git refs verify" for "git-fsck(1)".

shejialuo (8):
  t0602: use subshell to ensure working directory unchanged
  builtin/refs: get worktrees without reading head information
  packed-backend: check whether the "packed-refs" is regular file
  packed-backend: add "packed-refs" header consistency check
  packed-backend: check whether the refname contains NUL characters
  packed-backend: add "packed-refs" entry consistency check
  packed-backend: check whether the "packed-refs" is sorted
  builtin/fsck: add `git refs verify` child process

 Documentation/fsck-msgids.adoc |   14 +
 Documentation/git-fsck.adoc    |    7 +-
 builtin/fsck.c                 |   33 +-
 builtin/refs.c                 |    2 +-
 fsck.h                         |    4 +
 refs/packed-backend.c          |  369 +++++++++-
 t/t0602-reffiles-fsck.sh       | 1205 +++++++++++++++++++-------------
 worktree.c                     |    5 +
 worktree.h                     |    7 +
 9 files changed, 1161 insertions(+), 485 deletions(-)

Range-diff against v4:
1:  20889b7b18 = 1:  b3952d80a2 t0602: use subshell to ensure working directory unchanged
2:  9d7780e953 ! 2:  3695586f58 builtin/refs: get worktrees without reading head information
    @@ worktree.h: struct worktree {
      struct worktree **get_worktrees(void);
      
     +/*
    -+ * Like `get_worktrees`, but does not read HEAD. This is useful when checking
    -+ * the consistency, as reading HEAD may not be necessary.
    ++ * Like `get_worktrees`, but does not read HEAD. Skip reading HEAD allows to
    ++ * get the worktree without worrying about failures pertaining to parsing
    ++ * the HEAD ref. This is useful when we want to check the ref db consistency.
     + */
     +struct worktree **get_worktrees_without_reading_head(void);
     +
3:  44d26f6440 ! 3:  cbaae00e8b packed-backend: check whether the "packed-refs" is regular file
    @@ Commit message
     
         Although "git-fsck(1)" and "packed-backend.c" will check some
         consistency and correctness of "packed-refs" file, they never check the
    -    filetype of the "packed-refs". The user should always use "git
    -    pack-refs" command to create the raw regular "packed-refs" file, so we
    -    need to explicitly check this in "git refs verify".
    +    filetype of the "packed-refs". Let's verify that the "packed-refs" has
    +    the expected filetype, confirming it is created by "git pack-refs"
    +    command.
     
    -    We could use "open_nofollow" wrapper to open the raw "packed-refs" file.
    -    If the returned "fd" value is less than 0, we could check whether the
    -    "errno" is "ELOOP" to report an error to the user.
    +    Use "lstat" to check the file mode. If we cannot check the file status
    +    due to there is no such file this is OK because there is a possibility
    +    that there is no "packed-refs" in the repo.
     
         Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to
         the user if "packed-refs" is not a regular file.
    @@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
      {
     +	struct packed_ref_store *refs = packed_downcast(ref_store,
     +							REF_STORE_READ, "fsck");
    ++	struct stat st;
     +	int ret = 0;
    -+	int fd;
      
      	if (!is_main_worktree(wt))
     -		return 0;
    @@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
     +	if (o->verbose)
     +		fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
     +
    -+	fd = open_nofollow(refs->path, O_RDONLY);
    -+	if (fd < 0) {
    ++	if (lstat(refs->path, &st) < 0) {
     +		/*
     +		 * If the packed-refs file doesn't exist, there's nothing
     +		 * to check.
     +		 */
     +		if (errno == ENOENT)
     +			goto cleanup;
    ++		ret = error_errno(_("unable to stat %s"), refs->path);
    ++		goto cleanup;
    ++	}
     +
    -+		if (errno == ELOOP) {
    -+			struct fsck_ref_report report = { 0 };
    -+			report.path = "packed-refs";
    -+			ret = fsck_report_ref(o, &report,
    -+					      FSCK_MSG_BAD_REF_FILETYPE,
    -+					      "not a regular file");
    -+			goto cleanup;
    -+		}
    -+
    -+		ret = error_errno(_("unable to open %s"), refs->path);
    ++	if (!S_ISREG(st.st_mode)) {
    ++		struct fsck_ref_report report = { 0 };
    ++		report.path = "packed-refs";
    ++		ret = fsck_report_ref(o, &report,
    ++				      FSCK_MSG_BAD_REF_FILETYPE,
    ++				      "not a regular file");
     +		goto cleanup;
     +	}
     +
    @@ t/t0602-reffiles-fsck.sh: test_expect_success 'ref content checks should work wi
     +		git pack-refs --all &&
     +
     +		mv .git/packed-refs .git/packed-refs-back &&
    -+		ln -sf packed-refs-bak .git/packed-refs &&
    ++		ln -sf packed-refs-back .git/packed-refs &&
     +		test_must_fail git refs verify 2>err &&
     +		cat >expect <<-EOF &&
     +		error: packed-refs: badRefFiletype: not a regular file
4:  976c5baba0 ! 4:  b9ce8734ac packed-backend: add "packed-refs" header consistency check
    @@ Commit message
            create a new fsck message "badPackedRefHeader(ERROR)" for this.
         3. If the header content is not the same as the constant string
            "PACKED_REFS_HEADER". This is expected because we make it extensible
    -       intentionally. So, there is no need to report.
    +       intentionally and runtime "create_snapshot" won't complain about
    +       unknown traits. In order to align with the runtime behavior. There is
    +       no need to report.
     
         As we have analyzed, we only need to check the case 2 in the above. In
    -    order to do this, read the "packed-refs" file via "strbuf_read". Like
    +    order to do this, use "open_nofollow" function to get the file
    +    descriptor and then read the "packed-refs" file via "strbuf_read". Like
         what "create_snapshot" and other functions do, we could split the line
         by finding the next newline in the buffer. When we cannot find a
         newline, we could report an error.
    @@ Commit message
         Mentored-by: Karthik Nayak <karthik.188@gmail.com>
         Signed-off-by: shejialuo <shejialuo@gmail.com>
     
    - ## Documentation/fsck-msgids.txt ##
    + ## Documentation/fsck-msgids.adoc ##
     @@
      `badObjectSha1`::
      	(ERROR) An object has a bad sha1.
    @@ refs/packed-backend.c: static struct ref_iterator *packed_reflog_iterator_begin(
      	struct packed_ref_store *refs = packed_downcast(ref_store,
      							REF_STORE_READ, "fsck");
     +	struct strbuf packed_ref_content = STRBUF_INIT;
    + 	struct stat st;
    ++	int fd;
      	int ret = 0;
    - 	int fd;
      
    + 	if (!is_main_worktree(wt))
     @@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
      		goto cleanup;
      	}
      
    ++	/*
    ++	 * There is a chance that "packed-refs" file is removed or converted to
    ++	 * a symlink after filetype check and before open. So we need to avoid
    ++	 * this race condition by opening the file.
    ++	 */
    ++	fd = open_nofollow(refs->path, O_RDONLY);
    ++	if (fd < 0) {
    ++		if (errno == ENOENT)
    ++			goto cleanup;
    ++
    ++		if (errno == ELOOP) {
    ++			struct fsck_ref_report report = { 0 };
    ++			report.path = "packed-refs";
    ++			ret = fsck_report_ref(o, &report,
    ++					      FSCK_MSG_BAD_REF_FILETYPE,
    ++					      "not a regular file");
    ++			goto cleanup;
    ++		}
    ++	}
    ++
     +	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
     +		ret = error_errno(_("unable to read %s"), refs->path);
     +		goto cleanup;
5:  b66f142d7f = 5:  9f638b3adf packed-backend: check whether the refname contains NUL characters
6:  f68028e171 ! 6:  2c5395bdd0 packed-backend: add "packed-refs" entry consistency check
    @@ Commit message
         Mentored-by: Karthik Nayak <karthik.188@gmail.com>
         Signed-off-by: shejialuo <shejialuo@gmail.com>
     
    - ## Documentation/fsck-msgids.txt ##
    + ## Documentation/fsck-msgids.adoc ##
     @@
      `badObjectSha1`::
      	(ERROR) An object has a bad sha1.
    @@ refs/packed-backend.c: static int packed_fsck_ref_header(struct fsck_options *o,
     +				      (int)(eol - p), p);
     +		goto cleanup;
     +	}
    ++
     +cleanup:
     +	strbuf_release(&packed_entry);
     +	return ret;
7:  4a7adf293f ! 7:  648404c60d packed-backend: check whether the "packed-refs" is sorted
    @@ Commit message
         Mentored-by: Karthik Nayak <karthik.188@gmail.com>
         Signed-off-by: shejialuo <shejialuo@gmail.com>
     
    - ## Documentation/fsck-msgids.txt ##
    + ## Documentation/fsck-msgids.adoc ##
     @@
      	(ERROR) The "packed-refs" file contains an entry that is
      	not terminated by a newline.
    @@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
      							REF_STORE_READ, "fsck");
      	struct strbuf packed_ref_content = STRBUF_INIT;
     +	unsigned int sorted = 0;
    + 	struct stat st;
    +-	int fd;
      	int ret = 0;
    - 	int fd;
    ++	int fd;
      
    + 	if (!is_main_worktree(wt))
    + 		goto cleanup;
     @@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
      		goto cleanup;
      	}
8:  2dd3437478 ! 8:  4dbbacf44b builtin/fsck: add `git refs verify` child process
    @@ Commit message
         Mentored-by: Karthik Nayak <karthik.188@gmail.com>
         Signed-off-by: shejialuo <shejialuo@gmail.com>
     
    - ## Documentation/git-fsck.txt ##
    -@@ Documentation/git-fsck.txt: SYNOPSIS
    + ## Documentation/git-fsck.adoc ##
    +@@ Documentation/git-fsck.adoc: SYNOPSIS
      'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
      	 [--[no-]full] [--strict] [--verbose] [--lost-found]
      	 [--[no-]dangling] [--[no-]progress] [--connectivity-only]
    @@ Documentation/git-fsck.txt: SYNOPSIS
      
      DESCRIPTION
      -----------
    -@@ Documentation/git-fsck.txt: care about this output and want to speed it up further.
    +@@ Documentation/git-fsck.adoc: care about this output and want to speed it up further.
      	progress status even if the standard error stream is not
      	directed to a terminal.