mbox series

[v5,0/9] add ref content check for files backend

Message ID Zvj-DgHqtC30KjJe@ArchLinux (mailing list archive)
Headers show
Series add ref content check for files backend | expand

Message

shejialuo Sept. 29, 2024, 7:13 a.m. UTC
Hi All:

This version handles a lot of review from Junio.

1. [PATCH v5 1/9] enhances the commit message compared with the previous
[PATCH v4 1/5].

2. [PATCH v5 2/9] is a new topic which has not never been introduced in
the previous. It supports multiple worktrees check for refs. During the
GSoC PATCH: <ZrSqMmD-quQ18a9F@ArchLinux.localdomain>, I do not implement
the code to support worktree check. However, we need to add this due to
the review from Junio:
  > > +`escapeReferent`::
  > > +	(ERROR) The referent of a symref is outside the "ref" directory.
  >
  > I am not sure starting this as ERROR is wise.  Users and third-party
  > tools make creative uses of the system and I cannot offhand think of
  > an argument why it should be forbidden to create a symbolic link to
  > our own HEAD or to some worktree-specific ref in another worktree.
When checking the escape situation of the referent, I didn't consider
the worktree. So, I decide to first add checks for multiple worktree.
And then add a new test for multiple worktrees.

3. The intention of the [PATCH v5 3/9] is the same as the [PATCH v4
2/5].

  + Enhance the commit message suggested by Junio.
  + Use "fsck_ref_report" to tell the user we cannot read the file
  instead of reporting general error.
  + For "FSCK_MSG_BAD_REF_CONTENT" message id, instead of just reporting
  the no-information message "invalid ref content", report the actual
  content of the ref, i.e., "ref_content.buf".

4. The intention of the [PATCH v5 4/9] is the same as the [PATCH v4
3/5].

  + Instead of using the concrete "refMissingNewline" and
  "trailingRefContent" fsck messages, create a fsck info message
  "unofficialFormattedRef"
  + Follow the advice from Junio, use "fsck_ref_report" to report more
  useful information. For example, what is the trailing garbage.

5. The PATCH[v4 4/5] is split into 4 commits from [PATCH v5 5/9] to
[PATCH v5 8/9]. The reason why I decide to do this is that I introduce
the check for worktree and the version 4 is a little messy for the
commit message. Although the C code is not changed too much, the commit
message is hard to write and make the reviewer confused.

6. [PATCH v5 5/9] aims to add checks for textual symref except escape
situation.

  + Because I split commit here, it's easy to write the clean commit
  message, which should be changed according to the review from Junio.
  + Followed the advice from Junio to gracefully check the symref. Thus,
  the commit message is more clean.
  + Drop the check for "referent" pointing to a directory. We allow
  this, it's a dangling symref. No need to check this. So we could drop
  the parameter "referent_path" in "files_fsck_symref_target()".
  + Enhance the "fsck_ref_report" to report more useful information.

7. [PATCH v5 6/9] enhances the check for escape situation. Introduce a
new fsck message "escapeReferent(INFO)".

8. [PATCH v5 7/9] enhances the situation where we use multiple
worktrees. In practice, we allow point to ref of one of the linked
worktrees from primary worktree or one of the linked worktrees. We
should not warn about this.

9. [PATCH v5 8/9] enhances the test script for worktrees.

10. The intention of [PATCH v5 9/9] is the same as the [PATCH v4 5/5].
Not so much change.

Because I do not sync the upstream for a long time. For this series, I
sync the latest upstream and generate the patch, it is based on

  3857aae53f (Git 2.47-rc0, 2024-09-25)

And I don't think range-diff is useful, it is messy for the reviewers.
Actually, there are not so many logic changes in this new version.

Thanks,
Jialuo

shejialuo (9):
  ref: initialize "fsck_ref_report" with zero
  builtin/refs: support multiple worktrees check for refs.
  ref: port git-fsck(1) regular refs check for files backend
  ref: add more strict checks for regular refs
  ref: add basic symref content check for files backend
  ref: add escape check for the referent of symref
  ref: enhance escape situation for worktrees
  t0602: add ref content checks for worktrees
  ref: add symlink ref content check for files backend

 Documentation/fsck-msgids.txt |  28 +++
 builtin/refs.c                |  11 +-
 fsck.h                        |   5 +
 refs.c                        |   2 +-
 refs/files-backend.c          | 168 ++++++++++++-
 refs/refs-internal.h          |   2 +-
 t/t0602-reffiles-fsck.sh      | 442 ++++++++++++++++++++++++++++++++++
 7 files changed, 646 insertions(+), 12 deletions(-)

Comments

Junio C Hamano Sept. 30, 2024, 6:57 p.m. UTC | #1
shejialuo <shejialuo@gmail.com> writes:

> Because I do not sync the upstream for a long time. For this series, I
> sync the latest upstream and generate the patch, it is based on
>
>   3857aae53f (Git 2.47-rc0, 2024-09-25)

Does this help to reduce conflicts when merging the topic to say
'next' or 'seen'?  If so, such a rebase and noting it in the cover
letter message, like you just did, is very much appreciated.

If not, please don't ;-).

> And I don't think range-diff is useful, it is messy for the reviewers.
> Actually, there are not so many logic changes in this new version.

OK, so this needs a fresh full review.  Thanks.
shejialuo Oct. 1, 2024, 3:40 a.m. UTC | #2
On Mon, Sep 30, 2024 at 11:57:19AM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > Because I do not sync the upstream for a long time. For this series, I
> > sync the latest upstream and generate the patch, it is based on
> >
> >   3857aae53f (Git 2.47-rc0, 2024-09-25)
> 
> Does this help to reduce conflicts when merging the topic to say
> 'next' or 'seen'?  If so, such a rebase and noting it in the cover
> letter message, like you just did, is very much appreciated.
> 
> If not, please don't ;-).
> 

Actually, I am sure that there is no conflicts after squashing the
following two patches.

  <xmqqle0gzdyh.fsf_-_@gitster.g>
  <xmqqbk1cz69c.fsf@gitster.g>

The reason why I just sync the upstream is that the build system (such
as warning about unused parameters) and CIs are all changed.

I will remember this.

Thanks,
Jialuo
shejialuo Oct. 7, 2024, 12:49 p.m. UTC | #3
From the discussion with Patrick in v5 and Junio in v4. I conclude the
follow things:

1. "fsck_ref_report" should not be refactored to accept `NULL`. There
would be only one situation where it will be a little bad (the content
of a ref does not end with a newline). In the other situations, the
message part will be useful, such as:

  refs/heads/garbage-branch: trailingRefContent: ' garbage'.
  refs/heads/escape: escapeReferent: referent 'xxx' is outside.

Although for some messages, only use fsck message id is enough. But we
could also specify the message. It's not harmful anyway.

2. The mapping from fsck message id to error case should be one to one.
This is essentially important because the user could set the fsck error
levels. If we use multiple to one, we will give the user a bad
experience. We should avoid this.

I will wait for more comments to ensure the next version will be better.

Thanks,
Jialuo