diff mbox series

[GSoC,v2,7/7] fsck: add ref content check for files backend

Message ID 20240612085349.710785-8-shejialuo@gmail.com (mailing list archive)
State New, archived
Headers show
Series [GSoC,v2,1/7] fsck: add refs check interfaces to interface with fsck error levels | expand

Commit Message

shejialuo June 12, 2024, 8:53 a.m. UTC
Enhance the git-fsck(1) command by adding a check for reference content
in the files backend. The new functionality ensures that symrefs, real
symbolic link and regular refs are validated correctly.

Add a new function "files_fsck_symref" to check whether the symrefs and
symbolic link points to a valid object and a new function
"files_fsck_refs_content" handles both regular refs and symbolic refs.

In order to check the trailing content, add a new parameter
"trailing" to "parse_loose_ref_contents" function.

Last, add the following FSCK MESSAGEs:

1. "badRefSha(ERROR)": A ref has a bad sha.
2. "danglingSymre(WARN)": Found a dangling symref.
3. "trailingRefContent(WARN)": A ref content has trailing contents.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 Documentation/fsck-msgids.txt |   9 +++
 fsck.h                        |   3 +
 refs.c                        |   2 +-
 refs/files-backend.c          | 124 +++++++++++++++++++++++++++++++++-
 refs/refs-internal.h          |   5 +-
 t/t0602-reffiles-fsck.sh      | 110 ++++++++++++++++++++++++++++++
 6 files changed, 248 insertions(+), 5 deletions(-)

Comments

Junio C Hamano June 13, 2024, 7:38 p.m. UTC | #1
shejialuo <shejialuo@gmail.com> writes:

> In order to check the trailing content, add a new parameter
> "trailing" to "parse_loose_ref_contents" function.

About this one.

>  int parse_loose_ref_contents(const char *buf, struct object_id *oid,
>  			     struct strbuf *referent, unsigned int *type,
> -			     int *failure_errno)
> +			     int *failure_errno, unsigned int *trailing)
>  {
>  	const char *p;
>  	if (skip_prefix(buf, "ref:", &buf)) {
> @@ -607,6 +607,10 @@ int parse_loose_ref_contents(const char *buf, struct object_id *oid,
>  		*failure_errno = EINVAL;
>  		return -1;
>  	}
> +
> +	if (trailing && (*p != '\0' && *p != '\n'))
> +		*trailing = 1;
> +
>  	return 0;
>  }

We know what the garbage looked like at this point.  The caller owns
the "buf" pointer and we are pointing into that buffer with the
pointer p, and the garbage is right there.

So I am not sure if losing information by using "uint *" is a good
idea.  Wouldn't it make more sense to take "const char **trailing"
as a parameter and tell the caller where the trailing junk begins?

> +static int files_fsck_symref(struct fsck_refs_options *o,
> +			     struct strbuf *refname,
> +			     struct strbuf *path)

This does not take things like HEAD or refs/remotes/origin/HEAD to
validate.  Instead, the caller is responsible for either doing a
readlink on a symbolic link, or reading a textual symref and
stripping "ref: " prefix from it, before calling this function.
The "refname" parameter is not HEAD or refs/remotes/origin/HEAD but
the pointee of the symref.

So I'd imagine that renaming it to fsck_symref_target or along that
line to clarify that we are not checking the symref, but the target
of a symref, would be a good idea.

> +{
> +	struct stat st;
> +	int ret = 0;
> +
> +	if (lstat(path->buf, &st) < 0) {
> +		ret = fsck_refs_report(o, refname->buf,
> +				       FSCK_MSG_DANGLING_SYMREF,
> +				       "point to non-existent ref");
> +		goto out;
> +	}

Is that an error?  Just like being on an unborn branch is not an
error, it could be argued that a symref that points at a branch yet
to be born wouldn't be an error, either, no?

> +	if (!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode)) {
> +		ret = fsck_refs_report(o, refname->buf,
> +				       FSCK_MSG_DANGLING_SYMREF,
> +				       "point to invalid object");
> +		goto out;

The use of "object" here is highly misleading.  Yes, you can call a
filesystem entity like "a regular file", "a directory", etc. "an
object", but the word can refer to many other kinds of "object".  In
fact, I originally read this to mean "we are referring to an object
in the object database that is corrupt" or something, but of course
that is not what we are complaining about. We are complaining that
the symbolic link points at a file of wrong type (like a directory).

So, in short, missing is probably OK.  Pointing at a wrong thing
(like a directory or block device) is not.  Where, if any, do we
catch a symbolic ref that tries to escape the refs/* hierarchy
(e.g. ".git/refs/heads/my-crazy-ref" that is a symbolic link that
points at "../../../../else/where" that is not even part of the
repository), by the way?

Thanks.
shejialuo June 14, 2024, 5:20 a.m. UTC | #2
On Thu, Jun 13, 2024 at 12:38:45PM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > In order to check the trailing content, add a new parameter
> > "trailing" to "parse_loose_ref_contents" function.
> 
> About this one.
> 
> >  int parse_loose_ref_contents(const char *buf, struct object_id *oid,
> >  			     struct strbuf *referent, unsigned int *type,
> > -			     int *failure_errno)
> > +			     int *failure_errno, unsigned int *trailing)
> >  {
> >  	const char *p;
> >  	if (skip_prefix(buf, "ref:", &buf)) {
> > @@ -607,6 +607,10 @@ int parse_loose_ref_contents(const char *buf, struct object_id *oid,
> >  		*failure_errno = EINVAL;
> >  		return -1;
> >  	}
> > +
> > +	if (trailing && (*p != '\0' && *p != '\n'))
> > +		*trailing = 1;
> > +
> >  	return 0;
> >  }
> 
> We know what the garbage looked like at this point.  The caller owns
> the "buf" pointer and we are pointing into that buffer with the
> pointer p, and the garbage is right there.
> 
> So I am not sure if losing information by using "uint *" is a good
> idea.  Wouldn't it make more sense to take "const char **trailing"
> as a parameter and tell the caller where the trailing junk begins?
> 

Yes, I totally agree that using "uint *" will lose a lot of information
here. Actually I have used the "const char **trailing", but I made a
mistake to result the wild pointer. This is because when
"parse_loose_ref_contents" handles symref, it will never handle `*p`.
When the caller defines `const char *trailing`, it will be wild pointer.
But I think we could set it to `NULL` when handling symref.

I will change the code in the next version.

> > +static int files_fsck_symref(struct fsck_refs_options *o,
> > +			     struct strbuf *refname,
> > +			     struct strbuf *path)
> 
> This does not take things like HEAD or refs/remotes/origin/HEAD to
> validate.  Instead, the caller is responsible for either doing a
> readlink on a symbolic link, or reading a textual symref and
> stripping "ref: " prefix from it, before calling this function.
> The "refname" parameter is not HEAD or refs/remotes/origin/HEAD but
> the pointee of the symref.
> 
> So I'd imagine that renaming it to fsck_symref_target or along that
> line to clarify that we are not checking the symref, but the target
> of a symref, would be a good idea.
> 

That's not correct. The "refname" parameter is EXACTLY the symref
itself. What we do here is to check the "path" paramteter, there are two
situations:

1. For symref we will strip "ref: " prefix, and combine the girdir and
the stripped content to get the "path" parameter.
2. For symbolic we will get its actual path (here I made a mistake, I
totally forget the situation when it points to absolute path, I will
revise the code for handling it).

The design here is just check whether the symref points to the correct
thing. It does not care about the pointee. The code will traverse every
regular file under the "refs/" directory, eventually we will check the
"pointee" status. For example, a symref "sym-branch" and a regular ref
"branch".

  sym-branch: "ref: refs/heads/branch".
  branch: "xxxx"

The design will not report any error for "sym-branch". I think we should
discuss here whether this design is OK.

> > +{
> > +	struct stat st;
> > +	int ret = 0;
> > +
> > +	if (lstat(path->buf, &st) < 0) {
> > +		ret = fsck_refs_report(o, refname->buf,
> > +				       FSCK_MSG_DANGLING_SYMREF,
> > +				       "point to non-existent ref");
> > +		goto out;
> > +	}
> 
> Is that an error?  Just like being on an unborn branch is not an
> error, it could be argued that a symref that points at a branch yet
> to be born wouldn't be an error, either, no?
> 

The reason why I choose "danglingSymref" and warn severity is that I let
the code be align with "git checkout". When we use "git checkout" for a
dangling symref. It would produce the following output:

  $ git checkout branch-3
  warning: ignoring dangling symref refs/heads/branch-3
  error: pathspec 'branch-3' did not match any file(s) known to git

So I prefer to warn severity.

> > +	if (!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode)) {
> > +		ret = fsck_refs_report(o, refname->buf,
> > +				       FSCK_MSG_DANGLING_SYMREF,
> > +				       "point to invalid object");
> > +		goto out;
> 
> The use of "object" here is highly misleading.  Yes, you can call a
> filesystem entity like "a regular file", "a directory", etc. "an
> object", but the word can refer to many other kinds of "object".  In
> fact, I originally read this to mean "we are referring to an object
> in the object database that is corrupt" or something, but of course
> that is not what we are complaining about. We are complaining that
> the symbolic link points at a file of wrong type (like a directory).
> 

Yes, it brings a lot of misleading here. I will clean the code and
commit message (I also used object in commit message).

> So, in short, missing is probably OK.  Pointing at a wrong thing
> (like a directory or block device) is not.  Where, if any, do we
> catch a symbolic ref that tries to escape the refs/* hierarchy
> (e.g. ".git/refs/heads/my-crazy-ref" that is a symbolic link that
> points at "../../../../else/where" that is not even part of the
> repository), by the way?
> 

I intentionally ignored the "escape" situation. Actually, the path could
be either absolute or relative. It may be a little complicated. I will
find a way to support this in the next version.

> Thanks.

Thanks,
Jialuo
Junio C Hamano June 14, 2024, 3:23 p.m. UTC | #3
shejialuo <shejialuo@gmail.com> writes:

>> > +static int files_fsck_symref(struct fsck_refs_options *o,
>> > +			     struct strbuf *refname,
>> > +			     struct strbuf *path)
>> 
>> This does not take things like HEAD or refs/remotes/origin/HEAD to
>> validate.  Instead, the caller is responsible for either doing a
>> readlink on a symbolic link, or reading a textual symref and
>> stripping "ref: " prefix from it, before calling this function.
>> The "refname" parameter is not HEAD or refs/remotes/origin/HEAD but
>> the pointee of the symref.
>> 
>> So I'd imagine that renaming it to fsck_symref_target or along that
>> line to clarify that we are not checking the symref, but the target
>> of a symref, would be a good idea.
>
> That's not correct. The "refname" parameter is EXACTLY the symref
> itself.

Yeah, but the story is the same.  We are not really checking
anything about the symref (i.e. the thing in "refname") being funny.
We are checking what is in "path" (like "does it exist?") and the
"refname" is there only for reporting purposes (i.e. "we have a
symbolic ref REFNAME, that points at PATH which is a wrong thing").

>> > +{
>> > +	struct stat st;
>> > +	int ret = 0;
>> > +
>> > +	if (lstat(path->buf, &st) < 0) {
>> > +		ret = fsck_refs_report(o, refname->buf,
>> > +				       FSCK_MSG_DANGLING_SYMREF,
>> > +				       "point to non-existent ref");
>> > +		goto out;
>> > +	}
>> 
>> Is that an error?  Just like being on an unborn branch is not an
>> error, it could be argued that a symref that points at a branch yet
>> to be born wouldn't be an error, either, no?
>> 
>
> The reason why I choose "danglingSymref" and warn severity is that I let
> the code be align with "git checkout". When we use "git checkout" for a
> dangling symref. It would produce the following output:
>
>   $ git checkout branch-3
>   warning: ignoring dangling symref refs/heads/branch-3
>   error: pathspec 'branch-3' did not match any file(s) known to git
>
> So I prefer to warn severity.

If you do this from that situation, 

    $ git branch branch-3 master

what happens is that the pointee of branch-3 is created at the
commit pointed at by 'master'.  No error.  No warnings.

In a freshly created respository, HEAD is a dangling symbolic ref,
and that is not an error.  You can create a root commit from there
just fine.

If there is anything that needs improvement in your example, it is
that "checkout branch-3" should be taught to either (1) not warn
about dangling symbolic link and just give the error, which is in
line with how "git checkout HEAD" in a freshly created repository
behaves, or (2) just like unborn 'master' pointed at by 'HEAD' is
perfectly happy to be checked out, allow the unborn 'branch-3' to be
pointed at by 'HEAD', and arrange the first commit (which will be a
root commit) you create in that state to be pointed by the ref
'branch-3' points at.

So from all of these reasons, I do not think missing target should
be treated as any error worthy event.  Not even warning.

On the other hand, the target of the symref in "path" must be
checked, even if it does not currently exist, for its validity, the
same way an existing ref gets checked (lives inside refs/, passes
check-ref-format, etc.).

> I intentionally ignored the "escape" situation. Actually, the path could
> be either absolute or relative. It may be a little complicated. I will
> find a way to support this in the next version.

Yes, if this wants to claim to be part of "FSCK", it should catch
all the errors the regular runtime would complain about, and
"escape" thing is one of the first things that you need to get
right.  Whatever refs.c:read_ref_internal() for S_ISLNK(st.st_mode)
case takes as legit should be considered legit, the "fallback - read
through the symlink as if it were a non-symlink" case probably wants
to be warned.  refs.c:resolve_ref_unsafe(), which is used at the low
level from object-name.c:get_oid_basic() via refs.c:repo_dwim_ref(),
has further checks to see if a refname that a symref resolves to is
valid and the runtime sanity relies on these checks.

Thanks.
shejialuo June 14, 2024, 5:05 p.m. UTC | #4
Junio C Hamano <gitster@pobox.com> writes:
>
> shejialuo <shejialuo@gmail.com> writes:
>
> >> > +static int files_fsck_symref(struct fsck_refs_options *o,
> >> > +                       struct strbuf *refname,
> >> > +                       struct strbuf *path)
> >>
> >> This does not take things like HEAD or refs/remotes/origin/HEAD to
> >> validate.  Instead, the caller is responsible for either doing a
> >> readlink on a symbolic link, or reading a textual symref and
> >> stripping "ref: " prefix from it, before calling this function.
> >> The "refname" parameter is not HEAD or refs/remotes/origin/HEAD but
> >> the pointee of the symref.
> >>
> >> So I'd imagine that renaming it to fsck_symref_target or along that
> >> line to clarify that we are not checking the symref, but the target
> >> of a symref, would be a good idea.
> >
> > That's not correct. The "refname" parameter is EXACTLY the symref
> > itself.
>
> Yeah, but the story is the same.  We are not really checking
> anything about the symref (i.e. the thing in "refname") being funny.
> We are checking what is in "path" (like "does it exist?") and the
> "refname" is there only for reporting purposes (i.e. "we have a
> symbolic ref REFNAME, that points at PATH which is a wrong thing").
>

Yes, I agree, will rename the function in the next version.

> >> > +{
> >> > +  struct stat st;
> >> > +  int ret = 0;
> >> > +
> >> > +  if (lstat(path->buf, &st) < 0) {
> >> > +          ret = fsck_refs_report(o, refname->buf,
> >> > +                                 FSCK_MSG_DANGLING_SYMREF,
> >> > +                                 "point to non-existent ref");
> >> > +          goto out;
> >> > +  }
> >>
> >> Is that an error?  Just like being on an unborn branch is not an
> >> error, it could be argued that a symref that points at a branch yet
> >> to be born wouldn't be an error, either, no?
> >>
> >
> > The reason why I choose "danglingSymref" and warn severity is that I let
> > the code be align with "git checkout". When we use "git checkout" for a
> > dangling symref. It would produce the following output:
> >
> >   $ git checkout branch-3
> >   warning: ignoring dangling symref refs/heads/branch-3
> >   error: pathspec 'branch-3' did not match any file(s) known to git
> >
> > So I prefer to warn severity.
>
> If you do this from that situation,
>
>     $ git branch branch-3 master
>
> what happens is that the pointee of branch-3 is created at the
> commit pointed at by 'master'.  No error.  No warnings.
>
> In a freshly created respository, HEAD is a dangling symbolic ref,
> and that is not an error.  You can create a root commit from there
> just fine.
>

I am totally shocked by the fact that it will create the pointee file. I am
a little curious about the design here. I know when creating a new repo,
HEAD is a dangling symbolic ref. When we do first commit, it will
create the pointee file. But HEAD is a special file, it should be treated
differently. Why we treat the other symrefs the same way.

> If there is anything that needs improvement in your example, it is
> that "checkout branch-3" should be taught to either (1) not warn
> about dangling symbolic link and just give the error, which is in
> line with how "git checkout HEAD" in a freshly created repository
> behaves, or (2) just like unborn 'master' pointed at by 'HEAD' is
> perfectly happy to be checked out, allow the unborn 'branch-3' to be
> pointed at by 'HEAD', and arrange the first commit (which will be a
> root commit) you create in that state to be pointed by the ref
> 'branch-3' points at.
>

Should we really improve this? From my perspective, a user will get
more information when this warn message shows up. The
"error: pathspec 'branch-3' did not match any file(s) known to git" is
less informative.

Maybe the pro git users do not use git command to write to the git file
system. However, this is a bad way to do this. We always want to
the user to use the command to handle with git file system.

Instead, we should warn about the user when using

  $ git branch branch-3 master

It will create a new pointee here, I suppose we should warn the user
here except HEAD ref. The user could get the dangling symref by
deleting the pointee ref. For example:

  $ git symbolic-ref refs/heads/main master
  $ git checkout -b master-copy
  $ git branch --delete master

Should we warn this for user in git-fsck(1)?

> So from all of these reasons, I do not think missing target should
> be treated as any error worthy event.  Not even warning.
>

Yes, if git-branch implicitly creates the pointee ref file. We should not
handle such case. But I think giving a warning here except HEAD
ref is worthful.

> On the other hand, the target of the symref in "path" must be
> checked, even if it does not currently exist, for its validity, the
> same way an existing ref gets checked (lives inside refs/, passes
> check-ref-format, etc.).
>

Yes!

> > I intentionally ignored the "escape" situation. Actually, the path could
> > be either absolute or relative. It may be a little complicated. I will
> > find a way to support this in the next version.
>
> Yes, if this wants to claim to be part of "FSCK", it should catch
> all the errors the regular runtime would complain about, and
> "escape" thing is one of the first things that you need to get
> right.  Whatever refs.c:read_ref_internal() for S_ISLNK(st.st_mode)
> case takes as legit should be considered legit, the "fallback - read
> through the symlink as if it were a non-symlink" case probably wants
> to be warned.  refs.c:resolve_ref_unsafe(), which is used at the low
> level from object-name.c:get_oid_basic() via refs.c:repo_dwim_ref(),
> has further checks to see if a refname that a symref resolves to is
> valid and the runtime sanity relies on these checks.
>

Thanks for your help! I will look at these codes to implement the
"escape" check.

> Thanks.
>
Junio C Hamano June 14, 2024, 6:56 p.m. UTC | #5
jialuo she <shejialuo@gmail.com> writes:

> Instead, we should warn about the user when using
>
>   $ git branch branch-3 master
>
> It will create a new pointee here, I suppose we should warn the user
> here except HEAD ref.

The user created branch-3 as a symlink, fully expecting that the
pointee will be updated.

By the way, a seemingly normal branch looking symref is very handy
and that is what my "maint" branch looks like.  Right now it is
pointing at maint-2.45 (because 2.45.0 was the last feature release)

        $ git symbolic-ref refs/heads/maint
        refs/heads/maint-2.45

and I would imagine many other users are using the trick to maintain
multiple maintenance tracks.  They would be upset if their next "Now
2.46.0 release is done, let's start the 2.46 maintenance track"
mantra, which would look like so:

	$ git symbolic-ref refs/heads/maint refs/heads/maint-2.46
	$ git branch maint v2.46.0

starts to give an unnecessary warning.
diff mbox series

Patch

diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index cc85c897cc..69f86c5345 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -22,6 +22,9 @@ 
 `badRefName`::
 	(ERROR) A ref has a bad name.
 
+`badRefSha`::
+	(ERROR) A ref has a bad sha.
+
 `badTagName`::
 	(INFO) A tag has an invalid format.
 
@@ -37,6 +40,9 @@ 
 `badType`::
 	(ERROR) Found an invalid object type.
 
+`danglingSymref`::
+	(WARN) Found a dangling symref.
+
 `duplicateEntries`::
 	(ERROR) A tree contains duplicate file entries.
 
@@ -179,6 +185,9 @@ 
 `symlinkTargetMissing`::
 	(ERROR) Unable to read symbolic link target's blob.
 
+`trailingRefContent`::
+	(WARN) A ref content has trailing contents.
+
 `treeNotSorted`::
 	(ERROR) A tree is not properly sorted.
 
diff --git a/fsck.h b/fsck.h
index 1423a5e428..5a55a567b0 100644
--- a/fsck.h
+++ b/fsck.h
@@ -32,6 +32,7 @@  enum fsck_msg_type {
 	FUNC(BAD_OBJECT_SHA1, ERROR) \
 	FUNC(BAD_PARENT_SHA1, ERROR) \
 	FUNC(BAD_REF_NAME, ERROR) \
+	FUNC(BAD_REF_SHA, ERROR) \
 	FUNC(BAD_TIMEZONE, ERROR) \
 	FUNC(BAD_TREE, ERROR) \
 	FUNC(BAD_TREE_SHA1, ERROR) \
@@ -69,11 +70,13 @@  enum fsck_msg_type {
 	FUNC(SYMLINK_TARGET_BLOB, ERROR) \
 	/* warnings */ \
 	FUNC(EMPTY_NAME, WARN) \
+	FUNC(DANGLING_SYMREF, WARN) \
 	FUNC(FULL_PATHNAME, WARN) \
 	FUNC(HAS_DOT, WARN) \
 	FUNC(HAS_DOTDOT, WARN) \
 	FUNC(HAS_DOTGIT, WARN) \
 	FUNC(NULL_SHA1, WARN) \
+	FUNC(TRAILING_REF_CONTENT, WARN) \
 	FUNC(ZERO_PADDED_FILEMODE, WARN) \
 	FUNC(NUL_IN_COMMIT, WARN) \
 	FUNC(LARGE_PATHNAME, WARN) \
diff --git a/refs.c b/refs.c
index 0922439275..1325f83269 100644
--- a/refs.c
+++ b/refs.c
@@ -1744,7 +1744,7 @@  static int refs_read_special_head(struct ref_store *ref_store,
 	}
 
 	result = parse_loose_ref_contents(content.buf, oid, referent, type,
-					  failure_errno);
+					  failure_errno, NULL);
 
 done:
 	strbuf_release(&full_path);
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 266f1ffe8a..17d3e433f1 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -549,7 +549,7 @@  static int read_ref_internal(struct ref_store *ref_store, const char *refname,
 	strbuf_rtrim(&sb_contents);
 	buf = sb_contents.buf;
 
-	ret = parse_loose_ref_contents(buf, oid, referent, type, &myerr);
+	ret = parse_loose_ref_contents(buf, oid, referent, type, &myerr, NULL);
 
 out:
 	if (ret && !myerr)
@@ -585,7 +585,7 @@  static int files_read_symbolic_ref(struct ref_store *ref_store, const char *refn
 
 int parse_loose_ref_contents(const char *buf, struct object_id *oid,
 			     struct strbuf *referent, unsigned int *type,
-			     int *failure_errno)
+			     int *failure_errno, unsigned int *trailing)
 {
 	const char *p;
 	if (skip_prefix(buf, "ref:", &buf)) {
@@ -607,6 +607,10 @@  int parse_loose_ref_contents(const char *buf, struct object_id *oid,
 		*failure_errno = EINVAL;
 		return -1;
 	}
+
+	if (trailing && (*p != '\0' && *p != '\n'))
+		*trailing = 1;
+
 	return 0;
 }
 
@@ -3432,6 +3436,121 @@  static int files_fsck_refs_name(struct fsck_refs_options *o,
 	return ret;
 }
 
+static int files_fsck_symref(struct fsck_refs_options *o,
+			     struct strbuf *refname,
+			     struct strbuf *path)
+{
+	struct stat st;
+	int ret = 0;
+
+	if (lstat(path->buf, &st) < 0) {
+		ret = fsck_refs_report(o, refname->buf,
+				       FSCK_MSG_DANGLING_SYMREF,
+				       "point to non-existent ref");
+		goto out;
+	}
+
+	if (!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode)) {
+		ret = fsck_refs_report(o, refname->buf,
+				       FSCK_MSG_DANGLING_SYMREF,
+				       "point to invalid object");
+		goto out;
+	}
+out:
+	return ret;
+}
+
+static int files_fsck_refs_content(struct fsck_refs_options *o,
+				   const char *gitdir,
+				   const char *refs_check_dir,
+				   struct dir_iterator *iter)
+{
+	struct strbuf path = STRBUF_INIT,
+		      refname = STRBUF_INIT,
+		      ref_content = STRBUF_INIT,
+		      referent = STRBUF_INIT;
+	unsigned int trailing = 0;
+	int failure_errno = 0;
+	unsigned int type = 0;
+	struct object_id oid;
+	int ret = 0;
+
+	strbuf_addbuf(&path, &iter->path);
+	strbuf_addf(&refname, "%s/%s", refs_check_dir, iter->relative_path);
+
+	/*
+	 * If the file is a symlink, we need to only check the connectivity
+	 * of the destination object.
+	 */
+	if (S_ISLNK(iter->st.st_mode)) {
+		strbuf_strip_file_from_path(&path);
+		ret = strbuf_readlink(&ref_content,
+				      iter->path.buf, iter->st.st_size);
+		if (ret < 0) {
+			ret = error_errno(_("could not read link '%s'"),
+					  iter->path.buf);
+			goto clean;
+		}
+		strbuf_addbuf(&path, &ref_content);
+		strbuf_reset(&ref_content);
+
+		ret = files_fsck_symref(o, &refname, &path);
+		goto clean;
+	}
+
+	if (strbuf_read_file(&ref_content, path.buf, 0) < 0) {
+		ret = error_errno(_("%s/%s: unable to read the ref"),
+				  refs_check_dir, iter->relative_path);
+		goto clean;
+	}
+
+	if (parse_loose_ref_contents(ref_content.buf, &oid,
+				     &referent, &type,
+				     &failure_errno, &trailing)) {
+		ret = fsck_refs_report(o, refname.buf,
+				       FSCK_MSG_BAD_REF_SHA,
+				       "invalid ref content");
+		goto clean;
+	}
+
+	/*
+	 * If the ref is a symref, we need to check the destination name and
+	 * connectivity.
+	 */
+	if (referent.len && (type & REF_ISSYMREF)) {
+		strbuf_reset(&path);
+		strbuf_addf(&path, "%s/%s", gitdir, referent.buf);
+
+		if (check_refname_format(referent.buf, 0)) {
+			ret = fsck_refs_report(o, refname.buf,
+					       FSCK_MSG_DANGLING_SYMREF,
+					       "point to invalid refname");
+			goto clean;
+		}
+
+		ret = files_fsck_symref(o, &refname, &path);
+		goto clean;
+	} else {
+		/*
+		 * Only regular refs could have a trailing garbage. Should
+		 * be reported as a warning.
+		 */
+		if (trailing) {
+			ret = fsck_refs_report(o, refname.buf,
+					       FSCK_MSG_TRAILING_REF_CONTENT,
+					       "trailing garbage in ref");
+			goto clean;
+		}
+	}
+
+clean:
+	strbuf_release(&path);
+	strbuf_release(&refname);
+	strbuf_release(&ref_content);
+	strbuf_release(&referent);
+	return ret;
+}
+
 static int files_fsck_refs_dir(struct ref_store *ref_store,
 			       struct fsck_refs_options *o,
 			       const char *refs_check_dir,
@@ -3484,6 +3603,7 @@  static int files_fsck_refs(struct ref_store *ref_store,
 	int ret;
 	files_fsck_refs_fn fsck_refs_fns[]= {
 		files_fsck_refs_name,
+		files_fsck_refs_content,
 		NULL
 	};
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 8f42f21e77..eb3a7cdcc1 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -709,11 +709,12 @@  struct ref_store {
 
 /*
  * Parse contents of a loose ref file. *failure_errno maybe be set to EINVAL for
- * invalid contents.
+ * invalid contents. Also trailing is set to 1 when there is any bytes after the
+ * hex.
  */
 int parse_loose_ref_contents(const char *buf, struct object_id *oid,
 			     struct strbuf *referent, unsigned int *type,
-			     int *failure_errno);
+			     int *failure_errno, unsigned int *trailing);
 
 /*
  * Fill in the generic part of refs and add it to our collection of
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index b2db58d2c6..94cb93bf92 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -98,4 +98,114 @@  test_expect_success 'ref name check should be adapted into fsck messages' '
 	)
 '
 
+test_expect_success 'regular ref content should be checked' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	branch_dir_prefix=.git/refs/heads &&
+	tag_dir_prefix=.git/refs/tags &&
+	(
+		cd repo &&
+		git commit --allow-empty -m initial &&
+		git checkout -b branch-1 &&
+		git tag tag-1 &&
+		git commit --allow-empty -m second &&
+		git checkout -b branch-2 &&
+		git tag tag-2 &&
+		git checkout -b a/b/tag-2
+	) &&
+	(
+		cd repo &&
+		printf "%s garbage" "$(git rev-parse branch-1)" > $branch_dir_prefix/branch-1-garbage &&
+		git fsck 2>err &&
+		cat >expect <<-EOF &&
+		warning: refs/heads/branch-1-garbage: trailingRefContent: trailing garbage in ref
+		EOF
+		rm $branch_dir_prefix/branch-1-garbage &&
+		test_cmp expect err
+	) &&
+	(
+		cd repo &&
+		printf "%s garbage" "$(git rev-parse tag-1)" > $tag_dir_prefix/tag-1-garbage &&
+		test_must_fail git -c fsck.trailingRefContent=error fsck 2>err &&
+		cat >expect <<-EOF &&
+		error: refs/tags/tag-1-garbage: trailingRefContent: trailing garbage in ref
+		EOF
+		rm $tag_dir_prefix/tag-1-garbage &&
+		test_cmp expect err
+	) &&
+	(
+		cd repo &&
+		printf "%s    " "$(git rev-parse tag-2)" > $tag_dir_prefix/tag-2-garbage &&
+		git fsck 2>err &&
+		cat >expect <<-EOF &&
+		warning: refs/tags/tag-2-garbage: trailingRefContent: trailing garbage in ref
+		EOF
+		rm $tag_dir_prefix/tag-2-garbage &&
+		test_cmp expect err
+	) &&
+	(
+		cd repo &&
+		printf "xfsazqfxcadas" > $tag_dir_prefix/tag-2-bad &&
+		test_must_fail git refs verify 2>err &&
+		cat >expect <<-EOF &&
+		error: refs/tags/tag-2-bad: badRefSha: invalid ref content
+		EOF
+		rm $tag_dir_prefix/tag-2-bad &&
+		test_cmp expect err
+	) &&
+	(
+		cd repo &&
+		printf "xfsazqfxcadas" > $branch_dir_prefix/a/b/branch-2-bad &&
+		test_must_fail git refs verify 2>err &&
+		cat >expect <<-EOF &&
+		error: refs/heads/a/b/branch-2-bad: badRefSha: invalid ref content
+		EOF
+		rm $branch_dir_prefix/a/b/branch-2-bad &&
+		test_cmp expect err
+	)
+'
+
+test_expect_success 'symbolic ref content should be checked' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	branch_dir_prefix=.git/refs/heads &&
+	tag_dir_prefix=.git/refs/tags &&
+	(
+		cd repo &&
+		git commit --allow-empty -m initial &&
+		git checkout -b branch-1 &&
+		git tag tag-1
+	) &&
+	(
+		cd repo &&
+		printf "ref: refs/heads/branch-3" > $branch_dir_prefix/branch-2-bad &&
+		git refs verify 2>err &&
+		cat >expect <<-EOF &&
+		warning: refs/heads/branch-2-bad: danglingSymref: point to non-existent ref
+		EOF
+		rm $branch_dir_prefix/branch-2-bad &&
+		test_cmp expect err
+	) &&
+	(
+		cd repo &&
+		printf "ref: refs/heads/.branch" > $branch_dir_prefix/branch-2-bad &&
+		git refs verify 2>err &&
+		cat >expect <<-EOF &&
+		warning: refs/heads/branch-2-bad: danglingSymref: point to invalid refname
+		EOF
+		rm $branch_dir_prefix/branch-2-bad &&
+		test_cmp expect err
+	) &&
+	(
+		cd repo &&
+		printf "ref: refs/heads" > $branch_dir_prefix/branch-2-bad &&
+		git refs verify 2>err &&
+		cat >expect <<-EOF &&
+		warning: refs/heads/branch-2-bad: danglingSymref: point to invalid object
+		EOF
+		rm $branch_dir_prefix/branch-2-bad &&
+		test_cmp expect err
+	)
+'
+
 test_done