Message ID | Z3qN6C2IpQTdVn_S@ArchLinux (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | add more ref consistency checks | expand |
shejialuo <shejialuo@gmail.com> writes: > Although "git-fsck(1)" and "packed-backend.c" will check some > consistency and correctness of "packed-refs" file, they never check the > filetype of the "packed-refs". The user should always use "git > packed-refs" command to create the raw regular "packed-refs" file, so we > need to explicitly check this in "git refs verify". > > Use "lstat" to check the file mode. If we cannot check the file status, > this is OK because there is a chance that there is no "packed-refs" in > the repo. > > Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to > the user if "packed-refs" is not a regular file. > > Mentored-by: Patrick Steinhardt <ps@pks.im> > Mentored-by: Karthik Nayak <karthik.188@gmail.com> > Signed-off-by: shejialuo <shejialuo@gmail.com> > --- > refs/packed-backend.c | 33 +++++++++++++++++++++++++++++---- > t/t0602-reffiles-fsck.sh | 20 ++++++++++++++++++++ > 2 files changed, 49 insertions(+), 4 deletions(-) > > diff --git a/refs/packed-backend.c b/refs/packed-backend.c > index 3406f1e71d..d9eb2f8b71 100644 > --- a/refs/packed-backend.c > +++ b/refs/packed-backend.c > @@ -4,6 +4,7 @@ > #include "../config.h" > #include "../dir.h" > #include "../gettext.h" > +#include "../fsck.h" > #include "../hash.h" > #include "../hex.h" > #include "../refs.h" > @@ -1747,15 +1748,39 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s > return empty_ref_iterator_begin(); > } > > -static int packed_fsck(struct ref_store *ref_store UNUSED, > - struct fsck_options *o UNUSED, > +static int packed_fsck(struct ref_store *ref_store, > + struct fsck_options *o, > struct worktree *wt) > { > + struct packed_ref_store *refs = packed_downcast(ref_store, > + REF_STORE_READ, "fsck"); > + struct stat st; > + int ret = 0; > > if (!is_main_worktree(wt)) > - return 0; > + goto cleanup; > > - return 0; > + /* > + * If the packed-refs file doesn't exist, there's nothing to > + * check. > + */ > + if (lstat(refs->path, &st) < 0) > + goto cleanup; Since `lstat` return '-1' for all errors, we should check that the `errno == ENOENT`. > + if (o->verbose) > + fprintf_ln(stderr, "Checking packed-refs file %s", refs->path); > + > + if (!S_ISREG(st.st_mode)) { > + struct fsck_ref_report report = { 0 }; > + report.path = "packed-refs"; > + > + ret = fsck_report_ref(o, &report, FSCK_MSG_BAD_REF_FILETYPE, > + "not a regular file"); > + goto cleanup; > + } > + > +cleanup: > + return ret; > } > > struct ref_storage_be refs_be_packed = { > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh > index 75f234a94a..307f94a3ca 100755 > --- a/t/t0602-reffiles-fsck.sh > +++ b/t/t0602-reffiles-fsck.sh > @@ -626,4 +626,24 @@ test_expect_success 'ref content checks should work with worktrees' ' > test_cmp expect err > ' > > +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' ' > + test_when_finished "rm -rf repo" && > + git init repo && > + cd repo && This should be in a subshell, so that at the end we can actually remove the repo. This seems to be applicable to most of the other tests in this file too. Perhaps, we should clean it up as a precursor commit to this series? > + test_commit default && > + git branch branch-1 && > + git branch branch-2 && > + git branch branch-3 && > + git pack-refs --all && > + > + mv .git/packed-refs .git/packed-refs-back && > + ln -sf packed-refs-bak .git/packed-refs && This should be `ln -sf .git/packed-refs-back .git/packed-refs` no? > + test_must_fail git refs verify 2>err && > + cat >expect <<-EOF && > + error: packed-refs: badRefFiletype: not a regular file > + EOF > + rm .git/packed-refs && > + test_cmp expect err > +' > + > test_done > -- > 2.47.1
On Sun, Jan 05, 2025 at 09:49:28PM +0800, shejialuo wrote: > diff --git a/refs/packed-backend.c b/refs/packed-backend.c > index 3406f1e71d..d9eb2f8b71 100644 > --- a/refs/packed-backend.c > +++ b/refs/packed-backend.c > @@ -4,6 +4,7 @@ > #include "../config.h" > #include "../dir.h" > #include "../gettext.h" > +#include "../fsck.h" Let's keep the alphabetic ordering here. Other than that I have nothing to add on top of what Karthik mentioned already. Patrick
On Tue, Jan 07, 2025 at 08:33:56AM -0800, Karthik Nayak wrote: > shejialuo <shejialuo@gmail.com> writes: [snip] > > -static int packed_fsck(struct ref_store *ref_store UNUSED, > > - struct fsck_options *o UNUSED, > > +static int packed_fsck(struct ref_store *ref_store, > > + struct fsck_options *o, > > struct worktree *wt) > > { > > + struct packed_ref_store *refs = packed_downcast(ref_store, > > + REF_STORE_READ, "fsck"); > > + struct stat st; > > + int ret = 0; > > > > if (!is_main_worktree(wt)) > > - return 0; > > + goto cleanup; > > > > - return 0; > > + /* > > + * If the packed-refs file doesn't exist, there's nothing to > > + * check. > > + */ > > + if (lstat(refs->path, &st) < 0) > > + goto cleanup; > > Since `lstat` return '-1' for all errors, we should check that the > `errno == ENOENT`. > I agree here, if the reason is not "errno == ENOENT", we should report an error to the user. [snip] > > --- a/t/t0602-reffiles-fsck.sh > > +++ b/t/t0602-reffiles-fsck.sh > > @@ -626,4 +626,24 @@ test_expect_success 'ref content checks should work with worktrees' ' > > test_cmp expect err > > ' > > > > +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' ' > > + test_when_finished "rm -rf repo" && > > + git init repo && > > + cd repo && > > This should be in a subshell, so that at the end we can actually remove > the repo. This seems to be applicable to most of the other tests in this > file too. Perhaps, we should clean it up as a precursor commit to this > series? I have searched the usage of "test_when_finished", and I don't know why we need to use subshell. Could you please explain this further here. > > > + test_commit default && > > + git branch branch-1 && > > + git branch branch-2 && > > + git branch branch-3 && > > + git pack-refs --all && > > + > > + mv .git/packed-refs .git/packed-refs-back && > > + ln -sf packed-refs-bak .git/packed-refs && > > This should be `ln -sf .git/packed-refs-back .git/packed-refs` no? > No. This should not be `ln -sf .git/packed-refs-back .git/packed-refs`. This is because it is a relative symlink. And the file ".git/packed-refs-back" and ".git/packed-refs" are in the same directory. So, from the perspective of ".git/packed-refs", it should be the "packed-refs-back". Thanks, Jialuo
On Fri, Jan 17, 2025 at 8:59 AM shejialuo <shejialuo@gmail.com> wrote: > On Tue, Jan 07, 2025 at 08:33:56AM -0800, Karthik Nayak wrote: > > shejialuo <shejialuo@gmail.com> writes: > > > +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' ' > > > + test_when_finished "rm -rf repo" && > > > + git init repo && > > > + cd repo && > > > > This should be in a subshell, so that at the end we can actually remove > > the repo. This seems to be applicable to most of the other tests in this > > file too. Perhaps, we should clean it up as a precursor commit to this > > series? > > I have searched the usage of "test_when_finished", and I don't know why > we need to use subshell. Could you please explain this further here. Karthik may have been thinking about operating systems, such as Microsoft Windows, which won't allow a directory to be deleted if that directory is in use. In this case, because the test cd's into "repo" and never cd's elsewhere, the directory is still in use when test_when_finished() tries to delete "repo". However, there is an even more important reason to use a subshell, and that is because a subshell ensures that the current working directory is effectively restored to the path which was current before the cd command. This is important since it guarantees that subsequent tests will be run in the correct directory even if the preceding test bombed out part way through. For example: test_expect_success 'foo' ' git init repo && cd repo && ...some more commands... && cd .. ' If one of the commands in "...some more commands..." fails, then the `cd ..` will never be reached, and the current working directory will remain "repo" rather than reverting to the path prior to the cd command. Thus, any tests which follow this one in the script will end up running in the wrong directory. The proper way to protect against this is: test_expect_success 'foo' ' git init repo && ( cd repo && ...some more commands... ) ' Exiting the subshell will correctly restore the current working directory to the original path _regardless_ of whether the test succeeds or fails somewhere in "...some more commands...". Using a subshell also means that you don't have to manually restore the working directory via `cd ..` or similar.
On Fri, Jan 17, 2025 at 05:01:21PM -0500, Eric Sunshine wrote: > On Fri, Jan 17, 2025 at 8:59 AM shejialuo <shejialuo@gmail.com> wrote: > > On Tue, Jan 07, 2025 at 08:33:56AM -0800, Karthik Nayak wrote: > > > shejialuo <shejialuo@gmail.com> writes: > > > > +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' ' > > > > + test_when_finished "rm -rf repo" && > > > > + git init repo && > > > > + cd repo && > > > > > > This should be in a subshell, so that at the end we can actually remove > > > the repo. This seems to be applicable to most of the other tests in this > > > file too. Perhaps, we should clean it up as a precursor commit to this > > > series? > > > > I have searched the usage of "test_when_finished", and I don't know why > > we need to use subshell. Could you please explain this further here. > > Karthik may have been thinking about operating systems, such as > Microsoft Windows, which won't allow a directory to be deleted if that > directory is in use. In this case, because the test cd's into "repo" > and never cd's elsewhere, the directory is still in use when > test_when_finished() tries to delete "repo". > > However, there is an even more important reason to use a subshell, and > that is because a subshell ensures that the current working directory > is effectively restored to the path which was current before the cd > command. This is important since it guarantees that subsequent tests > will be run in the correct directory even if the preceding test bombed > out part way through. For example: > > test_expect_success 'foo' ' > git init repo && > cd repo && > ...some more commands... && > cd .. > ' > > If one of the commands in "...some more commands..." fails, then the > `cd ..` will never be reached, and the current working directory will > remain "repo" rather than reverting to the path prior to the cd > command. Thus, any tests which follow this one in the script will end > up running in the wrong directory. The proper way to protect against > this is: > > test_expect_success 'foo' ' > git init repo && > ( > cd repo && > ...some more commands... > ) > ' > > Exiting the subshell will correctly restore the current working > directory to the original path _regardless_ of whether the test > succeeds or fails somewhere in "...some more commands...". Using a > subshell also means that you don't have to manually restore the > working directory via `cd ..` or similar. Thanks for above detailed explanation. I somehow understand why there would be so many "repo/repo/repo" when I execute the test. I have thought that `test_expect_success` command will make the environment of each test totally independent. I will improve this in the next version. Thanks, Jialuo
Eric Sunshine <sunshine@sunshineco.com> writes: > On Fri, Jan 17, 2025 at 8:59 AM shejialuo <shejialuo@gmail.com> wrote: >> On Tue, Jan 07, 2025 at 08:33:56AM -0800, Karthik Nayak wrote: >> > shejialuo <shejialuo@gmail.com> writes: >> > > +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' ' >> > > + test_when_finished "rm -rf repo" && >> > > + git init repo && >> > > + cd repo && >> > >> > This should be in a subshell, so that at the end we can actually remove >> > the repo. This seems to be applicable to most of the other tests in this >> > file too. Perhaps, we should clean it up as a precursor commit to this >> > series? >> >> I have searched the usage of "test_when_finished", and I don't know why >> we need to use subshell. Could you please explain this further here. > > Karthik may have been thinking about operating systems, such as > Microsoft Windows, which won't allow a directory to be deleted if that > directory is in use. In this case, because the test cd's into "repo" > and never cd's elsewhere, the directory is still in use when > test_when_finished() tries to delete "repo". > I didn't know this either. I was mostly talking about what you mentioned below. > However, there is an even more important reason to use a subshell, and > that is because a subshell ensures that the current working directory > is effectively restored to the path which was current before the cd > command. This is important since it guarantees that subsequent tests > will be run in the correct directory even if the preceding test bombed > out part way through. For example: > > test_expect_success 'foo' ' > git init repo && > cd repo && > ...some more commands... && > cd .. > ' > > If one of the commands in "...some more commands..." fails, then the > `cd ..` will never be reached, and the current working directory will > remain "repo" rather than reverting to the path prior to the cd > command. Thus, any tests which follow this one in the script will end > up running in the wrong directory. The proper way to protect against > this is: > > test_expect_success 'foo' ' > git init repo && > ( > cd repo && > ...some more commands... > ) > ' > > Exiting the subshell will correctly restore the current working > directory to the original path _regardless_ of whether the test > succeeds or fails somewhere in "...some more commands...". Using a > subshell also means that you don't have to manually restore the > working directory via `cd ..` or similar. This is was a super nice explanation compared to my single sentence. Thanks!
diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 3406f1e71d..d9eb2f8b71 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -4,6 +4,7 @@ #include "../config.h" #include "../dir.h" #include "../gettext.h" +#include "../fsck.h" #include "../hash.h" #include "../hex.h" #include "../refs.h" @@ -1747,15 +1748,39 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s return empty_ref_iterator_begin(); } -static int packed_fsck(struct ref_store *ref_store UNUSED, - struct fsck_options *o UNUSED, +static int packed_fsck(struct ref_store *ref_store, + struct fsck_options *o, struct worktree *wt) { + struct packed_ref_store *refs = packed_downcast(ref_store, + REF_STORE_READ, "fsck"); + struct stat st; + int ret = 0; if (!is_main_worktree(wt)) - return 0; + goto cleanup; - return 0; + /* + * If the packed-refs file doesn't exist, there's nothing to + * check. + */ + if (lstat(refs->path, &st) < 0) + goto cleanup; + + if (o->verbose) + fprintf_ln(stderr, "Checking packed-refs file %s", refs->path); + + if (!S_ISREG(st.st_mode)) { + struct fsck_ref_report report = { 0 }; + report.path = "packed-refs"; + + ret = fsck_report_ref(o, &report, FSCK_MSG_BAD_REF_FILETYPE, + "not a regular file"); + goto cleanup; + } + +cleanup: + return ret; } struct ref_storage_be refs_be_packed = { diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh index 75f234a94a..307f94a3ca 100755 --- a/t/t0602-reffiles-fsck.sh +++ b/t/t0602-reffiles-fsck.sh @@ -626,4 +626,24 @@ test_expect_success 'ref content checks should work with worktrees' ' test_cmp expect err ' +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' ' + test_when_finished "rm -rf repo" && + git init repo && + cd repo && + test_commit default && + git branch branch-1 && + git branch branch-2 && + git branch branch-3 && + git pack-refs --all && + + mv .git/packed-refs .git/packed-refs-back && + ln -sf packed-refs-bak .git/packed-refs && + test_must_fail git refs verify 2>err && + cat >expect <<-EOF && + error: packed-refs: badRefFiletype: not a regular file + EOF + rm .git/packed-refs && + test_cmp expect err +' + test_done
Although "git-fsck(1)" and "packed-backend.c" will check some consistency and correctness of "packed-refs" file, they never check the filetype of the "packed-refs". The user should always use "git packed-refs" command to create the raw regular "packed-refs" file, so we need to explicitly check this in "git refs verify". Use "lstat" to check the file mode. If we cannot check the file status, this is OK because there is a chance that there is no "packed-refs" in the repo. Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to the user if "packed-refs" is not a regular file. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> --- refs/packed-backend.c | 33 +++++++++++++++++++++++++++++---- t/t0602-reffiles-fsck.sh | 20 ++++++++++++++++++++ 2 files changed, 49 insertions(+), 4 deletions(-)