Message ID | 20240623214301.143796-1-abhijeet.nkt@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | describe: refresh the index when 'broken' flag is used | expand |
This is my first ever patch submission and I am excited to contribute even if it is such a little thing! Please let me know of any etiquette or convention violations, I will do my best to uphold them moving forward. Thanks On 24/06/24 03:12, Abhijeet Sonar wrote: > When describe is run with 'dirty' flag, we refresh the index > to make sure it is in sync with the filesystem before > determining if the working tree is dirty. However, this is > not done for the codepath where the 'broken' flag is used. > > This causes `git describe --broken --dirty` to false > positively report the worktree being dirty. Refreshing the > index before running diff-index fixes the problem. > > Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> > Reported-by: Paul Millar <paul.millar@desy.de> > Suggested-by: Junio C Hamano <gitster@pobox.com> > --- > builtin/describe.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/builtin/describe.c b/builtin/describe.c > index e5287eddf2..2b443c155e 100644 > --- a/builtin/describe.c > +++ b/builtin/describe.c > @@ -645,6 +645,20 @@ int cmd_describe(int argc, const char **argv, const char *prefix) > if (argc == 0) { > if (broken) { > struct child_process cp = CHILD_PROCESS_INIT; > + struct lock_file index_lock = LOCK_INIT; > + int fd; > + > + setup_work_tree(); > + prepare_repo_settings(the_repository); > + repo_read_index(the_repository); > + refresh_index(the_repository->index, REFRESH_QUIET|REFRESH_UNMERGED, > + NULL, NULL, NULL); > + fd = repo_hold_locked_index(the_repository, > + &index_lock, 0); > + if (0 <= fd) > + repo_update_index_if_able(the_repository, &index_lock); > + > + > strvec_pushv(&cp.args, diff_index_args); > cp.git_cmd = 1; > cp.no_stdin = 1;
Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > When describe is run with 'dirty' flag, we refresh the index > to make sure it is in sync with the filesystem before > determining if the working tree is dirty. However, this is > not done for the codepath where the 'broken' flag is used. > > This causes `git describe --broken --dirty` to false > positively report the worktree being dirty. Refreshing the > index before running diff-index fixes the problem. > > Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> > Reported-by: Paul Millar <paul.millar@desy.de> > Suggested-by: Junio C Hamano <gitster@pobox.com> > --- > builtin/describe.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/builtin/describe.c b/builtin/describe.c > index e5287eddf2..2b443c155e 100644 > --- a/builtin/describe.c > +++ b/builtin/describe.c > @@ -645,6 +645,20 @@ int cmd_describe(int argc, const char **argv, const char *prefix) > if (argc == 0) { > if (broken) { > struct child_process cp = CHILD_PROCESS_INIT; > + struct lock_file index_lock = LOCK_INIT; > + int fd; > + > + setup_work_tree(); > + prepare_repo_settings(the_repository); > + repo_read_index(the_repository); > + refresh_index(the_repository->index, REFRESH_QUIET|REFRESH_UNMERGED, > + NULL, NULL, NULL); > + fd = repo_hold_locked_index(the_repository, > + &index_lock, 0); > + if (0 <= fd) > + repo_update_index_if_able(the_repository, &index_lock); > + > + > I'm wondering why this needs to be done, as I can see, when we use the '--broken' flag, we create a child process to run `git diff-index --quiet HEAD`. As such, we shouldn't have to refresh the index here. Could you perhaps state how you can reproduce the issue mentioned? Also apart from that, we should add a test to capture the changes. > cp.git_cmd = 1; > cp.no_stdin = 1; > -- > 2.45.GIT
Hi Abhijeet and Karthik On 24/06/2024 11:56, Karthik Nayak wrote: > Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > >> When describe is run with 'dirty' flag, we refresh the index >> to make sure it is in sync with the filesystem before >> determining if the working tree is dirty. However, this is >> not done for the codepath where the 'broken' flag is used. >> >> This causes `git describe --broken --dirty` to false >> positively report the worktree being dirty. Refreshing the >> index before running diff-index fixes the problem. This is a good description of the problem the patch fixes. >> Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> >> Reported-by: Paul Millar <paul.millar@desy.de> >> Suggested-by: Junio C Hamano <gitster@pobox.com> >> --- >> builtin/describe.c | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/builtin/describe.c b/builtin/describe.c >> index e5287eddf2..2b443c155e 100644 >> --- a/builtin/describe.c >> +++ b/builtin/describe.c >> @@ -645,6 +645,20 @@ int cmd_describe(int argc, const char **argv, const char *prefix) >> if (argc == 0) { >> if (broken) { >> struct child_process cp = CHILD_PROCESS_INIT; >> + struct lock_file index_lock = LOCK_INIT; >> + int fd; >> + >> + setup_work_tree(); >> + prepare_repo_settings(the_repository); >> + repo_read_index(the_repository); >> + refresh_index(the_repository->index, REFRESH_QUIET|REFRESH_UNMERGED, >> + NULL, NULL, NULL); >> + fd = repo_hold_locked_index(the_repository, >> + &index_lock, 0); >> + if (0 <= fd) >> + repo_update_index_if_able(the_repository, &index_lock); >> + As we're dealing with a repository that might be broken I suspect we'd be better to run "git update-index --unmerged -q --refresh" as a subprocess in the same way that we run "git diff-index" so that "git describe --broken" does not die if the index cannot be refreshed. > I'm wondering why this needs to be done, as I can see, when we use the > '--broken' flag, we create a child process to run `git diff-index > --quiet HEAD`. As such, we shouldn't have to refresh the index here. "git diff-index" and "git diff-files" do not refresh the index. This is by design so that a script can refresh the index once and run "git diff-index" several times without wasting time updating the index each time. > Also apart from that, we should add a test to capture the changes. That would be nice Best Wishes Phillip >> cp.git_cmd = 1; >> cp.no_stdin = 1; >> -- >> 2.45.GIT
On 24/06/24 16:50, Phillip Wood wrote: > This is a good description of the problem the patch fixes. Thanks! > As we're dealing with a repository that might be broken I suspect we'd > be better to run "git update-index --unmerged -q --refresh" as a > subprocess in the same way that we run "git diff-index" so that "git > describe --broken" does not die if the index cannot be refreshed. I see, that makes sense. I will change it to launch `update-index` in a sub-process instead. >> Also apart from that, we should add a test to capture the changes. > That would be nice Got it, I will add some tests as well. Thanks
I have a question: I would like to change the owner of a file in the test case I am writing -- an operation that requires super-user privileges. I am not sure if it is okay to do that in tests. Since that would require running tests with `sudo`. What would be the correct way to do this?
Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > I would like to change the owner of a file > in the test case I am writing -- an operation > that requires super-user privileges. I am not > sure if it is okay to do that in tests. Since > that would require running tests with `sudo`. What is the reason why you want to change the owner of a file in your test? If it is merely to make sure you cannot write to the .git/index file, temporarily doing chmod of the .git directory in a test (with POSIXPERM prerequisite) may be one way to do so, and you do not need the second user in the system test is running. Or if you pretend that you have a second process that is holding the lock in .git/index by creating .git/index.lock file yourself, that would also prevent your tested command from touching the index. The latter approach would result in a test that may look like so (I am writing this in my mail client, and I expect there may be some fix ups needed): test_expect_success 'see what --broken does upon unwritable index' ' test_when_finished "rm -f .git/index.lock" && test_commit A A.file && echo changed >>A.file && >.git/index.lock && test_must_fail git describe --dirty >actual 2>error && test_grep "could not write index" error && git describe --broken --dirty >actual 2>error && test_grep ! "could not write index" error && echo ...expected.describe.result... >expect && test_cmp expect actual ' HTH.
> What is the reason why you want to change the owner of a file in > your test? > > If it is merely to make sure you cannot write to the .git/index > file, temporarily doing chmod of the .git directory in a test (with POSIXPERM prerequisite) may be one way to do so, and you do not need the second user in the system test is running. I want to change the owner of a checked-in file and not the `.git` directory. This is because of what you noted in an earlier message: > As many attributes of each file (like the > last modified timestamp and who owns the file) are recorded in the > index for files that were verified to be unmodified (this is done so > that by doing lstat() on a path and comparing the result with the > information saved in the index, we can notice that the path was > modified without actually opening the file and looking at the > contents), after doing something (like "git diff") that causes this > information updated while the files appear to be owned by you Currently, `git describe --dirty --broken` reports the working tree as dirty if you change the owner of a file. And as Phillip pointed out, calling `git update-index --unmerged -q --refresh` to update the index fixes this. What I want to test looks something like this: # initially, the file is owned by a non-root user chown root file git describe --dirty --broken # incorrectly suffixes the output with '-dirty' As mentioned earlier, the dirty suffix goes away if the index is refreshed before running describe. This is what I really want to assert -- that there is no '-dirty' suffix when owner of a file is changed. This kind of simulates the scenario where `git describe` is run in a docker container as was originally reported by Paul: > mkdir test-container > > cd test-container > > cat >Dockerfile <<EOF > > FROM docker.io/debian:bookworm-slim > > WORKDIR /work > > RUN apt-get update && apt-get -y install git > > EOF > > podman build -t test-image . > > > > mkdir test-repo > > cd test-repo > > git init echo "Hello, world" > README > > git add README > > git commit -m "Initial commit" README > > git tag v1.0.0 > > > > git describe --tags --dirty --broken > > > > podman run -v `pwd`:/work --rm -it --entrypoint '["/usr/bin/git", > > "describe", "--tags", "--dirty", "--broken"]' test-image Thanks
Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > Currently, `git describe --dirty --broken` reports the working tree as > dirty if you change the owner of a file. And as Phillip pointed out, > calling `git update-index --unmerged -q --refresh` to update the index > fixes this. Starting from a clean state with a tracked file COPYING, I can do this: $ git describe --dirty --broken v2.45.2-862-g39ba10deb2 $ cat COPYING >RENAMING && mv RENAMING COPYING $ git diff-index --abbrev=8 HEAD :100644 100644 536e5552 00000000 M COPYING $ git describe --dirty --broken v2.45.2-862-g39ba10deb2-dirty $ git describe --dirty v2.45.2-862-g39ba10deb2 This is with a version if Git _without_ your fix, i.e. the one whose "describe --broken --dirty" does not do "git update-index --refresh". In other words, the stat-only change to cause "diff-index" to report a "suspected to be modified" does not have to be that the file is owned by a different owner. So I still do not understand why you want a second user in this test.
> So I still do not understand why you > want a second user in this test. What I really wanted to do was closely mirror the environment in reproduction steps mentioned in original bug report. Which I figured could be done by changing the owner to a second user. On 24/06/24 23:28, Junio C Hamano wrote: > $ git describe --dirty --broken > v2.45.2-862-g39ba10deb2 > $ cat COPYING >RENAMING && mv RENAMING COPYING > $ git diff-index --abbrev=8 HEAD > :100644 100644 536e5552 00000000 M COPYING > $ git describe --dirty --broken > v2.45.2-862-g39ba10deb2-dirty > $ git describe --dirty > v2.45.2-862-g39ba10deb2 Thanks, I will use this in the tests.
Phillip Wood <phillip.wood123@gmail.com> writes: > Hi Abhijeet and Karthik > > On 24/06/2024 11:56, Karthik Nayak wrote: >> Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: >> >>> When describe is run with 'dirty' flag, we refresh the index >>> to make sure it is in sync with the filesystem before >>> determining if the working tree is dirty. However, this is >>> not done for the codepath where the 'broken' flag is used. >>> >>> This causes `git describe --broken --dirty` to false >>> positively report the worktree being dirty. Refreshing the >>> index before running diff-index fixes the problem. > > This is a good description of the problem the patch fixes. > >>> Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> >>> Reported-by: Paul Millar <paul.millar@desy.de> >>> Suggested-by: Junio C Hamano <gitster@pobox.com> >>> --- >>> builtin/describe.c | 14 ++++++++++++++ >>> 1 file changed, 14 insertions(+) >>> >>> diff --git a/builtin/describe.c b/builtin/describe.c >>> index e5287eddf2..2b443c155e 100644 >>> --- a/builtin/describe.c >>> +++ b/builtin/describe.c >>> @@ -645,6 +645,20 @@ int cmd_describe(int argc, const char **argv, const char *prefix) >>> if (argc == 0) { >>> if (broken) { >>> struct child_process cp = CHILD_PROCESS_INIT; >>> + struct lock_file index_lock = LOCK_INIT; >>> + int fd; >>> + >>> + setup_work_tree(); >>> + prepare_repo_settings(the_repository); >>> + repo_read_index(the_repository); >>> + refresh_index(the_repository->index, REFRESH_QUIET|REFRESH_UNMERGED, >>> + NULL, NULL, NULL); >>> + fd = repo_hold_locked_index(the_repository, >>> + &index_lock, 0); >>> + if (0 <= fd) >>> + repo_update_index_if_able(the_repository, &index_lock); >>> + > > As we're dealing with a repository that might be broken I suspect we'd > be better to run "git update-index --unmerged -q --refresh" as a > subprocess in the same way that we run "git diff-index" so that "git > describe --broken" does not die if the index cannot be refreshed. > >> I'm wondering why this needs to be done, as I can see, when we use the >> '--broken' flag, we create a child process to run `git diff-index >> --quiet HEAD`. As such, we shouldn't have to refresh the index here. > > "git diff-index" and "git diff-files" do not refresh the index. This is > by design so that a script can refresh the index once and run "git > diff-index" several times without wasting time updating the index each time. > I see. Thanks for correcting me! >> Also apart from that, we should add a test to capture the changes. > > That would be nice > > Best Wishes > > Phillip > > >>> cp.git_cmd = 1; >>> cp.no_stdin = 1; >>> -- >>> 2.45.GIT
diff --git a/builtin/describe.c b/builtin/describe.c index e5287eddf2..2b443c155e 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -645,6 +645,20 @@ int cmd_describe(int argc, const char **argv, const char *prefix) if (argc == 0) { if (broken) { struct child_process cp = CHILD_PROCESS_INIT; + struct lock_file index_lock = LOCK_INIT; + int fd; + + setup_work_tree(); + prepare_repo_settings(the_repository); + repo_read_index(the_repository); + refresh_index(the_repository->index, REFRESH_QUIET|REFRESH_UNMERGED, + NULL, NULL, NULL); + fd = repo_hold_locked_index(the_repository, + &index_lock, 0); + if (0 <= fd) + repo_update_index_if_able(the_repository, &index_lock); + + strvec_pushv(&cp.args, diff_index_args); cp.git_cmd = 1; cp.no_stdin = 1;
When describe is run with 'dirty' flag, we refresh the index to make sure it is in sync with the filesystem before determining if the working tree is dirty. However, this is not done for the codepath where the 'broken' flag is used. This causes `git describe --broken --dirty` to false positively report the worktree being dirty. Refreshing the index before running diff-index fixes the problem. Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> Reported-by: Paul Millar <paul.millar@desy.de> Suggested-by: Junio C Hamano <gitster@pobox.com> --- builtin/describe.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)