Message ID | 20230906060241.944886-2-sokcevic@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [diff-lib] Fix check_removed when fsmonitor is on | expand |
Josip Sokcevic <sokcevic@google.com> writes: > git-diff-index may return incorrect deleted entries when fsmonitor is used in a > repository with git submodules. This can be observed on Mac machines, but it > can affect all other supported platforms too. > > If fsmonitor is used, `stat *st` may not be initialized. Since `lstat` calls > aren't not desired when fsmonitor is on, skip the entire gitlink check using > the same condition used to initialize `stat *st`. Ah, that's a great catch. Thanks for narrowing it down to check_removed(). I see from "git blame" that this was introduced in 4f3d6d0261 (fsmonitor: skip lstat deletion check during git diff-index, 2021-03- 17). In that commit, subsequent code in check_removed() already used "st" when it was possibly uninitialized, so perhaps no one noticed it. I don't think anyone discussed it either on the mailing list thread that introduced this patch [1]. [1] https://lore.kernel.org/git/pull.903.git.1615760258.gitgitgadget@gmail.com/ Looking at this function in more detail, though, I see that the solution in this patch set is incomplete - both before and after this patch, there are code paths in which "st" is never initialized, but "st->mode" is used (not in check_removed(), but in its callers). I think a more complete solution needs to look something like this: diff --git a/diff-lib.c b/diff-lib.c index d8aa777a73..5637237003 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -39,10 +39,20 @@ static int check_removed(const struct index_state *istate, const struct cache_entry *ce, struct stat *st) { assert(is_fsmonitor_refreshed(istate)); - if (!(ce->ce_flags & CE_FSMONITOR_VALID) && lstat(ce->name, st) < 0) { - if (!is_missing_file_error(errno)) - return -1; - return 1; + if (ce->ce_flags & CE_FSMONITOR_VALID) { + /* + * Both check_removed() and its callers expect lstat() to have + * happened and, in particular, the st_mode field to be set. + * Simulate this with the contents of ce. + */ + memset(st, 0, sizeof(*st)); + st->st_mode = ce->ce_mode; + } else { + if (lstat(ce->name, st) < 0) { + if (!is_missing_file_error(errno)) + return -1; + return 1; + } } if (has_symlink_leading_path(ce->name, ce_namelen(ce))) return 1; Also, regarding the commit message, the title should be formatted like: diff-lib: simulate stat when using fsmonitor or something like that (file, then colon, then message). You'll also need a sign-off. See the [[sign-off]] section in Documentation/SubmittingPatches for more information. > diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh > index 0c241d6c14..894a80fbe8 100755 > --- a/t/t7527-builtin-fsmonitor.sh > +++ b/t/t7527-builtin-fsmonitor.sh > @@ -824,6 +824,10 @@ test_expect_success 'submodule always visited' ' > > create_super super && > create_sub sub && > + git -C super --no-optional-locks diff-index --name-status HEAD >actual.with && > + git -C super --no-optional-locks -c core.fsmonitor=false \ > + diff-index --name-status HEAD >actual.without && > + test_cmp actual.with actual.without && > > git -C super submodule add ../sub ./dir_1/dir_2/sub && > git -C super commit -m "add sub" && Any reason why these additions are here instead of where I put them (in my_match_and_clean())? I think it makes more sense where I put them, as then they would automatically apply to all tests. > @@ -837,6 +841,8 @@ test_expect_success 'submodule always visited' ' > # some dirt in the submodule and confirm matching output. > > # Completely clean status. > + echo Now running for clean status && > + > my_match_and_clean && > > # .M S..U You can remove this - I included this to show exactly where in the code the test fails.
diff --git a/diff-lib.c b/diff-lib.c index d8aa777a73..c67aa5ff89 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -46,7 +46,7 @@ static int check_removed(const struct index_state *istate, const struct cache_en } if (has_symlink_leading_path(ce->name, ce_namelen(ce))) return 1; - if (S_ISDIR(st->st_mode)) { + if (!(ce->ce_flags & CE_FSMONITOR_VALID) && S_ISDIR(st->st_mode)) { struct object_id sub; /* diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh index 0c241d6c14..894a80fbe8 100755 --- a/t/t7527-builtin-fsmonitor.sh +++ b/t/t7527-builtin-fsmonitor.sh @@ -824,6 +824,10 @@ test_expect_success 'submodule always visited' ' create_super super && create_sub sub && + git -C super --no-optional-locks diff-index --name-status HEAD >actual.with && + git -C super --no-optional-locks -c core.fsmonitor=false \ + diff-index --name-status HEAD >actual.without && + test_cmp actual.with actual.without && git -C super submodule add ../sub ./dir_1/dir_2/sub && git -C super commit -m "add sub" && @@ -837,6 +841,8 @@ test_expect_success 'submodule always visited' ' # some dirt in the submodule and confirm matching output. # Completely clean status. + echo Now running for clean status && + my_match_and_clean && # .M S..U