Message ID | 20240531193156.28046-1-tboegi@web.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v4,1/1] macOS: ls-files path fails if path of workdir is NFD | expand |
tboegi@web.de writes: > The problem is the_repository->config->hash_initialized > is set to 1 before the_repository->commondir is set to ".git". > Due to this, .git/config is never read, and precomposed_unicode > is never set to 1 (remains -1). The "is never read" part is a bit confusing and misleading. If it were At the point of code flow where we would want to learn the value of precompose configuration, the local configuration has not been read. then I would understand it, though. Is the analysis telling us that we need to rethink the order of things setup_git_directory() does? Or is this inherently unsolvable because we need to discover the .git/ directory and path to it before we can read configuration to learn from it, but we need the value of precompose setting to compute the "path to it"? Presumably chdir() done in setup_discovered_git_dir() can be done with either NFC or NFD if the filesystem is squashing the differences between them, so perhaps doing the repo_set_worktree() done in setup_discovered_git_dir() is wrong, and we could delay populating the .worktree member until much later? For reading the local config, it does matter we know where the git-dir and common-dir are, but the location of worktree is immaterial. Anyway, thanks for a patch. Will queue. > run_builtin() { > setup_git_directory() { > strbuf_getcwd() { # setup.c:1542 > precompose_{strbuf,string}_if_needed() { > # precomposed_unicode is still -1 > git_congig_get_bool("core.precomposeunicode") { > git_config_check_init() { > repo_read_config() { > git_config_init() { > # !!! > the_repository->config->hash_initialized=1 > # !!! > } > # does not read .git/config since > # the_repository->commondir is still NULL > } > } > } > returns without converting to NFC > } > returns cwd in NFD > } > > setup_discovered_git_dir() { > set_git_work_tree(".") { > repo_set_worktree() { > # this function indirectly calls strbuf_getcwd() > # --> precompose_{strbuf,string}_if_needed() --> > # {git,repo}_config_get_bool("core.precomposeunicode"), > # but does not try to read .git/config since > # the_repository->config->hash_initialized > # is already set to 1 above. And it will not read > # .git/config even if hash_initialized is 0 > # since the_repository->commondir is still NULL. > > the_repository->worktree = NFD > } > } > } > > setup_git_env() { > repo_setup_gitdir() { > repo_set_commondir() { > # finally commondir is set here > the_repository->commondir = ".git" > } > } > } > > } // END setup_git_directory > > Reported-by: Jun T <takimoto-j@kba.biglobe.ne.jp> > Signed-off-by: Torsten Bögershausen <tboegi@web.de> > Signed-off-by: Junio C Hamano <gitster@pobox.com> > --- > setup.c | 2 +- > t/t3910-mac-os-precompose.sh | 39 +++++++++++++++++++++++++++++++++++- > 2 files changed, 39 insertions(+), 2 deletions(-) > > diff --git a/setup.c b/setup.c > index 2e607632db..61f61496ec 100644 > --- a/setup.c > +++ b/setup.c > @@ -48,7 +48,7 @@ static int abspath_part_inside_repo(char *path) > size_t wtlen; > char *path0; > int off; > - const char *work_tree = get_git_work_tree(); > + const char *work_tree = precompose_string_if_needed(get_git_work_tree()); > struct strbuf realpath = STRBUF_INIT; > > if (!work_tree) > diff --git a/t/t3910-mac-os-precompose.sh b/t/t3910-mac-os-precompose.sh > index 898267a6bd..6d5918c8fe 100755 > --- a/t/t3910-mac-os-precompose.sh > +++ b/t/t3910-mac-os-precompose.sh > @@ -37,6 +37,27 @@ Alongc=$Alongc$Alongc$Alongc$Alongc$Alongc #50 Byte > Alongc=$Alongc$Alongc$Alongc$Alongc$Alongc #250 Byte > Alongc=$Alongc$AEligatu$AEligatu #254 Byte > > + > +ls_files_nfc_nfd () { > + test_when_finished "git config --global --unset core.precomposeunicode" && > + prglbl=$1 > + prlocl=$2 > + aumlcreat=$3 > + aumllist=$4 > + git config --global core.precomposeunicode $prglbl && > + ( > + rm -rf .git && > + mkdir -p "somewhere/$prglbl/$prlocl/$aumlcreat" && > + mypwd=$PWD && > + cd "somewhere/$prglbl/$prlocl/$aumlcreat" && > + git init && > + git config core.precomposeunicode $prlocl && > + git --literal-pathspecs ls-files "$mypwd/somewhere/$prglbl/$prlocl/$aumllist" 2>err && > + >expected && > + test_cmp expected err > + ) > +} > + > test_expect_success "detect if nfd needed" ' > precomposeunicode=$(git config core.precomposeunicode) && > test "$precomposeunicode" = true && > @@ -211,8 +232,8 @@ test_expect_success "unicode decomposed: git restore -p . " ' > ' > > # Test if the global core.precomposeunicode stops autosensing > -# Must be the last test case > test_expect_success "respect git config --global core.precomposeunicode" ' > + test_when_finished "git config --global --unset core.precomposeunicode" && > git config --global core.precomposeunicode true && > rm -rf .git && > git init && > @@ -220,4 +241,20 @@ test_expect_success "respect git config --global core.precomposeunicode" ' > test "$precomposeunicode" = "true" > ' > > +test_expect_success "ls-files false false nfd nfd" ' > + ls_files_nfc_nfd false false $Adiarnfd $Adiarnfd > +' > + > +test_expect_success "ls-files false true nfd nfd" ' > + ls_files_nfc_nfd false true $Adiarnfd $Adiarnfd > +' > + > +test_expect_success "ls-files true false nfd nfd" ' > + ls_files_nfc_nfd true false $Adiarnfd $Adiarnfd > +' > + > +test_expect_success "ls-files true true nfd nfd" ' > + ls_files_nfc_nfd true true $Adiarnfd $Adiarnfd > +' > + > test_done > -- > 2.41.0.394.ge43f4fd0bd
On Sat, Jun 01, 2024 at 08:55:03AM -0700, Junio C Hamano wrote: > tboegi@web.de writes: > > > The problem is the_repository->config->hash_initialized > > is set to 1 before the_repository->commondir is set to ".git". > > Due to this, .git/config is never read, and precomposed_unicode > > is never set to 1 (remains -1). > > The "is never read" part is a bit confusing and misleading. If it > were > > At the point of code flow where we would want to learn the value > of precompose configuration, the local configuration has not > been read. > > then I would understand it, though. Yes, the thing is that it has not been read, and will never been read, in this very very usage of Git. > > Is the analysis telling us that we need to rethink the order of > things setup_git_directory() does? Or is this inherently unsolvable > because we need to discover the .git/ directory and path to it > before we can read configuration to learn from it, but we need the > value of precompose setting to compute the "path to it"? I think it is solvable, see below. > > Presumably chdir() done in setup_discovered_git_dir() can be done > with either NFC or NFD if the filesystem is squashing the > differences between them, so perhaps doing the repo_set_worktree() > done in setup_discovered_git_dir() is wrong, and we could delay > populating the .worktree member until much later? For reading the > local config, it does matter we know where the git-dir and > common-dir are, but the location of worktree is immaterial. > > Anyway, thanks for a patch. Will queue. My understanding is that detecting the .git/ dir and then reading the config file from here does not work here. That may be better documented in this very commit message, please feel free to ammend it. The root cause may be fixed in a different commit later. [snip] The best info that I have at the moment is the call stack analysis done by Jun. T, where the global core.precomposeunicode is not set, and reading the local one "fails". I hope this is a useful repetition: The following is some info I got (hope it is correct and useful), but I have no idea how to fix the problem. precompose_string_if_needed() works only if: precomposed_unicode is already set to 1, or git_config_get_bool("core.precomposeunicode") sets it to 1. But git_config_get_bool() reads the file .git/config only if: the_repository->commondir is already set to ".git". Back trace when the strbuf_getcwd() is called for the 3rd time is (frame #4 is set_git_work_tree()): * frame #0: git`strbuf_getcwd(sb=0x00007ff7bfeff0a8) at strbuf.c:588:20 frame #1: git`strbuf_realpath_1(resolved=0x00007ff7bfeff0a8, path=".", flags=2) at abspath.c:101:7 frame #2: git`strbuf_realpath(resolved=0x00007ff7bfeff0a8, path=".", die_on_error=1) at abspath.c:219:9 frame #3: git`real_pathdup(path=".", die_on_error=1) at abspath.c:240:6 frame #4: git`repo_set_worktree(repo=0x000000010044eb98, path=".") at repository.c:145:19 frame #5: git`set_git_work_tree(new_work_tree=".") at environment.c:278:2 frame #6: git`setup_discovered_git_dir(gitdir=".git", cwd=0x0000000100435238, offset=16, repo_fmt=0x00007ff7bfeff1d8, nongit_ok=0x0000000000000000) at setup.c:1119:2 frame #7: git`setup_git_directory_gently(nongit_ok=0x0000000000000000) at setup.c:1606:12 frame #8: git`setup_git_directory at setup.c:1815:9 frame #9: git`run_builtin(p=0x0000000100424d58, argc=2, argv=0x00007ff7bfeff6d8) at git.c:448:12 frame #10: git`handle_builtin(argc=2, argv=0x00007ff7bfeff6d8) at git.c:729:3 frame #11: git`run_argv(argcp=0x00007ff7bfeff54c, argv=0x00007ff7bfeff540) at git.c:793:4 frame #12: git`cmd_main(argc=2, argv=0x00007ff7bfeff6d8) at git.c:928:19 frame #13: git`main(argc=3, argv=0x00007ff7bfeff6d0) at common-main.c:62:11 At this point, precomposed_unicode is still -1 and the_repository->commondir is still NULL. This means strbuf_getcwd() retuns NFD, and the_repository->worktree is set to NFD. Moreover, precompose_string_if_needed() calls git_config_get_bool("core.precomposeunicode"), and this function indirecly sets the_repository->config->hash_initialized = 1 Later setup_git_directory_gently() (frame #7) calls setup_git_env() --> repo_set_gitdir() --> repo_set_commondir() and the_repository->commondir is now set to ".git". Then run_builtin() (frame #10) calls precompose_argv_prefix() --> precompose_string_if_needed(). Here we have precomposed_unicode = -1 the_repository->config->hash_initialized = 1 This means git_config_check_init() does not read .git/config (does not call repo_read_config()) even if the_repository->commondir is set to ".git", and precomposed_unicode is not set to 1. [snip]
> 2024/06/01 4:3, tboegi@web.de <mailto:tboegi@web.de> wrote: > > The solution is to add a call to precompose_string_if_needed() > to this code in setup.c : > `work_tree = precompose_string_if_needed(get_git_work_tree());` This simple patch works for both 'ls-files NFD' and 'ls-files NFC'. > > There is, however, a limitation with this very usage of Git: > The (repo) local .gitconfig file is not used, core.precomposeunicode in .git/config is read, in function precompose_argv_prefix(), and NFD in argv is converted to NFC. But, as you know, the variable the_repository->worktree or the return value of get_git_work_tree() is in NFD. > 2024/06/03 4:40, Torsten Bögershausen <tboegi@web.de> wrote: > > The root cause may be fixed in a different commit later. Of course fixing the 'root cause' is better, but I don't know whether it's easy or not. Anyway, thank you for woking on this problem. -- Jun
diff --git a/setup.c b/setup.c index 2e607632db..61f61496ec 100644 --- a/setup.c +++ b/setup.c @@ -48,7 +48,7 @@ static int abspath_part_inside_repo(char *path) size_t wtlen; char *path0; int off; - const char *work_tree = get_git_work_tree(); + const char *work_tree = precompose_string_if_needed(get_git_work_tree()); struct strbuf realpath = STRBUF_INIT; if (!work_tree) diff --git a/t/t3910-mac-os-precompose.sh b/t/t3910-mac-os-precompose.sh index 898267a6bd..6d5918c8fe 100755 --- a/t/t3910-mac-os-precompose.sh +++ b/t/t3910-mac-os-precompose.sh @@ -37,6 +37,27 @@ Alongc=$Alongc$Alongc$Alongc$Alongc$Alongc #50 Byte Alongc=$Alongc$Alongc$Alongc$Alongc$Alongc #250 Byte Alongc=$Alongc$AEligatu$AEligatu #254 Byte + +ls_files_nfc_nfd () { + test_when_finished "git config --global --unset core.precomposeunicode" && + prglbl=$1 + prlocl=$2 + aumlcreat=$3 + aumllist=$4 + git config --global core.precomposeunicode $prglbl && + ( + rm -rf .git && + mkdir -p "somewhere/$prglbl/$prlocl/$aumlcreat" && + mypwd=$PWD && + cd "somewhere/$prglbl/$prlocl/$aumlcreat" && + git init && + git config core.precomposeunicode $prlocl && + git --literal-pathspecs ls-files "$mypwd/somewhere/$prglbl/$prlocl/$aumllist" 2>err && + >expected && + test_cmp expected err + ) +} + test_expect_success "detect if nfd needed" ' precomposeunicode=$(git config core.precomposeunicode) && test "$precomposeunicode" = true && @@ -211,8 +232,8 @@ test_expect_success "unicode decomposed: git restore -p . " ' ' # Test if the global core.precomposeunicode stops autosensing -# Must be the last test case test_expect_success "respect git config --global core.precomposeunicode" ' + test_when_finished "git config --global --unset core.precomposeunicode" && git config --global core.precomposeunicode true && rm -rf .git && git init && @@ -220,4 +241,20 @@ test_expect_success "respect git config --global core.precomposeunicode" ' test "$precomposeunicode" = "true" ' +test_expect_success "ls-files false false nfd nfd" ' + ls_files_nfc_nfd false false $Adiarnfd $Adiarnfd +' + +test_expect_success "ls-files false true nfd nfd" ' + ls_files_nfc_nfd false true $Adiarnfd $Adiarnfd +' + +test_expect_success "ls-files true false nfd nfd" ' + ls_files_nfc_nfd true false $Adiarnfd $Adiarnfd +' + +test_expect_success "ls-files true true nfd nfd" ' + ls_files_nfc_nfd true true $Adiarnfd $Adiarnfd +' + test_done