Message ID | 20210819200953.2105230-3-emilyshaffer@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | cache parent project's gitdir in submodules | expand |
On 8/19/2021 4:09 PM, Emily Shaffer wrote: ... > +submodule.superprojectGitDir:: > + The relative path from the submodule's worktree to its superproject's > + gitdir. When Git is run in a repository, it usually makes no difference > + whether this repository is standalone or a submodule, but if this > + configuration variable is present, additional behavior may be possible, > + such as "git status" printing additional information about this > + submodule's status with respect to its superproject. This config should > + only be present in projects which are submodules, but is not guaranteed > + to be present in every submodule, so only optional value-added behavior > + should be linked to it. It is set automatically during > + submodule creation. > ++ > + Because of this configuration variable, it is forbidden to use the > + same submodule worktree shared by multiple superprojects. nit: this paragraph linked with the "+" line should have no tabbing. Also, could we use the same submodule worktree for multiple superprojects _before_ this configuration variable? That seems wild to me. Or, is that not a new requirement? Perhaps you mean something like this instead: It is forbidden to use the same submodule worktree for multiple superprojects, so this configuration variable stores the unique superproject and is not multi-valued. > diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c > index d55f6262e9..d60fcd2c7d 100644 > --- a/builtin/submodule--helper.c > +++ b/builtin/submodule--helper.c > @@ -1910,6 +1910,10 @@ static int module_clone(int argc, const char **argv, const char *prefix) > git_config_set_in_file(p, "submodule.alternateErrorStrategy", > error_strategy); > > + git_config_set_in_file(p, "submodule.superprojectGitdir", > + relative_path(absolute_path(get_git_dir()), > + path, &sb)); > + I see that all new submodules will have this configuration set. But we will also live in a world where some existing submodules do not have this variable set. I'll look elsewhere for compatibility checks. > inspect() { > - dir=$1 && > - > - git -C "$dir" for-each-ref --format='%(refname)' 'refs/heads/*' >heads && > - { git -C "$dir" symbolic-ref HEAD || :; } >head && > - git -C "$dir" rev-parse HEAD >head-sha1 && > - git -C "$dir" update-index --refresh && > - git -C "$dir" diff-files --exit-code && > - git -C "$dir" clean -n -d -x >untracked > + sub_dir=$1 && > + super_dir=$2 && > + > + git -C "$sub_dir" for-each-ref --format='%(refname)' 'refs/heads/*' >heads && > + { git -C "$sub_dir" symbolic-ref HEAD || :; } >head && > + git -C "$sub_dir" rev-parse HEAD >head-sha1 && > + git -C "$sub_dir" update-index --refresh && > + git -C "$sub_dir" diff-files --exit-code && > + cached_super_dir="$(git -C "$sub_dir" config --get submodule.superprojectGitDir)" && > + [ "$(git -C "$super_dir" rev-parse --absolute-git-dir)" \ > + -ef "$sub_dir/$cached_super_dir" ] && > + git -C "$sub_dir" clean -n -d -x >untracked You rewrote this test in the previous patch, and now every line is changed because you renamed 'dir' to 'sub_dir'. Could the previous patch use 'sub_dir' from the start so this change only shows the new lines instead of many edited lines? > } > > test_expect_success 'submodule add' ' > @@ -138,7 +142,7 @@ test_expect_success 'submodule add' ' > ) && > > rm -f heads head untracked && > - inspect addtest/submod && > + inspect addtest/submod addtest && Similarly, I would not be upset to see these lines be changed just the once, even if the second argument is ignored for a single commit. But this nitpick is definitely less important since I could see taste swaying things either way. Thanks, -Stolee
Emily Shaffer <emilyshaffer@google.com> wrote: > > diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c > index d55f6262e9..d60fcd2c7d 100644 > --- a/builtin/submodule--helper.c > +++ b/builtin/submodule--helper.c > @@ -1910,6 +1910,10 @@ static int module_clone(int argc, const char **argv, const char *prefix) > git_config_set_in_file(p, "submodule.alternateErrorStrategy", > error_strategy); > > + git_config_set_in_file(p, "submodule.superprojectGitdir", > + relative_path(absolute_path(get_git_dir()), > + path, &sb)); This will be executed when cloning a submodule with `git submodule add <url/path> <path>`. Do we also want to set submodule.superprojectGitdir when adding a repository that already exists in the working tree as a submodule? I.e., something like: git init super git init super/sub [ make commits in super/sub ] git -C super submodule add ./sub I don't know if this workflow is so commonly used, though... It may not be worth the additional work. Another option, which I believe was suggested by Jonathan Nieder on the Review Club, is to change the code to absorb the gitdir when adding the local submodule. Then, the configuration would already be set by the `absorb_git_dir...()` function itself. > free(sm_alternate); > free(error_strategy); > > diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh > index 4bc6b6c886..e407329d81 100755 > --- a/t/t7400-submodule-basic.sh > +++ b/t/t7400-submodule-basic.sh > @@ -108,14 +108,18 @@ test_expect_success 'setup - repository to add submodules to' ' > submodurl=$(pwd -P) > > inspect() { > - dir=$1 && > - > - git -C "$dir" for-each-ref --format='%(refname)' 'refs/heads/*' >heads && > - { git -C "$dir" symbolic-ref HEAD || :; } >head && > - git -C "$dir" rev-parse HEAD >head-sha1 && > - git -C "$dir" update-index --refresh && > - git -C "$dir" diff-files --exit-code && > - git -C "$dir" clean -n -d -x >untracked > + sub_dir=$1 && > + super_dir=$2 && > + > + git -C "$sub_dir" for-each-ref --format='%(refname)' 'refs/heads/*' >heads && > + { git -C "$sub_dir" symbolic-ref HEAD || :; } >head && > + git -C "$sub_dir" rev-parse HEAD >head-sha1 && > + git -C "$sub_dir" update-index --refresh && > + git -C "$sub_dir" diff-files --exit-code && > + cached_super_dir="$(git -C "$sub_dir" config --get submodule.superprojectGitDir)" && > + [ "$(git -C "$super_dir" rev-parse --absolute-git-dir)" \ > + -ef "$sub_dir/$cached_super_dir" ] && To avoid the non-POSIX `-ef`, we could perhaps do something like: super_gitdir="$(git -C "$super_dir" rev-parse --absolute-git-dir)" && cached_gitdir="$(git -C "$sub_dir" config --get submodule.superprojectGitDir)" && test "$cached_gitdir" = "$(test-tool path-utils relative_path "$super_gitdir" "$PWD/$sub_dir")" && (We need the "$PWD/" at the last command because `path.c:relative_path()` returns the first argument as-is when one of the two paths given to it is absolute and the other is not.) One bonus of testing the cached path this way is that we also check that it is indeed being stored as a relative path :)
On Thu, Aug 19, 2021 at 08:38:19PM -0400, Derrick Stolee wrote: > > On 8/19/2021 4:09 PM, Emily Shaffer wrote: > ... > > +submodule.superprojectGitDir:: > > + The relative path from the submodule's worktree to its superproject's > > + gitdir. When Git is run in a repository, it usually makes no difference > > + whether this repository is standalone or a submodule, but if this > > + configuration variable is present, additional behavior may be possible, > > + such as "git status" printing additional information about this > > + submodule's status with respect to its superproject. This config should > > + only be present in projects which are submodules, but is not guaranteed > > + to be present in every submodule, so only optional value-added behavior > > + should be linked to it. It is set automatically during > > + submodule creation. > > ++ > > + Because of this configuration variable, it is forbidden to use the > > + same submodule worktree shared by multiple superprojects. > > nit: this paragraph linked with the "+" line should have no tabbing. Done. > > Also, could we use the same submodule worktree for multiple superprojects > _before_ this configuration variable? That seems wild to me. Or, is that > not a new requirement? I guess it'd be possible to do something pretty evil with symlinks? I'm not sure why you would want to, though. But now that I think about it more, I'm not sure that it would work, at least if we understand submodule to mean "...and the gitdir lives in .git/modules/ of the superproject". If superA contains sub and superB contains a symlink to 'sub''s worktree in superA, then wouldn't superA and superB both be trying to contain their own gitdirs for sub? And having multiple gitdirs for a worktree is an unacceptable state anyway. Or maybe the issue is more like: you have super, which contains sub, and you have super-wt, which is a worktree of super; to avoid duplicating sub, you decided to use a symlink. So there's only one sub gitdir, and only one super gitdir. It's a little awkward, but since submodule worktrees aren't currently supported, I can see the appeal. In this configuration, a path from submodule *worktree* to superproject gitdir, which is what v3 and earlier propose, would be broken in one superproject worktree or the other And having multiple gitdirs for a worktree is an unacceptable state anyway. Or maybe the issue is more like: you have super, which contains sub, and you have super-wt, which is a worktree of super; to avoid duplicating sub, you decided to use a symlink. So there's only one sub gitdir, and only one super gitdir. It's a little awkward, but since submodule worktrees aren't currently supported, I can see the appeal. In this configuration, a path from submodule *worktree* to superproject gitdir, which is what v3 and earlier propose, would be broken in one superproject worktree or the other. But as I'm proposing in v4, folks in the review club pointed out to me that a pointer from gitdir to gitdir makes more sense - and that would fix this concern, too, because sub and the symlink of sub would both share a single gitdir, and that gitdir would point to the single gitdir of super and super-wt. All a long way to say: I think v4 will fix it by originating the relative path from submodule gitdir, instead. And I will remove the extra paragraph - I think it is just adding confusion around a case that nobody would really want to use... > > Perhaps you mean something like this instead: > > It is forbidden to use the same submodule worktree for multiple > superprojects, so this configuration variable stores the unique > superproject and is not multi-valued. > > > diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c > > index d55f6262e9..d60fcd2c7d 100644 > > --- a/builtin/submodule--helper.c > > +++ b/builtin/submodule--helper.c > > @@ -1910,6 +1910,10 @@ static int module_clone(int argc, const char **argv, const char *prefix) > > git_config_set_in_file(p, "submodule.alternateErrorStrategy", > > error_strategy); > > > > + git_config_set_in_file(p, "submodule.superprojectGitdir", > > + relative_path(absolute_path(get_git_dir()), > > + path, &sb)); > > + > > I see that all new submodules will have this configuration set. But we will > also live in a world where some existing submodules do not have this variable > set. I'll look elsewhere for compatibility checks. Yep, the series intended to add them piecemeal where possible, over the course of a handful of commits. > > > inspect() { > > - dir=$1 && > > - > > - git -C "$dir" for-each-ref --format='%(refname)' 'refs/heads/*' >heads && > > - { git -C "$dir" symbolic-ref HEAD || :; } >head && > > - git -C "$dir" rev-parse HEAD >head-sha1 && > > - git -C "$dir" update-index --refresh && > > - git -C "$dir" diff-files --exit-code && > > - git -C "$dir" clean -n -d -x >untracked > > + sub_dir=$1 && > > + super_dir=$2 && > > + > > + git -C "$sub_dir" for-each-ref --format='%(refname)' 'refs/heads/*' >heads && > > + { git -C "$sub_dir" symbolic-ref HEAD || :; } >head && > > + git -C "$sub_dir" rev-parse HEAD >head-sha1 && > > + git -C "$sub_dir" update-index --refresh && > > + git -C "$sub_dir" diff-files --exit-code && > > + cached_super_dir="$(git -C "$sub_dir" config --get submodule.superprojectGitDir)" && > > + [ "$(git -C "$super_dir" rev-parse --absolute-git-dir)" \ > > + -ef "$sub_dir/$cached_super_dir" ] && > > + git -C "$sub_dir" clean -n -d -x >untracked > > You rewrote this test in the previous patch, and now every line is changed > because you renamed 'dir' to 'sub_dir'. Could the previous patch use > 'sub_dir' from the start so this change only shows the new lines instead of > many edited lines? Sure. > > > } > > > > test_expect_success 'submodule add' ' > > @@ -138,7 +142,7 @@ test_expect_success 'submodule add' ' > > ) && > > > > rm -f heads head untracked && > > - inspect addtest/submod && > > + inspect addtest/submod addtest && > > Similarly, I would not be upset to see these lines be changed just the > once, even if the second argument is ignored for a single commit. But > this nitpick is definitely less important since I could see taste > swaying things either way. I feel less interested in that nit; I think a mechanical "strip the useless arg" change + a mechanical "add an unrelated useful arg" change is easier to review than doing both at once. - Emily
On Sat, Sep 04, 2021 at 02:20:51PM -0300, Matheus Tavares wrote: > > Emily Shaffer <emilyshaffer@google.com> wrote: > > > > diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c > > index d55f6262e9..d60fcd2c7d 100644 > > --- a/builtin/submodule--helper.c > > +++ b/builtin/submodule--helper.c > > @@ -1910,6 +1910,10 @@ static int module_clone(int argc, const char **argv, const char *prefix) > > git_config_set_in_file(p, "submodule.alternateErrorStrategy", > > error_strategy); > > > > + git_config_set_in_file(p, "submodule.superprojectGitdir", > > + relative_path(absolute_path(get_git_dir()), > > + path, &sb)); > > This will be executed when cloning a submodule with > `git submodule add <url/path> <path>`. Do we also want to set > submodule.superprojectGitdir when adding a repository that already exists in > the working tree as a submodule? I.e., something like: > > git init super > git init super/sub > [ make commits in super/sub ] > git -C super submodule add ./sub > > I don't know if this workflow is so commonly used, though... It may not be > worth the additional work. Yeah, I think it is covered in the next patch with 'git submodule absorbgitdirs'. > > Another option, which I believe was suggested by Jonathan Nieder on the Review > Club, is to change the code to absorb the gitdir when adding the local > submodule. Then, the configuration would already be set by the > `absorb_git_dir...()` function itself. > > > free(sm_alternate); > > free(error_strategy); > > > > diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh > > index 4bc6b6c886..e407329d81 100755 > > --- a/t/t7400-submodule-basic.sh > > +++ b/t/t7400-submodule-basic.sh > > @@ -108,14 +108,18 @@ test_expect_success 'setup - repository to add submodules to' ' > > submodurl=$(pwd -P) > > > > inspect() { > > - dir=$1 && > > - > > - git -C "$dir" for-each-ref --format='%(refname)' 'refs/heads/*' >heads && > > - { git -C "$dir" symbolic-ref HEAD || :; } >head && > > - git -C "$dir" rev-parse HEAD >head-sha1 && > > - git -C "$dir" update-index --refresh && > > - git -C "$dir" diff-files --exit-code && > > - git -C "$dir" clean -n -d -x >untracked > > + sub_dir=$1 && > > + super_dir=$2 && > > + > > + git -C "$sub_dir" for-each-ref --format='%(refname)' 'refs/heads/*' >heads && > > + { git -C "$sub_dir" symbolic-ref HEAD || :; } >head && > > + git -C "$sub_dir" rev-parse HEAD >head-sha1 && > > + git -C "$sub_dir" update-index --refresh && > > + git -C "$sub_dir" diff-files --exit-code && > > + cached_super_dir="$(git -C "$sub_dir" config --get submodule.superprojectGitDir)" && > > + [ "$(git -C "$super_dir" rev-parse --absolute-git-dir)" \ > > + -ef "$sub_dir/$cached_super_dir" ] && > > To avoid the non-POSIX `-ef`, we could perhaps do something like: > > super_gitdir="$(git -C "$super_dir" rev-parse --absolute-git-dir)" && > cached_gitdir="$(git -C "$sub_dir" config --get submodule.superprojectGitDir)" && > test "$cached_gitdir" = "$(test-tool path-utils relative_path "$super_gitdir" "$PWD/$sub_dir")" && > > (We need the "$PWD/" at the last command because `path.c:relative_path()` > returns the first argument as-is when one of the two paths given to it is > absolute and the other is not.) > > One bonus of testing the cached path this way is that we also check that > it is indeed being stored as a relative path :) Yep, that is what I settled on. Thanks. - Emily
diff --git a/Documentation/config/submodule.txt b/Documentation/config/submodule.txt index d7a63c8c12..23e0a01d90 100644 --- a/Documentation/config/submodule.txt +++ b/Documentation/config/submodule.txt @@ -90,3 +90,18 @@ submodule.alternateErrorStrategy:: `ignore`, `info`, `die`. Default is `die`. Note that if set to `ignore` or `info`, and if there is an error with the computed alternate, the clone proceeds as if no alternate was specified. + +submodule.superprojectGitDir:: + The relative path from the submodule's worktree to its superproject's + gitdir. When Git is run in a repository, it usually makes no difference + whether this repository is standalone or a submodule, but if this + configuration variable is present, additional behavior may be possible, + such as "git status" printing additional information about this + submodule's status with respect to its superproject. This config should + only be present in projects which are submodules, but is not guaranteed + to be present in every submodule, so only optional value-added behavior + should be linked to it. It is set automatically during + submodule creation. ++ + Because of this configuration variable, it is forbidden to use the + same submodule worktree shared by multiple superprojects. diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index d55f6262e9..d60fcd2c7d 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -1910,6 +1910,10 @@ static int module_clone(int argc, const char **argv, const char *prefix) git_config_set_in_file(p, "submodule.alternateErrorStrategy", error_strategy); + git_config_set_in_file(p, "submodule.superprojectGitdir", + relative_path(absolute_path(get_git_dir()), + path, &sb)); + free(sm_alternate); free(error_strategy); diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh index 4bc6b6c886..e407329d81 100755 --- a/t/t7400-submodule-basic.sh +++ b/t/t7400-submodule-basic.sh @@ -108,14 +108,18 @@ test_expect_success 'setup - repository to add submodules to' ' submodurl=$(pwd -P) inspect() { - dir=$1 && - - git -C "$dir" for-each-ref --format='%(refname)' 'refs/heads/*' >heads && - { git -C "$dir" symbolic-ref HEAD || :; } >head && - git -C "$dir" rev-parse HEAD >head-sha1 && - git -C "$dir" update-index --refresh && - git -C "$dir" diff-files --exit-code && - git -C "$dir" clean -n -d -x >untracked + sub_dir=$1 && + super_dir=$2 && + + git -C "$sub_dir" for-each-ref --format='%(refname)' 'refs/heads/*' >heads && + { git -C "$sub_dir" symbolic-ref HEAD || :; } >head && + git -C "$sub_dir" rev-parse HEAD >head-sha1 && + git -C "$sub_dir" update-index --refresh && + git -C "$sub_dir" diff-files --exit-code && + cached_super_dir="$(git -C "$sub_dir" config --get submodule.superprojectGitDir)" && + [ "$(git -C "$super_dir" rev-parse --absolute-git-dir)" \ + -ef "$sub_dir/$cached_super_dir" ] && + git -C "$sub_dir" clean -n -d -x >untracked } test_expect_success 'submodule add' ' @@ -138,7 +142,7 @@ test_expect_success 'submodule add' ' ) && rm -f heads head untracked && - inspect addtest/submod && + inspect addtest/submod addtest && test_cmp expect heads && test_cmp expect head && test_must_be_empty untracked @@ -229,7 +233,7 @@ test_expect_success 'submodule add --branch' ' ) && rm -f heads head untracked && - inspect addtest/submod-branch && + inspect addtest/submod-branch addtest && test_cmp expect-heads heads && test_cmp expect-head head && test_must_be_empty untracked @@ -245,7 +249,7 @@ test_expect_success 'submodule add with ./ in path' ' ) && rm -f heads head untracked && - inspect addtest/dotsubmod/frotz && + inspect addtest/dotsubmod/frotz addtest && test_cmp expect heads && test_cmp expect head && test_must_be_empty untracked @@ -261,7 +265,7 @@ test_expect_success 'submodule add with /././ in path' ' ) && rm -f heads head untracked && - inspect addtest/dotslashdotsubmod/frotz && + inspect addtest/dotslashdotsubmod/frotz addtest && test_cmp expect heads && test_cmp expect head && test_must_be_empty untracked @@ -277,7 +281,7 @@ test_expect_success 'submodule add with // in path' ' ) && rm -f heads head untracked && - inspect addtest/slashslashsubmod/frotz && + inspect addtest/slashslashsubmod/frotz addtest && test_cmp expect heads && test_cmp expect head && test_must_be_empty untracked @@ -293,7 +297,7 @@ test_expect_success 'submodule add with /.. in path' ' ) && rm -f heads head untracked && - inspect addtest/realsubmod && + inspect addtest/realsubmod addtest && test_cmp expect heads && test_cmp expect head && test_must_be_empty untracked @@ -309,7 +313,7 @@ test_expect_success 'submodule add with ./, /.. and // in path' ' ) && rm -f heads head untracked && - inspect addtest/realsubmod2 && + inspect addtest/realsubmod2 addtest && test_cmp expect heads && test_cmp expect head && test_must_be_empty untracked @@ -340,7 +344,7 @@ test_expect_success 'submodule add in subdirectory' ' ) && rm -f heads head untracked && - inspect addtest/realsubmod3 && + inspect addtest/realsubmod3 addtest && test_cmp expect heads && test_cmp expect head && test_must_be_empty untracked @@ -481,7 +485,7 @@ test_expect_success 'update should work when path is an empty dir' ' git submodule update -q >update.out && test_must_be_empty update.out && - inspect init && + inspect init . && test_cmp expect head-sha1 ' @@ -540,7 +544,7 @@ test_expect_success 'update should checkout rev1' ' echo "$rev1" >expect && git submodule update init && - inspect init && + inspect init . && test_cmp expect head-sha1 '
Teach submodules a reference to their superproject's gitdir. This allows us to A) know that we're running from a submodule, and B) have a shortcut to the superproject's vitals, for example, configs. By using a relative path instead of an absolute path, we can move the superproject directory around on the filesystem without breaking the submodule's cache. Since this hint value is only introduced during new submodule creation via `git submodule add`, though, there is more work to do to allow the record to be created at other times. If the new config is present, we can do some optional value-added behavior, like letting "git status" print additional info about the submodule's status in relation to its superproject, or like letting the superproject and submodule share an additional config file separate from either one's local config. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Helped-by: Junio C Hamano <gitster@pobox.com> --- Documentation/config/submodule.txt | 15 +++++++++++ builtin/submodule--helper.c | 4 +++ t/t7400-submodule-basic.sh | 40 ++++++++++++++++-------------- 3 files changed, 41 insertions(+), 18 deletions(-)