Message ID | 20211117005701.371808-1-emilyshaffer@google.com (mailing list archive) |
---|---|
Headers | show |
Series | teach submodules to know they're submodules | expand |
Emily Shaffer <emilyshaffer@google.com> writes: > For the original cover letter, see > https://lore.kernel.org/git/20210611225428.1208973-1-emilyshaffer%40google.com. Also for reference, v4 and v5 (as a reply to v4) can be found here: https://lore.kernel.org/git/20211014203416.2802639-1-emilyshaffer@google.com/ > Since v5: > > A couple things. Firstly, a semantics change *back* to the semantics of > v3 - we map from gitdir to gitdir, *not* from common dir to common dir, > so that theoretically a submodule with multiple worktrees in multiple > superproject worktrees will be able to figure out which worktree of the > superproject it's in. (Realistically, that's not really possible right > now, but I'd like to change that soon.) Makes sense. Also, thanks for the tests covering what happens when this is run from worktrees. > Secondly, a rewording of comments and commit messages to indicate that > this isn't a cache of some expensive operation, but rather intended to > be the source of truth for all submodules. I also added a fifth commit > rewriting `git rev-parse --show-superproject-working-tree` to > demonstrate what that means in practice - but from a practical > standpoint, I'm a little worried about that fifth patch. More details in > the patch 5 description. OK - this is not the "this variable being missing is OK" idea that I had [1], but we want to be able to depend on it to some extent. (And it is not a cache either - we are not planning to perform an operation to obtain the superproject gitdir if the cache is missing, but we are just going to assume that there is no superproject.) To that end, the 5th patch is misleading - it is behaving exactly like a cache. I think it's better to drop it. What would make sense to me (and seems to be in the spirit of this patch set) is to describe this as something that Git commands can rely on to determine if the current repo is a submodule, for performance reasons. So maybe Git commands/parameters that directly reference the submodule concept like "--show-superproject-working-tree" will work hard to find the superproject (by searching the filesystem), but those that do not (e.g. "git status") can make assumptions. Making this variable a source of truth wouldn't work, I think, because the source of truth is whether this repo appears in a .gitmodules file (and that hasn't changed). To this end, I'll comment on the changes I'd like to see on the individual patches too. [1] https://lore.kernel.org/git/20210727174650.2462099-1-jonathantanmy@google.com/
On Wed, Nov 17, 2021 at 03:28:46PM -0800, Jonathan Tan wrote: > > Emily Shaffer <emilyshaffer@google.com> writes: > > For the original cover letter, see > > https://lore.kernel.org/git/20210611225428.1208973-1-emilyshaffer%40google.com. > > Also for reference, v4 and v5 (as a reply to v4) can be found here: > https://lore.kernel.org/git/20211014203416.2802639-1-emilyshaffer@google.com/ > > > Since v5: > > > > A couple things. Firstly, a semantics change *back* to the semantics of > > v3 - we map from gitdir to gitdir, *not* from common dir to common dir, > > so that theoretically a submodule with multiple worktrees in multiple > > superproject worktrees will be able to figure out which worktree of the > > superproject it's in. (Realistically, that's not really possible right > > now, but I'd like to change that soon.) > > Makes sense. Also, thanks for the tests covering what happens when this > is run from worktrees. > > > Secondly, a rewording of comments and commit messages to indicate that > > this isn't a cache of some expensive operation, but rather intended to > > be the source of truth for all submodules. I also added a fifth commit > > rewriting `git rev-parse --show-superproject-working-tree` to > > demonstrate what that means in practice - but from a practical > > standpoint, I'm a little worried about that fifth patch. More details in > > the patch 5 description. > > OK - this is not the "this variable being missing is OK" idea that I had > [1], but we want to be able to depend on it to some extent. (And it is > not a cache either - we are not planning to perform an operation to > obtain the superproject gitdir if the cache is missing, but we are just > going to assume that there is no superproject.) > > To that end, the 5th patch is misleading - it is behaving exactly like a > cache. I think it's better to drop it. Yeah, I think you are right. > > What would make sense to me (and seems to be in the spirit of this patch > set) is to describe this as something that Git commands can rely on to > determine if the current repo is a submodule, for performance reasons. > So maybe Git commands/parameters that directly reference the submodule > concept like "--show-superproject-working-tree" will work hard to find > the superproject (by searching the filesystem), but those that do not > (e.g. "git status") can make assumptions. Oh interesting, I like the idea of that distinction. Thanks for the suggestion. I wonder, though, how best to delineate. I'd almost rather say like I suggested to Ævar (https://lore.kernel.org/git/YZ1KLNwsxx7IR1%2B5%40google.com), that we can treat "--show-superproject-working-tree" as a "legacy" option people are already using, and treat anything added from now on as "if it doesn't think it is a submodule, you should 'git submodule update' in the superproject". That relies on us being able to reliably keep the value of submodule.superprojectGitDir correct, but I think the gitdir->gitdir linking helps with that (as opposed to earlier iterations). > > Making this variable a source of truth wouldn't work, I think, because > the source of truth is whether this repo appears in a .gitmodules file > (and that hasn't changed). Interestingly, '--show-superproject-working-tree' doesn't check the .gitmodules file at all. It checks whether the project found in the filesystem above it knows about an object named path/to/superproject/ which is a gitlink - that is, it asks the index, not .gitmodules, as far as I understand it. So if the only existing alternative to submodule.superprojectGitDir isn't treating .gitmodules as source of truth, then what does that mean? > > To this end, I'll comment on the changes I'd like to see on the > individual patches too. > > [1] https://lore.kernel.org/git/20210727174650.2462099-1-jonathantanmy@google.com/
For the original cover letter, see https://lore.kernel.org/git/20210611225428.1208973-1-emilyshaffer%40google.com. CI run: https://github.com/nasamuffin/git/actions/runs/1780282431 Since v6: I've dropped the fifth commit to use this new config for `git rev-parse --show-superproject-working-tree`. I think it did more harm than good - that tool uses an odd way to determine whether the superproject is actually the superproject, anyways. I poked a little bit at trying to find some benchmark to demonstrate that "submodule.superprojectGitDir" is actually faster - but I didn't end up able to write one without writing a ton of new code to traverse the filesystem. To be honest, I'm not all that interested in performance - I want the config added for correctness, instead. So, the only real changes between v6 and v7 are some documentation changes suggested by Jonathan Tan (https://lore.kernel.org/git/20211117234300.2598132-1-jonathantanmy%40google.com). Since v5: A couple things. Firstly, a semantics change *back* to the semantics of v3 - we map from gitdir to gitdir, *not* from common dir to common dir, so that theoretically a submodule with multiple worktrees in multiple superproject worktrees will be able to figure out which worktree of the superproject it's in. (Realistically, that's not really possible right now, but I'd like to change that soon.) Secondly, a rewording of comments and commit messages to indicate that this isn't a cache of some expensive operation, but rather intended to be the source of truth for all submodules. I also added a fifth commit rewriting `git rev-parse --show-superproject-working-tree` to demonstrate what that means in practice - but from a practical standpoint, I'm a little worried about that fifth patch. More details in the patch 5 description. I did discuss Ævar's idea of relying on in-process filesystem digging to find the superproject's gitdir with the rest of the Google team, but in the end decided that there are some worries about filesystem digging in this way (namely, some ugly interactions with network drives that are actually already an issue for Googler Linux machines). Plus, the allure of being able to definitively know that we're a submodule is pretty strong. ;) But overall, this is the direction I'd prefer to keep going in, rather than trying to guess from the filesystem going forward. Since v4: The only real change here is a slight semantics change to map from <submodule gitdir> to <superproject common git dir>. In every case *except* for when the superproject has a worktree, this changes nothing. For the case when the superproject has a worktree, this means that now submodules will refer to the general superproject common dir (e.g. no worktree-specific refs or configs or whatnot). I *think* that because a submodule should exist in the context of the common dir, not the worktree gitdir, that is ok. However, it does mean it would be difficult to do something like sharing a config specific to the worktree (the initial goal of this series). $ROOT/.git $ROOT/.git/config.superproject <- shared by $ROOT/.git/modules/sub $ROOT/.git/modules/sub <- points to $ROOT/.git $ROOT/.git/worktrees/wt $ROOT/.git/worktrees/wt/config.superproject <- contains a certain config-based pre-commit hook If the submodule only knows about the common dir, that is tough, because the submodule would basically have to guess which worktree it's in from its own path. There would be no way for '$WT/sub' to inherit '$ROOT/.git/worktrees/wt/config.superproject'. That said... right now, we don't support submodules in worktrees very well at all. A submodule in a worktree will get a brand new gitdir in $ROOT/.git/worktrees/modules/ (and that brand new gitdir would point to the super's common dir). So I think we can punt on this entire question until we teach submodules and worktrees to play more gracefully together (it's on my long list...), and at that time we can probably introduce a pointer from $ROOT/.git/modules/sub/worktrees/wt/ to $ROOT/.git/worktrees/wt/.... Or, to summarize the long ramble above: "this is still kind of weird with worktrees, but let's fix it later when we fix worktrees more thoroughly". (More rambling about worktree weirdness here: https://lore.kernel.org/git/YYRaII8YWVxlBqsF%40google.com ) Since v3, a pretty major change: the semantics of submodule.superprojectGitDir has changed, to point from the submodule's gitdir to the superproject's gitdir (in v3 and earlier, we kept a path from the submodule's *worktree* to the superproject's gitdir instead). This cleans up some of the confusions about the behavior when a submodule worktree moves around in the superproject's tree, or in a future when we support submodules having multiple worktrees. I also tried to simplify the tests to use 'test-tool path-utils relative_path' everywhere - I think that makes them much more clear for a test reader, but if you're reviewing and it isn't obvious what we're testing for, please speak up. I think this is pretty mature and there was a lot of general agreement that the gitdir->gitdir association was the way to go, so please be brutal and look for nits, leaks, etc. this round ;) [/v4 cover letter] Emily Shaffer (4): t7400-submodule-basic: modernize inspect() helper introduce submodule.superprojectGitDir record submodule: record superproject gitdir during absorbgitdirs submodule: record superproject gitdir during 'update' Documentation/config/submodule.txt | 9 ++++ builtin/submodule--helper.c | 11 ++++ git-submodule.sh | 15 ++++++ submodule.c | 23 +++++++++ t/t7400-submodule-basic.sh | 54 +++++++++++--------- t/t7406-submodule-update.sh | 27 ++++++++++ t/t7412-submodule-absorbgitdirs.sh | 82 +++++++++++++++++++++++++++++- 7 files changed, 194 insertions(+), 27 deletions(-) Range-diff against v6: 1: f1b08a7057 = 1: 251510c687 t7400-submodule-basic: modernize inspect() helper 2: d46c8439ab ! 2: 1a85deb1c5 introduce submodule.superprojectGitDir record @@ Metadata ## Commit message ## introduce submodule.superprojectGitDir record - Teach submodules a reference to their superproject's gitdir. This allows - us to A) know that we're running from a submodule, and B) have a - shortcut to the superproject's vitals, for example, configs. + Teach submodules a config variable referencing to their superproject's + gitdir. Git commands can rely on this reference to determine whether the + current repo is a submodule to another repo. If this reference is + absent, Git may assume the current repo is not a submodule. + + In practice, some commands which specifically reference the submodule + relationship, like 'git rev-parse --show-superproject-working-tree', + will still search the parent directory. Other newly added or implicit + behavior, such as "git status" showing the submodule's status in + relation to the superproject, or a config which is shared between the + superproject and the submodule, should not search the parent directory + and instead rely on this "submodule.superprojectGitDir" config. By using a relative path instead of an absolute path, we can move the superproject directory around on the filesystem without breaking the @@ Commit message superproject's worktree gitdir (if it exists), we ensure that we can tell which worktree contains our submodule. - Since this hint value is only introduced during new submodule creation - via `git submodule add`, though, there is more work to do to allow the - record to be created at other times. - - Once this new config is reliably in place, we can use it to know - definitively that we are working in a submodule, and to know which - superproject we are a submodule of. This allows us to do some - value-added behavior, like letting "git status" print additional info - about the submodule's status in relation to its superproject, or like - letting the superproject and submodule share an additional config file - separate from either one's local config. + This commit teaches "git submodule add" to add the aformentioned config + variable. Subsequent commits will teach other commands to do so. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Helped-by: Junio C Hamano <gitster@pobox.com> @@ Documentation/config/submodule.txt: submodule.alternateErrorStrategy:: clone proceeds as if no alternate was specified. + +submodule.superprojectGitDir:: -+ The relative path from the submodule's gitdir to its superproject's -+ gitdir. When Git is run in a repository, it usually makes no -+ difference whether this repository is standalone or a submodule, but if -+ this configuration variable is present, additional behavior may be -+ possible, such as "git status" printing additional information about -+ this submodule's status with respect to its superproject. This config -+ should only be present in projects which are submodules, but is not -+ guaranteed to be present in every submodule, so only optional -+ value-added behavior should be linked to it. It is set automatically -+ during submodule creation. ++ If this repository is a submodule, the relative path from this repo's ++ gitdir to its superproject's gitdir. Git commands may rely on this ++ reference to determine whether the current repo is a submodule to ++ another repo; if this reference is absent, Git will treat the current ++ repo as though it is not a submodule (this does not make a difference to ++ most Git commands). It is set automatically during submodule creation. ## builtin/submodule--helper.c ## @@ builtin/submodule--helper.c: static int clone_submodule(struct module_clone_data *clone_data) 3: 63ddaf5608 ! 3: 7a44b0edf9 submodule: record superproject gitdir during absorbgitdirs @@ Commit message Signed-off-by: Emily Shaffer <emilyshaffer@google.com> + ## Documentation/config/submodule.txt ## +@@ Documentation/config/submodule.txt: submodule.superprojectGitDir:: + reference to determine whether the current repo is a submodule to + another repo; if this reference is absent, Git will treat the current + repo as though it is not a submodule (this does not make a difference to +- most Git commands). It is set automatically during submodule creation. ++ most Git commands). It is set automatically during submodule creation ++ and 'git submodule absorbgitdir'. + ## submodule.c ## @@ submodule.c: static void relocate_single_git_dir_into_superproject(const char *path) char *old_git_dir = NULL, *real_old_git_dir = NULL, *real_new_git_dir = NULL; 4: 33a582ef13 ! 4: 63e736e69d submodule: record superproject gitdir during 'update' @@ Commit message Signed-off-by: Emily Shaffer <emilyshaffer@google.com> + ## Documentation/config/submodule.txt ## +@@ Documentation/config/submodule.txt: submodule.superprojectGitDir:: + reference to determine whether the current repo is a submodule to + another repo; if this reference is absent, Git will treat the current + repo as though it is not a submodule (this does not make a difference to +- most Git commands). It is set automatically during submodule creation +- and 'git submodule absorbgitdir'. ++ most Git commands). It is set automatically during submodule creation, ++ update, and 'git submodule absorbgitdir'. + ## git-submodule.sh ## @@ git-submodule.sh: cmd_update() ;; 5: a8b5d40a77 < -: ---------- submodule: use config to find superproject worktree
Emily Shaffer <emilyshaffer@google.com> writes: > A couple things. Firstly, a semantics change *back* to the semantics of > v3 - we map from gitdir to gitdir, *not* from common dir to common dir, > so that theoretically a submodule with multiple worktrees in multiple > superproject worktrees will be able to figure out which worktree of the > superproject it's in. (Realistically, that's not really possible right > now, but I'd like to change that soon.) Sounds sensible. > Secondly, a rewording of comments and commit messages to indicate that > this isn't a cache of some expensive operation, but rather intended to > be the source of truth for all submodules. I'd expect that there is a way (e.g. "git fsck") that helps the users notice when the actual filesystem layout contradicts with what the gitdir-to-gitdir link says, and repair the repositories when they go out of sync if possible. It would be similar to "git worktree", where a link between the ".git" file that records "gitdir:" in a secondary worktree and the repository's $GIT_DIR/worktrees/*/gitdir, and the "repair" command can be used to bring them back in sync after moving the real repository without telling the secondary worktree about the move. > I did discuss Ævar's idea of relying on in-process filesystem digging to > find the superproject's gitdir with the rest of the Google team, but in > the end decided that there are some worries about filesystem digging in > this way (namely, some ugly interactions with network drives that are > actually already an issue for Googler Linux machines). Plus, the allure > of being able to definitively know that we're a submodule is pretty > strong. The other side of the coin is that, even when a configuration variable says that you are a submodule of the superproject at location X, if such a submodule gets moved out of the superproject (perhaps because the end-user wanted to concentrate on that submodule project alone as an independent project) and the superproject that used to be at location X got archived away, trusting and relying on what the configuration variable says would not help us access the now-gone superproject. And that would not change no matter how strongly we declare that it is the source of truth. Unless we have a very good way to detect inconsistency and stop spreading the damage (e.g. the setting thought our superproject sits at directory X, but that location is now occupied by a different repository that is not related), I am still skeptical about the "setting is the sole truth" design.
On Thu, Feb 03 2022, Emily Shaffer wrote: > Since v6: Thanks for the re-roll! > I've dropped the fifth commit to use this new config for `git rev-parse > --show-superproject-working-tree`. I think it did more harm than good - > that tool uses an odd way to determine whether the superproject is > actually the superproject, anyways. > > I poked a little bit at trying to find some benchmark to demonstrate > that "submodule.superprojectGitDir" is actually faster - but I didn't > end up able to write one without writing a ton of new code to traverse > the filesystem. I'm assuming that's tested against some variant of the submodule-in-C[1] conversion. I.e. at least when I tested it [2][3] it seemed easy to come up with (probably overly artificial) benchmarks where it would matter for the shell-out in this series. The question performance-wise was rather whether we'd just be introducing the config mechanism as a transitory performance workaround, and the need for it would evaporate once that C migration happened (re original CL quoted in [3]). > To be honest, I'm not all that interested in performance > - I want the config added for correctness, instead. And I'm honestly still at the point of not even being against this whole thing, although it probably sounds like that. I'm really not. I just genuinely don't get where this is headed. I.e. for the last iteration I did a demo patch on top that showed that there was no case added by the series where the on-the-fly discovery wasn't equivalent to the set-in-config value[4]. That change showed that after this series in a state where the config *is* redundant to on the fly discovery (or maybe not, and we're just missing test coverage). But since you're citing correctness do you have some repo->sub relationship in mind that would be ambiguous in a way where the configuration would resolve the ambiguity? I can imagine how such a thing might work, e.g. if we gave submodules some git-worktree-like method of being completely detached from the parent. I.e. being able to place a /usr/me/repo.git whose submodule entry for a "test" dir lives in /opt/tests or something. So when you "cd /opt/tests" you wouldn't be able to detect you're within a submodule. (I'm assuming the case where the submodule has its own "in-tree" .git, is that even supported anymore...?) But I can't think of one where such an ambiguity would arise within our current featureset. What I really can't see is how if the need for such "config [...] for correctness" would arise how that doesn't also invalidate the assumptions you're making in 3/4 and 4/4. I.e. surely if we need the config for correctness it's also true that we can't after-the-fact add the config on the fly to (such) existing submodules without user intervention. Or maybe the ambiguity would only arise from the POV of the submodule, but not for commands executed within the parent? 1. https://lore.kernel.org/git/cover-v5-0.9-00000000000-20220128T125206Z-avarab@gmail.com/ 2. https://lore.kernel.org/git/RFC-cover-0.2-00000000000-20211117T113134Z-avarab@gmail.com/ 3. https://lore.kernel.org/git/211124.86a6hue2wk.gmgdl@evledraar.gmail.com/ 4. https://lore.kernel.org/git/RFC-patch-2.2-b49d4c8db7d-20211117T113134Z-avarab@gmail.com/
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > I just genuinely don't get where this is headed. I.e. for the last > iteration I did a demo patch on top that showed that there was no case > added by the series where the on-the-fly discovery wasn't equivalent to > the set-in-config value[4]. > > That change showed that after this series in a state where the config > *is* redundant to on the fly discovery (or maybe not, and we're just > missing test coverage). > > But since you're citing correctness do you have some repo->sub > relationship in mind that would be ambiguous in a way where the > configuration would resolve the ambiguity? This is an excellent question, which I wish I could have raised in my earlier response. A clear explanation why this setting is not a redundant copy but adds real information on top of what we should be able to learn from the filesystem structure would really help in justifying the new thing. Thanks.
Hi, Ævar Arnfjörð Bjarmason wrote: > On Thu, Feb 03 2022, Emily Shaffer wrote: >> To be honest, I'm not all that interested in performance >> - I want the config added for correctness, instead. > > And I'm honestly still at the point of not even being against this whole > thing, although it probably sounds like that. I'm really not. > > I just genuinely don't get where this is headed. I.e. for the last > iteration I did a demo patch on top that showed that there was no case > added by the series where the on-the-fly discovery wasn't equivalent to > the set-in-config value[4]. Here's a few examples: 1. Suppose I track my $HOME directory as a git repository. Within my home directory, I have a src/git/ subdirectory with a clone of git.git, but I never intended to treat this as a submodule. If I run "git rev-parse --show-superproject-working-tree", then it will discover my home directory repository, run ls-files in there to see if it has GITLINK entries, and either see one for src/git if I had "git add"ed it by mistake or not see one. In either case, it would it would view my src/git/ directory as being a submodule of my home directory even though I hadn't intended it to be so. 2. Suppose I have a copy of a repository such as https://gerrit.googlesource.com/gerrit/, with all its submodules. I am in the plugins/replication/ directory. If I run "git rev-parse --show-superproject-working-tree", then it will discover my gerrit repository, run ls-files in there to see if it has GITLINK entries, and use the result to decide whether the cwd is a submodule. So for example, if I had run "git rm --cached plugins/replication" to _prepare to_ remove the plugins/replication submodule, then "git rev-parse --show-superproject-working-tree" will produce the wrong result. 3. Suppose I am not using submodules at all. I have a clone of mawk.git and I am working there. If I run "git rev-parse --show-superproject-working-tree", then I'm presumably interested in doing something submodule-specific; nothing wrong with that. But the series we're responding to is meant to support a wider variety of operations --- for example, suppose I am running a plain "git status" operation. If "git status" runs "git rev-parse --show-superproject-working-tree", then git would walk up the filesystem above my mawk/ directory, looking for another .git dir. We can reach an NFS automounter directory and just hang. Even without an NFS automounter, we'd expect this to take a while because, unlike normal repository discovery, we have no reason to believe that the walk is going to quickly discover a .git directory and terminate. So this would violate user expectations. Thanks and hope that helps, Jonathan > 4. https://lore.kernel.org/git/RFC-patch-2.2-b49d4c8db7d-20211117T113134Z-avarab@gmail.com/
Jonathan Nieder <jrnieder@gmail.com> writes: > Here's a few examples: > > 1. Suppose I track my $HOME directory as a git repository. Within my > home directory, I have a src/git/ subdirectory with a clone of > git.git, but I never intended to treat this as a submodule. > > If I run "git rev-parse --show-superproject-working-tree", then it > will discover my home directory repository, run ls-files in there > to see if it has GITLINK entries, and either see one for src/git if > I had "git add"ed it by mistake or not see one. In either case, > it would it would view my src/git/ directory as being a submodule > of my home directory even though I hadn't intended it to be so. I am not sure about this one. If you added an unrelated one with "git add" by mistake, you'd want to know about the mistake sooner rather than later, no? > 2. Suppose I have a copy of a repository such as > https://gerrit.googlesource.com/gerrit/, with all its submodules. > I am in the plugins/replication/ directory. > > If I run "git rev-parse --show-superproject-working-tree", then it > will discover my gerrit repository, run ls-files in there to see if > it has GITLINK entries, and use the result to decide whether the > cwd is a submodule. So for example, if I had run "git rm --cached > plugins/replication" to _prepare to_ remove the plugins/replication > submodule, then "git rev-parse --show-superproject-working-tree" > will produce the wrong result. Yes, looking only at the index of the superproject will have that problem, but don't other things in the superproject point at the submodule, too, e.g. submodule.<name>.* configuration variables? And then, after removing them to truly dissociate the submodule from the superproject, "git rev-parse --show-superproject-working-tree" may stop saying that it is a submodule, but this series wants to make it irrelevant what the command says. Until you unset the configuration variable in the submodule, it will stay to be a submodule of the superproject, but the superproject no longer thinks it is responsible for the submodule. You'll have to deal with an inconsistent state during the transition either way, so I am not sure it is the best solution to introduce an extra setting that can easily go out of sync. > 3. Suppose I am not using submodules at all. I have a clone of > mawk.git and I am working there. > > If I run "git rev-parse --show-superproject-working-tree", then I'm > presumably interested in doing something submodule-specific; > nothing wrong with that. But the series we're responding to is > meant to support a wider variety of operations --- for example, > suppose I am running a plain "git status" operation. > > If "git status" runs "git rev-parse > --show-superproject-working-tree", then git would walk up the > filesystem above my mawk/ directory, looking for another .git dir. > We can reach an NFS automounter directory and just hang. Even > without an NFS automounter, we'd expect this to take a while > because, unlike normal repository discovery, we have no reason to > believe that the walk is going to quickly discover a .git directory > and terminate. So this would violate user expectations. It would be a problem, but I do not know if "this is a submodule of that superproject" link is the only solution, let alone the most effective one. It seems to me that you are looking more for something like GIT_CEILING_DIRECTORIES.
Junio C Hamano wrote: > Jonathan Nieder <jrnieder@gmail.com> writes: >> Here's a few examples: >> >> 1. Suppose I track my $HOME directory as a git repository. Within my >> home directory, I have a src/git/ subdirectory with a clone of >> git.git, but I never intended to treat this as a submodule. >> >> If I run "git rev-parse --show-superproject-working-tree", then it >> will discover my home directory repository, run ls-files in there >> to see if it has GITLINK entries, and either see one for src/git if >> I had "git add"ed it by mistake or not see one. In either case, >> it would it would view my src/git/ directory as being a submodule >> of my home directory even though I hadn't intended it to be so. > > I am not sure about this one. If you added an unrelated one with > "git add" by mistake, you'd want to know about the mistake sooner > rather than later, no? My point with this example is that it's useful to have a definition of what is a submodule repository, to make it unambiguous whether this repository is a submodule or whether it's just a repository that happens to have been cloned inside of a git-managed worktree. For the specific example of having run "git add", I don't have any very strong opinions. [...] >> 2. Suppose I have a copy of a repository such as >> https://gerrit.googlesource.com/gerrit/, with all its submodules. >> I am in the plugins/replication/ directory. [...] >> So for example, if I had run "git rm --cached >> plugins/replication" to _prepare to_ remove the plugins/replication >> submodule, then "git rev-parse --show-superproject-working-tree" >> will produce the wrong result. > > Yes, looking only at the index of the superproject will have that > problem, but don't other things in the superproject point at the > submodule, too, e.g. submodule.<name>.* configuration variables? What all of those suggested alternatives have in common is that they are pointers from another repository to the submodule. This would be the first time in git history that we are saying a property of a repository depends on having to examine files outside of it. I guess the main question I'd have is, why _wouldn't_ I want a submodule to be able to point to the superproject containing it? I can think of many advantages to having that linkage, and the main disadvantage I can think of is that it is a change. I don't think that submodule.<name>.* is an adequate substitute for having this setting, because it requires - finding the superproject - mapping the <name> to a path, using .gitmodules - comparing the path to the submodule location which would be complex, slow, and error-prone. The one thing that I think could approach being an adequate substitute is examining the path to the current repository and stripping off path components until we find modules/; then the parent is the containing superproject. That would only work for absorbed submodules, though, and it would be less explicit than having a config item. > And then, after removing them to truly dissociate the submodule from > the superproject, "git rev-parse --show-superproject-working-tree" > may stop saying that it is a submodule, but this series wants to > make it irrelevant what the command says. Until you unset the > configuration variable in the submodule, it will stay to be a > submodule of the superproject, but the superproject no longer thinks > it is responsible for the submodule. You'll have to deal with an > inconsistent state during the transition either way, so I am not > sure it is the best solution to introduce an extra setting that can > easily go out of sync. This hints at a reason why one wouldn't want the linkage back --- dealing with the ambiguity of inconsistencies (what if a submodule declares a superproject but the superproject does not declare the submodule?). I would not expect that ambiguity to be much of a problem, because the typical way to use superproject linkage would be to print output from commands like "git status": for example, This is a submodule of ../../gerrit; you can run git -C ../../gerrit status to get the status of the superproject. An inconsistency could occur due to the user using "mv" (instead of "git mv") to move a submodule to a path a different number of path components from its superproject. One way to handle that would be to make submodules record a boolean setting reflecting whether they are a submodule, instead of the path to the superproject. (This would be similar to settings like core.bare.) Alternatively, if the path to the superproject is recorded and if "git fsck" is able to notice such an inconsistency, then the user should be able to have an okay experience repairing it. [...] >> If "git status" runs "git rev-parse >> --show-superproject-working-tree", then git would walk up the >> filesystem above my mawk/ directory, looking for another .git dir. >> We can reach an NFS automounter directory and just hang. Even >> without an NFS automounter, we'd expect this to take a while >> because, unlike normal repository discovery, we have no reason to >> believe that the walk is going to quickly discover a .git directory >> and terminate. So this would violate user expectations. > > It would be a problem, but I do not know if "this is a submodule of > that superproject" link is the only solution, let alone the most > effective one. It seems to me that you are looking more for > something like GIT_CEILING_DIRECTORIES. Who is the "you" addressed here? The end user can use GIT_CEILING_DIRECTORIES if they are expecting to run git commands within an NFS automounter directory and outside of any git repository, but they'd be right to be surprised if that suddenly became required when inside git repositories. I don't think we should assume that running an extra .git discovery walk is cost-free to users who are not using submodules and an acceptable burden to impose on them for the sake of submodule users. Thanks and hope that helps, Jonathan
Jonathan Nieder <jrnieder@gmail.com> writes: > My point with this example is that it's useful to have a definition of > what is a submodule repository, to make it unambiguous whether this > repository is a submodule or whether it's just a repository that > happens to have been cloned inside of a git-managed worktree. OK, together with the other "no need to let NFS automounter worry about parent directories", it makes a very successful argument for a single bit (i.e. this is a free-standing repository and is not a submodule, so no need to auto-discover if it is one). I think the "Alternatively" you later mention to solve ambiguity with just a single bit, without "this is a submodule of that superproject" linkage, is essentially the same? But I do not think it argues for the need to say "a config, not filesystem layout, must be the single source of truth to say which superproject this repository belongs as its submodule". > This would be the first time in git history that we are saying a > property of a repository depends on having to examine files outside of > it. Well, path-based configuration inclusion, with configuration driven hooks, I do not think the distinction matters much anymore in these days. > I guess the main question I'd have is, why _wouldn't_ I want a > submodule to be able to point to the superproject containing it? Because with (the absense of) a single "this is freestanding" bit, by default the filesystem layout can already "point" at it?
On Tue, Feb 08, 2022 at 10:24:49AM -0800, Junio C Hamano wrote: > > Jonathan Nieder <jrnieder@gmail.com> writes: > > > My point with this example is that it's useful to have a definition of > > what is a submodule repository, to make it unambiguous whether this > > repository is a submodule or whether it's just a repository that > > happens to have been cloned inside of a git-managed worktree. > > OK, together with the other "no need to let NFS automounter worry > about parent directories", it makes a very successful argument for a > single bit (i.e. this is a free-standing repository and is not a > submodule, so no need to auto-discover if it is one). I think the > "Alternatively" you later mention to solve ambiguity with just a > single bit, without "this is a submodule of that superproject" > linkage, is essentially the same? That resolution - "teach submodules to know they're submodules, but not whose submodule they are" - would still count as a success to me. The reason I proposed a path instead of a boolean here was simply because storing a path is a boolean (by whether it's present or not) *and* additional information (the path to the superproject), and it seemed silly to me to opt for less information. Or, to put it another way - "am I a submodule?" seems pretty vital, and "yes, and I belong to xyz" is an optimization on top of that. So I don't terribly mind sending this as just a boolean, if we feel that the effort to keep it up outweighs the benefit of saving us a filesystem walk. I'm not completely convinced that it does, though - would the addition of a 'git fsck' check for this config be satisfactory? In other words, is the problem that the execution of this series wasn't thorough enough and it should be refined, or that the concept itself is beyond saving? - Emily > > But I do not think it argues for the need to say "a config, not > filesystem layout, must be the single source of truth to say which > superproject this repository belongs as its submodule". > > > This would be the first time in git history that we are saying a > > property of a repository depends on having to examine files outside of > > it. > > Well, path-based configuration inclusion, with configuration driven > hooks, I do not think the distinction matters much anymore in these > days. > > > I guess the main question I'd have is, why _wouldn't_ I want a > > submodule to be able to point to the superproject containing it? > > Because with (the absense of) a single "this is freestanding" bit, > by default the filesystem layout can already "point" at it?
Emily Shaffer wrote: > On Tue, Feb 08, 2022 at 10:24:49AM -0800, Junio C Hamano wrote: >> Jonathan Nieder <jrnieder@gmail.com> writes: >>> My point with this example is that it's useful to have a definition of >>> what is a submodule repository, to make it unambiguous whether this >>> repository is a submodule or whether it's just a repository that >>> happens to have been cloned inside of a git-managed worktree. >> >> OK, together with the other "no need to let NFS automounter worry >> about parent directories", it makes a very successful argument for a >> single bit (i.e. this is a free-standing repository and is not a >> submodule, so no need to auto-discover if it is one). I think the >> "Alternatively" you later mention to solve ambiguity with just a >> single bit, without "this is a submodule of that superproject" >> linkage, is essentially the same? > > That resolution - "teach submodules to know they're submodules, but not > whose submodule they are" - would still count as a success to me. Thanks, both. Sounds like a good path forward. [...] > So I don't terribly mind sending this as > just a boolean, if we feel that the effort to keep it up outweighs the > benefit of saving us a filesystem walk. > > I'm not completely convinced that it does, though Personally, I'm convinced --- e.g., life gets painful enough when core.worktree ends up pointing to the wrong path, so being able to avoid that complexity seems like a nice outcome. > - would the addition > of a 'git fsck' check for this config be satisfactory? In other words, > is the problem that the execution of this series wasn't thorough enough > and it should be refined, or that the concept itself is beyond saving? For the absorbed case, the path to the superproject should be pretty stable, and in that case it's probably possible to make this robust enough. (That said, the path to the superproject gitdir would typically just be "../..", at least as long as we have the other patch to escape slashes in the submodule name.) In the non-absorbed case, it seems likely to get messy fast because a user can "mv" the submodule around. Another kind of case that gets interesting is when there are multiple superproject worktrees and multiple submodule worktrees. Does the relative path become a per-worktree variable? Using a boolean saves us from having to think through it. Thanks for the clear explanations, Jonathan
On Mon, Feb 07 2022, Jonathan Nieder wrote: > Hi, > > Ævar Arnfjörð Bjarmason wrote: >> On Thu, Feb 03 2022, Emily Shaffer wrote: > >>> To be honest, I'm not all that interested in performance >>> - I want the config added for correctness, instead. >> >> And I'm honestly still at the point of not even being against this whole >> thing, although it probably sounds like that. I'm really not. >> >> I just genuinely don't get where this is headed. I.e. for the last >> iteration I did a demo patch on top that showed that there was no case >> added by the series where the on-the-fly discovery wasn't equivalent to >> the set-in-config value[4]. > > Here's a few examples: I've read the downthread, but it's probably best to reply to this... > 1. Suppose I track my $HOME directory as a git repository. Within my > home directory, I have a src/git/ subdirectory with a clone of > git.git, but I never intended to treat this as a submodule. > > If I run "git rev-parse --show-superproject-working-tree", then it > will discover my home directory repository, run ls-files in there > to see if it has GITLINK entries, and either see one for src/git if > I had "git add"ed it by mistake or not see one. In either case, > it would it would view my src/git/ directory as being a submodule > of my home directory even though I hadn't intended it to be so. > > 2. Suppose I have a copy of a repository such as > https://gerrit.googlesource.com/gerrit/, with all its submodules. > I am in the plugins/replication/ directory. > > If I run "git rev-parse --show-superproject-working-tree", then it > will discover my gerrit repository, run ls-files in there to see if > it has GITLINK entries, and use the result to decide whether the > cwd is a submodule. So for example, if I had run "git rm --cached > plugins/replication" to _prepare to_ remove the plugins/replication > submodule, then "git rev-parse --show-superproject-working-tree" > will produce the wrong result. These both seem like valid edge cases, but they're still going to be the same edge case on the "parent" side even with a proposed cache (whether it's a boolean or a path). I.e. the question here is really not one of caching, but of what it means for Y to be a submodule of X. I assumed that we'd prefer a 1=1 relationship between the parent reporting that Y is a submodule of it, and Y reporting that it is a submodule (of the parent at some <path>). If that's the case we can walk up and ask parent .git's whether they think the <path> is their submodule. If it's not the case perhaps a config is needed, but then that surely has wider implications. I.e. won't it be the case that we can't add the config after-the-fact as this series proposes in those some ambiguous cases?xo > 3. Suppose I am not using submodules at all. I have a clone of > mawk.git and I am working there. > > If I run "git rev-parse --show-superproject-working-tree", then I'm > presumably interested in doing something submodule-specific; > nothing wrong with that. But the series we're responding to is > meant to support a wider variety of operations --- for example, > suppose I am running a plain "git status" operation. > > If "git status" runs "git rev-parse > --show-superproject-working-tree", then git would walk up the > filesystem above my mawk/ directory, looking for another .git dir. > We can reach an NFS automounter directory and just hang. Even > without an NFS automounter, we'd expect this to take a while > because, unlike normal repository discovery, we have no reason to > believe that the walk is going to quickly discover a .git directory > and terminate. So this would violate user expectations. We have a /a/b/c/d.git mounted, but not a parent /a/b/, and walking upwards causes it to be mounted?
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > We have a /a/b/c/d.git mounted, but not a parent /a/b/, and walking > upwards causes it to be mounted? ;-)
For the original cover letter, see https://lore.kernel.org/git/20210611225428.1208973-1-emilyshaffer%40google.com. CI run: https://github.com/nasamuffin/git/actions/runs/1469463328 Since v5: A couple things. Firstly, a semantics change *back* to the semantics of v3 - we map from gitdir to gitdir, *not* from common dir to common dir, so that theoretically a submodule with multiple worktrees in multiple superproject worktrees will be able to figure out which worktree of the superproject it's in. (Realistically, that's not really possible right now, but I'd like to change that soon.) Secondly, a rewording of comments and commit messages to indicate that this isn't a cache of some expensive operation, but rather intended to be the source of truth for all submodules. I also added a fifth commit rewriting `git rev-parse --show-superproject-working-tree` to demonstrate what that means in practice - but from a practical standpoint, I'm a little worried about that fifth patch. More details in the patch 5 description. I did discuss Ævar's idea of relying on in-process filesystem digging to find the superproject's gitdir with the rest of the Google team, but in the end decided that there are some worries about filesystem digging in this way (namely, some ugly interactions with network drives that are actually already an issue for Googler Linux machines). Plus, the allure of being able to definitively know that we're a submodule is pretty strong. ;) But overall, this is the direction I'd prefer to keep going in, rather than trying to guess from the filesystem going forward. Since v4: The only real change here is a slight semantics change to map from <submodule gitdir> to <superproject common git dir>. In every case *except* for when the superproject has a worktree, this changes nothing. For the case when the superproject has a worktree, this means that now submodules will refer to the general superproject common dir (e.g. no worktree-specific refs or configs or whatnot). I *think* that because a submodule should exist in the context of the common dir, not the worktree gitdir, that is ok. However, it does mean it would be difficult to do something like sharing a config specific to the worktree (the initial goal of this series). $ROOT/.git $ROOT/.git/config.superproject <- shared by $ROOT/.git/modules/sub $ROOT/.git/modules/sub <- points to $ROOT/.git $ROOT/.git/worktrees/wt $ROOT/.git/worktrees/wt/config.superproject <- contains a certain config-based pre-commit hook If the submodule only knows about the common dir, that is tough, because the submodule would basically have to guess which worktree it's in from its own path. There would be no way for '$WT/sub' to inherit '$ROOT/.git/worktrees/wt/config.superproject'. That said... right now, we don't support submodules in worktrees very well at all. A submodule in a worktree will get a brand new gitdir in $ROOT/.git/worktrees/modules/ (and that brand new gitdir would point to the super's common dir). So I think we can punt on this entire question until we teach submodules and worktrees to play more gracefully together (it's on my long list...), and at that time we can probably introduce a pointer from $ROOT/.git/modules/sub/worktrees/wt/ to $ROOT/.git/worktrees/wt/.... Or, to summarize the long ramble above: "this is still kind of weird with worktrees, but let's fix it later when we fix worktrees more thoroughly". (More rambling about worktree weirdness here: https://lore.kernel.org/git/YYRaII8YWVxlBqsF%40google.com ) Since v3, a pretty major change: the semantics of submodule.superprojectGitDir has changed, to point from the submodule's gitdir to the superproject's gitdir (in v3 and earlier, we kept a path from the submodule's *worktree* to the superproject's gitdir instead). This cleans up some of the confusions about the behavior when a submodule worktree moves around in the superproject's tree, or in a future when we support submodules having multiple worktrees. I also tried to simplify the tests to use 'test-tool path-utils relative_path' everywhere - I think that makes them much more clear for a test reader, but if you're reviewing and it isn't obvious what we're testing for, please speak up. I think this is pretty mature and there was a lot of general agreement that the gitdir->gitdir association was the way to go, so please be brutal and look for nits, leaks, etc. this round ;) [/v4 cover letter] Emily Shaffer (5): t7400-submodule-basic: modernize inspect() helper introduce submodule.superprojectGitDir record submodule: record superproject gitdir during absorbgitdirs submodule: record superproject gitdir during 'update' submodule: use config to find superproject worktree Documentation/config/submodule.txt | 12 ++++ builtin/submodule--helper.c | 11 +++ git-submodule.sh | 15 ++++ submodule.c | 108 ++++++++++++++++++++++++++++- t/t1500-rev-parse.sh | 9 +++ t/t7400-submodule-basic.sh | 54 ++++++++------- t/t7406-submodule-update.sh | 27 ++++++++ t/t7412-submodule-absorbgitdirs.sh | 82 +++++++++++++++++++++- 8 files changed, 290 insertions(+), 28 deletions(-) Range-diff against v5: 1: 6ff10beaf2 = 1: f1b08a7057 t7400-submodule-basic: modernize inspect() helper 2: d4f4627585 ! 2: d46c8439ab introduce submodule.superprojectGitDir record @@ Commit message By using a relative path instead of an absolute path, we can move the superproject directory around on the filesystem without breaking the - submodule's cache. And by using the path from gitdir to gitdir, we can + submodule's pointer. And by using the path from gitdir to gitdir, we can move the submodule within the superproject's tree structure without - breaking the submodule's cache, too. Finally, by pointing at - "get_git_common_dir()" instead of "get_git_dir()", we ensure the link - will refer to the parent repo, not to a specific worktree. + breaking the submodule's pointer, too. Finally, by pointing at the + superproject's worktree gitdir (if it exists), we ensure that we can + tell which worktree contains our submodule. Since this hint value is only introduced during new submodule creation via `git submodule add`, though, there is more work to do to allow the record to be created at other times. - If the new config is present, we can do some optional value-added - behavior, like letting "git status" print additional info about the - submodule's status in relation to its superproject, or like letting the - superproject and submodule share an additional config file separate from - either one's local config. + Once this new config is reliably in place, we can use it to know + definitively that we are working in a submodule, and to know which + superproject we are a submodule of. This allows us to do some + value-added behavior, like letting "git status" print additional info + about the submodule's status in relation to its superproject, or like + letting the superproject and submodule share an additional config file + separate from either one's local config. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Helped-by: Junio C Hamano <gitster@pobox.com> @@ Documentation/config/submodule.txt: submodule.alternateErrorStrategy:: + +submodule.superprojectGitDir:: + The relative path from the submodule's gitdir to its superproject's -+ common dir. When Git is run in a repository, it usually makes no ++ gitdir. When Git is run in a repository, it usually makes no + difference whether this repository is standalone or a submodule, but if + this configuration variable is present, additional behavior may be + possible, such as "git status" printing additional information about @@ builtin/submodule--helper.c: static int clone_submodule(struct module_clone_data git_config_set_in_file(p, "submodule.alternateErrorStrategy", error_strategy); -+ git_config_set_in_file(p, "submodule.superprojectGitdir", -+ relative_path(absolute_path(get_git_common_dir()), ++ /* ++ * Set the path from submodule's new gitdir to superproject's gitdir. ++ * The latter may be a worktree gitdir. However, it is not possible for ++ * the submodule to have a worktree-specific gitdir or config at clone ++ * time, because "extensions.worktreeConfig" is only valid when set in ++ * the local gitconfig, which the brand new submodule does not have yet. ++ */ ++ git_config_set_in_file(p, "submodule.superprojectGitDir", ++ relative_path(absolute_path(get_git_dir()), + sm_gitdir, &sb)); + free(sm_alternate); @@ t/t7400-submodule-basic.sh: submodurl=$(pwd -P) + # Ensure that submodule.superprojectGitDir contains the path from the + # submodule's gitdir to the superproject's gitdir. + -+ super_abs_gitdir=$(git -C "$super_dir" rev-parse --path-format=absolute --git-common-dir) && -+ sub_abs_gitdir=$(git -C "$sub_dir" rev-parse --path-format=absolute --git-common-dir) && ++ super_abs_gitdir=$(git -C "$super_dir" rev-parse --absolute-git-dir) && ++ sub_abs_gitdir=$(git -C "$sub_dir" rev-parse --absolute-git-dir) && + + [ "$(git -C "$sub_dir" config --get submodule.superprojectGitDir)" = \ + "$(test-tool path-utils relative_path "$super_abs_gitdir" \ 3: 2dae297943 ! 3: 63ddaf5608 submodule: record superproject gitdir during absorbgitdirs @@ submodule.c: static void relocate_single_git_dir_into_superproject(const char *p char *old_git_dir = NULL, *real_old_git_dir = NULL, *real_new_git_dir = NULL; struct strbuf new_gitdir = STRBUF_INIT; const struct submodule *sub; ++ struct config_set sub_cs; + struct strbuf config_path = STRBUF_INIT, sb = STRBUF_INIT; ++ int tmp; if (submodule_uses_worktrees(path)) die(_("relocate_gitdir for submodule '%s' with " @@ submodule.c: static void relocate_single_git_dir_into_superproject(const char *p relocate_gitdir(path, real_old_git_dir, real_new_git_dir); -+ /* cache pointer to superproject's gitdir */ ++ /* ++ * Note location of superproject's gitdir. Because the submodule already ++ * has a gitdir and local config, we can store this pointer from ++ * worktree config to worktree config, if the submodule has ++ * extensions.worktreeConfig set. ++ */ + strbuf_addf(&config_path, "%s/config", real_new_git_dir); ++ git_configset_init(&sub_cs); ++ git_configset_add_file(&sub_cs, config_path.buf); ++ /* return 0 indicates config was found - we have a worktree config */ ++ if (!git_configset_get_bool(&sub_cs, "extensions.worktreeConfig", &tmp)) ++ strbuf_addstr(&config_path, ".worktree"); ++ + git_config_set_in_file(config_path.buf, "submodule.superprojectGitdir", -+ relative_path(absolute_path(get_git_common_dir()), ++ relative_path(absolute_path(get_git_dir()), + real_new_git_dir, &sb)); + ++ git_configset_clear(&sub_cs); + strbuf_release(&config_path); + strbuf_release(&sb); free(old_git_dir); @@ t/t7412-submodule-absorbgitdirs.sh: test_expect_success 'absorbing fails for a s + # absorb the git dir + git submodule absorbgitdirs sub4 && + -+ # make sure the submodule cached the superproject gitdir correctly -+ submodule_gitdir="$(git -C sub4 rev-parse --path-format=absolute --git-common-dir)" && -+ superproject_gitdir="$(git rev-parse --path-format=absolute --git-common-dir)" && ++ # make sure the submodule noted the superproject gitdir correctly ++ submodule_gitdir="$(git -C sub4 rev-parse --absolute-git-dir)" && ++ superproject_gitdir="$(git rev-parse --absolute-git-dir)" && + + test-tool path-utils relative_path "$superproject_gitdir" \ + "$submodule_gitdir" >expect && @@ t/t7412-submodule-absorbgitdirs.sh: test_expect_success 'absorbing fails for a s + test_cmp expect actual + ) +' ++ ++test_expect_success 'absorbgitdirs works with a submodule with worktree config' ' ++ # reuse the worktree of the superproject ++ ( ++ cd wt && ++ ++ # create a new unembedded git dir ++ git init sub5 && ++ test_commit -C sub5 first && ++ git submodule add ./sub5 && ++ test_tick && ++ ++ # turn on worktree configs for submodule ++ git -C sub5 config extensions.worktreeConfig true && ++ ++ # absorb the git dir ++ git submodule absorbgitdirs sub5 && ++ ++ # make sure the submodule noted the superproject gitdir correctly ++ submodule_gitdir="$(git -C sub5 rev-parse --absolute-git-dir)" && ++ superproject_gitdir="$(git rev-parse --absolute-git-dir)" && ++ ++ test-tool path-utils relative_path "$superproject_gitdir" \ ++ "$submodule_gitdir" >expect && ++ git -C sub5 config submodule.superprojectGitDir >actual && ++ ++ test_cmp expect actual && ++ ++ # make sure the config went into the submodule config.worktree ++ test_file_not_empty "$submodule_gitdir/config.worktree" ++ ) ++' + test_done 4: f0412d6d34 ! 4: 33a582ef13 submodule: record superproject gitdir during 'update' @@ Metadata ## Commit message ## submodule: record superproject gitdir during 'update' - A recorded hint path to the superproject's gitdir might be added during + A recorded path to the superproject's gitdir might be added during 'git submodule add', but in some cases - like submodules which were created before 'git submodule add' learned to record that info - it might be useful to update the hint. Let's do it during 'git submodule @@ git-submodule.sh: cmd_update() ;; esac -+ # Cache a pointer to the superproject's common dir. This may have -+ # changed, unless it's a fresh clone. Writes it to worktree -+ # if applicable, otherwise to local. ++ # Store a poitner to the superproject's gitdir. This may have ++ # changed, unless it's a fresh clone. Write to worktree if ++ # applicable, and point to superproject's worktree gitdir if ++ # applicable. + if test -z "$just_cloned" + then + sm_gitdir="$(git -C "$sm_path" rev-parse --absolute-git-dir)" + relative_gitdir="$(git rev-parse --path-format=relative \ + --prefix "${sm_gitdir}" \ -+ --git-common-dir)" ++ --git-dir)" + + git -C "$sm_path" config --worktree \ + submodule.superprojectgitdir "$relative_gitdir" @@ t/t7406-submodule-update.sh: test_expect_success 'submodule update --quiet passe + git -C submodule config --unset submodule.superprojectGitdir && + git submodule update && + test-tool path-utils relative_path \ -+ "$(git rev-parse --path-format=absolute --git-common-dir)" \ -+ "$(git -C submodule rev-parse --path-format=absolute --git-common-dir)" >expect && ++ "$(git rev-parse --absolute-git-dir)" \ ++ "$(git -C submodule rev-parse --absolute-git-dir)" >expect && + git -C submodule config submodule.superprojectGitdir >actual && + test_cmp expect actual + ) +' ++ ++test_expect_success 'submodule update uses config.worktree if applicable' ' ++ (cd super && ++ git -C submodule config --unset submodule.superprojectGitDir && ++ git -C submodule config extensions.worktreeConfig true && ++ git submodule update && ++ test-tool path-utils relative_path \ ++ "$(git rev-parse --absolute-git-dir)" \ ++ "$(git -C submodule rev-parse --absolute-git-dir)" >expect && ++ git -C submodule config submodule.superprojectGitdir >actual && ++ test_cmp expect actual && ++ ++ test_file_not_empty "$(git -C submodule rev-parse --absolute-git-dir)/config.worktree" ++ ) ++' + test_done -: ---------- > 5: a8b5d40a77 submodule: use config to find superproject worktree