diff mbox series

[v9,2/3] introduce submodule.hasSuperproject record

Message ID 20220310004423.2627181-3-emilyshaffer@google.com (mailing list archive)
State New, archived
Headers show
Series teach submodules to know they're submodules | expand

Commit Message

Emily Shaffer March 10, 2022, 12:44 a.m. UTC
Teach submodules a config variable indicating the fact that they are a
submodule. If this config is set to false or unset, Git may assume the
current repo is not a submodule.

Git commands can use this variable to decide whether to traverse the
filesystem and look for a superproject at all. 'git rev-parse
--show-superproject-working-tree' can learn to exit early if this config
is unset or false. Other newly added or implicit behavior - like "git
status" showing the submodule's status in relation to the superproject,
or a config shared between the superproject and submodule - can use this
config to decide whether to search the parent directory to find a
superproject.

Introduce this config everywhere we add a new submodule, or touch one
that already exists, so that we can proliferate it in repos which are
already out in the world using submodules.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config/submodule.txt |  6 ++++
 builtin/submodule--helper.c        | 11 +++++++
 submodule.c                        | 12 +++++++
 t/t7400-submodule-basic.sh         |  4 +++
 t/t7406-submodule-update.sh        |  8 +++++
 t/t7412-submodule-absorbgitdirs.sh | 50 ++++++++++++++++++++++++++++--
 6 files changed, 89 insertions(+), 2 deletions(-)

Comments

Junio C Hamano March 10, 2022, 2:09 a.m. UTC | #1
Emily Shaffer <emilyshaffer@google.com> writes:

> @@ -2617,6 +2622,12 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
>  
>  	free(prefixed_path);
>  
> +	/*
> +	 * This entry point is always called from a submodule, so this is a
> +	 * good place to set a hint that this repo is a submodule.
> +	 */
> +	git_config_set("submodule.hasSuperproject", "true");
> +
>  	if (!oideq(&update_data.oid, &update_data.suboid) || update_data.force)
>  		return do_run_update_procedure(&update_data);

In Glen's update to rewrite "submodule update" in C, this part is
replaced with a call to update_submodule2().  I am not sure what the
current repository is at this point of the code with and without
Glen's topic, but are we sure we are in a submodule we discovered?

builtin/submodule--helper.c::run_update_procedure() takes sm_path to
the path to the submodule, presumably from superproject's point of
view, and the callchain leads to a call to run_update_command()
eventually, which uses run_command_v_opt_cd_env() to go in to the
submodule repository and run an external git command (like
"checkout"), so it looks like what git_config_set() updates is the
superprojects' configuration, not the configuration of a particular
submodule being updated.

The other one, where cmd_clone() sets the variable in submodule's
configuration file, looks good, but I am not sure about this one.
Junio C Hamano March 10, 2022, 2:32 a.m. UTC | #2
Emily Shaffer <emilyshaffer@google.com> writes:

> +test_expect_success 'submodule update adds submodule.hasSuperproject to older repos' '
> +	(cd super &&
> +	 test_unconfig submodule.hasSuperproject &&

So, before we run the test, we unconfig the variable in the SUPER
repository, and then

> +	 git submodule update &&

update the submodule, and then

> +	 test_cmp_config -C submodule true --type=bool submodule.hasSuperproject

go in to the submodule and check the value of the variable?

Shouldn't the first part be more like

	(cd super &&
	 test_unconfig -C submodule submodule.hasSuperproject &&

if we want to make sure "submodule update" sets it there?

> +	)
> +'
Glen Choo March 10, 2022, 9:29 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> Emily Shaffer <emilyshaffer@google.com> writes:
>
>> @@ -2617,6 +2622,12 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
>>  
>>  	free(prefixed_path);
>>  
>> +	/*
>> +	 * This entry point is always called from a submodule, so this is a
>> +	 * good place to set a hint that this repo is a submodule.
>> +	 */
>> +	git_config_set("submodule.hasSuperproject", "true");
>> +
>>  	if (!oideq(&update_data.oid, &update_data.suboid) || update_data.force)
>>  		return do_run_update_procedure(&update_data);
>
> In Glen's update to rewrite "submodule update" in C, this part is
> replaced with a call to update_submodule2().  I am not sure what the
> current repository is at this point of the code with and without
> Glen's topic, but are we sure we are in a submodule we discovered?

With my topic, this call would be moved into update_submodule2(). The
repository at that point is the superproject.

> builtin/submodule--helper.c::run_update_procedure() takes sm_path to
> the path to the submodule, presumably from superproject's point of
> view, and the callchain leads to a call to run_update_command()
> eventually, which uses run_command_v_opt_cd_env() to go in to the
> submodule repository and run an external git command (like
> "checkout"), so it looks like what git_config_set() updates is the
> superprojects' configuration, not the configuration of a particular
> submodule being updated.
>
> The other one, where cmd_clone() sets the variable in submodule's
> configuration file, looks good, but I am not sure about this one.

But in this series, the current repository is the submodule because this
part happens in a "run-update-procedure" child process.

So there is a slight conflict here, but the conflict existed even before
this change (we used to do this twice in git-submodule.sh and
module_clone()).

Because of that conflict, I was planning to base "part2" on this series
anyway, and if anything, this change makes the conflict better because
we now set "submodule.hasSuperproject" in only one place
(run_update_procedure()) instead of two.

So I think this change improves this series and improves the interaction
with mine.
Glen Choo March 10, 2022, 9:40 p.m. UTC | #4
Junio C Hamano <gitster@pobox.com> writes:

> Emily Shaffer <emilyshaffer@google.com> writes:
>
>> @@ -2617,6 +2622,12 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
>>  
>>  	free(prefixed_path);
>>  
>> +	/*
>> +	 * This entry point is always called from a submodule, so this is a
>> +	 * good place to set a hint that this repo is a submodule.
>> +	 */
>> +	git_config_set("submodule.hasSuperproject", "true");
>> +
>>  	if (!oideq(&update_data.oid, &update_data.suboid) || update_data.force)
>>  		return do_run_update_procedure(&update_data);
>
> In Glen's update to rewrite "submodule update" in C, this part is
> replaced with a call to update_submodule2().  I am not sure what the
> current repository is at this point of the code with and without
> Glen's topic, but are we sure we are in a submodule we discovered?

Rereading this, I realize you probably meant that this conflicts with
part1, not part2...

At the end of part1, update_submodule2() is called from inside the
submodule (specifically from run_update_procedure()). So a good merge
conflict resolution would be to set the config _before_ calling
update_submodule2(). e.g.

----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index bef9ab22d4..f53808d995 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2672,6 +2677,11 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
                                            &update_data.update_strategy);

        free(prefixed_path);
+       /*
+        * This entry point is always called from a submodule, so this is a
+        * good place to set a hint that this repo is a submodule.
+        */
+       git_config_set("submodule.hasSuperproject", "true");
        return update_submodule2(&update_data);
 }
Glen Choo March 10, 2022, 9:54 p.m. UTC | #5
Emily Shaffer <emilyshaffer@google.com> writes:

> diff --git a/Documentation/config/submodule.txt b/Documentation/config/submodule.txt
> index ee454f8126..99d5260b8e 100644
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index c5d3fc3817..eda9ed550e 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -1839,6 +1839,11 @@ static int clone_submodule(struct module_clone_data *clone_data)
>  		git_config_set_in_file(p, "submodule.alternateErrorStrategy",
>  				       error_strategy);
>  
> +	/*
> +	 * Teach the submodule that it's a submodule.
> +	 */
> +	git_config_set_in_file(p, "submodule.hasSuperproject", "true");
> +
>  	free(sm_alternate);
>  	free(error_strategy);

This git_config_set_* is superfluous - it sets the config in newly
cloned submodules..

>  
> @@ -2617,6 +2622,12 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
>  
>  	free(prefixed_path);
>  
> +	/*
> +	 * This entry point is always called from a submodule, so this is a
> +	 * good place to set a hint that this repo is a submodule.
> +	 */
> +	git_config_set("submodule.hasSuperproject", "true");
> +
>  	if (!oideq(&update_data.oid, &update_data.suboid) || update_data.force)
>  		return do_run_update_procedure(&update_data);
>  

but this is called over *all* submodules, so we're guaranteed to always
set the config if "git submodule update" isn't interrupted halfway.

I don't think we guarantee correctness if it is interrupted halfway e.g.
core.worktree can be unset if it is interrupted halfway (because
ensure-core-worktree is called adjacent to run-update-procedure, not
inside of update-clone).

So I think it's better to just drop the previous hunk - it will
disappear anyway in gc/submodule-update-part2.
Junio C Hamano March 10, 2022, 10:10 p.m. UTC | #6
Glen Choo <chooglen@google.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> Emily Shaffer <emilyshaffer@google.com> writes:
>>
>>> @@ -2617,6 +2622,12 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
>>>  
>>>  	free(prefixed_path);
>>>  
>>> +	/*
>>> +	 * This entry point is always called from a submodule, so this is a
>>> +	 * good place to set a hint that this repo is a submodule.
>>> +	 */
>>> +	git_config_set("submodule.hasSuperproject", "true");
>>> +
>>>  	if (!oideq(&update_data.oid, &update_data.suboid) || update_data.force)
>>>  		return do_run_update_procedure(&update_data);
>>
>> In Glen's update to rewrite "submodule update" in C, this part is
>> replaced with a call to update_submodule2().  I am not sure what the
>> current repository is at this point of the code with and without
>> Glen's topic, but are we sure we are in a submodule we discovered?
>
> Rereading this, I realize you probably meant that this conflicts with
> part1, not part2...
>
> At the end of part1, update_submodule2() is called from inside the
> submodule (specifically from run_update_procedure()). So a good merge
> conflict resolution would be to set the config _before_ calling
> update_submodule2(). e.g.
>
> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index bef9ab22d4..f53808d995 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -2672,6 +2677,11 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
>                                             &update_data.update_strategy);
>
>         free(prefixed_path);
> +       /*
> +        * This entry point is always called from a submodule, so this is a
> +        * good place to set a hint that this repo is a submodule.
> +        */
> +       git_config_set("submodule.hasSuperproject", "true");
>         return update_submodule2(&update_data);
>  }

That matched my tentative resolution I made last night, but what do
you think about this part of the test added by the patch?

diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 11cccbb333..ec2397fc69 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -1061,4 +1061,12 @@ test_expect_success 'submodule update --quiet passes quietness to fetch with a s
 	)
 '
 
+test_expect_success 'submodule update adds submodule.hasSuperproject to older repos' '
+	(cd super &&
+	 test_unconfig submodule.hasSuperproject &&
+	 git submodule update &&
+	 test_cmp_config -C submodule true --type=bool submodule.hasSuperproject
+	)
+'
+
 test_done

We go to "super", make sure that superproject does not have
submodule.hasSuperproject set, run "git submodule update", and see
if the configuration file in "submodule" subdirectory has the
variable set.  It does not clear the variable from the submodule
before starting, so the variable given to the submodule when it was
cloned would be there, even if "git submodule update" failed to set
it.

I am wondering if it should do something like the attached instead.

We

 * clear the variable from "super" and "super/submodule"
   repositories;

 * run "git submodule update";

 * ensure that "git submodule update" did not touch "super/.git/config";

 * ensure that "git submodule update" added the variable to
   "super/submodule/.git/config".

Clearing the variable from "super" is technically wrong because the
repository is set up as a submodule of "recursivesuper" and if we
had further tests, we should restore it in "super", but the point is
that we are makng sure "git submodule update" sets the variable in
the configuration file of the submodule, and not in the superproject's. 

With the conflict resolution above, this "corrected" test fails and
shows that superproject's configuration file is updated after "git
submodule update".

This series alone, without your topic, this "corrected" test fails,
and that is where my "are we sure we are mucking with the
configuration file in the submodule"? comes from.

diff --git c/t/t7406-submodule-update.sh w/t/t7406-submodule-update.sh
index 000e055811..c9912bb242 100755
--- c/t/t7406-submodule-update.sh
+++ w/t/t7406-submodule-update.sh
@@ -1083,4 +1083,16 @@ test_expect_success 'submodule update --filter sets partial clone settings' '
 	test_cmp_config -C super-filter/submodule blob:none remote.origin.partialclonefilter
 '
 
+test_expect_success 'submodule update adds submodule.hasSuperproject to older repos' '
+	(cd super &&
+	 test_unconfig submodule.hasSuperproject &&
+	 test_unconfig -C submodule submodule.hasSuperproject &&
+	 git submodule update &&
+	 echo in super &&
+	 test_cmp_config false --type=bool submodule.hasSuperproject &&
+	 echo in submodule &&
+	 test_cmp_config -C submodule true --type=bool submodule.hasSuperproject
+	)
+'
+
 test_done
Glen Choo March 10, 2022, 11:42 p.m. UTC | #7
Junio C Hamano <gitster@pobox.com> writes:

>> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
>> index bef9ab22d4..f53808d995 100644
>> --- a/builtin/submodule--helper.c
>> +++ b/builtin/submodule--helper.c
>> @@ -2672,6 +2677,11 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
>>                                             &update_data.update_strategy);
>>
>>         free(prefixed_path);
>> +       /*
>> +        * This entry point is always called from a submodule, so this is a
>> +        * good place to set a hint that this repo is a submodule.
>> +        */
>> +       git_config_set("submodule.hasSuperproject", "true");
>>         return update_submodule2(&update_data);
>>  }
>
> That matched my tentative resolution I made last night, but what do
> you think about this part of the test added by the patch?
>
> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
> index 11cccbb333..ec2397fc69 100755
> --- a/t/t7406-submodule-update.sh
> +++ b/t/t7406-submodule-update.sh
> @@ -1061,4 +1061,12 @@ test_expect_success 'submodule update --quiet passes quietness to fetch with a s
>  	)
>  '
>  
> +test_expect_success 'submodule update adds submodule.hasSuperproject to older repos' '
> +	(cd super &&
> +	 test_unconfig submodule.hasSuperproject &&
> +	 git submodule update &&
> +	 test_cmp_config -C submodule true --type=bool submodule.hasSuperproject
> +	)
> +'
> +
>  test_done
>
> We go to "super", make sure that superproject does not have
> submodule.hasSuperproject set, run "git submodule update", and see
> if the configuration file in "submodule" subdirectory has the
> variable set.  It does not clear the variable from the submodule
> before starting, so the variable given to the submodule when it was
> cloned would be there, even if "git submodule update" failed to set
> it.
>
> I am wondering if it should do something like the attached instead.
>
> We
>
>  * clear the variable from "super" and "super/submodule"
>    repositories;
>
>  * run "git submodule update";
>
>  * ensure that "git submodule update" did not touch "super/.git/config";
>
>  * ensure that "git submodule update" added the variable to
>    "super/submodule/.git/config".
>
> Clearing the variable from "super" is technically wrong because the
> repository is set up as a submodule of "recursivesuper" and if we
> had further tests, we should restore it in "super", but the point is
> that we are makng sure "git submodule update" sets the variable in
> the configuration file of the submodule, and not in the superproject's. 

Yes, the test you've described is closer to what I thought the original
test was trying to do. Seeing this test pass gave me a false sense of
confidence hm..

> With the conflict resolution above, this "corrected" test fails and
> shows that superproject's configuration file is updated after "git
> submodule update".
>
> This series alone, without your topic, this "corrected" test fails,
> and that is where my "are we sure we are mucking with the
> configuration file in the submodule"? comes from.

Yeah looks like we aren't in the submodule after all:

		out=$(git submodule--helper run-update-procedure \
			  ${wt_prefix:+--prefix "$wt_prefix"} \
			  ${GIT_QUIET:+--quiet} \
			  ${force:+--force} \
			  ${just_cloned:+--just-cloned} \
			  ${nofetch:+--no-fetch} \
			  ${depth:+"$depth"} \
			  ${update:+--update "$update"} \
			  ${prefix:+--recursive-prefix "$prefix"} \
			  ${sha1:+--oid "$sha1"} \
			  ${subsha1:+--suboid "$subsha1"} \
			  "--" \
			  "$sm_path")

This says "do the update at this submodule path", but this is being run
from the superproject.

So I suppose the way forward is one of the following:

- Revert my original suggestion
- Revert my original suggestion AND remove the git_config_set from
  "module_clone()" (before this, we unconditionally set this value in
  git-submodule.sh anyway)
- Set the config in the submodule even though we are running from the
  superproject (this is possible, ensure_core_worktree() does this).

In any case, sorry for the faulty suggestion :(
Glen Choo March 10, 2022, 11:53 p.m. UTC | #8
Glen Choo <chooglen@google.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>>> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
>>> index bef9ab22d4..f53808d995 100644
>>> --- a/builtin/submodule--helper.c
>>> +++ b/builtin/submodule--helper.c
>>> @@ -2672,6 +2677,11 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
>>>                                             &update_data.update_strategy);
>>>
>>>         free(prefixed_path);
>>> +       /*
>>> +        * This entry point is always called from a submodule, so this is a
>>> +        * good place to set a hint that this repo is a submodule.
>>> +        */
>>> +       git_config_set("submodule.hasSuperproject", "true");
>>>         return update_submodule2(&update_data);
>>>  }
>>
>> That matched my tentative resolution I made last night, but what do
>> you think about this part of the test added by the patch?
>>
>> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
>> index 11cccbb333..ec2397fc69 100755
>> --- a/t/t7406-submodule-update.sh
>> +++ b/t/t7406-submodule-update.sh
>> @@ -1061,4 +1061,12 @@ test_expect_success 'submodule update --quiet passes quietness to fetch with a s
>>  	)
>>  '
>>  
>> +test_expect_success 'submodule update adds submodule.hasSuperproject to older repos' '
>> +	(cd super &&
>> +	 test_unconfig submodule.hasSuperproject &&
>> +	 git submodule update &&
>> +	 test_cmp_config -C submodule true --type=bool submodule.hasSuperproject
>> +	)
>> +'
>> +
>>  test_done
>>
>> We go to "super", make sure that superproject does not have
>> submodule.hasSuperproject set, run "git submodule update", and see
>> if the configuration file in "submodule" subdirectory has the
>> variable set.  It does not clear the variable from the submodule
>> before starting, so the variable given to the submodule when it was
>> cloned would be there, even if "git submodule update" failed to set
>> it.
>>
>> I am wondering if it should do something like the attached instead.
>>
>> We
>>
>>  * clear the variable from "super" and "super/submodule"
>>    repositories;
>>
>>  * run "git submodule update";
>>
>>  * ensure that "git submodule update" did not touch "super/.git/config";
>>
>>  * ensure that "git submodule update" added the variable to
>>    "super/submodule/.git/config".
>>
>> Clearing the variable from "super" is technically wrong because the
>> repository is set up as a submodule of "recursivesuper" and if we
>> had further tests, we should restore it in "super", but the point is
>> that we are makng sure "git submodule update" sets the variable in
>> the configuration file of the submodule, and not in the superproject's. 
>
> Yes, the test you've described is closer to what I thought the original
> test was trying to do. Seeing this test pass gave me a false sense of
> confidence hm..

Correction, seeing the _original_ test pass gave me false sense of
confidence.

>> With the conflict resolution above, this "corrected" test fails and
>> shows that superproject's configuration file is updated after "git
>> submodule update".
>>
>> This series alone, without your topic, this "corrected" test fails,
>> and that is where my "are we sure we are mucking with the
>> configuration file in the submodule"? comes from.
> - Set the config in the submodule even though we are running from the
>   superproject (this is possible, ensure_core_worktree() does this).

If it helps, I was able to do this up by copying
ensure_core_worktree(), and this passes the amended test.

----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 4d02dd05ca..3bb7a65762 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1838,11 +1838,6 @@ static int clone_submodule(struct module_clone_data *clone_data)
    git_config_set_in_file(p, "submodule.alternateErrorStrategy",
              error_strategy);

-	/*
-	 * Teach the submodule that it's a submodule.
-	 */
-	git_config_set_in_file(p, "submodule.hasSuperproject", "true");
-
  free(sm_alternate);
  free(error_strategy);

@@ -2560,6 +2555,20 @@ static int update_clone(int argc, const char **argv, const char *prefix)
  return update_submodules(&suc);
}

+static void set_hassuperproject(const char *sm_path)
+{
+	struct repository subrepo;
+	char *cfg_file;
+
+	if (repo_submodule_init(&subrepo, the_repository, sm_path, null_oid()))
+		die(_("could not get a repository handle for submodule '%s'"), sm_path);
+
+	cfg_file = repo_git_path(&subrepo, "config");
+	git_config_set_in_file(cfg_file, "submodule.hasSuperproject", "true");
+
+	free(cfg_file);
+}
+
static int run_update_procedure(int argc, const char **argv, const char *prefix)
{
  int force = 0, quiet = 0, nofetch = 0, just_cloned = 0;
@@ -2622,10 +2631,9 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
  free(prefixed_path);

  /*
-	 * This entry point is always called from a submodule, so this is a
-	 * good place to set a hint that this repo is a submodule.
+	 * Teach the submodule that it's a submodule.
  */
-	git_config_set("submodule.hasSuperproject", "true");
+	set_hassuperproject(update_data.sm_path);

  if (!oideq(&update_data.oid, &update_data.suboid) || update_data.force)
    return do_run_update_procedure(&update_data);
Emily Shaffer March 15, 2022, 6:27 p.m. UTC | #9
On Thu, Mar 10, 2022 at 01:54:12PM -0800, Glen Choo wrote:
> 
> Emily Shaffer <emilyshaffer@google.com> writes:
> 
> > diff --git a/Documentation/config/submodule.txt b/Documentation/config/submodule.txt
> > index ee454f8126..99d5260b8e 100644
> > diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> > index c5d3fc3817..eda9ed550e 100644
> > --- a/builtin/submodule--helper.c
> > +++ b/builtin/submodule--helper.c
> > @@ -1839,6 +1839,11 @@ static int clone_submodule(struct module_clone_data *clone_data)
> >  		git_config_set_in_file(p, "submodule.alternateErrorStrategy",
> >  				       error_strategy);
> >  
> > +	/*
> > +	 * Teach the submodule that it's a submodule.
> > +	 */
> > +	git_config_set_in_file(p, "submodule.hasSuperproject", "true");
> > +
> >  	free(sm_alternate);
> >  	free(error_strategy);
> 
> This git_config_set_* is superfluous - it sets the config in newly
> cloned submodules..
> 
> >  
> > @@ -2617,6 +2622,12 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
> >  
> >  	free(prefixed_path);
> >  
> > +	/*
> > +	 * This entry point is always called from a submodule, so this is a
> > +	 * good place to set a hint that this repo is a submodule.
> > +	 */
> > +	git_config_set("submodule.hasSuperproject", "true");
> > +
> >  	if (!oideq(&update_data.oid, &update_data.suboid) || update_data.force)
> >  		return do_run_update_procedure(&update_data);
> >  
> 
> but this is called over *all* submodules, so we're guaranteed to always
> set the config if "git submodule update" isn't interrupted halfway.
> 
> I don't think we guarantee correctness if it is interrupted halfway e.g.
> core.worktree can be unset if it is interrupted halfway (because
> ensure-core-worktree is called adjacent to run-update-procedure, not
> inside of update-clone).
> 
> So I think it's better to just drop the previous hunk - it will
> disappear anyway in gc/submodule-update-part2.

Ah, this makes sense. Sure, will do.

 - Emily
Emily Shaffer March 15, 2022, 6:39 p.m. UTC | #10
On Thu, Mar 10, 2022 at 02:10:55PM -0800, Junio C Hamano wrote:
> 
> Glen Choo <chooglen@google.com> writes:
> 
> > Junio C Hamano <gitster@pobox.com> writes:
> >
> >> Emily Shaffer <emilyshaffer@google.com> writes:
> >>
> >>> @@ -2617,6 +2622,12 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
> >>>  
> >>>  	free(prefixed_path);
> >>>  
> >>> +	/*
> >>> +	 * This entry point is always called from a submodule, so this is a
> >>> +	 * good place to set a hint that this repo is a submodule.
> >>> +	 */
> >>> +	git_config_set("submodule.hasSuperproject", "true");
> >>> +
> >>>  	if (!oideq(&update_data.oid, &update_data.suboid) || update_data.force)
> >>>  		return do_run_update_procedure(&update_data);
> >>
> >> In Glen's update to rewrite "submodule update" in C, this part is
> >> replaced with a call to update_submodule2().  I am not sure what the
> >> current repository is at this point of the code with and without
> >> Glen's topic, but are we sure we are in a submodule we discovered?
> >
> > Rereading this, I realize you probably meant that this conflicts with
> > part1, not part2...
> >
> > At the end of part1, update_submodule2() is called from inside the
> > submodule (specifically from run_update_procedure()). So a good merge
> > conflict resolution would be to set the config _before_ calling
> > update_submodule2(). e.g.
> >
> > ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
> > diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> > index bef9ab22d4..f53808d995 100644
> > --- a/builtin/submodule--helper.c
> > +++ b/builtin/submodule--helper.c
> > @@ -2672,6 +2677,11 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
> >                                             &update_data.update_strategy);
> >
> >         free(prefixed_path);
> > +       /*
> > +        * This entry point is always called from a submodule, so this is a
> > +        * good place to set a hint that this repo is a submodule.
> > +        */
> > +       git_config_set("submodule.hasSuperproject", "true");
> >         return update_submodule2(&update_data);
> >  }
> 
> That matched my tentative resolution I made last night, but what do
> you think about this part of the test added by the patch?
> 
> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
> index 11cccbb333..ec2397fc69 100755
> --- a/t/t7406-submodule-update.sh
> +++ b/t/t7406-submodule-update.sh
> @@ -1061,4 +1061,12 @@ test_expect_success 'submodule update --quiet passes quietness to fetch with a s
>  	)
>  '
>  
> +test_expect_success 'submodule update adds submodule.hasSuperproject to older repos' '
> +	(cd super &&
> +	 test_unconfig submodule.hasSuperproject &&
> +	 git submodule update &&
> +	 test_cmp_config -C submodule true --type=bool submodule.hasSuperproject
> +	)
> +'
> +
>  test_done
> 
> We go to "super", make sure that superproject does not have
> submodule.hasSuperproject set, run "git submodule update", and see
> if the configuration file in "submodule" subdirectory has the
> variable set.  It does not clear the variable from the submodule
> before starting, so the variable given to the submodule when it was
> cloned would be there, even if "git submodule update" failed to set
> it.
> 
> I am wondering if it should do something like the attached instead.
> 
> We
> 
>  * clear the variable from "super" and "super/submodule"
>    repositories;
> 
>  * run "git submodule update";
> 
>  * ensure that "git submodule update" did not touch "super/.git/config";

Yeah, this is a good idea, and indeed when I add this step the bug
pointed out downthread becomes clear. Thanks.

> 
>  * ensure that "git submodule update" added the variable to
>    "super/submodule/.git/config".
> 
> Clearing the variable from "super" is technically wrong because the
> repository is set up as a submodule of "recursivesuper" and if we
> had further tests, we should restore it in "super", but the point is
> that we are makng sure "git submodule update" sets the variable in
> the configuration file of the submodule, and not in the superproject's. 
> 
> With the conflict resolution above, this "corrected" test fails and
> shows that superproject's configuration file is updated after "git
> submodule update".
> 
> This series alone, without your topic, this "corrected" test fails,
> and that is where my "are we sure we are mucking with the
> configuration file in the submodule"? comes from.
> 
> diff --git c/t/t7406-submodule-update.sh w/t/t7406-submodule-update.sh
> index 000e055811..c9912bb242 100755
> --- c/t/t7406-submodule-update.sh
> +++ w/t/t7406-submodule-update.sh
> @@ -1083,4 +1083,16 @@ test_expect_success 'submodule update --filter sets partial clone settings' '
>  	test_cmp_config -C super-filter/submodule blob:none remote.origin.partialclonefilter
>  '
>  
> +test_expect_success 'submodule update adds submodule.hasSuperproject to older repos' '
> +	(cd super &&
> +	 test_unconfig submodule.hasSuperproject &&
> +	 test_unconfig -C submodule submodule.hasSuperproject &&
> +	 git submodule update &&
> +	 echo in super &&
> +	 test_cmp_config false --type=bool submodule.hasSuperproject &&
> +	 echo in submodule &&
> +	 test_cmp_config -C submodule true --type=bool submodule.hasSuperproject
> +	)
> +'
> +
>  test_done
Junio C Hamano March 15, 2022, 7:19 p.m. UTC | #11
Emily Shaffer <emilyshaffer@google.com> writes:

>> Clearing the variable from "super" is technically wrong because the
>> repository is set up as a submodule of "recursivesuper" and if we
>> had further tests, we should restore it in "super", but the point is
>> that we are makng sure "git submodule update" sets the variable in
>> the configuration file of the submodule, and not in the superproject's. 

If we wanted to be kosher about this, we could start the test with

    git config submodule.hassuperproject 1

in the "super" repository, clear the variable in the "submodule"
repository, before running the "git submodule update" step, which
(1) should not touch the "super" configuration and (2) should touch
the "submodule" configuration.

If we inspect in the "super" repository after "submodule update"

    value=$(git config submodule.hassuperproject) &&
    test "$value" = 1

I think we can tell if a buggy "submodule update" overwrites the
"super" configuration from "1" to "true".  And downstream tests
will take "1" as true just fine.

And of course, in "submodule", the variable after "submodule update"
must be set to true, which can be checked with

    value=$(git -C submodule config --type=bool submodule.hassuperproject) &&
    test "$value" = true

The trick depends on the hardcoded value to represent "true" in the
code this patch adds, but that is the canonical way to spell true in
the config, according to "git config --type=bool", so the dependency
may not be too bad.

Just a thought.
Emily Shaffer March 15, 2022, 8:48 p.m. UTC | #12
On Thu, Mar 10, 2022 at 03:53:17PM -0800, Glen Choo wrote:
> 
> Glen Choo <chooglen@google.com> writes:
> 
> > Junio C Hamano <gitster@pobox.com> writes:
> >
> >>> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> >>> index bef9ab22d4..f53808d995 100644
> >>> --- a/builtin/submodule--helper.c
> >>> +++ b/builtin/submodule--helper.c
> >>> @@ -2672,6 +2677,11 @@ static int run_update_procedure(int argc, const char **argv, const char *prefix)
> >>>                                             &update_data.update_strategy);
> >>>
> >>>         free(prefixed_path);
> >>> +       /*
> >>> +        * This entry point is always called from a submodule, so this is a
> >>> +        * good place to set a hint that this repo is a submodule.
> >>> +        */
> >>> +       git_config_set("submodule.hasSuperproject", "true");
> >>>         return update_submodule2(&update_data);
> >>>  }
> >>
> >> That matched my tentative resolution I made last night, but what do
> >> you think about this part of the test added by the patch?
> >>
> >> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
> >> index 11cccbb333..ec2397fc69 100755
> >> --- a/t/t7406-submodule-update.sh
> >> +++ b/t/t7406-submodule-update.sh
> >> @@ -1061,4 +1061,12 @@ test_expect_success 'submodule update --quiet passes quietness to fetch with a s
> >>  	)
> >>  '
> >>  
> >> +test_expect_success 'submodule update adds submodule.hasSuperproject to older repos' '
> >> +	(cd super &&
> >> +	 test_unconfig submodule.hasSuperproject &&
> >> +	 git submodule update &&
> >> +	 test_cmp_config -C submodule true --type=bool submodule.hasSuperproject
> >> +	)
> >> +'
> >> +
> >>  test_done
> >>
> >> We go to "super", make sure that superproject does not have
> >> submodule.hasSuperproject set, run "git submodule update", and see
> >> if the configuration file in "submodule" subdirectory has the
> >> variable set.  It does not clear the variable from the submodule
> >> before starting, so the variable given to the submodule when it was
> >> cloned would be there, even if "git submodule update" failed to set
> >> it.
> >>
> >> I am wondering if it should do something like the attached instead.
> >>
> >> We
> >>
> >>  * clear the variable from "super" and "super/submodule"
> >>    repositories;
> >>
> >>  * run "git submodule update";
> >>
> >>  * ensure that "git submodule update" did not touch "super/.git/config";
> >>
> >>  * ensure that "git submodule update" added the variable to
> >>    "super/submodule/.git/config".
> >>
> >> Clearing the variable from "super" is technically wrong because the
> >> repository is set up as a submodule of "recursivesuper" and if we
> >> had further tests, we should restore it in "super", but the point is
> >> that we are makng sure "git submodule update" sets the variable in
> >> the configuration file of the submodule, and not in the superproject's. 
> >
> > Yes, the test you've described is closer to what I thought the original
> > test was trying to do. Seeing this test pass gave me a false sense of
> > confidence hm..
> 
> Correction, seeing the _original_ test pass gave me false sense of
> confidence.
> 
> >> With the conflict resolution above, this "corrected" test fails and
> >> shows that superproject's configuration file is updated after "git
> >> submodule update".
> >>
> >> This series alone, without your topic, this "corrected" test fails,
> >> and that is where my "are we sure we are mucking with the
> >> configuration file in the submodule"? comes from.
> > - Set the config in the submodule even though we are running from the
> >   superproject (this is possible, ensure_core_worktree() does this).
> 
> If it helps, I was able to do this up by copying
> ensure_core_worktree(), and this passes the amended test.
> 
> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
> 
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 4d02dd05ca..3bb7a65762 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -1838,11 +1838,6 @@ static int clone_submodule(struct module_clone_data *clone_data)
>     git_config_set_in_file(p, "submodule.alternateErrorStrategy",
>               error_strategy);
> 
> -	/*
> -	 * Teach the submodule that it's a submodule.
> -	 */
> -	git_config_set_in_file(p, "submodule.hasSuperproject", "true");
> -
>   free(sm_alternate);
>   free(error_strategy);
> 
> @@ -2560,6 +2555,20 @@ static int update_clone(int argc, const char **argv, const char *prefix)
>   return update_submodules(&suc);
> }
> 
> +static void set_hassuperproject(const char *sm_path)
> +{
> +	struct repository subrepo;
> +	char *cfg_file;
> +
> +	if (repo_submodule_init(&subrepo, the_repository, sm_path, null_oid()))
> +		die(_("could not get a repository handle for submodule '%s'"), sm_path);

Isn't the repo_submodule_init() fairly expensive? I think this is doing
a whole repo_init() call we would not otherwise be doing.... Is it good
enough to generate the config from sm_path, by using
strbuf_repo_worktree_path(), and simply be tolerant of the failure if
<sm-gitdir>/config doesn't exist?

Otherwise, this is a good workaround I think. Thanks.

 - Emily
Emily Shaffer March 15, 2022, 8:56 p.m. UTC | #13
On Tue, Mar 15, 2022 at 01:48:25PM -0700, Emily Shaffer wrote:
> > +static void set_hassuperproject(const char *sm_path)
> > +{
> > +	struct repository subrepo;
> > +	char *cfg_file;
> > +
> > +	if (repo_submodule_init(&subrepo, the_repository, sm_path, null_oid()))
> > +		die(_("could not get a repository handle for submodule '%s'"), sm_path);
> 
> Isn't the repo_submodule_init() fairly expensive? I think this is doing
> a whole repo_init() call we would not otherwise be doing.... Is it good
> enough to generate the config from sm_path, by using
> strbuf_repo_worktree_path(), and simply be tolerant of the failure if
> <sm-gitdir>/config doesn't exist?

Ah, I was misreading the implementation of repo_submodule_init() and I
see now that won't work. I guess it is fine to just invoke
repo_submodule_init() then, unless someone has another idea.
Glen Choo March 15, 2022, 9:19 p.m. UTC | #14
Emily Shaffer <emilyshaffer@google.com> writes:

> On Tue, Mar 15, 2022 at 01:48:25PM -0700, Emily Shaffer wrote:
>> > +static void set_hassuperproject(const char *sm_path)
>> > +{
>> > +	struct repository subrepo;
>> > +	char *cfg_file;
>> > +
>> > +	if (repo_submodule_init(&subrepo, the_repository, sm_path, null_oid()))
>> > +		die(_("could not get a repository handle for submodule '%s'"), sm_path);
>> 
>> Isn't the repo_submodule_init() fairly expensive? I think this is doing
>> a whole repo_init() call we would not otherwise be doing.... Is it good
>> enough to generate the config from sm_path, by using
>> strbuf_repo_worktree_path(), and simply be tolerant of the failure if
>> <sm-gitdir>/config doesn't exist?
>
> Ah, I was misreading the implementation of repo_submodule_init() and I
> see now that won't work. I guess it is fine to just invoke
> repo_submodule_init() then, unless someone has another idea.

Yes, it's difficult to avoid calling repo_submodule_init() because it's
hard to get the gitdir using just the path to the submodule in the
working tree (sm_path).

Are we particular about avoiding calls to repo_submodule_init()? I don't
recall hearing this as an objection before. If so, I'll keep this in
mind as I work on more submodule things.

As an aside, ensure_core_worktree() already calls repo_submodule_init(),
so this wouldn't be the first time "submodule update" calls
repo_submodule_init(), and a potential optimization might be to cache
the result in between invocations of repo_submodule_init().
diff mbox series

Patch

diff --git a/Documentation/config/submodule.txt b/Documentation/config/submodule.txt
index ee454f8126..99d5260b8e 100644
--- a/Documentation/config/submodule.txt
+++ b/Documentation/config/submodule.txt
@@ -91,3 +91,9 @@  submodule.alternateErrorStrategy::
 	`ignore`, `info`, `die`. Default is `die`. Note that if set to `ignore`
 	or `info`, and if there is an error with the computed alternate, the
 	clone proceeds as if no alternate was specified.
+
+submodule.hasSuperproject::
+	Indicates whether this repository is a submodule. If this config is set
+	to 'true', Git may traverse the filesystem above this submodule in order
+	to identify the superproject. It is set automatically during submodule
+	creation, update, and 'git submodule absorbgitdir'.
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index c5d3fc3817..eda9ed550e 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1839,6 +1839,11 @@  static int clone_submodule(struct module_clone_data *clone_data)
 		git_config_set_in_file(p, "submodule.alternateErrorStrategy",
 				       error_strategy);
 
+	/*
+	 * Teach the submodule that it's a submodule.
+	 */
+	git_config_set_in_file(p, "submodule.hasSuperproject", "true");
+
 	free(sm_alternate);
 	free(error_strategy);
 
@@ -2617,6 +2622,12 @@  static int run_update_procedure(int argc, const char **argv, const char *prefix)
 
 	free(prefixed_path);
 
+	/*
+	 * This entry point is always called from a submodule, so this is a
+	 * good place to set a hint that this repo is a submodule.
+	 */
+	git_config_set("submodule.hasSuperproject", "true");
+
 	if (!oideq(&update_data.oid, &update_data.suboid) || update_data.force)
 		return do_run_update_procedure(&update_data);
 
diff --git a/submodule.c b/submodule.c
index c689070524..aafbd628ad 100644
--- a/submodule.c
+++ b/submodule.c
@@ -2097,6 +2097,8 @@  static void relocate_single_git_dir_into_superproject(const char *path)
 	char *old_git_dir = NULL, *real_old_git_dir = NULL, *real_new_git_dir = NULL;
 	struct strbuf new_gitdir = STRBUF_INIT;
 	const struct submodule *sub;
+	struct config_set sub_cs;
+	struct strbuf config_path = STRBUF_INIT, sb = STRBUF_INIT;
 
 	if (submodule_uses_worktrees(path))
 		die(_("relocate_gitdir for submodule '%s' with "
@@ -2127,6 +2129,16 @@  static void relocate_single_git_dir_into_superproject(const char *path)
 
 	relocate_gitdir(path, real_old_git_dir, real_new_git_dir);
 
+	strbuf_addf(&config_path, "%s/config", real_new_git_dir);
+	git_configset_init(&sub_cs);
+	git_configset_add_file(&sub_cs, config_path.buf);
+
+	git_config_set_in_file(config_path.buf, "submodule.hasSuperproject",
+			       "true");
+
+	git_configset_clear(&sub_cs);
+	strbuf_release(&config_path);
+	strbuf_release(&sb);
 	free(old_git_dir);
 	free(real_old_git_dir);
 	free(real_new_git_dir);
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 40cf8d89aa..53c8bf699d 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -115,6 +115,10 @@  inspect() {
 	git -C "$sub_dir" rev-parse HEAD >head-sha1 &&
 	git -C "$sub_dir" update-index --refresh &&
 	git -C "$sub_dir" diff-files --exit-code &&
+
+	# Ensure that submodule.hasSuperproject is set.
+	test_cmp_config -C "$sub_dir" true --type=bool "submodule.hasSuperproject"
+
 	git -C "$sub_dir" clean -n -d -x >untracked
 }
 
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 11cccbb333..ec2397fc69 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -1061,4 +1061,12 @@  test_expect_success 'submodule update --quiet passes quietness to fetch with a s
 	)
 '
 
+test_expect_success 'submodule update adds submodule.hasSuperproject to older repos' '
+	(cd super &&
+	 test_unconfig submodule.hasSuperproject &&
+	 git submodule update &&
+	 test_cmp_config -C submodule true --type=bool submodule.hasSuperproject
+	)
+'
+
 test_done
diff --git a/t/t7412-submodule-absorbgitdirs.sh b/t/t7412-submodule-absorbgitdirs.sh
index 1cfa150768..4c33a98efa 100755
--- a/t/t7412-submodule-absorbgitdirs.sh
+++ b/t/t7412-submodule-absorbgitdirs.sh
@@ -30,7 +30,9 @@  test_expect_success 'absorb the git dir' '
 	git status >actual.1 &&
 	git -C sub1 rev-parse HEAD >actual.2 &&
 	test_cmp expect.1 actual.1 &&
-	test_cmp expect.2 actual.2
+	test_cmp expect.2 actual.2 &&
+
+	test_cmp_config -C sub1 true --type=bool submodule.hasSuperproject
 '
 
 test_expect_success 'absorbing does not fail for deinitialized submodules' '
@@ -61,7 +63,9 @@  test_expect_success 'absorb the git dir in a nested submodule' '
 	git status >actual.1 &&
 	git -C sub1/nested rev-parse HEAD >actual.2 &&
 	test_cmp expect.1 actual.1 &&
-	test_cmp expect.2 actual.2
+	test_cmp expect.2 actual.2 &&
+
+	test_cmp_config -C sub1/nested true --type=bool submodule.hasSuperproject
 '
 
 test_expect_success 're-setup nested submodule' '
@@ -130,4 +134,46 @@  test_expect_success 'absorbing fails for a submodule with multiple worktrees' '
 	test_i18ngrep "not supported" error
 '
 
+test_expect_success 'absorbgitdirs works when called from a superproject worktree' '
+	# set up a worktree of the superproject
+	git worktree add wt &&
+	(
+	cd wt &&
+
+	# create a new unembedded git dir
+	git init sub4 &&
+	test_commit -C sub4 first &&
+	git submodule add ./sub4 &&
+	test_tick &&
+
+	# absorb the git dir
+	git submodule absorbgitdirs sub4 &&
+
+	# make sure the submodule noted the superproject
+	test_cmp_config -C sub4 true --type=bool submodule.hasSuperproject
+	)
+'
+
+test_expect_success 'absorbgitdirs works with a submodule with worktree config' '
+	# reuse the worktree of the superproject
+	(
+	cd wt &&
+
+	# create a new unembedded git dir
+	git init sub5 &&
+	test_commit -C sub5 first &&
+	git submodule add ./sub5 &&
+	test_tick &&
+
+	# turn on worktree configs for submodule
+	git -C sub5 config extensions.worktreeConfig true &&
+
+	# absorb the git dir
+	git submodule absorbgitdirs sub5 &&
+
+	# make sure the submodule noted the superproject
+	test_cmp_config -C sub5 true --type=bool submodule.hasSuperproject
+	)
+'
+
 test_done