diff mbox series

pull: only pass '--recurse-submodules' to subcommands

Message ID pull.1262.git.git.1652138854255.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series pull: only pass '--recurse-submodules' to subcommands | expand

Commit Message

Glen Choo May 9, 2022, 11:27 p.m. UTC
From: Glen Choo <chooglen@google.com>

Fix a bug in "git pull" where `submodule.recurse` is preferred over
`fetch.recurseSubmodules` (Documentation/config/fetch.txt says that
`fetch.recurseSubmodules` should be preferred.). Do this by passing the
value of the "--recurse-submodules" CLI option to the underlying fetch,
instead of passing a value that combines the CLI option and config
variables.

In other words, this bug occurred because builtin/pull.c is conflating
two similar-sounding, but different concepts:

- Whether "git pull" itself should care about submodules e.g. whether it
  should update the submodule worktrees after performing a merge.
- The value of "--recurse-submodules" to pass to the underlying "git
  fetch".

Thus, when `submodule.recurse` is set, the underlying "git fetch" gets
invoked with "--recurse-submodules", overriding the value of
`fetch.recurseSubmodules`.

An alternative (and more obvious) approach to fix the bug would be to
teach "git pull" to understand `fetch.recurseSubmodules`, but the
proposed solution works better because:

- We don't maintain two identical config-parsing implementions in "git
  pull" and "git fetch".
- It works better with other commands invoked by "git pull" e.g. "git
  merge" won't accidentally respect `fetch.recurseSubmodules`.

Reported-by: Huang Zou <huang.zou@schrodinger.com>
Helped-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
---
    pull: only pass '--recurse-submodules' to subcommands
    
    Thanks Huang Zou for the report [1], and Philippe Blain for the initial
    investigation.
    
    This patch fixes the original bug, but not in the 'obvious' way of
    teaching "git pull" to parse fetch.recurseSubmodules. Instead, "git
    pull" now propagates its value of "--recurse-submodules" to "git fetch"
    (ignoring any config values), and leaves the config parsing to "git
    fetch".
    
    I think this works better because we get a nice separation of "config
    that git pull cares about" and "config that its subprocess care about",
    and as a result:
    
     * We don't maintain two identical config-parsing implementations in
       "git pull" and "git fetch".
     * It works better with other commands invoked by "git pull" e.g. "git
       merge" won't accidentally respect fetch.recurseSubmodules.
    
    PS I'm having a hard time writing today, let me know how the commit
    message/cover letter can be improved :)
    
    [1]
    https://lore.kernel.org/git/CAFnZ=JNE_Sa3TsKghBPj1d0cz3kc6o91Ogj-op8o6qK8t9hPgg@mail.gmail.com
    
    In-Reply-To:
    CAFnZ=JNE_Sa3TsKghBPj1d0cz3kc6o91Ogj-op8o6qK8t9hPgg@mail.gmail.com

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1262%2Fchooglen%2Fpull%2Ffetch-recurse-submodules-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1262/chooglen/pull/fetch-recurse-submodules-v1
Pull-Request: https://github.com/git/git/pull/1262

 builtin/pull.c            | 10 +++++++---
 t/t5572-pull-submodule.sh | 14 ++++++++++++++
 2 files changed, 21 insertions(+), 3 deletions(-)


base-commit: e8005e4871f130c4e402ddca2032c111252f070a

Comments

Junio C Hamano May 10, 2022, 12:09 a.m. UTC | #1
"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

>     This patch fixes the original bug, but not in the 'obvious' way of
>     teaching "git pull" to parse fetch.recurseSubmodules. Instead, "git
>     pull" now propagates its value of "--recurse-submodules" to "git fetch"
>     (ignoring any config values), and leaves the config parsing to "git
>     fetch".

OK.  So the sub-git that is run in submodules will always see a
command line option, and what its configuration file says does not
matter?

>     I think this works better because we get a nice separation of "config
>     that git pull cares about" and "config that its subprocess care about",
>     and as a result:
>     
>      * We don't maintain two identical config-parsing implementations in
>        "git pull" and "git fetch".
>      * It works better with other commands invoked by "git pull" e.g. "git
>        merge" won't accidentally respect fetch.recurseSubmodules.
>     
>     PS I'm having a hard time writing today, let me know how the commit
>     message/cover letter can be improved :)

OK.

> diff --git a/builtin/pull.c b/builtin/pull.c
> index 4d667abc19d..01155ba67b2 100644
> --- a/builtin/pull.c
> +++ b/builtin/pull.c
> @@ -72,6 +72,7 @@ static const char * const pull_usage[] = {
>  static int opt_verbosity;
>  static char *opt_progress;
>  static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
> +static int recurse_submodules_cli = RECURSE_SUBMODULES_DEFAULT;

This ...

>  /* Options passed to git-merge or git-rebase */
>  static enum rebase_type opt_rebase = -1;
> @@ -120,7 +121,7 @@ static struct option pull_options[] = {
>  		N_("force progress reporting"),
>  		PARSE_OPT_NOARG),
>  	OPT_CALLBACK_F(0, "recurse-submodules",
> -		   &recurse_submodules, N_("on-demand"),
> +		   &recurse_submodules_cli, N_("on-demand"),
>  		   N_("control for recursive fetching of submodules"),
>  		   PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),

... is where we keep track of what value we got from the command
line.  OK.

> @@ -536,8 +537,8 @@ static int run_fetch(const char *repo, const char **refspecs)
>  		strvec_push(&args, opt_tags);
>  	if (opt_prune)
>  		strvec_push(&args, opt_prune);
> -	if (recurse_submodules != RECURSE_SUBMODULES_DEFAULT)
> -		switch (recurse_submodules) {
> +	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)

The fact that the variable is different from _DEFAULT is a sure sign
that we got something from the command line, because there is no way
for the command line option to set the variable to _DEFAULT (in
other words, _DEFAULT is not really a default, it is a sign that it
is not yet set to any value).  OK.

> +		switch (recurse_submodules_cli) {
>  		case RECURSE_SUBMODULES_ON:
>  			strvec_push(&args, "--recurse-submodules=on");
>  			break;

OK, so the net effect is that we only strvec_push() a command line
option to underlying "fetch" when we got a command line option.  It
does not matter what "recurse_submodules" variable is set to.  The
variable can be set via the configuration mechanism.  _cli one is
different.

OK.  And they underying "git fetch" will read its configuration as
needed anyway (if we do not do these strvec_push() here).

Sounds very sensible.  FWIW, despite what you said earlier, I find
this "if we have command line override, pass it down, otherwise they
know how to read and interpret configuration on their own" a very
sensible and intuitive approach.

Very nicely done.

> @@ -1001,6 +1002,9 @@ int cmd_pull(int argc, const char **argv, const char *prefix)
>  
>  	argc = parse_options(argc, argv, prefix, pull_options, pull_usage, 0);
>  
> +	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
> +		recurse_submodules = recurse_submodules_cli;

This is a small fallout from the separation of the variables.
Again, _DEFAULT is not "the default behaviour whatever it is", but
is a signal "This was not set at all", and that makes this addition
correct.  At some point, we may want to rename the _DEFAULT to
_UNSPECIFIED or something for readability, but it does not have to
be a part of this fix.

> diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh
> index fa6b4cca65c..65aaa7927fb 100755
> --- a/t/t5572-pull-submodule.sh
> +++ b/t/t5572-pull-submodule.sh
> @@ -107,6 +107,20 @@ test_expect_success " --[no-]recurse-submodule and submodule.recurse" '
>  	test_path_is_file super/sub/merge_strategy_4.t
>  '
>  
> +test_expect_success "fetch.recurseSubmodules option triggers recursive fetch (but not recursive update)" '
> +	test_commit -C child merge_strategy_5 &&
> +	git -C parent submodule update --remote &&
> +	git -C parent add sub &&
> +	git -C parent commit -m "update submodule" &&
> +
> +	git -C super -c fetch.recursesubmodules=true pull --no-rebase &&
> +	# Check that the submodule commit was fetched
> +	sub_oid=$(git -C super rev-parse FETCH_HEAD:sub) &&
> +	git -C super/sub cat-file -e $sub_oid &&
> +	# Check that the submodule worktree did not update
> +	! test_path_is_file super/sub/merge_strategy_5.t
> +'
> +
>  test_expect_success 'pull --rebase --recurse-submodules (remote superproject submodule changes, local submodule changes)' '
>  	# This tests the following scenario :
>  	# - local submodule has new commits
>
> base-commit: e8005e4871f130c4e402ddca2032c111252f070a

Thanks, will queue.
Junio C Hamano May 10, 2022, 12:44 a.m. UTC | #2
"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Glen Choo <chooglen@google.com>
>
> Fix a bug in "git pull" where `submodule.recurse` is preferred over
> `fetch.recurseSubmodules` (Documentation/config/fetch.txt says that
> `fetch.recurseSubmodules` should be preferred.). Do this by passing the
> value of the "--recurse-submodules" CLI option to the underlying fetch,
> instead of passing a value that combines the CLI option and config
> variables.
> ...

Do we know if this ever worked correctly or it once used to work but
we broke it?

>  builtin/pull.c            | 10 +++++++---
>  t/t5572-pull-submodule.sh | 14 ++++++++++++++
>  2 files changed, 21 insertions(+), 3 deletions(-)

The reason I ask is because applying this patch to maint-2.35 and
then reverting the builtin/pull.c part still allow the new test in
t5572 this patch introduces to pass.  So either it used to work
without the change to builtin/pull.c back in v2.35.x days (but in a
newer codebase we need the change to builtin/pull.c), or the new
test is not testing the alleged breakage at all.

It seems that applying only the test part to 'master' does not detect
any test failure, so perhaps the test is faulty?

> diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh
> index fa6b4cca65c..65aaa7927fb 100755
> --- a/t/t5572-pull-submodule.sh
> +++ b/t/t5572-pull-submodule.sh
> @@ -107,6 +107,20 @@ test_expect_success " --[no-]recurse-submodule and submodule.recurse" '
>  	test_path_is_file super/sub/merge_strategy_4.t
>  '
>  
> +test_expect_success "fetch.recurseSubmodules option triggers recursive fetch (but not recursive update)" '
> +	test_commit -C child merge_strategy_5 &&
> +	git -C parent submodule update --remote &&
> +	git -C parent add sub &&
> +	git -C parent commit -m "update submodule" &&
> +
> +	git -C super -c fetch.recursesubmodules=true pull --no-rebase &&
> +	# Check that the submodule commit was fetched
> +	sub_oid=$(git -C super rev-parse FETCH_HEAD:sub) &&
> +	git -C super/sub cat-file -e $sub_oid &&
> +	# Check that the submodule worktree did not update
> +	! test_path_is_file super/sub/merge_strategy_5.t
> +'
> +
>  test_expect_success 'pull --rebase --recurse-submodules (remote superproject submodule changes, local submodule changes)' '
>  	# This tests the following scenario :
>  	# - local submodule has new commits
>
> base-commit: e8005e4871f130c4e402ddca2032c111252f070a
Philippe Blain May 10, 2022, 1:28 p.m. UTC | #3
Hi Glen,

Le 2022-05-09 à 19:27, Glen Choo via GitGitGadget a écrit :
> From: Glen Choo <chooglen@google.com>

First, regarding the commit message title (I never know where to comment
on that since I can't quote it :P)

> Re: [PATCH] pull: only pass '--recurse-submodules' to subcommands

I understand the intent, i.e. "only pass the CLI flag, not any config, to subcommands"
but only because I already know what the patch is about. It could be
read to mean "pass --recurse-submodules only to subcommands, and not to 
something else". Since this is really a bug that affects the underlying 
'git fetch', maybe something like this ?

    pull: pass '--recurse-submodules' to 'fetch' from CLI, not config

> 
> Fix a bug in "git pull" where `submodule.recurse` is preferred over
> `fetch.recurseSubmodules` 

here I would add "for the underlying 'git fetch'"

(Documentation/config/fetch.txt says that
> `fetch.recurseSubmodules` should be preferred.). Do this by passing the
> value of the "--recurse-submodules" CLI option to the underlying fetch,
> instead of passing a value that combines the CLI option and config
> variables.
> 
> In other words, this bug occurred because builtin/pull.c is conflating
> two similar-sounding, but different concepts:
> 
> - Whether "git pull" itself should care about submodules e.g. whether it
>   should update the submodule worktrees after performing a merge.

nit: "or rebase".

> - The value of "--recurse-submodules" to pass to the underlying "git
>   fetch".
> 
> Thus, when `submodule.recurse` is set, the underlying "git fetch" gets
> invoked with "--recurse-submodules", overriding the value of
> `fetch.recurseSubmodules`.

the wording is a litlle bit misleading here, as submodule.recurse could
be set to 'false', and then 'git fetch' will be invoked with '--recurse-submodules=false'.

> An alternative (and more obvious) approach to fix the bug would be to
> teach "git pull" to understand `fetch.recurseSubmodules`, but the
> proposed solution works better because:
> 
> - We don't maintain two identical config-parsing implementions in "git
>   pull" and "git fetch".
> - It works better with other commands invoked by "git pull" e.g. "git
>   merge" won't accidentally respect `fetch.recurseSubmodules`.

I'm not sure of the meaning of the second bullet, since "git merge" should
never perform a fetch ?... 

> 
> Reported-by: Huang Zou <huang.zou@schrodinger.com>
> Helped-by: Philippe Blain <levraiphilippeblain@gmail.com>
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
>     pull: only pass '--recurse-submodules' to subcommands
>     
>     Thanks Huang Zou for the report [1], and Philippe Blain for the initial
>     investigation.
>     
>     This patch fixes the original bug, but not in the 'obvious' way of
>     teaching "git pull" to parse fetch.recurseSubmodules. Instead, "git
>     pull" now propagates its value of "--recurse-submodules" to "git fetch"
>     (ignoring any config values), and leaves the config parsing to "git
>     fetch".
>     
>     I think this works better because we get a nice separation of "config
>     that git pull cares about" and "config that its subprocess care about",
>     and as a result:
>     
>      * We don't maintain two identical config-parsing implementations in
>        "git pull" and "git fetch".
>      * It works better with other commands invoked by "git pull" e.g. "git
>        merge" won't accidentally respect fetch.recurseSubmodules.
>     
>     PS I'm having a hard time writing today, let me know how the commit
>     message/cover letter can be improved :)
>     
>     [1]
>     https://lore.kernel.org/git/CAFnZ=JNE_Sa3TsKghBPj1d0cz3kc6o91Ogj-op8o6qK8t9hPgg@mail.gmail.com
>     
>     In-Reply-To:
>     CAFnZ=JNE_Sa3TsKghBPj1d0cz3kc6o91Ogj-op8o6qK8t9hPgg@mail.gmail.com
> 
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1262%2Fchooglen%2Fpull%2Ffetch-recurse-submodules-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1262/chooglen/pull/fetch-recurse-submodules-v1
> Pull-Request: https://github.com/git/git/pull/1262
> 
>  builtin/pull.c            | 10 +++++++---
>  t/t5572-pull-submodule.sh | 14 ++++++++++++++
>  2 files changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/builtin/pull.c b/builtin/pull.c
> index 4d667abc19d..01155ba67b2 100644
> --- a/builtin/pull.c
> +++ b/builtin/pull.c
> @@ -72,6 +72,7 @@ static const char * const pull_usage[] = {
>  static int opt_verbosity;
>  static char *opt_progress;
>  static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
> +static int recurse_submodules_cli = RECURSE_SUBMODULES_DEFAULT;
>  
>  /* Options passed to git-merge or git-rebase */
>  static enum rebase_type opt_rebase = -1;
> @@ -120,7 +121,7 @@ static struct option pull_options[] = {
>  		N_("force progress reporting"),
>  		PARSE_OPT_NOARG),
>  	OPT_CALLBACK_F(0, "recurse-submodules",
> -		   &recurse_submodules, N_("on-demand"),
> +		   &recurse_submodules_cli, N_("on-demand"),
>  		   N_("control for recursive fetching of submodules"),
>  		   PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),
>  
> @@ -536,8 +537,8 @@ static int run_fetch(const char *repo, const char **refspecs)
>  		strvec_push(&args, opt_tags);
>  	if (opt_prune)
>  		strvec_push(&args, opt_prune);
> -	if (recurse_submodules != RECURSE_SUBMODULES_DEFAULT)
> -		switch (recurse_submodules) {
> +	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
> +		switch (recurse_submodules_cli) {
>  		case RECURSE_SUBMODULES_ON:
>  			strvec_push(&args, "--recurse-submodules=on");
>  			break;
> @@ -1001,6 +1002,9 @@ int cmd_pull(int argc, const char **argv, const char *prefix)
>  
>  	argc = parse_options(argc, argv, prefix, pull_options, pull_usage, 0);
>  
> +	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
> +		recurse_submodules = recurse_submodules_cli;
> +

I agree with you and Junio that this implementation is very clear
and concise. Nice!

>  	if (cleanup_arg)
>  		/*
>  		 * this only checks the validity of cleanup_arg; we don't need
> diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh
> index fa6b4cca65c..65aaa7927fb 100755
> --- a/t/t5572-pull-submodule.sh
> +++ b/t/t5572-pull-submodule.sh
> @@ -107,6 +107,20 @@ test_expect_success " --[no-]recurse-submodule and submodule.recurse" '
>  	test_path_is_file super/sub/merge_strategy_4.t
>  '
>  
> +test_expect_success "fetch.recurseSubmodules option triggers recursive fetch (but not recursive update)" '
> +	test_commit -C child merge_strategy_5 &&
> +	git -C parent submodule update --remote &&
> +	git -C parent add sub &&
> +	git -C parent commit -m "update submodule" &&
> +
> +	git -C super -c fetch.recursesubmodules=true pull --no-rebase &&
> +	# Check that the submodule commit was fetched
> +	sub_oid=$(git -C super rev-parse FETCH_HEAD:sub) &&
> +	git -C super/sub cat-file -e $sub_oid &&
> +	# Check that the submodule worktree did not update
> +	! test_path_is_file super/sub/merge_strategy_5.t
> +'
> +

OK so here we test the following situation:

fetch.recurseSubmodules  submodule.recurse  --recurse-submodules  fetch result  pull result
-----------------------  -----------------  --------------------  ------------  ------------
true                     unset              absent                recursive     not recursive

I think this is a good start, but since the default behaviour 
of Git is the following:

fetch.recurseSubmodules  submodule.recurse  --recurse-submodules  fetch result  pull result
-----------------------  -----------------  --------------------  ------------  ------------
default: on-demand       default: false     unset                 on-demand     not recursive

we end up testing something not very far from the default behaviour. I think
this answers Junio's question as to why the test does not fail when applied
without the fix. So I think for completeness, we should add more tests,
including (maybe) the configuration that Huang reported:


fetch.recurseSubmodules  submodule.recurse  --recurse-submodules  fetch result   pull result
-----------------------  -----------------  --------------------  -------------  ------------
false                    true                absent               not recursive*  recursive

* the submodule will be fetched by 'git submodule update', not by the 'git fetch' ran by 'git pull'

and probably also this configuration that I think is what Huang wanted to achieve:

fetch.recurseSubmodules  submodule.recurse  --recurse-submodules  fetch result   pull result
-----------------------  -----------------  --------------------  -------------  ------------
on-demand                false               absent               on-demand      recursive

and probably this one also:

fetch.recurseSubmodules  submodule.recurse  --recurse-submodules  fetch result   pull result
-----------------------  -----------------  --------------------  -------------  ------------
on-demand                true               absent                on-demand      recursive

and maybe some others with an explicit '--recurse-submodules' flag on the CLI.

I must tell you that I think there is another bug I think in the 'git fetch' config parsing 
for 'fetch.recurseSubmodules' and 'submodule.recurse'. Even if 'fetch.recurseSubmodules'
is documented as overriding 'submodule.recurse', this is actually not  enforced
in 'git_fetch_config', currently it's a "last one wins" situation so the actualy behaviour
depends on which config is encountered last in the config file(s)...

Thanks for taking this on,

Philippe.
Glen Choo May 10, 2022, 6:27 p.m. UTC | #4
Thanks Philippe and Junio, both! 

Philippe Blain <levraiphilippeblain@gmail.com> writes:

>>  	if (cleanup_arg)
>>  		/*
>>  		 * this only checks the validity of cleanup_arg; we don't need
>> diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh
>> index fa6b4cca65c..65aaa7927fb 100755
>> --- a/t/t5572-pull-submodule.sh
>> +++ b/t/t5572-pull-submodule.sh
>> @@ -107,6 +107,20 @@ test_expect_success " --[no-]recurse-submodule and submodule.recurse" '
>>  	test_path_is_file super/sub/merge_strategy_4.t
>>  '
>>  
>> +test_expect_success "fetch.recurseSubmodules option triggers recursive fetch (but not recursive update)" '
>> +	test_commit -C child merge_strategy_5 &&
>> +	git -C parent submodule update --remote &&
>> +	git -C parent add sub &&
>> +	git -C parent commit -m "update submodule" &&
>> +
>> +	git -C super -c fetch.recursesubmodules=true pull --no-rebase &&
>> +	# Check that the submodule commit was fetched
>> +	sub_oid=$(git -C super rev-parse FETCH_HEAD:sub) &&
>> +	git -C super/sub cat-file -e $sub_oid &&
>> +	# Check that the submodule worktree did not update
>> +	! test_path_is_file super/sub/merge_strategy_5.t
>> +'
>> +
>
> OK so here we test the following situation:
>
> fetch.recurseSubmodules  submodule.recurse  --recurse-submodules  fetch result  pull result
> -----------------------  -----------------  --------------------  ------------  ------------
> true                     unset              absent                recursive     not recursive
>
> I think this is a good start, but since the default behaviour 
> of Git is the following:
>
> fetch.recurseSubmodules  submodule.recurse  --recurse-submodules  fetch result  pull result
> -----------------------  -----------------  --------------------  ------------  ------------
> default: on-demand       default: false     unset                 on-demand     not recursive
>
> we end up testing something not very far from the default behaviour.

Yes, I made a silly mistake with the test.. (I could've
sworn it made sense at the time).

- fetch.recurseSubmodules=true always worked correctly when
  submodule.recurse is not set because "git pull" realizes that the
  behavior is unspecified, so "git fetch" doesn't receive a
  "--recurse-submodules" and does the right thing.

  For completeness, we probably still want this test (even though it has
  always passed).

- I didn't test the actual buggy behavior, which is the precedence of
  the two config values.

  This would be a new test that tests something like
  fetch.recurseSubmodules=foo and submodule.recurse=bar. This should be
  enough to test precedence; I don't think we need to test more
  combinations of values.

I hope you both didn't waste too much time on this :)
Glen Choo May 10, 2022, 6:43 p.m. UTC | #5
Philippe Blain <levraiphilippeblain@gmail.com> writes:

> Hi Glen,
>
> Le 2022-05-09 à 19:27, Glen Choo via GitGitGadget a écrit :
>> From: Glen Choo <chooglen@google.com>
>
> First, regarding the commit message title (I never know where to comment
> on that since I can't quote it :P)
>
>> Re: [PATCH] pull: only pass '--recurse-submodules' to subcommands
>
> I understand the intent, i.e. "only pass the CLI flag, not any config, to subcommands"
> but only because I already know what the patch is about. It could be
> read to mean "pass --recurse-submodules only to subcommands, and not to 
> something else". Since this is really a bug that affects the underlying 
> 'git fetch', maybe something like this ?
>
>     pull: pass '--recurse-submodules' to 'fetch' from CLI, not config
>
>> 
>> Fix a bug in "git pull" where `submodule.recurse` is preferred over
>> `fetch.recurseSubmodules` 
>
> here I would add "for the underlying 'git fetch'"
>
> (Documentation/config/fetch.txt says that
>> `fetch.recurseSubmodules` should be preferred.). Do this by passing the
>> value of the "--recurse-submodules" CLI option to the underlying fetch,
>> instead of passing a value that combines the CLI option and config
>> variables.
>> 
>> In other words, this bug occurred because builtin/pull.c is conflating
>> two similar-sounding, but different concepts:
>> 
>> - Whether "git pull" itself should care about submodules e.g. whether it
>>   should update the submodule worktrees after performing a merge.
>
> nit: "or rebase".
>
>> - The value of "--recurse-submodules" to pass to the underlying "git
>>   fetch".
>> 
>> Thus, when `submodule.recurse` is set, the underlying "git fetch" gets
>> invoked with "--recurse-submodules", overriding the value of
>> `fetch.recurseSubmodules`.
>
> the wording is a litlle bit misleading here, as submodule.recurse could
> be set to 'false', and then 'git fetch' will be invoked with '--recurse-submodules=false'.

Thanks for the wording suggestions :) I'll see what I can incorporate.
 
>> An alternative (and more obvious) approach to fix the bug would be to
>> teach "git pull" to understand `fetch.recurseSubmodules`, but the
>> proposed solution works better because:
>> 
>> - We don't maintain two identical config-parsing implementions in "git
>>   pull" and "git fetch".
>> - It works better with other commands invoked by "git pull" e.g. "git
>>   merge" won't accidentally respect `fetch.recurseSubmodules`.
>
> I'm not sure of the meaning of the second bullet, since "git merge" should
> never perform a fetch ?... 

Ah, I'm describing a hypothetical issue with the 'obvious' (aka a
literal reading of the docs) approach of teaching "git pull" to handle
`fetch.recurseSubmodules`.

You are correct, "git merge" should never perform a fetch, so it
shouldn't care about `fetch.recurseSubmodules`. But a careless 'fix'
might be to copy what "git fetch" does with its config, e.g.


	if (!strcmp(k, "submodule.recurse")) {
		int r = git_config_bool(k, v) ?
			RECURSE_SUBMODULES_ON : RECURSE_SUBMODULES_OFF;
		recurse_submodules = r;
	} else if (!strcmp(k, "fetch.recursesubmodules")) {
		recurse_submodules = parse_fetch_recurse_submodules_arg(k, v);
		return 0;
	}

which might make the internal "git merge" suddenly recurse into
submodules. Of course, this can be fixed by using a fetch-specific
variable, like:

	if (!strcmp(k, "submodule.recurse")) {
		int r = git_config_bool(k, v) ?
			RECURSE_SUBMODULES_ON : RECURSE_SUBMODULES_OFF;
		recurse_submodules = r;
	}
  if (!strcmp(k, "fetch.recursesubmodules")) {
		fetch_recurse_submodules = parse_fetch_recurse_submodules_arg(k, v);
		return 0;
	}

  static int run_fetch(const char *repo, const char **refspecs)
  {
    /* ... */
    if (fetch_recurse_submodules != RECURSE_SUBMODULES_DEFAULT)
       /* This is actually wrong wrt the docs, but assume that we could
       combine the two values here correctly. */
       recurse_submodules = fetch_recurse_submodules;
    if (recurse_submodules != RECURSE_SUBMODULES_DEFAULT)
      switch (recurse_submodules) {
  }

but then we'd have to add more variables if we have
`merge.recurseSubmodules`, or `rebase.recurseSubmodules` etc.

So the easiest way is to just tell "git pull" to stop assuming that it
knows what the subcommands want :)
diff mbox series

Patch

diff --git a/builtin/pull.c b/builtin/pull.c
index 4d667abc19d..01155ba67b2 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -72,6 +72,7 @@  static const char * const pull_usage[] = {
 static int opt_verbosity;
 static char *opt_progress;
 static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
+static int recurse_submodules_cli = RECURSE_SUBMODULES_DEFAULT;
 
 /* Options passed to git-merge or git-rebase */
 static enum rebase_type opt_rebase = -1;
@@ -120,7 +121,7 @@  static struct option pull_options[] = {
 		N_("force progress reporting"),
 		PARSE_OPT_NOARG),
 	OPT_CALLBACK_F(0, "recurse-submodules",
-		   &recurse_submodules, N_("on-demand"),
+		   &recurse_submodules_cli, N_("on-demand"),
 		   N_("control for recursive fetching of submodules"),
 		   PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),
 
@@ -536,8 +537,8 @@  static int run_fetch(const char *repo, const char **refspecs)
 		strvec_push(&args, opt_tags);
 	if (opt_prune)
 		strvec_push(&args, opt_prune);
-	if (recurse_submodules != RECURSE_SUBMODULES_DEFAULT)
-		switch (recurse_submodules) {
+	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
+		switch (recurse_submodules_cli) {
 		case RECURSE_SUBMODULES_ON:
 			strvec_push(&args, "--recurse-submodules=on");
 			break;
@@ -1001,6 +1002,9 @@  int cmd_pull(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix, pull_options, pull_usage, 0);
 
+	if (recurse_submodules_cli != RECURSE_SUBMODULES_DEFAULT)
+		recurse_submodules = recurse_submodules_cli;
+
 	if (cleanup_arg)
 		/*
 		 * this only checks the validity of cleanup_arg; we don't need
diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh
index fa6b4cca65c..65aaa7927fb 100755
--- a/t/t5572-pull-submodule.sh
+++ b/t/t5572-pull-submodule.sh
@@ -107,6 +107,20 @@  test_expect_success " --[no-]recurse-submodule and submodule.recurse" '
 	test_path_is_file super/sub/merge_strategy_4.t
 '
 
+test_expect_success "fetch.recurseSubmodules option triggers recursive fetch (but not recursive update)" '
+	test_commit -C child merge_strategy_5 &&
+	git -C parent submodule update --remote &&
+	git -C parent add sub &&
+	git -C parent commit -m "update submodule" &&
+
+	git -C super -c fetch.recursesubmodules=true pull --no-rebase &&
+	# Check that the submodule commit was fetched
+	sub_oid=$(git -C super rev-parse FETCH_HEAD:sub) &&
+	git -C super/sub cat-file -e $sub_oid &&
+	# Check that the submodule worktree did not update
+	! test_path_is_file super/sub/merge_strategy_5.t
+'
+
 test_expect_success 'pull --rebase --recurse-submodules (remote superproject submodule changes, local submodule changes)' '
 	# This tests the following scenario :
 	# - local submodule has new commits