Message ID | 20200128221736.9217-1-emilyshaffer@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] clone: pass --single-branch during --recurse-submodules | expand |
On Tue, Jan 28, 2020 at 02:17:36PM -0800, Emily Shaffer wrote: > Previously, performing "git clone --recurse-submodules --single-branch" > resulted in submodules cloning all branches even though the superproject > cloned only one branch. Pipe --single-branch through the submodule > helper framework to make it to 'clone' later on. This makes sense to me, bearing in mind that I'm not at all a good person to point out subtleties with submodules that could bite us. > As discussed in the thread for v1 of this patch, in cases when the > submodule commit referenced by the specified superproject branch isn't > the same as the HEAD of the submodule repo known by the server side, > this still works in kind of a non-obvious way. In these cases, first we > fetch the single branch that is the ancestor of the server's HEAD; then > we fetch the commit needed by the superproject (and its ancestry). So > while this change prevents us from fetching *all* branches on clone, it > doesn't necessarily limit us to a single branch as described. Is it worth adding a test that we do the right thing here? Not so much to prove that it works now, but to protect us against future changes. It seems like the sort of thing that could get subtly broken. The patch looks mostly good to me (my, that was a lot of plumbing through that option); here are a few minor comments: > -update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--] [<path>...]:: > +update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--single-branch] [--] [<path>...]:: This line is horrendously long. Not new in your patch, but I wonder if the time might have come to break it up. > +--single-branch:: > + This option is only valid for the update command. > + Clone only one branch during update, HEAD or --branch. For some reason my brain insists on parsing this second sentence as a 3-item list without an Oxford comma. I wonder if it would be more clear as: Clone only one branch during update: HEAD or one specified by --branch. or similar. > #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \ > SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, 0, \ > - NULL, NULL, NULL, \ > + NULL, NULL, NULL, 0,\ > NULL, 0, 0, 0, NULL, 0, 0, 1} Wow. Also not new in your patch, but I think we're moving towards the use of C99 named initializers, which would make this a bit less daunting (all of the NULL/0 items could be omitted!). > +test_expect_success 'clone with --single-branch' ' > + test_when_finished "rm -rf super_clone" && > + git clone --recurse-submodules --single-branch "file://$pwd/." super_clone && > + ( > + cd super_clone/sub && > + git branch -a >branches && > + test_must_fail grep other branches > + ) > +' Don't use test_must_fail with non-Git commands; you can just say "! grep". We usually try to avoid scripting around git-branch output (although I find it pretty unlikely that future changes would break this particular case). git-for-each-ref would be a better pick, but I wonder if: git rev-parse --verify origin/master && test_must_fail git rev-parse --verify origin/other might express the expectation more clearly. -Peff
On Thu, Jan 30, 2020 at 05:23:03AM -0500, Jeff King wrote: > On Tue, Jan 28, 2020 at 02:17:36PM -0800, Emily Shaffer wrote: Ouch. I forgot about this review for some time. Sorry :) > > > Previously, performing "git clone --recurse-submodules --single-branch" > > resulted in submodules cloning all branches even though the superproject > > cloned only one branch. Pipe --single-branch through the submodule > > helper framework to make it to 'clone' later on. > > This makes sense to me, bearing in mind that I'm not at all a good > person to point out subtleties with submodules that could bite us. I wonder if it makes sense to ship this to our submodule-using masses here for a little while and see how it works / whether anybody yells? This might be too small for that kind of thing. > > > As discussed in the thread for v1 of this patch, in cases when the > > submodule commit referenced by the specified superproject branch isn't > > the same as the HEAD of the submodule repo known by the server side, > > this still works in kind of a non-obvious way. In these cases, first we > > fetch the single branch that is the ancestor of the server's HEAD; then > > we fetch the commit needed by the superproject (and its ancestry). So > > while this change prevents us from fetching *all* branches on clone, it > > doesn't necessarily limit us to a single branch as described. > > Is it worth adding a test that we do the right thing here? Not so much > to prove that it works now, but to protect us against future changes. It > seems like the sort of thing that could get subtly broken. What did you have in mind beyond the test I added already? > > The patch looks mostly good to me (my, that was a lot of plumbing > through that option); here are a few minor comments: > > > -update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--] [<path>...]:: > > +update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--single-branch] [--] [<path>...]:: > > This line is horrendously long. Not new in your patch, but I wonder if > the time might have come to break it up. I dug around in Asciidoc doc and couldn't find a good way to do so. The trailing :: means this command listing is done as a "definition list", and I just didn't see any way to multiline an entry for such a thing. :( > > > +--single-branch:: > > + This option is only valid for the update command. > > + Clone only one branch during update, HEAD or --branch. > > For some reason my brain insists on parsing this second sentence as a > 3-item list without an Oxford comma. I wonder if it would be more clear > as: > > Clone only one branch during update: HEAD or one specified by > --branch. > > or similar. Took it verbatim, I agree. > > > #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \ > > SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, 0, \ > > - NULL, NULL, NULL, \ > > + NULL, NULL, NULL, 0,\ > > NULL, 0, 0, 0, NULL, 0, 0, 1} > > Wow. Also not new in your patch, but I think we're moving towards the > use of C99 named initializers, which would make this a bit less > daunting (all of the NULL/0 items could be omitted!). Hrm. Yeah, I'll add a quick patch before this one to do so. It's pretty gross :) > > +test_expect_success 'clone with --single-branch' ' > > + test_when_finished "rm -rf super_clone" && > > + git clone --recurse-submodules --single-branch "file://$pwd/." super_clone && > > + ( > > + cd super_clone/sub && > > + git branch -a >branches && > > + test_must_fail grep other branches > > + ) > > +' > > Don't use test_must_fail with non-Git commands; you can just say "! grep". > > We usually try to avoid scripting around git-branch output (although I ooof > find it pretty unlikely that future changes would break this particular > case). git-for-each-ref would be a better pick, but I wonder if: > > git rev-parse --verify origin/master && > test_must_fail git rev-parse --verify origin/other > > might express the expectation more clearly. Sure, done. Sorry again for the long wait, and thanks for the effort on the review. New revision coming momentarily. - Emily
On Thu, Feb 20, 2020 at 06:53:06PM -0800, Emily Shaffer wrote: > On Thu, Jan 30, 2020 at 05:23:03AM -0500, Jeff King wrote: > > On Tue, Jan 28, 2020 at 02:17:36PM -0800, Emily Shaffer wrote: > > Ouch. I forgot about this review for some time. Sorry :) That's OK. I forgot about, too. :) > > This makes sense to me, bearing in mind that I'm not at all a good > > person to point out subtleties with submodules that could bite us. > > I wonder if it makes sense to ship this to our submodule-using masses > here for a little while and see how it works / whether anybody yells? > This might be too small for that kind of thing. Certainly putting it in front of more users gives us more confidence. But I have a feeling the gotcha here would be that it does what _some_ set of users wants, but not so much for others. And your audience may not be a representative sample. So certainly it doesn't hurt, but I'll leave it you whether you think it's worthwhile. I'm having trouble even coming up with a devil's advocate example for the "others" in this case, though. Getting general exposure in "next" and then "master" leading up to the release might also help, but I get the impression that those audiences aren't necessarily comprehensive, either. > > > As discussed in the thread for v1 of this patch, in cases when the > > > submodule commit referenced by the specified superproject branch isn't > > > the same as the HEAD of the submodule repo known by the server side, > > > this still works in kind of a non-obvious way. In these cases, first we > > > fetch the single branch that is the ancestor of the server's HEAD; then > > > we fetch the commit needed by the superproject (and its ancestry). So > > > while this change prevents us from fetching *all* branches on clone, it > > > doesn't necessarily limit us to a single branch as described. > > > > Is it worth adding a test that we do the right thing here? Not so much > > to prove that it works now, but to protect us against future changes. It > > seems like the sort of thing that could get subtly broken. > > What did you have in mind beyond the test I added already? I think your test just covers the "happy path" that what we want to fetch is on the tip of that single branch. But do we test a case where the commit we need for a submodule isn't on the HEAD branch of that submodule, and we "clone --single-branch --recurse-submodules" the superproject and it works? I.e., from what you describe above it should work, but it might be nice to make sure it continues to do so. > > > -update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--] [<path>...]:: > > > +update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--single-branch] [--] [<path>...]:: > > > > This line is horrendously long. Not new in your patch, but I wonder if > > the time might have come to break it up. > > I dug around in Asciidoc doc and couldn't find a good way to do so. The > trailing :: means this command listing is done as a "definition list", > and I just didn't see any way to multiline an entry for such a thing. :( Hmm, yeah. Personally I think it would be better as: update [options] [--] [<path>...]:: ... The supported options are: --init:: ...here's what init does... but I guess some of them are repeated between commands, which would need addressing. Anyway, it's clear this is way beyond the scope of your patch, so let's forget about it for now. -Peff
diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt index 5232407f68..10c42b752a 100644 --- a/Documentation/git-submodule.txt +++ b/Documentation/git-submodule.txt @@ -133,7 +133,7 @@ If you really want to remove a submodule from the repository and commit that use linkgit:git-rm[1] instead. See linkgit:gitsubmodules[7] for removal options. -update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--] [<path>...]:: +update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--single-branch] [--] [<path>...]:: + -- Update the registered submodules to match what the superproject @@ -430,6 +430,10 @@ options carefully. Clone new submodules in parallel with as many jobs. Defaults to the `submodule.fetchJobs` option. +--single-branch:: + This option is only valid for the update command. + Clone only one branch during update, HEAD or --branch. + <path>...:: Paths to submodule(s). When specified this will restrict the command to only operate on the submodules found at the specified paths. diff --git a/builtin/clone.c b/builtin/clone.c index 0fc89ae2b9..ec749a86c1 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -832,6 +832,9 @@ static int checkout(int submodule_progress) argv_array_push(&args, "--no-fetch"); } + if (option_single_branch) + argv_array_push(&args, "--single-branch"); + err = run_command_v_opt(args.argv, RUN_GIT_CMD); argv_array_clear(&args); } diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index c72931ecd7..50c9e23db1 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -1225,7 +1225,7 @@ static int module_deinit(int argc, const char **argv, const char *prefix) static int clone_submodule(const char *path, const char *gitdir, const char *url, const char *depth, struct string_list *reference, int dissociate, - int quiet, int progress) + int quiet, int progress, int single_branch) { struct child_process cp = CHILD_PROCESS_INIT; @@ -1247,6 +1247,8 @@ static int clone_submodule(const char *path, const char *gitdir, const char *url argv_array_push(&cp.args, "--dissociate"); if (gitdir && *gitdir) argv_array_pushl(&cp.args, "--separate-git-dir", gitdir, NULL); + if (single_branch) + argv_array_push(&cp.args, "--single-branch"); argv_array_push(&cp.args, "--"); argv_array_push(&cp.args, url); @@ -1373,6 +1375,7 @@ static int module_clone(int argc, const char **argv, const char *prefix) struct string_list reference = STRING_LIST_INIT_NODUP; int dissociate = 0, require_init = 0; char *sm_alternate = NULL, *error_strategy = NULL; + int single_branch = 0; struct option module_clone_options[] = { OPT_STRING(0, "prefix", &prefix, @@ -1400,12 +1403,15 @@ static int module_clone(int argc, const char **argv, const char *prefix) N_("force cloning progress")), OPT_BOOL(0, "require-init", &require_init, N_("disallow cloning into non-empty directory")), + OPT_BOOL(0, "single-branch", &single_branch, + N_("clone only one branch, HEAD or --branch")), OPT_END() }; const char *const git_submodule_helper_usage[] = { N_("git submodule--helper clone [--prefix=<path>] [--quiet] " "[--reference <repository>] [--name <name>] [--depth <depth>] " + "[--single-branch] " "--url <url> --path <path>"), NULL }; @@ -1438,7 +1444,7 @@ static int module_clone(int argc, const char **argv, const char *prefix) prepare_possible_alternates(name, &reference); if (clone_submodule(path, sm_gitdir, url, depth, &reference, dissociate, - quiet, progress)) + quiet, progress, single_branch)) die(_("clone of '%s' into submodule path '%s' failed"), url, path); } else { @@ -1562,6 +1568,7 @@ struct submodule_update_clone { const char *depth; const char *recursive_prefix; const char *prefix; + int single_branch; /* to be consumed by git-submodule.sh */ struct update_clone_data *update_clone; @@ -1578,7 +1585,7 @@ struct submodule_update_clone { }; #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \ SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, 0, \ - NULL, NULL, NULL, \ + NULL, NULL, NULL, 0,\ NULL, 0, 0, 0, NULL, 0, 0, 1} @@ -1718,6 +1725,8 @@ static int prepare_to_clone_next_submodule(const struct cache_entry *ce, argv_array_push(&child->args, "--dissociate"); if (suc->depth) argv_array_push(&child->args, suc->depth); + if (suc->single_branch) + argv_array_push(&child->args, "--single-branch"); cleanup: strbuf_reset(&displaypath_sb); @@ -1897,6 +1906,8 @@ static int update_clone(int argc, const char **argv, const char *prefix) N_("force cloning progress")), OPT_BOOL(0, "require-init", &suc.require_init, N_("disallow cloning into non-empty directory")), + OPT_BOOL(0, "single-branch", &suc.single_branch, + N_("clone only one branch, HEAD or --branch")), OPT_END() }; diff --git a/git-submodule.sh b/git-submodule.sh index aaa1809d24..95f73ab775 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -10,7 +10,7 @@ USAGE="[--quiet] [--cached] or: $dashless [--quiet] status [--cached] [--recursive] [--] [<path>...] or: $dashless [--quiet] init [--] [<path>...] or: $dashless [--quiet] deinit [-f|--force] (--all| [--] <path>...) - or: $dashless [--quiet] update [--init] [--remote] [-N|--no-fetch] [-f|--force] [--checkout|--merge|--rebase] [--[no-]recommend-shallow] [--reference <repository>] [--recursive] [--] [<path>...] + or: $dashless [--quiet] update [--init] [--remote] [-N|--no-fetch] [-f|--force] [--checkout|--merge|--rebase] [--[no-]recommend-shallow] [--reference <repository>] [--recursive] [--single-branch] [--] [<path>...] or: $dashless [--quiet] set-branch (--default|--branch <branch>) [--] <path> or: $dashless [--quiet] set-url [--] <path> <newurl> or: $dashless [--quiet] summary [--cached|--files] [--summary-limit <n>] [commit] [--] [<path>...] @@ -47,6 +47,7 @@ custom_name= depth= progress= dissociate= +single_branch= die_if_unmatched () { @@ -526,6 +527,9 @@ cmd_update() --jobs=*) jobs=$1 ;; + --single-branch) + single_branch=1 + ;; --) shift break @@ -555,6 +559,7 @@ cmd_update() ${dissociate:+"--dissociate"} \ ${depth:+--depth "$depth"} \ ${require_init:+--require-init} \ + ${single_branch:+--single-branch} \ $recommend_shallow \ $jobs \ -- \ diff --git a/t/t5617-clone-submodules-remote.sh b/t/t5617-clone-submodules-remote.sh index 37fcce9c40..67f2b48659 100755 --- a/t/t5617-clone-submodules-remote.sh +++ b/t/t5617-clone-submodules-remote.sh @@ -14,7 +14,8 @@ test_expect_success 'setup' ' cd sub && git init && test_commit subcommit1 && - git tag sub_when_added_to_super + git tag sub_when_added_to_super && + git branch other ) && git submodule add "file://$pwd/sub" sub && git commit -m "add submodule" && @@ -51,4 +52,14 @@ test_expect_success 'check the default is --no-remote-submodules' ' ) ' +test_expect_success 'clone with --single-branch' ' + test_when_finished "rm -rf super_clone" && + git clone --recurse-submodules --single-branch "file://$pwd/." super_clone && + ( + cd super_clone/sub && + git branch -a >branches && + test_must_fail grep other branches + ) +' + test_done
Previously, performing "git clone --recurse-submodules --single-branch" resulted in submodules cloning all branches even though the superproject cloned only one branch. Pipe --single-branch through the submodule helper framework to make it to 'clone' later on. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> --- Removed references to --branch since v1. As discussed in the thread for v1 of this patch, in cases when the submodule commit referenced by the specified superproject branch isn't the same as the HEAD of the submodule repo known by the server side, this still works in kind of a non-obvious way. In these cases, first we fetch the single branch that is the ancestor of the server's HEAD; then we fetch the commit needed by the superproject (and its ancestry). So while this change prevents us from fetching *all* branches on clone, it doesn't necessarily limit us to a single branch as described. To limit submodules to a single branch on clone, we need to know which branch we want for each submodule; unfortunately it seems that just propagating --branch doesn't necessarily help as there's not much guarantee that the superproject branch name matches the submodule branch name, or that all submodule branch names are the same. So, we need to give a little more thought there. - Emily Documentation/git-submodule.txt | 6 +++++- builtin/clone.c | 3 +++ builtin/submodule--helper.c | 17 ++++++++++++++--- git-submodule.sh | 7 ++++++- t/t5617-clone-submodules-remote.sh | 13 ++++++++++++- 5 files changed, 40 insertions(+), 6 deletions(-)