Message ID | fcece09546cbdb5f1bcd0d0c5aaa3a54e9d3b852.1640727143.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Sparse checkout: fix bug with worktree of bare repo | expand |
On Tue, Dec 28, 2021 at 4:32 PM Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote: > When adding a new worktree, it is reasonable to expect that we want to > use the current set of sparse-checkout settings for that new worktree. > This is particularly important for repositories where the worktree would > become too large to be useful. This is even more important when using > partial clone as well, since we want to avoid downloading the missing > blobs for files that should not be written to the new worktree. > > The only way to create such a worktree without this intermediate step of > expanding the full worktree is to copy the sparse-checkout patterns and > config settings during 'git worktree add'. Each worktree has its own > sparse-checkout patterns, and the default behavior when the > sparse-checkout file is missing is to include all paths at HEAD. Thus, > we need to have patterns from somewhere, they might as well be the > current worktree's patterns. These are then modified independently in > the future. > > In addition to the sparse-checkout file, copy the worktree config file > if worktree config is enabled and the file exists. This will copy over > any important settings to ensure the new worktree behaves the same as > the current one. This is not a proper review. I just happened to very quickly scan my eyes over this patch without even having looked at any of the others, nor have I read the v3 cover letter closely yet. Nevertheless, while skimming this patch, an issue jumped out at me... > Signed-off-by: Derrick Stolee <dstolee@microsoft.com> > --- > diff --git a/builtin/worktree.c b/builtin/worktree.c > @@ -336,6 +336,47 @@ static int add_worktree(const char *path, const char *refname, > + /* > + * If we are using worktree config, then copy all currenct config > + * values from the current worktree into the new one, that way the > + * new worktree behaves the same as this one. > + */ s/currenct/current/ > + if (repository_format_worktree_config) { > + char *from_file = git_pathdup("config.worktree"); > + char *to_file = xstrfmt("%s/worktrees/%s/config.worktree", > + realpath.buf, name); > + > + if (file_exists(from_file)) { > + if (safe_create_leading_directories(to_file) || > + copy_file(to_file, from_file, 0666)) > + error(_("failed to copy worktree config from '%s' to '%s'"), > + from_file, to_file); > + } I presume that you lifted this idea from [1] in which I offhandedly mentioned that one possible way to implement Elijah's desire to copy sparse-checkout configuration when a new worktree is created would be to simply copy the existing worktree-specific configuration to the new worktree. Unfortunately, a direct implementation of that idea suffers the same problem which started this entire thread. Namely: % git clone --bare <url>/bare.git % cd bare.git/ % git worktree init-worktree-config % git worktree add -d ../foo Preparing worktree (detached HEAD a0df8ce) HEAD is now at a0df8ce gobbledygook % cd ../foo/ % git sparse-checkout init fatal: this operation must be run in a work tree % git config --get --show-origin --show-scope core.bare worktree file:.../bare.git/worktrees/foo/config.worktree true The problem is that for a bare repository, after `git worktree init-worktree-config`, "bare.git/config.worktree" contains the repo-specific `core.bare=true` setting, so copying "bare.git/config.worktree" to "bare.git/worktrees/<id>/config.worktree" verbatim has undesired consequences. The obvious way to work around this problem is to (again) special-case `core.bare` and `core.worktree` to remove them when copying the worktree-specific configuration. Whether or not that is the best solution or even desirable is a different question. (I haven't thought it through enough to have an opinion.) [1]: https://lore.kernel.org/git/CAPig+cRYKKGA1af4hV0fz_nZWNG=zMgAziuAimDxWTz6L8u3Tg@mail.gmail.com/
On 12/29/2021 1:37 AM, Eric Sunshine wrote: > On Tue, Dec 28, 2021 at 4:32 PM Derrick Stolee via GitGitGadget > <gitgitgadget@gmail.com> wrote: >> When adding a new worktree, it is reasonable to expect that we want to >> use the current set of sparse-checkout settings for that new worktree. >> This is particularly important for repositories where the worktree would >> become too large to be useful. This is even more important when using >> partial clone as well, since we want to avoid downloading the missing >> blobs for files that should not be written to the new worktree. >> >> The only way to create such a worktree without this intermediate step of >> expanding the full worktree is to copy the sparse-checkout patterns and >> config settings during 'git worktree add'. Each worktree has its own >> sparse-checkout patterns, and the default behavior when the >> sparse-checkout file is missing is to include all paths at HEAD. Thus, >> we need to have patterns from somewhere, they might as well be the >> current worktree's patterns. These are then modified independently in >> the future. >> >> In addition to the sparse-checkout file, copy the worktree config file >> if worktree config is enabled and the file exists. This will copy over >> any important settings to ensure the new worktree behaves the same as >> the current one. > > This is not a proper review. I just happened to very quickly scan my > eyes over this patch without even having looked at any of the others, > nor have I read the v3 cover letter closely yet. Nevertheless, while > skimming this patch, an issue jumped out at me... > >> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> >> --- >> diff --git a/builtin/worktree.c b/builtin/worktree.c >> @@ -336,6 +336,47 @@ static int add_worktree(const char *path, const char *refname, >> + /* >> + * If we are using worktree config, then copy all currenct config >> + * values from the current worktree into the new one, that way the >> + * new worktree behaves the same as this one. >> + */ > > s/currenct/current/ > >> + if (repository_format_worktree_config) { >> + char *from_file = git_pathdup("config.worktree"); >> + char *to_file = xstrfmt("%s/worktrees/%s/config.worktree", >> + realpath.buf, name); >> + >> + if (file_exists(from_file)) { >> + if (safe_create_leading_directories(to_file) || >> + copy_file(to_file, from_file, 0666)) >> + error(_("failed to copy worktree config from '%s' to '%s'"), >> + from_file, to_file); >> + } > > I presume that you lifted this idea from [1] in which I offhandedly > mentioned that one possible way to implement Elijah's desire to copy > sparse-checkout configuration when a new worktree is created would be > to simply copy the existing worktree-specific configuration to the new > worktree. Unfortunately, a direct implementation of that idea suffers > the same problem which started this entire thread. Namely: > > % git clone --bare <url>/bare.git > % cd bare.git/ > % git worktree init-worktree-config > % git worktree add -d ../foo > Preparing worktree (detached HEAD a0df8ce) > HEAD is now at a0df8ce gobbledygook > % cd ../foo/ > % git sparse-checkout init > fatal: this operation must be run in a work tree > % git config --get --show-origin --show-scope core.bare > worktree file:.../bare.git/worktrees/foo/config.worktree true > > The problem is that for a bare repository, after `git worktree > init-worktree-config`, "bare.git/config.worktree" contains the > repo-specific `core.bare=true` setting, so copying > "bare.git/config.worktree" to > "bare.git/worktrees/<id>/config.worktree" verbatim has undesired > consequences. It is certainly unfortunate that this can happen when core.bare or core.worktree are set in the config.worktree of the bare repo. The thing we are doing here is trying to create a worktree that exactly matches the current worktree (even if it is bare or redirected to a different working directory), but since we don't actually support a "bare" worktree this does not work. > The obvious way to work around this problem is to (again) special-case > `core.bare` and `core.worktree` to remove them when copying the > worktree-specific configuration. Whether or not that is the best > solution or even desirable is a different question. (I haven't thought > it through enough to have an opinion.) It makes sense to special case these two settings since we want to allow creating a working worktree from a bare repo, even if it has worktree config stating that it is bare. As far as the implementation goes, we could do the copy and then unset those two values in the new file. That's an easy enough change. I'll wait for more feedback on the overall ideas (and names of things like the init-worktree-config subcommand). Thanks, -Stolee
On Wed, Dec 29, 2021 at 9:31 AM Derrick Stolee <stolee@gmail.com> wrote: > > On 12/29/2021 1:37 AM, Eric Sunshine wrote: > > On Tue, Dec 28, 2021 at 4:32 PM Derrick Stolee via GitGitGadget > > <gitgitgadget@gmail.com> wrote: > > The obvious way to work around this problem is to (again) special-case > > `core.bare` and `core.worktree` to remove them when copying the > > worktree-specific configuration. Whether or not that is the best > > solution or even desirable is a different question. (I haven't thought > > it through enough to have an opinion.) > > It makes sense to special case these two settings since we want to > allow creating a working worktree from a bare repo, even if it has > worktree config stating that it is bare. Agreed. > As far as the implementation goes, we could do the copy and then > unset those two values in the new file. That's an easy enough change. > I'll wait for more feedback on the overall ideas (and names of things > like the init-worktree-config subcommand). What value does the init-worktree-config subcommand provide; why shouldn't we just get rid of it? I know Eric was strongly suggesting it, but he was thinking in terms of always doing that full switchover step, or never doing it. Both extremes had the potential to cause user-visible bugs, and thus he suggested providing a command to allow users to pick their poison. I provided a suggestion avoiding both extremes that doesn't have that pick-your-poison approach, so I don't see why forcing users into this extra step makes any sense. But perhaps I missed something. Is there a usecase for users to explicitly use this?
On 12/29/2021 2:51 PM, Elijah Newren wrote: > On Wed, Dec 29, 2021 at 9:31 AM Derrick Stolee <stolee@gmail.com> wrote: >> >> On 12/29/2021 1:37 AM, Eric Sunshine wrote: >>> On Tue, Dec 28, 2021 at 4:32 PM Derrick Stolee via GitGitGadget >>> <gitgitgadget@gmail.com> wrote: > >>> The obvious way to work around this problem is to (again) special-case >>> `core.bare` and `core.worktree` to remove them when copying the >>> worktree-specific configuration. Whether or not that is the best >>> solution or even desirable is a different question. (I haven't thought >>> it through enough to have an opinion.) >> >> It makes sense to special case these two settings since we want to >> allow creating a working worktree from a bare repo, even if it has >> worktree config stating that it is bare. > > Agreed. > >> As far as the implementation goes, we could do the copy and then >> unset those two values in the new file. That's an easy enough change. > >> I'll wait for more feedback on the overall ideas (and names of things >> like the init-worktree-config subcommand). > > What value does the init-worktree-config subcommand provide; why > shouldn't we just get rid of it? > > I know Eric was strongly suggesting it, but he was thinking in terms > of always doing that full switchover step, or never doing it. Both > extremes had the potential to cause user-visible bugs, and thus he > suggested providing a command to allow users to pick their poison. I > provided a suggestion avoiding both extremes that doesn't have that > pick-your-poison approach, so I don't see why forcing users into this > extra step makes any sense. > > But perhaps I missed something. Is there a usecase for users to > explicitly use this? I think the motivation is that worktree config is something that is harder to set up than to just run a 'git config' command, and we should guide users into a best practice for using it. The documentation becomes "run this command to enable it". It also provides a place to update the steps if we were to change something in the future around worktree config, but I'm guessing the ship has sailed and backwards compatibility will keep us from introducing a new setting that would need to be added here. Thanks, -Stolee
On Wed, Dec 29, 2021 at 1:39 PM Derrick Stolee <stolee@gmail.com> wrote: > > On 12/29/2021 2:51 PM, Elijah Newren wrote: > > On Wed, Dec 29, 2021 at 9:31 AM Derrick Stolee <stolee@gmail.com> wrote: > >> > >> On 12/29/2021 1:37 AM, Eric Sunshine wrote: > >>> On Tue, Dec 28, 2021 at 4:32 PM Derrick Stolee via GitGitGadget > >>> <gitgitgadget@gmail.com> wrote: > > > >>> The obvious way to work around this problem is to (again) special-case > >>> `core.bare` and `core.worktree` to remove them when copying the > >>> worktree-specific configuration. Whether or not that is the best > >>> solution or even desirable is a different question. (I haven't thought > >>> it through enough to have an opinion.) > >> > >> It makes sense to special case these two settings since we want to > >> allow creating a working worktree from a bare repo, even if it has > >> worktree config stating that it is bare. > > > > Agreed. > > > >> As far as the implementation goes, we could do the copy and then > >> unset those two values in the new file. That's an easy enough change. > > > >> I'll wait for more feedback on the overall ideas (and names of things > >> like the init-worktree-config subcommand). > > > > What value does the init-worktree-config subcommand provide; why > > shouldn't we just get rid of it? > > > > I know Eric was strongly suggesting it, but he was thinking in terms > > of always doing that full switchover step, or never doing it. Both > > extremes had the potential to cause user-visible bugs, and thus he > > suggested providing a command to allow users to pick their poison. I > > provided a suggestion avoiding both extremes that doesn't have that > > pick-your-poison approach, so I don't see why forcing users into this > > extra step makes any sense. > > > > But perhaps I missed something. Is there a usecase for users to > > explicitly use this? > > I think the motivation is that worktree config is something that is > harder to set up than to just run a 'git config' command, and we > should guide users into a best practice for using it. The > documentation becomes "run this command to enable it". Okay, but that's an answer to a different question -- namely, "if users want/need to explicitly set it up, why should we have a command?" Your answer here is a very good answer to that question, but you've assumed the "if". My question was on the "if": (Why) Do users need or want to explicitly set it up? Secondarily, if users want to set it up explicitly, is the work here really sufficient to help guide them? In particular, I discovered and started using extensions.worktreeConfig without ever looking at the relevant portions of git-worktree.txt (the references in git-config.txt never mentioned them). I also pushed this usage to others, including even to you with `git-sparse-checkout`, and no reviewer on this list ever caught it or informed me of the `proper` additional guidelines found in git-worktree.txt until this thread started. So, relying on folks to read git-worktree.txt for this config item feels a bit weak to me. Granted, your new command will be much more likely to be read since it appears near the top of git-worktree.txt, but I just don't think that's enough. The references to extensions.worktreeConfig in git-config.txt should reference any special command or extended steps if we expect users to manually configure it (whether via explicit new subcommand or via also playing with other config settings). Anyway, if we think users want to set it up explicitly, and we address the discoverability problem above, then I'd vote for "independent-config" or "private-config" (or _maybe_ "migrate-config"). Because: * no sense repeating the word `worktree` in `git worktree init-worktree-config`. It's redundant. * The words "independent" or "private" suggest what it does and why users might want to use the new subcommand. * It's not an "init": * The documentation makes no attempt to impose a temporal order of using this command before `git worktree add`. (Would we even want to?) * As per my recommendation elsewhere, this step just isn't needed for the vast majority of users (i.e. those with non-bare clones who leave core.worktree alone). * ...and it's also not needed for other (core.bare=true or core.worktree set) users since `git worktree add` will automatically run this config migration for them And, actually, with the name "independent-config" or "private-config", I might be answering my own question. It's a name that speaks to why users might want it, so my objection to the new command is rapidly diminishing. > It also provides a place to update the steps if we were to change > something in the future around worktree config, but I'm guessing > the ship has sailed and backwards compatibility will keep us from > introducing a new setting that would need to be added here. Yeah, the best time to force a config change is probably when we introduce a new command with it. If we had forced core.repositoryFormatVersion=1 in order to read extensions.* at the time that extensions.* was introduced, that would have been fine (When we tried it later, we had to revert it for backward compatibility reasons.). If we had forced extensions.worktreeConfig=true when introducing git-worktree, that would have been fine. We did force extensions.worktreeConfig=true when we introduced git-sparse-checkout, which was good, though we localized it to sparse-checkout and avoided adding it to worktree usage in general at that point. The other good time to force a config change is when we have cases where behavior is already broken anyway. For example, the (core.bare=true or core.worktree is set) case that started this thread. But in such cases, we have to localize the forced-config change to the cases that are "broken anyway" unless we can verify it won't break other cases. I'm not sure that this new subcommand will ever fall into either of these categories, so I also agree that we'll be unlikely to ever change it.
On Wed, Dec 29, 2021 at 2:52 PM Elijah Newren <newren@gmail.com> wrote: > On Wed, Dec 29, 2021 at 9:31 AM Derrick Stolee <stolee@gmail.com> wrote: > > I'll wait for more feedback on the overall ideas (and names of things > > like the init-worktree-config subcommand). > > What value does the init-worktree-config subcommand provide; why > shouldn't we just get rid of it? > > I know Eric was strongly suggesting it, but he was thinking in terms > of always doing that full switchover step, or never doing it. Both > extremes had the potential to cause user-visible bugs, and thus he > suggested providing a command to allow users to pick their poison. I > provided a suggestion avoiding both extremes that doesn't have that > pick-your-poison approach, so I don't see why forcing users into this > extra step makes any sense. Right. The minimally invasive, minimally dangerous approach you outlined at the very bottom of [1] obviates the need for `init-worktree-config`. We still want the underlying function for `git worktree add` to call, but a user-facing command providing the same functionality becomes much less meaningful since enabling per-worktree configuration involves no more than simply setting `extension.worktreeConfig=true` in all cases. So, I can't think of any reason to add `init-worktree-config` presently (if ever). [1]: https://lore.kernel.org/git/CABPp-BHuO3B366uJuODMQo-y449p8cAMVn0g2MTcO5di3Xa7Zg@mail.gmail.com/
On Wed, Dec 29, 2021 at 5:45 PM Elijah Newren <newren@gmail.com> wrote: > On Wed, Dec 29, 2021 at 1:39 PM Derrick Stolee <stolee@gmail.com> wrote: > > I think the motivation is that worktree config is something that is > > harder to set up than to just run a 'git config' command, and we > > should guide users into a best practice for using it. The > > documentation becomes "run this command to enable it". > > Okay, but that's an answer to a different question -- namely, "if > users want/need to explicitly set it up, why should we have a > command?" Your answer here is a very good answer to that question, > but you've assumed the "if". My question was on the "if": (Why) Do > users need or want to explicitly set it up? > > Secondarily, if users want to set it up explicitly, is the work here > really sufficient to help guide them? In particular, I discovered and > started using extensions.worktreeConfig without ever looking at the > relevant portions of git-worktree.txt (the references in > git-config.txt never mentioned them). I also pushed this usage to > others, including even to you with `git-sparse-checkout`, and no > reviewer on this list ever caught it or informed me of the `proper` > additional guidelines found in git-worktree.txt until this thread > started. So, relying on folks to read git-worktree.txt for this > config item feels a bit weak to me. Granted, your new command will be > much more likely to be read since it appears near the top of > git-worktree.txt, but I just don't think that's enough. The > references to extensions.worktreeConfig in git-config.txt should > reference any special command or extended steps if we expect users to > manually configure it (whether via explicit new subcommand or via also > playing with other config settings). Agreed about it being a good idea to update git-config.txt to mention the extra bookkeeping related to `extensions.worktreeConfig=1` (though it doesn't necessarily need to be done by this series). > Anyway, if we think users want to set it up explicitly, and we address > the discoverability problem above, then I'd vote for > "independent-config" or "private-config" (or _maybe_ > "migrate-config"). Because: > * no sense repeating the word `worktree` in `git worktree > init-worktree-config`. It's redundant. > * The words "independent" or "private" suggest what it does and why > users might want to use the new subcommand. > * It's not an "init": I had some similar thoughts/objections to the name `init-worktree-config` (not that I was able to come up with anything better). Thanks for enumerating them here. Anyhow, as you say below, the new subcommand isn't really needed. > * The documentation makes no attempt to impose a temporal order of > using this command before `git worktree add`. (Would we even want > to?) Aside from possible(?) sparse-checkout issues, I don't think there is a reason to impose temporal order in general. > * As per my recommendation elsewhere, this step just isn't needed > for the vast majority of users (i.e. those with non-bare clones who > leave core.worktree alone). > * ...and it's also not needed for other (core.bare=true or > core.worktree set) users since `git worktree add` will automatically > run this config migration for them Your enumeration at the very end of [1] pretty well convinced me that we don't need this command; certainly not at present, and perhaps never. > And, actually, with the name "independent-config" or "private-config", > I might be answering my own question. It's a name that speaks to why > users might want it, so my objection to the new command is rapidly > diminishing. Perhaps some day it might make sense to add such a subcommand (more suitably named), but with your proposal in [1] it's hard to justify adding it now, and it certainly doesn't need to be part of this patch series. [1]: https://lore.kernel.org/git/CABPp-BHuO3B366uJuODMQo-y449p8cAMVn0g2MTcO5di3Xa7Zg@mail.gmail.com/
diff --git a/builtin/worktree.c b/builtin/worktree.c index 937ee6fc38b..bca49a55f13 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -336,6 +336,47 @@ static int add_worktree(const char *path, const char *refname, strbuf_addf(&sb, "%s/commondir", sb_repo.buf); write_file(sb.buf, "../.."); + /* + * If the current worktree has sparse-checkout enabled, then copy + * the sparse-checkout patterns from the current worktree. + */ + if (core_apply_sparse_checkout) { + char *from_file = git_pathdup("info/sparse-checkout"); + char *to_file = xstrfmt("%s/worktrees/%s/info/sparse-checkout", + realpath.buf, name); + + if (file_exists(from_file)) { + if (safe_create_leading_directories(to_file) || + copy_file(to_file, from_file, 0666)) + error(_("failed to copy '%s' to '%s'; sparse-checkout may not work correctly"), + from_file, to_file); + } + + free(from_file); + free(to_file); + } + + /* + * If we are using worktree config, then copy all currenct config + * values from the current worktree into the new one, that way the + * new worktree behaves the same as this one. + */ + if (repository_format_worktree_config) { + char *from_file = git_pathdup("config.worktree"); + char *to_file = xstrfmt("%s/worktrees/%s/config.worktree", + realpath.buf, name); + + if (file_exists(from_file)) { + if (safe_create_leading_directories(to_file) || + copy_file(to_file, from_file, 0666)) + error(_("failed to copy worktree config from '%s' to '%s'"), + from_file, to_file); + } + + free(from_file); + free(to_file); + } + strvec_pushf(&child_env, "%s=%s", GIT_DIR_ENVIRONMENT, sb_git.buf); strvec_pushf(&child_env, "%s=%s", GIT_WORK_TREE_ENVIRONMENT, path); cp.git_cmd = 1; diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index 15403158c49..ce84819e1f5 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -180,6 +180,53 @@ test_expect_success 'set enables config' ' ' test_expect_success 'set enables worktree config, if enabled' ' + git init worktree-patterns && + ( + cd worktree-patterns && + test_commit test file && + mkdir dir dir2 && + test_commit dir dir/file && + test_commit dir2 dir2/file && + + # By initializing the worktree config here... + git worktree init-worktree-config && + + # This set command places config values in worktree- + # specific config... + git sparse-checkout set --cone dir && + + # Which must be copied, along with the sparse-checkout + # patterns, here. + git worktree add --detach ../worktree-patterns2 + ) && + test_cmp_config -C worktree-patterns true core.sparseCheckout && + test_cmp_config -C worktree-patterns2 true core.sparseCheckout && + test_cmp_config -C worktree-patterns true core.sparseCheckoutCone && + test_cmp_config -C worktree-patterns2 true core.sparseCheckoutCone && + test_cmp worktree-patterns/.git/info/sparse-checkout \ + worktree-patterns/.git/worktrees/worktree-patterns2/info/sparse-checkout && + + ls worktree-patterns >expect && + ls worktree-patterns2 >actual && + test_cmp expect actual && + + # Double check that the copy works from a non-main worktree. + ( + cd worktree-patterns2 && + git sparse-checkout set dir2 && + git worktree add --detach ../worktree-patterns3 + ) && + test_cmp_config -C worktree-patterns3 true core.sparseCheckout && + test_cmp_config -C worktree-patterns3 true core.sparseCheckoutCone && + test_cmp worktree-patterns/.git/worktrees/worktree-patterns2/info/sparse-checkout \ + worktree-patterns/.git/worktrees/worktree-patterns3/info/sparse-checkout && + + ls worktree-patterns2 >expect && + ls worktree-patterns3 >actual && + test_cmp expect actual +' + +test_expect_success 'worktree add copies sparse-checkout patterns' ' git init worktree-config && ( cd worktree-config && @@ -547,11 +594,11 @@ test_expect_success 'interaction with submodules' ' test_expect_success 'different sparse-checkouts with worktrees' ' git -C repo worktree add --detach ../worktree && - check_files worktree "a deep folder1 folder2" && + check_files worktree "a folder1" && git -C worktree sparse-checkout init --cone && - git -C repo sparse-checkout set folder1 && + git -C repo sparse-checkout set folder1 folder2 && git -C worktree sparse-checkout set deep/deeper1 && - check_files repo a folder1 && + check_files repo a folder1 folder2 && check_files worktree a deep '