diff mbox series

[v6,2/6] worktree: create init_worktree_config()

Message ID 5d0cc242d92c68bf239f9e17eab9c80ec6b2d469.1644269583.git.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 615a84ad788b26260c5b053ace2d5720ea5f05c5
Headers show
Series Sparse checkout: fix bug with worktree of bare repo | expand

Commit Message

Derrick Stolee Feb. 7, 2022, 9:32 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

Upgrading a repository to use extensions.worktreeConfig is non-trivial.
There are several steps involved, including moving some config settings
from the common config file to the main worktree's config.worktree file.
The previous change updated the documentation with all of these details.

Commands such as 'git sparse-checkout set' upgrade the repository to use
extensions.worktreeConfig without following these steps, causing some
user pain in some special cases.

Create a helper method, init_worktree_config(), that will be used in a
later change to fix this behavior within 'git sparse-checkout set'. The
method is carefully documented in worktree.h.

Note that we do _not_ upgrade the repository format version to 1 during
this process. The worktree config extension must be considered by Git
and third-party tools even if core.repositoryFormatVersion is 0 for
historical reasons documented in 11664196ac ("Revert
"check_repository_format_gently(): refuse extensions for old
repositories"", 2020-07-15). This is a special case for this extension,
and newer extensions (such as extensions.objectFormat) still need to
upgrade the repository format version.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 worktree.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 worktree.h | 21 ++++++++++++++++
 2 files changed, 94 insertions(+)

Comments

Junio C Hamano Feb. 8, 2022, 10:09 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +static int move_config_setting(const char *key, const char *value,
> +			       const char *from_file, const char *to_file)
> +{
> +	if (git_config_set_in_file_gently(to_file, key, value))
> +		return error(_("unable to set %s in '%s'"), key, to_file);
> +	if (git_config_set_in_file_gently(from_file, key, NULL))
> +		return error(_("unable to unset %s in '%s'"), key, from_file);
> +	return 0;
> +}

Interesting.

The verb "move" in its name made me expect a "get (and remove)
whatever value(s) defined out of the old file, and set them
identically in the new file" sequence, but that is not what is done
here.  "set to this new single value in the new file and unset from
the old one".

I can see the need to say "move it only when its value is X",
so having the caller to extract the value before deciding to call
the function (hence not "moving from old") does make sense, but then
the function is misnamed---it is not "moving", it is doing something
else.

> +int init_worktree_config(struct repository *r)
> +{
> +	int res = 0;
> +	int bare = 0;
> +	struct config_set cs = { { 0 } };
> +	const char *core_worktree;
> +	char *common_config_file;
> +	char *main_worktree_file;
> +
> +	/*
> +	 * If the extension is already enabled, then we can skip the
> +	 * upgrade process.
> +	 */
> +	if (repository_format_worktree_config)
> +		return 0;

OK.

> +	if ((res = git_config_set_gently("extensions.worktreeConfig", "true")))
> +		return error(_("failed to set extensions.worktreeConfig setting"));

OK.

> +	common_config_file = xstrfmt("%s/config", r->commondir);
> +	main_worktree_file = xstrfmt("%s/config.worktree", r->commondir);
> +
> +	git_configset_init(&cs);
> +	git_configset_add_file(&cs, common_config_file);
> +
> +	/*
> +	 * If core.bare is true in the common config file, then we need to
> +	 * move it to the main worktree's config file or it will break all
> +	 * worktrees. If it is false, then leave it in place because it
> +	 * _could_ be negating a global core.bare=true.
> +	 */

Is the assumption that the secondary worktrees are never bare, but
the primary one could be (iow, adding worktrees to a bare repository
would leave the original bare repository as the primary "worktree"
that does not have "working tree")?  I am trying to see what downsides
it tries to avoid by not moving the core.bare==false setting.  Shouldn't
core.bare be set to false when "worktree add" creates a new one anyway,
if the secondaries are never bare?

> +	if (!git_configset_get_bool(&cs, "core.bare", &bare) && bare) {
> +		if ((res = move_config_setting("core.bare", "true",
> +					       common_config_file,
> +					       main_worktree_file)))
> +			goto cleanup;
> +	}

> +	/*
> +	 * If core.worktree is set, then the main worktree is located
> +	 * somewhere different than the parent of the common Git dir.

OK.  We do not want to share the working tree for the primary worktree
among secondary worktrees.  For the primary, common and uncommon are
the same, so it may not matter, but mention of "common Git dir" here
may confuse readers?  Unless overridden by the config, the parent of
the git dir is the root of the working tree, no?

> +	 * Relocate that value to avoid breaking all worktrees with this
> +	 * upgrade to worktree config.
> +	 */

And if it is not set, then working tree of each worktree is the
parent of the per-worktree Git dir, so they will automatically
become separate, which makes sense.

> +	if (!git_configset_get_value(&cs, "core.worktree", &core_worktree)) {
> +		if ((res = move_config_setting("core.worktree", core_worktree,
> +					       common_config_file,
> +					       main_worktree_file)))
> +			goto cleanup;
> +	}
> +
> +	/*
> +	 * Ensure that we use worktree config for the remaining lifetime
> +	 * of the current process.
> +	 */
> +	repository_format_worktree_config = 1;
> +
> +cleanup:
> +	git_configset_clear(&cs);
> +	free(common_config_file);
> +	free(main_worktree_file);
> +	return res;
> +}
> diff --git a/worktree.h b/worktree.h
> index 9e06fcbdf3d..e9e839926b0 100644
> --- a/worktree.h
> +++ b/worktree.h
> @@ -183,4 +183,25 @@ void strbuf_worktree_ref(const struct worktree *wt,
>  			 struct strbuf *sb,
>  			 const char *refname);
>  
> +/**
> + * Enable worktree config for the first time. This will make the following
> + * adjustments:
> + *
> + * 1. Add extensions.worktreeConfig=true in the common config file.
> + *
> + * 2. If the common config file has a core.worktree value, then that value
> + *    is moved to the main worktree's config.worktree file.
> + *
> + * 3. If the common config file has a core.bare enabled, then that value
> + *    is moved to the main worktree's config.worktree file.
> + *
> + * If extensions.worktreeConfig is already true, then this method
> + * terminates early without any of the above steps. The existing config
> + * arrangement is assumed to be intentional.
> + *
> + * Returns 0 on success. Reports an error message and returns non-zero
> + * if any of these steps fail.
> + */
> +int init_worktree_config(struct repository *r);
> +
>  #endif
Derrick Stolee Feb. 9, 2022, 2:21 a.m. UTC | #2
On 2/8/2022 5:09 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> +static int move_config_setting(const char *key, const char *value,
>> +			       const char *from_file, const char *to_file)
>> +{
>> +	if (git_config_set_in_file_gently(to_file, key, value))
>> +		return error(_("unable to set %s in '%s'"), key, to_file);
>> +	if (git_config_set_in_file_gently(from_file, key, NULL))
>> +		return error(_("unable to unset %s in '%s'"), key, from_file);
>> +	return 0;
>> +}
> 
> Interesting.
> 
> The verb "move" in its name made me expect a "get (and remove)
> whatever value(s) defined out of the old file, and set them
> identically in the new file" sequence, but that is not what is done
> here.  "set to this new single value in the new file and unset from
> the old one".

I think this "copy into the worktree-specific config, then remove
from the common file" is an important sequence of events in case a
concurrent process comes in and reads the two config files in the
intermediate state and does not see the config value anywhere.

But perhaps that's not actually what you are concerned about,
because you're saying that the 'value' being provided does not
actually guarantee that we are moving the setting.

> I can see the need to say "move it only when its value is X",
> so having the caller to extract the value before deciding to call
> the function (hence not "moving from old") does make sense, but then
> the function is misnamed---it is not "moving", it is doing something
> else.

I think the end state is correct for all uses here, since we only
run this after checking to see if the config value exists in the
'from_file', so 'value' is correct (and this is a static method,
not a generally-useful method for config.h).

Perhaps a "write_in_new_and_remove_from_old()" would be a better,
if verbose, name. I struggle to find a less cumbersome name, and
"move" seems to match the intent pretty well in the context of its
use.

>> +	common_config_file = xstrfmt("%s/config", r->commondir);
>> +	main_worktree_file = xstrfmt("%s/config.worktree", r->commondir);
>> +
>> +	git_configset_init(&cs);
>> +	git_configset_add_file(&cs, common_config_file);
>> +
>> +	/*
>> +	 * If core.bare is true in the common config file, then we need to
>> +	 * move it to the main worktree's config file or it will break all
>> +	 * worktrees. If it is false, then leave it in place because it
>> +	 * _could_ be negating a global core.bare=true.
>> +	 */
> 
> Is the assumption that the secondary worktrees are never bare, but
> the primary one could be (iow, adding worktrees to a bare repository
> would leave the original bare repository as the primary "worktree"
> that does not have "working tree")?  I am trying to see what downsides
> it tries to avoid by not moving the core.bare==false setting.  Shouldn't
> core.bare be set to false when "worktree add" creates a new one anyway,
> if the secondaries are never bare?

Secondary worktrees cannot be bare. If Git interprets the worktree config
to have core.bare=true in a secondary worktree, it errors out.

You seem to be suggesting that we should explicitly write core.bare=false
into each of the worktree-specific config files. Is that right? This move
is effectively the same, since 'false' is the default.

>> +	if (!git_configset_get_bool(&cs, "core.bare", &bare) && bare) {
>> +		if ((res = move_config_setting("core.bare", "true",
>> +					       common_config_file,
>> +					       main_worktree_file)))
>> +			goto cleanup;
>> +	}
> 
>> +	/*
>> +	 * If core.worktree is set, then the main worktree is located
>> +	 * somewhere different than the parent of the common Git dir.
> 
> OK.  We do not want to share the working tree for the primary worktree
> among secondary worktrees.  For the primary, common and uncommon are
> the same, so it may not matter, but mention of "common Git dir" here
> may confuse readers?  Unless overridden by the config, the parent of
> the git dir is the root of the working tree, no?

Here, the verbal gymnastics are somewhat necessary because secondary
worktrees have a .git _file_, not a git directory, so using "common
Git dir" is a way to explicitly reference the Git dir. And the
strangeness here is exactly that core.worktree can change this working
tree to be something other than the parent of the (common) Git dir.

>> +	 * Relocate that value to avoid breaking all worktrees with this
>> +	 * upgrade to worktree config.
>> +	 */
> 
> And if it is not set, then working tree of each worktree is the
> parent of the per-worktree Git dir, so they will automatically
> become separate, which makes sense.

Thanks,
-Stolee
Elijah Newren Feb. 9, 2022, 4:43 p.m. UTC | #3
On Tue, Feb 8, 2022 at 2:09 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > +static int move_config_setting(const char *key, const char *value,
> > +                            const char *from_file, const char *to_file)
> > +{
> > +     if (git_config_set_in_file_gently(to_file, key, value))
> > +             return error(_("unable to set %s in '%s'"), key, to_file);
> > +     if (git_config_set_in_file_gently(from_file, key, NULL))
> > +             return error(_("unable to unset %s in '%s'"), key, from_file);
> > +     return 0;
> > +}
>
> Interesting.
>
> The verb "move" in its name made me expect a "get (and remove)
> whatever value(s) defined out of the old file, and set them
> identically in the new file" sequence, but that is not what is done
> here.  "set to this new single value in the new file and unset from
> the old one".
>
> I can see the need to say "move it only when its value is X",
> so having the caller to extract the value before deciding to call
> the function (hence not "moving from old") does make sense, but then
> the function is misnamed---it is not "moving", it is doing something
> else.
>
> > +int init_worktree_config(struct repository *r)
> > +{
> > +     int res = 0;
> > +     int bare = 0;
> > +     struct config_set cs = { { 0 } };
> > +     const char *core_worktree;
> > +     char *common_config_file;
> > +     char *main_worktree_file;
> > +
> > +     /*
> > +      * If the extension is already enabled, then we can skip the
> > +      * upgrade process.
> > +      */
> > +     if (repository_format_worktree_config)
> > +             return 0;
>
> OK.
>
> > +     if ((res = git_config_set_gently("extensions.worktreeConfig", "true")))
> > +             return error(_("failed to set extensions.worktreeConfig setting"));
>
> OK.
>
> > +     common_config_file = xstrfmt("%s/config", r->commondir);
> > +     main_worktree_file = xstrfmt("%s/config.worktree", r->commondir);
> > +
> > +     git_configset_init(&cs);
> > +     git_configset_add_file(&cs, common_config_file);
> > +
> > +     /*
> > +      * If core.bare is true in the common config file, then we need to
> > +      * move it to the main worktree's config file or it will break all
> > +      * worktrees. If it is false, then leave it in place because it
> > +      * _could_ be negating a global core.bare=true.
> > +      */
>
> Is the assumption that the secondary worktrees are never bare, but
> the primary one could be (iow, adding worktrees to a bare repository
> would leave the original bare repository as the primary "worktree"
> that does not have "working tree")?

Yes, and in fact that was the case which generated the original bug
report -- a bare clone where the affected individual started using
`git worktree add` to create some non-primary worktrees (and then also
used sparse-checkout in some of them).

>  I am trying to see what downsides
> it tries to avoid by not moving the core.bare==false setting.  Shouldn't
> core.bare be set to false when "worktree add" creates a new one anyway,
> if the secondaries are never bare?

Moving the core.bare==false setting might make sense.  In the previous
discussions, we tried to hypothesize about usage of old git clients
and non-git clients (jgit, etc.) on the same repos, and didn't know if
some of those would break if they couldn't find a `core.bare` setting
anywhere (since they wouldn't know to look in config.worktree).  We
needed to migrate core.bare=true to avoid an incorrect value affecting
all worktrees (and thus we figured it was worth the risk of breaking
older git/non-git clients because having older clients be broken is
better than having all clients including current git be broken), but
the same wasn't true for core.bare=false.  That said, we don't
actively know of any such clients that would be hurt by such a
migration.
Junio C Hamano Feb. 9, 2022, 5:34 p.m. UTC | #4
Derrick Stolee <stolee@gmail.com> writes:

> On 2/8/2022 5:09 PM, Junio C Hamano wrote:
>> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> 
>>> +static int move_config_setting(const char *key, const char *value,
>>> +			       const char *from_file, const char *to_file)
>>> +{
>>> +	if (git_config_set_in_file_gently(to_file, key, value))
>>> +		return error(_("unable to set %s in '%s'"), key, to_file);
>>> +	if (git_config_set_in_file_gently(from_file, key, NULL))
>>> +		return error(_("unable to unset %s in '%s'"), key, from_file);
>>> +	return 0;
>>> +}
>> 
>> Interesting.
>> 
>> The verb "move" in its name made me expect a "get (and remove)
>> whatever value(s) defined out of the old file, and set them
>> identically in the new file" sequence, but that is not what is done
>> here.  "set to this new single value in the new file and unset from
>> the old one".
>
> I think this "copy into the worktree-specific config, then remove
> from the common file" is an important sequence of events in case a
> concurrent process comes in and reads the two config files in the
> intermediate state and does not see the config value anywhere.
>
> But perhaps that's not actually what you are concerned about,
> because you're saying that the 'value' being provided does not
> actually guarantee that we are moving the setting.

Yes.  "Why are we _ignoring_ what is in the old file when we claim
to be _moving_?" was the question I had upon seeing this function.

>> I can see the need to say "move it only when its value is X",
>> so having the caller to extract the value before deciding to call
>> the function (hence not "moving from old") does make sense, but then
>> the function is misnamed---it is not "moving", it is doing something
>> else.

> I think the end state is correct for all uses here, since we only
> run this after checking to see if the config value exists in the
> 'from_file', so 'value' is correct (and this is a static method,
> not a generally-useful method for config.h).

As long as this is used on a single-valued "last one wins" variable,
the callers and this helper taken together will do the right thing.

> Perhaps a "write_in_new_and_remove_from_old()" would be a better,
> if verbose, name. I struggle to find a less cumbersome name, and
> "move" seems to match the intent pretty well in the context of its
> use.

The name is fine as long as the requirement for the caller is made
clear.  A short comment to help the next reader from having to ask
the same question before the helper may be sufficient.

>> Is the assumption that the secondary worktrees are never bare, but
>> the primary one could be (iow, adding worktrees to a bare repository
>> would leave the original bare repository as the primary "worktree"
>> that does not have "working tree")?  I am trying to see what downsides
>> it tries to avoid by not moving the core.bare==false setting.  Shouldn't
>> core.bare be set to false when "worktree add" creates a new one anyway,
>> if the secondaries are never bare?
>
> Secondary worktrees cannot be bare. If Git interprets the worktree config
> to have core.bare=true in a secondary worktree, it errors out.
>
> You seem to be suggesting that we should explicitly write core.bare=false
> into each of the worktree-specific config files. Is that right? This move
> is effectively the same, since 'false' is the default.

Unless there is a lower-precedence configuration file that we have
to override, yes, not writing core.bare=false upon "worktree add" is
fine.  I simply do not know if we need to do something special in
order to defeat /etc/gitconfig or $HOME/.gitconfig with the repository
or the worktree specific configuration file.

> Here, the verbal gymnastics are somewhat necessary because secondary
> worktrees have a .git _file_, not a git directory, so using "common
> Git dir" is a way to explicitly reference the Git dir. And the
> strangeness here is exactly that core.worktree can change this working
> tree to be something other than the parent of the (common) Git dir.

OK.  The .git _file_ is our moral equivalent to a symbolic link, and
I forgot about that.

I also wonder if we should do something like what we do for refs
(i.e. the API knows which refs are per-worktree and which are
global, so the callers do not have to care and just can say things
like "update HEAD to this value", and "give me the value of
refs/bisect/good") when repo_set_config*() is called, but that is
outside the scope of this step, which is about one-time migration.

As the code for migration go, I think I am happy with what it wants
to do and how it does it.

Thanks.
diff mbox series

Patch

diff --git a/worktree.c b/worktree.c
index 6f598dcfcdf..5292c94b3d9 100644
--- a/worktree.c
+++ b/worktree.c
@@ -5,6 +5,7 @@ 
 #include "worktree.h"
 #include "dir.h"
 #include "wt-status.h"
+#include "config.h"
 
 void free_worktrees(struct worktree **worktrees)
 {
@@ -826,3 +827,75 @@  int should_prune_worktree(const char *id, struct strbuf *reason, char **wtpath,
 	*wtpath = path;
 	return 0;
 }
+
+static int move_config_setting(const char *key, const char *value,
+			       const char *from_file, const char *to_file)
+{
+	if (git_config_set_in_file_gently(to_file, key, value))
+		return error(_("unable to set %s in '%s'"), key, to_file);
+	if (git_config_set_in_file_gently(from_file, key, NULL))
+		return error(_("unable to unset %s in '%s'"), key, from_file);
+	return 0;
+}
+
+int init_worktree_config(struct repository *r)
+{
+	int res = 0;
+	int bare = 0;
+	struct config_set cs = { { 0 } };
+	const char *core_worktree;
+	char *common_config_file;
+	char *main_worktree_file;
+
+	/*
+	 * If the extension is already enabled, then we can skip the
+	 * upgrade process.
+	 */
+	if (repository_format_worktree_config)
+		return 0;
+	if ((res = git_config_set_gently("extensions.worktreeConfig", "true")))
+		return error(_("failed to set extensions.worktreeConfig setting"));
+
+	common_config_file = xstrfmt("%s/config", r->commondir);
+	main_worktree_file = xstrfmt("%s/config.worktree", r->commondir);
+
+	git_configset_init(&cs);
+	git_configset_add_file(&cs, common_config_file);
+
+	/*
+	 * If core.bare is true in the common config file, then we need to
+	 * move it to the main worktree's config file or it will break all
+	 * worktrees. If it is false, then leave it in place because it
+	 * _could_ be negating a global core.bare=true.
+	 */
+	if (!git_configset_get_bool(&cs, "core.bare", &bare) && bare) {
+		if ((res = move_config_setting("core.bare", "true",
+					       common_config_file,
+					       main_worktree_file)))
+			goto cleanup;
+	}
+	/*
+	 * If core.worktree is set, then the main worktree is located
+	 * somewhere different than the parent of the common Git dir.
+	 * Relocate that value to avoid breaking all worktrees with this
+	 * upgrade to worktree config.
+	 */
+	if (!git_configset_get_value(&cs, "core.worktree", &core_worktree)) {
+		if ((res = move_config_setting("core.worktree", core_worktree,
+					       common_config_file,
+					       main_worktree_file)))
+			goto cleanup;
+	}
+
+	/*
+	 * Ensure that we use worktree config for the remaining lifetime
+	 * of the current process.
+	 */
+	repository_format_worktree_config = 1;
+
+cleanup:
+	git_configset_clear(&cs);
+	free(common_config_file);
+	free(main_worktree_file);
+	return res;
+}
diff --git a/worktree.h b/worktree.h
index 9e06fcbdf3d..e9e839926b0 100644
--- a/worktree.h
+++ b/worktree.h
@@ -183,4 +183,25 @@  void strbuf_worktree_ref(const struct worktree *wt,
 			 struct strbuf *sb,
 			 const char *refname);
 
+/**
+ * Enable worktree config for the first time. This will make the following
+ * adjustments:
+ *
+ * 1. Add extensions.worktreeConfig=true in the common config file.
+ *
+ * 2. If the common config file has a core.worktree value, then that value
+ *    is moved to the main worktree's config.worktree file.
+ *
+ * 3. If the common config file has a core.bare enabled, then that value
+ *    is moved to the main worktree's config.worktree file.
+ *
+ * If extensions.worktreeConfig is already true, then this method
+ * terminates early without any of the above steps. The existing config
+ * arrangement is assumed to be intentional.
+ *
+ * Returns 0 on success. Reports an error message and returns non-zero
+ * if any of these steps fail.
+ */
+int init_worktree_config(struct repository *r);
+
 #endif