Message ID | pull.1261.v2.git.git.1652485058.gitgitgadget@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | setup.c: make bare repo discovery optional | expand |
"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes: > * die()-ing is necessary if we're trying to flip the default value of > discovery.bare. We'd expect many bare repo users to be broken, and it's > more helpful to fail loudly than to silently ignore the bare repo. > > But in the long term, long after we've flipped the default and users know > that they need to opt into bare repo discovery, would it be a better UX > to just silently ignore the bare repo? Would a middle-ground of giving a warning() message help? Can it be loud and annoying enough to knudge the users to adjust without breaking the functionality? The longer-term default should be "cwd is allowed, but we do not bother going up from object/04 subdirectory of a bare repository", not "bare repositories should not be usable at all without GIT_DIR". > + Add a config variable, `discovery.bare`, that tells Git whether or not > + it should work with the bare repository it has discovered i.e. Git will > + die() if it discovers a bare repository, but it is not allowed by Missing comma before "i.e." > ++discovery.bare:: > ++ Specifies what kinds of directories Git can recognize as a bare > ++ repository when looking for the repository (aka repository > + discovery). This has no effect if repository discovery is not > + performed e.g. the path to the repository is set via `--git-dir` > + (see linkgit:git[1]). > ++ > +This config setting is only respected when specified in a system or global > +config, not when it is specified in a repository config or via the command > ++line option `-c discovery.bare=<value>`. ;-) > +++ > ++The currently supported values are `always` (Git always recognizes bare > ++repositories) and `never` (Git never recognizes bare repositories). > ++This defaults to `always`, but this default is likely to change. > +++ > ++If your workflow does not rely on bare repositories, it is recommended that > ++you set this value to `never`. This makes repository discovery easier to > ++reason about and prevents certain types of security and non-security > ++problems, such as: Hopefully "git fetch" over ssh:// and file:/// would run the other side with GIT_DIR explicitly set? As long as this recommendation does not break these use cases, I think we are OK, but I do not yet find these "problems, such as..." so convincing.
"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
> t/t0034-discovery-bare.sh | 69 +++++++++++++++++++++++
This number is already in use by an in-flight topic, if I am not
mistaken. Please make it a habit to always check your topic works
well when merged to 'next' and to 'seen'.
Thanks.
Junio C Hamano <gitster@pobox.com> writes: > "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> * die()-ing is necessary if we're trying to flip the default value of >> discovery.bare. We'd expect many bare repo users to be broken, and it's >> more helpful to fail loudly than to silently ignore the bare repo. >> >> But in the long term, long after we've flipped the default and users know >> that they need to opt into bare repo discovery, would it be a better UX >> to just silently ignore the bare repo? > > Would a middle-ground of giving a warning() message help? Can it be > loud and annoying enough to knudge the users to adjust without > breaking the functionality? Personally, when my tool changes its behavior, I would strongly prefer it to die than to "change behavior + warn". I'd feel more comfortable knowing that the tool did nothing as opposed to doing the wrong thing and only being informed after the fact. Also, I sometimes ignore warnings ;) When we _do_ transition away from die(), ignore + warning() sounds like a good first step. But if any of this flies in the face of the project's conventions, let me know as such. >> + Add a config variable, `discovery.bare`, that tells Git whether or not >> + it should work with the bare repository it has discovered i.e. Git will >> + die() if it discovers a bare repository, but it is not allowed by > > Missing comma before "i.e." Thanks. >> +++ >> ++The currently supported values are `always` (Git always recognizes bare >> ++repositories) and `never` (Git never recognizes bare repositories). >> ++This defaults to `always`, but this default is likely to change. >> +++ >> ++If your workflow does not rely on bare repositories, it is recommended that >> ++you set this value to `never`. This makes repository discovery easier to >> ++reason about and prevents certain types of security and non-security >> ++problems, such as: > > Hopefully "git fetch" over ssh:// and file:/// would run the other > side with GIT_DIR explicitly set? Ah, I'll check this and get back to you. > I do not yet > find these "problems, such as..." so convincing. What would be a convincing rationale to you? I'll capture that here. I'm assuming that you already have such an rationale in mind when you say that the longer-term default is that "we respect bare repositories only if they are the cwd.". I'm also assuming that this rationale is something other than embedded bare repos, because "cwd-only" does not protect against that. Perhaps "never" sounds better to folks who don't ever expect bare repositories and want to lock down the environment. Randall (cc-ed) suggests one such use case in [1]. (To Randall: Oops, I actually meant to cc you earlier, since you were the first to suggest a practical use case for never allowing bare repos. It must've slipped my mind). [1] https://lore.kernel.org/git/005d01d84ad0$782e8fc0$688baf40$@nexbridge.com.
On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote: > Thanks all for the comments on v1, I've expanded this series somewhat to > address them,... Please include a full cover letter with each version, so reviewers can respond to the full series goals. Your series here intends to start protecting against malicious embedded bare repositories by allowing users to opt-in to a more protected state. When the 'discovery.bare' option is set, then Git may die() on a bare repository that is discovered based on the current working directory (these protections are ignored if the user specifies the directory directly through --git-dir or $GIT_DIR). The 'discovery.bare' option has these values at the end of your series: * 'always' (default) allows all bare repos, matching the current behavior of Git. * 'never' avoids operating in bare repositories altogether. * 'cwd' operates in a bare repository only if the current directory is exactly the root of the bare repository. It is important that we keep 'always' as the default at first, because we do not want to introduce a breaking change without warning (at least for an issue like this that has been around for a long time). The 'never' option is a good one for very security-conscious users who really want to avoid problems. I don't anticipate that users who know about this option and set it themselves are the type that would fall for the social engineering required to attack using this vector, but I can imagine an IT department installing the value in system config across a fleet of machines. I find the 'cwd' option to not be valuable. It unblocks most existing users, but also almost completely removes the protection that the option was supposed to provide. I find neither the 'never' or 'cwd' options an acceptable choice for a future default. I also think that this protection is too rigid: it restricts _all_ bare repositories, not just embedded ones. There is no check to see if the parent directory of the bare repository is inside a non-bare repository. This leads to what I think would be a valuable replacement for the 'cwd' option: * 'no-embedded' allows non-embedded bare repositories. An _embedded bare repository_ is a bare repository whose parent directory is contained in the worktree of a non-bare Git repository. When in this mode, embedded bare repositories are not allowed unless the parent non-bare Git repository has a 'safe.embedded' config value storing the path to the current embedded bare repository. That was certainly difficult to write, but here it is as pseudo-code to hopefully remove some doubt as to how this might work: if repo is bare: if value == "always": return ALLOWED if value == "never": return FORBIDDEN; path = get_parent_repo() if !path: return ALLOWED if config_file_has_value("{path}/.git/config", "safe.embedded", repo): return ALLOWED return FORBIDDEN With this kind of option, we can protect users from these social engineering attacks while providing an opt-in protection for scenarios where embedded bare repos are currently being used (while also not breaking anyone using non-embedded bare repos). I think Taylor was mentioning something like this in his previous replies, perhaps even to the previous thread on this topic. This 'no-embedded' option is something that I could see as a potential new default, after it has proven itself in a released version of Git. There are performance drawbacks to checking the parent path for a Git repo, which is why it is only done when in "no-embedded" mode. I mentioned some other concerns in your PATCH 1 about how we are now adding the third use of read_very_early_config() and that we should probably refactor that before adding the third option, in order to avoid additional performance costs as well as it being difficult to audit which config options are only checked from these "protected" config files. Thanks, -Stolee
Glen Choo <chooglen@google.com> writes: > Junio C Hamano <gitster@pobox.com> writes: > >> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes: >> >>> * die()-ing is necessary if we're trying to flip the default value of >>> discovery.bare. We'd expect many bare repo users to be broken, and it's >>> more helpful to fail loudly than to silently ignore the bare repo. >>> >>> But in the long term, long after we've flipped the default and users know >>> that they need to opt into bare repo discovery, would it be a better UX >>> to just silently ignore the bare repo? >> >> Would a middle-ground of giving a warning() message help? Can it be >> loud and annoying enough to knudge the users to adjust without >> breaking the functionality? > > Personally, when my tool changes its behavior, I would strongly prefer > it to die than to "change behavior + warn". I'd feel more comfortable > knowing that the tool did nothing as opposed to doing the wrong thing > and only being informed after the fact. Also, I sometimes ignore > warnings ;) Heh, personally I would try very hard not to change the behaviour without explicitly asked by the users with configuration or command line option. Flipping the default has traditionally been done in two or three phases. (1) We start by giving a loud and annoying warning to those who haven't configured and tell them the default *will* change, how to keep the current behaviour forever, and how to live in the future by adopting the future default early. (2) After a while, we flip the default. Those who haven't configured are given a notice that the default has changed, how to keep the old behaviour forever, and how to explicitly choose the same value as the default to squelch the notice. (3) After yet another while, we stop giving the notice. If we omitted (2), here is where we flip the default. Strictly speaking, we can have (1) in one release and then could directly jump to (3), but some distros may skip the releases that has (1), and (2) is an attempt to help users of such distros. >> Hopefully "git fetch" over ssh:// and file:/// would run the other >> side with GIT_DIR explicitly set? > > Ah, I'll check this and get back to you. > >> I do not yet >> find these "problems, such as..." so convincing. > > What would be a convincing rationale to you? I'll capture that here. That is a wrong question. You are the one pushing for castrating the bare repositories. > I'm assuming that you already have such an rationale in mind when you > say that the longer-term default is that "we respect bare repositories > only if they are the cwd.". I'm also assuming that this rationale is > something other than embedded bare repos, because "cwd-only" does not > protect against that. No, I do not have such a "different" rationale to justify the change proposed in this patch. I was saying that the claim "embedded bare repos are risky", backed by your two examples, did not sound all that serious a problem. Presented with a more serious brekage scenario, it may make the description more convincing. Thanks.
Junio C Hamano <gitster@pobox.com> writes: > Glen Choo <chooglen@google.com> writes: > >> Junio C Hamano <gitster@pobox.com> writes: >> >>> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes: >>> >>>> * die()-ing is necessary if we're trying to flip the default value of >>>> discovery.bare. We'd expect many bare repo users to be broken, and it's >>>> more helpful to fail loudly than to silently ignore the bare repo. >>>> >>>> But in the long term, long after we've flipped the default and users know >>>> that they need to opt into bare repo discovery, would it be a better UX >>>> to just silently ignore the bare repo? >>> >>> Would a middle-ground of giving a warning() message help? Can it be >>> loud and annoying enough to knudge the users to adjust without >>> breaking the functionality? >> >> Personally, when my tool changes its behavior, I would strongly prefer >> it to die than to "change behavior + warn". I'd feel more comfortable >> knowing that the tool did nothing as opposed to doing the wrong thing >> and only being informed after the fact. Also, I sometimes ignore >> warnings ;) > > Heh, personally I would try very hard not to change the behaviour > without explicitly asked by the users with configuration or command > line option. Flipping the default has traditionally been done in > two or three phases. > > (1) We start by giving a loud and annoying warning to those who > haven't configured and tell them the default *will* change, how > to keep the current behaviour forever, and how to live in the > future by adopting the future default early. > > (2) After a while, we flip the default. Those who haven't > configured are given a notice that the default has changed, how > to keep the old behaviour forever, and how to explicitly choose > the same value as the default to squelch the notice. > > (3) After yet another while, we stop giving the notice. If we > omitted (2), here is where we flip the default. > > Strictly speaking, we can have (1) in one release and then could > directly jump to (3), but some distros may skip the releases that > has (1), and (2) is an attempt to help users of such distros. Ah, that is very helpful. Thanks. It's pretty clear that I misunderstood what you meant by "giving a warning() message" - the warning() is there to prepare users in advance of the change; we don't actually want the warning() in the long term. For something as disruptive as discovering bare repos, having all of (1), (2) and (3) sounds appropriate. >>> Hopefully "git fetch" over ssh:// and file:/// would run the other >>> side with GIT_DIR explicitly set? >> >> Ah, I'll check this and get back to you. >> >>> I do not yet >>> find these "problems, such as..." so convincing. >> >> What would be a convincing rationale to you? I'll capture that here. > > That is a wrong question. You are the one pushing for castrating > the bare repositories. Let me clarify in case this wasn't received the way I intended. Earlier in the thread, you mentioned: The longer-term default should be "cwd is allowed, but we do not bother going up from object/04 subdirectory of a bare repository", [...] which I took to mean "Junio thinks that, by default, Git should stop walking up to find a bare repo, and thinks this is better because of rationale X.", and not, "Junio does not think that the default needs to change, but is just suggesting a better default than Glen's". If it is the former, then there is obviously some thought process here that is worth sharing. If it the latter, then I'm in favor of taking Stolee's suggestion to drop "cwd", since nobody else finds it useful enough. (I like the 'simplification' story, but not enough to push "cwd" through, especially since it does quite little security-wise.) >> I'm assuming that you already have such an rationale in mind when you >> say that the longer-term default is that "we respect bare repositories >> only if they are the cwd.". I'm also assuming that this rationale is >> something other than embedded bare repos, because "cwd-only" does not >> protect against that. > > No, I do not have such a "different" rationale to justify the change > proposed in this patch. I was saying that the claim "embedded bare > repos are risky", backed by your two examples, did not sound all > that serious a problem. Presented with a more serious brekage > scenario, it may make the description more convincing. Fair. I'll mull over this.
Glen Choo <chooglen@google.com> writes: > which I took to mean "Junio thinks that, by default, Git should stop > walking up to find a bare repo, and thinks this is better because of > rationale X." The X is "it would not break existing use case too badly, just to address a 'security' story whose severity is not so clearly expressed". > If it the latter, then I'm in favor of taking Stolee's suggestion to > drop "cwd", since nobody else finds it useful enough. (I like the > 'simplification' story, but not enough to push "cwd" through, especially > since it does quite little security-wise.) As long as you'll be there to answer the angry mob that complain loudly (and irritatingly enough, the only do so after a release is made to flip the default), I do not care too much either way ;-). Thanks.
On Mon, May 16, 2022 at 03:07:35PM -0400, Derrick Stolee wrote: > On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote: > > Thanks all for the comments on v1, I've expanded this series somewhat to > > address them,... > > Please include a full cover letter with each version, so reviewers > can respond to the full series goals. > > Your series here intends to start protecting against malicious > embedded bare repositories by allowing users to opt-in to a more > protected state. [...] Thanks for the summary, which I think will be especially helpful to others looking at this series for the very first time. > The 'never' option is a good one for very security-conscious > users who really want to avoid problems. I don't anticipate that > users who know about this option and set it themselves are the > type that would fall for the social engineering required to > attack using this vector, but I can imagine an IT department > installing the value in system config across a fleet of machines. When I first read this, I disagreed, since presumably that same crowd has legitimate bare repositories that they want to continue being able to operate in without having to pass `--git-dir` or `$GIT_DIR` in. In fact... > I also think that this protection is too rigid: it restricts > _all_ bare repositories, not just embedded ones. There is no check > to see if the parent directory of the bare repository is inside a > non-bare repository. ...this resonates quite a bit more with me. "never" isn't a good option unless you aren't a user of bare repositories _and_ don't have any embedded bare repositories (either at all, or any ones that you trust). > This leads to what I think would be a valuable replacement for > the 'cwd' option: > > * 'no-embedded' allows non-embedded bare repositories. An > _embedded bare repository_ is a bare repository whose parent > directory is contained in the worktree of a non-bare Git > repository. When in this mode, embedded bare repositories are > not allowed unless the parent non-bare Git repository has a > 'safe.embedded' config value storing the path to the current > embedded bare repository. > > That was certainly difficult to write, but here it is as > pseudo-code to hopefully remove some doubt as to how this might > work: > > if repo is bare: > if value == "always": > return ALLOWED > if value == "never": > return FORBIDDEN; This is indeed very similar to a proposal I had made upthread (which you note lower down in this email). One thing that's nice is that we only have to traverse up to the parent repo when in the "no-embedded" mode. That may be slow (since it's unbounded all the way up to the filesystem root or a ceiling directory, whichever we encounter first), but I think it's unavoidable if you need to distinguish between embedded and non-embedded bare repositories. > path = get_parent_repo() > > if !path: > return ALLOWED > > if config_file_has_value("{path}/.git/config", "safe.embedded", repo): > return ALLOWED > > return FORBIDDEN > > With this kind of option, we can protect users from these > social engineering attacks while providing an opt-in protection > for scenarios where embedded bare repos are currently being used > (while also not breaking anyone using non-embedded bare repos). > > I think Taylor was mentioning something like this in his previous > replies, perhaps even to the previous thread on this topic. Yep, see: https://lore.kernel.org/git/Ylobp7sntKeWTLDX@nand.local/.a > This 'no-embedded' option is something that I could see as a > potential new default, after it has proven itself in a released > version of Git. I would be totally happy to see "no-embedded" become the default. It might be nice to issue a warning when the top-level config is unset, to give users a heads up about cases that may be broken, perhaps like: if repo is bare: switch (value) { case "always": return ALLOWED; case "never": return FORBIDDEN; case "no-embedded": # fallthrough case "": path = get_parent_repo() if !path return ALLOWED; if config_file_has_value("{path}/.git/config", "safe.embedded", repo) return ALLOWED; if value == "no-embedded": return FORBIDDEN; # otherwise, we're in an embedded bare repository with an unset # discovery.bare config. # # warn that this will break in the future... warning(_("%s is embedded within %s"), the_repository.path, path); advise(_("to allow discovery for this embedded repo, either run")); advise(_("")); advise(_(" $ git config --global discovery.bare always, or")); advise(_(" $ git -C '%s' config --local safe.embedded '%s'"), path, relpath(path, the_repository.path)); # ...but allow the invocation for now until the default is # changed. return ALLOWED; default: die(_("unrecognized value of discovery.bare: '%s'"), value); } ...where relpath is similar to Go's path/filepath.Rel function. With an appropriate deprecation period, I think we could even get away from the "continue executing, but don't read config+hooks", which in retrospect is more error-prone and difficult to reason about than I initially had given it credit for. Thanks, Taylor
Derrick Stolee <derrickstolee@github.com> writes: > * 'no-embedded' allows non-embedded bare repositories. An > _embedded bare repository_ is a bare repository whose parent > directory is contained in the worktree of a non-bare Git > repository. When in this mode, embedded bare repositories are > not allowed unless the parent non-bare Git repository has a > 'safe.embedded' config value storing the path to the current > embedded bare repository. Sounds sensible. I wonder how expensive this will be in practice, but the behaviour seems well thought out.
Thanks, I think this has advanced the conversation quite a bit. Derrick Stolee <derrickstolee@github.com> writes: > On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote: >> Thanks all for the comments on v1, I've expanded this series somewhat to >> address them,... > > Please include a full cover letter with each version, so reviewers > can respond to the full series goals. > > Your series here intends to start protecting against malicious > embedded bare repositories by allowing users to opt-in to a more > protected state. When the 'discovery.bare' option is set, then > Git may die() on a bare repository that is discovered based on > the current working directory (these protections are ignored if > the user specifies the directory directly through --git-dir or > $GIT_DIR). > > The 'discovery.bare' option has these values at the end of your > series: > > * 'always' (default) allows all bare repos, matching the current > behavior of Git. > > * 'never' avoids operating in bare repositories altogether. > > * 'cwd' operates in a bare repository only if the current directory > is exactly the root of the bare repository. My mistake, I should have prepared this summary myself. Thanks again. > It is important that we keep 'always' as the default at first, > because we do not want to introduce a breaking change without > warning (at least for an issue like this that has been around > for a long time). Yes. > The 'never' option is a good one for very security-conscious > users who really want to avoid problems. I don't anticipate that > users who know about this option and set it themselves are the > type that would fall for the social engineering required to > attack using this vector, but I can imagine an IT department > installing the value in system config across a fleet of machines. Yes. Setting the 'never' option in a system config is the use case that motivated this. > I find the 'cwd' option to not be valuable. It unblocks most > existing users, but also almost completely removes the protection > that the option was supposed to provide. Ok, I agree that it provides next-to-no protection. I'll drop it in this series; it's easy enough to reimplement if users really want it anyway. > This leads to what I think would be a valuable replacement for > the 'cwd' option: > > * 'no-embedded' allows non-embedded bare repositories. An > _embedded bare repository_ is a bare repository whose parent > directory is contained in the worktree of a non-bare Git > repository. When in this mode, embedded bare repositories are > not allowed unless the parent non-bare Git repository has a > 'safe.embedded' config value storing the path to the current > embedded bare repository. > > That was certainly difficult to write, but here it is as > pseudo-code to hopefully remove some doubt as to how this might > work: > > if repo is bare: > if value == "always": > return ALLOWED > if value == "never": > return FORBIDDEN; > > path = get_parent_repo() > > if !path: > return ALLOWED > > if config_file_has_value("{path}/.git/config", "safe.embedded", repo): > return ALLOWED > > return FORBIDDEN > > With this kind of option, we can protect users from these > social engineering attacks while providing an opt-in protection > for scenarios where embedded bare repos are currently being used > (while also not breaking anyone using non-embedded bare repos). [...] > This 'no-embedded' option is something that I could see as a > potential new default, after it has proven itself in a released > version of Git. I agree, this sounds like a good default that should work for most users. That said, I don't think I will implement it, and even if I do, it won't be in this series. I have serious doubts that I'd be able to deliver it in a reasonable amount of time (I tried preparing patches to this effect and failed [1]), and 'never' is sufficient for $DAYJOB's current needs. I would be very happy to see this come to fruition though. I have no objections to anyone preparing patches for this, and I'll gladly review those if that's helpful. [1] The specific trouble I had was figuring out whether or not the 'parent' repo was tracking the bare repo, since an untracked bare repo in the working tree isn't (in some sense) really "embedded" and it can't have come from a remote. But maybe the tracking check is unnecessary. We would break a few more users without it, but 'safe.embedded' is an easy enough way for a user to unbreak themselves. > I mentioned some other concerns in your PATCH 1 about how we > are now adding the third use of read_very_early_config() and that > we should probably refactor that before adding the third option, > in order to avoid additional performance costs as well as it > being difficult to audit which config options are only checked > from these "protected" config files. Makes sense. I'll ask about specifics on that subthread. > > Thanks, > -Stolee
Thanks all for the comments on v1, I've expanded this series somewhat to address them, notably: * The config value is now named discovery.bare and is an enum (instead of an allow-list). This hopefully moves us away from "bare repos are unsafe and need to be guarded against" and towards "bare repos can be made optional if it serves your needs better". * discovery.bare now causes git to die() instead of silently ignoring the bare repo. * Add an option that allows a bare repo if it is the CWD, since this is presumably a reasonable default for 99% of bare repo users [1]. = Questions/Concerns * die()-ing is necessary if we're trying to flip the default value of discovery.bare. We'd expect many bare repo users to be broken, and it's more helpful to fail loudly than to silently ignore the bare repo. But in the long term, long after we've flipped the default and users know that they need to opt into bare repo discovery, would it be a better UX to just silently ignore the bare repo? = Patch organization * Patch 1 introduces discovery.bare with allowed values [always|never]. * Patch 2 adds discover.bare=cwd, which is useful when users don't always set GIT_DIR e.g. their workflow really depends on it, they are in the midst of migration. = Series history Changes since v1: * Rename safe.barerepository to discovery.bare and make it die() * Move tests into t/t0034-discovery-bare.sh * Avoid unnecessary config reading by using a static variable * Add discovery.bare=cwd * Fix typos [1] I tried this 'cwd' setting on our test suite, with some pretty promising results. https://github.com/chooglen/git/actions/runs/2321914777 Out of the 8 failing scripts: * 6 are of the form "make sure we 'do the right thing' inside a subdirectory of a bare repo" (which typically means .git) e.g. t9903-bash-prompt.sh. We should be setting discovery.bare=always for these tests, so this is a non-issue. * t5323-pack-redundant.sh can be rewritten to -C into the root of the bare repo instead of a subdirectory. * t3310-notes-merge-manual-resolve.sh: not sure what the test is checking in particular, but I think this can be rewritten. IOW, I don't think we have any commands that require that CWD is a subdirectory of a bare repo, and we could use discovery.bare without much hassle. Glen Choo (2): setup.c: make bare repo discovery optional setup.c: learn discovery.bareRepository=cwd Documentation/config/discovery.txt | 26 +++++++++ setup.c | 89 ++++++++++++++++++++++++++++-- t/t0034-discovery-bare.sh | 69 +++++++++++++++++++++++ 3 files changed, 178 insertions(+), 6 deletions(-) create mode 100644 Documentation/config/discovery.txt create mode 100755 t/t0034-discovery-bare.sh base-commit: e8005e4871f130c4e402ddca2032c111252f070a Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1261%2Fchooglen%2Fsetup%2Fdisable-bare-repo-config-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1261/chooglen/setup/disable-bare-repo-config-v2 Pull-Request: https://github.com/git/git/pull/1261 Range-diff vs v1: 1: 3370258c4b3 ! 1: 22b10bf9da8 [RFC] setup.c: make bare repo discovery optional @@ Metadata Author: Glen Choo <chooglen@google.com> ## Commit message ## - [RFC] setup.c: make bare repo discovery optional + setup.c: make bare repo discovery optional - Add a config variable, `safe.barerepository`, that tells Git whether or - not to recognize bare repositories when it is trying to discover the - repository. This only affects repository discovery, thus it has no + Add a config variable, `discovery.bare`, that tells Git whether or not + it should work with the bare repository it has discovered i.e. Git will + die() if it discovers a bare repository, but it is not allowed by + `discovery.bare`. This only affects repository discovery, thus it has no effect if discovery was not done (e.g. `--git-dir` was passed). This is motivated by the fact that some workflows don't use bare @@ Commit message are probably very rare in practice, this lets users reduce the chance to zero. - This config is designed to be used like an allow-list, but it is not yet - clear what a good format for this allow-list would be. As such, this - patch limits the config value to a tri-state of [true|false|unset]: + This config is an enum of: - - [*|(unset)] recognize all bare repositories (like Git does today) - - (empty) recognize no bare repositories + - ["always"|(unset)]: always recognize bare repositories (like Git does + today) + - "never": never recognize bare repositories - and leaves the full format to be determined later. + More values are expected to be added later, and the default is expected + to change (i.e. to something other than "always"). [1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com [2]: I don't personally know anyone who does this as part of their @@ Commit message Signed-off-by: Glen Choo <chooglen@google.com> - ## Documentation/config/safe.txt ## + WIP setup.c: make discovery.bare die on failure + + Signed-off-by: Glen Choo <chooglen@google.com> + + ## Documentation/config/discovery.txt (new) ## @@ -+safe.barerepository:: -+ This config entry specifies directories that Git can recognize as -+ a bare repository when looking for the repository (aka repository ++discovery.bare:: ++ Specifies what kinds of directories Git can recognize as a bare ++ repository when looking for the repository (aka repository + discovery). This has no effect if repository discovery is not + performed e.g. the path to the repository is set via `--git-dir` + (see linkgit:git[1]). ++ -+It is recommended that you set this value so that Git will only use the bare -+repositories you intend it to. This prevents certain types of security and -+non-security problems, such as: -+ -+* `git clone`-ing a repository containing a maliciously bare repository -+ inside it. -+* Git recognizing a directory that isn't mean to be a bare repository, -+ but happens to look like one. -++ -+The currently supported values are `*` (Git recognizes all bare -+repositories) and the empty value (Git never recognizes bare repositories). -+Defaults to `*`. -++ +This config setting is only respected when specified in a system or global +config, not when it is specified in a repository config or via the command -+line option `-c safe.barerepository=<path>`. ++line option `-c discovery.bare=<value>`. +++ ++The currently supported values are `always` (Git always recognizes bare ++repositories) and `never` (Git never recognizes bare repositories). ++This defaults to `always`, but this default is likely to change. +++ ++If your workflow does not rely on bare repositories, it is recommended that ++you set this value to `never`. This makes repository discovery easier to ++reason about and prevents certain types of security and non-security ++problems, such as: + - safe.directory:: - These config entries specify Git-tracked directories that are - considered safe even if they are owned by someone other than the ++* `git clone`-ing a repository containing a malicious bare repository ++ inside it. ++* Git recognizing a directory that isn't meant to be a bare repository, ++ but happens to look like one. ## setup.c ## +@@ + static int inside_git_dir = -1; + static int inside_work_tree = -1; + static int work_tree_config_is_bogus; ++enum discovery_bare_config { ++ DISCOVERY_BARE_UNKNOWN = -1, ++ DISCOVERY_BARE_NEVER = 0, ++ DISCOVERY_BARE_ALWAYS, ++}; ++static enum discovery_bare_config discovery_bare_config = ++ DISCOVERY_BARE_UNKNOWN; + + static struct startup_info the_startup_info; + struct startup_info *startup_info = &the_startup_info; @@ setup.c: static int ensure_valid_ownership(const char *path) return data.is_safe; } -+/* -+ * This is similar to safe_directory_data, but only supports true/false. -+ */ -+struct safe_bare_repository_data { -+ int is_safe; -+}; -+ -+static int safe_bare_repository_cb(const char *key, const char *value, void *d) ++static int discovery_bare_cb(const char *key, const char *value, void *d) +{ -+ struct safe_bare_repository_data *data = d; -+ -+ if (strcmp(key, "safe.barerepository")) ++ if (strcmp(key, "discovery.bare")) + return 0; + -+ if (!value || !strcmp(value, "*")) { -+ data->is_safe = 1; ++ if (!strcmp(value, "never")) { ++ discovery_bare_config = DISCOVERY_BARE_NEVER; + return 0; + } -+ if (!*value) { -+ data->is_safe = 0; ++ if (!strcmp(value, "always")) { ++ discovery_bare_config = DISCOVERY_BARE_ALWAYS; + return 0; + } + return -1; +} + -+static int should_detect_bare(void) ++static int check_bare_repo_allowed(void) +{ -+ struct safe_bare_repository_data data; -+ -+ read_very_early_config(safe_bare_repository_cb, &data); ++ if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) { ++ read_very_early_config(discovery_bare_cb, NULL); ++ /* We didn't find a value; use the default. */ ++ if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) ++ discovery_bare_config = DISCOVERY_BARE_ALWAYS; ++ } ++ switch (discovery_bare_config) { ++ case DISCOVERY_BARE_NEVER: ++ return 0; ++ case DISCOVERY_BARE_ALWAYS: ++ return 1; ++ default: ++ BUG("invalid discovery_bare_config %d", discovery_bare_config); ++ } ++} + -+ return data.is_safe; ++static const char *discovery_bare_config_to_string(void) ++{ ++ switch (discovery_bare_config) { ++ case DISCOVERY_BARE_NEVER: ++ return "never"; ++ case DISCOVERY_BARE_ALWAYS: ++ return "always"; ++ default: ++ BUG("invalid discovery_bare_config %d", discovery_bare_config); ++ } +} + enum discovery_result { GIT_DIR_NONE = 0, GIT_DIR_EXPLICIT, +@@ setup.c: enum discovery_result { + GIT_DIR_HIT_CEILING = -1, + GIT_DIR_HIT_MOUNT_POINT = -2, + GIT_DIR_INVALID_GITFILE = -3, +- GIT_DIR_INVALID_OWNERSHIP = -4 ++ GIT_DIR_INVALID_OWNERSHIP = -4, ++ GIT_DIR_DISALLOWED_BARE = -5 + }; + + /* @@ setup.c: static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, - return GIT_DIR_DISCOVERED; } -- if (is_git_directory(dir->buf)) { -+ if (should_detect_bare() && is_git_directory(dir->buf)) { + if (is_git_directory(dir->buf)) { ++ if (!check_bare_repo_allowed()) ++ return GIT_DIR_DISALLOWED_BARE; if (!ensure_valid_ownership(dir->buf)) return GIT_DIR_INVALID_OWNERSHIP; strbuf_addstr(gitdir, "."); +@@ setup.c: const char *setup_git_directory_gently(int *nongit_ok) + } + *nongit_ok = 1; + break; ++ case GIT_DIR_DISALLOWED_BARE: ++ if (!nongit_ok) { ++ die(_("cannot use bare repository '%s' (discovery.bare is '%s')"), ++ dir.buf, ++ discovery_bare_config_to_string()); ++ } ++ *nongit_ok = 1; ++ break; + case GIT_DIR_NONE: + /* + * As a safeguard against setup_git_directory_gently_1 returning - ## t/t1510-repo-setup.sh ## -@@ t/t1510-repo-setup.sh: test_expect_success '#16e: bareness preserved by --bare' ' - ) - ' - -+# Test the tri-state of [(unset)|""|"*"]. -+test_expect_success '#16f: bare repo in worktree' ' -+ test_when_finished "git config --global --unset safe.barerepository" && -+ setup_repo 16f unset "" unset && + ## t/t0034-discovery-bare.sh (new) ## +@@ ++#!/bin/sh + -+ git init --bare 16f/default/bare && -+ git init --bare 16f/default/bare/bare && -+ try_case 16f/default/bare unset unset \ -+ . "(null)" "$here/16f/default/bare" "(null)" && -+ try_case 16f/default/bare/bare unset unset \ -+ . "(null)" "$here/16f/default/bare/bare" "(null)" && ++test_description='verify discovery.bare checks' + -+ git config --global safe.barerepository "*" && -+ git init --bare 16f/all/bare && -+ git init --bare 16f/all/bare/bare && -+ try_case 16f/all/bare unset unset \ -+ . "(null)" "$here/16f/all/bare" "(null)" && -+ try_case 16f/all/bare/bare unset unset \ -+ . "(null)" "$here/16f/all/bare/bare" "(null)" && ++. ./test-lib.sh + -+ git config --global safe.barerepository "" && -+ git init --bare 16f/never/bare && -+ git init --bare 16f/never/bare/bare && -+ try_case 16f/never/bare unset unset \ -+ ".git" "$here/16f" "$here/16f" "never/bare/" && -+ try_case 16f/never/bare/bare unset unset \ -+ ".git" "$here/16f" "$here/16f" "never/bare/bare/" -+' ++pwd="$(pwd)" + -+test_expect_success '#16g: inside .git with safe.barerepository' ' -+ test_when_finished "git config --global --unset safe.barerepository" && ++expect_allowed () { ++ git rev-parse --absolute-git-dir >actual && ++ echo "$pwd/outer-repo/bare-repo" >expected && ++ test_cmp expected actual ++} + -+ # Omit the "default" case; it is covered by 16a. ++expect_rejected () { ++ test_must_fail git rev-parse --absolute-git-dir 2>err && ++ grep "discovery.bare" err ++} + -+ git config --global safe.barerepository "*" && -+ setup_repo 16g/all unset "" unset && -+ mkdir -p 16g/all/.git/wt/sub && -+ try_case 16g/all/.git unset unset \ -+ . "(null)" "$here/16g/all/.git" "(null)" && -+ try_case 16g/all/.git/wt unset unset \ -+ "$here/16g/all/.git" "(null)" "$here/16g/all/.git/wt" "(null)" && -+ try_case 16g/all/.git/wt/sub unset unset \ -+ "$here/16g/all/.git" "(null)" "$here/16g/all/.git/wt/sub" "(null)" && ++test_expect_success 'setup bare repo in worktree' ' ++ git init outer-repo && ++ git init --bare outer-repo/bare-repo ++' ++ ++test_expect_success 'discovery.bare unset' ' ++ ( ++ cd outer-repo/bare-repo && ++ expect_allowed && ++ cd refs/ && ++ expect_allowed ++ ) ++' ++ ++test_expect_success 'discovery.bare=always' ' ++ git config --global discovery.bare always && ++ ( ++ cd outer-repo/bare-repo && ++ expect_allowed && ++ cd refs/ && ++ expect_allowed ++ ) ++' + -+ git config --global safe.barerepository "" && -+ setup_repo 16g/never unset "" unset && -+ mkdir -p 16g/never/.git/wt/sub && -+ try_case 16g/never/.git unset unset \ -+ ".git" "$here/16g/never" "$here/16g/never" ".git/" && -+ try_case 16g/never/.git/wt unset unset \ -+ ".git" "$here/16g/never" "$here/16g/never" ".git/wt/" && -+ try_case 16g/never/.git/wt/sub unset unset \ -+ ".git" "$here/16g/never" "$here/16g/never" ".git/wt/sub/" ++test_expect_success 'discovery.bare=never' ' ++ git config --global discovery.bare never && ++ ( ++ cd outer-repo/bare-repo && ++ expect_rejected && ++ cd refs/ && ++ expect_rejected ++ ) && ++ ( ++ GIT_DIR=outer-repo/bare-repo && ++ export GIT_DIR && ++ expect_allowed ++ ) +' + - test_expect_success '#17: GIT_WORK_TREE without explicit GIT_DIR is accepted (bare case)' ' - # Just like #16. - setup_repo 17a unset "" true && ++test_done -: ----------- > 2: 62070aab7eb setup.c: learn discovery.bareRepository=cwd