Message ID | 20240626123358.420292-2-flo@geekplace.eu (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | setup: support GIT_IGNORE_INSECURE_OWNER environment variable | expand |
Hi Florian On 26/06/2024 13:33, Florian Schmaus wrote: > Sometimes more flexibility to disable/ignore the ownership check, besides > the safe.directory configuration option, is required. > > For example, git-daemon running as nobody user, which typically has no > home directory. Therefore, we can not add the path to a user-global > configuration and adding the path to the system-wide configuration could > have negative security implications. > > Therefore, make the check configurable via an environment variable. An alternative would be to allow safe.directory to be specified on the command line with "git -c safe.directory='*' daemon ..." rather than adding a dedicated environment variable. Or you could set $HOME to a suitable directory when running "git daemon" and put the user-global config file there. That directory and config file only need to be readable by the user that "git daemon" is running under it can be owned by root or whoever else you want. Best Wishes Phillip > If the environment variable GIT_IGNORE_INSECURE_OWNER is set to true, > then ignore potentially insecure ownership of git-related path > components. > > Signed-off-by: Florian Schmaus <flo@geekplace.eu> > --- > setup.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/setup.c b/setup.c > index 3afa6fb09b28..da3f504fb536 100644 > --- a/setup.c > +++ b/setup.c > @@ -1278,6 +1278,14 @@ static int ensure_valid_ownership(const char *gitfile, > */ > git_protected_config(safe_directory_cb, &data); > > + if (data.is_safe) > + return data.is_safe; > + > + if (git_env_bool("GIT_IGNORE_INSECURE_OWNER", 0)) { > + warning("ignoring dubious ownership in repository at '%s' (GIT_IGNORE_INSECURE_OWNER set)", data.path); > + return 1; > + } > + > return data.is_safe; > } >
On Wednesday, June 26, 2024 9:12 AM, Phillip Wood wrote: >On 26/06/2024 13:33, Florian Schmaus wrote: >> Sometimes more flexibility to disable/ignore the ownership check, >> besides the safe.directory configuration option, is required. >> >> For example, git-daemon running as nobody user, which typically has no >> home directory. Therefore, we can not add the path to a user-global >> configuration and adding the path to the system-wide configuration >> could have negative security implications. >> >> Therefore, make the check configurable via an environment variable. > >An alternative would be to allow safe.directory to be specified on the command line >with "git -c safe.directory='*' daemon ..." rather than adding a dedicated >environment variable. Or you could set $HOME to a suitable directory when >running "git daemon" and put the user-global config file there. That directory and >config file only need to be readable by the user that "git daemon" is running under it >can be owned by root or whoever else you want. I agree with this alternative. Our CI build environment already has so many environment variables (not just from git but all dependencies and the CI environment itself) that we are pushing the 56Kb platform limit (not kidding). Reducing dependence on environment variables is, in my opinion, a good and necessary thing. >> If the environment variable GIT_IGNORE_INSECURE_OWNER is set to true, >> then ignore potentially insecure ownership of git-related path >> components. >> >> Signed-off-by: Florian Schmaus <flo@geekplace.eu> >> --- >> setup.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/setup.c b/setup.c >> index 3afa6fb09b28..da3f504fb536 100644 >> --- a/setup.c >> +++ b/setup.c >> @@ -1278,6 +1278,14 @@ static int ensure_valid_ownership(const char *gitfile, >> */ >> git_protected_config(safe_directory_cb, &data); >> >> + if (data.is_safe) >> + return data.is_safe; >> + >> + if (git_env_bool("GIT_IGNORE_INSECURE_OWNER", 0)) { >> + warning("ignoring dubious ownership in repository at '%s' >(GIT_IGNORE_INSECURE_OWNER set)", data.path); >> + return 1; >> + } >> + >> return data.is_safe; >> } >> Sincerely --Randall
On 26/06/2024 14:11, Phillip Wood wrote: > Hi Florian > > On 26/06/2024 13:33, Florian Schmaus wrote: >> Sometimes more flexibility to disable/ignore the ownership check, besides >> the safe.directory configuration option, is required. >> >> For example, git-daemon running as nobody user, which typically has no >> home directory. Therefore, we can not add the path to a user-global >> configuration and adding the path to the system-wide configuration could >> have negative security implications. >> >> Therefore, make the check configurable via an environment variable. > > An alternative would be to allow safe.directory to be specified on the > command line with "git -c safe.directory='*' daemon ..." rather than > adding a dedicated environment variable. To expand an this a little - a couple of times I've wanted to checkout a bare repository that is owned by a different user. It is a pain to have to add a new config setting just for a one-off checkout. Being able to adjust the config on the command line would be very useful in that case. > Or you could set $HOME to a > suitable directory when running "git daemon" and put the user-global > config file there. That directory and config file only need to be > readable by the user that "git daemon" is running under it can be owned > by root or whoever else you want. The advantage of this approach is that there are no changes needed to git, instead of setting GIT_IGNORE_INSECURE_OWNER one sets HOME to point to a suitable config file. I found this useful when I was debugging the issues with git-daemon earlier[1] Best Wishes Phillip [1] https://lore.kernel.org/git/834862fd-b579-438a-b9b3-5246bf27ce8a@gmail.com > Best Wishes > > Phillip > > >> If the environment variable GIT_IGNORE_INSECURE_OWNER is set to true, >> then ignore potentially insecure ownership of git-related path >> components. >> >> Signed-off-by: Florian Schmaus <flo@geekplace.eu> >> --- >> setup.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/setup.c b/setup.c >> index 3afa6fb09b28..da3f504fb536 100644 >> --- a/setup.c >> +++ b/setup.c >> @@ -1278,6 +1278,14 @@ static int ensure_valid_ownership(const char >> *gitfile, >> */ >> git_protected_config(safe_directory_cb, &data); >> + if (data.is_safe) >> + return data.is_safe; >> + >> + if (git_env_bool("GIT_IGNORE_INSECURE_OWNER", 0)) { >> + warning("ignoring dubious ownership in repository at '%s' >> (GIT_IGNORE_INSECURE_OWNER set)", data.path); >> + return 1; >> + } >> + >> return data.is_safe; >> } > >
Phillip Wood <phillip.wood123@gmail.com> writes: > To expand an this a little - a couple of times I've wanted to checkout > a bare repository that is owned by a different user. It is a pain to > have to add a new config setting just for a one-off checkout. Being > able to adjust the config on the command line would be very useful in > that case. True. As long as it is deemed safe to honor the one-off "git -c safe.directory=..." from the command line, for the purpose of this "I who am running this 'git' process hereby declare that I trust this and that repository", I think it would be the best solution for the "git daemon" use case. And it is much better than adding a one-off environment variable. After all, if your "git daemon" user does not have a $HOME set in its /etc/passwd entry, you cannot set such an environment variable in $HOME/.profile so somewhere in your "git daemon" invocation would have to be tweaked to have code snippet that sets and exports it *anyway*. You can tweak the "git" invocation to add the command line tweak "-c safe.directory=..." at the place you would have set and exported the variable, and using the well understood "git -c var=val" mechanism would be more appropriate. >> Or you could set $HOME to a suitable directory when running "git > ... > The advantage of this approach is that there are no changes needed to > git, instead of setting GIT_IGNORE_INSECURE_OWNER one sets HOME to > point to a suitable config file. I found this useful when I was > debugging the issues with git-daemon earlier[1] Yup, that sounds like a workable approach, if "git -c var=val" approach turns out to be inappropriate for security purposes for whatever reason. Thanks.
On 26/06/2024 16:19, rsbecker@nexbridge.com wrote: > On Wednesday, June 26, 2024 9:12 AM, Phillip Wood wrote: >> On 26/06/2024 13:33, Florian Schmaus wrote: >> An alternative would be to allow safe.directory to be specified on the command line >> with "git -c safe.directory='*' daemon ..." rather than adding a dedicated >> environment variable. Or you could set $HOME to a suitable directory when >> running "git daemon" and put the user-global config file there. That directory and >> config file only need to be readable by the user that "git daemon" is running under it >> can be owned by root or whoever else you want. > > I agree with this alternative. Our CI build environment already has so many environment variables (not just from git but all dependencies and the CI environment itself) that we are pushing the 56Kb platform limit (not kidding). Reducing dependence on environment variables is, in my opinion, a good and necessary thing. "git -c" still ends up setting an environment variable internally to ensure that the config setting is passed down to any child processes. The advantage is that we're re-using an existing mechanism rather than introducing a new environment variable. Best Wishes Phillip >>> If the environment variable GIT_IGNORE_INSECURE_OWNER is set to true, >>> then ignore potentially insecure ownership of git-related path >>> components. >>> >>> Signed-off-by: Florian Schmaus <flo@geekplace.eu> >>> --- >>> setup.c | 8 ++++++++ >>> 1 file changed, 8 insertions(+) >>> >>> diff --git a/setup.c b/setup.c >>> index 3afa6fb09b28..da3f504fb536 100644 >>> --- a/setup.c >>> +++ b/setup.c >>> @@ -1278,6 +1278,14 @@ static int ensure_valid_ownership(const char *gitfile, >>> */ >>> git_protected_config(safe_directory_cb, &data); >>> >>> + if (data.is_safe) >>> + return data.is_safe; >>> + >>> + if (git_env_bool("GIT_IGNORE_INSECURE_OWNER", 0)) { >>> + warning("ignoring dubious ownership in repository at '%s' >> (GIT_IGNORE_INSECURE_OWNER set)", data.path); >>> + return 1; >>> + } >>> + >>> return data.is_safe; >>> } >>> > > Sincerely > --Randall >
Thanks for all your replies. Much appreciated. On 26/06/2024 20.11, Junio C Hamano wrote: > Phillip Wood <phillip.wood123@gmail.com> writes: >> To expand an this a little - a couple of times I've wanted to checkout >> a bare repository that is owned by a different user. It is a pain to >> have to add a new config setting just for a one-off checkout. Being >> able to adjust the config on the command line would be very useful in >> that case. > > True. As long as it is deemed safe to honor the one-off "git -c > safe.directory=..." from the command line, for the purpose of this > "I who am running this 'git' process hereby declare that I trust > this and that repository", I think it would be the best solution > for the "git daemon" use case. How does one pass "-c safe.directory=…" to git-http-backend? I currently have an Apache config snippet like SetEnv GIT_PROJECT_ROOT /var/www/example.org/htdocs/git SetEnv GIT_HTTP_EXPORT_ALL ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ <Files "git-http-backend"> Require all granted AcceptPathInfo On </Files> to serve git repositories. Granted, the apache user has a home directory, so I am probably able to set save.directory via ~/.gitconfig. However, the point here is that git is often invoked indirectly, with no control over the command line arguments that are passed to it. On the other hand, one has usually control over the environment variables. I agree that "-c safe.directory=…" is preferable to GIT_IGNORE_INSECURE_OWNER. However, sometimes using "-c safe.directory=…" is cumbersome and maybe even impossible. One alternative to GIT_IGNORE_INSECURE_OWNER would be a generic GIT_EXTRA_ARGS environment variable. So one could set GIT_EXTRA_ARGS="-c safe.directory=…" Not saying that I like the idea, just pointing out this option. - Florian
On Wed, Jun 26, 2024 at 09:06:10PM +0200, Florian Schmaus wrote: > > True. As long as it is deemed safe to honor the one-off "git -c > > safe.directory=..." from the command line, for the purpose of this > > "I who am running this 'git' process hereby declare that I trust > > this and that repository", I think it would be the best solution > > for the "git daemon" use case. > > How does one pass "-c safe.directory=…" to git-http-backend? > > I currently have an Apache config snippet like > > SetEnv GIT_PROJECT_ROOT /var/www/example.org/htdocs/git > SetEnv GIT_HTTP_EXPORT_ALL > ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ > > <Files "git-http-backend"> > Require all granted > AcceptPathInfo On > </Files> > > to serve git repositories. > > Granted, the apache user has a home directory, so I am probably able to set > save.directory via ~/.gitconfig. > > However, the point here is that git is often invoked indirectly, with no > control over the command line arguments that are passed to it. On the other > hand, one has usually control over the environment variables. > > I agree that "-c safe.directory=…" is preferable to > GIT_IGNORE_INSECURE_OWNER. However, sometimes using "-c safe.directory=…" is > cumbersome and maybe even impossible. > > One alternative to GIT_IGNORE_INSECURE_OWNER would be a generic > GIT_EXTRA_ARGS environment variable. So one could set > > GIT_EXTRA_ARGS="-c safe.directory=…" > > Not saying that I like the idea, just pointing out this option. You can do: GIT_CONFIG_COUNT=1 GIT_CONFIG_KEY_0=safe.directory GIT_CONFIG_VALUE_0="*" It is a bit verbose, but it's a documented interface in git-config(1). Under the hood "git -c" uses a different, older mechanism, but we've not documented it nor promised that it will remain stable. -Peff
On 26/06/2024 19:11, Junio C Hamano wrote: > Phillip Wood <phillip.wood123@gmail.com> writes: > >> To expand an this a little - a couple of times I've wanted to checkout >> a bare repository that is owned by a different user. It is a pain to >> have to add a new config setting just for a one-off checkout. Being >> able to adjust the config on the command line would be very useful in >> that case. > > True. As long as it is deemed safe to honor the one-off "git -c > safe.directory=..." from the command line, for the purpose of this > "I who am running this 'git' process hereby declare that I trust > this and that repository", I think it would be the best solution > for the "git daemon" use case. This actually works already, the behavior was changed in 6061601d9f (safe.directory: use git_protected_config(), 2022-07-14). The reason I thought it didn't work was that I remember it failing on Debian bullseye a few months ago but that used an older version of git. There is some more rationale for the change in 779ea9303a7 (Documentation: define protected configuration, 2022-07-14) Best Wishes Phillip > And it is much better than adding a one-off environment variable. > After all, if your "git daemon" user does not have a $HOME set in > its /etc/passwd entry, you cannot set such an environment variable > in $HOME/.profile so somewhere in your "git daemon" invocation would > have to be tweaked to have code snippet that sets and exports it > *anyway*. You can tweak the "git" invocation to add the command > line tweak "-c safe.directory=..." at the place you would have set > and exported the variable, and using the well understood "git -c > var=val" mechanism would be more appropriate. > >>> Or you could set $HOME to a suitable directory when running "git >> ... >> The advantage of this approach is that there are no changes needed to >> git, instead of setting GIT_IGNORE_INSECURE_OWNER one sets HOME to >> point to a suitable config file. I found this useful when I was >> debugging the issues with git-daemon earlier[1] > > Yup, that sounds like a workable approach, if "git -c var=val" > approach turns out to be inappropriate for security purposes > for whatever reason. > > Thanks.
Phillip Wood <phillip.wood123@gmail.com> writes: > On 26/06/2024 19:11, Junio C Hamano wrote: >> Phillip Wood <phillip.wood123@gmail.com> writes: >> >>> To expand an this a little - a couple of times I've wanted to checkout >>> a bare repository that is owned by a different user. It is a pain to >>> have to add a new config setting just for a one-off checkout. Being >>> able to adjust the config on the command line would be very useful in >>> that case. >> True. As long as it is deemed safe to honor the one-off "git -c >> safe.directory=..." from the command line, for the purpose of this >> "I who am running this 'git' process hereby declare that I trust >> this and that repository", I think it would be the best solution >> for the "git daemon" use case. > > This actually works already, the behavior was changed in 6061601d9f > (safe.directory: use git_protected_config(), 2022-07-14). The reason I > thought it didn't work was that I remember it failing on Debian > bullseye a few months ago but that used an older version of git. There > is some more rationale for the change in 779ea9303a7 (Documentation: > define protected configuration, 2022-07-14) Thanks. So, does this more or less conclude the episode about how best to deal with the 2.45.1 regression that Florian's patch in this thread started? It seems that we already have enough mechanisms to help users tweak their existing set-up, so we may not need code changes, but I am wondering if we want to add a bit of documentation around safe.directory to tell them when it makes sense to set it, what value(s) they would want to set it to, etc. * For "git daemon" invocations, because we know the command is run after chdir to a directory with '.' specified as the repository, we recommend to have safe.directory=., either on the command line with "-c var=val" or in daemon user's ~/.gitconfig, in the "git-daemon" help page? We could recommend safe.directory=*, but they would mean the same thing in the context of running "git daemon". We may want to discuss who protects from whom with the safe.directory mechanism and git-daemon-export-ok mechanism. The former is "the daemon trusts that repositories won't harm the daemon user", while the latter is "the repository owner is OK for it to be published". Also optionally, we may update the code to take the absolute path of the repository before passing it to the safe.directory check. * For "http-backend" invocations, we should think about potential additions that would help users, similar to what I listed above for "git daemon". Having said all that, I do not think I mind GIT_SAFE_DIRECTORIES that is a ":" separated list of paths that is honored just like the multi-valued configuration variable safe.directory. Once an attacker can influence your environment variables, it already is game over, so trusting it does not make the attack surface any worse. As Peff explained, we can trigger the more general "git -c var=val" mechanism by exporting a set of environment variables, so such a specialized environment variable is not strictly needed, but it would make writing the "SetEnv" directive in apache configuration (and similar ones for other HTTP server implementations) slighly simpler and a lot more straight-forward.
On 27/06/2024 16:28, Junio C Hamano wrote: > Phillip Wood <phillip.wood123@gmail.com> writes: > >> On 26/06/2024 19:11, Junio C Hamano wrote: >>> Phillip Wood <phillip.wood123@gmail.com> writes: >>> >>>> To expand an this a little - a couple of times I've wanted to checkout >>>> a bare repository that is owned by a different user. It is a pain to >>>> have to add a new config setting just for a one-off checkout. Being >>>> able to adjust the config on the command line would be very useful in >>>> that case. >>> True. As long as it is deemed safe to honor the one-off "git -c >>> safe.directory=..." from the command line, for the purpose of this >>> "I who am running this 'git' process hereby declare that I trust >>> this and that repository", I think it would be the best solution >>> for the "git daemon" use case. >> >> This actually works already, the behavior was changed in 6061601d9f >> (safe.directory: use git_protected_config(), 2022-07-14). The reason I >> thought it didn't work was that I remember it failing on Debian >> bullseye a few months ago but that used an older version of git. There >> is some more rationale for the change in 779ea9303a7 (Documentation: >> define protected configuration, 2022-07-14) > > Thanks. > > So, does this more or less conclude the episode about how best to > deal with the 2.45.1 regression that Florian's patch in this thread > started? I think so yes > It seems that we already have enough mechanisms to help > users tweak their existing set-up, so we may not need code changes, > but I am wondering if we want to add a bit of documentation around > safe.directory to tell them when it makes sense to set it, what > value(s) they would want to set it to, etc. > > * For "git daemon" invocations, because we know the command is run > after chdir to a directory with '.' specified as the repository, > we recommend to have safe.directory=., either on the command line > with "-c var=val" or in daemon user's ~/.gitconfig, in the > "git-daemon" help page? We could recommend safe.directory=*, but > they would mean the same thing in the context of running "git > daemon". I think we'd be better to fix the safe.directory check as you suggest below if we can but failing that updating the documentation would certainly help. > We may want to discuss who protects from whom with the > safe.directory mechanism and git-daemon-export-ok mechanism. The > former is "the daemon trusts that repositories won't harm the > daemon user", while the latter is "the repository owner is OK for > it to be published". Yes that would be helpful > Also optionally, we may update the code to take the absolute path > of the repository before passing it to the safe.directory check. I think doing this would be more helpful than updating the documentation to recommend adding "safe.directory=.". If we do this we would also want to convert "//" -> "/" in the config keys as we've been forcing users to add paths like "/srv/git//my-repo" if the --base-path argument to git-daemon ended with a "/" > * For "http-backend" invocations, we should think about potential > additions that would help users, similar to what I listed above > for "git daemon". That sounds sensible. > Having said all that, I do not think I mind GIT_SAFE_DIRECTORIES > that is a ":" separated list of paths that is honored just like the > multi-valued configuration variable safe.directory. Once an > attacker can influence your environment variables, it already is > game over, so trusting it does not make the attack surface any > worse. Indeed in that case the attacker can influence the path that we read the protected config from by setting $HOME (and do far worse by setting $PATH) > As Peff explained, we can trigger the more general "git -c > var=val" mechanism by exporting a set of environment variables, so > such a specialized environment variable is not strictly needed, but > it would make writing the "SetEnv" directive in apache configuration > (and similar ones for other HTTP server implementations) slighly > simpler and a lot more straight-forward. Yes having to set all the GIT_CONFIG_* variables can be rather confusing Best Wishes Phillip
Phillip Wood <phillip.wood123@gmail.com> writes: >> We may want to discuss who protects from whom with the >> safe.directory mechanism and git-daemon-export-ok mechanism. The >> former is "the daemon trusts that repositories won't harm the >> daemon user", while the latter is "the repository owner is OK for >> it to be published". > > Yes that would be helpful OK, let's see if somebody volunteers for documentation updates in this area. > I think doing this would be more helpful than updating the > documentation to recommend adding "safe.directory=.". If we do this we > would also want to convert "//" -> "/" in the config keys as we've > been forcing users to add paths like "/srv/git//my-repo" if the > --base-path argument to git-daemon ended with a "/" OK, so the idea is to normalize both safe.directory and data->path (which might come from either worktree or gitdir) and then look for a match. path needs to be normalized because it can say '.' and '/srv/git//my-repo', and values of safe.directory need to be normalized because the users may have written '.' --- oops, relative to what directory do we normalize safe.directory values? That would not work. Let me retry. - Compare entries of safe.directory with data->path literally without normalization, as the user may have written in the configuration "safe.directory=.", expecting that data->path to be '.' (the git-daemon use case). - Normalize entries of safe.directory and data->path and then compare them, turning path="." (the git-daemon use case) into "/srv/git/my-repo" and a safe.directory entry "/srv/git//my-repo" user wrote into "/srv/git/my-repo", so that they match. Or we could treat "." on safe.directory as a synonym for "*" (i.e. "anything goes"), and compare all other cases only after normalization (which would save the cost of "literal" comparison for safe.directory entries that are not ".")? I may have missed some corner cases, but either of these would probably work. >> * For "http-backend" invocations, we should think about potential >> additions that would help users, similar to what I listed above >> for "git daemon". > > That sounds sensible. OK. >> Having said all that, I do not think I mind GIT_SAFE_DIRECTORIES >> that is a ":" separated list of paths that is honored just like the >> multi-valued configuration variable safe.directory. Once an >> attacker can influence your environment variables, it already is >> game over, so trusting it does not make the attack surface any >> worse. > > Indeed in that case the attacker can influence the path that we read > the protected config from by setting $HOME (and do far worse by > setting $PATH) > ... > Yes having to set all the GIT_CONFIG_* variables can be rather confusing OK. So an independent effort may be to introduce the said environment variable, and have it split at ':' and feed into the same machinery used to check paths against safe.directory configuration. It may need a minor refactoring to lift the current comparison logic that assumes it is _only_ driven by the git_config() callback, and we would probably want to define how these two sources of whitelist entries interact (who overrides what, is "an empty element clears all the previous entries" still true, etc.). So, I think we have three actionable items here. Thanks.
Hi Junio On 28/06/2024 17:48, Junio C Hamano wrote: > Phillip Wood <phillip.wood123@gmail.com> writes: > >> I think doing this would be more helpful than updating the >> documentation to recommend adding "safe.directory=.". If we do this we >> would also want to convert "//" -> "/" in the config keys as we've >> been forcing users to add paths like "/srv/git//my-repo" if the >> --base-path argument to git-daemon ended with a "/" > > OK, so the idea is to normalize both safe.directory and data->path > (which might come from either worktree or gitdir) and then look for > a match. path needs to be normalized because it can say '.' and > '/srv/git//my-repo', and values of safe.directory need to be > normalized because the users may have written '.' --- oops, relative > to what directory do we normalize safe.directory values? That would > not work. Let me retry. > > - Compare entries of safe.directory with data->path literally > without normalization, as the user may have written in the > configuration "safe.directory=.", expecting that data->path to be > '.' (the git-daemon use case). I'm not sure this is a good idea because it is not clear which directory the user wanted to mark as safe when they added a relative directory to safe.directory. In the case of git-daemon one needs both the absolute path to the repository and "." to be present in safe.directory so we can ignore "." and match the absolute path. > - Normalize entries of safe.directory and data->path and then > compare them, turning path="." (the git-daemon use case) into > "/srv/git/my-repo" and a safe.directory entry "/srv/git//my-repo" > user wrote into "/srv/git/my-repo", so that they match. We have several of normalization functions available: - normalize_path_copy() does a textual normalization which cleans up "//", "/./" and "/../". - absolute_pathdup() which prepends the current directory to relative paths attempting to use $PWD for the current directory where possible but does not expand symbolic links and does not clean up the path passed to it. - real_pathdup() which expands symbolic links One way forward would be to clean up the entries in safe.directory with normalize_path_copy() and compare them to the result of normalizing $git_dir with absolute_pathdup() followed by normalize_path_copy(). That will ensure that we're always comparing the safe.directory entries against an absolute path and both sides of the comparison are textually normalized. I'm not sure whether we'd be better to use absolute_pathdup() or real_pathdup() or if we'd be safer comparing the output of both against safe.directory if they give different results. If this sounds reasonable I'll try and put a patch together later this week. > Or we could treat "." on safe.directory as a synonym for "*" > (i.e. "anything goes"), and compare all other cases only after > normalization (which would save the cost of "literal" comparison for > safe.directory entries that are not ".")? Having more than one way to spell "*" sounds confusing. Assuming "." has only been added as a workaround for the current limitations in our safe.directory comparisons I think we'd be better to ignore it. Best Wishes Phillip > I may have missed some corner cases, but either of these would > probably work. > >>> * For "http-backend" invocations, we should think about potential >>> additions that would help users, similar to what I listed above >>> for "git daemon". >> >> That sounds sensible. > > OK. > >>> Having said all that, I do not think I mind GIT_SAFE_DIRECTORIES >>> that is a ":" separated list of paths that is honored just like the >>> multi-valued configuration variable safe.directory. Once an >>> attacker can influence your environment variables, it already is >>> game over, so trusting it does not make the attack surface any >>> worse. >> >> Indeed in that case the attacker can influence the path that we read >> the protected config from by setting $HOME (and do far worse by >> setting $PATH) >> ... >> Yes having to set all the GIT_CONFIG_* variables can be rather confusing > > OK. > > So an independent effort may be to introduce the said environment > variable, and have it split at ':' and feed into the same machinery > used to check paths against safe.directory configuration. It may > need a minor refactoring to lift the current comparison logic that > assumes it is _only_ driven by the git_config() callback, and we > would probably want to define how these two sources of whitelist > entries interact (who overrides what, is "an empty element clears > all the previous entries" still true, etc.). > > So, I think we have three actionable items here. > > Thanks. > >
Hi Phillip, On Wed, 26 Jun 2024, Phillip Wood wrote: > On 26/06/2024 14:11, Phillip Wood wrote: > > Hi Florian > > > > On 26/06/2024 13:33, Florian Schmaus wrote: > > > Sometimes more flexibility to disable/ignore the ownership check, besides > > > the safe.directory configuration option, is required. > > > > > > For example, git-daemon running as nobody user, which typically has no > > > home directory. Therefore, we can not add the path to a user-global > > > configuration and adding the path to the system-wide configuration could > > > have negative security implications. > > > > > > Therefore, make the check configurable via an environment variable. > > > > An alternative would be to allow safe.directory to be specified on the > > command line with "git -c safe.directory='*' daemon ..." rather than adding > > a dedicated environment variable. > > To expand an this a little - a couple of times I've wanted to checkout a bare > repository that is owned by a different user. It is a pain to have to add a > new config setting just for a one-off checkout. Being able to adjust the > config on the command line would be very useful in that case. It is somewhat surprising that this `-c safe.directory=*` method does _not_ work for local clones. To verify, I ran this: git init --bare other-user.git && sudo chown -R 9999.9999 other-user.git/ && git -c safe.directory=\* clone other-user.git/ This will complain about the dubious ownership and suggest to add the `safe.directory` setting to the user-wide config, ignoring the command-line config altogether. The reason is to be found in https://github.com/git/git/blob/v2.45.2/connect.c#L1462-L1464: /* remove repo-local variables from the environment */ for (var = local_repo_env; *var; var++) strvec_push(&conn->env, *var); The `local_repo_env` array _specifically_ lists `GIT_CONFIG_PARAMETERS` in https://github.com/git/git/blob/v2.45.2/environment.c#L129 to be removed from the environment when spawning the `git upload-pack` process. It was not originally listed, but added via https://lore.kernel.org/git/20100824064114.GA20724@burratino/, where the commit message does not really shed light into the question why this was desirable, and there is no discussion in that mail thread about this aspect of the patch, but at least the added test case reveals the intention in some sort of way: The `-c` option allows to specify `receive.denyDeletes`, and in the described scenario the idea was that it would only apply to the client side of a local `receive-pack` but not the "remote" one. As the example above illustrates, that patch might have been overly focused on one specific, particular scenario. Ciao, Johannes
Phillip Wood <phillip.wood123@gmail.com> writes: >> - Compare entries of safe.directory with data->path literally >> without normalization, as the user may have written in the >> configuration "safe.directory=.", expecting that data->path to be >> '.' (the git-daemon use case). > > I'm not sure this is a good idea because it is not clear which > directory the user wanted to mark as safe when they added a relative > directory to safe.directory. In the case of git-daemon one needs both > the absolute path to the repository and "." to be present in > safe.directory so we can ignore "." and match the absolute path. IOW, we do not bend over backwards to try to be backward compatible on this point? I can go with that, especially because it smells like (I haven't thought deeply about it yet, though) that approach can simplify the checks. >> - Normalize entries of safe.directory and data->path and then >> compare them, turning path="." (the git-daemon use case) into >> "/srv/git/my-repo" and a safe.directory entry "/srv/git//my-repo" >> user wrote into "/srv/git/my-repo", so that they match. > > We have several of normalization functions available: > > - normalize_path_copy() does a textual normalization which cleans up > "//", "/./" and "/../". > > - absolute_pathdup() which prepends the current directory to relative > paths attempting to use $PWD for the current directory where possible > but does not expand symbolic links and does not clean up the path > passed to it. > > - real_pathdup() which expands symbolic links > > One way forward would be to clean up the entries in safe.directory > with normalize_path_copy() and compare them to the result of > normalizing $git_dir with absolute_pathdup() followed by > normalize_path_copy(). That will ensure that we're always comparing > the safe.directory entries against an absolute path and both sides of > the comparison are textually normalized. I'm not sure whether we'd be > better to use absolute_pathdup() or real_pathdup() or if we'd be safer > comparing the output of both against safe.directory if they give > different results. If this sounds reasonable I'll try and put a patch > together later this week. Hmph. Are runtime-detected (not from $GIT_DIR environment and friends) gitdir and/or worktree paths always relative? If we let getcwd() involved in the process of turning them absolute, these paths may already have symbolic links "expanded", so we have no choice other than expanding symbolic links before using paths in safe.directory to compare with them. For example, the paths from safe.directory come from the end-user, and may say things like "/home/phillip/repos/*", because phillip and everybody else on the system think /home/$USER should be where their home directories ought to be, but that was based on the niceness of sysadmins to arrange "/home/phillip" to be a symbolic link to "/home1/phillip" because the "/home" partition has already run out of space to add more users. When gitdir and/or worktree paths are computed using getcwd(), they would be paths somewhere under the "/home1/phillip/repos/" directory. Letting real_pathdup() to deal with the symbolic links may be the only usable way in such a set-up available for us. But we will of course make tons more synonymous paths by using real_pathdup() to normalize both the safe.directory entries and the gitdir and/or worktree paths---are there security downsides in doing so?
On Mon, Jul 01, 2024 at 06:34:10PM +0200, Johannes Schindelin wrote: > The `local_repo_env` array _specifically_ lists `GIT_CONFIG_PARAMETERS` in > https://github.com/git/git/blob/v2.45.2/environment.c#L129 to be removed > from the environment when spawning the `git upload-pack` process. > > It was not originally listed, but added via > https://lore.kernel.org/git/20100824064114.GA20724@burratino/, where the > commit message does not really shed light into the question why this was > desirable, and there is no discussion in that mail thread about this > aspect of the patch, but at least the added test case reveals the > intention in some sort of way: The `-c` option allows to specify > `receive.denyDeletes`, and in the described scenario the idea was that it > would only apply to the client side of a local `receive-pack` but not the > "remote" one. As the example above illustrates, that patch might have > been overly focused on one specific, particular scenario. One reason we haven't loosened local_repo_env for the local transport is that it would make it inconsistent with all of the other transports. I.e., "git -c foo.bar=baz fetch" would still be ignored over ssh, https, and so on, because those transports don't pass environment variables. There's some more discussion from a similar case that came up a month ago: https://lore.kernel.org/git/20240529102307.GF1098944@coredump.intra.peff.net/ -Peff
Jeff King <peff@peff.net> writes: > On Mon, Jul 01, 2024 at 06:34:10PM +0200, Johannes Schindelin wrote: > >> The `local_repo_env` array _specifically_ lists `GIT_CONFIG_PARAMETERS` in >> https://github.com/git/git/blob/v2.45.2/environment.c#L129 to be removed >> from the environment when spawning the `git upload-pack` process. >> >> It was not originally listed, but added via >> https://lore.kernel.org/git/20100824064114.GA20724@burratino/, where the >> commit message does not really shed light into the question why this was >> desirable, and there is no discussion in that mail thread about this >> aspect of the patch, but at least the added test case reveals the >> intention in some sort of way: The `-c` option allows to specify >> `receive.denyDeletes`, and in the described scenario the idea was that it >> would only apply to the client side of a local `receive-pack` but not the >> "remote" one. As the example above illustrates, that patch might have >> been overly focused on one specific, particular scenario. > > One reason we haven't loosened local_repo_env for the local transport is > that it would make it inconsistent with all of the other transports. > I.e., "git -c foo.bar=baz fetch" would still be ignored over ssh, https, > and so on, because those transports don't pass environment variables. > > There's some more discussion from a similar case that came up a month > ago: > > https://lore.kernel.org/git/20240529102307.GF1098944@coredump.intra.peff.net/ Thanks. I wonder if there is a way to add this kind of pieces of information to old commits and discussion threads around it after the fact, and if it helps us (like Dscho who wondered why we decided if it is a good idea, and more importantly if we still think it is a good idea and why). ... and then goes back to see the original discussion thread, with the "bright idea" that I could just follow up on 14-year old discussion thread. Only to find that despite what Dscho said, the commit message does say why it is desirable ("to imitate remote transport well") already. So, I guess we do not really need to do such a post-annotation in this particular case, but I think after seeing somebody posting a message like the one I am responding to and finding it helpful, it would be helpful if somebody can post a message pointing at it as a response to the old thread that wants a post-annotation.
[+cc Eric for some possible public-inbox wisdom] On Mon, Jul 01, 2024 at 01:40:18PM -0700, Junio C Hamano wrote: > > There's some more discussion from a similar case that came up a month > > ago: > > > > https://lore.kernel.org/git/20240529102307.GF1098944@coredump.intra.peff.net/ > > Thanks. I wonder if there is a way to add this kind of pieces of > information to old commits and discussion threads around it after > the fact, and if it helps us (like Dscho who wondered why we decided > if it is a good idea, and more importantly if we still think it is a > good idea and why). > > ... and then goes back to see the original discussion thread, > with the "bright idea" that I could just follow up on 14-year > old discussion thread. Only to find that despite what Dscho > said, the commit message does say why it is desirable ("to > imitate remote transport well") already. > > So, I guess we do not really need to do such a post-annotation in > this particular case, but I think after seeing somebody posting a > message like the one I am responding to and finding it helpful, it > would be helpful if somebody can post a message pointing at it as a > response to the old thread that wants a post-annotation. Usually I find myself digging backwards in history, following links to old threads. But I guess what you are asking is how would somebody looking at old thread XYZ know that it was mentioned much later. And I think the solution is for the new thread to just link to the old one by message-id (i.e., the usual lore links). And then searching for that message-id in the archive could turn up the later threads. I don't know how well public-inbox handles that in practice, though: 1. Do things that look like message-ids get searched for in message bodies? I'd think so if you don't explicitly say "this is a message id". 2. It's really a multi-element search. If I have a thread with 10 messages, I'd really like to know of more recent threads that linked back to _any_ message in the thread. You'd probably have to feed them all manually. But in theory indexing could generate some kind of bidirectional "related" link. I don't often do this with message-ids, but I frequently do find other references by doing a full-text search for commit hashes, or phrases from commit subjects. I usually do so with my local notmuch archive, rather than using public-inbox, but I think you should be able to do phrase searches there, too. -Peff
Jeff King <peff@peff.net> wrote: > On Mon, Jul 01, 2024 at 01:40:18PM -0700, Junio C Hamano wrote: > > Thanks. I wonder if there is a way to add this kind of pieces of > > information to old commits and discussion threads around it after > > the fact, and if it helps us (like Dscho who wondered why we decided > > if it is a good idea, and more importantly if we still think it is a > > good idea and why). > > > > ... and then goes back to see the original discussion thread, > > with the "bright idea" that I could just follow up on 14-year > > old discussion thread. Only to find that despite what Dscho > > said, the commit message does say why it is desirable ("to > > imitate remote transport well") already. > > > > So, I guess we do not really need to do such a post-annotation in > > this particular case, but I think after seeing somebody posting a > > message like the one I am responding to and finding it helpful, it > > would be helpful if somebody can post a message pointing at it as a > > response to the old thread that wants a post-annotation. I think adding code comments referencing commit OIDs and URLs with Message-IDs can help. I've noticed many people (usually in other projects) haven't learned to use git (log|blame) :< > Usually I find myself digging backwards in history, following links to > old threads. But I guess what you are asking is how would somebody > looking at old thread XYZ know that it was mentioned much later. > > And I think the solution is for the new thread to just link to the old > one by message-id (i.e., the usual lore links). And then searching for > that message-id in the archive could turn up the later threads. I don't > know how well public-inbox handles that in practice, though: > > 1. Do things that look like message-ids get searched for in message > bodies? I'd think so if you don't explicitly say "this is a message > id". Yes, encapsulating via double-quotes to turn it into a phrase search may help if the Message-ID contains dashes and such. > 2. It's really a multi-element search. If I have a thread with 10 > messages, I'd really like to know of more recent threads that > linked back to _any_ message in the thread. You'd probably have to > feed them all manually. But in theory indexing could generate some > kind of bidirectional "related" link. You can also join a bunch of Message-IDs (or any other query) via `OR' elements to ensure you don't miss things. <https://xapian.org/docs/queryparser.html> has a lot more details on combining things which apply to notmuch, too. > I don't often do this with message-ids, but I frequently do find other > references by doing a full-text search for commit hashes, or phrases > from commit subjects. I usually do so with my local notmuch archive, > rather than using public-inbox, but I think you should be able to do > phrase searches there, too. Yeah. That's fairly easy to do for Junio's git <=> git@vger mirror. It's more challenging to make work (and presentable) for hundreds of kernel list archives and linux.git mirrors, especially on my 15 year old hardware, but progress is being made since I'm resorting to using C++ for Xapian :x
diff --git a/setup.c b/setup.c index 3afa6fb09b28..da3f504fb536 100644 --- a/setup.c +++ b/setup.c @@ -1278,6 +1278,14 @@ static int ensure_valid_ownership(const char *gitfile, */ git_protected_config(safe_directory_cb, &data); + if (data.is_safe) + return data.is_safe; + + if (git_env_bool("GIT_IGNORE_INSECURE_OWNER", 0)) { + warning("ignoring dubious ownership in repository at '%s' (GIT_IGNORE_INSECURE_OWNER set)", data.path); + return 1; + } + return data.is_safe; }
Sometimes more flexibility to disable/ignore the ownership check, besides the safe.directory configuration option, is required. For example, git-daemon running as nobody user, which typically has no home directory. Therefore, we can not add the path to a user-global configuration and adding the path to the system-wide configuration could have negative security implications. Therefore, make the check configurable via an environment variable. If the environment variable GIT_IGNORE_INSECURE_OWNER is set to true, then ignore potentially insecure ownership of git-related path components. Signed-off-by: Florian Schmaus <flo@geekplace.eu> --- setup.c | 8 ++++++++ 1 file changed, 8 insertions(+)