diff mbox series

[v3,2/3] safe.directory: normalize the configured path

Message ID 20240730011004.4030246-3-gitster@pobox.com (mailing list archive)
State Superseded
Headers show
Series safe.directory clean-up | expand

Commit Message

Junio C Hamano July 30, 2024, 1:10 a.m. UTC
The pathname of a repository comes from getcwd() and it could be a
path aliased via symbolic links, e.g., the real directory may be
/home/u/repository but a symbolic link /home/u/repo may point at it,
and the clone request may come as "git clone file:///home/u/repo/"

A request to check if /home/u/repository is safe would be rejected
if the safe.directory configuration allows /home/u/repo/ but not its
alias /home/u/repository/.  Normalize the paths configured for the
safe.directory configuration variable before comparing them with the
path being checked.

Two and a half things to note, compared to the previous step to
normalize the actual path of the suspected repository, are:

 - A configured safe.directory may be coming from .gitignore in the
   home directory that may be shared across machines.  The path
   meant to match with an entry may not necessarily exist on all of
   such machines, so not being able to convert them to real path on
   this machine is *not* a condition that is worthy of warning.
   Hence, we ignore a path that cannot be converted to a real path.

 - A configured safe.directory is essentially a random string that
   user throws at us, written completely unrelated to the directory
   the current process happens to be in.  Hence it makes little
   sense to give a non-absolute path.  Hence we ignore any
   non-absolute paths, except for ".".

 - The safe.directory set to "." was once advertised on the list as
   a valid workaround for the regression caused by the overly tight
   safe.directory check introduced in 2.45.1; we treat it to mean
   "if we are at the top level of a repository, it is OK".
   (cf. <834862fd-b579-438a-b9b3-5246bf27ce8a@gmail.com>).

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 setup.c                   | 37 ++++++++++++++++++++++---
 t/t0033-safe-directory.sh | 57 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 90 insertions(+), 4 deletions(-)

Comments

Jeff King July 30, 2024, 7:31 a.m. UTC | #1
On Mon, Jul 29, 2024 at 06:10:03PM -0700, Junio C Hamano wrote:

> Two and a half things to note, compared to the previous step to
> normalize the actual path of the suspected repository, are:
> 
>  - A configured safe.directory may be coming from .gitignore in the
>    home directory that may be shared across machines.  The path
>    meant to match with an entry may not necessarily exist on all of
>    such machines, so not being able to convert them to real path on
>    this machine is *not* a condition that is worthy of warning.
>    Hence, we ignore a path that cannot be converted to a real path.

Thanks, I think this addresses my concern. One small thing I noticed in
the patch:

> @@ -1236,14 +1236,43 @@ static int safe_directory_cb(const char *key, const char *value,
>  
>  		if (!git_config_pathname(&allowed, key, value)) {
>  			const char *check = allowed ? allowed : value;
> -			if (ends_with(check, "/*")) {
> -				size_t len = strlen(check);
> -				if (!fspathncmp(check, data->path, len - 1))
> +			char *to_free = NULL;
> +
> +			/*
> +			 * Setting safe.directory to a non-absolute path
> +			 * makes little sense---it won't be relative to
> +			 * the configuration file the item is defined in.
> +			 * Except for ".", which means "if we are at the top
> +			 * level of a repository, then it is OK", which is
> +			 * slightly tighter than "*" that allows discovery.
> +			 */
> +			if (!is_absolute_path(check) && strcmp(check, ".")) {
> +				warning(_("safe.directory '%s' not absolute"),
> +					check);
> +				goto next;
> +			}

This is_absolute_path() check is redundant, isn't it? If we are checking
for a literal ".", then we know the patch must be non-absolute.

-Peff
Jeff King July 30, 2024, 7:43 a.m. UTC | #2
On Mon, Jul 29, 2024 at 06:10:03PM -0700, Junio C Hamano wrote:

> @@ -1236,14 +1236,43 @@ static int safe_directory_cb(const char *key, const char *value,
>  
>  		if (!git_config_pathname(&allowed, key, value)) {
>  			const char *check = allowed ? allowed : value;
> -			if (ends_with(check, "/*")) {
> -				size_t len = strlen(check);
> -				if (!fspathncmp(check, data->path, len - 1))

BTW, one oddity I noticed in the existing code:

Under what circumstances will "allowed" be NULL in that ternary? I think
if git_config_pathname() returns non-zero, then we called
interpolate_path(). It can return NULL, but in that case
git_config_pathname() will die(). We might change that later, but then
I'd expect it to return non-zero. So I suspect the whole "check"
variable could just be dropped in favor of using "allowed".

Obviously not new in your patch, but maybe worth fixing while in the
area? I think it comes from an evil merge in b8bdb2f283 (Merge branch
'jc/safe-directory-leading-path', 2024-06-12).

-Peff
Junio C Hamano July 30, 2024, 4:03 p.m. UTC | #3
Jeff King <peff@peff.net> writes:

>> +			if (!is_absolute_path(check) && strcmp(check, ".")) {
>> +				warning(_("safe.directory '%s' not absolute"),
>> +					check);
>> +				goto next;
>> +			}
>
> This is_absolute_path() check is redundant, isn't it? If we are checking
> for a literal ".", then we know the path must be non-absolute.

What I meant was "If it is not absolute, that is an error, but if
the thing is a dot, that is allowed as an exception".

Is the lack of "!" confusing, I wonder?   We could rewrite it to
make it more explicit:

	if (is_absolute_path(check) || !strcmp(check, ".")) {
		; /* OK */
	} else {
		warning(_("not absolute %s"), check);
		goto next;
	}

My earlier draft for v3 had the check for dot a lot earlier in the
function, i.e.

	-	} else if (!strcmp(value, "*")) {
	+	} else if (!strcmp(value, "*") || !strcmp(value, ".")) {
			data->is_safe = 1;

and this part said "If not absolute, that is an error" without
anything about dot.

But then I changed my mind and made it unsafe to do this:

	cd .git/refs && git -c safe.directory=. foo

as safe.directory=. means "A repository at the current directory of
the process is allowed" and the repository in this case is not at "."
but at "..", meaning "." is a lot stricter than "*".
Junio C Hamano July 30, 2024, 4:22 p.m. UTC | #4
Jeff King <peff@peff.net> writes:

> On Mon, Jul 29, 2024 at 06:10:03PM -0700, Junio C Hamano wrote:
>
>> @@ -1236,14 +1236,43 @@ static int safe_directory_cb(const char *key, const char *value,
>>  
>>  		if (!git_config_pathname(&allowed, key, value)) {
>>  			const char *check = allowed ? allowed : value;
>> -			if (ends_with(check, "/*")) {
>> -				size_t len = strlen(check);
>> -				if (!fspathncmp(check, data->path, len - 1))
>
> BTW, one oddity I noticed in the existing code:
>
> Under what circumstances will "allowed" be NULL in that ternary? I think
> if git_config_pathname() returns non-zero, then we called
> interpolate_path(). It can return NULL, but in that case
> git_config_pathname() will die(). We might change that later, but then
> I'd expect it to return non-zero. So I suspect the whole "check"
> variable could just be dropped in favor of using "allowed".
>
> Obviously not new in your patch, but maybe worth fixing while in the
> area? I think it comes from an evil merge in b8bdb2f283 (Merge branch
> 'jc/safe-directory-leading-path', 2024-06-12).

I think it deserves to be a separate change, probably a preliminary
clean-up, as it predates that by a few years, and goes back to the
initial introduction of the safe.directory feature.  The merge you
found had this bit:

diff --cc setup.c
index e47946d0e7,4c5de0960b..e112545f71
--- a/setup.c
+++ b/setup.c
@@@ -1230,13 -1176,21 +1230,20 @@@ static int safe_directory_cb(const cha
  	} else if (!strcmp(value, "*")) {
  		data->is_safe = 1;
  	} else {
- 		char *interpolated = NULL;
- 
- 		if (!git_config_pathname(&interpolated, key, value) &&
- 		    !fspathcmp(data->path, interpolated ? interpolated : value))
- 			data->is_safe = 1;
- 
- 		free(interpolated);
 ...


Notice that in the original, the code is prepared for the case where
interpolated is NULL when fspathcmp() needs to use it or value,
which is when git_config_pathname() returned 0/success.

It came from 8959555c (setup_git_directory(): add an owner check for
the top-level directory, 2022-03-02) that introduced the
safe.directory feature:

+		if (!git_config_pathname(&interpolated, key, value) &&
+		    !fspathcmp(data->path, interpolated ? interpolated : value))
+			data->is_safe = 1;

where it shared the same assumption.
Jeff King July 30, 2024, 8:08 p.m. UTC | #5
On Tue, Jul 30, 2024 at 09:03:35AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> +			if (!is_absolute_path(check) && strcmp(check, ".")) {
> >> +				warning(_("safe.directory '%s' not absolute"),
> >> +					check);
> >> +				goto next;
> >> +			}
> >
> > This is_absolute_path() check is redundant, isn't it? If we are checking
> > for a literal ".", then we know the path must be non-absolute.
> 
> What I meant was "If it is not absolute, that is an error, but if
> the thing is a dot, that is allowed as an exception".
> 
> Is the lack of "!" confusing, I wonder?   We could rewrite it to
> make it more explicit:

Oh, right, I totally misread it. You'd think after 25+ years of writing
C that I would be able to get the strcmp() return value right in my
head... :)

So yeah, it is doing the right thing.

> 	if (is_absolute_path(check) || !strcmp(check, ".")) {
> 		; /* OK */
> 	} else {
> 		warning(_("not absolute %s"), check);
> 		goto next;
> 	}

Hmm. Yeah, that probably would have softened my confusion, but it's also
kind of hard to read. I think what you wrote originally is just fine,
and I just mis-read it.

> My earlier draft for v3 had the check for dot a lot earlier in the
> function, i.e.
> 
> 	-	} else if (!strcmp(value, "*")) {
> 	+	} else if (!strcmp(value, "*") || !strcmp(value, ".")) {
> 			data->is_safe = 1;
> 
> and this part said "If not absolute, that is an error" without
> anything about dot.
> 
> But then I changed my mind and made it unsafe to do this:
> 
> 	cd .git/refs && git -c safe.directory=. foo
> 
> as safe.directory=. means "A repository at the current directory of
> the process is allowed" and the repository in this case is not at "."
> but at "..", meaning "." is a lot stricter than "*".

I could see arguments in either direction, and I don't have a strong
opinion between them.

-Peff
Jeff King July 30, 2024, 8:10 p.m. UTC | #6
On Tue, Jul 30, 2024 at 09:22:09AM -0700, Junio C Hamano wrote:

> > Obviously not new in your patch, but maybe worth fixing while in the
> > area? I think it comes from an evil merge in b8bdb2f283 (Merge branch
> > 'jc/safe-directory-leading-path', 2024-06-12).
> 
> I think it deserves to be a separate change, probably a preliminary
> clean-up, as it predates that by a few years, and goes back to the
> initial introduction of the safe.directory feature.  The merge you
> found had this bit:

Ah, yeah, I was busy looking at the assignments and didn't notice the
ternary in the function call. So yeah, it comes from that earlier
commit, but I think it was equally useless there.

And yes, fixing it definitely should be a separate commit.

-Peff
diff mbox series

Patch

diff --git a/setup.c b/setup.c
index 45bbbe329f..6012f3f011 100644
--- a/setup.c
+++ b/setup.c
@@ -1236,14 +1236,43 @@  static int safe_directory_cb(const char *key, const char *value,
 
 		if (!git_config_pathname(&allowed, key, value)) {
 			const char *check = allowed ? allowed : value;
-			if (ends_with(check, "/*")) {
-				size_t len = strlen(check);
-				if (!fspathncmp(check, data->path, len - 1))
+			char *to_free = NULL;
+
+			/*
+			 * Setting safe.directory to a non-absolute path
+			 * makes little sense---it won't be relative to
+			 * the configuration file the item is defined in.
+			 * Except for ".", which means "if we are at the top
+			 * level of a repository, then it is OK", which is
+			 * slightly tighter than "*" that allows discovery.
+			 */
+			if (!is_absolute_path(check) && strcmp(check, ".")) {
+				warning(_("safe.directory '%s' not absolute"),
+					check);
+				goto next;
+			}
+
+			/*
+			 * A .gitconfig in $HOME may be shared across
+			 * different machines and safe.directory entries
+			 * may or may not exist as paths on all of these
+			 * machines.  In other words, it is not a warning
+			 * worthy event when there is no such path on this
+			 * machine---the entry may be useful elsewhere.
+			 */
+			to_free = real_pathdup(check, 0);
+			if (!to_free)
+				goto next;
+			if (ends_with(to_free, "/*")) {
+				size_t len = strlen(to_free);
+				if (!fspathncmp(to_free, data->path, len - 1))
 					data->is_safe = 1;
-			} else if (!fspathcmp(data->path, check)) {
+			} else if (!fspathcmp(data->path, to_free)) {
 				data->is_safe = 1;
 			}
+			free(to_free);
 		}
+	next:
 		if (allowed != value)
 			free(allowed);
 	}
diff --git a/t/t0033-safe-directory.sh b/t/t0033-safe-directory.sh
index 07ac0f9a01..ea74657255 100755
--- a/t/t0033-safe-directory.sh
+++ b/t/t0033-safe-directory.sh
@@ -176,4 +176,61 @@  test_expect_success SYMLINKS 'checked leading paths are normalized' '
 	git -C repo/s/.git/ for-each-ref
 '
 
+test_expect_success SYMLINKS 'configured paths are normalized' '
+	test_when_finished "rm -rf repository; rm -f repo" &&
+	(
+		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
+		git config --global --unset-all safe.directory
+	) &&
+	git init repository &&
+	ln -s repository repo &&
+	(
+		cd repository &&
+		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
+		test_commit sample
+	) &&
+
+	(
+		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
+		git config --global safe.directory "$(pwd)/repo"
+	) &&
+	git -C repository for-each-ref &&
+	git -C repository/ for-each-ref &&
+	git -C repo for-each-ref &&
+	git -C repo/ for-each-ref &&
+	test_must_fail git -C repository/.git for-each-ref &&
+	test_must_fail git -C repository/.git/ for-each-ref &&
+	test_must_fail git -C repo/.git for-each-ref &&
+	test_must_fail git -C repo/.git/ for-each-ref
+'
+
+test_expect_success SYMLINKS 'configured leading paths are normalized' '
+	test_when_finished "rm -rf repository; rm -f repo" &&
+	(
+		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
+		git config --global --unset-all safe.directory
+	) &&
+	mkdir -p repository &&
+	git init repository/s &&
+	ln -s repository repo &&
+	(
+		cd repository/s &&
+		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
+		test_commit sample
+	) &&
+
+	(
+		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
+		git config --global safe.directory "$(pwd)/repo/*"
+	) &&
+	git -C repository/s for-each-ref &&
+	git -C repository/s/ for-each-ref &&
+	git -C repository/s/.git for-each-ref &&
+	git -C repository/s/.git/ for-each-ref &&
+	git -C repo/s for-each-ref &&
+	git -C repo/s/ for-each-ref &&
+	git -C repo/s/.git for-each-ref &&
+	git -C repo/s/.git/ for-each-ref
+'
+
 test_done