diff mbox series

[v2] fast-import: disallow "." and ".." path components

Message ID pull.1831.v2.git.1732561248717.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 4a2790a257b314ab59f6f2e25f3d7ca120219922
Headers show
Series [v2] fast-import: disallow "." and ".." path components | expand

Commit Message

Elijah Newren Nov. 25, 2024, 7 p.m. UTC
From: Elijah Newren <newren@gmail.com>

If a user specified e.g.
   M 100644 :1 ../some-file
then fast-import previously would happily create a git history where
there is a tree in the top-level directory named "..", and with a file
inside that directory named "some-file".  The top-level ".." directory
causes problems.  While git checkout will die with errors and fsck will
report hasDotdot problems, the user is going to have problems trying to
remove the problematic file.  Simply avoid creating this bad history in
the first place.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
    fast-import: disallow "." and ".." path components
    
    Changes since v1:
    
     * make use of is_dot_or_dotdot() from dir.h
     * fix style issue

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1831%2Fnewren%2Fdisallow-dotdot-fast-import-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1831/newren/disallow-dotdot-fast-import-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1831

Range-diff vs v1:

 1:  86ea3df0351 ! 1:  447b6794a4a fast-import: disallow "." and ".." path components
     @@ builtin/fast-import.c: static int tree_content_set(
       		root->tree = t = grow_tree_content(t, t->entry_count);
       	e = new_tree_entry();
       	e->name = to_atom(p, n);
     -+	if (!strcmp(e->name->str_dat, ".") || !strcmp(e->name->str_dat, "..")) {
     ++	if (is_dot_or_dotdot(e->name->str_dat))
      +		die("path %s contains invalid component", p);
     -+	}
       	e->versions[0].mode = 0;
       	oidclr(&e->versions[0].oid, the_repository->hash_algo);
       	t->entries[t->entry_count++] = e;


 builtin/fast-import.c  |  2 ++
 t/t9300-fast-import.sh | 20 ++++++++++++++++++++
 2 files changed, 22 insertions(+)


base-commit: 04eaff62f286226f501dd21f069e0e257aee11a6

Comments

Patrick Steinhardt Nov. 26, 2024, 6:57 a.m. UTC | #1
On Mon, Nov 25, 2024 at 07:00:48PM +0000, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> If a user specified e.g.
>    M 100644 :1 ../some-file
> then fast-import previously would happily create a git history where
> there is a tree in the top-level directory named "..", and with a file
> inside that directory named "some-file".  The top-level ".." directory
> causes problems.  While git checkout will die with errors and fsck will
> report hasDotdot problems, the user is going to have problems trying to
> remove the problematic file.  Simply avoid creating this bad history in
> the first place.

Makes sense.

More generally this made me wonder whether we should maybe extract some
bits out of "fsck.c" so that we don't have to duplicate the checks done
there in git-fast-import(1). This would for example include checks for
".git" and its HFS/NTFS variants as well as tree entry length checks for
names longer than 4096 characters.

This of course does not have to be part of your patch, which looks good
to me.

Thanks!

Patrick
Kristoffer Haugsbakk Nov. 27, 2024, 8:28 a.m. UTC | #2
Hi.  I see that this is in `next` now so the following might
be irrelevant.

On Mon, Nov 25, 2024, at 20:00, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> [...]
> diff --git a/builtin/fast-import.c b/builtin/fast-import.c
> index 76d5c20f141..995ef76f9d6 100644
> --- a/builtin/fast-import.c
> +++ b/builtin/fast-import.c
> @@ -1466,6 +1466,8 @@ static int tree_content_set(
>  		root->tree = t = grow_tree_content(t, t->entry_count);
>  	e = new_tree_entry();
>  	e->name = to_atom(p, n);
> +	if (is_dot_or_dotdot(e->name->str_dat))
> +		die("path %s contains invalid component", p);

Nit: single-quoting the path seems more common:

    $ git grep "\"path '%s'" ':!po/' | wc -l
    17
    $ git grep "\"path %s" ':!po/' | wc -l
    4

>  	e->versions[0].mode = 0;
>  	oidclr(&e->versions[0].oid, the_repository->hash_algo);
>  	t->entries[t->entry_count++] = e;
> diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
> index 6224f54d4d2..caf3dc003a0 100755
> --- a/t/t9300-fast-import.sh
> +++ b/t/t9300-fast-import.sh
> @@ -522,6 +522,26 @@ test_expect_success 'B: fail on invalid committer (5)' '
>  	test_must_fail git fast-import <input
>  '
>
> +test_expect_success 'B: fail on invalid file path' '
> +	cat >input <<-INPUT_END &&
> +	blob
> +	mark :1
> +	data <<EOF
> +	File contents
> +	EOF
> +
> +	commit refs/heads/badpath
> +	committer Name <email> $GIT_COMMITTER_DATE
> +	data <<COMMIT
> +	Commit Message
> +	COMMIT
> +	M 100644 :1 ../invalid-path

Maybe the test could be parameterized so that both `..` and `.` can
be tested?  Like in `test_path_eol_success`.
Jeff King Nov. 27, 2024, 2:24 p.m. UTC | #3
On Tue, Nov 26, 2024 at 07:57:57AM +0100, Patrick Steinhardt wrote:

> On Mon, Nov 25, 2024 at 07:00:48PM +0000, Elijah Newren via GitGitGadget wrote:
> > From: Elijah Newren <newren@gmail.com>
> > 
> > If a user specified e.g.
> >    M 100644 :1 ../some-file
> > then fast-import previously would happily create a git history where
> > there is a tree in the top-level directory named "..", and with a file
> > inside that directory named "some-file".  The top-level ".." directory
> > causes problems.  While git checkout will die with errors and fsck will
> > report hasDotdot problems, the user is going to have problems trying to
> > remove the problematic file.  Simply avoid creating this bad history in
> > the first place.
> 
> Makes sense.
> 
> More generally this made me wonder whether we should maybe extract some
> bits out of "fsck.c" so that we don't have to duplicate the checks done
> there in git-fast-import(1). This would for example include checks for
> ".git" and its HFS/NTFS variants as well as tree entry length checks for
> names longer than 4096 characters.

I had the same thought, but I think the right code to be using is
verify_path(). That's what ultimately is used to let names into the
index from trees, from update-index, or from other tools like git-apply.

So I'd consider that authoritative, and fsck is mostly trying to follow
those rules while looking at only a single tree at a time. But
fast-import should have the whole path as a string, just like the index
code does).

-Peff
Junio C Hamano Nov. 27, 2024, 11:07 p.m. UTC | #4
Jeff King <peff@peff.net> writes:

> I had the same thought, but I think the right code to be using is
> verify_path(). That's what ultimately is used to let names into the
> index from trees, from update-index, or from other tools like git-apply.

Yeah, I agree that is the right helper to use.
diff mbox series

Patch

diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 76d5c20f141..995ef76f9d6 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -1466,6 +1466,8 @@  static int tree_content_set(
 		root->tree = t = grow_tree_content(t, t->entry_count);
 	e = new_tree_entry();
 	e->name = to_atom(p, n);
+	if (is_dot_or_dotdot(e->name->str_dat))
+		die("path %s contains invalid component", p);
 	e->versions[0].mode = 0;
 	oidclr(&e->versions[0].oid, the_repository->hash_algo);
 	t->entries[t->entry_count++] = e;
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 6224f54d4d2..caf3dc003a0 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -522,6 +522,26 @@  test_expect_success 'B: fail on invalid committer (5)' '
 	test_must_fail git fast-import <input
 '
 
+test_expect_success 'B: fail on invalid file path' '
+	cat >input <<-INPUT_END &&
+	blob
+	mark :1
+	data <<EOF
+	File contents
+	EOF
+
+	commit refs/heads/badpath
+	committer Name <email> $GIT_COMMITTER_DATE
+	data <<COMMIT
+	Commit Message
+	COMMIT
+	M 100644 :1 ../invalid-path
+	INPUT_END
+
+	test_when_finished "git update-ref -d refs/heads/badpath" &&
+	test_must_fail git fast-import <input
+'
+
 ###
 ### series C
 ###