Message ID | pull.1831.v2.git.1732561248717.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 4a2790a257b314ab59f6f2e25f3d7ca120219922 |
Headers | show |
Series | [v2] fast-import: disallow "." and ".." path components | expand |
On Mon, Nov 25, 2024 at 07:00:48PM +0000, Elijah Newren via GitGitGadget wrote: > From: Elijah Newren <newren@gmail.com> > > If a user specified e.g. > M 100644 :1 ../some-file > then fast-import previously would happily create a git history where > there is a tree in the top-level directory named "..", and with a file > inside that directory named "some-file". The top-level ".." directory > causes problems. While git checkout will die with errors and fsck will > report hasDotdot problems, the user is going to have problems trying to > remove the problematic file. Simply avoid creating this bad history in > the first place. Makes sense. More generally this made me wonder whether we should maybe extract some bits out of "fsck.c" so that we don't have to duplicate the checks done there in git-fast-import(1). This would for example include checks for ".git" and its HFS/NTFS variants as well as tree entry length checks for names longer than 4096 characters. This of course does not have to be part of your patch, which looks good to me. Thanks! Patrick
Hi. I see that this is in `next` now so the following might be irrelevant. On Mon, Nov 25, 2024, at 20:00, Elijah Newren via GitGitGadget wrote: > From: Elijah Newren <newren@gmail.com> > [...] > diff --git a/builtin/fast-import.c b/builtin/fast-import.c > index 76d5c20f141..995ef76f9d6 100644 > --- a/builtin/fast-import.c > +++ b/builtin/fast-import.c > @@ -1466,6 +1466,8 @@ static int tree_content_set( > root->tree = t = grow_tree_content(t, t->entry_count); > e = new_tree_entry(); > e->name = to_atom(p, n); > + if (is_dot_or_dotdot(e->name->str_dat)) > + die("path %s contains invalid component", p); Nit: single-quoting the path seems more common: $ git grep "\"path '%s'" ':!po/' | wc -l 17 $ git grep "\"path %s" ':!po/' | wc -l 4 > e->versions[0].mode = 0; > oidclr(&e->versions[0].oid, the_repository->hash_algo); > t->entries[t->entry_count++] = e; > diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh > index 6224f54d4d2..caf3dc003a0 100755 > --- a/t/t9300-fast-import.sh > +++ b/t/t9300-fast-import.sh > @@ -522,6 +522,26 @@ test_expect_success 'B: fail on invalid committer (5)' ' > test_must_fail git fast-import <input > ' > > +test_expect_success 'B: fail on invalid file path' ' > + cat >input <<-INPUT_END && > + blob > + mark :1 > + data <<EOF > + File contents > + EOF > + > + commit refs/heads/badpath > + committer Name <email> $GIT_COMMITTER_DATE > + data <<COMMIT > + Commit Message > + COMMIT > + M 100644 :1 ../invalid-path Maybe the test could be parameterized so that both `..` and `.` can be tested? Like in `test_path_eol_success`.
On Tue, Nov 26, 2024 at 07:57:57AM +0100, Patrick Steinhardt wrote: > On Mon, Nov 25, 2024 at 07:00:48PM +0000, Elijah Newren via GitGitGadget wrote: > > From: Elijah Newren <newren@gmail.com> > > > > If a user specified e.g. > > M 100644 :1 ../some-file > > then fast-import previously would happily create a git history where > > there is a tree in the top-level directory named "..", and with a file > > inside that directory named "some-file". The top-level ".." directory > > causes problems. While git checkout will die with errors and fsck will > > report hasDotdot problems, the user is going to have problems trying to > > remove the problematic file. Simply avoid creating this bad history in > > the first place. > > Makes sense. > > More generally this made me wonder whether we should maybe extract some > bits out of "fsck.c" so that we don't have to duplicate the checks done > there in git-fast-import(1). This would for example include checks for > ".git" and its HFS/NTFS variants as well as tree entry length checks for > names longer than 4096 characters. I had the same thought, but I think the right code to be using is verify_path(). That's what ultimately is used to let names into the index from trees, from update-index, or from other tools like git-apply. So I'd consider that authoritative, and fsck is mostly trying to follow those rules while looking at only a single tree at a time. But fast-import should have the whole path as a string, just like the index code does). -Peff
Jeff King <peff@peff.net> writes: > I had the same thought, but I think the right code to be using is > verify_path(). That's what ultimately is used to let names into the > index from trees, from update-index, or from other tools like git-apply. Yeah, I agree that is the right helper to use.
diff --git a/builtin/fast-import.c b/builtin/fast-import.c index 76d5c20f141..995ef76f9d6 100644 --- a/builtin/fast-import.c +++ b/builtin/fast-import.c @@ -1466,6 +1466,8 @@ static int tree_content_set( root->tree = t = grow_tree_content(t, t->entry_count); e = new_tree_entry(); e->name = to_atom(p, n); + if (is_dot_or_dotdot(e->name->str_dat)) + die("path %s contains invalid component", p); e->versions[0].mode = 0; oidclr(&e->versions[0].oid, the_repository->hash_algo); t->entries[t->entry_count++] = e; diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index 6224f54d4d2..caf3dc003a0 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -522,6 +522,26 @@ test_expect_success 'B: fail on invalid committer (5)' ' test_must_fail git fast-import <input ' +test_expect_success 'B: fail on invalid file path' ' + cat >input <<-INPUT_END && + blob + mark :1 + data <<EOF + File contents + EOF + + commit refs/heads/badpath + committer Name <email> $GIT_COMMITTER_DATE + data <<COMMIT + Commit Message + COMMIT + M 100644 :1 ../invalid-path + INPUT_END + + test_when_finished "git update-ref -d refs/heads/badpath" && + test_must_fail git fast-import <input +' + ### ### series C ###