Message ID | dc67464b6aec46c5b6ac0d9cb8549d48a35c5353.1712741871.git.thalia@archibald.dev (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | fast-import: tighten parsing of paths | expand |
Thalia Archibald <thalia@archibald.dev> writes: > NUL cannot appear in paths. Even disregarding filesystem path > limitations, the tree object format delimits with NUL, so such a path > cannot be encoded by Git. > > When a quoted path is unquoted, it could possibly contain NUL from > "\000". Forbid it so it isn't truncated. > > fast-import still has other issues with NUL, but those will be addressed > later. Later meaning outside the series, as 8/8 is not about that? Not a complaint, and if the way I interpreted is correct, then there is no need to update the above statement. Just double-checking. > Signed-off-by: Thalia Archibald <thalia@archibald.dev> > --- > Documentation/git-fast-import.txt | 1 + > builtin/fast-import.c | 2 ++ > t/t9300-fast-import.sh | 1 + > 3 files changed, 4 insertions(+) > > diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt > index db53b50268..edda30f90c 100644 > --- a/Documentation/git-fast-import.txt > +++ b/Documentation/git-fast-import.txt > @@ -660,6 +660,7 @@ and must be in canonical form. That is it must not: > > The root of the tree can be represented by an empty string as `<path>`. > > +`<path>` cannot contain NUL, either literally or escaped as `\000`. OK. > + if (strlen(sb->buf) != sb->len) > + die("NUL in %s: %s", field, command_buf.buf); Nice. !memchr(sb->buf, ch, sb->len) would be more general solution if we were looking for ch that is not NUL, but for checking NUL, what you wrote is the most natural to read. > } else { > if (include_spaces) > *endp = p + strlen(p); > diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh > index 5cde8f8d01..1e68426852 100755 > --- a/t/t9300-fast-import.sh > +++ b/t/t9300-fast-import.sh > @@ -3300,6 +3300,7 @@ test_path_base_fail () { > local change="$1" prefix="$2" field="$3" suffix="$4" > test_path_fail "$change" 'unclosed " in '"$field" "$prefix" '"hello.c' "$suffix" "Invalid $field" > test_path_fail "$change" "invalid escape in quoted $field" "$prefix" '"hello\xff"' "$suffix" "Invalid $field" > + test_path_fail "$change" "escaped NUL in quoted $field" "$prefix" '"hello\000"' "$suffix" "NUL in $field" > } > test_path_eol_quoted_fail () { > local change="$1" prefix="$2" field="$3"
diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index db53b50268..edda30f90c 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -660,6 +660,7 @@ and must be in canonical form. That is it must not: The root of the tree can be represented by an empty string as `<path>`. +`<path>` cannot contain NUL, either literally or escaped as `\000`. It is recommended that `<path>` always be encoded using UTF-8. `filedelete` diff --git a/builtin/fast-import.c b/builtin/fast-import.c index 7a398dc975..98096b6fa7 100644 --- a/builtin/fast-import.c +++ b/builtin/fast-import.c @@ -2269,6 +2269,8 @@ static void parse_path(struct strbuf *sb, const char *p, const char **endp, if (*p == '"') { if (unquote_c_style(sb, p, endp)) die("Invalid %s: %s", field, command_buf.buf); + if (strlen(sb->buf) != sb->len) + die("NUL in %s: %s", field, command_buf.buf); } else { if (include_spaces) *endp = p + strlen(p); diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index 5cde8f8d01..1e68426852 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -3300,6 +3300,7 @@ test_path_base_fail () { local change="$1" prefix="$2" field="$3" suffix="$4" test_path_fail "$change" 'unclosed " in '"$field" "$prefix" '"hello.c' "$suffix" "Invalid $field" test_path_fail "$change" "invalid escape in quoted $field" "$prefix" '"hello\xff"' "$suffix" "Invalid $field" + test_path_fail "$change" "escaped NUL in quoted $field" "$prefix" '"hello\000"' "$suffix" "NUL in $field" } test_path_eol_quoted_fail () { local change="$1" prefix="$2" field="$3"
NUL cannot appear in paths. Even disregarding filesystem path limitations, the tree object format delimits with NUL, so such a path cannot be encoded by Git. When a quoted path is unquoted, it could possibly contain NUL from "\000". Forbid it so it isn't truncated. fast-import still has other issues with NUL, but those will be addressed later. Signed-off-by: Thalia Archibald <thalia@archibald.dev> --- Documentation/git-fast-import.txt | 1 + builtin/fast-import.c | 2 ++ t/t9300-fast-import.sh | 1 + 3 files changed, 4 insertions(+)