Message ID | af45c2337fbe2a59ac95aff3ce90a69d8c30416f.1544127439.git.steadmon@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add commit-graph fuzzer and fix buffer overflow | expand |
On Thu, Dec 06, 2018 at 12:20:54PM -0800, Josh Steadmon wrote: > diff --git a/commit-graph.c b/commit-graph.c > index 07dd410f3c..224a5f161e 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -165,10 +165,20 @@ struct commit_graph *parse_commit_graph(void *graph_map, int fd, > last_chunk_offset = 8; > chunk_lookup = data + 8; > for (i = 0; i < graph->num_chunks; i++) { > - uint32_t chunk_id = get_be32(chunk_lookup + 0); > - uint64_t chunk_offset = get_be64(chunk_lookup + 4); > + uint32_t chunk_id; > + uint64_t chunk_offset; > int chunk_repeated = 0; > > + if (chunk_lookup + GRAPH_CHUNKLOOKUP_WIDTH > > + data + graph_size) { > + error(_("chunk lookup table entry missing; graph file may be incomplete")); > + free(graph); > + return NULL; > + } Is it possible to overflow the addition here? E.g., if I'm on a 32-bit system and the truncated chunk appears right at the 4GB limit, in which case we wrap back around? I guess that's pretty implausible, since it would mean that the mmap is bumping up against the end of the address space. I didn't check, but I wouldn't be surprised if sane operating systems avoid allocating those addresses. But I think you could write this as: if (data + graph_size - chunk_lookup < GRAPH_CHUNKLOOKUP_WIDTH) to avoid overflow (we know that "data + graph_size" is sane because that's our mmap, and chunk_lookup is somewhere between "data" and "data + graph_size", so the result is between 0 and graph_size). I dunno. I think I've convinced myself it's a non-issue here, but it may be good to get in the habit of writing these sorts of offset checks in an overflow-proof order. -Peff
On 12/6/2018 3:20 PM, Josh Steadmon wrote: > + > +# usage: corrupt_and_zero_graph_then_verify <corrupt_position> <data> <zero_position> <string> > +# Manipulates the commit-graph file at <corrupt_position> by inserting the data, > +# then zeros the file starting at <zero_position>. Finally, runs > +# 'git commit-graph verify' and places the output in the file 'err'. Tests 'err' > +# for the given string. > +corrupt_and_zero_graph_then_verify() { This method is very similar to to 'corrupt_graph_and_verify()', the only difference being the zero_pos, which zeroes the graph. Could it instead be a modification of corrupt_graph_and_verify() where $4 is interpreted as zero_pos, and if it is blank we don't do the truncation? > +test_expect_success 'detect truncated graph' ' > + corrupt_and_zero_graph_then_verify $GRAPH_BYTE_CHUNK_COUNT "\xff" \ > + $GRAPH_CHUNK_LOOKUP_OFFSET "chunk lookup table entry missing" > +' > + Thanks for this! I think it's valuable to keep explicit tests around that were discovered from your fuzz tests. Specifically, I can repeat the test when I get around to the next file format. Thanks, -Stolee
diff --git a/commit-graph.c b/commit-graph.c index 07dd410f3c..224a5f161e 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -165,10 +165,20 @@ struct commit_graph *parse_commit_graph(void *graph_map, int fd, last_chunk_offset = 8; chunk_lookup = data + 8; for (i = 0; i < graph->num_chunks; i++) { - uint32_t chunk_id = get_be32(chunk_lookup + 0); - uint64_t chunk_offset = get_be64(chunk_lookup + 4); + uint32_t chunk_id; + uint64_t chunk_offset; int chunk_repeated = 0; + if (chunk_lookup + GRAPH_CHUNKLOOKUP_WIDTH > + data + graph_size) { + error(_("chunk lookup table entry missing; graph file may be incomplete")); + free(graph); + return NULL; + } + + chunk_id = get_be32(chunk_lookup + 0); + chunk_offset = get_be64(chunk_lookup + 4); + chunk_lookup += GRAPH_CHUNKLOOKUP_WIDTH; if (chunk_offset > graph_size - GIT_MAX_RAWSZ) { diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 5fe21db99f..2503cb0345 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -384,6 +384,29 @@ corrupt_graph_and_verify() { test_i18ngrep "$grepstr" err } + +# usage: corrupt_and_zero_graph_then_verify <corrupt_position> <data> <zero_position> <string> +# Manipulates the commit-graph file at <corrupt_position> by inserting the data, +# then zeros the file starting at <zero_position>. Finally, runs +# 'git commit-graph verify' and places the output in the file 'err'. Tests 'err' +# for the given string. +corrupt_and_zero_graph_then_verify() { + corrupt_pos=$1 + data="${2:-\0}" + zero_pos=$3 + grepstr=$4 + orig_size=$(stat --format=%s $objdir/info/commit-graph) + cd "$TRASH_DIRECTORY/full" && + test_when_finished mv commit-graph-backup $objdir/info/commit-graph && + cp $objdir/info/commit-graph commit-graph-backup && + printf "$data" | dd of="$objdir/info/commit-graph" bs=1 seek="$corrupt_pos" conv=notrunc && + truncate --size=$zero_pos $objdir/info/commit-graph && + truncate --size=$orig_size $objdir/info/commit-graph && + test_must_fail git commit-graph verify 2>test_err && + grep -v "^+" test_err >err && + test_i18ngrep "$grepstr" err +} + test_expect_success 'detect bad signature' ' corrupt_graph_and_verify 0 "\0" \ "graph signature" @@ -484,6 +507,11 @@ test_expect_success 'detect invalid checksum hash' ' "incorrect checksum" ' +test_expect_success 'detect truncated graph' ' + corrupt_and_zero_graph_then_verify $GRAPH_BYTE_CHUNK_COUNT "\xff" \ + $GRAPH_CHUNK_LOOKUP_OFFSET "chunk lookup table entry missing" +' + test_expect_success 'git fsck (checks commit-graph)' ' cd "$TRASH_DIRECTORY/full" && git fsck &&
fuzz-commit-graph identified a case where Git will read past the end of a buffer containing a commit graph if the graph's header has an incorrect chunk count. A simple bounds check in parse_commit_graph() prevents this. Signed-off-by: Josh Steadmon <steadmon@google.com> --- commit-graph.c | 14 ++++++++++++-- t/t5318-commit-graph.sh | 28 ++++++++++++++++++++++++++++ 2 files changed, 40 insertions(+), 2 deletions(-)