diff mbox series

[2/2] pack-bitmap.c: ensure pseudo-merge offset reads are bounded

Message ID 0a16399d14afd527f4db63f2a4a3b0a3cbf112f1.1717699237.git.me@ttaylorr.com (mailing list archive)
State New, archived
Headers show
Series [1/2] Documentation/technical/bitmap-format.txt: add missing position table | expand

Commit Message

Taylor Blau June 6, 2024, 6:41 p.m. UTC
After reading the pseudo-merge extension's metadata table, we allocate
an array to store information about each pseudo-merge, including its
byte offset within the .bitmap file itself.

This is done like so:

    pseudo_merge_ofs = index_end - 24 -
            (index->pseudo_merges.nr * sizeof(uint64_t));
    for (i = 0; i < index->pseudo_merges.nr; i++) {
            index->pseudo_merges.v[i].at = get_be64(pseudo_merge_ofs);
            pseudo_merge_ofs += sizeof(uint64_t);
    }

But if the pseudo-merge table is corrupt, we'll keep calling get_be64()
past the end of the pseudo-merge extension, potentially reading off the
end of the mmap'd region.

Prevent this by ensuring that we have at least `table_size - 24` many
bytes available to read (subtracting 24 as the length of the metadata
component).

This is sufficient to prevent us from reading off the end of the
pseudo-merge extension, and ensures that all of the get_be64() calls
below are in bounds.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Junio C Hamano June 6, 2024, 7:42 p.m. UTC | #1
Taylor Blau <me@ttaylorr.com> writes:

> After reading the pseudo-merge extension's metadata table, we allocate
> an array to store information about each pseudo-merge, including its
> byte offset within the .bitmap file itself.
>
> This is done like so:
>
>     pseudo_merge_ofs = index_end - 24 -
>             (index->pseudo_merges.nr * sizeof(uint64_t));
>     for (i = 0; i < index->pseudo_merges.nr; i++) {
>             index->pseudo_merges.v[i].at = get_be64(pseudo_merge_ofs);
>             pseudo_merge_ofs += sizeof(uint64_t);
>     }
>
> But if the pseudo-merge table is corrupt, we'll keep calling get_be64()
> past the end of the pseudo-merge extension, potentially reading off the
> end of the mmap'd region.
>
> Prevent this by ensuring that we have at least `table_size - 24` many
> bytes available to read (subtracting 24 as the length of the metadata
> component).
>
> This is sufficient to prevent us from reading off the end of the
> pseudo-merge extension, and ensures that all of the get_be64() calls
> below are in bounds.

Can table_size at this point be smaller than 24, which will allow
(table_size - 24) to be a huge number that st_mult() will
comfortably fit?

> Helped-by: Jeff King <peff@peff.net>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  pack-bitmap.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/pack-bitmap.c b/pack-bitmap.c
> index 70230e2647..ad2635c025 100644
> --- a/pack-bitmap.c
> +++ b/pack-bitmap.c
> @@ -238,6 +238,9 @@ static int load_bitmap_header(struct bitmap_index *index)
>  				index->pseudo_merges.commits_nr = get_be32(index_end - 20);
>  				index->pseudo_merges.nr = get_be32(index_end - 24);
>  
> +				if (st_mult(index->pseudo_merges.nr, sizeof(uint64_t)) > table_size - 24)
> +					return error(_("corrupted bitmap index file, pseudo-merge table too short"));
> +
>  				CALLOC_ARRAY(index->pseudo_merges.v,
>  					     index->pseudo_merges.nr);
Taylor Blau June 6, 2024, 10:25 p.m. UTC | #2
On Thu, Jun 06, 2024 at 12:42:59PM -0700, Junio C Hamano wrote:
> Can table_size at this point be smaller than 24, which will allow
> (table_size - 24) to be a huge number that st_mult() will
> comfortably fit?

It could be smaller than 24, but I think we're at the point of
diminishing returns here. The table_size field is read from the .bitmap
file itself, and we do some light bounds checking here:

    table_size = get_be64(index_end - 8);
    if (table_size > index_end - index->map - header_size)
        return error(_(...));

We could add another check to ensure that table_size is at least 24, but
I'm less concerned here for a couple of reasons:

  - Since we're reading off of the index_end, we know that all of our
    reads are within the .bitmap itself, so we're not reading outside of
    the memory-mapped region.

  - Checking that index->pseudo_merges.nr is a reasonable size also
    bounds reads, but more importantly IMHO prevents a large heap
    allocation via the CALLOC_ARRAY() below.

So I think we could check the table_size value, but I'm not sure we'd
gain very much by doing so.

Thanks,
Taylor
Junio C Hamano June 6, 2024, 10:35 p.m. UTC | #3
Taylor Blau <me@ttaylorr.com> writes:

> On Thu, Jun 06, 2024 at 12:42:59PM -0700, Junio C Hamano wrote:
>> Can table_size at this point be smaller than 24, which will allow
>> (table_size - 24) to be a huge number that st_mult() will
>> comfortably fit?
>
> It could be smaller than 24, but I think we're at the point of
> diminishing returns here.

I only meant to say that we could easily rewrite

	if (st_mult() > table_size - 24)

condition to

	if (st_add(st_mult(), 24) > table_size)

and we do not have to think if we have already checked table_size
before we reach this point of the control flow.
Taylor Blau June 6, 2024, 10:38 p.m. UTC | #4
On Thu, Jun 06, 2024 at 03:35:51PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > On Thu, Jun 06, 2024 at 12:42:59PM -0700, Junio C Hamano wrote:
> >> Can table_size at this point be smaller than 24, which will allow
> >> (table_size - 24) to be a huge number that st_mult() will
> >> comfortably fit?
> >
> > It could be smaller than 24, but I think we're at the point of
> > diminishing returns here.
>
> I only meant to say that we could easily rewrite
>
> 	if (st_mult() > table_size - 24)
>
> condition to
>
> 	if (st_add(st_mult(), 24) > table_size)
>
> and we do not have to think if we have already checked table_size
> before we reach this point of the control flow.

Ah. Thanks for the clarification. Yes, I think you could do so; I'm
happy to send another version if you like.

Thanks,
Taylor
Junio C Hamano June 14, 2024, 6:23 p.m. UTC | #5
Taylor Blau <me@ttaylorr.com> writes:

> On Thu, Jun 06, 2024 at 03:35:51PM -0700, Junio C Hamano wrote:
>> Taylor Blau <me@ttaylorr.com> writes:
>>
>> > On Thu, Jun 06, 2024 at 12:42:59PM -0700, Junio C Hamano wrote:
>> >> Can table_size at this point be smaller than 24, which will allow
>> >> (table_size - 24) to be a huge number that st_mult() will
>> >> comfortably fit?
>> >
>> > It could be smaller than 24, but I think we're at the point of
>> > diminishing returns here.
>>
>> I only meant to say that we could easily rewrite
>>
>> 	if (st_mult() > table_size - 24)
>>
>> condition to
>>
>> 	if (st_add(st_mult(), 24) > table_size)
>>
>> and we do not have to think if we have already checked table_size
>> before we reach this point of the control flow.
>
> Ah. Thanks for the clarification. Yes, I think you could do so; I'm
> happy to send another version if you like.

Let's get this thing unstuck; the other larger topic that this topic
builds upon is stuck for too long.
diff mbox series

Patch

diff --git a/pack-bitmap.c b/pack-bitmap.c
index 70230e2647..ad2635c025 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -238,6 +238,9 @@  static int load_bitmap_header(struct bitmap_index *index)
 				index->pseudo_merges.commits_nr = get_be32(index_end - 20);
 				index->pseudo_merges.nr = get_be32(index_end - 24);
 
+				if (st_mult(index->pseudo_merges.nr, sizeof(uint64_t)) > table_size - 24)
+					return error(_("corrupted bitmap index file, pseudo-merge table too short"));
+
 				CALLOC_ARRAY(index->pseudo_merges.v,
 					     index->pseudo_merges.nr);