diff mbox series

[02/20] packfile.c: prevent overflow in `load_idx()`

Message ID d6902cd9e7f7f2a6b8044c8fb782a28c23e15600.1689205042.git.me@ttaylorr.com (mailing list archive)
State New, archived
Headers show
Series guard object lookups against 32-bit overflow | expand

Commit Message

Taylor Blau July 12, 2023, 11:37 p.m. UTC
Prevent an overflow when locating a pack's CRC offset when the number
of packed items is greater than 2^32-1/hashsz by guarding the
computation with an `st_mult()`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 packfile.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Phillip Wood July 13, 2023, 8:21 a.m. UTC | #1
Hi Taylor

On 13/07/2023 00:37, Taylor Blau wrote:
> Prevent an overflow when locating a pack's CRC offset when the number
> of packed items is greater than 2^32-1/hashsz by guarding the
> computation with an `st_mult()`.
> 
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>   packfile.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/packfile.c b/packfile.c
> index 89220f0e03..70acf1694b 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -186,7 +186,7 @@ int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
>   		     */
>   		    (sizeof(off_t) <= 4))
>   			return error("pack too large for current definition of off_t in %s", path);
> -		p->crc_offset = 8 + 4 * 256 + nr * hashsz;
> +		p->crc_offset = st_add(8 + 4 * 256, st_mult(nr, hashsz));

p->crc_offset is a uint32_t so we're still prone to truncation here 
unless we change the crc_offset member of struct packed_git to be a 
size_t. I haven't checked if the other users of crc_offset would need 
adjusting if we change its type.

Best Wishes

Phillip

>   	}
>   
>   	p->index_version = version;
Taylor Blau July 13, 2023, 2:24 p.m. UTC | #2
On Thu, Jul 13, 2023 at 09:21:55AM +0100, Phillip Wood wrote:
> > diff --git a/packfile.c b/packfile.c
> > index 89220f0e03..70acf1694b 100644
> > --- a/packfile.c
> > +++ b/packfile.c
> > @@ -186,7 +186,7 @@ int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
> >   		     */
> >   		    (sizeof(off_t) <= 4))
> >   			return error("pack too large for current definition of off_t in %s", path);
> > -		p->crc_offset = 8 + 4 * 256 + nr * hashsz;
> > +		p->crc_offset = st_add(8 + 4 * 256, st_mult(nr, hashsz));
>
> p->crc_offset is a uint32_t so we're still prone to truncation here unless
> we change the crc_offset member of struct packed_git to be a size_t. I
> haven't checked if the other users of crc_offset would need adjusting if we
> change its type.

Thanks for spotting. Luckily, this should be a straightforward change:

    $ git grep crc_offset
    builtin/index-pack.c:	idx1 = (((const uint32_t *)((const uint8_t *)p->index_data + p->crc_offset))
    object-store-ll.h:	uint32_t crc_offset;
    packfile.c:		p->crc_offset = st_add(8 + 4 * 256, st_mult(nr, hashsz));

The single usage in index-pack is OK, so we only need to change its type
to a size_t.

I could see an argument that this should be an off_t, since it is an
offset into a file. But since we memory map the whole thing anyway, I
think we are equally OK to treat it as a pointer offset. A similar
argument is made in f86f769550e (compute pack .idx byte offsets using
size_t, 2020-11-13), so I am content to leave this as a size_t.

Thanks,
Taylor
Junio C Hamano July 13, 2023, 4:14 p.m. UTC | #3
Phillip Wood <phillip.wood123@gmail.com> writes:

>> -		p->crc_offset = 8 + 4 * 256 + nr * hashsz;
>> +		p->crc_offset = st_add(8 + 4 * 256, st_mult(nr, hashsz));
>
> p->crc_offset is a uint32_t so we're still prone to truncation here

Good eyes.  Thanks.
Taylor Blau July 14, 2023, 12:54 a.m. UTC | #4
On Thu, Jul 13, 2023 at 10:24:53AM -0400, Taylor Blau wrote:
> On Thu, Jul 13, 2023 at 09:21:55AM +0100, Phillip Wood wrote:
> > p->crc_offset is a uint32_t so we're still prone to truncation here unless
> > we change the crc_offset member of struct packed_git to be a size_t. I
> > haven't checked if the other users of crc_offset would need adjusting if we
> > change its type.
>
> Thanks for spotting. Luckily, this should be a straightforward change:

Here's a replacement patch which changes the type of `crc_offset`. If
there end up being other review comments, I'll fold this into the next
round.

--- 8< ---
Subject: [PATCH] packfile.c: prevent overflow in `load_idx()`

Prevent an overflow when locating a pack's CRC offset when the number
of packed items is greater than 2^32-1/hashsz by guarding the
computation with an `st_mult()`.

Note that to avoid truncating the result, the `crc_offset` member must
itself become a `size_t`. The only usage of this variable (besides the
assignment in `load_idx()`) is in `read_v2_anomalous_offsets()` in the
index-pack code. There we use the `crc_offset` as a pointer offset, so
we are already equipped to handle the type change.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 object-store-ll.h | 2 +-
 packfile.c        | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/object-store-ll.h b/object-store-ll.h
index e8f22cdb1b..26a3895c82 100644
--- a/object-store-ll.h
+++ b/object-store-ll.h
@@ -106,7 +106,7 @@ struct packed_git {
 	const void *index_data;
 	size_t index_size;
 	uint32_t num_objects;
-	uint32_t crc_offset;
+	size_t crc_offset;
 	struct oidset bad_objects;
 	int index_version;
 	time_t mtime;
diff --git a/packfile.c b/packfile.c
index 89220f0e03..70acf1694b 100644
--- a/packfile.c
+++ b/packfile.c
@@ -186,7 +186,7 @@ int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
 		     */
 		    (sizeof(off_t) <= 4))
 			return error("pack too large for current definition of off_t in %s", path);
-		p->crc_offset = 8 + 4 * 256 + nr * hashsz;
+		p->crc_offset = st_add(8 + 4 * 256, st_mult(nr, hashsz));
 	}

 	p->index_version = version;
--
2.41.0.329.g0a1adfae833
--- >8 ---

Thanks,
Taylor
Phillip Wood July 14, 2023, 9:55 a.m. UTC | #5
On 13/07/2023 15:24, Taylor Blau wrote:
> On Thu, Jul 13, 2023 at 09:21:55AM +0100, Phillip Wood wrote:
>>> diff --git a/packfile.c b/packfile.c
>>> index 89220f0e03..70acf1694b 100644
>>> --- a/packfile.c
>>> +++ b/packfile.c
>>> @@ -186,7 +186,7 @@ int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
>>>    		     */
>>>    		    (sizeof(off_t) <= 4))
>>>    			return error("pack too large for current definition of off_t in %s", path);
>>> -		p->crc_offset = 8 + 4 * 256 + nr * hashsz;
>>> +		p->crc_offset = st_add(8 + 4 * 256, st_mult(nr, hashsz));
>>
>> p->crc_offset is a uint32_t so we're still prone to truncation here unless
>> we change the crc_offset member of struct packed_git to be a size_t. I
>> haven't checked if the other users of crc_offset would need adjusting if we
>> change its type.
> 
> Thanks for spotting. Luckily, this should be a straightforward change:
> 
>      $ git grep crc_offset
>      builtin/index-pack.c:	idx1 = (((const uint32_t *)((const uint8_t *)p->index_data + p->crc_offset))
>      object-store-ll.h:	uint32_t crc_offset;
>      packfile.c:		p->crc_offset = st_add(8 + 4 * 256, st_mult(nr, hashsz));
> 
> The single usage in index-pack is OK, so we only need to change its type
> to a size_t.

That's good, it is nice it is such a simple change

> I could see an argument that this should be an off_t, since it is an
> offset into a file. But since we memory map the whole thing anyway, I
> think we are equally OK to treat it as a pointer offset. A similar
> argument is made in f86f769550e (compute pack .idx byte offsets using
> size_t, 2020-11-13), so I am content to leave this as a size_t.


If we're already using size_t where one could argue in favor of using 
off_t I think that makes sense.

Best Wishes

Phillip
Phillip Wood July 14, 2023, 9:56 a.m. UTC | #6
On 14/07/2023 01:54, Taylor Blau wrote:
> On Thu, Jul 13, 2023 at 10:24:53AM -0400, Taylor Blau wrote:
>> On Thu, Jul 13, 2023 at 09:21:55AM +0100, Phillip Wood wrote:
>>> p->crc_offset is a uint32_t so we're still prone to truncation here unless
>>> we change the crc_offset member of struct packed_git to be a size_t. I
>>> haven't checked if the other users of crc_offset would need adjusting if we
>>> change its type.
>>
>> Thanks for spotting. Luckily, this should be a straightforward change:
> 
> Here's a replacement patch which changes the type of `crc_offset`. If
> there end up being other review comments, I'll fold this into the next
> round.
> 
> --- 8< ---
> Subject: [PATCH] packfile.c: prevent overflow in `load_idx()`
> 
> Prevent an overflow when locating a pack's CRC offset when the number
> of packed items is greater than 2^32-1/hashsz by guarding the
> computation with an `st_mult()`.
> 
> Note that to avoid truncating the result, the `crc_offset` member must
> itself become a `size_t`. The only usage of this variable (besides the
> assignment in `load_idx()`) is in `read_v2_anomalous_offsets()` in the
> index-pack code. There we use the `crc_offset` as a pointer offset, so
> we are already equipped to handle the type change.

Thanks for adding that explanation, this version looks good to me

Best Wishes

Phillip

> Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>   object-store-ll.h | 2 +-
>   packfile.c        | 2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/object-store-ll.h b/object-store-ll.h
> index e8f22cdb1b..26a3895c82 100644
> --- a/object-store-ll.h
> +++ b/object-store-ll.h
> @@ -106,7 +106,7 @@ struct packed_git {
>   	const void *index_data;
>   	size_t index_size;
>   	uint32_t num_objects;
> -	uint32_t crc_offset;
> +	size_t crc_offset;
>   	struct oidset bad_objects;
>   	int index_version;
>   	time_t mtime;
> diff --git a/packfile.c b/packfile.c
> index 89220f0e03..70acf1694b 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -186,7 +186,7 @@ int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
>   		     */
>   		    (sizeof(off_t) <= 4))
>   			return error("pack too large for current definition of off_t in %s", path);
> -		p->crc_offset = 8 + 4 * 256 + nr * hashsz;
> +		p->crc_offset = st_add(8 + 4 * 256, st_mult(nr, hashsz));
>   	}
> 
>   	p->index_version = version;
> --
> 2.41.0.329.g0a1adfae833
> --- >8 ---
> 
> Thanks,
> Taylor
Junio C Hamano July 14, 2023, 4:29 p.m. UTC | #7
Taylor Blau <me@ttaylorr.com> writes:

> On Thu, Jul 13, 2023 at 10:24:53AM -0400, Taylor Blau wrote:
>> On Thu, Jul 13, 2023 at 09:21:55AM +0100, Phillip Wood wrote:
>> > p->crc_offset is a uint32_t so we're still prone to truncation here unless
>> > we change the crc_offset member of struct packed_git to be a size_t. I
>> > haven't checked if the other users of crc_offset would need adjusting if we
>> > change its type.
>>
>> Thanks for spotting. Luckily, this should be a straightforward change:
>
> Here's a replacement patch which changes the type of `crc_offset`. If
> there end up being other review comments, I'll fold this into the next
> round.

The code change to use st_add() and st_mult() is the same as before,
and the type of .crc_offset member changes, both of which is not
unexpected.

In the meantime I will replace the copy of [2/20] I have with this
one.

Thanks, both.

> --- 8< ---
> Subject: [PATCH] packfile.c: prevent overflow in `load_idx()`
>
> Prevent an overflow when locating a pack's CRC offset when the number
> of packed items is greater than 2^32-1/hashsz by guarding the
> computation with an `st_mult()`.
>
> Note that to avoid truncating the result, the `crc_offset` member must
> itself become a `size_t`. The only usage of this variable (besides the
> assignment in `load_idx()`) is in `read_v2_anomalous_offsets()` in the
> index-pack code. There we use the `crc_offset` as a pointer offset, so
> we are already equipped to handle the type change.
>
> Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  object-store-ll.h | 2 +-
>  packfile.c        | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/object-store-ll.h b/object-store-ll.h
> index e8f22cdb1b..26a3895c82 100644
> --- a/object-store-ll.h
> +++ b/object-store-ll.h
> @@ -106,7 +106,7 @@ struct packed_git {
>  	const void *index_data;
>  	size_t index_size;
>  	uint32_t num_objects;
> -	uint32_t crc_offset;
> +	size_t crc_offset;
>  	struct oidset bad_objects;
>  	int index_version;
>  	time_t mtime;
> diff --git a/packfile.c b/packfile.c
> index 89220f0e03..70acf1694b 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -186,7 +186,7 @@ int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
>  		     */
>  		    (sizeof(off_t) <= 4))
>  			return error("pack too large for current definition of off_t in %s", path);
> -		p->crc_offset = 8 + 4 * 256 + nr * hashsz;
> +		p->crc_offset = st_add(8 + 4 * 256, st_mult(nr, hashsz));
>  	}
>
>  	p->index_version = version;
> --
> 2.41.0.329.g0a1adfae833
> --- >8 ---
>
> Thanks,
> Taylor
diff mbox series

Patch

diff --git a/packfile.c b/packfile.c
index 89220f0e03..70acf1694b 100644
--- a/packfile.c
+++ b/packfile.c
@@ -186,7 +186,7 @@  int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
 		     */
 		    (sizeof(off_t) <= 4))
 			return error("pack too large for current definition of off_t in %s", path);
-		p->crc_offset = 8 + 4 * 256 + nr * hashsz;
+		p->crc_offset = st_add(8 + 4 * 256, st_mult(nr, hashsz));
 	}
 
 	p->index_version = version;