diff mbox series

packfile: avoid overflowing shift during decode

Message ID 20211110234033.3144165-1-jonathantanmy@google.com (mailing list archive)
State Accepted
Commit 89c37479dbd6e4350bb6d6fa5049c388629518d5
Headers show
Series packfile: avoid overflowing shift during decode | expand

Commit Message

Jonathan Tan Nov. 10, 2021, 11:40 p.m. UTC
unpack_object_header_buffer() attempts to protect against overflowing
left shifts, but the limit of the shift amount should not be the size of
the variable being shifted. It should be the size minus the size of its
contents. Fix that accordingly.

This was noticed at $DAYJOB by a fuzzer running internally.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
In next, d6a09e795d ("odb: guard against data loss checking out a huge
file", 2021-11-03) (merged as fe5160a170 ("Merge branch
'mc/clean-smudge-with-llp64' into next", 2021-11-03)) ameliorates this
situation by dying if the left shift overflows, but this patch is still
worthwhile as it makes a bad header be reported as a bad header, not a
fatal left shift overflow.
---
 packfile.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Junio C Hamano Nov. 11, 2021, 1:58 a.m. UTC | #1
Jonathan Tan <jonathantanmy@google.com> writes:

> situation by dying if the left shift overflows, but this patch is still
> worthwhile as it makes a bad header be reported as a bad header, not a
> fatal left shift overflow.

Hmph, I am not sure if I like to see it put that way, but I think
the problem is not anything new.

Even when we have a packfile that is in a good shape, the current
machine (especially its wordsize) may not be capable of extracting
the object we are looking at from the packstream, when the object is
larger than the current machine's ulong can express.  So it may be
an indication that your machine cannot use the packed object, but
may not be an indication that there is anything _wrong_ in the
object header.

    Side note.  I suspect that this should already be the case; you
    can pack a large object whose size do not fit in u32 on a box
    whose ulong is u64, and you wouldn't be able to use such a
    packfile on a box where ulong is u32.  There is no "bad object
    header" in the pack stream per-se, but we cannot use it there.

After all, the reason why we chose to use the varint encoding is
because we do not have to change the file format when we extend
beyond the architectures of prevailing word size of the day.

Having said all that, I think the patch is correct.  We later use
the number "shift" as the shift count like so

	size += (c & 0x7f) << shift;

In order for the uppermost bit from (c & 0x7f) to be still inside
"size", it is not sufficient that "shift" does not exceed the
bit-size of "size".  We need to subtract the bitscanreverse of
(0x7f), which is 7, which is exactly what your patch does.

Looking good.  Thanks.


>  packfile.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/packfile.c b/packfile.c
> index 89402cfc69..972c327e29 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -1068,7 +1068,7 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf,
>  	size = c & 15;
>  	shift = 4;
>  	while (c & 0x80) {
> -		if (len <= used || bitsizeof(long) <= shift) {
> +		if (len <= used || (bitsizeof(long) - 7) <= shift) {
>  			error("bad object header");
>  			size = used = 0;
>  			break;
Marc Strapetz Jan. 10, 2022, 11:22 p.m. UTC | #2
On 11/11/2021 02:58, Junio C Hamano wrote:
> Jonathan Tan <jonathantanmy@google.com> writes:
> 
>> diff --git a/packfile.c b/packfile.c
>> index 89402cfc69..972c327e29 100644
>> --- a/packfile.c
>> +++ b/packfile.c
>> @@ -1068,7 +1068,7 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf,
>>   	size = c & 15;
>>   	shift = 4;
>>   	while (c & 0x80) {
>> -		if (len <= used || bitsizeof(long) <= shift) {
>> +		if (len <= used || (bitsizeof(long) - 7) <= shift) {

This seems to cause troubles now for 32-bit systems (in my case Git for 
Windows 32-Bit): `shift` will go through 4, 11, 18 and for 25 it finally 
errors out. This means that objects >= 32MB can't be processed anymore. 
The condition should probably be changed to:

+		if (len <= used || (bitsizeof(long) - 7) < shift) {

This still ensures that the shift can never overflow and on 32-bit 
systems restores the maximum size of 4G with a final shift of 127<<25 
(the old condition `bitsizeof(long) <= shift` was perfectly valid for 
32-bit systems).

> Even when we have a packfile that is in a good shape, the current
> machine (especially its wordsize) may not be capable of extracting
> the object we are looking at from the packstream, when the object is
> larger than the current machine's ulong can express.  So it may be
> an indication that your machine cannot use the packed object, but
> may not be an indication that there is anything _wrong_ in the
> object header.

I was actually quite confused by the error message "bad object header". 
It made me investigate all sorts of other things before thoroughly 
stepping through this loop.

-Marc
Junio C Hamano Jan. 12, 2022, 8:06 p.m. UTC | #3
Marc Strapetz <marc.strapetz@syntevo.com> writes:

> On 11/11/2021 02:58, Junio C Hamano wrote:
>> Jonathan Tan <jonathantanmy@google.com> writes:
>> 
>>> diff --git a/packfile.c b/packfile.c
>>> index 89402cfc69..972c327e29 100644
>>> --- a/packfile.c
>>> +++ b/packfile.c
>>> @@ -1068,7 +1068,7 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf,
>>>   	size = c & 15;
>>>   	shift = 4;
>>>   	while (c & 0x80) {
>>> -		if (len <= used || bitsizeof(long) <= shift) {
>>> +		if (len <= used || (bitsizeof(long) - 7) <= shift) {
>
> This seems to cause troubles now for 32-bit systems (in my case Git
> for Windows 32-Bit): `shift` will go through 4, 11, 18 and for 25 it
> finally errors out. This means that objects >= 32MB can't be processed
> anymore. The condition should probably be changed to:
>
> +		if (len <= used || (bitsizeof(long) - 7) < shift) {
>
> This still ensures that the shift can never overflow and on 32-bit
> systems restores the maximum size of 4G with a final shift of 127<<25 
> (the old condition `bitsizeof(long) <= shift` was perfectly valid for
> 32-bit systems).

Jonathan?
Junio C Hamano Jan. 12, 2022, 8:12 p.m. UTC | #4
Junio C Hamano <gitster@pobox.com> writes:

> Marc Strapetz <marc.strapetz@syntevo.com> writes:
>
>> On 11/11/2021 02:58, Junio C Hamano wrote:
>>> Jonathan Tan <jonathantanmy@google.com> writes:
>>> 
>>>> diff --git a/packfile.c b/packfile.c
>>>> index 89402cfc69..972c327e29 100644
>>>> --- a/packfile.c
>>>> +++ b/packfile.c
>>>> @@ -1068,7 +1068,7 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf,
>>>>   	size = c & 15;
>>>>   	shift = 4;
>>>>   	while (c & 0x80) {
>>>> -		if (len <= used || bitsizeof(long) <= shift) {
>>>> +		if (len <= used || (bitsizeof(long) - 7) <= shift) {
>>
>> This seems to cause troubles now for 32-bit systems (in my case Git
>> for Windows 32-Bit): `shift` will go through 4, 11, 18 and for 25 it
>> finally errors out. This means that objects >= 32MB can't be processed
>> anymore. The condition should probably be changed to:
>>
>> +		if (len <= used || (bitsizeof(long) - 7) < shift) {
>>
>> This still ensures that the shift can never overflow and on 32-bit
>> systems restores the maximum size of 4G with a final shift of 127<<25 
>> (the old condition `bitsizeof(long) <= shift` was perfectly valid for
>> 32-bit systems).
>
> Jonathan?


----- >8 --------- >8 --------- >8 --------- >8 --------- >8 -----
Date: Wed, 12 Jan 2022 12:11:42 -0800
Subject: [PATCH] packfile: fix off-by-one error in decoding logic

shift count being exactly at 7-bit smaller than the long is OK; on
32-bit architecture, shift count starts at 4 and goes through 11, 18
and 25, at which point the guard triggers one iteration too early.

Reported-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 packfile.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/packfile.c b/packfile.c
index d3820c780b..667e21ce97 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1067,7 +1067,7 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf,
 	size = c & 15;
 	shift = 4;
 	while (c & 0x80) {
-		if (len <= used || (bitsizeof(long) - 7) <= shift) {
+		if (len <= used || (bitsizeof(long) - 7) < shift) {
 			error("bad object header");
 			size = used = 0;
 			break;
Jonathan Tan Jan. 12, 2022, 8:27 p.m. UTC | #5
Junio C Hamano <gitster@pobox.com> writes:
> Marc Strapetz <marc.strapetz@syntevo.com> writes:
> 
> > On 11/11/2021 02:58, Junio C Hamano wrote:
> >> Jonathan Tan <jonathantanmy@google.com> writes:
> >> 
> >>> diff --git a/packfile.c b/packfile.c
> >>> index 89402cfc69..972c327e29 100644
> >>> --- a/packfile.c
> >>> +++ b/packfile.c
> >>> @@ -1068,7 +1068,7 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf,
> >>>   	size = c & 15;
> >>>   	shift = 4;
> >>>   	while (c & 0x80) {
> >>> -		if (len <= used || bitsizeof(long) <= shift) {
> >>> +		if (len <= used || (bitsizeof(long) - 7) <= shift) {
> >
> > This seems to cause troubles now for 32-bit systems (in my case Git
> > for Windows 32-Bit): `shift` will go through 4, 11, 18 and for 25 it
> > finally errors out. This means that objects >= 32MB can't be processed
> > anymore. The condition should probably be changed to:
> >
> > +		if (len <= used || (bitsizeof(long) - 7) < shift) {
> >
> > This still ensures that the shift can never overflow and on 32-bit
> > systems restores the maximum size of 4G with a final shift of 127<<25 
> > (the old condition `bitsizeof(long) <= shift` was perfectly valid for
> > 32-bit systems).
> 
> Jonathan?

This analysis makes sense - not sure how I missed that. 0x7f (the number
being shifted) is 7 bits, so it can safely be shifted 25 bits.

The original condition of `bitsizeof(long) <= shift` works for 32-bit
but not for 64-bit (4, 11, 18, 25, 32, 39, 46, 53, 60) since shifting
0x7f, a 7-bit value, by 60 bits would result in overflow, so we still
need to subtract 7. I agree that the inequality should be `<`, not `<=`.
diff mbox series

Patch

diff --git a/packfile.c b/packfile.c
index 89402cfc69..972c327e29 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1068,7 +1068,7 @@  unsigned long unpack_object_header_buffer(const unsigned char *buf,
 	size = c & 15;
 	shift = 4;
 	while (c & 0x80) {
-		if (len <= used || bitsizeof(long) <= shift) {
+		if (len <= used || (bitsizeof(long) - 7) <= shift) {
 			error("bad object header");
 			size = used = 0;
 			break;