[RFC] unpack-trees: watch for out-of-range index position

Message ID	20200108023127.219429-1-emilyshaffer@google.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=ZiOn=25=vger.kernel.org=git-owner@kernel.org> Date: Tue, 7 Jan 2020 18:31:27 -0800 Message-Id: <20200108023127.219429-1-emilyshaffer@google.com> Mime-Version: 1.0 Subject: [RFC PATCH] unpack-trees: watch for out-of-range index position From: Emily Shaffer <emilyshaffer@google.com> To: git@vger.kernel.org Cc: Emily Shaffer <emilyshaffer@google.com> Content-Type: text/plain; charset="UTF-8" Sender: git-owner@vger.kernel.org Precedence: bulk
Series	[RFC] unpack-trees: watch for out-of-range index position \| expand [RFC] unpack-trees: watch for out-of-range index position

Emily Shaffer Jan. 8, 2020, 2:31 a.m. UTC

It's possible in a case where the index file contains a tree extension
but no blobs within that tree exist for index_pos_by_traverse_info() to
segfault. If the name_entry passed into index_pos_by_traverse_info() has
no blobs inside, AND is alphabetically later than all blobs currently in
the index file, index_pos_by_traverse_info() will segfault. For example,
an index file which looks something like this:

  aaa#0
  bbb/aaa#0
  [Extensions]
  TREE: zzz

In this example, 'index_name_pos(..., "zzz/", ...)' will return '-4',
indicating that "zzz/" could be inserted at position 3. However, when
the checks which ensure that the insertion position of "zzz/" look for a
blob at that position beginning with "zzz/", the index cache is accessed
out of range, causing a segfault.

This kind of index state is not typically generated during user
operations, and is in fact an edge case of the state being checked for
in the conditional where it was added. However, since the entry for the
BUG() line is ambiguous, tell some additional context to help Git
developers debug the failure later. When we know the name of the dir we
were trying to look up, it becomes possible to examine the index file
in a hex util to determine what went wrong; the position gives a hint
about where to start looking.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
This issue came in via a bugreport from a user who had done some nasty
things like deleting various files in .git/ (and then couldn't remember
how they had done it). The concern was primarily that a segfault is ugly
and scary, and possibly dangerous; I didn't see much problem with
checking for index-out-of-range if the result is a fatal error
regardless.

The index file they shared for reproduction was very similar to the one
that I proposed in the commit message. However, though I had a repo
where I could reproduce, since the user wasn't sure how they had gotten
there I struggled reasoning about how to produce these exact conditions.
It seems like during normal operation the index shouldn't learn about a
tree extension where it doesn't know any blobs (in fact, I've become
irritated before about being unable to stage/commit only directory
structure :) ).

I also didn't find any test cases looking for the BUG() as it exists
now; I guess that's because BUG()s are supposed to be unreachable during
normal operation (or else they'd be a die()). So, it's marked RFC only
because I couldn't think of a way to reliably reproduce or test this
change.

 - Emily

 unpack-trees.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Jeff King Jan. 8, 2020, 7:15 a.m. UTC | #1

On Tue, Jan 07, 2020 at 06:31:27PM -0800, Emily Shaffer wrote:

> This issue came in via a bugreport from a user who had done some nasty
> things like deleting various files in .git/ (and then couldn't remember
> how they had done it). The concern was primarily that a segfault is ugly
> and scary, and possibly dangerous; I didn't see much problem with
> checking for index-out-of-range if the result is a fatal error
> regardless.
>
> [...]
>  	if (pos >= 0)
>  		BUG("This is a directory and should not exist in index");
>  	pos = -pos - 1;
> -	if (!starts_with(o->src_index->cache[pos]->name, name.buf) ||
> +	if (pos >= o->src_index->cache_nr ||
> +	    !starts_with(o->src_index->cache[pos]->name, name.buf) ||
>  	    (pos > 0 && starts_with(o->src_index->cache[pos-1]->name, name.buf)))
> -		BUG("pos must point at the first entry in this directory");
> +		BUG("pos %d doesn't point to the first entry of %s in index",
> +		    pos, name.buf);

The new condition you added looks correct to me. I suspect this BUG()
should not be a BUG() at all, though. It's not necessarily a logic error
inside Git, but as you showed it could indicate corrupt data we read
from disk. The true is probably same of the "pos >= 0" condition checked
above.

It's mostly an academic distinction, though, as I think it would be
pretty reasonable for now to just die() here (eventually, though, we
might want to turn it into an error return).

-Peff

Junio C Hamano Jan. 8, 2020, 5:30 p.m. UTC | #2

Jeff King <peff@peff.net> writes:

> On Tue, Jan 07, 2020 at 06:31:27PM -0800, Emily Shaffer wrote:
>
>> This issue came in via a bugreport from a user who had done some nasty
>> things like deleting various files in .git/ (and then couldn't remember
>> how they had done it). The concern was primarily that a segfault is ugly
>> and scary, and possibly dangerous; I didn't see much problem with
>> checking for index-out-of-range if the result is a fatal error
>> regardless.
>>
>> [...]
>>  	if (pos >= 0)
>>  		BUG("This is a directory and should not exist in index");
>>  	pos = -pos - 1;
>> -	if (!starts_with(o->src_index->cache[pos]->name, name.buf) ||
>> +	if (pos >= o->src_index->cache_nr ||
>> +	    !starts_with(o->src_index->cache[pos]->name, name.buf) ||
>>  	    (pos > 0 && starts_with(o->src_index->cache[pos-1]->name, name.buf)))
>> -		BUG("pos must point at the first entry in this directory");
>> +		BUG("pos %d doesn't point to the first entry of %s in index",
>> +		    pos, name.buf);
>
> The new condition you added looks correct to me. I suspect this BUG()
> should not be a BUG() at all, though. It's not necessarily a logic error
> inside Git, but as you showed it could indicate corrupt data we read
> from disk. The true is probably same of the "pos >= 0" condition checked
> above.

It does not sound like a BUG to me, either, but the new condition
does look correct to me, too.  We can turn it into die() later if
somebody truly cares ;-)

Thanks, both.  Will queue.


> It's mostly an academic distinction, though, as I think it would be
> pretty reasonable for now to just die() here (eventually, though, we
> might want to turn it into an error return).
>
> -Peff

Emily Shaffer Jan. 8, 2020, 7:38 p.m. UTC | #3

On Wed, Jan 08, 2020 at 09:30:36AM -0800, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
> > On Tue, Jan 07, 2020 at 06:31:27PM -0800, Emily Shaffer wrote:
> >
> >> This issue came in via a bugreport from a user who had done some nasty
> >> things like deleting various files in .git/ (and then couldn't remember
> >> how they had done it). The concern was primarily that a segfault is ugly
> >> and scary, and possibly dangerous; I didn't see much problem with
> >> checking for index-out-of-range if the result is a fatal error
> >> regardless.
> >>
> >> [...]
> >>  	if (pos >= 0)
> >>  		BUG("This is a directory and should not exist in index");
> >>  	pos = -pos - 1;
> >> -	if (!starts_with(o->src_index->cache[pos]->name, name.buf) ||
> >> +	if (pos >= o->src_index->cache_nr ||
> >> +	    !starts_with(o->src_index->cache[pos]->name, name.buf) ||
> >>  	    (pos > 0 && starts_with(o->src_index->cache[pos-1]->name, name.buf)))
> >> -		BUG("pos must point at the first entry in this directory");
> >> +		BUG("pos %d doesn't point to the first entry of %s in index",
> >> +		    pos, name.buf);
> >
> > The new condition you added looks correct to me. I suspect this BUG()
> > should not be a BUG() at all, though. It's not necessarily a logic error
> > inside Git, but as you showed it could indicate corrupt data we read
> > from disk. The true is probably same of the "pos >= 0" condition checked
> > above.
> 
> It does not sound like a BUG to me, either, but the new condition
> does look correct to me, too.  We can turn it into die() later if
> somebody truly cares ;-)
> 
> Thanks, both.  Will queue.

Thanks much for the quick turnaround. If I hear more noise I'll give it
a try with die() or error code instead, but for now I'll move on to the
next bug on my list. :)

 - Emily

Junio C Hamano Jan. 8, 2020, 8:35 p.m. UTC | #4

Emily Shaffer <emilyshaffer@google.com> writes:

>> > The new condition you added looks correct to me. I suspect this BUG()
>> > should not be a BUG() at all, though. It's not necessarily a logic error
>> > inside Git, but as you showed it could indicate corrupt data we read
>> > from disk. The true is probably same of the "pos >= 0" condition checked
>> > above.
>> 
>> It does not sound like a BUG to me, either, but the new condition
>> does look correct to me, too.  We can turn it into die() later if
>> somebody truly cares ;-)
>> 
>> Thanks, both.  Will queue.
>
> Thanks much for the quick turnaround. If I hear more noise I'll give it
> a try with die() or error code instead, but for now I'll move on to the
> next bug on my list. :)

By the way, it is somewhat sad that we proceeded that far in the
first place---such a corrupt on-disk index would have caused an
early die() if we did not get rid of the trailing-hash integrity
check.

Jeff King Jan. 9, 2020, 7:52 a.m. UTC | #5

On Wed, Jan 08, 2020 at 12:35:29PM -0800, Junio C Hamano wrote:

> >> It does not sound like a BUG to me, either, but the new condition
> >> does look correct to me, too.  We can turn it into die() later if
> >> somebody truly cares ;-)
> >> 
> >> Thanks, both.  Will queue.
> >
> > Thanks much for the quick turnaround. If I hear more noise I'll give it
> > a try with die() or error code instead, but for now I'll move on to the
> > next bug on my list. :)
> 
> By the way, it is somewhat sad that we proceeded that far in the
> first place---such a corrupt on-disk index would have caused an
> early die() if we did not get rid of the trailing-hash integrity
> check.

Perhaps. The integrity check only protects against an index that was
modified after the fact, not one that was generated by a buggy Git. I'm
not sure we know how the index that led to this patch got into this
state (though it sounds like Emily has a copy and could check the hash
on it), but other cache-tree segfault I found recently was with an index
with an intact integrity hash.

So I think regardless of the trailing-hash check, we'd always want to be
defensive when reading on-disk data.

-Peff

Emily Shaffer Jan. 9, 2020, 10:46 p.m. UTC | #6

On Thu, Jan 09, 2020 at 02:52:50AM -0500, Jeff King wrote:
> On Wed, Jan 08, 2020 at 12:35:29PM -0800, Junio C Hamano wrote:
> 
> > >> It does not sound like a BUG to me, either, but the new condition
> > >> does look correct to me, too.  We can turn it into die() later if
> > >> somebody truly cares ;-)
> > >> 
> > >> Thanks, both.  Will queue.
> > >
> > > Thanks much for the quick turnaround. If I hear more noise I'll give it
> > > a try with die() or error code instead, but for now I'll move on to the
> > > next bug on my list. :)
> > 
> > By the way, it is somewhat sad that we proceeded that far in the
> > first place---such a corrupt on-disk index would have caused an
> > early die() if we did not get rid of the trailing-hash integrity
> > check.
> 
> Perhaps. The integrity check only protects against an index that was
> modified after the fact, not one that was generated by a buggy Git. I'm
> not sure we know how the index that led to this patch got into this
> state (though it sounds like Emily has a copy and could check the hash
> on it), but other cache-tree segfault I found recently was with an index
> with an intact integrity hash.

Yeah, I can do that, although I'm not sure how. The index itself is very
small - it only contains one file and one tree extension - so I'll go
ahead and paste some poking and prodding, and if it's not what you
wanted then please let me know what else to run.

  $ g fsck --cache
  Checking object directories: 100% (256/256), done.
  Checking objects: 100% (20/20), done.
  broken link from  commit 153a9a100eae7fdba5989ce39a5dd1782075517f
                to  commit cca7ecaa5d8c398f41bfec7938cc6a526803579b
  broken link from  commit 7d6bb91e31d18eadfaf855a9fb7ad6ba81b8b6d9
                to  commit 03087a617bfe55f862cb1ef43273a2bd08e8b6d6
  missing commit 03087a617bfe55f862cb1ef43273a2bd08e8b6d6
  missing commit cca7ecaa5d8c398f41bfec7938cc6a526803579b
  dangling commit 5e2c635433bc46b13061b276e481f63b1f6642c8

  $ hexdump -C .git/index
  00000000  44 49 52 43 00 00 00 02  00 00 00 01 5d 89 5e 22  |DIRC........].^"|
  00000010  23 bf a3 c4 5d 89 5e 22  23 bf a3 c4 00 00 fe 02  |#...].^"#.......|
  00000020  02 c8 f5 83 00 00 81 a4  00 06 c1 dc 00 01 5f 53  |.............._S|
  00000030  00 00 06 b3 78 88 a4 f4  22 34 7d ad b0 c4 73 0f  |....x..."4}...s.|
  00000040  c5 bc f6 ea 1d 2d f0 3a  00 09 52 45 41 44 4d 45  |.....-.:..README|
  00000050  2e 6d 64 00 54 52 45 45  00 00 00 3a 00 31 37 20  |.md.TREE...:.17 |
  00000060  31 0a da 7f 67 25 40 7d  4e ce 9f d3 72 ce 4c e8  |1...g%@}N...r.L.|
  00000070  40 6d 5d ad e9 79 67 69  74 6c 69 6e 74 00 34 20  |@m]..ygitlint.4 |
  00000080  30 0a 93 63 25 17 69 e6  d6 92 78 97 55 4b 0f 8b  |0..c%.i...x.UK..|
  00000090  ff a0 e8 2d 6d 71 32 d1  69 fc f2 38 42 f8 5a 6e  |...-mq2.i..8B.Zn|
  000000a0  05 35 d6 94 41 c0 9f c7  ba 43                    |.5..A....C|
  000000aa

  $ find .git/objects -type f
  .git/objects/pack/pack-5e5d5e7c3cbd60a99b4c0295a2935885fbb235a1.idx
  .git/objects/pack/pack-5e5d5e7c3cbd60a99b4c0295a2935885fbb235a1.pack
  .git/objects/15/3a9a100eae7fdba5989ce39a5dd1782075517f
  .git/objects/5e/2c635433bc46b13061b276e481f63b1f6642c8

(By the way, cat-file barfs on both those loose objects, for the reason fsck
reveals.)

I hope that's helpful.

 - Emily

Jeff King Jan. 10, 2020, 6:37 a.m. UTC | #7

On Thu, Jan 09, 2020 at 02:46:41PM -0800, Emily Shaffer wrote:

> > Perhaps. The integrity check only protects against an index that was
> > modified after the fact, not one that was generated by a buggy Git. I'm
> > not sure we know how the index that led to this patch got into this
> > state (though it sounds like Emily has a copy and could check the hash
> > on it), but other cache-tree segfault I found recently was with an index
> > with an intact integrity hash.
> 
> Yeah, I can do that, although I'm not sure how. The index itself is very
> small - it only contains one file and one tree extension - so I'll go
> ahead and paste some poking and prodding, and if it's not what you
> wanted then please let me know what else to run.

I was thinking you would run something like:

  size=$(stat --format=%s "$file")
  actual=$(head -c $(($size-20)) "$file" | sha1sum | awk '{print $1}')
  expect=$(xxd -s -20 -g 20 -c 20 "$file" | awk '{print $2}')
  if test "$actual" = "$expect"; then
          echo "OK ($actual)"
  else
          echo "FAIL ($actual != $expect)"
  fi

to manually check the sha1. But...

>   $ g fsck --cache
>   Checking object directories: 100% (256/256), done.
>   Checking objects: 100% (20/20), done.
>   broken link from  commit 153a9a100eae7fdba5989ce39a5dd1782075517f
>                 to  commit cca7ecaa5d8c398f41bfec7938cc6a526803579b
>   broken link from  commit 7d6bb91e31d18eadfaf855a9fb7ad6ba81b8b6d9
>                 to  commit 03087a617bfe55f862cb1ef43273a2bd08e8b6d6
>   missing commit 03087a617bfe55f862cb1ef43273a2bd08e8b6d6
>   missing commit cca7ecaa5d8c398f41bfec7938cc6a526803579b
>   dangling commit 5e2c635433bc46b13061b276e481f63b1f6642c8

...fsck would have reported a problem there, since we explicitly kept
the check there in a33fc72fe9 (read-cache: force_verify_index_checksum,
2017-04-14).

And just to be double-sure, I used this:

>   $ hexdump -C .git/index
>   00000000  44 49 52 43 00 00 00 02  00 00 00 01 5d 89 5e 22  |DIRC........].^"|
>   00000010  23 bf a3 c4 5d 89 5e 22  23 bf a3 c4 00 00 fe 02  |#...].^"#.......|
>   00000020  02 c8 f5 83 00 00 81 a4  00 06 c1 dc 00 01 5f 53  |.............._S|
>   00000030  00 00 06 b3 78 88 a4 f4  22 34 7d ad b0 c4 73 0f  |....x..."4}...s.|
>   00000040  c5 bc f6 ea 1d 2d f0 3a  00 09 52 45 41 44 4d 45  |.....-.:..README|
>   00000050  2e 6d 64 00 54 52 45 45  00 00 00 3a 00 31 37 20  |.md.TREE...:.17 |
>   00000060  31 0a da 7f 67 25 40 7d  4e ce 9f d3 72 ce 4c e8  |1...g%@}N...r.L.|
>   00000070  40 6d 5d ad e9 79 67 69  74 6c 69 6e 74 00 34 20  |@m]..ygitlint.4 |
>   00000080  30 0a 93 63 25 17 69 e6  d6 92 78 97 55 4b 0f 8b  |0..c%.i...x.UK..|
>   00000090  ff a0 e8 2d 6d 71 32 d1  69 fc f2 38 42 f8 5a 6e  |...-mq2.i..8B.Zn|
>   000000a0  05 35 d6 94 41 c0 9f c7  ba 43                    |.5..A....C|
>   000000aa

to reconstruct the file and check its sha1, and indeed it is fine.

So this bogus index was probably actually created by Git, not an
after-the-fact byte corruption.

-Peff

Emily Shaffer Jan. 10, 2020, 11:07 p.m. UTC | #8

On Fri, Jan 10, 2020 at 01:37:41AM -0500, Jeff King wrote:
> On Thu, Jan 09, 2020 at 02:46:41PM -0800, Emily Shaffer wrote:
> 
> > > Perhaps. The integrity check only protects against an index that was
> > > modified after the fact, not one that was generated by a buggy Git. I'm
> > > not sure we know how the index that led to this patch got into this
> > > state (though it sounds like Emily has a copy and could check the hash
> > > on it), but other cache-tree segfault I found recently was with an index
> > > with an intact integrity hash.
> > 
> > Yeah, I can do that, although I'm not sure how. The index itself is very
> > small - it only contains one file and one tree extension - so I'll go
> > ahead and paste some poking and prodding, and if it's not what you
> > wanted then please let me know what else to run.
> 
> I was thinking you would run something like:
> 
>   size=$(stat --format=%s "$file")
>   actual=$(head -c $(($size-20)) "$file" | sha1sum | awk '{print $1}')
>   expect=$(xxd -s -20 -g 20 -c 20 "$file" | awk '{print $2}')
>   if test "$actual" = "$expect"; then
>           echo "OK ($actual)"
>   else
>           echo "FAIL ($actual != $expect)"
>   fi
> 
> to manually check the sha1.

Unsurprising given your mail, yeah, this looks OK when I run it against
the repo in question.

> So this bogus index was probably actually created by Git, not an
> after-the-fact byte corruption.

Disappointingly, the repro repo we got was aggressively redacted - I
don't have any reflogs to look through and try and get a hint of what
happened, and I imagine the reporter has moved on with their life enough
that we can't get something useful from there now.

 - Emily

[RFC] unpack-trees: watch for out-of-range index position

Commit Message

Comments

Patch