Message ID | 20191115141541.11149-5-chriscool@tuxfamily.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Rewrite packfile reuse code | expand |
On Fri, Nov 15, 2019 at 03:15:36PM +0100, Christian Couder wrote: > From: Jeff King <peff@peff.net> > > We will no longer compute bitmap_git->reuse_objects in a > following commit, so we cannot rely on it anymore to > terminate the loop early; we have to iterate to the end. Hmm. In my original work from 2015 (which you never saw as individual commits), this came somewhere in the middle, after moving to per-object reuse. I think by dropping these hunks now, the state of the code at this point would mean that we might write the objects twice. We'd mark them as "reused" and send them as part of write_reused_pack(). But we'd also send them to pack-objects via the show_reachable_fn callback, and it would add them to the usual packing list. And indeed, t5310.10 fails at this point in the series with: Cloning into bare repository 'clone.git'... remote: Enumerating objects: 331, done. remote: Counting objects: 100% (331/331), done. remote: Compressing objects: 100% (111/111), done. remote: Total 662 (delta 108), reused 331 (delta 108), pack-reused 331 Receiving objects: 100% (662/662), 53.14 KiB | 17.71 MiB/s, done. Resolving deltas: 100% (216/216), done. fatal: The same object 00c1d3730931e66eb08dabe3a3c9fa16621d728a appears twice in the pack fatal: index-pack failed and then starts working again after the final patch. -Peff
On Mon, Dec 9, 2019 at 7:47 AM Jeff King <peff@peff.net> wrote: > I think by dropping these hunks now, the state of the code at this point > would mean that we might write the objects twice. We'd mark them as > "reused" and send them as part of write_reused_pack(). But we'd also > send them to pack-objects via the show_reachable_fn callback, and it > would add them to the usual packing list. > > And indeed, t5310.10 fails at this point in the series with: > > Cloning into bare repository 'clone.git'... > remote: Enumerating objects: 331, done. > remote: Counting objects: 100% (331/331), done. > remote: Compressing objects: 100% (111/111), done. > remote: Total 662 (delta 108), reused 331 (delta 108), pack-reused 331 > Receiving objects: 100% (662/662), 53.14 KiB | 17.71 MiB/s, done. > Resolving deltas: 100% (216/216), done. > fatal: The same object 00c1d3730931e66eb08dabe3a3c9fa16621d728a appears twice in the pack > fatal: index-pack failed > > and then starts working again after the final patch. Yeah, I thought I had tested this, but anyway I squashed this patch into the final patch to avoid failures.
diff --git a/pack-bitmap.c b/pack-bitmap.c index e07c798879..016d0319fc 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -622,7 +622,7 @@ static void show_objects_for_type( enum object_type object_type, show_reachable_fn show_reach) { - size_t pos = 0, i = 0; + size_t i = 0; uint32_t offset; struct ewah_iterator it; @@ -630,13 +630,15 @@ static void show_objects_for_type( struct bitmap *objects = bitmap_git->result; - if (bitmap_git->reuse_objects == bitmap_git->pack->num_objects) - return; - ewah_iterator_init(&it, type_filter); - while (i < objects->word_alloc && ewah_iterator_next(&filter, &it)) { + for (i = 0; i < objects->word_alloc && + ewah_iterator_next(&filter, &it); i++) { eword_t word = objects->words[i] & filter; + size_t pos = (i * BITS_IN_EWORD); + + if (!word) + continue; for (offset = 0; offset < BITS_IN_EWORD; ++offset) { struct object_id oid; @@ -648,9 +650,6 @@ static void show_objects_for_type( offset += ewah_bit_ctz64(word >> offset); - if (pos + offset < bitmap_git->reuse_objects) - continue; - entry = &bitmap_git->pack->revindex[pos + offset]; nth_packed_object_oid(&oid, bitmap_git->pack, entry->nr); @@ -659,9 +658,6 @@ static void show_objects_for_type( show_reach(&oid, object_type, 0, hash, bitmap_git->pack, entry->offset); } - - pos += BITS_IN_EWORD; - i++; } }