diff mbox series

[v2,06/23] pack-bitmap-write: support storing pseudo-merge commits

Message ID ee33a70324589a98c2239530b03cc2d7afbdfb9e.1714422410.git.me@ttaylorr.com (mailing list archive)
State Superseded
Headers show
Series pack-bitmap: pseudo-merge reachability bitmaps | expand

Commit Message

Taylor Blau April 29, 2024, 8:43 p.m. UTC
Prepare to write pseudo-merge bitmaps by annotating individual bitmapped
commits (which are represented by the `bitmapped_commit` structure) with
an extra bit indicating whether or not they are a pseudo-merge.

In subsequent commits, pseudo-merge bitmaps will be generated by
allocating a fake commit node with parents covering the full set of
commits represented by the pseudo-merge bitmap. These commits will be
added to the set of "selected" commits as usual, but will be written
specially instead of being included with the rest of the selected
commits.

Mechanically speaking, there are two parts of this change:

  - The bitmapped_commit struct gets a new bit indicating whether it is
    a pseudo-merge, or an ordinary commit selected for bitmaps.

  - A handful of changes to only write out the non-pseudo-merge commits
    when enumerating through the selected array (see the new
    `bitmap_writer_selected_nr()` function). Pseudo-merge commits appear
    after all non-pseudo-merge commits, so it is safe to enumerate
    through the selected array like so:

        for (i = 0; i < bitmap_writer_selected_nr(); i++)
          if (writer.selected[i].pseudo_merge)
            BUG("unexpected pseudo-merge");

    without encountering the BUG().

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap-write.c | 100 +++++++++++++++++++++++++++++---------------
 pack-bitmap.h       |   1 +
 2 files changed, 67 insertions(+), 34 deletions(-)

Comments

Patrick Steinhardt May 6, 2024, 11:52 a.m. UTC | #1
On Mon, Apr 29, 2024 at 04:43:15PM -0400, Taylor Blau wrote:
[snip]
> @@ -46,6 +48,11 @@ struct bitmap_writer {
>  
>  static struct bitmap_writer writer;
>  
> +static inline int bitmap_writer_selected_nr(void)
> +{
> +	return writer.selected_nr - writer.pseudo_merges_nr;
> +}

This function may use a comment to explain what its meaning actually is.
Like, `bitmap_writer_selected_nr()` is obviously not the same as the
`selected_nr` of the `bitmap_writer`, which is quite confusing. So why
do we subtract values and why are there two different `selected_nr`s?

[snip]
> diff --git a/pack-bitmap.h b/pack-bitmap.h
> index dae2d68a338..ca9acd2f735 100644
> --- a/pack-bitmap.h
> +++ b/pack-bitmap.h
> @@ -21,6 +21,7 @@ struct bitmap_disk_header {
>  	unsigned char checksum[GIT_MAX_RAWSZ];
>  };
>  
> +#define BITMAP_PSEUDO_MERGE (1u<<21)
>  #define NEEDS_BITMAP (1u<<22)

This flag is already used by "builtin/pack-objects.c", which may be fine.
But in any case, shouldn't we update "object.h" with both of these flags?

Patrick
Taylor Blau May 6, 2024, 6:48 p.m. UTC | #2
On Mon, May 06, 2024 at 01:52:56PM +0200, Patrick Steinhardt wrote:
> On Mon, Apr 29, 2024 at 04:43:15PM -0400, Taylor Blau wrote:
> [snip]
> > @@ -46,6 +48,11 @@ struct bitmap_writer {
> >
> >  static struct bitmap_writer writer;
> >
> > +static inline int bitmap_writer_selected_nr(void)
> > +{
> > +	return writer.selected_nr - writer.pseudo_merges_nr;
> > +}
>
> This function may use a comment to explain what its meaning actually is.
> Like, `bitmap_writer_selected_nr()` is obviously not the same as the
> `selected_nr` of the `bitmap_writer`, which is quite confusing. So why
> do we subtract values and why are there two different `selected_nr`s?

selected_nr is the total number of bitmaps we are writing (including
pseudo-merges), and writer.pseudo_merges_nr is the number of those
bitmaps which are pseudo-merges.

I renamed this function to bitmap_writer_nr_selected_commits() which
should clarify things, let me know if that works!

> [snip]
> > diff --git a/pack-bitmap.h b/pack-bitmap.h
> > index dae2d68a338..ca9acd2f735 100644
> > --- a/pack-bitmap.h
> > +++ b/pack-bitmap.h
> > @@ -21,6 +21,7 @@ struct bitmap_disk_header {
> >  	unsigned char checksum[GIT_MAX_RAWSZ];
> >  };
> >
> > +#define BITMAP_PSEUDO_MERGE (1u<<21)
> >  #define NEEDS_BITMAP (1u<<22)
>
> This flag is already used by "builtin/pack-objects.c", which may be fine.
> But in any case, shouldn't we update "object.h" with both of these flags?

I can't see where in builtin/pack-objects.c this flag is used. The table
in object.h says that bit 21 is used in:

  - list-objects-filter.c
  - builtin/index-pack.c
  - builtin/unpack-objects.c

But I think those are all fine. We don't call unpack-objects from the
bitmap writing paths, and the same is true of index-pack (since we're
writing the pack out directly).

list-objects-filter.c should also be OK, since I am 99% sure that these
two code paths do not collide, but even if they do, that field is only
set on tree objects from the list-objects-filter.c code path, and the
new bits in pack-bitmap.h are only set on commit objects.

Regardless, let me not forget to update the table in object.h! Thanks
for reminding me.

Thanks,
Taylor
Patrick Steinhardt May 10, 2024, 11:47 a.m. UTC | #3
On Mon, May 06, 2024 at 02:48:10PM -0400, Taylor Blau wrote:
> On Mon, May 06, 2024 at 01:52:56PM +0200, Patrick Steinhardt wrote:
> > On Mon, Apr 29, 2024 at 04:43:15PM -0400, Taylor Blau wrote:
> > [snip]
> > > @@ -46,6 +48,11 @@ struct bitmap_writer {
> > >
> > >  static struct bitmap_writer writer;
> > >
> > > +static inline int bitmap_writer_selected_nr(void)
> > > +{
> > > +	return writer.selected_nr - writer.pseudo_merges_nr;
> > > +}
> >
> > This function may use a comment to explain what its meaning actually is.
> > Like, `bitmap_writer_selected_nr()` is obviously not the same as the
> > `selected_nr` of the `bitmap_writer`, which is quite confusing. So why
> > do we subtract values and why are there two different `selected_nr`s?
> 
> selected_nr is the total number of bitmaps we are writing (including
> pseudo-merges), and writer.pseudo_merges_nr is the number of those
> bitmaps which are pseudo-merges.
> 
> I renamed this function to bitmap_writer_nr_selected_commits() which
> should clarify things, let me know if that works!

Yup, that's clearer, thanks.

> > [snip]
> > > diff --git a/pack-bitmap.h b/pack-bitmap.h
> > > index dae2d68a338..ca9acd2f735 100644
> > > --- a/pack-bitmap.h
> > > +++ b/pack-bitmap.h
> > > @@ -21,6 +21,7 @@ struct bitmap_disk_header {
> > >  	unsigned char checksum[GIT_MAX_RAWSZ];
> > >  };
> > >
> > > +#define BITMAP_PSEUDO_MERGE (1u<<21)
> > >  #define NEEDS_BITMAP (1u<<22)
> >
> > This flag is already used by "builtin/pack-objects.c", which may be fine.
> > But in any case, shouldn't we update "object.h" with both of these flags?
> 
> I can't see where in builtin/pack-objects.c this flag is used. The table
> in object.h says that bit 21 is used in:

Yeah, no idea. I must've been seeing ghosts here.

>   - list-objects-filter.c
>   - builtin/index-pack.c
>   - builtin/unpack-objects.c
> 
> But I think those are all fine. We don't call unpack-objects from the
> bitmap writing paths, and the same is true of index-pack (since we're
> writing the pack out directly).
> 
> list-objects-filter.c should also be OK, since I am 99% sure that these
> two code paths do not collide, but even if they do, that field is only
> set on tree objects from the list-objects-filter.c code path, and the
> new bits in pack-bitmap.h are only set on commit objects.

Okay, thanks for the explanation!

Patrick
Jeff King May 13, 2024, 6:42 p.m. UTC | #4
On Mon, Apr 29, 2024 at 04:43:15PM -0400, Taylor Blau wrote:

> diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> index 9bc41a9e145..fef02cd745a 100644
> --- a/pack-bitmap-write.c
> +++ b/pack-bitmap-write.c
> @@ -24,7 +24,7 @@ struct bitmapped_commit {
>  	struct ewah_bitmap *write_as;
>  	int flags;
>  	int xor_offset;
> -	uint32_t commit_pos;
> +	unsigned pseudo_merge : 1;
>  };

The addition of the bit flag here makes sense, but dropping commit_pos
caught me by surprise. But...it looks like that flag is simply unused
cruft even before this patch?

It might be worth noting that in the commit message, or better still,
pulling its removal out to a preparatory patch.

> +static inline int bitmap_writer_selected_nr(void)
> +{
> +	return writer.selected_nr - writer.pseudo_merges_nr;
> +}

OK, so now most spots should use this new function instead of looking at
writer.selected_nr directly. But if anybody accidentally uses the old
field directly, it is presumably disastrous. Is it worth renaming it to
make sure we caught all references?

The downside would be that spots which _do_ want the complete
selected_nr would need to be updated to use the new name. It doesn't
look like there are that many, though. OTOH, that means that it's also
easy to inspect them and see that you covered all of the relevant cases
(as far as I can see). I guess the biggest value in changing the field
name would be catching any topics in flight (or long-running forks).

-Peff
Taylor Blau May 13, 2024, 8:19 p.m. UTC | #5
On Mon, May 13, 2024 at 02:42:46PM -0400, Jeff King wrote:
> On Mon, Apr 29, 2024 at 04:43:15PM -0400, Taylor Blau wrote:
>
> > diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> > index 9bc41a9e145..fef02cd745a 100644
> > --- a/pack-bitmap-write.c
> > +++ b/pack-bitmap-write.c
> > @@ -24,7 +24,7 @@ struct bitmapped_commit {
> >  	struct ewah_bitmap *write_as;
> >  	int flags;
> >  	int xor_offset;
> > -	uint32_t commit_pos;
> > +	unsigned pseudo_merge : 1;
> >  };
>
> The addition of the bit flag here makes sense, but dropping commit_pos
> caught me by surprise. But...it looks like that flag is simply unused
> cruft even before this patch?
>
> It might be worth noting that in the commit message, or better still,
> pulling its removal out to a preparatory patch.

Hah, so this is a funny one :-).

I was following your suggestion to pull out the deletion into its own
patch[^1] and starting to dig out back-references to indicate why it was
safe to remove this field.

But the only reference to commit_pos is from 7cc8f971085 (pack-objects:
implement bitmap writing, 2013-12-21), which is the commit that added
this field in the first place. Looking at:

    $ git log -p -S commit_pos 7cc8f971085 -- pack-bitmap-write.c

doesn't really show us anything interesting, either.

But! There is an array called commit_positions, which I suspected was
for holding the values of commit_pos in the same order as they appear in
the writer.selected array.

So I think the right patch is something like this (which I'll put in the
next round of this series):

--- 8< ---
Subject: [PATCH] pack-bitmap-write.c: move commit_positions into commit_pos
 fields

In 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21), the
bitmapped_commit struct was introduced, including the 'commit_pos'
field, which has been unused ever since its introduction more than a
decade ago.

Instead, we have used the nearby `commit_positions` array leaving the
bitmapped_commit struct with an unused 4-byte field.

We could drop the `commit_pos` field as unused, and continue to store
the values in the auxiliary array. But we could also drop the array and
store the data for each bitmapped_commit struct inside of the structure
itself, which is what this patch does.

In any spot that we previously read `commit_positions[i]`, we can now
instead read `writer.selected[i].commit_pos`. There are a few spots that
need changing as a result:

  - write_selected_commits_v1() is a simple transformation, since we're
    just reading the field. As a result, the function no longer needs an
    explicit argument to pass the commit_positions array.

  - write_lookup_table() also no longer needs the explicit
    commit_positions array passed in as an argument. But it still needs
    to sort an array of indices into the writer.selected array to read
    them in commit_pos order, so table_cmp() is adjusted accordingly.

  - bitmap_writer_finish() no longer needs to allocate, populate, and
    free the commit_positions table. Instead, we can just write the data
    directly into each struct bitmapped_commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap-write.c | 42 ++++++++++++++++--------------------------
 1 file changed, 16 insertions(+), 26 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 473a0fa0d40..26f57e48804 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -679,9 +679,7 @@ static const struct object_id *oid_access(size_t pos, const void *table)
 	return &index[pos]->oid;
 }

-static void write_selected_commits_v1(struct hashfile *f,
-				      uint32_t *commit_positions,
-				      off_t *offsets)
+static void write_selected_commits_v1(struct hashfile *f, off_t *offsets)
 {
 	int i;

@@ -691,7 +689,7 @@ static void write_selected_commits_v1(struct hashfile *f,
 		if (offsets)
 			offsets[i] = hashfile_total(f);

-		hashwrite_be32(f, commit_positions[i]);
+		hashwrite_be32(f, stored->commit_pos);
 		hashwrite_u8(f, stored->xor_offset);
 		hashwrite_u8(f, stored->flags);

@@ -699,23 +697,20 @@ static void write_selected_commits_v1(struct hashfile *f,
 	}
 }

-static int table_cmp(const void *_va, const void *_vb, void *_data)
+static int table_cmp(const void *_va, const void *_vb)
 {
-	uint32_t *commit_positions = _data;
-	uint32_t a = commit_positions[*(uint32_t *)_va];
-	uint32_t b = commit_positions[*(uint32_t *)_vb];
+	struct bitmapped_commit *a = &writer.selected[*(uint32_t *)_va];
+	struct bitmapped_commit *b = &writer.selected[*(uint32_t *)_vb];

-	if (a > b)
+	if (a->commit_pos < b->commit_pos)
+		return -1;
+	else if (a->commit_pos > b->commit_pos)
 		return 1;
-	else if (a < b)
-		return -1;

 	return 0;
 }

-static void write_lookup_table(struct hashfile *f,
-			       uint32_t *commit_positions,
-			       off_t *offsets)
+static void write_lookup_table(struct hashfile *f, off_t *offsets)
 {
 	uint32_t i;
 	uint32_t *table, *table_inv;
@@ -731,7 +726,7 @@ static void write_lookup_table(struct hashfile *f,
 	 * bitmap corresponds to j'th bitmapped commit (among the selected
 	 * commits) in lex order of OIDs.
 	 */
-	QSORT_S(table, writer.selected_nr, table_cmp, commit_positions);
+	QSORT(table, writer.selected_nr, table_cmp);

 	/* table_inv helps us discover that relationship (i'th bitmap
 	 * to j'th commit by j = table_inv[i])
@@ -762,7 +757,7 @@ static void write_lookup_table(struct hashfile *f,
 			xor_row = 0xffffffff;
 		}

-		hashwrite_be32(f, commit_positions[table[i]]);
+		hashwrite_be32(f, writer.selected[table[i]].commit_pos);
 		hashwrite_be64(f, (uint64_t)offsets[table[i]]);
 		hashwrite_be32(f, xor_row);
 	}
@@ -798,7 +793,6 @@ void bitmap_writer_finish(struct pack_idx_entry **index,
 	static uint16_t flags = BITMAP_OPT_FULL_DAG;
 	struct strbuf tmp_file = STRBUF_INIT;
 	struct hashfile *f;
-	uint32_t *commit_positions = NULL;
 	off_t *offsets = NULL;
 	uint32_t i;

@@ -823,22 +817,19 @@ void bitmap_writer_finish(struct pack_idx_entry **index,
 	if (options & BITMAP_OPT_LOOKUP_TABLE)
 		CALLOC_ARRAY(offsets, index_nr);

-	ALLOC_ARRAY(commit_positions, writer.selected_nr);
-
 	for (i = 0; i < writer.selected_nr; i++) {
 		struct bitmapped_commit *stored = &writer.selected[i];
-		int commit_pos = oid_pos(&stored->commit->object.oid, index, index_nr, oid_access);
+		stored->commit_pos = oid_pos(&stored->commit->object.oid, index,
+					     index_nr, oid_access);

-		if (commit_pos < 0)
+		if (stored->commit_pos < 0)
 			BUG(_("trying to write commit not in index"));
-
-		commit_positions[i] = commit_pos;
 	}

-	write_selected_commits_v1(f, commit_positions, offsets);
+	write_selected_commits_v1(f, offsets);

 	if (options & BITMAP_OPT_LOOKUP_TABLE)
-		write_lookup_table(f, commit_positions, offsets);
+		write_lookup_table(f, offsets);

 	if (options & BITMAP_OPT_HASH_CACHE)
 		write_hash_cache(f, index, index_nr);
@@ -853,6 +844,5 @@ void bitmap_writer_finish(struct pack_idx_entry **index,
 		die_errno("unable to rename temporary bitmap file to '%s'", filename);

 	strbuf_release(&tmp_file);
-	free(commit_positions);
 	free(offsets);
 }

--
2.45.0.57.gee4186f79f3

--- >8 ---

> > +static inline int bitmap_writer_selected_nr(void)
> > +{
> > +	return writer.selected_nr - writer.pseudo_merges_nr;
> > +}
>
> OK, so now most spots should use this new function instead of looking at
> writer.selected_nr directly. But if anybody accidentally uses the old
> field directly, it is presumably disastrous. Is it worth renaming it to
> make sure we caught all references?

We only need to check within this file, since the bitmap_writer
structure definition is defined within the pack-bitmap-writer.c
compilation unit.

I took a careful look through the file, and am confident that we touched
all of the spots that needed attention.

Thanks,
Taylor

[^1]: If memory serves, that was my original intention when writing this
  series for the first time, but I must have forgotten when I was
  actually splitting out the individual patches and staged the removal
  alongside the rest of this change.
diff mbox series

Patch

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 9bc41a9e145..fef02cd745a 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -24,7 +24,7 @@  struct bitmapped_commit {
 	struct ewah_bitmap *write_as;
 	int flags;
 	int xor_offset;
-	uint32_t commit_pos;
+	unsigned pseudo_merge : 1;
 };
 
 struct bitmap_writer {
@@ -39,6 +39,8 @@  struct bitmap_writer {
 	struct bitmapped_commit *selected;
 	unsigned int selected_nr, selected_alloc;
 
+	uint32_t pseudo_merges_nr;
+
 	struct progress *progress;
 	int show_progress;
 	unsigned char pack_checksum[GIT_MAX_RAWSZ];
@@ -46,6 +48,11 @@  struct bitmap_writer {
 
 static struct bitmap_writer writer;
 
+static inline int bitmap_writer_selected_nr(void)
+{
+	return writer.selected_nr - writer.pseudo_merges_nr;
+}
+
 void bitmap_writer_init(struct repository *r)
 {
 	writer.bitmaps = kh_init_oid_map();
@@ -120,25 +127,30 @@  void bitmap_writer_build_type_index(struct packing_data *to_pack,
  * Compute the actual bitmaps
  */
 
-static inline void push_bitmapped_commit(struct commit *commit)
+static void bitmap_writer_push_bitmapped_commit(struct commit *commit,
+						unsigned pseudo_merge)
 {
-	int hash_ret;
-	khiter_t hash_pos;
-
 	if (writer.selected_nr >= writer.selected_alloc) {
 		writer.selected_alloc = (writer.selected_alloc + 32) * 2;
 		REALLOC_ARRAY(writer.selected, writer.selected_alloc);
 	}
 
-	hash_pos = kh_put_oid_map(writer.bitmaps, commit->object.oid, &hash_ret);
-	if (!hash_ret)
-		die(_("duplicate entry when writing bitmap index: %s"),
-		    oid_to_hex(&commit->object.oid));
-	kh_value(writer.bitmaps, hash_pos) = NULL;
+	if (!pseudo_merge) {
+		int hash_ret;
+		khiter_t hash_pos = kh_put_oid_map(writer.bitmaps,
+						   commit->object.oid,
+						   &hash_ret);
+
+		if (!hash_ret)
+			die(_("duplicate entry when writing bitmap index: %s"),
+			    oid_to_hex(&commit->object.oid));
+		kh_value(writer.bitmaps, hash_pos) = NULL;
+	}
 
 	writer.selected[writer.selected_nr].commit = commit;
 	writer.selected[writer.selected_nr].bitmap = NULL;
 	writer.selected[writer.selected_nr].flags = 0;
+	writer.selected[writer.selected_nr].pseudo_merge = pseudo_merge;
 
 	writer.selected_nr++;
 }
@@ -168,16 +180,20 @@  static void compute_xor_offsets(void)
 
 	while (next < writer.selected_nr) {
 		struct bitmapped_commit *stored = &writer.selected[next];
-
 		int best_offset = 0;
 		struct ewah_bitmap *best_bitmap = stored->bitmap;
 		struct ewah_bitmap *test_xor;
 
+		if (stored->pseudo_merge)
+			goto next;
+
 		for (i = 1; i <= MAX_XOR_OFFSET_SEARCH; ++i) {
 			int curr = next - i;
 
 			if (curr < 0)
 				break;
+			if (writer.selected[curr].pseudo_merge)
+				continue;
 
 			test_xor = ewah_pool_new();
 			ewah_xor(writer.selected[curr].bitmap, stored->bitmap, test_xor);
@@ -193,6 +209,7 @@  static void compute_xor_offsets(void)
 			}
 		}
 
+next:
 		stored->xor_offset = best_offset;
 		stored->write_as = best_bitmap;
 
@@ -205,7 +222,8 @@  struct bb_commit {
 	struct bitmap *commit_mask;
 	struct bitmap *bitmap;
 	unsigned selected:1,
-		 maximal:1;
+		 maximal:1,
+		 pseudo_merge:1;
 	unsigned idx; /* within selected array */
 };
 
@@ -243,17 +261,18 @@  static void bitmap_builder_init(struct bitmap_builder *bb,
 	revs.first_parent_only = 1;
 
 	for (i = 0; i < writer->selected_nr; i++) {
-		struct commit *c = writer->selected[i].commit;
-		struct bb_commit *ent = bb_data_at(&bb->data, c);
+		struct bitmapped_commit *bc = &writer->selected[i];
+		struct bb_commit *ent = bb_data_at(&bb->data, bc->commit);
 
 		ent->selected = 1;
 		ent->maximal = 1;
+		ent->pseudo_merge = bc->pseudo_merge;
 		ent->idx = i;
 
 		ent->commit_mask = bitmap_new();
 		bitmap_set(ent->commit_mask, i);
 
-		add_pending_object(&revs, &c->object, "");
+		add_pending_object(&revs, &bc->commit->object, "");
 	}
 
 	if (prepare_revision_walk(&revs))
@@ -430,8 +449,13 @@  static int fill_bitmap_commit(struct bb_commit *ent,
 		struct commit *c = prio_queue_get(queue);
 
 		if (old_bitmap && mapping) {
-			struct ewah_bitmap *old = bitmap_for_commit(old_bitmap, c);
+			struct ewah_bitmap *old;
 			struct bitmap *remapped = bitmap_new();
+
+			if (commit->object.flags & BITMAP_PSEUDO_MERGE)
+				old = NULL;
+			else
+				old = bitmap_for_commit(old_bitmap, c);
 			/*
 			 * If this commit has an old bitmap, then translate that
 			 * bitmap and add its bits to this one. No need to walk
@@ -450,12 +474,14 @@  static int fill_bitmap_commit(struct bb_commit *ent,
 		 * Mark ourselves and queue our tree. The commit
 		 * walk ensures we cover all parents.
 		 */
-		pos = find_object_pos(&c->object.oid, &found);
-		if (!found)
-			return -1;
-		bitmap_set(ent->bitmap, pos);
-		prio_queue_put(tree_queue,
-			       repo_get_commit_tree(the_repository, c));
+		if (!(c->object.flags & BITMAP_PSEUDO_MERGE)) {
+			pos = find_object_pos(&c->object.oid, &found);
+			if (!found)
+				return -1;
+			bitmap_set(ent->bitmap, pos);
+			prio_queue_put(tree_queue,
+				       repo_get_commit_tree(the_repository, c));
+		}
 
 		for (p = c->parents; p; p = p->next) {
 			pos = find_object_pos(&p->item->object.oid, &found);
@@ -483,6 +509,9 @@  static void store_selected(struct bb_commit *ent, struct commit *commit)
 
 	stored->bitmap = bitmap_to_ewah(ent->bitmap);
 
+	if (ent->pseudo_merge)
+		return;
+
 	hash_pos = kh_get_oid_map(writer.bitmaps, commit->object.oid);
 	if (hash_pos == kh_end(writer.bitmaps))
 		die(_("attempted to store non-selected commit: '%s'"),
@@ -612,7 +641,7 @@  void bitmap_writer_select_commits(struct commit **indexed_commits,
 
 	if (indexed_commits_nr < 100) {
 		for (i = 0; i < indexed_commits_nr; ++i)
-			push_bitmapped_commit(indexed_commits[i]);
+			bitmap_writer_push_bitmapped_commit(indexed_commits[i], 0);
 		return;
 	}
 
@@ -645,7 +674,7 @@  void bitmap_writer_select_commits(struct commit **indexed_commits,
 			}
 		}
 
-		push_bitmapped_commit(chosen);
+		bitmap_writer_push_bitmapped_commit(chosen, 0);
 
 		i += next + 1;
 		display_progress(writer.progress, i);
@@ -683,8 +712,11 @@  static void write_selected_commits_v1(struct hashfile *f,
 {
 	int i;
 
-	for (i = 0; i < writer.selected_nr; ++i) {
+	for (i = 0; i < bitmap_writer_selected_nr(); ++i) {
 		struct bitmapped_commit *stored = &writer.selected[i];
+		if (stored->pseudo_merge)
+			BUG("unexpected pseudo-merge among selected: %s",
+			    oid_to_hex(&stored->commit->object.oid));
 
 		if (offsets)
 			offsets[i] = hashfile_total(f);
@@ -718,10 +750,10 @@  static void write_lookup_table(struct hashfile *f,
 	uint32_t i;
 	uint32_t *table, *table_inv;
 
-	ALLOC_ARRAY(table, writer.selected_nr);
-	ALLOC_ARRAY(table_inv, writer.selected_nr);
+	ALLOC_ARRAY(table, bitmap_writer_selected_nr());
+	ALLOC_ARRAY(table_inv, bitmap_writer_selected_nr());
 
-	for (i = 0; i < writer.selected_nr; i++)
+	for (i = 0; i < bitmap_writer_selected_nr(); i++)
 		table[i] = i;
 
 	/*
@@ -729,16 +761,16 @@  static void write_lookup_table(struct hashfile *f,
 	 * bitmap corresponds to j'th bitmapped commit (among the selected
 	 * commits) in lex order of OIDs.
 	 */
-	QSORT_S(table, writer.selected_nr, table_cmp, commit_positions);
+	QSORT_S(table, bitmap_writer_selected_nr(), table_cmp, commit_positions);
 
 	/* table_inv helps us discover that relationship (i'th bitmap
 	 * to j'th commit by j = table_inv[i])
 	 */
-	for (i = 0; i < writer.selected_nr; i++)
+	for (i = 0; i < bitmap_writer_selected_nr(); i++)
 		table_inv[table[i]] = i;
 
 	trace2_region_enter("pack-bitmap-write", "writing_lookup_table", the_repository);
-	for (i = 0; i < writer.selected_nr; i++) {
+	for (i = 0; i < bitmap_writer_selected_nr(); i++) {
 		struct bitmapped_commit *selected = &writer.selected[table[i]];
 		uint32_t xor_offset = selected->xor_offset;
 		uint32_t xor_row;
@@ -809,7 +841,7 @@  void bitmap_writer_finish(struct pack_idx_entry **index,
 	memcpy(header.magic, BITMAP_IDX_SIGNATURE, sizeof(BITMAP_IDX_SIGNATURE));
 	header.version = htons(default_version);
 	header.options = htons(flags | options);
-	header.entry_count = htonl(writer.selected_nr);
+	header.entry_count = htonl(bitmap_writer_selected_nr());
 	hashcpy(header.checksum, writer.pack_checksum);
 
 	hashwrite(f, &header, sizeof(header) - GIT_MAX_RAWSZ + the_hash_algo->rawsz);
@@ -821,9 +853,9 @@  void bitmap_writer_finish(struct pack_idx_entry **index,
 	if (options & BITMAP_OPT_LOOKUP_TABLE)
 		CALLOC_ARRAY(offsets, index_nr);
 
-	ALLOC_ARRAY(commit_positions, writer.selected_nr);
+	ALLOC_ARRAY(commit_positions, bitmap_writer_selected_nr());
 
-	for (i = 0; i < writer.selected_nr; i++) {
+	for (i = 0; i < bitmap_writer_selected_nr(); i++) {
 		struct bitmapped_commit *stored = &writer.selected[i];
 		int commit_pos = oid_pos(&stored->commit->object.oid, index, index_nr, oid_access);
 
diff --git a/pack-bitmap.h b/pack-bitmap.h
index dae2d68a338..ca9acd2f735 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -21,6 +21,7 @@  struct bitmap_disk_header {
 	unsigned char checksum[GIT_MAX_RAWSZ];
 };
 
+#define BITMAP_PSEUDO_MERGE (1u<<21)
 #define NEEDS_BITMAP (1u<<22)
 
 /*