diff mbox series

[07/23] ewah: make bitmap growth less aggressive

Message ID c7db594fae4d0447a55a92e830475d9bc418ae7f.1605123652.git.me@ttaylorr.com (mailing list archive)
State Superseded
Headers show
Series pack-bitmap: bitmap generation improvements | expand

Commit Message

Taylor Blau Nov. 11, 2020, 7:42 p.m. UTC
From: Jeff King <peff@peff.net>

If you ask to set a bit in the Nth word and we haven't yet allocated
that many slots in our array, we'll increase the bitmap size to 2*N.
This means we might frequently end up with bitmaps that are twice the
necessary size (as soon as you ask for the biggest bit, we'll size up to
twice that).

But if we just allocate as many words as were asked for, we may not grow
fast enough. The worst case there is setting bit 0, then 1, etc. Each
time we grow we'd just extend by one more word, giving us linear
reallocations (and quadratic memory copies).

Let's combine those by allocating the maximum of:

 - what the caller asked for

 - a geometric increase in existing size; we'll switch to 3/2 instead of
   2 here. That's less aggressive and may help avoid fragmenting memory
   (N + 3N/2 > 9N/4, so old chunks can be reused as we scale up).

Our worst case is still 3/2N wasted bits (you set bit N-1, then setting
bit N causes us to grow by 3/2), but our average should be much better.

This isn't usually that big a deal, but it will matter as we shift the
reachability bitmap generation code to store more bitmaps in memory.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 ewah/bitmap.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Junio C Hamano Nov. 22, 2020, 8:32 p.m. UTC | #1
Taylor Blau <me@ttaylorr.com> writes:

>  - a geometric increase in existing size; we'll switch to 3/2 instead of
>    2 here. That's less aggressive and may help avoid fragmenting memory
>    (N + 3N/2 > 9N/4, so old chunks can be reused as we scale up).

I am sure this is something obvious to bitmap folks, but where does
9N/4 come from (I get that the left-hand-side of the comparison is
the memory necessary to hold both the old and the new copy while
reallocating the words[] array)?

Thanks.
Taylor Blau Nov. 23, 2020, 4:49 p.m. UTC | #2
On Sun, Nov 22, 2020 at 12:32:01PM -0800, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> >  - a geometric increase in existing size; we'll switch to 3/2 instead of
> >    2 here. That's less aggressive and may help avoid fragmenting memory
> >    (N + 3N/2 > 9N/4, so old chunks can be reused as we scale up).
>
> I am sure this is something obvious to bitmap folks, but where does
> 9N/4 come from (I get that the left-hand-side of the comparison is
> the memory necessary to hold both the old and the new copy while
> reallocating the words[] array)?

I thought that I was in the group of "bitmap folks", but since it's not
obvious to me either, I guess I'll have to hand in my bitmap folks
membership card ;).

Peff: where does 9N/4 come from? On a similar note: we could certainly
use ALLOC_GROW here, too, but it would change the behavior slightly (by
using alloc_nr()'s "add-16-first" behavior). Maybe we should be using
it, but I'll defer to your judgement.

> Thanks.

Thanks,
Taylor
Jeff King Nov. 24, 2020, 3 a.m. UTC | #3
On Mon, Nov 23, 2020 at 11:49:49AM -0500, Taylor Blau wrote:

> On Sun, Nov 22, 2020 at 12:32:01PM -0800, Junio C Hamano wrote:
> > Taylor Blau <me@ttaylorr.com> writes:
> >
> > >  - a geometric increase in existing size; we'll switch to 3/2 instead of
> > >    2 here. That's less aggressive and may help avoid fragmenting memory
> > >    (N + 3N/2 > 9N/4, so old chunks can be reused as we scale up).
> >
> > I am sure this is something obvious to bitmap folks, but where does
> > 9N/4 come from (I get that the left-hand-side of the comparison is
> > the memory necessary to hold both the old and the new copy while
> > reallocating the words[] array)?
> 
> I thought that I was in the group of "bitmap folks", but since it's not
> obvious to me either, I guess I'll have to hand in my bitmap folks
> membership card ;).
> 
> Peff: where does 9N/4 come from?

it is not a bitmap thing at all. We are growing a buffer, so if we
continually multiply it by 3/2, then our sequence of sizes is:

  - before growth: N
  - after 1 growth: 3N/2
  - after 2 growths: 9N/4

Meaning we can fit the third chunk into the memory vacated by the second
two. Whereas with a factor of, say 2:

  - before growth: N
  - after 1 growth: 2N
  - after 2 growth: 4N

which does not fit, and fragments your memory.

There's a slight lie there, which is that you'll typically still hold
the growth G-1 while doing growth G (after all, that is where you will
copy the data from). But it still works out that you eventually get to
use old chunks. The breakeven point is actually the golden ratio, but a)
it's irrational and b) it probably makes sense to give some slop for
malloc chunk overhead. 1.6 would probably be fine, too, though. :)

> On a similar note: we could certainly
> use ALLOC_GROW here, too, but it would change the behavior slightly (by
> using alloc_nr()'s "add-16-first" behavior). Maybe we should be using
> it, but I'll defer to your judgement.

That would be OK, modulo the measurement question I asked in the other
(wrong) part of the thread.

-Peff
Junio C Hamano Nov. 24, 2020, 8:11 p.m. UTC | #4
Jeff King <peff@peff.net> writes:

> On Mon, Nov 23, 2020 at 11:49:49AM -0500, Taylor Blau wrote:
>
>> On Sun, Nov 22, 2020 at 12:32:01PM -0800, Junio C Hamano wrote:
>> > Taylor Blau <me@ttaylorr.com> writes:
>> >
>> > >  - a geometric increase in existing size; we'll switch to 3/2 instead of
>> > >    2 here. That's less aggressive and may help avoid fragmenting memory
>> > >    (N + 3N/2 > 9N/4, so old chunks can be reused as we scale up).
>> >
>> > I am sure this is something obvious to bitmap folks, but where does
>> > 9N/4 come from (I get that the left-hand-side of the comparison is
>> > the memory necessary to hold both the old and the new copy while
>> > reallocating the words[] array)?
>> 
>> I thought that I was in the group of "bitmap folks", but since it's not
>> obvious to me either, I guess I'll have to hand in my bitmap folks
>> membership card ;).
>> 
>> Peff: where does 9N/4 come from?
>
> it is not a bitmap thing at all. We are growing a buffer, so if we
> continually multiply it by 3/2, then our sequence of sizes is:
>
>   - before growth: N
>   - after 1 growth: 3N/2
>   - after 2 growths: 9N/4

AH, OK.  I feel stupid not to have thought of this myself X-<.

Thanks.
diff mbox series

Patch

diff --git a/ewah/bitmap.c b/ewah/bitmap.c
index 7c1ecfa6fd..43a59d7fed 100644
--- a/ewah/bitmap.c
+++ b/ewah/bitmap.c
@@ -39,7 +39,9 @@  static void bitmap_grow(struct bitmap *self, size_t word_alloc)
 {
 	if (word_alloc > self->word_alloc) {
 		size_t old_size = self->word_alloc;
-		self->word_alloc = word_alloc * 2;
+		self->word_alloc = old_size * 3 / 2;
+		if (word_alloc > self->word_alloc)
+			self->word_alloc = word_alloc;
 		REALLOC_ARRAY(self->words, self->word_alloc);
 		memset(self->words + old_size, 0x0,
 			(self->word_alloc - old_size) * sizeof(eword_t));