Message ID | 36deaad366d66d10b96755dd6969bfe51123a2d4.1605123652.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | pack-bitmap: bitmap generation improvements | expand |
Taylor Blau <me@ttaylorr.com> writes: > When the buffer size is exactly 1, we fail to grow it properly, since > the integer truncation means that 1 * 3 / 2 = 1. This can cause a bad > write on the line below. When the buffer_size is exactly (alloc_size - 1), we can fit the new element at the last word in the buffer array, but we still grow. Is this because we anticipate that we would need to add more soon? > Bandaid this by first padding the buffer by 16, and then growing it. > This still allows old blocks to fit into new ones, but fixes the case > where the block size equals 1. Adding 16 unconditionally is not "to pad". If somebody really wants "to pad", a likely implementation would be that the size resulting from some computation (e.g. multiplying by 1.5) is round up to a multiple of some number, than rounding up the original number before multiplying it by 1.5, so the use of that verb in the explanation did not help me understand what is going on. Having said that, I see you used the word "bandaid" to signal that we shouldn't worry about this being optimal or even correct and we should be happy as long as it is not wrong ;-), but is there any reason behind this 16 (as opposed to picking, say, 8 or 31), or is that pulled out of thin air? I think this probably mimics what alloc_nr() computes for ALLOC_GROW(). I wonder why buffer_grow() cannot be built around ALLOC_GROW() instead? Nothing in the code is wrong per-se, but just what I noticed while re-reading the patch. Thanks. > Co-authored-by: Jeff King <peff@peff.net> > Signed-off-by: Jeff King <peff@peff.net> > Signed-off-by: Taylor Blau <me@ttaylorr.com> > --- > ewah/ewah_bitmap.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c > index d59b1afe3d..3fae04ad00 100644 > --- a/ewah/ewah_bitmap.c > +++ b/ewah/ewah_bitmap.c > @@ -45,7 +45,7 @@ static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size) > static inline void buffer_push(struct ewah_bitmap *self, eword_t value) > { > if (self->buffer_size + 1 >= self->alloc_size) > - buffer_grow(self, self->buffer_size * 3 / 2); > + buffer_grow(self, (self->buffer_size + 16) * 3 / 2); > > self->buffer[self->buffer_size++] = value; > }
On Sun, Nov 22, 2020 at 11:36:17AM -0800, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > When the buffer size is exactly 1, we fail to grow it properly, since > > the integer truncation means that 1 * 3 / 2 = 1. This can cause a bad > > write on the line below. > > When the buffer_size is exactly (alloc_size - 1), we can fit the new > element at the last word in the buffer array, but we still grow. Is > this because we anticipate that we would need to add more soon? Right; the check 'if (self->buffer_size + 1 >= self->alloc_size)' could probably be written as a strict inequality, but that check dates back to the original ewah implementation that Vicent added. But, that is not quite the point of this patch: instead we want to stop the integer math on that line from preventing us from growing the buffer. I think that this paragraph would be clarified by adding "and we need to grow" to the end of "when the buffer size is exactly 1". > > Bandaid this by first padding the buffer by 16, and then growing it. > > This still allows old blocks to fit into new ones, but fixes the case > > where the block size equals 1. > > Adding 16 unconditionally is not "to pad". If somebody really wants > "to pad", a likely implementation would be that the size resulting > from some computation (e.g. multiplying by 1.5) is round up to a > multiple of some number, than rounding up the original number before > multiplying it by 1.5, so the use of that verb in the explanation > did not help me understand what is going on. > > Having said that, I see you used the word "bandaid" to signal that > we shouldn't worry about this being optimal or even correct and we > should be happy as long as it is not wrong ;-), but is there any > reason behind this 16 (as opposed to picking, say, 8 or 31), or is > that pulled out of thin air? Any phrase that more accurately states what's going on is fine by me, but... > I think this probably mimics what alloc_nr() computes for ALLOC_GROW(). > I wonder why buffer_grow() cannot be built around ALLOC_GROW() instead? I think that we probably could just use ALLOC_GROW() as you suggest. Funny enough, reading through GitHub's chat logs, apparently this is something that Peff and I talked about. So, 16 probably came from alloc_nr(), but we probably stopped short of realizing that we could just use ALLOC_GROW as-is. So, maybe something along the lines of: diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c index 3fae04ad00..9effcc0877 100644 --- a/ewah/ewah_bitmap.c +++ b/ewah/ewah_bitmap.c @@ -19,6 +19,7 @@ #include "git-compat-util.h" #include "ewok.h" #include "ewok_rlw.h" +#include "cache.h" static inline size_t min_size(size_t a, size_t b) { @@ -33,12 +34,7 @@ static inline size_t max_size(size_t a, size_t b) static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size) { size_t rlw_offset = (uint8_t *)self->rlw - (uint8_t *)self->buffer; - - if (self->alloc_size >= new_size) - return; - - self->alloc_size = new_size; - REALLOC_ARRAY(self->buffer, self->alloc_size); + ALLOC_GROW(self->buffer, new_size, self->alloc_size); self->rlw = self->buffer + (rlw_offset / sizeof(eword_t)); } > Nothing in the code is wrong per-se, but just what I noticed while > re-reading the patch. > > Thanks. Thanks, Taylor
On Mon, Nov 23, 2020 at 11:22:04AM -0500, Taylor Blau wrote: > > I think this probably mimics what alloc_nr() computes for ALLOC_GROW(). > > I wonder why buffer_grow() cannot be built around ALLOC_GROW() instead? > > I think that we probably could just use ALLOC_GROW() as you suggest. > Funny enough, reading through GitHub's chat logs, apparently this is > something that Peff and I talked about. So, 16 probably came from > alloc_nr(), but we probably stopped short of realizing that we could > just use ALLOC_GROW as-is. That would probably be OK. It's a bit more aggressive, which could matter if you have a large number of very small bitmaps. My original goal of the "grow less aggressively" patch was to keep memory usage down, knowing that I was going to be holding a lot of bitmaps in memory at once. But even with micro-optimizations like this, it turned out to be far too big in practice (and hence Stolee's work on top to reduce the total number we hold at once). > @@ -33,12 +34,7 @@ static inline size_t max_size(size_t a, size_t b) > static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size) > { > size_t rlw_offset = (uint8_t *)self->rlw - (uint8_t *)self->buffer; > - > - if (self->alloc_size >= new_size) > - return; > - > - self->alloc_size = new_size; > - REALLOC_ARRAY(self->buffer, self->alloc_size); > + ALLOC_GROW(self->buffer, new_size, self->alloc_size); > self->rlw = self->buffer + (rlw_offset / sizeof(eword_t)); > } I think the real test would be measuring the peak heap of the series as you posted it in v2, and this version replacing this patch (and the "grow less aggressively" one) with ALLOC_GROW(). On something big, like repacking all of the torvalds/linux or git/git fork networks. If there's no appreciable difference, then definitely I think it's worth the simplicity of reusing ALLOC_GROW(). -Peff
On Mon, Nov 23, 2020 at 09:48:22PM -0500, Jeff King wrote: > > I think that we probably could just use ALLOC_GROW() as you suggest. > > Funny enough, reading through GitHub's chat logs, apparently this is > > something that Peff and I talked about. So, 16 probably came from > > alloc_nr(), but we probably stopped short of realizing that we could > > just use ALLOC_GROW as-is. > > That would probably be OK. It's a bit more aggressive, which could > matter if you have a large number of very small bitmaps. My original > goal of the "grow less aggressively" patch was to keep memory usage > down, knowing that I was going to be holding a lot of bitmaps in memory > at once. But even with micro-optimizations like this, it turned out to > be far too big in practice (and hence Stolee's work on top to reduce the > total number we hold at once). Oh, sorry, I was mixing this patch up with patches 6 and 7, which touch buffer_grow(). This is a totally separate spot, and this is a pure bug-fix. I think the main reason we didn't use ALLOC_GROW() here in the beginning is that the ewah code was originally designed to be a separate library (a port of the java ewah library), and didn't depend on Git code. These days we pull in xmalloc, etc, so we should be fine to use ALLOC_GROW(). Likewise... > I think the real test would be measuring the peak heap of the series as > you posted it in v2, and this version replacing this patch (and the > "grow less aggressively" one) with ALLOC_GROW(). On something big, like > repacking all of the torvalds/linux or git/git fork networks. > > If there's no appreciable difference, then definitely I think it's worth > the simplicity of reusing ALLOC_GROW(). All of this is nonsense (though it does apply to the question of using ALLOC_GROW() in bitmap_grow(), via patch 7). We have many fewer ewah bitmaps in memory at one time, so I don't think it's worth micro-managing a few extra bytes of growth. Using ALLOC_GROW() for this case would be fine. -Peff
On Mon, Nov 23, 2020 at 09:51:41PM -0500, Jeff King wrote: > On Mon, Nov 23, 2020 at 09:48:22PM -0500, Jeff King wrote: > > > > I think that we probably could just use ALLOC_GROW() as you suggest. > > > Funny enough, reading through GitHub's chat logs, apparently this is > > > something that Peff and I talked about. So, 16 probably came from > > > alloc_nr(), but we probably stopped short of realizing that we could > > > just use ALLOC_GROW as-is. > > > > That would probably be OK. It's a bit more aggressive, which could > > matter if you have a large number of very small bitmaps. My original > > goal of the "grow less aggressively" patch was to keep memory usage > > down, knowing that I was going to be holding a lot of bitmaps in memory > > at once. But even with micro-optimizations like this, it turned out to > > be far too big in practice (and hence Stolee's work on top to reduce the > > total number we hold at once). > > Oh, sorry, I was mixing this patch up with patches 6 and 7, which touch > buffer_grow(). This is a totally separate spot, and this is a pure > bug-fix. > > I think the main reason we didn't use ALLOC_GROW() here in the beginning > is that the ewah code was originally designed to be a separate library > (a port of the java ewah library), and didn't depend on Git code. > > These days we pull in xmalloc, etc, so we should be fine to use > ALLOC_GROW(). > > Likewise... > > > I think the real test would be measuring the peak heap of the series as > > you posted it in v2, and this version replacing this patch (and the > > "grow less aggressively" one) with ALLOC_GROW(). On something big, like > > repacking all of the torvalds/linux or git/git fork networks. > > > > If there's no appreciable difference, then definitely I think it's worth > > the simplicity of reusing ALLOC_GROW(). > > All of this is nonsense (though it does apply to the question of using > ALLOC_GROW() in bitmap_grow(), via patch 7). You and I timed this a week or two ago, but I only just returned to this topic today. Switching to ALLOC_GROW() doesn't affect the final memory usage at all, so I changed patch 7 up to use that instead of more or less open-coding alloc_nr(). > -Peff Thanks, Taylor
diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c index d59b1afe3d..3fae04ad00 100644 --- a/ewah/ewah_bitmap.c +++ b/ewah/ewah_bitmap.c @@ -45,7 +45,7 @@ static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size) static inline void buffer_push(struct ewah_bitmap *self, eword_t value) { if (self->buffer_size + 1 >= self->alloc_size) - buffer_grow(self, self->buffer_size * 3 / 2); + buffer_grow(self, (self->buffer_size + 16) * 3 / 2); self->buffer[self->buffer_size++] = value; }