Message ID | 1379af2e9d271b501ef3942398e7f159a9c77973.1656440978.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | pack-objects.h: remove outdated pahole results | expand |
On 6/28/2022 2:30 PM, Taylor Blau wrote: > The size and padding of `struct object_entry` is an important factor in > determining the memory usage of `pack-objects`. For this reason, > 3b13a5f263 (pack-objects: reorder members to shrink struct object_entry, > 2018-04-14) added a comment containing some information from pahole > indicating the size and padding of that struct. > > Unfortunately, this comment hasn't been updated since 9ac3f0e5b3 > (pack-objects: fix performance issues on packing large deltas, > 2018-07-22), despite the size of this struct changing many times since > that commit. ... > For that reason, eliminate the confusion by removing the comment > altogether. > - > - /* > - * pahole results on 64-bit linux (gcc and clang) > - * > - * size: 80, bit_padding: 9 bits > - * > - * and on 32-bit (gcc) > - * > - * size: 76, bit_padding: 9 bits > - */ > }; I will shed no tears over this being gone. Thanks! -Stolee
On 2022-06-28 at 18:30:20, Taylor Blau wrote: > The size and padding of `struct object_entry` is an important factor in > determining the memory usage of `pack-objects`. For this reason, > 3b13a5f263 (pack-objects: reorder members to shrink struct object_entry, > 2018-04-14) added a comment containing some information from pahole > indicating the size and padding of that struct. > > Unfortunately, this comment hasn't been updated since 9ac3f0e5b3 > (pack-objects: fix performance issues on packing large deltas, > 2018-07-22), despite the size of this struct changing many times since > that commit. > > To see just how often the size of object_entry changes, I skimmed the > first-parent history with this script: > > for sha in $(git rev-list --first-parent --reverse 9ac3f0e..) > do > echo -n "$sha " > git checkout -q $sha > make -s pack-objects.o 2>/dev/null > pahole -C object_entry pack-objects.o | sed -n \ > -e 's/\/\* size: \([0-9]*\).*/size \1/p' \ > -e 's/\/\*.*padding: \([0-9]*\).*/padding \1/p' | xargs > done | uniq -f1 > > In between each merge, the size of object_entry changes too often to > record every instance here. But the important merges (along with their > corresponding sizes and bit paddings) in chronological order are: > > ad635e82d6 (Merge branch 'nd/pack-objects-pack-struct', 2018-05-23) size 80 padding 4 > 29d9e3e2c4 (Merge branch 'nd/pack-deltify-regression-fix', 2018-08-22) size 80 padding 9 > 3ebdef2e1b (Merge branch 'jk/pack-delta-reuse-with-bitmap', 2018-09-17) size 80 padding 8 > 33e4ae9c50 (Merge branch 'bc/sha-256', 2019-01-29) size 96 padding 8 > > (indicating that the current size of the struct is 96 bytes, with 8 > padding bits). > > Even though this comment was written in a good spirit, it is updated > infrequently enough that is serves to confuse rather than to encourage I think you wanted to say, "that it serves:. > contributors to update the appropriate values when the modify the > definition of object_entry. > > For that reason, eliminate the confusion by removing the comment > altogether. > > Signed-off-by: Taylor Blau <me@ttaylorr.com> I agree with your rationale and that we should remove this.
On Tue, Jun 28, 2022 at 09:04:27PM +0000, brian m. carlson wrote: > > Even though this comment was written in a good spirit, it is updated > > infrequently enough that is serves to confuse rather than to encourage > > I think you wanted to say, "that it serves:. Oops. Thanks for pointing it out. > > contributors to update the appropriate values when the modify the > > definition of object_entry. > > > > For that reason, eliminate the confusion by removing the comment > > altogether. > > > > Signed-off-by: Taylor Blau <me@ttaylorr.com> > > I agree with your rationale and that we should remove this. I'm glad that you agree. Thanks, Taylor
On Tue, Jun 28, 2022 at 02:30:20PM -0400, Taylor Blau wrote: > Even though this comment was written in a good spirit, it is updated > infrequently enough that is serves to confuse rather than to encourage > contributors to update the appropriate values when the modify the > definition of object_entry. > > For that reason, eliminate the confusion by removing the comment > altogether. I agree the actual numbers aren't helping anybody. We _could_ leave a comment that says "we store a lot of these in memory; be careful of where and how you add new fields to avoid increasing the struct size". And then people can run "pahole" before and after their changes. But then that is also true of other structs (like "struct object"), and we do not bother there. So it probably is fine not to annotate this specifically. Speaking of which, I suspect quite a lot of memory could be saved if "pack-objects --revs" freed the object structs it allocates during its traversal. Unless we're generating bitmaps, I don't think they get used again after the initial packing list is generated. At peak you'd still be storing all of the object_entry structs alongside the objects as you finish the traversal, but it wouldn't overlap with any memory used for the delta search, and of course we'd be at that peak for a much smaller time. Not a blocker for your patch obviously, but maybe a fun experiment in an adjacent area. Possibly even an ambitious #leftoverbits opportunity. :) -Peff
On Fri, Jul 01, 2022 at 02:16:26PM -0400, Jeff King wrote: > On Tue, Jun 28, 2022 at 02:30:20PM -0400, Taylor Blau wrote: > > > Even though this comment was written in a good spirit, it is updated > > infrequently enough that is serves to confuse rather than to encourage > > contributors to update the appropriate values when the modify the > > definition of object_entry. > > > > For that reason, eliminate the confusion by removing the comment > > altogether. > > I agree the actual numbers aren't helping anybody. We _could_ leave a > comment that says "we store a lot of these in memory; be careful of > where and how you add new fields to avoid increasing the struct size". > And then people can run "pahole" before and after their changes. > > But then that is also true of other structs (like "struct object"), and > we do not bother there. So it probably is fine not to annotate this > specifically. We have such a comment at the very type of the block comment above `struct object_entry`'s definition: "The size of struct nearly determines pack-object's memory consumption. This struct is packed tight for that reason. When you add or reorder something in this struct, think a bit about this". thanks to Duy back in 3b13a5f263 (pack-objects: reorder members to shrink struct object_entry, 2018-04-14). > Speaking of which, I suspect quite a lot of memory could be saved if > "pack-objects --revs" freed the object structs it allocates during its > traversal. Unless we're generating bitmaps, I don't think they get used > again after the initial packing list is generated. At peak you'd > still be storing all of the object_entry structs alongside the objects > as you finish the traversal, but it wouldn't overlap with any memory > used for the delta search, and of course we'd be at that peak for a much > smaller time. > > Not a blocker for your patch obviously, but maybe a fun experiment in an > adjacent area. Possibly even an ambitious #leftoverbits opportunity. :) Challenge accepted! ;-) Thanks, Taylor
On Fri, Jul 01, 2022 at 03:48:19PM -0400, Taylor Blau wrote: > > I agree the actual numbers aren't helping anybody. We _could_ leave a > > comment that says "we store a lot of these in memory; be careful of > > where and how you add new fields to avoid increasing the struct size". > > And then people can run "pahole" before and after their changes. > > > > But then that is also true of other structs (like "struct object"), and > > we do not bother there. So it probably is fine not to annotate this > > specifically. > > We have such a comment at the very type of the block comment above > `struct object_entry`'s definition: > > "The size of struct nearly determines pack-object's memory > consumption. This struct is packed tight for that reason. When you > add or reorder something in this struct, think a bit about this". > > thanks to Duy back in 3b13a5f263 (pack-objects: reorder members to > shrink struct object_entry, 2018-04-14). Oh, indeed. Then I withdraw my (non-)complaint. :) -Peff
diff --git a/pack-objects.h b/pack-objects.h index 393b9db546..579476687c 100644 --- a/pack-objects.h +++ b/pack-objects.h @@ -116,16 +116,6 @@ struct object_entry { unsigned dfs_state:OE_DFS_STATE_BITS; unsigned depth:OE_DEPTH_BITS; unsigned ext_base:1; /* delta_idx points outside packlist */ - - /* - * pahole results on 64-bit linux (gcc and clang) - * - * size: 80, bit_padding: 9 bits - * - * and on 32-bit (gcc) - * - * size: 76, bit_padding: 9 bits - */ }; struct packing_data {
The size and padding of `struct object_entry` is an important factor in determining the memory usage of `pack-objects`. For this reason, 3b13a5f263 (pack-objects: reorder members to shrink struct object_entry, 2018-04-14) added a comment containing some information from pahole indicating the size and padding of that struct. Unfortunately, this comment hasn't been updated since 9ac3f0e5b3 (pack-objects: fix performance issues on packing large deltas, 2018-07-22), despite the size of this struct changing many times since that commit. To see just how often the size of object_entry changes, I skimmed the first-parent history with this script: for sha in $(git rev-list --first-parent --reverse 9ac3f0e..) do echo -n "$sha " git checkout -q $sha make -s pack-objects.o 2>/dev/null pahole -C object_entry pack-objects.o | sed -n \ -e 's/\/\* size: \([0-9]*\).*/size \1/p' \ -e 's/\/\*.*padding: \([0-9]*\).*/padding \1/p' | xargs done | uniq -f1 In between each merge, the size of object_entry changes too often to record every instance here. But the important merges (along with their corresponding sizes and bit paddings) in chronological order are: ad635e82d6 (Merge branch 'nd/pack-objects-pack-struct', 2018-05-23) size 80 padding 4 29d9e3e2c4 (Merge branch 'nd/pack-deltify-regression-fix', 2018-08-22) size 80 padding 9 3ebdef2e1b (Merge branch 'jk/pack-delta-reuse-with-bitmap', 2018-09-17) size 80 padding 8 33e4ae9c50 (Merge branch 'bc/sha-256', 2019-01-29) size 96 padding 8 (indicating that the current size of the struct is 96 bytes, with 8 padding bits). Even though this comment was written in a good spirit, it is updated infrequently enough that is serves to confuse rather than to encourage contributors to update the appropriate values when the modify the definition of object_entry. For that reason, eliminate the confusion by removing the comment altogether. Signed-off-by: Taylor Blau <me@ttaylorr.com> --- Noticed this while reverting an old topic out of GitHub's fork, and realized that this comment was severely out-of-date. pack-objects.h | 10 ---------- 1 file changed, 10 deletions(-) -- 2.37.0.1.g1379af2e9d