diff mbox series

pack-objects.h: remove outdated pahole results

Message ID 1379af2e9d271b501ef3942398e7f159a9c77973.1656440978.git.me@ttaylorr.com (mailing list archive)
State New, archived
Headers show
Series pack-objects.h: remove outdated pahole results | expand

Commit Message

Taylor Blau June 28, 2022, 6:30 p.m. UTC
The size and padding of `struct object_entry` is an important factor in
determining the memory usage of `pack-objects`. For this reason,
3b13a5f263 (pack-objects: reorder members to shrink struct object_entry,
2018-04-14) added a comment containing some information from pahole
indicating the size and padding of that struct.

Unfortunately, this comment hasn't been updated since 9ac3f0e5b3
(pack-objects: fix performance issues on packing large deltas,
2018-07-22), despite the size of this struct changing many times since
that commit.

To see just how often the size of object_entry changes, I skimmed the
first-parent history with this script:

    for sha in $(git rev-list --first-parent --reverse 9ac3f0e..)
    do
      echo -n "$sha "
      git checkout -q $sha
      make -s pack-objects.o 2>/dev/null
      pahole -C object_entry pack-objects.o | sed -n \
        -e 's/\/\* size: \([0-9]*\).*/size \1/p' \
        -e 's/\/\*.*padding: \([0-9]*\).*/padding \1/p' | xargs
    done | uniq -f1

In between each merge, the size of object_entry changes too often to
record every instance here. But the important merges (along with their
corresponding sizes and bit paddings) in chronological order are:

    ad635e82d6 (Merge branch 'nd/pack-objects-pack-struct', 2018-05-23) size 80 padding 4
    29d9e3e2c4 (Merge branch 'nd/pack-deltify-regression-fix', 2018-08-22) size 80 padding 9
    3ebdef2e1b (Merge branch 'jk/pack-delta-reuse-with-bitmap', 2018-09-17) size 80 padding 8
    33e4ae9c50 (Merge branch 'bc/sha-256', 2019-01-29) size 96 padding 8

(indicating that the current size of the struct is 96 bytes, with 8
padding bits).

Even though this comment was written in a good spirit, it is updated
infrequently enough that is serves to confuse rather than to encourage
contributors to update the appropriate values when the modify the
definition of object_entry.

For that reason, eliminate the confusion by removing the comment
altogether.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Noticed this while reverting an old topic out of GitHub's fork, and
realized that this comment was severely out-of-date.

 pack-objects.h | 10 ----------
 1 file changed, 10 deletions(-)

--
2.37.0.1.g1379af2e9d

Comments

Derrick Stolee June 28, 2022, 8:03 p.m. UTC | #1
On 6/28/2022 2:30 PM, Taylor Blau wrote:
> The size and padding of `struct object_entry` is an important factor in
> determining the memory usage of `pack-objects`. For this reason,
> 3b13a5f263 (pack-objects: reorder members to shrink struct object_entry,
> 2018-04-14) added a comment containing some information from pahole
> indicating the size and padding of that struct.
> 
> Unfortunately, this comment hasn't been updated since 9ac3f0e5b3
> (pack-objects: fix performance issues on packing large deltas,
> 2018-07-22), despite the size of this struct changing many times since
> that commit.
...
> For that reason, eliminate the confusion by removing the comment
> altogether.

> -
> -	/*
> -	 * pahole results on 64-bit linux (gcc and clang)
> -	 *
> -	 *   size: 80, bit_padding: 9 bits
> -	 *
> -	 * and on 32-bit (gcc)
> -	 *
> -	 *   size: 76, bit_padding: 9 bits
> -	 */
>  };

I will shed no tears over this being gone. Thanks!

-Stolee
brian m. carlson June 28, 2022, 9:04 p.m. UTC | #2
On 2022-06-28 at 18:30:20, Taylor Blau wrote:
> The size and padding of `struct object_entry` is an important factor in
> determining the memory usage of `pack-objects`. For this reason,
> 3b13a5f263 (pack-objects: reorder members to shrink struct object_entry,
> 2018-04-14) added a comment containing some information from pahole
> indicating the size and padding of that struct.
> 
> Unfortunately, this comment hasn't been updated since 9ac3f0e5b3
> (pack-objects: fix performance issues on packing large deltas,
> 2018-07-22), despite the size of this struct changing many times since
> that commit.
> 
> To see just how often the size of object_entry changes, I skimmed the
> first-parent history with this script:
> 
>     for sha in $(git rev-list --first-parent --reverse 9ac3f0e..)
>     do
>       echo -n "$sha "
>       git checkout -q $sha
>       make -s pack-objects.o 2>/dev/null
>       pahole -C object_entry pack-objects.o | sed -n \
>         -e 's/\/\* size: \([0-9]*\).*/size \1/p' \
>         -e 's/\/\*.*padding: \([0-9]*\).*/padding \1/p' | xargs
>     done | uniq -f1
> 
> In between each merge, the size of object_entry changes too often to
> record every instance here. But the important merges (along with their
> corresponding sizes and bit paddings) in chronological order are:
> 
>     ad635e82d6 (Merge branch 'nd/pack-objects-pack-struct', 2018-05-23) size 80 padding 4
>     29d9e3e2c4 (Merge branch 'nd/pack-deltify-regression-fix', 2018-08-22) size 80 padding 9
>     3ebdef2e1b (Merge branch 'jk/pack-delta-reuse-with-bitmap', 2018-09-17) size 80 padding 8
>     33e4ae9c50 (Merge branch 'bc/sha-256', 2019-01-29) size 96 padding 8
> 
> (indicating that the current size of the struct is 96 bytes, with 8
> padding bits).
> 
> Even though this comment was written in a good spirit, it is updated
> infrequently enough that is serves to confuse rather than to encourage

I think you wanted to say, "that it serves:.

> contributors to update the appropriate values when the modify the
> definition of object_entry.
> 
> For that reason, eliminate the confusion by removing the comment
> altogether.
> 
> Signed-off-by: Taylor Blau <me@ttaylorr.com>

I agree with your rationale and that we should remove this.
Taylor Blau June 28, 2022, 9:26 p.m. UTC | #3
On Tue, Jun 28, 2022 at 09:04:27PM +0000, brian m. carlson wrote:
> > Even though this comment was written in a good spirit, it is updated
> > infrequently enough that is serves to confuse rather than to encourage
>
> I think you wanted to say, "that it serves:.

Oops. Thanks for pointing it out.

> > contributors to update the appropriate values when the modify the
> > definition of object_entry.
> >
> > For that reason, eliminate the confusion by removing the comment
> > altogether.
> >
> > Signed-off-by: Taylor Blau <me@ttaylorr.com>
>
> I agree with your rationale and that we should remove this.

I'm glad that you agree.

Thanks,
Taylor
Jeff King July 1, 2022, 6:16 p.m. UTC | #4
On Tue, Jun 28, 2022 at 02:30:20PM -0400, Taylor Blau wrote:

> Even though this comment was written in a good spirit, it is updated
> infrequently enough that is serves to confuse rather than to encourage
> contributors to update the appropriate values when the modify the
> definition of object_entry.
> 
> For that reason, eliminate the confusion by removing the comment
> altogether.

I agree the actual numbers aren't helping anybody. We _could_ leave a
comment that says "we store a lot of these in memory; be careful of
where and how you add new fields to avoid increasing the struct size".
And then people can run "pahole" before and after their changes.

But then that is also true of other structs (like "struct object"), and
we do not bother there. So it probably is fine not to annotate this
specifically.

Speaking of which, I suspect quite a lot of memory could be saved if
"pack-objects --revs" freed the object structs it allocates during its
traversal. Unless we're generating bitmaps, I don't think they get used
again after the initial packing list is generated. At peak you'd
still be storing all of the object_entry structs alongside the objects
as you finish the traversal, but it wouldn't overlap with any memory
used for the delta search, and of course we'd be at that peak for a much
smaller time.

Not a blocker for your patch obviously, but maybe a fun experiment in an
adjacent area. Possibly even an ambitious #leftoverbits opportunity. :)

-Peff
Taylor Blau July 1, 2022, 7:48 p.m. UTC | #5
On Fri, Jul 01, 2022 at 02:16:26PM -0400, Jeff King wrote:
> On Tue, Jun 28, 2022 at 02:30:20PM -0400, Taylor Blau wrote:
>
> > Even though this comment was written in a good spirit, it is updated
> > infrequently enough that is serves to confuse rather than to encourage
> > contributors to update the appropriate values when the modify the
> > definition of object_entry.
> >
> > For that reason, eliminate the confusion by removing the comment
> > altogether.
>
> I agree the actual numbers aren't helping anybody. We _could_ leave a
> comment that says "we store a lot of these in memory; be careful of
> where and how you add new fields to avoid increasing the struct size".
> And then people can run "pahole" before and after their changes.
>
> But then that is also true of other structs (like "struct object"), and
> we do not bother there. So it probably is fine not to annotate this
> specifically.

We have such a comment at the very type of the block comment above
`struct object_entry`'s definition:

    "The size of struct nearly determines pack-object's memory
    consumption. This struct is packed tight for that reason. When you
    add or reorder something in this struct, think a bit about this".

thanks to Duy back in 3b13a5f263 (pack-objects: reorder members to
shrink struct object_entry, 2018-04-14).

> Speaking of which, I suspect quite a lot of memory could be saved if
> "pack-objects --revs" freed the object structs it allocates during its
> traversal. Unless we're generating bitmaps, I don't think they get used
> again after the initial packing list is generated. At peak you'd
> still be storing all of the object_entry structs alongside the objects
> as you finish the traversal, but it wouldn't overlap with any memory
> used for the delta search, and of course we'd be at that peak for a much
> smaller time.
>
> Not a blocker for your patch obviously, but maybe a fun experiment in an
> adjacent area. Possibly even an ambitious #leftoverbits opportunity. :)

Challenge accepted! ;-)

Thanks,
Taylor
Jeff King July 1, 2022, 9:05 p.m. UTC | #6
On Fri, Jul 01, 2022 at 03:48:19PM -0400, Taylor Blau wrote:

> > I agree the actual numbers aren't helping anybody. We _could_ leave a
> > comment that says "we store a lot of these in memory; be careful of
> > where and how you add new fields to avoid increasing the struct size".
> > And then people can run "pahole" before and after their changes.
> >
> > But then that is also true of other structs (like "struct object"), and
> > we do not bother there. So it probably is fine not to annotate this
> > specifically.
> 
> We have such a comment at the very type of the block comment above
> `struct object_entry`'s definition:
> 
>     "The size of struct nearly determines pack-object's memory
>     consumption. This struct is packed tight for that reason. When you
>     add or reorder something in this struct, think a bit about this".
> 
> thanks to Duy back in 3b13a5f263 (pack-objects: reorder members to
> shrink struct object_entry, 2018-04-14).

Oh, indeed. Then I withdraw my (non-)complaint. :)

-Peff
diff mbox series

Patch

diff --git a/pack-objects.h b/pack-objects.h
index 393b9db546..579476687c 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -116,16 +116,6 @@  struct object_entry {
 	unsigned dfs_state:OE_DFS_STATE_BITS;
 	unsigned depth:OE_DEPTH_BITS;
 	unsigned ext_base:1; /* delta_idx points outside packlist */
-
-	/*
-	 * pahole results on 64-bit linux (gcc and clang)
-	 *
-	 *   size: 80, bit_padding: 9 bits
-	 *
-	 * and on 32-bit (gcc)
-	 *
-	 *   size: 76, bit_padding: 9 bits
-	 */
 };

 struct packing_data {