mbox series

[0/1] blame: remove unnecessary use of get_commit_info()

Message ID 20210216163151.76307-1-rafaeloliveira.cs@gmail.com (mailing list archive)
Headers show
Series blame: remove unnecessary use of get_commit_info() | expand

Message

Rafael Silva Feb. 16, 2021, 4:31 p.m. UTC
This patch is a code cleanup and consequently a micro optimization in
determine_heat_line().

When `blame`ing with --color-by-age, the determine_line_heat() calls the
get_commit_info() to parse the commit's information.  However, it turns
out this is actually unnecessary because its caller also does the same.
We can instead, pass the already parsed `commit_info` and remove the
internal call to get_commit_info().

The performance optimization made by this patch is relatively small as
get_commit_info() doesn't perform any heavy-load operations. So, the
performance improvements is not that exciting, not at least until you
enter the ms/μs realm.  Nevertheless I thought the code cleanup is
still valid and the optimization is a bonus.


  ... digging into the the performance improvements, for those 
      who are curious ...

Running Git PERF suite in linux.git, I've got a subtle performance
improvement for some runs:

	# git.328c109303 - compiled git from commit 328c109303
	# git.blame-patched - compiled git from commit 328c109303 + this patch
        Test                                          git.328c109303    git.blame-patched
        -------------------------------------------------------------------------------------
        blame --color-by-age kernel/fork.c            1.96(1.81+0.15)   1.95(1.80+0.14) -0.5%
        blame --color-by-age kernel/sys.c             1.67(1.53+0.13)   1.66(1.52+0.14) -0.6%
        blame --color-by-age mm/slab.c                2.16(2.00+0.16)   2.15(1.99+0.15) -0.5%
        blame --color-by-age lib/packing.c            0.20(0.14+0.05)   0.20(0.14+0.05) +0.0%
        blame --color-by-age drivers/cdrom/cdrom.c    1.62(1.46+0.15)   1.62(1.46+0.15) +0.0%
        blame --color-by-age crypto/crypto_engine.c   0.37(0.29+0.06)   0.36(0.28+0.06) -2.7%
        blame --color-by-age net/ipv4/ip_forward.c    1.49(1.35+0.13)   1.48(1.34+0.13) -0.7%

To dig a little deeper, I enabled the Git's trace2 API to record every
call to the determine_line_heat() function:

        ...
+       trace2_region_enter("blame", "determine_line_heat", the_repository);
        determine_line_heat(ent, &default_color);
+       trace2_region_enter("blame", "determine_line_heat", the_repository);
        ...

Then, running `blame` for "kernel/fork.c` and _summing_ all the execution
time for every call (around 1.3k calls) resulted in 2.6x faster execution
(best out 3):

	git built from 328c109303 (The eighth batch, 2021-02-12) = 42ms
	git built from 328c109303 + this patch                   = 16ms

Of course, this is on the milliseconds :).

Lastly, to get a better picture of the performance change, I computed
all the determine_line_heat()'s execution time, for all 3 calls, into
a distribution graph.  The execution time is converted to microseconds
for ease of understanding:

        # git built from 328c109303 (The eighth batch, 2021-02-12)
     	Time in μs     0.0 ~  9.0  |
     	Time in μs    10.0 ~ 19.0  | *
     	Time in μs    20.0 ~ 29.0  | *****************
    	Time in μs    30.0 ~ 39.0  | *****************
     	Time in μs    40.0 ~ 49.0  | **
     	Time in μs    50.0 ~ 59.0  | *
     	Time in μs    60.0 ~ 69.0  | *
     	Time in μs    70.0 ~ 79.0  | *
     	Time in μs    80.0 ~ 89.0  | *
     	Time in μs    90.0 ~ 99.0  | *
     	Time in μs   100.0 ~ 109.0 | *
     	Time in μs   110.0 ~ 119.0 | *
     	Time in μs   120.0 ~ 129.0 |
     	Time in μs   130.0 ~ 139.0 | *
     	[N: 3915 | Median: 30.0 | Min: 19.0 | Max: 137.0]

     	# git built from 328c109303 + this patch
     	Time in μs     0.0 ~  9.0  |
     	Time in μs    10.0 ~ 19.0  | *************************************
     	Time in μs    20.0 ~ 29.0  | *
     	Time in μs    30.0 ~ 39.0  | *
     	Time in μs    40.0 ~ 49.0  | *
     	Time in μs    50.0 ~ 59.0  | *
     	Time in μs    60.0 ~ 69.0  | *
     	Time in μs    70.0 ~ 79.0  | *
     	Time in μs    80.0 ~ 89.0  | *
     	Time in μs    90.0 ~ 99.0  |
	        ... [ zero's omitted ] ...
     	Time in μs   150.0 ~ 159.0 |
     	Time in μs   160.0 ~ 169.0 | *
     	[N: 3915 | Median: 12.0 | Min: 10.0 | Max: 167.0]

	Note: A "*" represent hundreds counter. One "*" is used
	for any count between 1 and 99. (i.e: 23 = "*", 99 = "*",
	100 = "**", 199 = "**", 200 = "***", etc).

With the changes introduced by this patch, the determine_line_heat()
mostly finishes its execution between 10μs and 19μs compared with the
current implementation that mostly finished around 20μs and 39μs.

Rafael Silva (1):
  blame: remove unnecessary use of get_commit_info()

 builtin/blame.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

Comments

Derrick Stolee Feb. 16, 2021, 6:35 p.m. UTC | #1
On 2/16/2021 11:31 AM, Rafael Silva wrote:
> Running Git PERF suite in linux.git, I've got a subtle performance
> improvement for some runs:
> 
> 	# git.328c109303 - compiled git from commit 328c109303
> 	# git.blame-patched - compiled git from commit 328c109303 + this patch
>         Test                                          git.328c109303    git.blame-patched
>         -------------------------------------------------------------------------------------
>         blame --color-by-age kernel/fork.c            1.96(1.81+0.15)   1.95(1.80+0.14) -0.5%
>         blame --color-by-age kernel/sys.c             1.67(1.53+0.13)   1.66(1.52+0.14) -0.6%
>         blame --color-by-age mm/slab.c                2.16(2.00+0.16)   2.15(1.99+0.15) -0.5%
>         blame --color-by-age lib/packing.c            0.20(0.14+0.05)   0.20(0.14+0.05) +0.0%
>         blame --color-by-age drivers/cdrom/cdrom.c    1.62(1.46+0.15)   1.62(1.46+0.15) +0.0%
>         blame --color-by-age crypto/crypto_engine.c   0.37(0.29+0.06)   0.36(0.28+0.06) -2.7%
>         blame --color-by-age net/ipv4/ip_forward.c    1.49(1.35+0.13)   1.48(1.34+0.13) -0.7%

Have you updated the commit-graph with changed-path Bloom filters in
your copy of linux.git before running the perf tests? You might get
smaller numbers overall (both sides) but make the difference for this
patch be more pronounced:

	git commit-graph write --reachable --changed-paths

> To dig a little deeper, I enabled the Git's trace2 API to record every
> call to the determine_line_heat() function:
> 
>         ...
> +       trace2_region_enter("blame", "determine_line_heat", the_repository);
>         determine_line_heat(ent, &default_color);
> +       trace2_region_enter("blame", "determine_line_heat", the_repository);
>         ...
> 
> Then, running `blame` for "kernel/fork.c` and _summing_ all the execution
> time for every call (around 1.3k calls) resulted in 2.6x faster execution
> (best out 3):
> 
> 	git built from 328c109303 (The eighth batch, 2021-02-12) = 42ms
> 	git built from 328c109303 + this patch                   = 16ms

This is a good way to demonstrate the change. Definitely worthwhile for
demonstrating the value of the patch. I'll second Taylor's request that
this performance data goes in the commit message so we can see the details
in the future.

Thanks,
-Stolee
Junio C Hamano Feb. 16, 2021, 7:45 p.m. UTC | #2
Derrick Stolee <stolee@gmail.com> writes:

> ...
> This is a good way to demonstrate the change. Definitely worthwhile for
> demonstrating the value of the patch. I'll second Taylor's request that
> this performance data goes in the commit message so we can see the details
> in the future.

Thanks, all.  I do agree with you and Taylor that these numbers
deserve to be in the log message, not cover letter.
Rafael Silva Feb. 17, 2021, 2:42 p.m. UTC | #3
Derrick Stolee <stolee@gmail.com> writes:

> On 2/16/2021 11:31 AM, Rafael Silva wrote:
>> Running Git PERF suite in linux.git, I've got a subtle performance
>> improvement for some runs:
>> 
>> 	# git.328c109303 - compiled git from commit 328c109303
>> 	# git.blame-patched - compiled git from commit 328c109303 + this patch
>>         Test                                          git.328c109303    git.blame-patched
>>         -------------------------------------------------------------------------------------
>>         blame --color-by-age kernel/fork.c            1.96(1.81+0.15)   1.95(1.80+0.14) -0.5%
>>         blame --color-by-age kernel/sys.c             1.67(1.53+0.13)   1.66(1.52+0.14) -0.6%
>>         blame --color-by-age mm/slab.c                2.16(2.00+0.16)   2.15(1.99+0.15) -0.5%
>>         blame --color-by-age lib/packing.c            0.20(0.14+0.05)   0.20(0.14+0.05) +0.0%
>>         blame --color-by-age drivers/cdrom/cdrom.c    1.62(1.46+0.15)   1.62(1.46+0.15) +0.0%
>>         blame --color-by-age crypto/crypto_engine.c   0.37(0.29+0.06)   0.36(0.28+0.06) -2.7%
>>         blame --color-by-age net/ipv4/ip_forward.c    1.49(1.35+0.13)   1.48(1.34+0.13) -0.7%
>
> Have you updated the commit-graph with changed-path Bloom filters in
> your copy of linux.git before running the perf tests? You might get
> smaller numbers overall (both sides) but make the difference for this
> patch be more pronounced:
>
> 	git commit-graph write --reachable --changed-paths
>

Thanks for pointing this out. I didn't updated the commit-graph in my
copy of linux.git when I ran the perf tests and indeed with the
commit-graph updated, the execution time is faster now.

Here's the results (best out of 3):

        # git.328c109303 - compiled git from commit 328c109303
        # git.blame-patched - compiled git from commit 328c109303 + this patch
        Test                                          git.328c109303   git.blame-patched
        ------------------------------------------------------------------------------------
        blame --color-by-age kernel/fork.c            1.13(0.96+0.17)  1.12(0.96+0.15) -0.9%
        blame --color-by-age kernel/sys.c             1.00(0.81+0.18)  0.98(0.81+0.15) -2.0%
        blame --color-by-age mm/slab.c                1.51(1.33+0.18)  1.49(1.31+0.17) -1.3%
        blame --color-by-age lib/packing.c            0.13(0.05+0.07)  0.12(0.04+0.07) -7.7%
        blame --color-by-age drivers/cdrom/cdrom.c    0.62(0.48+0.12)  0.61(0.48+0.12) -1.6%
        blame --color-by-age crypto/crypto_engine.c   0.16(0.08+0.08)  0.16(0.07+0.08) +0.0%
        blame --color-by-age net/ipv4/ip_forward.c    0.43(0.30+0.13)  0.42(0.29+0.11) -2.3%

For the "kernel/fork.c" example, there is around 1.7x improvement. The
percentage comparison is also bigger than it was before the commit-graph
file was updated with the changed paths Bloom Filter as you expected.

For the "net/ipv4/ip_forward.c" there a 3.4x improvements which I
believe is the biggest one in my tests.