diff mbox series

[v2] pack-bitmap: remove trace2 region from hot path

Message ID pull.1365.v2.git.1664198277250.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 89a1ab8fb5766d0474ff72f176804508b0558f8a
Headers show
Series [v2] pack-bitmap: remove trace2 region from hot path | expand

Commit Message

Derrick Stolee Sept. 26, 2022, 1:17 p.m. UTC
From: Derrick Stolee <derrickstolee@github.com>

The trace2 region around the call to lazy_bitmap_for_commit() in
bitmap_for_commit() was added in 28cd730680d (pack-bitmap: prepare to
read lookup table extension, 2022-08-14). While adding trace2 regions is
typically helpful for tracking performance, this method is called
possibly thousands of times as a commit walk explores commit history
looking for a matching bitmap. When trace2 output is enabled, this
region is emitted many times and performance is throttled by that
output.

For now, remove these regions entirely.

This is a critical path, and it would be valuable to measure that the
time spent in bitmap_for_commit() does not increase when using the
commit lookup table. The best way to do that would be to use a mechanism
that sums the time spent in a region and reports a single value at the
end of the process. This technique was introduced but not merged by [1]
so maybe this example presents some justification to revisit that
approach.

[1] https://lore.kernel.org/git/pull.1099.v2.git.1640720202.gitgitgadget@gmail.com/

To help with the 'git blame' output in this region, add a comment that
warns against adding a trace2 region. Delete a test from t5310 that used
that trace output to check that this lookup optimization was activated.
To create this kind of test again in the future, the stopwatch traces
mentioned earlier could be used as a signal that we activated this code
path.

Helpedy-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
    [For 2.38.0] pack-bitmap: remove trace2 region from hot path
    
    I noticed this while trying to backport the Abhradeep's lookup table
    work into GitHub's fork. Something went wrong in that process, causing
    this region to significantly slow down. It turns out that slow down does
    not reproduce on current 'master', which is good. I must have missed
    something while I was backporting.
    
    Regardless, the use of trace2_region_enter() here should be removed or
    replaced. For the sake of 2.38.0, this simple removal is safe enough.
    However, to really dig into what was happening I had to construct a
    rebase [2] of Jeff's trace2 stopwatch work, then apply changes on top
    [3] that could replace this region with trace2_timer_*() methods.
    
    [2]
    https://github.com/git/git/compare/master...derrickstolee:trace2-stopwatch
    [3]
    https://github.com/derrickstolee/git/compare/trace2-stopwatch...bitmap-trace2
    
    As a separate discussion, it might be worth revisiting that stopwatch
    work so we have it available as a tool when doing these kinds of
    investigations.
    
    
    Updates in v2
    =============
    
     * I removed the test that was checking for trace2 regions. In [3], I
       played around with using trace2_counter() to replace that check, so
       we could reinstate the test without the performance drop from using
       repeated trace2_region() calls.
     * I squashed in Junio's comment change.
    
    Thanks, -Stolee

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1365%2Fderrickstolee%2Fbitmap-remove-trace2-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1365/derrickstolee/bitmap-remove-trace2-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1365

Range-diff vs v1:

 1:  987827eaf90 ! 1:  8ef63c9e32f pack-bitmap: remove trace2 region from hot path
     @@ Commit message
      
          [1] https://lore.kernel.org/git/pull.1099.v2.git.1640720202.gitgitgadget@gmail.com/
      
     +    To help with the 'git blame' output in this region, add a comment that
     +    warns against adding a trace2 region. Delete a test from t5310 that used
     +    that trace output to check that this lookup optimization was activated.
     +    To create this kind of test again in the future, the stopwatch traces
     +    mentioned earlier could be used as a signal that we activated this code
     +    path.
     +
     +    Helpedy-by: Junio C Hamano <gitster@pobox.com>
          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
      
       ## pack-bitmap.c ##
     @@ pack-bitmap.c: struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap
       			return NULL;
       
      -		trace2_region_enter("pack-bitmap", "reading_lookup_table", the_repository);
     ++		/* this is a fairly hot codepath - no trace2_region please */
       		/* NEEDSWORK: cache misses aren't recorded */
       		bitmap = lazy_bitmap_for_commit(bitmap_git, commit);
      -		trace2_region_leave("pack-bitmap", "reading_lookup_table", the_repository);
       		if (!bitmap)
       			return NULL;
       		return lookup_stored_bitmap(bitmap);
     +
     + ## t/t5310-pack-bitmaps.sh ##
     +@@ t/t5310-pack-bitmaps.sh: test_expect_success 'verify writing bitmap lookup table when enabled' '
     + 	grep "\"label\":\"writing_lookup_table\"" trace2
     + '
     + 
     +-test_expect_success 'lookup table is actually used to traverse objects' '
     +-	git repack -adb &&
     +-	GIT_TRACE2_EVENT="$(pwd)/trace3" \
     +-		git rev-list --use-bitmap-index --count --all &&
     +-	grep "\"label\":\"reading_lookup_table\"" trace3
     +-'
     +-
     + test_expect_success 'truncated bitmap fails gracefully (lookup table)' '
     + 	test_config pack.writebitmaphashcache false &&
     + 	git repack -adb &&


 pack-bitmap.c           | 3 +--
 t/t5310-pack-bitmaps.sh | 7 -------
 2 files changed, 1 insertion(+), 9 deletions(-)


base-commit: 1b3d6e17fe83eb6f79ffbac2f2c61bbf1eaef5f8

Comments

Ævar Arnfjörð Bjarmason Sept. 26, 2022, 3:01 p.m. UTC | #1
On Mon, Sep 26 2022, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <derrickstolee@github.com>
>
> The trace2 region around the call to lazy_bitmap_for_commit() in
> bitmap_for_commit() was added in 28cd730680d (pack-bitmap: prepare to
> read lookup table extension, 2022-08-14). While adding trace2 regions is
> typically helpful for tracking performance, this method is called
> possibly thousands of times as a commit walk explores commit history
> looking for a matching bitmap. When trace2 output is enabled, this
> region is emitted many times and performance is throttled by that
> output.
>
> For now, remove these regions entirely.
>
> This is a critical path, and it would be valuable to measure that the
> time spent in bitmap_for_commit() does not increase when using the
> commit lookup table. The best way to do that would be to use a mechanism
> that sums the time spent in a region and reports a single value at the
> end of the process. This technique was introduced but not merged by [1]
> so maybe this example presents some justification to revisit that
> approach.

Just getting rid of this seems like a good thing for now.

But aside: Yes, one way to mitigate this rather than removing the
tracing would be to make it really fast.

But just skimming pack-bitmap.c do we really need trace2 at the
granularity of a single commit? Looking at who calls bitmap_for_commit()
wouldn't something like this sketch-out be much more useful?:
	
	diff --git a/pack-bitmap.c b/pack-bitmap.c
	index 9d5205055a5..439aec220c7 100644
	--- a/pack-bitmap.c
	+++ b/pack-bitmap.c
	@@ -830,10 +830,8 @@ struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git,
	 		if (!bitmap_git->table_lookup)
	 			return NULL;
	 
	-		trace2_region_enter("pack-bitmap", "reading_lookup_table", the_repository);
	 		/* NEEDSWORK: cache misses aren't recorded */
	 		bitmap = lazy_bitmap_for_commit(bitmap_git, commit);
	-		trace2_region_leave("pack-bitmap", "reading_lookup_table", the_repository);
	 		if (!bitmap)
	 			return NULL;
	 		return lookup_stored_bitmap(bitmap);
	@@ -1042,6 +1040,7 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
	 	 * The ones without bitmaps in the index will be stored in the
	 	 * `not_mapped_list` for further processing.
	 	 */
	+	/* begin trace2 find roots? */
	 	while (roots) {
	 		struct object *object = roots->item;
	 		roots = roots->next;
	@@ -1055,6 +1054,7 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
	 		object_list_insert(object, &not_mapped);
	 	}
	 
	+	/* end trace2 find roots? */
	 	/*
	 	 * Best case scenario: We found bitmaps for all the roots,
	 	 * so the resulting `or` bitmap has the full reachability analysis
	@@ -1100,7 +1100,7 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
	 		incdata.base = base;
	 		incdata.seen = seen;
	 
	-		revs->include_check = should_include;
	+		revs->include_check = should_include; // Will call bitmap_for_commit()
	 		revs->include_check_obj = should_include_obj;
	 		revs->include_check_data = &incdata;
	 
	@@ -1110,9 +1110,11 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
	 		show_data.bitmap_git = bitmap_git;
	 		show_data.base = base;
	 
	+		/* begin trace2 rev list? */
	 		traverse_commit_list(revs,
	 				     show_commit, show_object,
	 				     &show_data);
	+		/* end trace2 rev list? */
	 
	 		revs->include_check = NULL;
	 		revs->include_check_obj = NULL;
	

This is *not* the same as stricking the tracing into
bitmap_for_commit(), but do we really need the tracing that far inside
our loop?
Derrick Stolee Sept. 26, 2022, 5:31 p.m. UTC | #2
On 9/26/2022 11:01 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Mon, Sep 26 2022, Derrick Stolee via GitGitGadget wrote:
> 
>> This is a critical path, and it would be valuable to measure that the
>> time spent in bitmap_for_commit() does not increase when using the
>> commit lookup table. The best way to do that would be to use a mechanism
>> that sums the time spent in a region and reports a single value at the
>> end of the process. This technique was introduced but not merged by [1]
>> so maybe this example presents some justification to revisit that
>> approach.
> 
> Just getting rid of this seems like a good thing for now.
> 
> But aside: Yes, one way to mitigate this rather than removing the
> tracing would be to make it really fast.
> 
> But just skimming pack-bitmap.c do we really need trace2 at the
> granularity of a single commit? Looking at who calls bitmap_for_commit()
> wouldn't something like this sketch-out be much more useful?:

The point of it being where it was to check that we hit the path
custom to the commit lookup extension, hence the test that is removed.

You are proposing a different region entirely, meant to measure the
cost of the entire walk. Maybe that's valuable, but not critical to
this topic.

If you're proposing the high-level traces as an alternative to my
prototype using the trace2 timers, then I'll point out that the timer
approach allows us to determine how much time is being spent checking
for bitmaps versus walking commits, which the regions you provided don't.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/pack-bitmap.c b/pack-bitmap.c
index 9a208abc1fd..a2bbbbd811d 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -830,10 +830,9 @@  struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git,
 		if (!bitmap_git->table_lookup)
 			return NULL;
 
-		trace2_region_enter("pack-bitmap", "reading_lookup_table", the_repository);
+		/* this is a fairly hot codepath - no trace2_region please */
 		/* NEEDSWORK: cache misses aren't recorded */
 		bitmap = lazy_bitmap_for_commit(bitmap_git, commit);
-		trace2_region_leave("pack-bitmap", "reading_lookup_table", the_repository);
 		if (!bitmap)
 			return NULL;
 		return lookup_stored_bitmap(bitmap);
diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index 7e50f8e7653..6d693eef82f 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -455,13 +455,6 @@  test_expect_success 'verify writing bitmap lookup table when enabled' '
 	grep "\"label\":\"writing_lookup_table\"" trace2
 '
 
-test_expect_success 'lookup table is actually used to traverse objects' '
-	git repack -adb &&
-	GIT_TRACE2_EVENT="$(pwd)/trace3" \
-		git rev-list --use-bitmap-index --count --all &&
-	grep "\"label\":\"reading_lookup_table\"" trace3
-'
-
 test_expect_success 'truncated bitmap fails gracefully (lookup table)' '
 	test_config pack.writebitmaphashcache false &&
 	git repack -adb &&