diff mbox series

[3/4] pack-bitmap.c: support 'tree:0' filtering

Message ID 87b21d72bb588f7366d928544aeaf4de68b027a7.1588633810.git.me@ttaylorr.com (mailing list archive)
State New, archived
Headers show
Series pack-bitmap: use bitmaps for traversals with '--filter=tree:0' | expand

Commit Message

Taylor Blau May 4, 2020, 11:12 p.m. UTC
In the previous patch, we made it easy to define other filters that
exclude all objects of a certain type. Use that in order to implement
bitmap-level filtering for the '--filter=tree:<n>' filter when 'n' is
equal to 0.

The general case is not helped by bitmaps, since for values of 'n > 0',
the object filtering machinery requires a full-blown tree traversal in
order to determine the depth of a given tree. Caching this is
non-obvious, too, since the same tree object can have a different depth
depending on the context (e.g., a tree was moved up in the directory
hierarchy between two commits).

But, the 'n = 0' case can be helped, and this patch does so. Running
p5310.11 in this tree and on master with the kernel, we can see that
this case is helped substantially:

  Test                                  master              this tree
  --------------------------------------------------------------------------------
  5310.11: rev-list count with tree:0   10.68(10.39+0.27)   0.06(0.04+0.01) -99.4%

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap.c                      | 25 ++++++++++++++++++++++++-
 t/perf/p5310-pack-bitmaps.sh       |  5 +++++
 t/t6113-rev-list-bitmap-filters.sh | 21 +++++++++++++++++++++
 3 files changed, 50 insertions(+), 1 deletion(-)

Comments

Junio C Hamano May 5, 2020, 5:25 a.m. UTC | #1
Taylor Blau <me@ttaylorr.com> writes:

> diff --git a/pack-bitmap.c b/pack-bitmap.c
> index 3693c9e62f..195ee8cad0 100644
> --- a/pack-bitmap.c
> +++ b/pack-bitmap.c
> @@ -749,7 +749,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git,
>  	eword_t mask;
>  	uint32_t i;
>  
> -	if (type != OBJ_BLOB)
> +	if (type != OBJ_BLOB && type != OBJ_TREE)
>  		BUG("filter_bitmap_exclude_type: unsupported type '%d'", type);

OK.  This is the same as the previous step, but why would we even
need this guard?  find_tip_objects() is equipped to find tips of any
object type, iterating on the bitmap for "type", or flipping the
bits in the to_filter bitmap, does not have any limitation to the
blob type in the previous step, and there is no limitation to the
blob or tree types after this step, either, no?

> @@ -867,6 +867,20 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git,
>  	bitmap_free(tips);
>  }
>  
> +static void filter_bitmap_tree_depth(struct bitmap_index *bitmap_git,
> +				     struct object_list *tip_objects,
> +				     struct bitmap *to_filter,
> +				     unsigned long limit)
> +{
> +	if (limit)
> +		BUG("filter_bitmap_tree_depth given non-zero limit");

This one does make sense, because the code to exclude all trees and
all blobs we have below won't be able to cull only trees at a given
depth or deeper.

> +	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
> +				   OBJ_TREE);
> +	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
> +				   OBJ_BLOB);

And these two are quite straight-forward.

> +}
> +
>  static int filter_bitmap(struct bitmap_index *bitmap_git,
>  			 struct object_list *tip_objects,
>  			 struct bitmap *to_filter,
> @@ -890,6 +904,15 @@ static int filter_bitmap(struct bitmap_index *bitmap_git,
>  		return 0;
>  	}
>  
> +	if (filter->choice == LOFC_TREE_DEPTH &&
> +	    filter->tree_exclude_depth == 0) {
> +		if (bitmap_git)
> +			filter_bitmap_tree_depth(bitmap_git, tip_objects,
> +						 to_filter,
> +						 filter->tree_exclude_depth);

I briefly wondered if it is cleaner to read if we hardcode 0 as the
last argument.  But if the helper function ever learns how to filter
by tree with non-zero depth, we can only tweak the if() condition
without changing the call, so the way you wrote it is the right way.

Thanks.
Taylor Blau May 5, 2020, 3:59 p.m. UTC | #2
On Mon, May 04, 2020 at 10:25:46PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > diff --git a/pack-bitmap.c b/pack-bitmap.c
> > index 3693c9e62f..195ee8cad0 100644
> > --- a/pack-bitmap.c
> > +++ b/pack-bitmap.c
> > @@ -749,7 +749,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git,
> >  	eword_t mask;
> >  	uint32_t i;
> >
> > -	if (type != OBJ_BLOB)
> > +	if (type != OBJ_BLOB && type != OBJ_TREE)
> >  		BUG("filter_bitmap_exclude_type: unsupported type '%d'", type);
>
> OK.  This is the same as the previous step, but why would we even
> need this guard?  find_tip_objects() is equipped to find tips of any
> object type, iterating on the bitmap for "type", or flipping the
> bits in the to_filter bitmap, does not have any limitation to the
> blob type in the previous step, and there is no limitation to the
> blob or tree types after this step, either, no?

I think we need some sort of guard here, since we could receive any
value of object_type, but you're right that this isn't the right one. It
should probably be something like:

  if (type < OBJ_COMMIT || type > OBJ_TAG)

to pick out the sentinel values like OBJ_BAD and OBJ_NONE, as well as
the pack-specific types, like OBJ_OFS_DELTA and so on.

I fixed this locally, and will resend it along with the rest of v2 in a
day or so. Thanks for a review :).

> > @@ -867,6 +867,20 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git,
> >  	bitmap_free(tips);
> >  }
> >
> > +static void filter_bitmap_tree_depth(struct bitmap_index *bitmap_git,
> > +				     struct object_list *tip_objects,
> > +				     struct bitmap *to_filter,
> > +				     unsigned long limit)
> > +{
> > +	if (limit)
> > +		BUG("filter_bitmap_tree_depth given non-zero limit");
>
> This one does make sense, because the code to exclude all trees and
> all blobs we have below won't be able to cull only trees at a given
> depth or deeper.
>
> > +	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
> > +				   OBJ_TREE);
> > +	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
> > +				   OBJ_BLOB);
>
> And these two are quite straight-forward.
>
> > +}
> > +
> >  static int filter_bitmap(struct bitmap_index *bitmap_git,
> >  			 struct object_list *tip_objects,
> >  			 struct bitmap *to_filter,
> > @@ -890,6 +904,15 @@ static int filter_bitmap(struct bitmap_index *bitmap_git,
> >  		return 0;
> >  	}
> >
> > +	if (filter->choice == LOFC_TREE_DEPTH &&
> > +	    filter->tree_exclude_depth == 0) {
> > +		if (bitmap_git)
> > +			filter_bitmap_tree_depth(bitmap_git, tip_objects,
> > +						 to_filter,
> > +						 filter->tree_exclude_depth);
>
> I briefly wondered if it is cleaner to read if we hardcode 0 as the
> last argument.  But if the helper function ever learns how to filter
> by tree with non-zero depth, we can only tweak the if() condition
> without changing the call, so the way you wrote it is the right way.
>
> Thanks.

Thanks,
Taylor
Junio C Hamano May 5, 2020, 6:20 p.m. UTC | #3
Taylor Blau <me@ttaylorr.com> writes:

> I think we need some sort of guard here, since we could receive any
> value of object_type, but you're right that this isn't the right one. It
> should probably be something like:
>
>   if (type < OBJ_COMMIT || type > OBJ_TAG)
>
> to pick out the sentinel values like OBJ_BAD and OBJ_NONE, as well as
> the pack-specific types, like OBJ_OFS_DELTA and so on.

Yeah, it looked strange to start checking for OBJ_BLOB and OBJ_TREE
in commits that starts passing these types to the function, while
the code in the function was prepared to take any valid type, so
using the above condition from the get-go would probably be a lot
more sensible.
diff mbox series

Patch

diff --git a/pack-bitmap.c b/pack-bitmap.c
index 3693c9e62f..195ee8cad0 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -749,7 +749,7 @@  static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git,
 	eword_t mask;
 	uint32_t i;
 
-	if (type != OBJ_BLOB)
+	if (type != OBJ_BLOB && type != OBJ_TREE)
 		BUG("filter_bitmap_exclude_type: unsupported type '%d'", type);
 
 	/*
@@ -867,6 +867,20 @@  static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git,
 	bitmap_free(tips);
 }
 
+static void filter_bitmap_tree_depth(struct bitmap_index *bitmap_git,
+				     struct object_list *tip_objects,
+				     struct bitmap *to_filter,
+				     unsigned long limit)
+{
+	if (limit)
+		BUG("filter_bitmap_tree_depth given non-zero limit");
+
+	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
+				   OBJ_TREE);
+	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
+				   OBJ_BLOB);
+}
+
 static int filter_bitmap(struct bitmap_index *bitmap_git,
 			 struct object_list *tip_objects,
 			 struct bitmap *to_filter,
@@ -890,6 +904,15 @@  static int filter_bitmap(struct bitmap_index *bitmap_git,
 		return 0;
 	}
 
+	if (filter->choice == LOFC_TREE_DEPTH &&
+	    filter->tree_exclude_depth == 0) {
+		if (bitmap_git)
+			filter_bitmap_tree_depth(bitmap_git, tip_objects,
+						 to_filter,
+						 filter->tree_exclude_depth);
+		return 0;
+	}
+
 	/* filter choice not handled */
 	return -1;
 }
diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh
index 80c53edca7..75ccf9f4e3 100755
--- a/t/perf/p5310-pack-bitmaps.sh
+++ b/t/perf/p5310-pack-bitmaps.sh
@@ -53,6 +53,11 @@  test_perf 'rev-list count with blob:limit=1k' '
 		--filter=blob:limit=1k >/dev/null
 '
 
+test_perf 'rev-list count with tree:0' '
+	git rev-list --use-bitmap-index --count --objects --all \
+		--filter=tree:0 >/dev/null
+'
+
 test_perf 'simulated partial clone' '
 	git pack-objects --stdout --all --filter=blob:none </dev/null >/dev/null
 '
diff --git a/t/t6113-rev-list-bitmap-filters.sh b/t/t6113-rev-list-bitmap-filters.sh
index 145603f124..2b551e6fd0 100755
--- a/t/t6113-rev-list-bitmap-filters.sh
+++ b/t/t6113-rev-list-bitmap-filters.sh
@@ -53,4 +53,25 @@  test_expect_success 'blob:limit filter with specified blob' '
 	test_bitmap_traversal expect actual
 '
 
+test_expect_success 'tree:0 filter' '
+	git rev-list --objects --filter=tree:0 HEAD >expect &&
+	git rev-list --use-bitmap-index \
+		     --objects --filter=tree:0 HEAD >actual &&
+	test_bitmap_traversal expect actual
+'
+
+test_expect_success 'tree:0 filter with specified blob, tree' '
+	git rev-list --objects --filter=tree:0 HEAD HEAD:two.t >expect &&
+	git rev-list --use-bitmap-index \
+		     --objects --filter=tree:0 HEAD HEAD:two.t >actual &&
+	test_bitmap_traversal expect actual
+'
+
+test_expect_success 'tree:1 filter' '
+	git rev-list --objects --filter=tree:1 HEAD >expect &&
+	git rev-list --use-bitmap-index \
+		     --objects --filter=tree:1 HEAD >actual &&
+	test_cmp expect actual
+'
+
 test_done