Message ID | 87b21d72bb588f7366d928544aeaf4de68b027a7.1588633810.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | pack-bitmap: use bitmaps for traversals with '--filter=tree:0' | expand |
Taylor Blau <me@ttaylorr.com> writes: > diff --git a/pack-bitmap.c b/pack-bitmap.c > index 3693c9e62f..195ee8cad0 100644 > --- a/pack-bitmap.c > +++ b/pack-bitmap.c > @@ -749,7 +749,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, > eword_t mask; > uint32_t i; > > - if (type != OBJ_BLOB) > + if (type != OBJ_BLOB && type != OBJ_TREE) > BUG("filter_bitmap_exclude_type: unsupported type '%d'", type); OK. This is the same as the previous step, but why would we even need this guard? find_tip_objects() is equipped to find tips of any object type, iterating on the bitmap for "type", or flipping the bits in the to_filter bitmap, does not have any limitation to the blob type in the previous step, and there is no limitation to the blob or tree types after this step, either, no? > @@ -867,6 +867,20 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git, > bitmap_free(tips); > } > > +static void filter_bitmap_tree_depth(struct bitmap_index *bitmap_git, > + struct object_list *tip_objects, > + struct bitmap *to_filter, > + unsigned long limit) > +{ > + if (limit) > + BUG("filter_bitmap_tree_depth given non-zero limit"); This one does make sense, because the code to exclude all trees and all blobs we have below won't be able to cull only trees at a given depth or deeper. > + filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter, > + OBJ_TREE); > + filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter, > + OBJ_BLOB); And these two are quite straight-forward. > +} > + > static int filter_bitmap(struct bitmap_index *bitmap_git, > struct object_list *tip_objects, > struct bitmap *to_filter, > @@ -890,6 +904,15 @@ static int filter_bitmap(struct bitmap_index *bitmap_git, > return 0; > } > > + if (filter->choice == LOFC_TREE_DEPTH && > + filter->tree_exclude_depth == 0) { > + if (bitmap_git) > + filter_bitmap_tree_depth(bitmap_git, tip_objects, > + to_filter, > + filter->tree_exclude_depth); I briefly wondered if it is cleaner to read if we hardcode 0 as the last argument. But if the helper function ever learns how to filter by tree with non-zero depth, we can only tweak the if() condition without changing the call, so the way you wrote it is the right way. Thanks.
On Mon, May 04, 2020 at 10:25:46PM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > diff --git a/pack-bitmap.c b/pack-bitmap.c > > index 3693c9e62f..195ee8cad0 100644 > > --- a/pack-bitmap.c > > +++ b/pack-bitmap.c > > @@ -749,7 +749,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, > > eword_t mask; > > uint32_t i; > > > > - if (type != OBJ_BLOB) > > + if (type != OBJ_BLOB && type != OBJ_TREE) > > BUG("filter_bitmap_exclude_type: unsupported type '%d'", type); > > OK. This is the same as the previous step, but why would we even > need this guard? find_tip_objects() is equipped to find tips of any > object type, iterating on the bitmap for "type", or flipping the > bits in the to_filter bitmap, does not have any limitation to the > blob type in the previous step, and there is no limitation to the > blob or tree types after this step, either, no? I think we need some sort of guard here, since we could receive any value of object_type, but you're right that this isn't the right one. It should probably be something like: if (type < OBJ_COMMIT || type > OBJ_TAG) to pick out the sentinel values like OBJ_BAD and OBJ_NONE, as well as the pack-specific types, like OBJ_OFS_DELTA and so on. I fixed this locally, and will resend it along with the rest of v2 in a day or so. Thanks for a review :). > > @@ -867,6 +867,20 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git, > > bitmap_free(tips); > > } > > > > +static void filter_bitmap_tree_depth(struct bitmap_index *bitmap_git, > > + struct object_list *tip_objects, > > + struct bitmap *to_filter, > > + unsigned long limit) > > +{ > > + if (limit) > > + BUG("filter_bitmap_tree_depth given non-zero limit"); > > This one does make sense, because the code to exclude all trees and > all blobs we have below won't be able to cull only trees at a given > depth or deeper. > > > + filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter, > > + OBJ_TREE); > > + filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter, > > + OBJ_BLOB); > > And these two are quite straight-forward. > > > +} > > + > > static int filter_bitmap(struct bitmap_index *bitmap_git, > > struct object_list *tip_objects, > > struct bitmap *to_filter, > > @@ -890,6 +904,15 @@ static int filter_bitmap(struct bitmap_index *bitmap_git, > > return 0; > > } > > > > + if (filter->choice == LOFC_TREE_DEPTH && > > + filter->tree_exclude_depth == 0) { > > + if (bitmap_git) > > + filter_bitmap_tree_depth(bitmap_git, tip_objects, > > + to_filter, > > + filter->tree_exclude_depth); > > I briefly wondered if it is cleaner to read if we hardcode 0 as the > last argument. But if the helper function ever learns how to filter > by tree with non-zero depth, we can only tweak the if() condition > without changing the call, so the way you wrote it is the right way. > > Thanks. Thanks, Taylor
Taylor Blau <me@ttaylorr.com> writes: > I think we need some sort of guard here, since we could receive any > value of object_type, but you're right that this isn't the right one. It > should probably be something like: > > if (type < OBJ_COMMIT || type > OBJ_TAG) > > to pick out the sentinel values like OBJ_BAD and OBJ_NONE, as well as > the pack-specific types, like OBJ_OFS_DELTA and so on. Yeah, it looked strange to start checking for OBJ_BLOB and OBJ_TREE in commits that starts passing these types to the function, while the code in the function was prepared to take any valid type, so using the above condition from the get-go would probably be a lot more sensible.
diff --git a/pack-bitmap.c b/pack-bitmap.c index 3693c9e62f..195ee8cad0 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -749,7 +749,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, eword_t mask; uint32_t i; - if (type != OBJ_BLOB) + if (type != OBJ_BLOB && type != OBJ_TREE) BUG("filter_bitmap_exclude_type: unsupported type '%d'", type); /* @@ -867,6 +867,20 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git, bitmap_free(tips); } +static void filter_bitmap_tree_depth(struct bitmap_index *bitmap_git, + struct object_list *tip_objects, + struct bitmap *to_filter, + unsigned long limit) +{ + if (limit) + BUG("filter_bitmap_tree_depth given non-zero limit"); + + filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter, + OBJ_TREE); + filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter, + OBJ_BLOB); +} + static int filter_bitmap(struct bitmap_index *bitmap_git, struct object_list *tip_objects, struct bitmap *to_filter, @@ -890,6 +904,15 @@ static int filter_bitmap(struct bitmap_index *bitmap_git, return 0; } + if (filter->choice == LOFC_TREE_DEPTH && + filter->tree_exclude_depth == 0) { + if (bitmap_git) + filter_bitmap_tree_depth(bitmap_git, tip_objects, + to_filter, + filter->tree_exclude_depth); + return 0; + } + /* filter choice not handled */ return -1; } diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh index 80c53edca7..75ccf9f4e3 100755 --- a/t/perf/p5310-pack-bitmaps.sh +++ b/t/perf/p5310-pack-bitmaps.sh @@ -53,6 +53,11 @@ test_perf 'rev-list count with blob:limit=1k' ' --filter=blob:limit=1k >/dev/null ' +test_perf 'rev-list count with tree:0' ' + git rev-list --use-bitmap-index --count --objects --all \ + --filter=tree:0 >/dev/null +' + test_perf 'simulated partial clone' ' git pack-objects --stdout --all --filter=blob:none </dev/null >/dev/null ' diff --git a/t/t6113-rev-list-bitmap-filters.sh b/t/t6113-rev-list-bitmap-filters.sh index 145603f124..2b551e6fd0 100755 --- a/t/t6113-rev-list-bitmap-filters.sh +++ b/t/t6113-rev-list-bitmap-filters.sh @@ -53,4 +53,25 @@ test_expect_success 'blob:limit filter with specified blob' ' test_bitmap_traversal expect actual ' +test_expect_success 'tree:0 filter' ' + git rev-list --objects --filter=tree:0 HEAD >expect && + git rev-list --use-bitmap-index \ + --objects --filter=tree:0 HEAD >actual && + test_bitmap_traversal expect actual +' + +test_expect_success 'tree:0 filter with specified blob, tree' ' + git rev-list --objects --filter=tree:0 HEAD HEAD:two.t >expect && + git rev-list --use-bitmap-index \ + --objects --filter=tree:0 HEAD HEAD:two.t >actual && + test_bitmap_traversal expect actual +' + +test_expect_success 'tree:1 filter' ' + git rev-list --objects --filter=tree:1 HEAD >expect && + git rev-list --use-bitmap-index \ + --objects --filter=tree:1 HEAD >actual && + test_cmp expect actual +' + test_done
In the previous patch, we made it easy to define other filters that exclude all objects of a certain type. Use that in order to implement bitmap-level filtering for the '--filter=tree:<n>' filter when 'n' is equal to 0. The general case is not helped by bitmaps, since for values of 'n > 0', the object filtering machinery requires a full-blown tree traversal in order to determine the depth of a given tree. Caching this is non-obvious, too, since the same tree object can have a different depth depending on the context (e.g., a tree was moved up in the directory hierarchy between two commits). But, the 'n = 0' case can be helped, and this patch does so. Running p5310.11 in this tree and on master with the kernel, we can see that this case is helped substantially: Test master this tree -------------------------------------------------------------------------------- 5310.11: rev-list count with tree:0 10.68(10.39+0.27) 0.06(0.04+0.01) -99.4% Signed-off-by: Taylor Blau <me@ttaylorr.com> --- pack-bitmap.c | 25 ++++++++++++++++++++++++- t/perf/p5310-pack-bitmaps.sh | 5 +++++ t/t6113-rev-list-bitmap-filters.sh | 21 +++++++++++++++++++++ 3 files changed, 50 insertions(+), 1 deletion(-)