Message ID | dffdd794de3255719028a5f2c64d43048a5a5f60.1728505840.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | pack-bitmap: convert offset to ref deltas where possible | expand |
On Wed, Oct 09, 2024 at 04:31:31PM -0400, Taylor Blau wrote: > In a following commit, we will prepare to mark objects for reuse which > are stored as deltas, but whose base object wasn't included in the > output pack (e.g., because the client already has it). > > We can't, however, take advantage of that optimization when the > traversal removed objects from the result due to either (a) object > filtering, or (b) --unpacked. > > That's because in either case, we can't rely on the principle that "if > an object appears in the 'haves' bitmap, that the client already has > that object", since they may not have it as a result of filtering. I think this is a reasonable precaution, though it does make me wonder if the non-reuse code paths are so careful. That is, if we have object Y which is a delta against object X, but we know the other side _could_ have X (because it's reachable from some commit they claim to have), would we ever send a thin delta against X? I don't recall seeing any protections for that, though I also wouldn't be too surprised if it somehow just works out because we never figure out whether they have X or not. :-/ I wonder, though: should thin deltas just be turned off entirely when filtering is in play? Likewise for --unpacked (though in practice I think it would never be used with --thin, as it is about generating new on-disk packs). -Peff
On Fri, Oct 11, 2024 at 04:35:45AM -0400, Jeff King wrote: > I wonder, though: should thin deltas just be turned off entirely when > filtering is in play? > > Likewise for --unpacked (though in practice I think it would never be > used with --thin, as it is about generating new on-disk packs). They should; that's what this commit is setting us up to do in the next one, which does not allocate the "convert to REF_DELTAs" bitmap when filtering is in play. Thanks, Taylor
diff --git a/pack-bitmap.c b/pack-bitmap.c index 8326a20979e..67ea267ed2a 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -112,6 +112,12 @@ struct bitmap_index { /* "have" bitmap from the last performed walk */ struct bitmap *haves; + /* + * Whether the last performed walk had objects removed from + * 'result' due to object filtering. + */ + int filtered; + /* Version of the bitmap index */ unsigned int version; }; @@ -1684,6 +1690,8 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, bitmap_unset(to_filter, pos); } + bitmap_git->filtered = 1; + bitmap_free(tips); } @@ -1776,6 +1784,8 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git, bitmap_unset(to_filter, pos); } + bitmap_git->filtered = 1; + bitmap_free(tips); } @@ -1892,6 +1902,8 @@ static void filter_packed_objects_from_bitmap(struct bitmap_index *bitmap_git, if (has_object_pack(&eindex->objects[i]->oid)) bitmap_unset(result, objects_nr + i); } + + bitmap_git->filtered = 1; } struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs,
In a following commit, we will prepare to mark objects for reuse which are stored as deltas, but whose base object wasn't included in the output pack (e.g., because the client already has it). We can't, however, take advantage of that optimization when the traversal removed objects from the result due to either (a) object filtering, or (b) --unpacked. That's because in either case, we can't rely on the principle that "if an object appears in the 'haves' bitmap, that the client already has that object", since they may not have it as a result of filtering. Keep track of whether or not we performed any object filtering during our traversal in preparation to actually implement thin delta conversion. Signed-off-by: Taylor Blau <me@ttaylorr.com> --- pack-bitmap.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)