Message ID | a18baeb0b42994ebcb216df5fe69459ba9a33795.1624314293.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | multi-pack reachability bitmaps | expand |
On Mon, Jun 21 2021, Taylor Blau wrote: > + enum object_type bitmap_type = OBJ_NONE; > + int bitmaps_nr = 0; > + > + if (bitmap_get(tdata->commits, pos)) { > + bitmap_type = OBJ_COMMIT; > + bitmaps_nr++; > + } > + if (bitmap_get(tdata->trees, pos)) { > + bitmap_type = OBJ_TREE; > + bitmaps_nr++; > + } > + if (bitmap_get(tdata->blobs, pos)) { > + bitmap_type = OBJ_BLOB; > + bitmaps_nr++; > + } > + if (bitmap_get(tdata->tags, pos)) { > + bitmap_type = OBJ_TAG; > + bitmaps_nr++; > + } This made me wonder if this could be better with something like the HAS_MULTI_BITS() macro, but that trick probably can't be applied here in any shape or form :) > + > + if (!bitmap_type) > + die("object %s not found in type bitmaps", > + oid_to_hex(&obj->oid)); It feels a bit magical to use an enum and then assume to know the enum's values, I know we do "type < 0" all over the place, but I'd think "if (bitmap_type == OBJ_NONE)" would be better here.... > + > + if (bitmaps_nr > 1) > + die("object %s does not have a unique type", > + oid_to_hex(&obj->oid)); Or just check the bitmaps_nr instead: if (!bitmaps_nr) die("found none"); else if (bitmaps_nr > 1) ...; Just bikeshedding... > + > + if (bitmap_type != obj->type) > + die("object %s: real type %s, expected: %s", > + oid_to_hex(&obj->oid), > + type_name(obj->type), > + type_name(bitmap_type)); To argue against myself (sort of) about that "== OBJ_NONE" above, if we're not assuming that then it's sort of weird not to also assume that type_name(type) won't return a NULL in the case of OBJ_NONE, which it does (but this code guards against).
On Fri, Jun 25, 2021 at 01:02:56AM +0200, Ævar Arnfjörð Bjarmason wrote: > > On Mon, Jun 21 2021, Taylor Blau wrote: > > > + enum object_type bitmap_type = OBJ_NONE; > > + int bitmaps_nr = 0; > > + > > + if (bitmap_get(tdata->commits, pos)) { > > + bitmap_type = OBJ_COMMIT; > > + bitmaps_nr++; > > + } > > + if (bitmap_get(tdata->trees, pos)) { > > + bitmap_type = OBJ_TREE; > > + bitmaps_nr++; > > + } > > + if (bitmap_get(tdata->blobs, pos)) { > > + bitmap_type = OBJ_BLOB; > > + bitmaps_nr++; > > + } > > + if (bitmap_get(tdata->tags, pos)) { > > + bitmap_type = OBJ_TAG; > > + bitmaps_nr++; > > + } > > This made me wonder if this could be better with something like the > HAS_MULTI_BITS() macro, but that trick probably can't be applied here in > any shape or form :) Right; since we're looking at the same bit position in each of the type-level bitmaps, we can't just OR them together, since all of the bits are in the same place. And really, the object_type enum doesn't have values that tell us the type of an object by looking at just a single bit. So HAS_MULTI_BITS(OBJ_BLOB) would return "true", since OBJ_BLOB is 3. > > + > > + if (bitmap_type != obj->type) > > + die("object %s: real type %s, expected: %s", > > + oid_to_hex(&obj->oid), > > + type_name(obj->type), > > + type_name(bitmap_type)); > > To argue against myself (sort of) about that "== OBJ_NONE" above, if > we're not assuming that then it's sort of weird not to also assume that > type_name(type) won't return a NULL in the case of OBJ_NONE, which it > does (but this code guards against). I tend to agree. To restate what you're saying: by the time we get to the type_name(bitmap_type) we know that bitmap_type is non-zero, so we assume it's OK to call type_name() on it. Of course, the object_type_strings does handle the zero argument, so this is probably a little academic, but good to think through nonetheless. Thanks, Taylor
On Mon, Jun 21, 2021 at 06:24:59PM -0400, Taylor Blau wrote: > The special `--test-bitmap` mode of `git rev-list` is used to compare > the result of an object traversal with a bitmap to check its integrity. > This mode does not, however, assert that the types of reachable objects > are stored correctly. > > Harden this mode by teaching it to also check that each time an object's > bit is marked, the corresponding bit should be set in exactly one of the > type bitmaps (whose type matches the object's true type). Yep, makes sense, and the patch looks good. > +{ > + enum object_type bitmap_type = OBJ_NONE; > [...] > + > + if (!bitmap_type) > + die("object %s not found in type bitmaps", > + oid_to_hex(&obj->oid)); I think the suggestion to do: if (bitmap_type == OBJ_NONE) is reasonable here, as it assumes less about the enum. I do think OBJ_BAD and OBJ_NONE were chosen with these kind of numeric comparisons in mind, but there is no reason to rely on them in places we don't need to. -Peff
On Wed, Jul 21, 2021 at 05:45:12AM -0400, Jeff King wrote: > > +{ > > + enum object_type bitmap_type = OBJ_NONE; > > [...] > > + > > + if (!bitmap_type) > > + die("object %s not found in type bitmaps", > > + oid_to_hex(&obj->oid)); > > I think the suggestion to do: > > if (bitmap_type == OBJ_NONE) > > is reasonable here, as it assumes less about the enum. I do think > OBJ_BAD and OBJ_NONE were chosen with these kind of numeric comparisons > in mind, but there is no reason to rely on them in places we don't need > to. I had to double check your suggestion, because my first question was "what if bitmap_type is OBJ_BAD?" We can call type_name() on OBJ_BAD, but it will return NULL, and we use the return value in a format string unconditionally. So that would be a problem, but it's impossible for this to ever be OBJ_BAD, because we only set it based on the type bitmaps; so it's either a commit/tree/blob/tag, or none (but not bad). I took your suggestion, thanks. > -Peff Thanks, Taylor
diff --git a/pack-bitmap.c b/pack-bitmap.c index d90e1d9d8c..368fa59a42 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1301,10 +1301,52 @@ void count_bitmap_commit_list(struct bitmap_index *bitmap_git, struct bitmap_test_data { struct bitmap_index *bitmap_git; struct bitmap *base; + struct bitmap *commits; + struct bitmap *trees; + struct bitmap *blobs; + struct bitmap *tags; struct progress *prg; size_t seen; }; +static void test_bitmap_type(struct bitmap_test_data *tdata, + struct object *obj, int pos) +{ + enum object_type bitmap_type = OBJ_NONE; + int bitmaps_nr = 0; + + if (bitmap_get(tdata->commits, pos)) { + bitmap_type = OBJ_COMMIT; + bitmaps_nr++; + } + if (bitmap_get(tdata->trees, pos)) { + bitmap_type = OBJ_TREE; + bitmaps_nr++; + } + if (bitmap_get(tdata->blobs, pos)) { + bitmap_type = OBJ_BLOB; + bitmaps_nr++; + } + if (bitmap_get(tdata->tags, pos)) { + bitmap_type = OBJ_TAG; + bitmaps_nr++; + } + + if (!bitmap_type) + die("object %s not found in type bitmaps", + oid_to_hex(&obj->oid)); + + if (bitmaps_nr > 1) + die("object %s does not have a unique type", + oid_to_hex(&obj->oid)); + + if (bitmap_type != obj->type) + die("object %s: real type %s, expected: %s", + oid_to_hex(&obj->oid), + type_name(obj->type), + type_name(bitmap_type)); +} + static void test_show_object(struct object *object, const char *name, void *data) { @@ -1314,6 +1356,7 @@ static void test_show_object(struct object *object, const char *name, bitmap_pos = bitmap_position(tdata->bitmap_git, &object->oid); if (bitmap_pos < 0) die("Object not in bitmap: %s\n", oid_to_hex(&object->oid)); + test_bitmap_type(tdata, object, bitmap_pos); bitmap_set(tdata->base, bitmap_pos); display_progress(tdata->prg, ++tdata->seen); @@ -1328,6 +1371,7 @@ static void test_show_commit(struct commit *commit, void *data) &commit->object.oid); if (bitmap_pos < 0) die("Object not in bitmap: %s\n", oid_to_hex(&commit->object.oid)); + test_bitmap_type(tdata, &commit->object, bitmap_pos); bitmap_set(tdata->base, bitmap_pos); display_progress(tdata->prg, ++tdata->seen); @@ -1375,6 +1419,10 @@ void test_bitmap_walk(struct rev_info *revs) tdata.bitmap_git = bitmap_git; tdata.base = bitmap_new(); + tdata.commits = ewah_to_bitmap(bitmap_git->commits); + tdata.trees = ewah_to_bitmap(bitmap_git->trees); + tdata.blobs = ewah_to_bitmap(bitmap_git->blobs); + tdata.tags = ewah_to_bitmap(bitmap_git->tags); tdata.prg = start_progress("Verifying bitmap entries", result_popcnt); tdata.seen = 0;