Message ID | 2b335c45ae7832b886ef9adccc2530e4ca53d267.1721250704.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | midx: incremental multi-pack indexes, part one | expand |
On Wed, Jul 17, 2024 at 05:12:25PM -0400, Taylor Blau wrote: > After that point, we only need to make two special considerations within > this function: > > - First, the pack_int_id returned to us by `nth_midxed_pack_int_id()` > is a position in the concatenated lexical order of packs, so we must > ensure that we subtract `m->num_packs_in_base` before accessing the > MIDX-local `packs` array. > > - Second, we must avoid translating the `pos` back to a MIDX-local > index, since we use it as an argument to `nth_midxed_offset()` which > expects a position relative to the concatenated lexical order of > objects. OK. I think this is correct, but this would be another place where we could use an nth_midxed_offset_one() function if we had one. My thinking was that we'd avoid walking back over the midx chain again. But I guess we don't actually do that, because our midx_for_object() will have overwritten our "m" variable, as well. So inside nth_midxed_offset_one() we'll immediately realize that the global position is inside the midx we passed in. A little extra arithmetic, but there's no pointer chasing. -Peff
On Thu, Aug 01, 2024 at 06:12:15AM -0400, Jeff King wrote: > On Wed, Jul 17, 2024 at 05:12:25PM -0400, Taylor Blau wrote: > > > After that point, we only need to make two special considerations within > > this function: > > > > - First, the pack_int_id returned to us by `nth_midxed_pack_int_id()` > > is a position in the concatenated lexical order of packs, so we must > > ensure that we subtract `m->num_packs_in_base` before accessing the > > MIDX-local `packs` array. > > > > - Second, we must avoid translating the `pos` back to a MIDX-local > > index, since we use it as an argument to `nth_midxed_offset()` which > > expects a position relative to the concatenated lexical order of > > objects. > > OK. I think this is correct, but this would be another place where we > could use an nth_midxed_offset_one() function if we had one. Yeah. > My thinking was that we'd avoid walking back over the midx chain again. > But I guess we don't actually do that, because our midx_for_object() > will have overwritten our "m" variable, as well. So inside > nth_midxed_offset_one() we'll immediately realize that the global > position is inside the midx we passed in. A little extra arithmetic, but > there's no pointer chasing. Yup, exactly. Thanks, Taylor
diff --git a/midx.c b/midx.c index f87df6bede..dea78474a2 100644 --- a/midx.c +++ b/midx.c @@ -406,14 +406,12 @@ int fill_midx_entry(struct repository *r, if (!bsearch_midx(oid, m, &pos)) return 0; - if (pos >= m->num_objects) - return 0; - + midx_for_object(&m, pos); pack_int_id = nth_midxed_pack_int_id(m, pos); if (prepare_midx_pack(r, m, pack_int_id)) return 0; - p = m->packs[pack_int_id]; + p = m->packs[pack_int_id - m->num_packs_in_base]; /* * We are about to tell the caller where they can locate the
In a similar fashion as previous commits, teach the `fill_midx_entry()` function to work in a incremental MIDX-aware fashion. This function, unlike others which accept an index into either the lexical order of objects or packs, takes in an object_id, and attempts to fill a caller-provided 'struct pack_entry' with the remaining pieces of information about that object from the MIDX. The function uses `bsearch_midx()` which fills out the frame-local 'pos' variable, recording the given object_id's lexical position within the MIDX chain, if found (if no matching object ID was found, we'll return immediately without filling out the `pack_entry` structure). Once given that position, we jump back through the `->base_midx` pointer to ensure that our `m` points at the MIDX layer which contains the given object_id (and not an ancestor or descendant of it in the chain). Note that we can drop the bounds check "if (pos >= m->num_objects)" because `midx_for_object()` performs this check for us. After that point, we only need to make two special considerations within this function: - First, the pack_int_id returned to us by `nth_midxed_pack_int_id()` is a position in the concatenated lexical order of packs, so we must ensure that we subtract `m->num_packs_in_base` before accessing the MIDX-local `packs` array. - Second, we must avoid translating the `pos` back to a MIDX-local index, since we use it as an argument to `nth_midxed_offset()` which expects a position relative to the concatenated lexical order of objects. Signed-off-by: Taylor Blau <me@ttaylorr.com> --- midx.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)