Message ID | cover.1715166175.git.ps@pks.im (mailing list archive) |
---|---|
Headers | show |
Series | reftable: prepare for re-seekable iterators | expand |
Patrick Steinhardt <ps@pks.im> writes: > To address this inefficiency, the patch series at hand refactors the > reftable library such that creation of iterators and seeking on an > iterator are separate steps. This refactoring prepares us for reusing > iterators to perform multiple seeks, which in turn will allow us to > reuse internal data structures for subsequent seeks. ;-) > Note: this series does not yet go all the way to re-seekable iterators, > and there are no users yet. The patch series is complex enough as-is > already, so I decided to defer that to the next iteration. Thus, the > whole refactoring here should essentially be a large no-op that prepares > the infrastructure for re-seekable iterators. > > The series depends on pks/reftable-write-optim at fa74f32291 > (reftable/block: reuse compressed array, 2024-04-08). There is another topic on reftable to make write options tweakable, whose addition of reftable/dump and reader_print_blocks interface needs to be adjusted to this change, I think. reftable/reader.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git c/reftable/reader.c w/reftable/reader.c index 2ea830bdb6..fd516e01db 100644 --- c/reftable/reader.c +++ w/reftable/reader.c @@ -841,7 +841,7 @@ int reftable_reader_print_blocks(const char *tablename) }, }; struct reftable_block_source src = { 0 }; - struct table_iter ti = TABLE_ITER_INIT; + struct table_iter *ti; struct reftable_reader *r = NULL; size_t i; int err; @@ -854,11 +854,14 @@ int reftable_reader_print_blocks(const char *tablename) if (err < 0) goto done; + REFTABLE_ALLOC_ARRAY(ti, 1); + table_iter_init(ti, r); + printf("header:\n"); printf(" block_size: %d\n", r->block_size); for (i = 0; i < ARRAY_SIZE(sections); i++) { - err = reader_start(r, &ti, sections[i].type, 0); + err = table_iter_seek_start(ti, sections[i].type, 0); if (err < 0) goto done; if (err > 0) @@ -867,10 +870,10 @@ int reftable_reader_print_blocks(const char *tablename) printf("%s:\n", sections[i].name); while (1) { - printf(" - length: %u\n", ti.br.block_len); - printf(" restarts: %u\n", ti.br.restart_count); + printf(" - length: %u\n", ti->br.block_len); + printf(" restarts: %u\n", ti->br.restart_count); - err = table_iter_next_block(&ti); + err = table_iter_next_block(ti); if (err < 0) goto done; if (err > 0) @@ -880,6 +883,6 @@ int reftable_reader_print_blocks(const char *tablename) done: reftable_reader_free(r); - table_iter_close(&ti); + table_iter_close(ti); return err; }
Junio C Hamano <gitster@pobox.com> writes: > diff --git c/reftable/reader.c w/reftable/reader.c > index 2ea830bdb6..fd516e01db 100644 > --- c/reftable/reader.c > +++ w/reftable/reader.c > @@ -841,7 +841,7 @@ int reftable_reader_print_blocks(const char *tablename) > }, > }; > struct reftable_block_source src = { 0 }; > - struct table_iter ti = TABLE_ITER_INIT; > + struct table_iter *ti; > struct reftable_reader *r = NULL; > size_t i; > int err; ... an unusually early error could skip everything here ... > @@ -880,6 +883,6 @@ int reftable_reader_print_blocks(const char *tablename) > > done: > reftable_reader_free(r); > - table_iter_close(&ti); > + table_iter_close(ti); ... and cause this to break. In the version of the merge-fix I used, *ti is initialized to NULL and the _close() is called only when ti is not NULL. > return err; > }
On Wed, May 08, 2024 at 04:42:11PM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > To address this inefficiency, the patch series at hand refactors the > > reftable library such that creation of iterators and seeking on an > > iterator are separate steps. This refactoring prepares us for reusing > > iterators to perform multiple seeks, which in turn will allow us to > > reuse internal data structures for subsequent seeks. > > ;-) > > > Note: this series does not yet go all the way to re-seekable iterators, > > and there are no users yet. The patch series is complex enough as-is > > already, so I decided to defer that to the next iteration. Thus, the > > whole refactoring here should essentially be a large no-op that prepares > > the infrastructure for re-seekable iterators. > > > > The series depends on pks/reftable-write-optim at fa74f32291 > > (reftable/block: reuse compressed array, 2024-04-08). > > There is another topic on reftable to make write options tweakable, > whose addition of reftable/dump and reader_print_blocks interface > needs to be adjusted to this change, I think. True, I forgot to double check that one. Your proposed resolution isn't quite correct as we now leak the `ti` pointer -- `table_iter_close()` only closes the iterator and releases its underlying resources, but does not free the iterator itself. The below diff would be needed on top of what you currently have in `seen`. In any case though, I can also resend this topic with ps/reftable-write-options pulled in as a dependency. Please let me know your preference. Patrick -- >8 -- diff --git a/reftable/reader.c b/reftable/reader.c index 2d9630b7c2..83f6772e5d 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -841,8 +841,8 @@ int reftable_reader_print_blocks(const char *tablename) }, }; struct reftable_block_source src = { 0 }; - struct table_iter *ti = NULL; struct reftable_reader *r = NULL; + struct table_iter ti = {0}; size_t i; int err; @@ -854,14 +854,13 @@ int reftable_reader_print_blocks(const char *tablename) if (err < 0) goto done; - REFTABLE_ALLOC_ARRAY(ti, 1); - table_iter_init(ti, r); + table_iter_init(&ti, r); printf("header:\n"); printf(" block_size: %d\n", r->block_size); for (i = 0; i < ARRAY_SIZE(sections); i++) { - err = table_iter_seek_start(ti, sections[i].type, 0); + err = table_iter_seek_start(&ti, sections[i].type, 0); if (err < 0) goto done; if (err > 0) @@ -870,10 +869,10 @@ int reftable_reader_print_blocks(const char *tablename) printf("%s:\n", sections[i].name); while (1) { - printf(" - length: %u\n", ti->br.block_len); - printf(" restarts: %u\n", ti->br.restart_count); + printf(" - length: %u\n", ti.br.block_len); + printf(" restarts: %u\n", ti.br.restart_count); - err = table_iter_next_block(ti); + err = table_iter_next_block(&ti); if (err < 0) goto done; if (err > 0) @@ -883,7 +882,6 @@ int reftable_reader_print_blocks(const char *tablename) done: reftable_reader_free(r); - if (ti) - table_iter_close(ti); + table_iter_close(&ti); return err; }
Patrick Steinhardt <ps@pks.im> writes: > The below diff would be needed on top of what you currently have in > `seen`. In any case though, I can also resend this topic with > ps/reftable-write-options pulled in as a dependency. Please let me know > your preference. Is it "needed", in the sense that "without the fix what you posted is broken in such and such ways", or is it "I think it is niceR to have it on stack because this one instance does not have to be on heap"? To me, they look equivalent and I have no problems with the "nicer" variant, but your "needed" makes me wonder if I am missing some correctness invariants I am violating without realizing. Thanks.
On Fri, May 10, 2024 at 08:40:56AM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > The below diff would be needed on top of what you currently have in > > `seen`. In any case though, I can also resend this topic with > > ps/reftable-write-options pulled in as a dependency. Please let me know > > your preference. > > Is it "needed", in the sense that "without the fix what you posted > is broken in such and such ways", or is it "I think it is niceR to > have it on stack because this one instance does not have to be on > heap"? To me, they look equivalent and I have no problems with the > "nicer" variant, but your "needed" makes me wonder if I am missing > some correctness invariants I am violating without realizing. It's needed in the sense that your version leaks memory -- the `ti` pointer is never free'd. Other than that they are equivalent indeed. Patrick
Patrick Steinhardt <ps@pks.im> writes: > It's needed in the sense that your version leaks memory -- the `ti` > pointer is never free'd. Other than that they are equivalent indeed. Ah, OK, I somehow misread that _close() will free the resource, but it only frees the resources held by the shell and not the shell itself. Thanks.