Message ID | 41db7414e17201f85b476af5e0183e72de450310.1712209149.git.ps@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | reftable: optimize write performance | expand |
On Thu, Apr 4, 2024 at 7:48 AM Patrick Steinhardt <ps@pks.im> wrote: > There are two code paths which release memory of the reftable writer: > > - `reftable_writer_close()` releases internal state after it has > written data. > > - `reftable_writer_free()` releases the block that was written to and > the writer itself. The bifurcation is there so you can read the stats after closing the writer. The new method makes it harder to misuse, but now you have two ways to end a writer. Suggestion: drop reftable_writer_{free,close} from reftable-writer.h (rename to remove the reftable_ prefix because they are no longer considered public) and find another way to read out the stats. Either pass an optional reftable_writer_stats into the construction of the writer, return the stats from the close function, or drop stats altogether. IIRC They are only used in the unit tests.
On Thu, Apr 04, 2024 at 09:08:46AM +0200, Han-Wen Nienhuys wrote: > On Thu, Apr 4, 2024 at 7:48 AM Patrick Steinhardt <ps@pks.im> wrote: > > There are two code paths which release memory of the reftable writer: > > > > - `reftable_writer_close()` releases internal state after it has > > written data. > > > > - `reftable_writer_free()` releases the block that was written to and > > the writer itself. > > The bifurcation is there so you can read the stats after closing the > writer. The new method makes it harder to misuse, but now you have two > ways to end a writer. Suggestion: drop reftable_writer_{free,close} > from reftable-writer.h (rename to remove the reftable_ prefix because > they are no longer considered public) and find another way to read out > the stats. Either pass an optional reftable_writer_stats into the > construction of the writer, return the stats from the close function, > or drop stats altogether. IIRC They are only used in the unit tests. But even with these refactorings the stats remain intact after calling `reftable_writer_close()` or `reftable_writer_release()`, right? So it basically continues to work as expected. It might not be the cleanest way to handle this, but I think this patch is an improvement over the previous state because we plug a memory leak and deduplicate the cleanup logic. So I would suggest to defer your proposed refactorings to a later point, if you're okay with that. Thanks! Patrick
On Thu, Apr 4, 2024 at 9:32 AM Patrick Steinhardt <ps@pks.im> wrote: > > On Thu, Apr 04, 2024 at 09:08:46AM +0200, Han-Wen Nienhuys wrote: > > On Thu, Apr 4, 2024 at 7:48 AM Patrick Steinhardt <ps@pks.im> wrote: > > > There are two code paths which release memory of the reftable writer: > > > > > > - `reftable_writer_close()` releases internal state after it has > > > written data. > > > > > > - `reftable_writer_free()` releases the block that was written to and > > > the writer itself. > > > > The bifurcation is there so you can read the stats after closing the > > writer. The new method makes it harder to misuse, but now you have two > > ways to end a writer. Suggestion: drop reftable_writer_{free,close} > > from reftable-writer.h (rename to remove the reftable_ prefix because > > they are no longer considered public) and find another way to read out > > the stats. Either pass an optional reftable_writer_stats into the > > construction of the writer, return the stats from the close function, > > or drop stats altogether. IIRC They are only used in the unit tests. > > But even with these refactorings the stats remain intact after calling > `reftable_writer_close()` or `reftable_writer_release()`, right? So it > basically continues to work as expected. Right - I misinterpreted your change. > It might not be the cleanest way to handle this, but I think this patch > is an improvement over the previous state because we plug a memory leak > and deduplicate the cleanup logic. So I would suggest to defer your > proposed refactorings to a later point, if you're okay with that. yes. Please add reftable_writer_release to reftable-writer.h for consistency, though. Or remove the reftable_ prefix.
On Thu, Apr 04, 2024 at 11:00:46AM +0200, Han-Wen Nienhuys wrote: > On Thu, Apr 4, 2024 at 9:32 AM Patrick Steinhardt <ps@pks.im> wrote: > > On Thu, Apr 04, 2024 at 09:08:46AM +0200, Han-Wen Nienhuys wrote: [snip] > > It might not be the cleanest way to handle this, but I think this patch > > is an improvement over the previous state because we plug a memory leak > > and deduplicate the cleanup logic. So I would suggest to defer your > > proposed refactorings to a later point, if you're okay with that. > > yes. Please add reftable_writer_release to reftable-writer.h for > consistency, though. Or remove the reftable_ prefix. I've dropped the `reftable_` prefix locally. Will wait a bit for additional reviews though before sending out v3. Patrick
diff --git a/reftable/writer.c b/reftable/writer.c index d347ec4cc6..7b70c9b666 100644 --- a/reftable/writer.c +++ b/reftable/writer.c @@ -149,11 +149,21 @@ void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, w->max_update_index = max; } +static void reftable_writer_release(struct reftable_writer *w) +{ + if (w) { + reftable_free(w->block); + w->block = NULL; + block_writer_release(&w->block_writer_data); + w->block_writer = NULL; + writer_clear_index(w); + strbuf_release(&w->last_key); + } +} + void reftable_writer_free(struct reftable_writer *w) { - if (!w) - return; - reftable_free(w->block); + reftable_writer_release(w); reftable_free(w); } @@ -643,16 +653,13 @@ int reftable_writer_close(struct reftable_writer *w) } done: - /* free up memory. */ - block_writer_release(&w->block_writer_data); - writer_clear_index(w); - strbuf_release(&w->last_key); + reftable_writer_release(w); return err; } static void writer_clear_index(struct reftable_writer *w) { - for (size_t i = 0; i < w->index_len; i++) + for (size_t i = 0; w->index && i < w->index_len; i++) strbuf_release(&w->index[i].last_key); FREE_AND_NULL(w->index); w->index_len = 0;
There are two code paths which release memory of the reftable writer: - `reftable_writer_close()` releases internal state after it has written data. - `reftable_writer_free()` releases the block that was written to and the writer itself. Both code paths free different parts of the writer, and consequently the caller must make sure to call both. And while callers mostly do this already, this falls apart when a write failure causes the caller to skip calling `reftable_write_close()`. Introduce a new function `reftable_writer_release()` that releases all internal state and call it from both paths. Like this it is fine for the caller to not call `reftable_writer_close()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- reftable/writer.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-)