Message ID | 20241221-b4-pks-reftable-oom-fix-without-readers-v1-3-12db83a3267c@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | reftable: fix out-of-memory errors on NonStop | expand |
Patrick Steinhardt <ps@pks.im> writes: > Similar as the preceding commit, we may try to do a zero-sized > allocation when reloading a reftable stack that ain't got any tables. > It is implementation-defined whether malloc(3p) returns a NULL pointer > in that case or a zero-sized object. In case it does return a NULL > pointer though it causes us to think we have run into an out-of-memory > situation, and thus we return an error. > > Fix this by only allocating arrays when they have at least one entry. > Refactor the code so that we don't try to access those arrays in case > they are empty. > > Reported-by: Randall S. Becker <rsbecker@nexbridge.com> > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > reftable/stack.c | 44 +++++++++++++++++++++++--------------------- > 1 file changed, 23 insertions(+), 21 deletions(-) This somehow did not cleanly apply, so I whiggled it in manually. I probably wouldn't mixed the "size_t i" changes into this fix if I were doing it. To avoid "while (*names)" loop, I would have made it to "for (size_t j = 0; j < names_len; j++)" and kept the existing use of "i" intact, instead. And reintroducing for() scoped "i" three times did not seem to make it easier to read the result. I am not convinced we need to avoid "while (*names)", by the way. If "names" were NULL, names_length() would already have segfaulted anyway, and basics.c:parse_names(), when not returning NULL (which would have been caught by the sole caller of reload_once() as an error), makes sure it gives its caller a NULL-terminated array. But other than that, this seems to make sure that we avoid unnecessary work when cur_len or new_readers is zero and avoids asking for 0-sized allocation as a side effect of doing so, which is good. Will queue. Thanks.
Junio C Hamano <gitster@pobox.com> writes: >> reftable/stack.c | 44 +++++++++++++++++++++++--------------------- >> 1 file changed, 23 insertions(+), 21 deletions(-) > > This somehow did not cleanly apply, so I whiggled it in manually. That is because I usually ignore author-supplied base, and this time I used the tip of ps/reftable-alloc-failures topic, which brings all the text these four patches touch, in my initial attempt. Applying these on the author-supplied base (ff795a5c5e) yields the same tree as the result of merging my manual application of these four patches to ps/reftable-alloc-failures into the same base.
On December 21, 2024 12:57 PM, Junio C Hamano wrote: >Junio C Hamano <gitster@pobox.com> writes: > >>> reftable/stack.c | 44 +++++++++++++++++++++++--------------------- >>> 1 file changed, 23 insertions(+), 21 deletions(-) >> >> This somehow did not cleanly apply, so I whiggled it in manually. > >That is because I usually ignore author-supplied base, and this time I used the tip of >ps/reftable-alloc-failures topic, which brings all the text these four patches touch, in >my initial attempt. > >Applying these on the author-supplied base (ff795a5c5e) yields the same tree as >the result of merging my manual application of these four patches to ps/reftable- >alloc-failures into the same base. Ready to test this. Please let me know when and I will report results.
<rsbecker@nexbridge.com> writes: >>Applying these on the author-supplied base (ff795a5c5e) yields the same > tree as >>the result of merging my manual application of these four patches to > ps/reftable- >>alloc-failures into the same base. > > Ready to test this. Please let me know when and I will report results. If you want to start sooner $ git checkout -b test ff795a5c5ed2e2d07c688c217a615d89e3f5733b $ git am ... these four patches ... should give you the fix without anything else mixed in. I'll push out the usual four branches after integration testing, but it will be queued in 'seen' (just above the point that corresponds to 'next') first, before merging it to 'next' (and then down to 'master' before -rc1). Thanks.
On Sat, Dec 21, 2024 at 09:36:54AM -0800, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > Similar as the preceding commit, we may try to do a zero-sized > > allocation when reloading a reftable stack that ain't got any tables. > > It is implementation-defined whether malloc(3p) returns a NULL pointer > > in that case or a zero-sized object. In case it does return a NULL > > pointer though it causes us to think we have run into an out-of-memory > > situation, and thus we return an error. > > > > Fix this by only allocating arrays when they have at least one entry. > > Refactor the code so that we don't try to access those arrays in case > > they are empty. > > > > Reported-by: Randall S. Becker <rsbecker@nexbridge.com> > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > > --- > > reftable/stack.c | 44 +++++++++++++++++++++++--------------------- > > 1 file changed, 23 insertions(+), 21 deletions(-) > > This somehow did not cleanly apply, so I whiggled it in manually. > > I probably wouldn't mixed the "size_t i" changes into this fix if I > were doing it. To avoid "while (*names)" loop, I would have made it > to "for (size_t j = 0; j < names_len; j++)" and kept the existing > use of "i" intact, instead. And reintroducing for() scoped "i" > three times did not seem to make it easier to read the result. > > I am not convinced we need to avoid "while (*names)", by the way. > If "names" were NULL, names_length() would already have segfaulted > anyway, and basics.c:parse_names(), when not returning NULL (which > would have been caught by the sole caller of reload_once() as an > error), makes sure it gives its caller a NULL-terminated array. True. I was initially erring on the side of being overly cautious here because I didn't yet have the idea to adapt `reftable_malloc()` and `reftable_realloc()` so that they have the same behaviour as NonStop. Let me redo this change and reduce it to the minimum required one. Patrick
On December 21, 2024 1:31 PM, Junio C Hamano wrote: >To: rsbecker@nexbridge.com >Cc: 'Patrick Steinhardt' <ps@pks.im>; git@vger.kernel.org; 'Randall S. Becker' ><randall.becker@nexbridge.ca> >Subject: Re: [PATCH 3/4] reftable/stack: fix zero-sized allocation when there are no >readers > ><rsbecker@nexbridge.com> writes: > >>>Applying these on the author-supplied base (ff795a5c5e) yields the >>>same >> tree as >>>the result of merging my manual application of these four patches to >> ps/reftable- >>>alloc-failures into the same base. >> >> Ready to test this. Please let me know when and I will report results. > >If you want to start sooner > > $ git checkout -b test ff795a5c5ed2e2d07c688c217a615d89e3f5733b > $ git am ... these four patches ... > >should give you the fix without anything else mixed in. I'll push out the usual four >branches after integration testing, but it will be queued in 'seen' (just above the >point that corresponds to >'next') first, before merging it to 'next' (and then down to 'master' before -rc1). FYI: 'seen' looks better now. I am having issues in t0211-trace2-perf.sh with undefined SSL symbols (SSL_get0_group_name), but am able to use git init again.
On December 22, 2024 12:48 PM, I wrote: >On December 21, 2024 1:31 PM, Junio C Hamano wrote: >>To: rsbecker@nexbridge.com >>Cc: 'Patrick Steinhardt' <ps@pks.im>; git@vger.kernel.org; 'Randall S. Becker' >><randall.becker@nexbridge.ca> >>Subject: Re: [PATCH 3/4] reftable/stack: fix zero-sized allocation when >>there are no readers >> >><rsbecker@nexbridge.com> writes: >> >>>>Applying these on the author-supplied base (ff795a5c5e) yields the >>>>same >>> tree as >>>>the result of merging my manual application of these four patches to >>> ps/reftable- >>>>alloc-failures into the same base. >>> >>> Ready to test this. Please let me know when and I will report results. >> >>If you want to start sooner >> >> $ git checkout -b test ff795a5c5ed2e2d07c688c217a615d89e3f5733b >> $ git am ... these four patches ... >> >>should give you the fix without anything else mixed in. I'll push out >>the usual four branches after integration testing, but it will be >>queued in 'seen' (just above the point that corresponds to >>'next') first, before merging it to 'next' (and then down to 'master' before -rc1). > >FYI: 'seen' looks better now. I am having issues in t0211-trace2-perf.sh with >undefined SSL symbols (SSL_get0_group_name), but am able to use git init again. 'seen' looks good. Operator error on trace2-perf.sh - used the wrong version of OpenSSL. Thanks, Randall
On Sun, Dec 22, 2024 at 01:17:15PM -0500, rsbecker@nexbridge.com wrote: > On December 22, 2024 12:48 PM, I wrote: > >On December 21, 2024 1:31 PM, Junio C Hamano wrote: > >>To: rsbecker@nexbridge.com > >>Cc: 'Patrick Steinhardt' <ps@pks.im>; git@vger.kernel.org; 'Randall S. > Becker' > >><randall.becker@nexbridge.ca> > >>Subject: Re: [PATCH 3/4] reftable/stack: fix zero-sized allocation when > >>there are no readers > >> > >><rsbecker@nexbridge.com> writes: > >> > >>>>Applying these on the author-supplied base (ff795a5c5e) yields the > >>>>same > >>> tree as > >>>>the result of merging my manual application of these four patches to > >>> ps/reftable- > >>>>alloc-failures into the same base. > >>> > >>> Ready to test this. Please let me know when and I will report results. > >> > >>If you want to start sooner > >> > >> $ git checkout -b test ff795a5c5ed2e2d07c688c217a615d89e3f5733b > >> $ git am ... these four patches ... > >> > >>should give you the fix without anything else mixed in. I'll push out > >>the usual four branches after integration testing, but it will be > >>queued in 'seen' (just above the point that corresponds to > >>'next') first, before merging it to 'next' (and then down to 'master' > before -rc1). > > > >FYI: 'seen' looks better now. I am having issues in t0211-trace2-perf.sh > with > >undefined SSL symbols (SSL_get0_group_name), but am able to use git init > again. > > 'seen' looks good. Operator error on trace2-perf.sh - used the wrong version > of OpenSSL. Thanks for confirming! Patrick
Patrick Steinhardt <ps@pks.im> writes: >> >FYI: 'seen' looks better now. I am having issues in t0211-trace2-perf.sh >> with >> >undefined SSL symbols (SSL_get0_group_name), but am able to use git init >> again. >> >> 'seen' looks good. Operator error on trace2-perf.sh - used the wrong version >> of OpenSSL. > > Thanks for confirming! Thanks, all.
diff --git a/reftable/stack.c b/reftable/stack.c index 6ca21965d8e1135d986043113d465abd14cd532c..8328bfc58e9c207983ce88355d6110c40be3fac1 100644 --- a/reftable/stack.c +++ b/reftable/stack.c @@ -270,40 +270,42 @@ static int reftable_stack_reload_once(struct reftable_stack *st, int reuse_open) { size_t cur_len = !st->merged ? 0 : st->merged->readers_len; - struct reftable_reader **cur; + struct reftable_reader **cur = NULL; struct reftable_reader **reused = NULL; - struct reftable_reader **new_readers; + struct reftable_reader **new_readers = NULL; size_t reused_len = 0, reused_alloc = 0, names_len; size_t new_readers_len = 0; struct reftable_merged_table *new_merged = NULL; struct reftable_buf table_path = REFTABLE_BUF_INIT; int err = 0; - size_t i; - cur = stack_copy_readers(st, cur_len); - if (!cur) { - err = REFTABLE_OUT_OF_MEMORY_ERROR; - goto done; + if (cur_len) { + cur = stack_copy_readers(st, cur_len); + if (!cur) { + err = REFTABLE_OUT_OF_MEMORY_ERROR; + goto done; + } } names_len = names_length(names); - - new_readers = reftable_calloc(names_len, sizeof(*new_readers)); - if (!new_readers) { - err = REFTABLE_OUT_OF_MEMORY_ERROR; - goto done; + if (names_len) { + new_readers = reftable_calloc(names_len, sizeof(*new_readers)); + if (!new_readers) { + err = REFTABLE_OUT_OF_MEMORY_ERROR; + goto done; + } } - while (*names) { + for (size_t i = 0; i < names_len; i++) { struct reftable_reader *rd = NULL; - const char *name = *names++; + const char *name = names[i]; /* this is linear; we assume compaction keeps the number of tables under control so this is not quadratic. */ - for (i = 0; reuse_open && i < cur_len; i++) { - if (cur[i] && 0 == strcmp(cur[i]->name, name)) { - rd = cur[i]; - cur[i] = NULL; + for (size_t j = 0; reuse_open && j < cur_len; j++) { + if (cur[j] && 0 == strcmp(cur[j]->name, name)) { + rd = cur[j]; + cur[j] = NULL; /* * When reloading the stack fails, we end up @@ -357,7 +359,7 @@ static int reftable_stack_reload_once(struct reftable_stack *st, * file of such an open reader wouldn't have been possible to be * unlinked by the compacting process. */ - for (i = 0; i < cur_len; i++) { + for (size_t i = 0; i < cur_len; i++) { if (cur[i]) { const char *name = reader_name(cur[i]); @@ -388,11 +390,11 @@ static int reftable_stack_reload_once(struct reftable_stack *st, * happen on the successful case, because on the unsuccessful one we * decrement their refcount via `new_readers`. */ - for (i = 0; i < reused_len; i++) + for (size_t i = 0; i < reused_len; i++) reftable_reader_decref(reused[i]); done: - for (i = 0; i < new_readers_len; i++) + for (size_t i = 0; i < new_readers_len; i++) reftable_reader_decref(new_readers[i]); reftable_free(new_readers); reftable_free(reused);
Similar as the preceding commit, we may try to do a zero-sized allocation when reloading a reftable stack that ain't got any tables. It is implementation-defined whether malloc(3p) returns a NULL pointer in that case or a zero-sized object. In case it does return a NULL pointer though it causes us to think we have run into an out-of-memory situation, and thus we return an error. Fix this by only allocating arrays when they have at least one entry. Refactor the code so that we don't try to access those arrays in case they are empty. Reported-by: Randall S. Becker <rsbecker@nexbridge.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> --- reftable/stack.c | 44 +++++++++++++++++++++++--------------------- 1 file changed, 23 insertions(+), 21 deletions(-)