Message ID | 20221116105013.1777440-1-e@80x24.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | delta-islands: free island-related data after use | expand |
On Wed, Nov 16 2022, Eric Wong wrote: > On my use case involving 771 islands of Linux on kernel.org, > this reduces memory usage by around 25MB. The bulk of that > comes from free_remote_islands, since free_island_regexes only > saves around 40k. > > This memory is saved early in the memory-intensive pack process, > making it available for the remainder of the long process. > > Signed-off-by: Eric Wong <e@80x24.org> > --- > Note: I only noticed this when I screwed up up a pack.island RE, > ended up with hundreds of thousands of islands instead of 771, > and kept OOM-ing :x > > Memory savings were measured using the following patch which > relies on a patched LD_PRELOAD-based malloc debugger: > https://80x24.org/spew/20221116095404.3974691-1-e@80x24.org/raw FWIW SANITIZE=leak will find this if you stick a "remote_islands = NULL" and run e.g. t5320-delta-islands.sh, but maybe you needed this closer to production. Valgrind will also work, but of course be *much* slower. > Will try to hunt down more memory savings in the nearish future. > > delta-islands.c | 27 +++++++++++++++++++++++++++ > 1 file changed, 27 insertions(+) > > diff --git a/delta-islands.c b/delta-islands.c > index 26f9e99e1a..391e947cc6 100644 > --- a/delta-islands.c > +++ b/delta-islands.c > @@ -454,6 +454,31 @@ static void deduplicate_islands(struct repository *r) > free(list); > } > > +static void free_island_regexes(void) > +{ > + unsigned int i; > + > + for (i = 0; i < island_regexes_nr; i++) > + regfree(&island_regexes[i]); > + > + FREE_AND_NULL(island_regexes); > + island_regexes_alloc = island_regexes_nr = 0; > +} > + > +static void free_remote_islands(void) > +{ > + const char *island_name; > + struct remote_island *rl; > + > + kh_foreach(remote_islands, island_name, rl, { > + free((void *)island_name); > + oid_array_clear(&rl->oids); > + free(rl); > + }); > + kh_destroy_str(remote_islands); > + remote_islands = NULL; > +} > + > void load_delta_islands(struct repository *r, int progress) > { > island_marks = kh_init_oid_map(); > @@ -461,7 +486,9 @@ void load_delta_islands(struct repository *r, int progress) > > git_config(island_config_callback, NULL); > for_each_ref(find_island_for_ref, NULL); > + free_island_regexes(); > deduplicate_islands(r); > + free_remote_islands(); > > if (progress) > fprintf(stderr, _("Marked %d islands, done.\n"), island_counter); Perfect shouldn't be the enemy of the good & all that, but in this case it's not too much more effort to just give this data an appropriate lifetime instead of the global, I tried that out for just the "regex" part of this below. The free_remote_islands() seems to be similarly alive between "find_island_for_ref" and "deduplicate_islands". Your version also works, but the root cause of this sort of thing is these global lifetimes, which sometimes we do for a good reason, but in this case we don't. It's more lines, but e.g. your FREE_AND_NULL() and resetting "nr" and "alloc" makes one wonder how the data might be used outside fo that load_delta_islands() chain (which is needed, if we invoke it again), when we can just init a stack variable to hold it instead... diff --git a/delta-islands.c b/delta-islands.c index 26f9e99e1a9..ef86a91059c 100644 --- a/delta-islands.c +++ b/delta-islands.c @@ -312,29 +312,41 @@ void resolve_tree_islands(struct repository *r, free(todo); } -static regex_t *island_regexes; -static unsigned int island_regexes_alloc, island_regexes_nr; +struct island_config_data { + regex_t *rx; + size_t nr; + size_t alloc; +}; static const char *core_island_name; -static int island_config_callback(const char *k, const char *v, void *cb UNUSED) +static void island_config_data_release(struct island_config_data *icd) +{ + for (size_t i = 0; i < icd->nr; i++) + regfree(&icd->rx[i]); + free(icd->rx); +} + +static int island_config_callback(const char *k, const char *v, void *cb) { + struct island_config_data *data = cb; + if (!strcmp(k, "pack.island")) { struct strbuf re = STRBUF_INIT; if (!v) return config_error_nonbool(k); - ALLOC_GROW(island_regexes, island_regexes_nr + 1, island_regexes_alloc); + ALLOC_GROW(data->rx, data->nr + 1, data->alloc); if (*v != '^') strbuf_addch(&re, '^'); strbuf_addstr(&re, v); - if (regcomp(&island_regexes[island_regexes_nr], re.buf, REG_EXTENDED)) + if (regcomp(&data->rx[data->nr], re.buf, REG_EXTENDED)) die(_("failed to load island regex for '%s': %s"), k, re.buf); strbuf_release(&re); - island_regexes_nr++; + data->nr++; return 0; } @@ -365,8 +377,10 @@ static void add_ref_to_island(const char *island_name, const struct object_id *o } static int find_island_for_ref(const char *refname, const struct object_id *oid, - int flags UNUSED, void *data UNUSED) + int flags UNUSED, void *cb) { + struct island_config_data *data = cb; + /* * We should advertise 'ARRAY_SIZE(matches) - 2' as the max, * so we can diagnose below a config with more capture groups @@ -377,8 +391,8 @@ static int find_island_for_ref(const char *refname, const struct object_id *oid, struct strbuf island_name = STRBUF_INIT; /* walk backwards to get last-one-wins ordering */ - for (i = island_regexes_nr - 1; i >= 0; i--) { - if (!regexec(&island_regexes[i], refname, + for (i = data->nr - 1; i >= 0; i--) { + if (!regexec(&data->rx[i], refname, ARRAY_SIZE(matches), matches, 0)) break; } @@ -456,11 +470,14 @@ static void deduplicate_islands(struct repository *r) void load_delta_islands(struct repository *r, int progress) { + struct island_config_data data = { 0 }; + island_marks = kh_init_oid_map(); remote_islands = kh_init_str(); - git_config(island_config_callback, NULL); - for_each_ref(find_island_for_ref, NULL); + git_config(island_config_callback, &data); + for_each_ref(find_island_for_ref, &data); + island_config_data_release(&data); deduplicate_islands(r); if (progress)
On Wed, Nov 16, 2022 at 10:50:13AM +0000, Eric Wong wrote: > On my use case involving 771 islands of Linux on kernel.org, > this reduces memory usage by around 25MB. The bulk of that > comes from free_remote_islands, since free_island_regexes only > saves around 40k. > > This memory is saved early in the memory-intensive pack process, > making it available for the remainder of the long process. I think this works and is a reasonable thing to do. The non-obvious question is whether the island data is ever used later in the process. Certainly the island bitmaps themselves are, but the initial mapping of refs to islands doesn't. I do agree with Ævar that it would be a lot easier to confirm that if these variables were given a more appropriate scope (this was mostly just laziness on my part writing the initial delta-island code; so much of it has to stick around for the whole process and I didn't distinguish between the two). > Will try to hunt down more memory savings in the nearish future. Yes, you've probably noticed that pack-objects does not distinguish much between what is necessary for the various phases. A few obvious things to look at: 1. After the write phase, you can probably ditch the island bitmaps, too. In many repacks we're basically done then, but if you're generating bitmaps, that happens afterwards in the same process. 2. The object traversal for pack-objects is done in-process these days. But after it finishes, I suspect that we do not generally need those object structs anymore, because all of the book-keeping is done in the bit object_entry array in packing_data. The exception would be generating bitmaps, which does need to do some traversal (and I think may hang on to actual "struct commit" pointers). I also don't think we have any code that clears obj_hash or released the pooled object pointers. So it's probably pretty tricky, but I suspect would yield big savings, since it's a per-object cost on the order of 64 bytes (so ~640MB on the kernel). 3. During the bitmap phase I'm not sure if we still care about the object_entry struct in packing_data. It's the other big bytes-per-object user of memory. We need it all the way through the write phase for obvious reasons, but if we could ditch it for the bitmap phase, that may reduce peak memory during that phase. -Peff
diff --git a/delta-islands.c b/delta-islands.c index 26f9e99e1a..391e947cc6 100644 --- a/delta-islands.c +++ b/delta-islands.c @@ -454,6 +454,31 @@ static void deduplicate_islands(struct repository *r) free(list); } +static void free_island_regexes(void) +{ + unsigned int i; + + for (i = 0; i < island_regexes_nr; i++) + regfree(&island_regexes[i]); + + FREE_AND_NULL(island_regexes); + island_regexes_alloc = island_regexes_nr = 0; +} + +static void free_remote_islands(void) +{ + const char *island_name; + struct remote_island *rl; + + kh_foreach(remote_islands, island_name, rl, { + free((void *)island_name); + oid_array_clear(&rl->oids); + free(rl); + }); + kh_destroy_str(remote_islands); + remote_islands = NULL; +} + void load_delta_islands(struct repository *r, int progress) { island_marks = kh_init_oid_map(); @@ -461,7 +486,9 @@ void load_delta_islands(struct repository *r, int progress) git_config(island_config_callback, NULL); for_each_ref(find_island_for_ref, NULL); + free_island_regexes(); deduplicate_islands(r); + free_remote_islands(); if (progress) fprintf(stderr, _("Marked %d islands, done.\n"), island_counter);
On my use case involving 771 islands of Linux on kernel.org, this reduces memory usage by around 25MB. The bulk of that comes from free_remote_islands, since free_island_regexes only saves around 40k. This memory is saved early in the memory-intensive pack process, making it available for the remainder of the long process. Signed-off-by: Eric Wong <e@80x24.org> --- Note: I only noticed this when I screwed up up a pack.island RE, ended up with hundreds of thousands of islands instead of 771, and kept OOM-ing :x Memory savings were measured using the following patch which relies on a patched LD_PRELOAD-based malloc debugger: https://80x24.org/spew/20221116095404.3974691-1-e@80x24.org/raw Will try to hunt down more memory savings in the nearish future. delta-islands.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)