Message ID | cover.1716541556.git.ps@pks.im (mailing list archive) |
---|---|
Headers | show |
Series | Various memory leak fixes | expand |
Patrick Steinhardt <ps@pks.im> writes: > this is the second version of my patch series that fixes various memory > leaks in Git. Changes compared to v1: > > - t4153 and t7006 aren't marked as passing anymore. I thought they > pass because most of these tests were skipped because of a missing > TTY prerequisite both on my local machine, but also in our CI. > > - Add another patch to install the Perl IO:Pty module on Alpine and > Ubuntu. This fulfills the TTY prerequisite and thus surfaces the > memory leaks in both of the above tests. > > - Add another unit test for strvec that exercise replacing a string in > the strvec with a copy of itself. > > - A bunch of commit message improvements. Looking very good. This seems to reveal existing leaks when merged to 'seen'; other topics that are not in 'master' may be introducing these leaks. I'll see if a trial merge to 'next' is leak-free (in which case I'll merge it down to 'next') or there are other topics in 'next' that are leaking (in which case we'll play by ear---either mark the tests again as non-leak-free, or plug the leak if it seems trivial). https://github.com/git/git/actions/runs/9231313414/job/25400998823 says t1400-update-ref has many "stdin symref-update" things are failing. Also https://github.com/git/git/actions/runs/9231313414/job/25401102951 shows that t1460-refs-migrate fails on Windows. Thanks.
On Fri, May 24, 2024 at 07:10:09PM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > this is the second version of my patch series that fixes various memory > > leaks in Git. Changes compared to v1: > > > > - t4153 and t7006 aren't marked as passing anymore. I thought they > > pass because most of these tests were skipped because of a missing > > TTY prerequisite both on my local machine, but also in our CI. > > > > - Add another patch to install the Perl IO:Pty module on Alpine and > > Ubuntu. This fulfills the TTY prerequisite and thus surfaces the > > memory leaks in both of the above tests. > > > > - Add another unit test for strvec that exercise replacing a string in > > the strvec with a copy of itself. > > > > - A bunch of commit message improvements. > > Looking very good. This seems to reveal existing leaks when merged > to 'seen'; other topics that are not in 'master' may be introducing > these leaks. I'll see if a trial merge to 'next' is leak-free (in > which case I'll merge it down to 'next') or there are other topics > in 'next' that are leaking (in which case we'll play by ear---either > mark the tests again as non-leak-free, or plug the leak if it seems > trivial). > > https://github.com/git/git/actions/runs/9231313414/job/25400998823 > > says t1400-update-ref has many "stdin symref-update" things are > failing. Indeed. The following diff fixes the leak: diff --git a/builtin/update-ref.c b/builtin/update-ref.c index 7d2a419230..e54be9c429 100644 --- a/builtin/update-ref.c +++ b/builtin/update-ref.c @@ -130,6 +130,8 @@ static char *parse_next_arg(const char **next) if (arg.len) return strbuf_detach(&arg, NULL); + + strbuf_release(&arg); return NULL; } Karthik is out of office this week, so you may want to add this as a "SQUASH???" commit on top of his topic branch to make "seen" pass. > Also > > https://github.com/git/git/actions/runs/9231313414/job/25401102951 > > shows that t1460-refs-migrate fails on Windows. Hm, this one is curious. There are no leak logs at all, and the exit code is 139. Might be SIGSEGV, indicating that something else is going on here than a memory leak. I'll investigate. Patrick
Patrick Steinhardt <ps@pks.im> writes: > Indeed. The following diff fixes the leak: > > diff --git a/builtin/update-ref.c b/builtin/update-ref.c > index 7d2a419230..e54be9c429 100644 > --- a/builtin/update-ref.c > +++ b/builtin/update-ref.c > @@ -130,6 +130,8 @@ static char *parse_next_arg(const char **next) > > if (arg.len) > return strbuf_detach(&arg, NULL); > + > + strbuf_release(&arg); > return NULL; > } > > > Karthik is out of office this week, so you may want to add this as a > "SQUASH???" commit on top of his topic branch to make "seen" pass. Alright. Thanks. >> Also >> >> https://github.com/git/git/actions/runs/9231313414/job/25401102951 >> >> shows that t1460-refs-migrate fails on Windows. > > Hm, this one is curious. There are no leak logs at all, and the exit > code is 139. Might be SIGSEGV, indicating that something else is going > on here than a memory leak. Sorry, I wasn't clear enough. I do not suspect this is about leaks (and the failing job on Windows is not about leaks).
Junio C Hamano <gitster@pobox.com> writes: > Patrick Steinhardt <ps@pks.im> writes: > >> Indeed. The following diff fixes the leak: >> >> diff --git a/builtin/update-ref.c b/builtin/update-ref.c >> index 7d2a419230..e54be9c429 100644 >> --- a/builtin/update-ref.c >> +++ b/builtin/update-ref.c >> @@ -130,6 +130,8 @@ static char *parse_next_arg(const char **next) >> >> if (arg.len) >> return strbuf_detach(&arg, NULL); >> + >> + strbuf_release(&arg); >> return NULL; >> } >> >> >> Karthik is out of office this week, so you may want to add this as a >> "SQUASH???" commit on top of his topic branch to make "seen" pass. > > Alright. Thanks. But I am curious. How would this leak even be possible? arg.len==0 and arg.buf != strbuf_slopbuf would mean that somebody who touched arg (either parse_arg() call or strbuf_addstr() call) allocated arg.buf but ended up not adding any byte (or added some bytes but at the end rewound with "arg.len = 0"). Ahh, strbuf_addstr() does an unconditional strbuf_grow(). I wonder if the following patch is a good idea in general, then (not as a replacement for your fix, but as a general code hygiene---unless you know you need more memory, don't allocate). I have a suspicion that this is optimizing for a wrong case, though, so I'd probably refrain from committing it. If the caller often has an empty string, and if the caller feels like it is worth checking to omit such an extra "opportunistic" allocation, the caller can do so. strbuf.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git c/strbuf.c w/strbuf.c index 0d929e4e19..c0e58f5b95 100644 --- c/strbuf.c +++ w/strbuf.c @@ -308,6 +308,8 @@ void strbuf_remove(struct strbuf *sb, size_t pos, size_t len) void strbuf_add(struct strbuf *sb, const void *data, size_t len) { + if (!len) + return; strbuf_grow(sb, len); memcpy(sb->buf + sb->len, data, len); strbuf_setlen(sb, sb->len + len); @@ -315,6 +317,8 @@ void strbuf_add(struct strbuf *sb, const void *data, size_t len) void strbuf_addbuf(struct strbuf *sb, const struct strbuf *sb2) { + if (!sb2->len) + return; strbuf_grow(sb, sb2->len); memcpy(sb->buf + sb->len, sb2->buf, sb2->len); strbuf_setlen(sb, sb->len + sb2->len); @@ -337,6 +341,8 @@ const char *strbuf_join_argv(struct strbuf *buf, void strbuf_addchars(struct strbuf *sb, int c, size_t n) { + if (!n) + return; strbuf_grow(sb, n); memset(sb->buf + sb->len, c, n); strbuf_setlen(sb, sb->len + n); @@ -805,6 +811,8 @@ void strbuf_addstr_xml_quoted(struct strbuf *buf, const char *s) static void strbuf_add_urlencode(struct strbuf *sb, const char *s, size_t len, char_predicate allow_unencoded_fn) { + if (!len) + return; strbuf_grow(sb, len); while (len--) { char ch = *s++;
On Mon, May 27, 2024 at 10:38:59AM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > >> Also > >> > >> https://github.com/git/git/actions/runs/9231313414/job/25401102951 > >> > >> shows that t1460-refs-migrate fails on Windows. > > > > Hm, this one is curious. There are no leak logs at all, and the exit > > code is 139. Might be SIGSEGV, indicating that something else is going > > on here than a memory leak. > > Sorry, I wasn't clear enough. I do not suspect this is about leaks > (and the failing job on Windows is not about leaks). I figured this one out now: Windows of course refuses to remove some of the files because they are kept open. This shouldn't lead to a segfault itself. But we call `free_ref_cache()` on a partially initialized files ref store, nad that function didn't handle the case when it is being passed a `NULL` pointer. I've addressed this by releasing the reftable ref store before trying to remove it from disk so that all files are being closed. But it also surfaced another bug: our worktree code creates a copy of the main ref store by accident because we set up `wt->is_current` _after_ we have called `get_worktree_ref_store()`. I fixed all of this and will send a new version in a bit. Patrick
Patrick Steinhardt <ps@pks.im> writes: > On Fri, May 24, 2024 at 07:10:09PM -0700, Junio C Hamano wrote: >> Patrick Steinhardt <ps@pks.im> writes: >> >> > this is the second version of my patch series that fixes various memory >> > leaks in Git. Changes compared to v1: >> > >> > - t4153 and t7006 aren't marked as passing anymore. I thought they >> > pass because most of these tests were skipped because of a missing >> > TTY prerequisite both on my local machine, but also in our CI. >> > >> > - Add another patch to install the Perl IO:Pty module on Alpine and >> > Ubuntu. This fulfills the TTY prerequisite and thus surfaces the >> > memory leaks in both of the above tests. >> > >> > - Add another unit test for strvec that exercise replacing a string in >> > the strvec with a copy of itself. >> > >> > - A bunch of commit message improvements. >> >> Looking very good. This seems to reveal existing leaks when merged >> to 'seen'; other topics that are not in 'master' may be introducing >> these leaks. I'll see if a trial merge to 'next' is leak-free (in >> which case I'll merge it down to 'next') or there are other topics >> in 'next' that are leaking (in which case we'll play by ear---either >> mark the tests again as non-leak-free, or plug the leak if it seems >> trivial). >> >> https://github.com/git/git/actions/runs/9231313414/job/25400998823 >> >> says t1400-update-ref has many "stdin symref-update" things are >> failing. > > Indeed. The following diff fixes the leak: > > diff --git a/builtin/update-ref.c b/builtin/update-ref.c > index 7d2a419230..e54be9c429 100644 > --- a/builtin/update-ref.c > +++ b/builtin/update-ref.c > @@ -130,6 +130,8 @@ static char *parse_next_arg(const char **next) > > if (arg.len) > return strbuf_detach(&arg, NULL); > + > + strbuf_release(&arg); > return NULL; > } > > > Karthik is out of office this week, so you may want to add this as a > "SQUASH???" commit on top of his topic branch to make "seen" pass. > Thanks Patrick. Indeed I'm a bit slow on my responses, since I'm on vacation, but yeah, I too came about adding this as a fix. I'll mostly check for all tests and send a new version based on this series soon. >> Also >> >> https://github.com/git/git/actions/runs/9231313414/job/25401102951 >> >> shows that t1460-refs-migrate fails on Windows. > > Hm, this one is curious. There are no leak logs at all, and the exit > code is 139. Might be SIGSEGV, indicating that something else is going > on here than a memory leak. > > I'll investigate. > > Patrick
Hi, this is the second version of my patch series that fixes various memory leaks in Git. Changes compared to v1: - t4153 and t7006 aren't marked as passing anymore. I thought they pass because most of these tests were skipped because of a missing TTY prerequisite both on my local machine, but also in our CI. - Add another patch to install the Perl IO:Pty module on Alpine and Ubuntu. This fulfills the TTY prerequisite and thus surfaces the memory leaks in both of the above tests. - Add another unit test for strvec that exercise replacing a string in the strvec with a copy of itself. - A bunch of commit message improvements. Thanks! Patrick Patrick Steinhardt (21): ci: add missing dependency for TTY prereq t: mark a bunch of tests as leak-free transport-helper: fix leaking helper name strbuf: fix leak when `appendwholeline()` fails with EOF checkout: clarify memory ownership in `unique_tracking_name()` http: refactor code to clarify memory ownership config: clarify memory ownership in `git_config_pathname()` diff: refactor code to clarify memory ownership of prefixes convert: refactor code to clarify ownership of check_roundtrip_encoding builtin/log: stop using globals for log config builtin/log: stop using globals for format config config: clarify memory ownership in `git_config_string()` config: plug various memory leaks builtin/credential: clear credential before exit commit-reach: fix memory leak in `ahead_behind()` submodule: fix leaking memory for submodule entries strvec: add functions to replace and remove strings builtin/mv: refactor `add_slash()` to always return allocated strings builtin/mv duplicate string list memory builtin/mv: refactor to use `struct strvec` builtin/mv: fix leaks for submodule gitfile paths Makefile | 1 + alias.c | 6 +- attr.c | 2 +- attr.h | 2 +- builtin/blame.c | 2 +- builtin/checkout.c | 14 +- builtin/commit.c | 4 +- builtin/config.c | 2 +- builtin/credential.c | 2 + builtin/log.c | 708 ++++++++++-------- builtin/merge.c | 4 +- builtin/mv.c | 222 +++--- builtin/rebase.c | 2 +- builtin/receive-pack.c | 6 +- builtin/repack.c | 8 +- builtin/worktree.c | 20 +- checkout.c | 4 +- checkout.h | 6 +- ci/install-dependencies.sh | 4 +- commit-reach.c | 4 + config.c | 52 +- config.h | 10 +- convert.c | 30 +- convert.h | 2 +- delta-islands.c | 2 +- diff.c | 20 +- environment.c | 16 +- environment.h | 14 +- fetch-pack.c | 4 +- fsck.c | 4 +- fsmonitor-settings.c | 5 +- gpg-interface.c | 6 +- http.c | 50 +- imap-send.c | 12 +- mailmap.c | 4 +- mailmap.h | 4 +- merge-ll.c | 6 +- pager.c | 2 +- pretty.c | 14 +- promisor-remote.h | 2 +- remote.c | 20 +- remote.h | 8 +- sequencer.c | 2 +- setup.c | 6 +- strbuf.c | 4 +- strvec.c | 20 + strvec.h | 13 + submodule-config.c | 2 + t/t0300-credentials.sh | 2 + t/t0411-clone-from-partial.sh | 1 + t/t0610-reftable-basics.sh | 1 + t/t0611-reftable-httpd.sh | 1 + t/t1013-read-tree-submodule.sh | 1 + t/t1306-xdg-files.sh | 1 + t/t1350-config-hooks-path.sh | 1 + t/t1400-update-ref.sh | 2 + t/t2013-checkout-submodule.sh | 1 + t/t2024-checkout-dwim.sh | 1 + t/t2060-switch.sh | 1 + t/t2405-worktree-submodule.sh | 1 + t/t3007-ls-files-recurse-submodules.sh | 1 + t/t3203-branch-output.sh | 2 + t/t3415-rebase-autosquash.sh | 1 + t/t3426-rebase-submodule.sh | 1 + t/t3512-cherry-pick-submodule.sh | 1 + t/t3513-revert-submodule.sh | 1 + t/t3600-rm.sh | 1 + t/t3906-stash-submodule.sh | 1 + t/t4001-diff-rename.sh | 4 +- t/t4041-diff-submodule-option.sh | 1 + t/t4043-diff-rename-binary.sh | 1 + t/t4059-diff-submodule-not-initialized.sh | 1 + t/t4060-diff-submodule-option-diff-format.sh | 1 + t/t4120-apply-popt.sh | 1 + t/t4137-apply-submodule.sh | 1 + t/t4210-log-i18n.sh | 2 + t/t5563-simple-http-auth.sh | 1 + t/t5564-http-proxy.sh | 1 + t/t5581-http-curl-verbose.sh | 1 + t/t6006-rev-list-format.sh | 1 + t/t6041-bisect-submodule.sh | 1 + t/t6400-merge-df.sh | 1 + t/t6412-merge-large-rename.sh | 1 + t/t6426-merge-skip-unneeded-updates.sh | 1 + t/t6429-merge-sequence-rename-caching.sh | 1 + t/t6438-submodule-directory-file-conflicts.sh | 1 + t/t7001-mv.sh | 2 + t/t7005-editor.sh | 1 + t/t7102-reset.sh | 1 + t/t7112-reset-submodule.sh | 1 + t/t7417-submodule-path-url.sh | 1 + t/t7421-submodule-summary-add.sh | 1 + t/t7423-submodule-symlinks.sh | 1 + t/t9129-git-svn-i18n-commitencoding.sh | 1 - t/t9139-git-svn-non-utf8-commitencoding.sh | 1 - t/t9200-git-cvsexportcommit.sh | 1 + t/t9401-git-cvsserver-crlf.sh | 1 + t/t9600-cvsimport.sh | 1 + t/t9601-cvsimport-vendor-branch.sh | 1 + t/t9602-cvsimport-branches-tags.sh | 1 + t/t9603-cvsimport-patchsets.sh | 2 + t/t9604-cvsimport-timestamps.sh | 2 + t/unit-tests/t-strvec.c | 269 +++++++ t/unit-tests/test-lib.c | 13 + t/unit-tests/test-lib.h | 13 + transport-helper.c | 6 +- transport.c | 1 + upload-pack.c | 2 +- userdiff.h | 12 +- 109 files changed, 1151 insertions(+), 586 deletions(-) create mode 100644 t/unit-tests/t-strvec.c Range-diff against v1: -: ---------- > 1: 857b8b14ce ci: add missing dependency for TTY prereq 1: 0e9fa9ca73 ! 2: ceade7dbba t: mark a bunch of tests as leak-free @@ Commit message - t2405: Passes since 6741e917de (repository: avoid leaking `fsmonitor` data, 2024-04-12). - - t4153: Passes since 71c7916053 (apply: plug a leak in apply_data, - 2024-04-23). - - - t7006: Passes since at least Git v2.40. I did not care to go back - any further than that. - - t7423: Introduced via b20c10fd9b (t7423: add tests for symlinked submodule directories, 2024-01-28), passes since e8d0608944 (submodule: require the submodule path to contain directories only, @@ t/t2405-worktree-submodule.sh: test_description='Combination of submodules and m base_path=$(pwd -P) - ## t/t4153-am-resume-override-opts.sh ## -@@ - - test_description='git-am command-line options override saved options' - -+TEST_PASSES_SANITIZE_LEAK=true - . ./test-lib.sh - . "$TEST_DIRECTORY"/lib-terminal.sh - - - ## t/t7006-pager.sh ## -@@ - - test_description='Test automatic use of a pager.' - -+TEST_PASSES_SANITIZE_LEAK=true - . ./test-lib.sh - . "$TEST_DIRECTORY"/lib-pager.sh - . "$TEST_DIRECTORY"/lib-terminal.sh - ## t/t7423-submodule-symlinks.sh ## @@ 2: 05fbadbae2 ! 3: a96b5ac359 transport-helper: fix leaking helper name @@ Commit message ownership of the name, nor do we free it. The associated memory thus leaks. - Fix this memory leak and mark now-passing tests as leak free. + Fix this memory leak by freeing the string at the calling side in + `transport_get()`. `transport_helper_init()` now creates its own copy of + the string and thus can free it as required. + + An alterantive way to fix this would be to transfer ownership of the + string passed into `transport_helper_init()`, which would avoid the call + to xstrdup(1). But it does make for a more surprising calling convention + as we do not typically transfer ownership of strings like this. + + Mark now-passing tests as leak free. Signed-off-by: Patrick Steinhardt <ps@pks.im> 3: d76079797f = 4: 9dd8709d1b strbuf: fix leak when `appendwholeline()` fails with EOF 4: ffd9eb795f = 5: 6d4e9ce706 checkout: clarify memory ownership in `unique_tracking_name()` 5: d4bf3c1f95 = 6: 141cae2de1 http: refactor code to clarify memory ownership 6: 1702762772 = 7: ff5e761e55 config: clarify memory ownership in `git_config_pathname()` 7: 18dce492df ! 8: afe69c7303 diff: refactor code to clarify memory ownership of prefixes @@ Metadata ## Commit message ## diff: refactor code to clarify memory ownership of prefixes - The source end destination prefixes are tracked in a `const char *` + The source and destination prefixes are tracked in a `const char *` array, but may at times contain allocated strings. The result is that those strings may be leaking because we never free them. 8: 667eb3f8ff = 9: eb7fce55b0 convert: refactor code to clarify ownership of check_roundtrip_encoding 9: 11eed8cea7 = 10: ee2fcf388c builtin/log: stop using globals for log config 10: d8cd9a05f8 = 11: 3490ad3a02 builtin/log: stop using globals for format config 11: a857637e61 = 12: 6cfc28c7e2 config: clarify memory ownership in `git_config_string()` 12: b2f8878b55 = 13: 70e8e26513 config: plug various memory leaks 13: 23e2cf98b7 = 14: f1a1c43e76 builtin/credential: clear credential before exit 14: a11ce6a0ed = 15: 64b92156f8 commit-reach: fix memory leak in `ahead_behind()` 15: 24362604b2 = 16: cd8a992f08 submodule: fix leaking memory for submodule entries 16: c43c93db3b ! 17: 128e2eaf7a strvec: add functions to replace and remove strings @@ t/unit-tests/t-strvec.c (new) + strvec_clear(&vec); +} + ++static void t_replace_with_substring(void) ++{ ++ struct strvec vec = STRVEC_INIT; ++ strvec_pushl(&vec, "foo", NULL); ++ strvec_replace(&vec, 0, vec.v[0] + 1); ++ check_strvec(&vec, "oo", NULL); ++ strvec_clear(&vec); ++} ++ +static void t_remove_at_head(void) +{ + struct strvec vec = STRVEC_INIT; @@ t/unit-tests/t-strvec.c (new) + TEST(t_replace_at_head(), "replace at head"); + TEST(t_replace_in_between(), "replace in between"); + TEST(t_replace_at_tail(), "replace at tail"); ++ TEST(t_replace_with_substring(), "replace with substring"); + TEST(t_remove_at_head(), "remove at head"); + TEST(t_remove_in_between(), "remove in between"); + TEST(t_remove_at_tail(), "remove at tail"); 17: 97470398ad = 18: 1310b24fc2 builtin/mv: refactor `add_slash()` to always return allocated strings 18: 7a2e5e82cc = 19: d4fef9825a builtin/mv duplicate string list memory 19: b546ca4d62 = 20: 797cdb286a builtin/mv: refactor to use `struct strvec` 20: bba735388d = 21: 095469193c builtin/mv: fix leaks for submodule gitfile paths