Message ID | 70e8e2651306e9d221e5e472720a7610947580a7.1716541556.git.ps@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Various memory leak fixes | expand |
On Fri, May 24, 2024 at 12:04:12PM +0200, Patrick Steinhardt wrote: > Now that memory ownership rules around `git_config_string()` and > `git_config_pathname()` are clearer, it also got easier to spot that > the returned memory needs to be free'd. Plug a subset of those cases and > mark now-passing tests as leak free. Junio, this mail now uses UTF-8 as encoding. Does that fix the issues for you? Patrick
On Fri, May 24, 2024 at 12:04:12PM +0200, Patrick Steinhardt wrote: > diff --git a/alias.c b/alias.c > index 269892c356..4daafd9bda 100644 > --- a/alias.c > +++ b/alias.c > @@ -21,9 +21,11 @@ static int config_alias_cb(const char *key, const char *value, > return 0; > > if (data->alias) { > - if (!strcasecmp(p, data->alias)) > + if (!strcasecmp(p, data->alias)) { > + FREE_AND_NULL(data->v); > return git_config_string(&data->v, > key, value); > + } > } else if (data->list) { > string_list_append(data->list, p); > } IMHO this should be done automatically by git_config_string(). The current design is an accident waiting to happen, and in the long run every call is going to need this FREE_AND_NULL(). By doing it in the function the calling code is shorter, and there's no way we'll forget. I posted a series along those lines a month or so ago: https://lore.kernel.org/git/20240407005656.GA436890@coredump.intra.peff.net/ The catch is that you can't do this: const char *foo = "bar"; git_config_string(&foo, ...); So I introduced a new function that took a non-const pointer with the new behavior, with the idea that we'd eventually migrate everything over. It looks like you may have already done that migration earlier in your series, since the move to "char *" in the previous patch was OK. Though as a side note, sadly: char *foo = "bar"; does not produce an error or even a warning without -Wwrite-strings. I think in the long run we should enable that, but there's a little cleanup required to do so. The main reason I didn't follow up on that earlier series is that there was some discussion about maybe moving this stuff over to strbufs (after teaching it to handle literal initializers). But if you've managed to remove all of the cases that needed that, I think just sticking with "char *" is fine. The other issue raised in that thread is that many of these config variables are also passed to parse-options, which treats them as const strings (and we get no compiler support because it goes through a void pointer). So they may leak if we overwrite them, or in the unusual case that we load config after parsing options, we may try to free a non-heap string. The one we discussed was log's signature_file, and it looks like you split that to use two variables, which works. Junio suggested an OPT_FILENAME_DUP() option, which I'm also OK with. The main challenge to me is being sure we found all such spots (and not accidentally introducing new ones). But I don't have a good solution there. > @@ -1566,7 +1569,7 @@ static int git_default_core_config(const char *var, const char *value, > > if (!strcmp(var, "core.checkroundtripencoding")) { > FREE_AND_NULL(check_roundtrip_encoding); > - return git_config_string((const char **) &check_roundtrip_encoding, var, value); > + return git_config_string(&check_roundtrip_encoding, var, value); > } This should have lost its cast in the previous commit, no? Applying up to patch 12 and building with DEVELOPER=1 gets a warning. -Peff
On Sat, May 25, 2024 at 12:33:47AM -0400, Jeff King wrote: > On Fri, May 24, 2024 at 12:04:12PM +0200, Patrick Steinhardt wrote: > > > diff --git a/alias.c b/alias.c > > index 269892c356..4daafd9bda 100644 > > --- a/alias.c > > +++ b/alias.c > > @@ -21,9 +21,11 @@ static int config_alias_cb(const char *key, const char *value, > > return 0; > > > > if (data->alias) { > > - if (!strcasecmp(p, data->alias)) > > + if (!strcasecmp(p, data->alias)) { > > + FREE_AND_NULL(data->v); > > return git_config_string(&data->v, > > key, value); > > + } > > } else if (data->list) { > > string_list_append(data->list, p); > > } > > IMHO this should be done automatically by git_config_string(). The > current design is an accident waiting to happen, and in the long run > every call is going to need this FREE_AND_NULL(). By doing it in the > function the calling code is shorter, and there's no way we'll forget. In fact, I had this in my first iteration. But I didn't feel comfortable with it exactly due to the reasons you mention below, where often times we assign string constants as default values. This requires us to be extremely careful, also because we do not yet have `-Wwrite-strings` enabled as you mention. > I posted a series along those lines a month or so ago: > > https://lore.kernel.org/git/20240407005656.GA436890@coredump.intra.peff.net/ > > The catch is that you can't do this: > > const char *foo = "bar"; > git_config_string(&foo, ...); > > So I introduced a new function that took a non-const pointer with the > new behavior, with the idea that we'd eventually migrate everything > over. It looks like you may have already done that migration earlier in > your series, since the move to "char *" in the previous patch was OK. > > Though as a side note, sadly: > > char *foo = "bar"; > > does not produce an error or even a warning without -Wwrite-strings. I > think in the long run we should enable that, but there's a little > cleanup required to do so. Indeed, I had the exact same observation. I've already got a patch series that enables `-Wwrite-strings` and that adapts our codebase to compile cleanly with it. I'll send that series once this one here has landed. So my proposal would be to leave this patch as-is for the time being, but revisit it once both patch series have landed. Does that work for you? > The main reason I didn't follow up on that earlier series is that > there was some discussion about maybe moving this stuff over to > strbufs (after teaching it to handle literal initializers). But if > you've managed to remove all of the cases that needed that, I think > just sticking with "char *" is fine. I don't think I managed to hit every callsite yet that leaks memory. But I think it shouldn't be too bad, and especiall if we follow up this patch series with `FREE_AND_NULL()` on the out-parameter then this should be fine. > The other issue raised in that thread is that many of these config > variables are also passed to parse-options, which treats them as const > strings (and we get no compiler support because it goes through a void > pointer). So they may leak if we overwrite them, or in the unusual > case that we load config after parsing options, we may try to free a > non-heap string. The one we discussed was log's signature_file, and it > looks like you split that to use two variables, which works. Junio > suggested an OPT_FILENAME_DUP() option, which I'm also OK with. The > main challenge to me is being sure we found all such spots (and not > accidentally introducing new ones). But I don't have a good solution > there. Yup, I remember having some issues with `OPT_FILENAME()`, but to the best of my knowledge I've fixed all of them. > > @@ -1566,7 +1569,7 @@ static int git_default_core_config(const char > > *var, const char *value, > > > > if (!strcmp(var, "core.checkroundtripencoding")) { > > FREE_AND_NULL(check_roundtrip_encoding); - return > > git_config_string((const char **) &check_roundtrip_encoding, > > var, value); + return > > git_config_string(&check_roundtrip_encoding, var, value); } > > This should have lost its cast in the previous commit, no? Applying up > to patch 12 and building with DEVELOPER=1 gets a warning. Ah, good catch. Will fix. Patrick
On Mon, May 27, 2024 at 08:46:02AM +0200, Patrick Steinhardt wrote: > Indeed, I had the exact same observation. I've already got a patch > series that enables `-Wwrite-strings` and that adapts our codebase to > compile cleanly with it. I'll send that series once this one here has > landed. > > So my proposal would be to leave this patch as-is for the time being, > but revisit it once both patch series have landed. Does that work for > you? Yeah, I can certainly live with that. One of my biggest concerns was that your patch would invalidate the investigation and work I had already done, and I'd fail to follow up. But if you are putting it onto your todo list, then that is even less work for me. ;) -Peff
diff --git a/alias.c b/alias.c index 269892c356..4daafd9bda 100644 --- a/alias.c +++ b/alias.c @@ -21,9 +21,11 @@ static int config_alias_cb(const char *key, const char *value, return 0; if (data->alias) { - if (!strcasecmp(p, data->alias)) + if (!strcasecmp(p, data->alias)) { + FREE_AND_NULL(data->v); return git_config_string(&data->v, key, value); + } } else if (data->list) { string_list_append(data->list, p); } diff --git a/config.c b/config.c index a025cfafe0..496cd1a61e 100644 --- a/config.c +++ b/config.c @@ -1414,8 +1414,10 @@ static int git_default_core_config(const char *var, const char *value, return 0; } - if (!strcmp(var, "core.attributesfile")) + if (!strcmp(var, "core.attributesfile")) { + FREE_AND_NULL(git_attributes_file); return git_config_pathname(&git_attributes_file, var, value); + } if (!strcmp(var, "core.hookspath")) { if (ctx->kvi && ctx->kvi->scope == CONFIG_SCOPE_LOCAL && @@ -1428,6 +1430,7 @@ static int git_default_core_config(const char *var, const char *value, "again with " "`GIT_CLONE_PROTECTION_ACTIVE=false`"), value); + FREE_AND_NULL(git_hooks_path); return git_config_pathname(&git_hooks_path, var, value); } @@ -1566,7 +1569,7 @@ static int git_default_core_config(const char *var, const char *value, if (!strcmp(var, "core.checkroundtripencoding")) { FREE_AND_NULL(check_roundtrip_encoding); - return git_config_string((const char **) &check_roundtrip_encoding, var, value); + return git_config_string(&check_roundtrip_encoding, var, value); } if (!strcmp(var, "core.notesref")) { @@ -1576,8 +1579,10 @@ static int git_default_core_config(const char *var, const char *value, return 0; } - if (!strcmp(var, "core.editor")) + if (!strcmp(var, "core.editor")) { + FREE_AND_NULL(editor_program); return git_config_string(&editor_program, var, value); + } if (!strcmp(var, "core.commentchar") || !strcmp(var, "core.commentstring")) { @@ -1595,11 +1600,13 @@ static int git_default_core_config(const char *var, const char *value, return 0; } - if (!strcmp(var, "core.askpass")) + if (!strcmp(var, "core.askpass")) { + FREE_AND_NULL(askpass_program); return git_config_string(&askpass_program, var, value); + } if (!strcmp(var, "core.excludesfile")) { - free(excludes_file); + FREE_AND_NULL(excludes_file); return git_config_pathname(&excludes_file, var, value); } @@ -1702,11 +1709,15 @@ static int git_default_sparse_config(const char *var, const char *value) static int git_default_i18n_config(const char *var, const char *value) { - if (!strcmp(var, "i18n.commitencoding")) + if (!strcmp(var, "i18n.commitencoding")) { + FREE_AND_NULL(git_commit_encoding); return git_config_string(&git_commit_encoding, var, value); + } - if (!strcmp(var, "i18n.logoutputencoding")) + if (!strcmp(var, "i18n.logoutputencoding")) { + FREE_AND_NULL(git_log_output_encoding); return git_config_string(&git_log_output_encoding, var, value); + } /* Add other config variables here and to Documentation/config.txt. */ return 0; @@ -1779,10 +1790,15 @@ static int git_default_push_config(const char *var, const char *value) static int git_default_mailmap_config(const char *var, const char *value) { - if (!strcmp(var, "mailmap.file")) + if (!strcmp(var, "mailmap.file")) { + FREE_AND_NULL(git_mailmap_file); return git_config_pathname(&git_mailmap_file, var, value); - if (!strcmp(var, "mailmap.blob")) + } + + if (!strcmp(var, "mailmap.blob")) { + FREE_AND_NULL(git_mailmap_blob); return git_config_string(&git_mailmap_blob, var, value); + } /* Add other config variables here and to Documentation/config.txt. */ return 0; @@ -1790,8 +1806,10 @@ static int git_default_mailmap_config(const char *var, const char *value) static int git_default_attr_config(const char *var, const char *value) { - if (!strcmp(var, "attr.tree")) + if (!strcmp(var, "attr.tree")) { + FREE_AND_NULL(git_attr_tree); return git_config_string(&git_attr_tree, var, value); + } /* * Add other attribute related config variables here and to diff --git a/t/t1306-xdg-files.sh b/t/t1306-xdg-files.sh index 40d3c42618..53e5b290b9 100755 --- a/t/t1306-xdg-files.sh +++ b/t/t1306-xdg-files.sh @@ -7,6 +7,7 @@ test_description='Compatibility with $XDG_CONFIG_HOME/git/ files' +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh test_expect_success 'read config: xdg file exists and ~/.gitconfig doesn'\''t' ' diff --git a/t/t1350-config-hooks-path.sh b/t/t1350-config-hooks-path.sh index f6dc83e2aa..5c47f9ecc3 100755 --- a/t/t1350-config-hooks-path.sh +++ b/t/t1350-config-hooks-path.sh @@ -2,6 +2,7 @@ test_description='Test the core.hooksPath configuration variable' +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh test_expect_success 'set up a pre-commit hook in core.hooksPath' ' diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh index fcc40d6fe1..22452ff84c 100755 --- a/t/t3415-rebase-autosquash.sh +++ b/t/t3415-rebase-autosquash.sh @@ -5,6 +5,7 @@ test_description='auto squash' GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh . "$TEST_DIRECTORY"/lib-rebase.sh diff --git a/t/t4041-diff-submodule-option.sh b/t/t4041-diff-submodule-option.sh index 0c1502d4b0..8fc40e75eb 100755 --- a/t/t4041-diff-submodule-option.sh +++ b/t/t4041-diff-submodule-option.sh @@ -12,6 +12,7 @@ This test tries to verify the sanity of the --submodule option of git diff. GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh # Tested non-UTF-8 encoding diff --git a/t/t4060-diff-submodule-option-diff-format.sh b/t/t4060-diff-submodule-option-diff-format.sh index 97c6424cd5..8ce67442d9 100755 --- a/t/t4060-diff-submodule-option-diff-format.sh +++ b/t/t4060-diff-submodule-option-diff-format.sh @@ -10,6 +10,7 @@ test_description='Support for diff format verbose submodule difference in git di This test tries to verify the sanity of --submodule=diff option of git diff. ' +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh # Tested non-UTF-8 encoding diff --git a/t/t4210-log-i18n.sh b/t/t4210-log-i18n.sh index 75216f19ce..7120030b5c 100755 --- a/t/t4210-log-i18n.sh +++ b/t/t4210-log-i18n.sh @@ -1,6 +1,8 @@ #!/bin/sh test_description='test log with i18n features' + +TEST_PASSES_SANITIZE_LEAK=true . ./lib-gettext.sh # two forms of é diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh index 573eb97a0f..f1623b1c06 100755 --- a/t/t6006-rev-list-format.sh +++ b/t/t6006-rev-list-format.sh @@ -8,6 +8,7 @@ test_description='git rev-list --pretty=format test' GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh . "$TEST_DIRECTORY"/lib-terminal.sh diff --git a/t/t7005-editor.sh b/t/t7005-editor.sh index 5fcf281dfb..b9822294fe 100755 --- a/t/t7005-editor.sh +++ b/t/t7005-editor.sh @@ -2,6 +2,7 @@ test_description='GIT_EDITOR, core.editor, and stuff' +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh unset EDITOR VISUAL GIT_EDITOR diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh index 62d9f846ce..2add26d768 100755 --- a/t/t7102-reset.sh +++ b/t/t7102-reset.sh @@ -10,6 +10,7 @@ Documented tests for git reset' GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh commit_msg () { diff --git a/t/t9129-git-svn-i18n-commitencoding.sh b/t/t9129-git-svn-i18n-commitencoding.sh index 185248a4cd..01e1e8a8f7 100755 --- a/t/t9129-git-svn-i18n-commitencoding.sh +++ b/t/t9129-git-svn-i18n-commitencoding.sh @@ -4,7 +4,6 @@ test_description='git svn honors i18n.commitEncoding in config' -TEST_FAILS_SANITIZE_LEAK=true . ./lib-git-svn.sh compare_git_head_with () { diff --git a/t/t9139-git-svn-non-utf8-commitencoding.sh b/t/t9139-git-svn-non-utf8-commitencoding.sh index b7f756b2b7..22d80b0be2 100755 --- a/t/t9139-git-svn-non-utf8-commitencoding.sh +++ b/t/t9139-git-svn-non-utf8-commitencoding.sh @@ -4,7 +4,6 @@ test_description='git svn refuses to dcommit non-UTF8 messages' -TEST_FAILS_SANITIZE_LEAK=true . ./lib-git-svn.sh # ISO-2022-JP can pass for valid UTF-8, so skipping that in this test
Now that memory ownership rules around `git_config_string()` and `git_config_pathname()` are clearer, it also got easier to spot that the returned memory needs to be free'd. Plug a subset of those cases and mark now-passing tests as leak free. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- alias.c | 4 ++- config.c | 38 ++++++++++++++------ t/t1306-xdg-files.sh | 1 + t/t1350-config-hooks-path.sh | 1 + t/t3415-rebase-autosquash.sh | 1 + t/t4041-diff-submodule-option.sh | 1 + t/t4060-diff-submodule-option-diff-format.sh | 1 + t/t4210-log-i18n.sh | 2 ++ t/t6006-rev-list-format.sh | 1 + t/t7005-editor.sh | 1 + t/t7102-reset.sh | 1 + t/t9129-git-svn-i18n-commitencoding.sh | 1 - t/t9139-git-svn-non-utf8-commitencoding.sh | 1 - 13 files changed, 41 insertions(+), 13 deletions(-)