Message ID | 20210620151204.19260-12-andrzej@ahunt.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix all leaks in tests t0002-t0099: Part 2 | expand |
Hi Andrzej Thanks for working on removing memory leaks from git. On 20/06/2021 16:12, andrzej@ahunt.org wrote: > From: Andrzej Hunt <ajrhunt@google.com> > > This change: > - xstrdup()'s all string being used for replace_opts.strategy, to I think you mean replay_opts rather than replace_opts. > guarantee that replace_opts owns these strings. This is needed because > sequencer_remove_state() will free replace_opts.strategy, and it's > usually called as part of the usage of replace_opts. > - Removes xstrdup()'s being used to populate options.strategy in > cmd_rebase(), which avoids leaking options.strategy, even in the > case where strategy is never moved/copied into replace_opts. > These changes are needed because: > - We would always create a new string for options.strategy if we either > get a strategy via options (OPT_STRING(...strategy...), or via > GIT_TEST_MERGE_ALGORITHM. > - But only sometimes is this string copied into replace_opts - in which > case it did get free()'d in sequencer_remove_state(). > - The rest of the time, the newly allocated string would remain unused, > causing a leak. But we can't just add a free because that can result > in a double-free in those cases where replace_opts was populated. > > An alternative approach would be to set options.strategy to NULL when > moving the pointer to replace_opts.strategy, combined with always > free()'ing options.strategy, but that seems like a more > complicated and wasteful approach. read_basic_state() contains if (file_exists(state_dir_path("strategy", opts))) { strbuf_reset(&buf); if (!read_oneliner(&buf, state_dir_path("strategy", opts), READ_ONELINER_WARN_MISSING)) return -1; free(opts->strategy); opts->strategy = xstrdup(buf.buf); } So we do try to free opts->strategy when reading the state from disc and we allocate a new string. I suspect that opts->strategy is actually NULL in when this function is called but I haven't checked. Given that we are allocating a copy above I think maybe your alternative approach of always freeing opts->strategy would be better. Best Wishes Phillip > This was first seen when running t0021 with LSAN, but t2012 helped catch > the fact that we can't just free(options.strategy) at the end of > cmd_rebase (as that can cause a double-free). LSAN output from t0021: > > LSAN output from t0021: > > Direct leak of 4 byte(s) in 1 object(s) allocated from: > #0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3 > #1 0xa71eb8 in xstrdup wrapper.c:29:14 > #2 0x61b1cc in cmd_rebase builtin/rebase.c:1779:22 > #3 0x4ce83e in run_builtin git.c:475:11 > #4 0x4ccafe in handle_builtin git.c:729:3 > #5 0x4cb01c in run_argv git.c:818:4 > #6 0x4cb01c in cmd_main git.c:949:19 > #7 0x6b3fad in main common-main.c:52:11 > #8 0x7f267b512349 in __libc_start_main (/lib64/libc.so.6+0x24349) > > SUMMARY: AddressSanitizer: 4 byte(s) leaked in 1 allocation(s). > > Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> > --- > builtin/rebase.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/builtin/rebase.c b/builtin/rebase.c > index 12f093121d..9d81db0f3a 100644 > --- a/builtin/rebase.c > +++ b/builtin/rebase.c > @@ -139,7 +139,7 @@ static struct replay_opts get_replay_opts(const struct rebase_options *opts) > replay.ignore_date = opts->ignore_date; > replay.gpg_sign = xstrdup_or_null(opts->gpg_sign_opt); > if (opts->strategy) > - replay.strategy = opts->strategy; > + replay.strategy = xstrdup_or_null(opts->strategy); > else if (!replay.strategy && replay.default_strategy) { > replay.strategy = replay.default_strategy; > replay.default_strategy = NULL; > @@ -1723,7 +1723,6 @@ int cmd_rebase(int argc, const char **argv, const char *prefix) > } > > if (options.strategy) { > - options.strategy = xstrdup(options.strategy); > switch (options.type) { > case REBASE_APPLY: > die(_("--strategy requires --merge or --interactive")); > @@ -1776,7 +1775,7 @@ int cmd_rebase(int argc, const char **argv, const char *prefix) > if (options.type == REBASE_MERGE && > !options.strategy && > getenv("GIT_TEST_MERGE_ALGORITHM")) > - options.strategy = xstrdup(getenv("GIT_TEST_MERGE_ALGORITHM")); > + options.strategy = getenv("GIT_TEST_MERGE_ALGORITHM"); > > switch (options.type) { > case REBASE_MERGE: >
On Sun, Jun 20, 2021 at 11:29 AM Phillip Wood <phillip.wood123@gmail.com> wrote: > > Hi Andrzej > > Thanks for working on removing memory leaks from git. > > On 20/06/2021 16:12, andrzej@ahunt.org wrote: > > From: Andrzej Hunt <ajrhunt@google.com> > > > > This change: > > - xstrdup()'s all string being used for replace_opts.strategy, to > > I think you mean replay_opts rather than replace_opts. > > > guarantee that replace_opts owns these strings. This is needed because > > sequencer_remove_state() will free replace_opts.strategy, and it's > > usually called as part of the usage of replace_opts. > > - Removes xstrdup()'s being used to populate options.strategy in > > cmd_rebase(), which avoids leaking options.strategy, even in the > > case where strategy is never moved/copied into replace_opts. > > > > These changes are needed because: > > - We would always create a new string for options.strategy if we either > > get a strategy via options (OPT_STRING(...strategy...), or via > > GIT_TEST_MERGE_ALGORITHM. > > - But only sometimes is this string copied into replace_opts - in which > > case it did get free()'d in sequencer_remove_state(). > > - The rest of the time, the newly allocated string would remain unused, > > causing a leak. But we can't just add a free because that can result > > in a double-free in those cases where replace_opts was populated. > > > > An alternative approach would be to set options.strategy to NULL when > > moving the pointer to replace_opts.strategy, combined with always > > free()'ing options.strategy, but that seems like a more > > complicated and wasteful approach. > > read_basic_state() contains > if (file_exists(state_dir_path("strategy", opts))) { > strbuf_reset(&buf); > if (!read_oneliner(&buf, state_dir_path("strategy", opts), > READ_ONELINER_WARN_MISSING)) > return -1; > free(opts->strategy); > opts->strategy = xstrdup(buf.buf); > } > > So we do try to free opts->strategy when reading the state from disc and > we allocate a new string. I suspect that opts->strategy is actually NULL > in when this function is called but I haven't checked. Given that we are > allocating a copy above I think maybe your alternative approach of > always freeing opts->strategy would be better. Good catches. sequencer_remove_state() in sequencer.c also has a free(opts->strategy) call. To make things even more muddy, we have code like replay.strategy = replay.default_strategy; or opts->strategy = opts->default_strategy; which both will probably work really poorly with the calls to free(opts->default_strategy); free(opts->strategy); from sequencer_remove_state(). I suspect we've got a few bugs here...
Hi Elijah On 21/06/2021 22:39, Elijah Newren wrote: > On Sun, Jun 20, 2021 at 11:29 AM Phillip Wood <phillip.wood123@gmail.com> wrote: >> >> Hi Andrzej >> >> Thanks for working on removing memory leaks from git. >> >> On 20/06/2021 16:12, andrzej@ahunt.org wrote: >>> From: Andrzej Hunt <ajrhunt@google.com> >>> >>> This change: >>> - xstrdup()'s all string being used for replace_opts.strategy, to >> >> I think you mean replay_opts rather than replace_opts. >> >>> guarantee that replace_opts owns these strings. This is needed because >>> sequencer_remove_state() will free replace_opts.strategy, and it's >>> usually called as part of the usage of replace_opts. >>> - Removes xstrdup()'s being used to populate options.strategy in >>> cmd_rebase(), which avoids leaking options.strategy, even in the >>> case where strategy is never moved/copied into replace_opts. >> >> >>> These changes are needed because: >>> - We would always create a new string for options.strategy if we either >>> get a strategy via options (OPT_STRING(...strategy...), or via >>> GIT_TEST_MERGE_ALGORITHM. >>> - But only sometimes is this string copied into replace_opts - in which >>> case it did get free()'d in sequencer_remove_state(). >>> - The rest of the time, the newly allocated string would remain unused, >>> causing a leak. But we can't just add a free because that can result >>> in a double-free in those cases where replace_opts was populated. >>> >>> An alternative approach would be to set options.strategy to NULL when >>> moving the pointer to replace_opts.strategy, combined with always >>> free()'ing options.strategy, but that seems like a more >>> complicated and wasteful approach. >> >> read_basic_state() contains >> if (file_exists(state_dir_path("strategy", opts))) { >> strbuf_reset(&buf); >> if (!read_oneliner(&buf, state_dir_path("strategy", opts), >> READ_ONELINER_WARN_MISSING)) >> return -1; >> free(opts->strategy); >> opts->strategy = xstrdup(buf.buf); >> } >> >> So we do try to free opts->strategy when reading the state from disc and >> we allocate a new string. I suspect that opts->strategy is actually NULL >> in when this function is called but I haven't checked. Given that we are >> allocating a copy above I think maybe your alternative approach of >> always freeing opts->strategy would be better. > > Good catches. sequencer_remove_state() in sequencer.c also has a > free(opts->strategy) call. > > To make things even more muddy, we have code like > replay.strategy = replay.default_strategy; > or > opts->strategy = opts->default_strategy; > which both will probably work really poorly with the calls to > free(opts->default_strategy); > free(opts->strategy); > from sequencer_remove_state(). I suspect we've got a few bugs here... It's not immediately obvious but I think those are actually safe. opts->default_strategy is allocated by sequencer_init_config() so it is correct to free it and when we assign it in rebase.c we do else if (!replay.strategy && replay.default_strategy) { replay.strategy = replay.default_strategy; replay.default_strategy = NULL; } so there is no double free. There is similar code in builtin/revert.c which I think is where your other example came from. I think there is a leak in builtin/revert.c though if (!opts->strategy && opts->default_strategy) { opts->strategy = opts->default_strategy; opts->default_strategy = NULL; } /* do some other stuff */ /* These option values will be free()d */ opts->gpg_sign = xstrdup_or_null(opts->gpg_sign); opts->strategy = xstrdup_or_null(opts->strategy); So we copy the default strategy, leaking the original copy from sequencer_init_options() if --strategy isn't given on the command line. I think it would be simple to fix this by making the copy earlier. if (!opts->strategy && opts->default_strategy) { opts->strategy = opts->default_strategy; opts->default_strategy = NULL; } else if (opts->strategy) { /* This option will be free()d in sequencer_remove_state() */ opts->strategy = xstrdup(opts->strategy); } I'm going offline for a week or so in a couple of days but I'll have look at making a proper patch when I get back. Best Wishes Phillip
On 22/06/2021 11:02, Phillip Wood wrote: > Hi Elijah > > On 21/06/2021 22:39, Elijah Newren wrote: >> On Sun, Jun 20, 2021 at 11:29 AM Phillip Wood >> <phillip.wood123@gmail.com> wrote: >>> >>> Hi Andrzej >>> >>> Thanks for working on removing memory leaks from git. >>> >>> On 20/06/2021 16:12, andrzej@ahunt.org wrote: >>>> From: Andrzej Hunt <ajrhunt@google.com> >>>> >>>> This change: >>>> - xstrdup()'s all string being used for replace_opts.strategy, to >>> >>> I think you mean replay_opts rather than replace_opts. >>> >>>> guarantee that replace_opts owns these strings. This is needed >>>> because >>>> sequencer_remove_state() will free replace_opts.strategy, and it's >>>> usually called as part of the usage of replace_opts. >>>> - Removes xstrdup()'s being used to populate options.strategy in >>>> cmd_rebase(), which avoids leaking options.strategy, even in the >>>> case where strategy is never moved/copied into replace_opts. >>> >>> >>>> These changes are needed because: >>>> - We would always create a new string for options.strategy if we either >>>> get a strategy via options (OPT_STRING(...strategy...), or via >>>> GIT_TEST_MERGE_ALGORITHM. >>>> - But only sometimes is this string copied into replace_opts - in which >>>> case it did get free()'d in sequencer_remove_state(). >>>> - The rest of the time, the newly allocated string would remain unused, >>>> causing a leak. But we can't just add a free because that can >>>> result >>>> in a double-free in those cases where replace_opts was populated. >>>> >>>> An alternative approach would be to set options.strategy to NULL when >>>> moving the pointer to replace_opts.strategy, combined with always >>>> free()'ing options.strategy, but that seems like a more >>>> complicated and wasteful approach. >>> >>> read_basic_state() contains >>> if (file_exists(state_dir_path("strategy", opts))) { >>> strbuf_reset(&buf); >>> if (!read_oneliner(&buf, state_dir_path("strategy", >>> opts), >>> READ_ONELINER_WARN_MISSING)) >>> return -1; >>> free(opts->strategy); >>> opts->strategy = xstrdup(buf.buf); >>> } >>> >>> So we do try to free opts->strategy when reading the state from disc and >>> we allocate a new string. I suspect that opts->strategy is actually NULL >>> in when this function is called but I haven't checked. Thank you for noticing this. I think you're right - running an ASAN build past the whole test suite also didn't catch any double-frees which mostly confirms that opts->strategy is indeed always NULL here. But that's not a good reason for taking the risk. >>> Given that we are >>> allocating a copy above I think maybe your alternative approach of >>> always freeing opts->strategy would be better. I will go down this route for V2. Although on further thought: instead of my original idea of moving the string to replay_opts (and NULL'ing out rebase_options->strategy), I think it's better to create a new copy when populating replay_opts. The move/NULL approach I suggested in V1 happens to work OK, but I think it's non-obvious and could break if we ever wanted to use get_replay_opts() more than once - creating separate copies reduces the number of surprises. >> >> Good catches. sequencer_remove_state() in sequencer.c also has a >> free(opts->strategy) call. >> >> To make things even more muddy, we have code like >> replay.strategy = replay.default_strategy; >> or >> opts->strategy = opts->default_strategy; >> which both will probably work really poorly with the calls to >> free(opts->default_strategy); >> free(opts->strategy); >> from sequencer_remove_state(). I suspect we've got a few bugs here... > > It's not immediately obvious but I think those are actually safe. > opts->default_strategy is allocated by sequencer_init_config() so it is > correct to free it and when we assign it in rebase.c we do > > else if (!replay.strategy && replay.default_strategy) { > replay.strategy = replay.default_strategy; > replay.default_strategy = NULL; > } > > so there is no double free. As mentioned above, ASAN isn't catching any double-frees here (but I guess that depends on whether or not you trust the test suite to be reasonably testing all permutations). But it's still good to take note of sequencer_remove_state() free'ing opts->strategy, because I almost did manage to add a double free when I added a free(options.strategy) to cmd_rebase without also xstrdup'ing strategy in get_replay_opts(). > There is similar code in builtin/revert.c > which I think is where your other example came from. I think there is a > leak in builtin/revert.c though > > if (!opts->strategy && opts->default_strategy) { > opts->strategy = opts->default_strategy; > opts->default_strategy = NULL; > } > > /* do some other stuff */ > > /* These option values will be free()d */ > opts->gpg_sign = xstrdup_or_null(opts->gpg_sign); > opts->strategy = xstrdup_or_null(opts->strategy); > > So we copy the default strategy, leaking the original copy from > sequencer_init_options() if --strategy isn't given on the command line. > I think it would be simple to fix this by making the copy earlier. > > if (!opts->strategy && opts->default_strategy) { > opts->strategy = opts->default_strategy; > opts->default_strategy = NULL; > } else if (opts->strategy) { > /* This option will be free()d in sequencer_remove_state() */ > opts->strategy = xstrdup(opts->strategy); > } > Nice find. I'm noticing a lot of interesting leaks in git's options handling, and those leaks also tend to be the trickiest ones to fix (as my blunder in the original version of this patch demonstrates :) ). ATB, Andrzej
Hi Andrzej On 25/07/2021 14:03, Andrzej Hunt wrote: > [...] >>>> Given that we are >>>> allocating a copy above I think maybe your alternative approach of >>>> always freeing opts->strategy would be better. > > I will go down this route for V2. Although on further thought: instead > of my original idea of moving the string to replay_opts (and NULL'ing > out rebase_options->strategy), I think it's better to create a new copy > when populating replay_opts. The move/NULL approach I suggested in V1 > happens to work OK, but I think it's non-obvious and could break if we > ever wanted to use get_replay_opts() more than once - creating separate > copies reduces the number of surprises. Copying the string sounds like a good approach. I've looked at the V2 patch and it looks fine to me. Thanks Phillip
diff --git a/builtin/rebase.c b/builtin/rebase.c index 12f093121d..9d81db0f3a 100644 --- a/builtin/rebase.c +++ b/builtin/rebase.c @@ -139,7 +139,7 @@ static struct replay_opts get_replay_opts(const struct rebase_options *opts) replay.ignore_date = opts->ignore_date; replay.gpg_sign = xstrdup_or_null(opts->gpg_sign_opt); if (opts->strategy) - replay.strategy = opts->strategy; + replay.strategy = xstrdup_or_null(opts->strategy); else if (!replay.strategy && replay.default_strategy) { replay.strategy = replay.default_strategy; replay.default_strategy = NULL; @@ -1723,7 +1723,6 @@ int cmd_rebase(int argc, const char **argv, const char *prefix) } if (options.strategy) { - options.strategy = xstrdup(options.strategy); switch (options.type) { case REBASE_APPLY: die(_("--strategy requires --merge or --interactive")); @@ -1776,7 +1775,7 @@ int cmd_rebase(int argc, const char **argv, const char *prefix) if (options.type == REBASE_MERGE && !options.strategy && getenv("GIT_TEST_MERGE_ALGORITHM")) - options.strategy = xstrdup(getenv("GIT_TEST_MERGE_ALGORITHM")); + options.strategy = getenv("GIT_TEST_MERGE_ALGORITHM"); switch (options.type) { case REBASE_MERGE: