diff mbox series

[3/5] mv: move src_dir cleanup to end of cmd_mv()

Message ID 20240530064422.GC1949704@coredump.intra.peff.net (mailing list archive)
State Accepted
Commit cc65e085e413c7b837e46169ff543c6c8bfcae1f
Headers show
Series add-ons for ps/leakfixes | expand

Commit Message

Jeff King May 30, 2024, 6:44 a.m. UTC
Commit b6f51e3db9 (mv: cleanup empty WORKING_DIRECTORY, 2022-08-09)
added an auxiliary array where we store directory arguments that we see
while processing the incoming arguments. After actually moving things,
we then use that array to remove now-empty directories, and then
immediately free the array.

But if the actual move queues any errors in only_match_skip_worktree,
that can cause us to jump straight to the "out" label to clean up,
skipping the free() and leaking the array.

Let's push the free() down past the "out" label so that we always clean
up (the array is initialized to NULL, so this is always safe). We'll
hold on to the memory a little longer than necessary, but clarity is
more important than micro-optimizing here.

Note that the adjacent "a_src_dir" strbuf does not suffer the same
problem; it is only allocated during the removal step.

Signed-off-by: Jeff King <peff@peff.net>
---
Reported as "new" by Coverity, but I think that is only because of the
"goto out". Before your recent patches it was a straight "return", which
was even worse. ;)

 builtin/mv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Patrick Steinhardt May 30, 2024, 7:04 a.m. UTC | #1
On Thu, May 30, 2024 at 02:44:22AM -0400, Jeff King wrote:
> Commit b6f51e3db9 (mv: cleanup empty WORKING_DIRECTORY, 2022-08-09)
> added an auxiliary array where we store directory arguments that we see
> while processing the incoming arguments. After actually moving things,
> we then use that array to remove now-empty directories, and then
> immediately free the array.
> 
> But if the actual move queues any errors in only_match_skip_worktree,
> that can cause us to jump straight to the "out" label to clean up,
> skipping the free() and leaking the array.
> 
> Let's push the free() down past the "out" label so that we always clean
> up (the array is initialized to NULL, so this is always safe). We'll
> hold on to the memory a little longer than necessary, but clarity is
> more important than micro-optimizing here.
> 
> Note that the adjacent "a_src_dir" strbuf does not suffer the same
> problem; it is only allocated during the removal step.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> Reported as "new" by Coverity, but I think that is only because of the
> "goto out". Before your recent patches it was a straight "return", which
> was even worse. ;)

Ouf of curiosity, did you check whether this makes any additional tests
pass with SANITIZE=leak?

The fix itself looks good to me, thanks.

Patrick
Jeff King May 30, 2024, 7:21 a.m. UTC | #2
On Thu, May 30, 2024 at 09:04:28AM +0200, Patrick Steinhardt wrote:

> > But if the actual move queues any errors in only_match_skip_worktree,
> > that can cause us to jump straight to the "out" label to clean up,
> > skipping the free() and leaking the array.
> > 
> > Let's push the free() down past the "out" label so that we always clean
> > up (the array is initialized to NULL, so this is always safe). We'll
> > hold on to the memory a little longer than necessary, but clarity is
> > more important than micro-optimizing here.
> [...]
> 
> Ouf of curiosity, did you check whether this makes any additional tests
> pass with SANITIZE=leak?

No, I didn't. I think you can only trigger it with a sparse index, at
which point it seemed like diminishing returns to try to reproduce.

But running in "check" mode is not too hard...

...time passes...

Looks like no. The obvious candidate would be t7002-mv-sparse-checkout,
but it looks like the sparse-checkout code has minor leaks itself.

-Peff
Patrick Steinhardt May 30, 2024, 7:24 a.m. UTC | #3
On Thu, May 30, 2024 at 03:21:59AM -0400, Jeff King wrote:
> On Thu, May 30, 2024 at 09:04:28AM +0200, Patrick Steinhardt wrote:
> 
> > > But if the actual move queues any errors in only_match_skip_worktree,
> > > that can cause us to jump straight to the "out" label to clean up,
> > > skipping the free() and leaking the array.
> > > 
> > > Let's push the free() down past the "out" label so that we always clean
> > > up (the array is initialized to NULL, so this is always safe). We'll
> > > hold on to the memory a little longer than necessary, but clarity is
> > > more important than micro-optimizing here.
> > [...]
> > 
> > Ouf of curiosity, did you check whether this makes any additional tests
> > pass with SANITIZE=leak?
> 
> No, I didn't. I think you can only trigger it with a sparse index, at
> which point it seemed like diminishing returns to try to reproduce.
> 
> But running in "check" mode is not too hard...
> 
> ...time passes...
> 
> Looks like no. The obvious candidate would be t7002-mv-sparse-checkout,
> but it looks like the sparse-checkout code has minor leaks itself.

Okay, thanks for double checking! I was mostly asking because I plan to
send another leak fixes series to the mailing list later this week.

Patrick
Jeff King May 30, 2024, 8:15 a.m. UTC | #4
On Thu, May 30, 2024 at 09:24:34AM +0200, Patrick Steinhardt wrote:

> > Looks like no. The obvious candidate would be t7002-mv-sparse-checkout,
> > but it looks like the sparse-checkout code has minor leaks itself.
> 
> Okay, thanks for double checking! I was mostly asking because I plan to
> send another leak fixes series to the mailing list later this week.

OK, good news. t7002 _does_ trigger the leak fixed in my patch. You just
can't tell because of all of the sparse-checkout leaks. ;)

The (messy) patch below gets it to a leak-free state when applied on
top. Do you want me to do another mini-series with it, or do you want to
just roll it into what you're doing (I won't be surprised if you've
already found some of these).

-Peff

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 0f52e25249..1ed9dfa886 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -311,6 +311,8 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 		fprintf(fp, "%s/\n", pattern);
 		free(pattern);
 	}
+
+	string_list_clear(&sl, 0);
 }
 
 static int write_patterns_and_update(struct pattern_list *pl)
@@ -471,6 +473,7 @@ static int sparse_checkout_init(int argc, const char **argv, const char *prefix)
 	/* If we already have a sparse-checkout file, use it. */
 	if (res >= 0) {
 		free(sparse_filename);
+		clear_pattern_list(&pl);
 		return update_working_directory(NULL);
 	}
 
@@ -486,6 +489,7 @@ static int sparse_checkout_init(int argc, const char **argv, const char *prefix)
 			die(_("failed to open '%s'"), sparse_filename);
 
 		free(sparse_filename);
+		clear_pattern_list(&pl);
 		fprintf(fp, "/*\n!/*/\n");
 		fclose(fp);
 		return 0;
@@ -525,6 +529,10 @@ static void insert_recursive_pattern(struct pattern_list *pl, struct strbuf *pat
 
 		if (!hashmap_get_entry(&pl->parent_hashmap, e, ent, NULL))
 			hashmap_add(&pl->parent_hashmap, &e->ent);
+		else {
+			free(e->pattern);
+			free(e);
+		}
 	}
 }
 
@@ -891,7 +899,6 @@ static int sparse_checkout_disable(int argc, const char **argv,
 		OPT_END(),
 	};
 	struct pattern_list pl;
-	struct strbuf match_all = STRBUF_INIT;
 
 	/*
 	 * We do not exit early if !core_apply_sparse_checkout; due to the
@@ -917,8 +924,7 @@ static int sparse_checkout_disable(int argc, const char **argv,
 	pl.use_cone_patterns = 0;
 	core_apply_sparse_checkout = 1;
 
-	strbuf_addstr(&match_all, "/*");
-	add_pattern(strbuf_detach(&match_all, NULL), empty_base, 0, &pl, 0);
+	add_pattern("/*", empty_base, 0, &pl, 0);
 
 	prepare_repo_settings(the_repository);
 	the_repository->settings.sparse_index = 0;
diff --git a/dir.c b/dir.c
index 2d83f3311a..5769c4e693 100644
--- a/dir.c
+++ b/dir.c
@@ -799,6 +799,8 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 
 	if (given->patternlen > 2 &&
 	    !strcmp(given->pattern + given->patternlen - 2, "/*")) {
+		struct pattern_entry *old;
+
 		if (!(given->flags & PATTERN_FLAG_NEGATIVE)) {
 			/* Not a cone pattern. */
 			warning(_("unrecognized pattern: '%s'"), given->pattern);
@@ -824,7 +826,11 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 		}
 
 		hashmap_add(&pl->parent_hashmap, &translated->ent);
-		hashmap_remove(&pl->recursive_hashmap, &translated->ent, &data);
+		old = hashmap_remove_entry(&pl->recursive_hashmap, translated, ent, &data);
+		if (old) {
+			free(old->pattern);
+			free(old);
+		}
 		free(data);
 		return;
 	}
@@ -855,6 +861,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 
 clear_hashmaps:
 	warning(_("disabling cone pattern matching"));
+	/* should free ent->pattern too; refactor clear_pattern_list? */
 	hashmap_clear_and_free(&pl->parent_hashmap, struct pattern_entry, ent);
 	hashmap_clear_and_free(&pl->recursive_hashmap, struct pattern_entry, ent);
 	pl->use_cone_patterns = 0;
@@ -950,13 +957,20 @@ static int read_skip_worktree_file_from_index(struct index_state *istate,
  */
 void clear_pattern_list(struct pattern_list *pl)
 {
+	struct hashmap_iter iter;
+	struct pattern_entry *entry;
 	int i;
 
 	for (i = 0; i < pl->nr; i++)
 		free(pl->patterns[i]);
 	free(pl->patterns);
 	free(pl->filebuf);
+
+	hashmap_for_each_entry(&pl->recursive_hashmap, &iter, entry, ent)
+		free(entry->pattern);
 	hashmap_clear_and_free(&pl->recursive_hashmap, struct pattern_entry, ent);
+	hashmap_for_each_entry(&pl->parent_hashmap, &iter, entry, ent)
+		free(entry->pattern);
 	hashmap_clear_and_free(&pl->parent_hashmap, struct pattern_entry, ent);
 
 	memset(pl, 0, sizeof(*pl));
Patrick Steinhardt May 30, 2024, 8:19 a.m. UTC | #5
On Thu, May 30, 2024 at 04:15:12AM -0400, Jeff King wrote:
> On Thu, May 30, 2024 at 09:24:34AM +0200, Patrick Steinhardt wrote:
> 
> > > Looks like no. The obvious candidate would be t7002-mv-sparse-checkout,
> > > but it looks like the sparse-checkout code has minor leaks itself.
> > 
> > Okay, thanks for double checking! I was mostly asking because I plan to
> > send another leak fixes series to the mailing list later this week.
> 
> OK, good news. t7002 _does_ trigger the leak fixed in my patch. You just
> can't tell because of all of the sparse-checkout leaks. ;)
> 
> The (messy) patch below gets it to a leak-free state when applied on
> top. Do you want me to do another mini-series with it, or do you want to
> just roll it into what you're doing (I won't be surprised if you've
> already found some of these).

Well, with all the leaks that we have in our codebase it's not all that
likely that we actually work on the same ones :) For the record, my
patch series is 29 patches long by now and makes another 76 test suites
pass. But none of those fixes touch any of the code you touched.

So given that my patch series is already way too long, and given that
there shouldn't be any conflicts, I wouldn't mind if this was spun out
into a separate series.

Patrick
Jeff King May 30, 2024, 8:28 a.m. UTC | #6
On Thu, May 30, 2024 at 10:19:41AM +0200, Patrick Steinhardt wrote:

> > The (messy) patch below gets it to a leak-free state when applied on
> > top. Do you want me to do another mini-series with it, or do you want to
> > just roll it into what you're doing (I won't be surprised if you've
> > already found some of these).
> 
> Well, with all the leaks that we have in our codebase it's not all that
> likely that we actually work on the same ones :) For the record, my
> patch series is 29 patches long by now and makes another 76 test suites
> pass. But none of those fixes touch any of the code you touched.
> 
> So given that my patch series is already way too long, and given that
> there shouldn't be any conflicts, I wouldn't mind if this was spun out
> into a separate series.

Sounds good. I just wanted to make sure we didn't end up duplicating or
conflicting.

-Peff
diff mbox series

Patch

diff --git a/builtin/mv.c b/builtin/mv.c
index 81ca910de6..852b4e92c1 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -556,7 +556,6 @@  int cmd_mv(int argc, const char **argv, const char *prefix)
 	}
 
 	strbuf_release(&a_src_dir);
-	free(src_dir);
 
 	if (dirty_paths.nr)
 		advise_on_moving_dirty_path(&dirty_paths);
@@ -571,6 +570,7 @@  int cmd_mv(int argc, const char **argv, const char *prefix)
 	ret = 0;
 
 out:
+	free(src_dir);
 	free(dst_w_slash);
 	string_list_clear(&src_for_dst, 0);
 	string_list_clear(&dirty_paths, 0);