[0/13] leak fixes for sparse-checkout code

Message ID	20240531112433.GA428583@coredump.intra.peff.net (mailing list archive)
Headers	show Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8275915575A for <git@vger.kernel.org>; Fri, 31 May 2024 11:24:35 +0000 (UTC) Date: Fri, 31 May 2024 07:24:33 -0400 From: Jeff King <peff@peff.net> To: git@vger.kernel.org Cc: Patrick Steinhardt <ps@pks.im> Subject: [PATCH 0/13] leak fixes for sparse-checkout code Message-ID: <20240531112433.GA428583@coredump.intra.peff.net> Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline
Series	leak fixes for sparse-checkout code \| expand [0/13] leak fixes for sparse-checkout code [01/13] sparse-checkout: free string list in write_cone_to_file() [02/13] sparse-checkout: pass string literals directly to add_pattern() [03/13] dir.c: free strings in sparse cone pattern hashmaps [04/13] sparse-checkout: clear patterns when init() sees existing sparse file [05/13] dir.c: free removed sparse-pattern hashmap entries [06/13] dir.c: always copy input to add_pattern() [07/13] sparse-checkout: reuse --stdin buffer when reading patterns [08/13] sparse-checkout: always free "line" strbuf after reading input [09/13] sparse-checkout: refactor temporary sparse_checkout_patterns [10/13] sparse-checkout: free sparse_filename after use [11/13] sparse-checkout: free pattern list in sparse_checkout_list() [12/13] sparse-checkout: free string list after displaying [13/13] sparse-checkout: free duplicate hashmap entries

Message ID

20240531112433.GA428583@coredump.intra.peff.net (mailing list archive)

Headers

Date: Fri, 31 May 2024 07:24:33 -0400
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>
Subject: [PATCH 0/13] leak fixes for sparse-checkout code
Message-ID: <20240531112433.GA428583@coredump.intra.peff.net>
Precedence: bulk
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline

Series

leak fixes for sparse-checkout code | expand

Message

Jeff King May 31, 2024, 11:24 a.m. UTC

So Patrick nerd-sniped me by asking if my earlier leakfix for git-mv was
triggered by the test suite. It was, in t7002, but that wasn't enough to
make the script leak-free. So I figured, how hard could it be to go all
the way?

Well. It only took a few patches (1-5), but in the process I stumbled on
a rather tricky interface oddity of add_pattern(), which caused some
other leaks. The interface is fixed in patch 6, and the matching leak
goes away in patch 7. Of course, I wanted to make sure it was tested, so
after poking around I found that t1091 triggered it.

But as you might guess, that didn't make t1091 leak-free. And I couldn't
bear leaving it on a cliffhanger like that, so patches 8-13 fix the rest
of the issues triggered by that script.

And along the way we managed to make t1090 and t3602 leak-free, too
(actually in patch 2, but I didn't notice until the whole thing was
done).

These should apply on top of jk/leakfixes, since the leak-freeness of
t7002 depends on the fix there.

  [01/13]: sparse-checkout: free string list in write_cone_to_file()
  [02/13]: sparse-checkout: pass string literals directly to add_pattern()
  [03/13]: dir.c: free strings in sparse cone pattern hashmaps
  [04/13]: sparse-checkout: clear patterns when init() sees existing sparse file
  [05/13]: dir.c: free removed sparse-pattern hashmap entries
  [06/13]: dir.c: always copy input to add_pattern()
  [07/13]: sparse-checkout: reuse --stdin buffer when reading patterns
  [08/13]: sparse-checkout: always free "line" strbuf after reading input
  [09/13]: sparse-checkout: refactor temporary sparse_checkout_patterns
  [10/13]: sparse-checkout: free sparse_filename after use
  [11/13]: sparse-checkout: free pattern list in sparse_checkout_list()
  [12/13]: sparse-checkout: free string list after displaying
  [13/13]: sparse-checkout: free duplicate hashmap entries

 builtin/sparse-checkout.c          | 49 +++++++++++++++++++-----------
 dir.c                              | 42 ++++++++++++++++---------
 dir.h                              |  3 +-
 t/t1090-sparse-checkout-scope.sh   |  1 +
 t/t1091-sparse-checkout-builtin.sh |  1 +
 t/t3602-rm-sparse-checkout.sh      |  1 +
 t/t7002-mv-sparse-checkout.sh      |  1 +
 7 files changed, 65 insertions(+), 33 deletions(-)

-Peff

Comments

Jeff King May 31, 2024, 11:49 a.m. UTC | #1

On Fri, May 31, 2024 at 07:24:33AM -0400, Jeff King wrote:

> But as you might guess, that didn't make t1091 leak-free. And I couldn't
> bear leaving it on a cliffhanger like that, so patches 8-13 fix the rest
> of the issues triggered by that script.
> 
> And along the way we managed to make t1090 and t3602 leak-free, too
> (actually in patch 2, but I didn't notice until the whole thing was
> done).

Oh, btw, there's one interesting workflow I found. It's nice to see if
your incremental work is making things better (and to make sure that the
fixes are being exercised somewhere in the test suite).  But the
granularity of "is this script leak-free" is too coarse to see the
incremental steps.  Likewise even for individual test failures, as you
can have many leaks in a single program.

So I ended up doing this a lot:

  script=t1091-sparse-checkout-builtin.sh
  make SANITIZE=leak &&
  (
	cd t &&
	rm -rf test-results &&
	LSAN_OPTIONS=abort_on_error=0:exitcode=0 \
	GIT_TEST_SANITIZE_LEAK_LOG=true \
	./$script
  )
  for i in Indirect Direct; do
	echo "$i: $(grep "^$i leak" t/test-results/${script%.sh}.leak/* | wc -l)"
  done

It keeps running instead of aborting on leaks (otherwise your counts
may go up as "failing" programs are fixed and we run more code). And
instead just logs it all and counts up the log entries.

I wonder if it would be useful to have something like that baked into
test-lib.

-Peff

Junio C Hamano May 31, 2024, 3:01 p.m. UTC | #2

Jeff King <peff@peff.net> writes:

> 	LSAN_OPTIONS=abort_on_error=0:exitcode=0 \
> 	GIT_TEST_SANITIZE_LEAK_LOG=true \
> 	./$script
>   )
>   for i in Indirect Direct; do
> 	echo "$i: $(grep "^$i leak" t/test-results/${script%.sh}.leak/* | wc -l)"
>   done

Neat trick.

> It keeps running instead of aborting on leaks (otherwise your counts
> may go up as "failing" programs are fixed and we run more code). And
> instead just logs it all and counts up the log entries.

Indeed.  It is a very nice idea.

> I wonder if it would be useful to have something like that baked into
> test-lib.
>
> -Peff