t4204 is not sanitizer clean at all

Message ID	xmqqtuf86t7z.fsf_-_@gitster.g (mailing list archive)
State	Accepted
Commit	dc2c5db807e3922941a44a8f8c6172e9c9f947a9
Headers	show Return-Path: <git-owner@kernel.org> From: Junio C Hamano <gitster@pobox.com> To: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason <avarab@gmail.com> Cc: git@vger.kernel.org Subject: [PATCH] t4204 is not sanitizer clean at all References: <20211213220327.16042-1-jerry@skydio.com> <20211213220327.16042-2-jerry@skydio.com> <xmqqee6dz5s9.fsf@gitster.g> Date: Thu, 16 Dec 2021 15:11:12 -0800 In-Reply-To: <xmqqee6dz5s9.fsf@gitster.g> (Junio C. Hamano's message of "Wed, 15 Dec 2021 17:40:06 -0800") Message-ID: <xmqqtuf86t7z.fsf_-_@gitster.g> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk
Series	t4204 is not sanitizer clean at all \| expand t4204 is not sanitizer clean at all

Message ID

xmqqtuf86t7z.fsf_-_@gitster.g (mailing list archive)

State

Accepted

Commit

dc2c5db807e3922941a44a8f8c6172e9c9f947a9

Headers

From: Junio C Hamano <gitster@pobox.com>
To: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason <avarab@gmail.com>
Cc: git@vger.kernel.org
Subject: [PATCH] t4204 is not sanitizer clean at all
References: <20211213220327.16042-1-jerry@skydio.com>
        <20211213220327.16042-2-jerry@skydio.com> <xmqqee6dz5s9.fsf@gitster.g>
Date: Thu, 16 Dec 2021 15:11:12 -0800
In-Reply-To: <xmqqee6dz5s9.fsf@gitster.g> (Junio C. Hamano's message of "Wed,
        15 Dec 2021 17:40:06 -0800")
Message-ID: <xmqqtuf86t7z.fsf_-_@gitster.g>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
Precedence: bulk

Series

t4204 is not sanitizer clean at all | expand

Commit Message

Junio C Hamano Dec. 16, 2021, 11:11 p.m. UTC

Earlier we marked that this patch-id test is leak-sanitizer clean,
but if we read the test script carefully, it is merely because we
have too many invocations of commands in the "git log" family on the
upstream side of the pipe, hiding breakages from them.

Split the pipeline so that breakages from these commands can be
caught (not limited to aborts due to leak-sanitizer) and unmark
the script as not passing the test with leak-sanitizer in effect.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 A quick grep tells me that tests 3302, 3303, 3305, 4020 and 6236
 all use "git log" and still are marked as passing tests with
 leak-sanitizer in effect.  I've taken a deep look at none of them,
 but I suspect they share the same kind of breakage.

 t/t4204-patch-id.sh | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

Comments

Ævar Arnfjörð Bjarmason Dec. 17, 2021, 4:39 a.m. UTC | #1

On Thu, Dec 16 2021, Junio C Hamano wrote:

> Earlier we marked that this patch-id test is leak-sanitizer clean,
> but if we read the test script carefully, it is merely because we
> have too many invocations of commands in the "git log" family on the
> upstream side of the pipe, hiding breakages from them.
>
> Split the pipeline so that breakages from these commands can be
> caught (not limited to aborts due to leak-sanitizer) and unmark
> the script as not passing the test with leak-sanitizer in effect.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>
>  A quick grep tells me that tests 3302, 3303, 3305, 4020 and 6236
>  all use "git log" and still are marked as passing tests with
>  leak-sanitizer in effect.  I've taken a deep look at none of them,
>  but I suspect they share the same kind of breakage.

This change looks good to me.

FWIW this is not a mistake on my part, but something I'm perfectly aware
of. I don't consider it to be "brekage".

We have plenty of place in the test suite where we hide exit codes on
the LHS of a pipe, or where we call a function that doesn't &&-chain its
git invocations.

In those cases we can and usually will "succeed" under LSAN, because it
allows the program to emit its full output, and will abort() at the very
end.

I have an unsubmitted logging mode (using LSAN_OPTIONS=log_path=<path>)
where I log every one of these to test-results/*, there's a lot more of
these.

But in the meantime I think the best way forward is to gradually mark
the tests that pass with LSAN as passing, to ensure that we at least
don't have regressions in the meantime. Before this we'd at least check
the "git checkout" etc. for leaks.

If I made fixing all broken &&-chains or git on the LHS of a pipe a
prerequisite for marking as passing under under LSAN I'd end up with
something that's approximately the size of [1] and more (i.e. Eric's
upcoming patches to do that).

I don't see why we'd consider perfect the enemy of the good in these
cases. Yes we won't catch the successful exit of every single git
invocation, but our tests aren't doing that now, LSAN or not. But until
that's fixed we'll at least catch some, which helps our overall memory
leak regression coverage.

More importantly it makes it a lot easier to reason about future memory
leak patches, as we'll be able to get to a 1=1 mapping of tests that
pass, and those that are marked being known to pass. I'm using that
locally to fake-fail those that start passing unexpectedly that aren't
on the list, which then helps to inform the addition of "this test now
passes with no leaks".

1. https://lore.kernel.org/git/20211213063059.19424-1-sunshine@sunshineco.com/

Junio C Hamano Dec. 17, 2021, 8:48 p.m. UTC | #2

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> This change looks good to me.
>
> FWIW this is not a mistake on my part, but something I'm perfectly aware
> of. I don't consider it to be "brekage".
>
> We have plenty of place in the test suite where we hide exit codes on
> the LHS of a pipe, or where we call a function that doesn't &&-chain its
> git invocations.
>
> In those cases we can and usually will "succeed" under LSAN, because it
> allows the program to emit its full output, and will abort() at the very
> end.

But pipes do not hide ONLY deaths by sanitizer.  And by relying on
the presence of pipe hiding deaths of git tools to mark the script
sanitizer-clean, the TEST_PASSES_SANITIZE_LEAK=true line adds an
unnecessary road-block for those who are cleaning up the "git whose
crash are hidden by being on the left hand side of the pipe"
pattern.

I do not know what to call it if not "breakage".

Ævar Arnfjörð Bjarmason Dec. 17, 2021, 10:23 p.m. UTC | #3

On Fri, Dec 17 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> This change looks good to me.
>>
>> FWIW this is not a mistake on my part, but something I'm perfectly aware
>> of. I don't consider it to be "brekage".
>>
>> We have plenty of place in the test suite where we hide exit codes on
>> the LHS of a pipe, or where we call a function that doesn't &&-chain its
>> git invocations.
>>
>> In those cases we can and usually will "succeed" under LSAN, because it
>> allows the program to emit its full output, and will abort() at the very
>> end.
>
> But pipes do not hide ONLY deaths by sanitizer.  And by relying on
> the presence of pipe hiding deaths of git tools to mark the script
> sanitizer-clean, the TEST_PASSES_SANITIZE_LEAK=true line adds an
> unnecessary road-block for those who are cleaning up the "git whose
> crash are hidden by being on the left hand side of the pipe"
> pattern.
>
> I do not know what to call it if not "breakage".

Yes it's broken as far as the test is concerned. I meant as far as
"GIT_TEST_PASSING_SANITIZE_LEAK" goes I consider it somewhere between
"meh" and "don't care yet".

I.e. these are pretty irrelevant for finding leaks, as we've got a huge
deluge of them elsewhere. At some point we might have a last few stray
memory leaks in git hidden by such patterns, but we're very far away
from that.

Sometimes fixing those is trivial as in 3247919a758 (commit-graph tests:
fix error-hiding graph_git_two_modes() helper, 2021-10-15), and
sometimes we'll find that the test was broken all along in some other
subtle way, as in the a046aa38ca9 (commit-graph tests: fix another
graph_git_two_modes() helper, 2021-10-15) follow-up.

But as to the "roadblock" I don't mind the
TEST_PASSES_SANITIZE_LEAK=true being removed from the script at the
slightest sign of trouble. Nobody should have to shift gears and chase
down some memory leak in "git log" just because they needed it for their
test setup.

And I'd very much prefer that to UNLEAK() just to avoid that
TEST_PASSES_SANITIZE_LEAK=true removal, because it makes fixing the leak
itself harder as far as what topic to target, re-adding
TEST_PASSES_SANITIZE_LEAK=true once it's fixed etc. goes.

diff --git i/t/t4204-patch-id.sh w/t/t4204-patch-id.sh
index e78d8097f3..80f4a65b28 100755
--- i/t/t4204-patch-id.sh
+++ w/t/t4204-patch-id.sh
@@ -5,7 +5,6 @@  test_description='git patch-id'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
-TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'setup' '
@@ -28,7 +27,8 @@  test_expect_success 'setup' '
 '
 
 test_expect_success 'patch-id output is well-formed' '
-	git log -p -1 | git patch-id >output &&
+	git log -p -1 >log.output &&
+	git patch-id <log.output >output &&
 	grep "^$OID_REGEX $(git rev-parse HEAD)$" output
 '
 
@@ -36,8 +36,8 @@  test_expect_success 'patch-id output is well-formed' '
 calc_patch_id () {
 	patch_name="$1"
 	shift
-	git patch-id "$@" |
-	sed "s/ .*//" >patch-id_"$patch_name" &&
+	git patch-id "$@" >patch-id.output &&
+	sed "s/ .*//" patch-id.output >patch-id_"$patch_name" &&
 	test_line_count -gt 0 patch-id_"$patch_name"
 }
 
@@ -46,7 +46,8 @@  get_top_diff () {
 }
 
 get_patch_id () {
-	get_top_diff "$1" | calc_patch_id "$@"
+	get_top_diff "$1" >top-diff.output &&
+	calc_patch_id <top-diff.output "$@"
 }
 
 test_expect_success 'patch-id detects equality' '
@@ -64,16 +65,18 @@  test_expect_success 'patch-id detects inequality' '
 test_expect_success 'patch-id supports git-format-patch output' '
 	get_patch_id main &&
 	git checkout same &&
-	git format-patch -1 --stdout | calc_patch_id same &&
+	git format-patch -1 --stdout >format-patch.output &&
+	calc_patch_id same <format-patch.output &&
 	test_cmp patch-id_main patch-id_same &&
-	set $(git format-patch -1 --stdout | git patch-id) &&
+	set $(git patch-id <format-patch.output) &&
 	test "$2" = $(git rev-parse HEAD)
 '
 
 test_expect_success 'whitespace is irrelevant in footer' '
 	get_patch_id main &&
 	git checkout same &&
-	git format-patch -1 --stdout | sed "s/ \$//" | calc_patch_id same &&
+	git format-patch -1 --stdout >format-patch.output &&
+	sed "s/ \$//" format-patch.output | calc_patch_id same &&
 	test_cmp patch-id_main patch-id_same
 '
 
@@ -92,10 +95,11 @@  test_patch_id_file_order () {
 	shift
 	name="order-${1}-$relevant"
 	shift
-	get_top_diff "main" | calc_patch_id "$name" "$@" &&
+	get_top_diff "main" >top-diff.output &&
+	calc_patch_id <top-diff.output "$name" "$@" &&
 	git checkout same &&
-	git format-patch -1 --stdout -O foo-then-bar |
-		calc_patch_id "ordered-$name" "$@" &&
+	git format-patch -1 --stdout -O foo-then-bar >format-patch.output &&
+	calc_patch_id <format-patch.output "ordered-$name" "$@" &&
 	cmp_patch_id $relevant "$name" "ordered-$name"
 
 }
@@ -143,7 +147,8 @@  test_expect_success '--stable overrides patchid.stable = false' '
 test_expect_success 'patch-id supports git-format-patch MIME output' '
 	get_patch_id main &&
 	git checkout same &&
-	git format-patch -1 --attach --stdout | calc_patch_id same &&
+	git format-patch -1 --attach --stdout >format-patch.output &&
+	calc_patch_id <format-patch.output same &&
 	test_cmp patch-id_main patch-id_same
 '

t4204 is not sanitizer clean at all

Commit Message

Comments

Patch