Message ID | 20231003202504.GA7697@coredump.intra.peff.net (mailing list archive) |
---|---|
Headers | show |
Series | some commit-graph leak fixes | expand |
Jeff, Please accept my apologies for slightly hijacking your posting, but I see you have been fixing some leaks, and so presumably you are familiar with building git with "SANITIZE=leak". I have fixed some leaks in my SHA1+SHA256 patchset recently and while tracking them down I found that simply enabling SANITIZE=leak caused "make test" on git v2.42 without patches to give different failures from test run to test run. Well actually I wound up with the following command line: GIT_TEST_PASSING_SANITIZE_LEAK=true GIT_TEST_SANITIZE_LEAK_LOG=true SANITIZE=leak DEVELOPER=1 make test I had removed "-j32" to make things more reproducible. I observed this unreliability with SANITIZE=leak when building git on an fully updated version of debian 12. My big question is: Do other people see random test failures when SANITIZE=leak is enabled? Is it just me? Thanks, Eric
On Tue, Oct 03, 2023 at 08:33:26PM -0500, Eric W. Biederman wrote: > My big question is: > > Do other people see random test failures when SANITIZE=leak is enabled? > > Is it just me? Yes, I've seen this. You mentioned that you were testing with v2.42, which lacks 370ef7e40d (test-lib: ignore uninteresting LSan output, 2023-08-28). Try using the current version of 'master', or just cherry-picking that commit onto v2.42. A few other tips to avoid confusing results (though they at least do not vary from run to run): - use the LEAK_LOG option, since you otherwise miss some cases (it looks like you already are from what you posted above) - gcc and clang sometimes produce different results. Right now I get no leak from gcc on t9004, but clang reports one (I think clang is right here) - turn off compiler optimizations; we've had cases where code reordering/removal creates false positives. Oh, hmm, I forgot we do this by default since d3775de074 (Makefile: force -O0 when compiling with SANITIZE=leak, 2022-10-18), so your v2.42 should be covered. -Peff
Jeff King <peff@peff.net> writes: > On Tue, Oct 03, 2023 at 08:33:26PM -0500, Eric W. Biederman wrote: > >> My big question is: >> >> Do other people see random test failures when SANITIZE=leak is enabled? >> >> Is it just me? > > Yes, I've seen this. You mentioned that you were testing with v2.42, > which lacks 370ef7e40d (test-lib: ignore uninteresting LSan output, > 2023-08-28). Try using the current version of 'master', or just > cherry-picking that commit onto v2.42. > > A few other tips to avoid confusing results (though they at least do not > vary from run to run): > > - use the LEAK_LOG option, since you otherwise miss some cases (it > looks like you already are from what you posted above) > > - gcc and clang sometimes produce different results. Right now I get > no leak from gcc on t9004, but clang reports one (I think clang is > right here) > > - turn off compiler optimizations; we've had cases where code > reordering/removal creates false positives. Oh, hmm, I forgot we do > this by default since d3775de074 (Makefile: force -O0 when compiling > with SANITIZE=leak, 2022-10-18), so your v2.42 should be covered. I just tried master, aka commit d0e8084c65cb ("The fourteenth batch"). What I see on a random failure looks like: > make -C t/ all > make[1]: Entering directory '/home/user/projects/git/git/t' > rm -f -r 'test-results' > GIT_TEST_EXT_CHAIN_LINT=0 && export GIT_TEST_EXT_CHAIN_LINT && make aggregate-results-and-cleanup > make[2]: Entering directory '/home/user/projects/git/git/t' > *** t0000-basic.sh *** > Segmentation fault > error: test_bool_env requires bool values both for $GIT_TEST_PASSING_SANITIZE_LEAK and for the default fallback Which doesn't sound like anything you have described so I am guessing it is something with my environment I need to track down. Eric
On Wed, Oct 04, 2023 at 09:19:40AM -0500, Eric W. Biederman wrote: > What I see on a random failure looks like: > > > make -C t/ all > > make[1]: Entering directory '/home/user/projects/git/git/t' > > rm -f -r 'test-results' > > GIT_TEST_EXT_CHAIN_LINT=0 && export GIT_TEST_EXT_CHAIN_LINT && make aggregate-results-and-cleanup > > make[2]: Entering directory '/home/user/projects/git/git/t' > > *** t0000-basic.sh *** > > Segmentation fault > > error: test_bool_env requires bool values both for $GIT_TEST_PASSING_SANITIZE_LEAK and for the default fallback > > Which doesn't sound like anything you have described so I am guessing it > is something with my environment I need to track down. No, that seems different entirely. You'll have to figure out which program is segfaulting and why (if you can see it in a script besides t0000 you're probably better off, as that one is a maze of tests-within-tests, since it is testing the test-harness itself). Although the "error" you see maybe implies that it is failing early on in test-lib.sh, when we are calling "test-tool env-helper". If that is segfaulting there is probably something very wrong with your build. -Peff
Jeff King <peff@peff.net> writes: > On Wed, Oct 04, 2023 at 09:19:40AM -0500, Eric W. Biederman wrote: > >> What I see on a random failure looks like: >> >> > make -C t/ all >> > make[1]: Entering directory '/home/user/projects/git/git/t' >> > rm -f -r 'test-results' >> > GIT_TEST_EXT_CHAIN_LINT=0 && export GIT_TEST_EXT_CHAIN_LINT && make aggregate-results-and-cleanup >> > make[2]: Entering directory '/home/user/projects/git/git/t' >> > *** t0000-basic.sh *** >> > Segmentation fault >> > error: test_bool_env requires bool values both for $GIT_TEST_PASSING_SANITIZE_LEAK and for the default fallback >> >> Which doesn't sound like anything you have described so I am guessing it >> is something with my environment I need to track down. > > No, that seems different entirely. You'll have to figure out which > program is segfaulting and why (if you can see it in a script besides > t0000 you're probably better off, as that one is a maze of > tests-within-tests, since it is testing the test-harness itself). > > Although the "error" you see maybe implies that it is failing early on > in test-lib.sh, when we are calling "test-tool env-helper". If that is > segfaulting there is probably something very wrong with your build. Just to document what I am seeing it appears to be some odd interaction with address space randomization. If I run my make as: "setarch --addr-no-randomize make test" I don't see coredumps any more. Now to dig a deeper and see if I can figure out what about address space randomization is making things break. Eric
On Tue, Oct 03, 2023 at 04:25:04PM -0400, Jeff King wrote: > I noticed while working on the jk/commit-graph-verify-fix topic that > free_commit_graph() leaks any slices of a commit-graph-chain except for > the first. I naively hoped that fixing that would make t5324 leak-free, > but it turns out there were a number of other leaks, so I fixed those, > too. A couple of them were in the merge code, which in turn means a > bunch of new test scripts are now leak-free. > > Even though I saw the problem on that other topic, there's no dependency > here; this series can be applied directly to master (or possibly even > maint, though I didn't try). Thanks for carefully finding and explaining these various leaks. The series is a definite improvement, and after reviewing closely I couldn't find anything worth changing. LGTM! Thanks, Taylor
Taylor Blau <me@ttaylorr.com> writes: > On Tue, Oct 03, 2023 at 04:25:04PM -0400, Jeff King wrote: >> I noticed while working on the jk/commit-graph-verify-fix topic that >> free_commit_graph() leaks any slices of a commit-graph-chain except for >> the first. I naively hoped that fixing that would make t5324 leak-free, >> but it turns out there were a number of other leaks, so I fixed those, >> too. A couple of them were in the merge code, which in turn means a >> bunch of new test scripts are now leak-free. >> >> Even though I saw the problem on that other topic, there's no dependency >> here; this series can be applied directly to master (or possibly even >> maint, though I didn't try). > > Thanks for carefully finding and explaining these various leaks. The > series is a definite improvement, and after reviewing closely I couldn't > find anything worth changing. LGTM! Thanks, both. Let's merge it down to 'next'.