From patchwork Mon Dec 30 04:24:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13923039 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 988F9171C9 for ; Mon, 30 Dec 2024 04:24:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735532645; cv=none; b=i+cSecMBkhJSB+vF4leKxxfqTF9B+Dqe7lRXhLqdcj/bitbJNhxbD7y3WDQq4w1S2IphaeADRzYhiPiU/32NTF4WbhLJVickdqp0fMwSycGYA34cXdiMHQOgvB7iQdpUntQyQkokE0OpyUBk3S4WXMLTR5w2qi30PQhhDY74a+4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735532645; c=relaxed/simple; bh=GiO5OvRdz0p7pbNPcPLRUv3+HxC4hT+NN0Be+YdOR4U=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=kLhvW4LeDZVCLNwqTo3Oqu83vyL5oUir3bfGPsoKAxU338Opcd9hVvklNa3cu8ik6PcKQHxUiLBelXBrjo0yS1sasbzIer2SW6pxtTxoH7KB2W195pF0szHLsBFZAdPxtl/6l3WzEi3Klb7NQxb8hOr/FknErsUHSDwgSjhHfSk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=KTZa7i2G; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="KTZa7i2G" Received: (qmail 14718 invoked by uid 109); 30 Dec 2024 04:24:03 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=GiO5OvRdz0p7pbNPcPLRUv3+HxC4hT+NN0Be+YdOR4U=; b=KTZa7i2Gc7oX3aX83x9q1k/GWuqldip8uZuae1SMv2pQOMWGmjFOxTJIqbxUEuu3wQG67YlWF9u5/A8pZ6Jcn0UcwKTSPFV0T76ILaOhle6+sKRIQ6q0Hg4ifoBdvxA5WBtSJNQaFQIO8Y8k2FxCHPDKKX35Crj1l5auN0P/uNEW46gA3T0faVNjTm88K4eiQdVEsB9QVQo1yahyf3AS6amHDDFxlW7ffSoy8pRgxV8H59RlwtwSBf35y7Do9vN7U6DHv+eQEi+UpkM0AAGscmlZvyZuKdmCI+gkLLXSmerS+CwCCMVw+ILJsQowffgvqjWE4SZ1l1KyL6csUgFPwg== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Mon, 30 Dec 2024 04:24:03 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14249 invoked by uid 111); 30 Dec 2024 04:24:02 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sun, 29 Dec 2024 23:24:02 -0500 Authentication-Results: peff.net; auth=none Date: Sun, 29 Dec 2024 23:24:01 -0500 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Patrick Steinhardt , Junio C Hamano Subject: [PATCH 1/5] test-lib: use individual lsan dir for --stress runs Message-ID: <20241230042401.GA113400@coredump.intra.peff.net> References: <20241230042325.GA112439@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241230042325.GA112439@coredump.intra.peff.net> When storing output in test-results/, we usually give each numbered run in a --stress set its own output file. But we don't do that for storing LSan logs, so something like: ./t0003-attributes.sh --stress will have many scripts simultaneously creating, writing to, and deleting the test-results/t0003-attributes.leak directory. This can cause logs from one run to be attributed to another, spurious failures when creation and deletion race, and so on. This has always been broken, but nobody noticed because it's rare to do a --stress run with LSan (since the point is for the code to run quickly many times in order to hit races). But if you're trying to find a race in the leak sanitizing code, it makes sense to use these together. We can fix it by using $TEST_RESULTS_BASE, which already incorporates the stress job suffix. Signed-off-by: Jeff King --- t/test-lib.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/test-lib.sh b/t/test-lib.sh index 1a67adb207..96f2dfb69d 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -331,7 +331,7 @@ TEST_RESULTS_BASE="$TEST_RESULTS_DIR/$TEST_NAME$TEST_STRESS_JOB_SFX" TEST_RESULTS_SAN_FILE_PFX=trace TEST_RESULTS_SAN_DIR_SFX=leak TEST_RESULTS_SAN_FILE= -TEST_RESULTS_SAN_DIR="$TEST_RESULTS_DIR/$TEST_NAME.$TEST_RESULTS_SAN_DIR_SFX" +TEST_RESULTS_SAN_DIR="$TEST_RESULTS_BASE.$TEST_RESULTS_SAN_DIR_SFX" TRASH_DIRECTORY="trash directory.$TEST_NAME$TEST_STRESS_JOB_SFX" test -n "$root" && TRASH_DIRECTORY="$root/$TRASH_DIRECTORY" case "$TRASH_DIRECTORY" in From patchwork Mon Dec 30 04:26:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13923040 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B3E1171C9 for ; Mon, 30 Dec 2024 04:26:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735532774; cv=none; b=H0XZEXB/Y1uktaAB1lnsiZOpTiohYVpIAGRsjzzqiK4aowbYjwRw2DBc8HsT+asIhUVWRZtXkMvn5H0lyhuOJO0TlYDnGqaamkIH8eGMWYLXnBy3wWE5U08MtpWjMV9ppPoUWjx9xZm5cPNn0bZWXe4A7WYQvWHj6f9y1sqUhz8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735532774; c=relaxed/simple; bh=IwJvgDnImmtHi0wqNFu5Dqjt22bM1CWpHxVWhhejdIk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Y5XNoeIf6PGztG+iFIAo9JG1H43IoOkZ/WpiSgvjFJHGzU6Si5DS31CGiwxVeEemNRNv6u9Awc6z9Fssrs3MsTVitsI856q/bAaJZEZfRzAC3ouVqnS2Mapk+3nD9F6qQfKLrSDglHshZ6Nqc9TnhfC96S9UozaY8SPAMnzA8GA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=XlUOBqQb; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="XlUOBqQb" Received: (qmail 14732 invoked by uid 109); 30 Dec 2024 04:26:11 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=IwJvgDnImmtHi0wqNFu5Dqjt22bM1CWpHxVWhhejdIk=; b=XlUOBqQbnrJldg2FwxcYWh1J2mX9aoLeSp/m644uuOC5VukLmBT82kyMhkQst9tJDoBXiFD9qaOFCzc7WCKxXO2v4KDib9skT5JY9pC2s03OHZCuFAbweM/Bto1lNOmLxcGb57U7YrDyH6K0xs5yhs9jw2cswkwVMIMrmoyDYlDPNVcN3zPWQzerPXg3g3h1P1LvomFrlvvEQ2DICD5JTC5ouHOK4HFFYLnSLKAJAL3h8gtqfq21khaK5m3Y/LUv9HShQGG+7UfKT2hxgvfYgK+76QtnRyBGH188wL8dEa+YZAAVNxiKXOiXccKIvlXfGgawPK8urhVJHufgNEJxVw== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Mon, 30 Dec 2024 04:26:11 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14285 invoked by uid 111); 30 Dec 2024 04:26:10 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sun, 29 Dec 2024 23:26:10 -0500 Authentication-Results: peff.net; auth=none Date: Sun, 29 Dec 2024 23:26:10 -0500 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Patrick Steinhardt , Junio C Hamano Subject: [PATCH 2/5] Revert "index-pack: spawn threads atomically" Message-ID: <20241230042610.GB113400@coredump.intra.peff.net> References: <20241230042325.GA112439@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241230042325.GA112439@coredump.intra.peff.net> This reverts commit 993d38a0669a8056d496797516e743e26b6b8b54. That commit was trying to solve a race between LSan setting up the threads stack and another thread calling exit(), by making sure that all pthread_create() calls have finished before doing any work that might trigger the exit(). But that isn't sufficient. The setup code actually runs in the individual threads themselves, not in the spawning thread's call to pthread_create(). So while it may have improved the race a bit, you can still trigger it pretty quickly with: make SANITIZE=leak cd t ./t5309-pack-delta-cycles.sh --stress Let's back out that failed attempt so we can try again. Signed-off-by: Jeff King --- I'm still puzzled why I ever thought 93d38a066 solved this, because the stress test above is exactly what I was using back then to test it. Possibly LSan changed in some way, or possibly I just got lucky in my stress run, but the more likely reason is that I simply bungled the testing (e.g., if you accidentally use a non-LSan build, then it naturally appears to work). builtin/index-pack.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index d773809c4c..0b62b2589f 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1336,15 +1336,13 @@ static void resolve_deltas(struct pack_idx_option *opts) base_cache_limit = opts->delta_base_cache_limit * nr_threads; if (nr_threads > 1 || getenv("GIT_FORCE_THREADS")) { init_thread(); - work_lock(); for (i = 0; i < nr_threads; i++) { int ret = pthread_create(&thread_data[i].thread, NULL, threaded_second_pass, thread_data + i); if (ret) die(_("unable to create thread: %s"), strerror(ret)); } - work_unlock(); for (i = 0; i < nr_threads; i++) pthread_join(thread_data[i].thread, NULL); cleanup_thread(); From patchwork Mon Dec 30 04:28:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13923041 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E406E171C9 for ; Mon, 30 Dec 2024 04:28:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735532913; cv=none; b=kX8J5/wQRMAyohuhJfqC2KSRB1DM6kgkKOtgXkBnBVa+6UKdp4ahG3uqyRphJEvyZfduyptQmTSi5WgJ/rEKRWe3SOxeWhEGmMGIlIyyQrniZV1zEKeNURr4HZSzp4opRKFchmxj9COBkFQvdEjhRBWhRw/PV1KGc7pkiV4K7YE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735532913; c=relaxed/simple; bh=X49/AKLQUv13TMj9wHX+Q1lFS9GKp/AyYZ5ubUGeD/g=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=WIYP2V6N+tMXQj+CFUn4oQZnlFZr8ZmtIgZu95CZIShHq3A0sDPPt9buI/mqdLa6aA8q0icZ8nBt4wmayzn64ebVFM3fVQCwwxzJhHbLYFQ0V7ZkGxZDnK2tffylnV4wj/FentVOBLAJ+07HsQ8HyoThMlgv1KcVBUXmvgkaP1I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=U84/ZdpJ; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="U84/ZdpJ" Received: (qmail 14748 invoked by uid 109); 30 Dec 2024 04:28:31 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=X49/AKLQUv13TMj9wHX+Q1lFS9GKp/AyYZ5ubUGeD/g=; b=U84/ZdpJtBzZVZ6XFPw0uWENG1uN6jUxqL//wl+NJTRxS9PdxbXpC4ueFS5w8uJH/A+tQbxIt01qsbl0VbBZIakBwnm+HolAs5os4J4GDbpMTQQ2iNkJTRO8RNOpnfQSoIeZf1cKjHx2urdjZHQ7FwNZHGG0rrsiWVg99NujbQ33bCKcp4DbcgowwrFpdz6goWAwe7pOrxwnWRY5TM7h3Q9zJq2uorIoecudBSHRq6pGxs5Wg9nsau9o3fL1mkaDZXJBKa3idF/QnlRxET7MM/m66sMyIJOLq7i9biNx03axW2GZgl6d+cH2n6B7r+253Vb9js/mjDNU6Hy104Evfw== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Mon, 30 Dec 2024 04:28:31 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14295 invoked by uid 111); 30 Dec 2024 04:28:30 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sun, 29 Dec 2024 23:28:30 -0500 Authentication-Results: peff.net; auth=none Date: Sun, 29 Dec 2024 23:28:30 -0500 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Patrick Steinhardt , Junio C Hamano Subject: [PATCH 3/5] thread-utils: introduce optional barrier type Message-ID: <20241230042830.GC113400@coredump.intra.peff.net> References: <20241230042325.GA112439@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241230042325.GA112439@coredump.intra.peff.net> One thread primitive we don't yet support is a barrier: it waits for all threads to reach a synchronization point before letting any of them continue. This would be useful for avoiding the LSan race we see in index-pack (and other places) by having all threads complete their initialization before any of them start to do real work. POSIX introduced a pthread_barrier_t in 2004, which does what we want. But if we want to rely on it: 1. Our Windows pthread emulation would need a new set of wrapper functions. There's a Synchronization Barrier primitive there, which was introduced in Windows 8 (which is old enough for us to depend on). 2. macOS (and possibly other systems) has pthreads but not pthread_barrier_t. So there we'd have to implement our own barrier based on the mutex and cond primitives. Those are do-able, but since we only care about avoiding races in our LSan builds, there's an easier way: make it a noop on systems without a native pthread barrier. This patch introduces a "maybe_thread_barrier" API. The clunky name (rather than just using pthread_barrier directly) should hopefully clue people in that on some systems it will do nothing. It's wired to a Makefile knob which has to be triggered manually, and we enable it for the linux-leaks CI jobs (since we know we'll have it there). There are some other possible options: - we could turn it on all the time for Linux systems based on uname. But we really only care about it for LSan builds, and there is no need to add extra code to regular builds. - we could turn it on only for LSan builds. But that would break builds on non-Linux platforms (like macOS) that otherwise should support sanitizers. - we could trigger only on the combination of Linux and LSan together. This isn't too hard to do, but the uname check isn't completely accurate. It is really about what your libc supports, and non-glibc systems might not have it (though at least musl seems to). So we'd risk breaking builds on those systems, which would need to add a new knob. Though the upside would be that running local "make SANITIZE=leak test" would be protected automatically. And of course none of this protects LSan runs from races on systems without pthread barriers. It's probably OK in practice to protect only our CI jobs, though. The race is rare-ish and most leak-checking happens through CI. Signed-off-by: Jeff King --- Makefile | 7 +++++++ ci/lib.sh | 1 + thread-utils.h | 17 +++++++++++++++++ 3 files changed, 25 insertions(+) diff --git a/Makefile b/Makefile index 97e8385b66..2c6dad8a75 100644 --- a/Makefile +++ b/Makefile @@ -141,6 +141,10 @@ include shared.mak # # Define NO_PTHREADS if you do not have or do not want to use Pthreads. # +# Define THREAD_BARRIER_PTHREAD if your system has pthread_barrier_t. Barrier +# support is optional and is only helpful when building with SANITIZE=leak, as +# it is used to eliminate some races in the leak-checker. +# # Define NO_PREAD if you have a problem with pread() system call (e.g. # cygwin1.dll before v1.5.22). # @@ -2079,6 +2083,9 @@ ifdef NO_PTHREADS else BASIC_CFLAGS += $(PTHREAD_CFLAGS) EXTLIBS += $(PTHREAD_LIBS) + ifdef THREAD_BARRIER_PTHREAD + BASIC_CFLAGS += -DTHREAD_BARRIER_PTHREAD + endif endif ifdef HAVE_PATHS_H diff --git a/ci/lib.sh b/ci/lib.sh index 8885ee3c3f..6a1267fbcb 100755 --- a/ci/lib.sh +++ b/ci/lib.sh @@ -385,6 +385,7 @@ linux-musl) ;; linux-leaks|linux-reftable-leaks) export SANITIZE=leak + export THREAD_BARRIER_PTHREAD=1 ;; linux-asan-ubsan) export SANITIZE=address,undefined diff --git a/thread-utils.h b/thread-utils.h index 4961487ed9..3df5be9916 100644 --- a/thread-utils.h +++ b/thread-utils.h @@ -53,5 +53,22 @@ int dummy_pthread_init(void *); int online_cpus(void); int init_recursive_mutex(pthread_mutex_t*); +#ifdef THREAD_BARRIER_PTHREAD +#define maybe_thread_barrier_t pthread_barrier_t +#define maybe_thread_barrier_init pthread_barrier_init +#define maybe_thread_barrier_wait pthread_barrier_wait +#define maybe_thread_barrier_destroy pthread_barrier_destroy +#else +#define maybe_thread_barrier_t int +static inline int maybe_thread_barrier_init(maybe_thread_barrier_t *b UNUSED, + void *attr UNUSED, + unsigned nr UNUSED) +{ + errno = ENOSYS; + return -1; +} +#define maybe_thread_barrier_wait(barrier) +#define maybe_thread_barrier_destroy(barrier) +#endif #endif /* THREAD_COMPAT_H */ From patchwork Mon Dec 30 04:29:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13923042 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0BC34171C9 for ; Mon, 30 Dec 2024 04:29:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735532981; cv=none; b=k53lFlTla0p7ozAPMSWzanuKnV1tAQHWo6n0JS6DQrlYwX3/VGCsZniJRwQl+Z4k5sIlJTNVX1vRLRfoybUXG0CkB8hX1gNKq4gGlvMvjD25eZS/0tUBnzyeEBnkcduz1U/bi177P7SgBEUYWobtj6PDlYScL9dxHHJfN3L5Uq4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735532981; c=relaxed/simple; bh=MYdf5MuZU5+Y3IuEXL9t8ozpNE20MSs1q/2G6VsYyx0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=m9W49Zpfppnh0OCpFHKySq75HsglhDsu+59RTOeMFlH/mS5jgzK5SjqLoHGjn52FxRLso0YIOLia5p6Kf0CuV4aOOjRjc0wmb2G5NLv9uiPen04aChQEYgRquczgpy2op0ce5XFfEsTy6wfHFHdAHRZLh62m50VK7g0HJ6JScVE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=X7FTmD4J; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="X7FTmD4J" Received: (qmail 14761 invoked by uid 109); 30 Dec 2024 04:29:39 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=MYdf5MuZU5+Y3IuEXL9t8ozpNE20MSs1q/2G6VsYyx0=; b=X7FTmD4JMANjMUKCpWJbmSDSLr2WGfcvVgjLM42orlvudjFgQMA37fj+WNDDhesAoHHZ8fpxT5KTGIZ5yg0vCAV6B2g8xusG2Wo+OQGe8tZYNxytze3euEA3c55TWxpv+xNX5wE8y2uhvFGNlkyGo+TNkRlZNtD8d1Blj8DnZAvV8fNcAu0lKH+MnERefqbJaYFvndhUkFxoIkqNn2uf0gU5XbOZuG7RHQj5vTAwwestJlJ0qg/p0a14IjPY3X/61OGtdXCV9LvXwjGe3klGjBL2c4j7/Y2h5XOILZDuMEjMGgZFc30/aMJweivJNg04UwASJaPzJLJ9Zahqe3QGDw== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Mon, 30 Dec 2024 04:29:39 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14313 invoked by uid 111); 30 Dec 2024 04:29:38 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sun, 29 Dec 2024 23:29:38 -0500 Authentication-Results: peff.net; auth=none Date: Sun, 29 Dec 2024 23:29:38 -0500 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Patrick Steinhardt , Junio C Hamano Subject: [PATCH 4/5] index-pack: work around LSan threading race with barrier Message-ID: <20241230042938.GD113400@coredump.intra.peff.net> References: <20241230042325.GA112439@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241230042325.GA112439@coredump.intra.peff.net> We sometimes get false positives from our linux-leaks CI job because of a race in LSan itself. The problem is that one thread is still initializing its stack in LSan's code (and allocating memory to do so) while anothe thread calls die(), taking down the whole process and triggering a leak check. The problem is described in more detail in 993d38a066 (index-pack: spawn threads atomically, 2024-01-05), which tried to fix it by pausing worker threads until all calls to pthread_create() had completed. But that's not enough to fix the problem, because the LSan setup code runs in the threads themselves. So even though pthread_create() has returned, we have no idea if all threads actually finished their setup before letting any of them do real work. We can fix that by using a barrier inside the threads themselves, waiting for all of them to hit the start of their main function before any of them proceed. You can test for the race by running: make SANITIZE=leak THREAD_BARRIER_PTHREAD=YesOnLinux cd t ./t5309-pack-delta-cycles.sh --stress which fails quickly before this patch, and should run indefinitely without it. Signed-off-by: Jeff King --- builtin/index-pack.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 0b62b2589f..27b120f26c 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -185,6 +185,8 @@ static pthread_mutex_t deepest_delta_mutex; static pthread_key_t key; +static maybe_thread_barrier_t start_barrier; + static inline void lock_mutex(pthread_mutex_t *mutex) { if (threads_active) @@ -209,6 +211,7 @@ static void init_thread(void) if (show_stat) pthread_mutex_init(&deepest_delta_mutex, NULL); pthread_key_create(&key, NULL); + maybe_thread_barrier_init(&start_barrier, NULL, nr_threads); CALLOC_ARRAY(thread_data, nr_threads); for (i = 0; i < nr_threads; i++) { thread_data[i].pack_fd = xopen(curr_pack, O_RDONLY); @@ -231,6 +234,7 @@ static void cleanup_thread(void) for (i = 0; i < nr_threads; i++) close(thread_data[i].pack_fd); pthread_key_delete(key); + maybe_thread_barrier_destroy(&start_barrier); free(thread_data); } @@ -1100,6 +1104,8 @@ static int compare_ref_delta_entry(const void *a, const void *b) static void *threaded_second_pass(void *data) { + if (threads_active) + maybe_thread_barrier_wait(&start_barrier); if (data) set_thread_data(data); for (;;) { From patchwork Mon Dec 30 04:30:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13923043 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B3064EDE for ; Mon, 30 Dec 2024 04:30:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735533030; cv=none; b=HZd/66WUDw8eWhIxISa8hjjekfMSgwLZ/K3P5+VhvPgp0xrm3mkXxfwxQiUz00OSP4qYsOb8ZN1z7OVxXbommZmRO2xpM1+y/nmmulcy5piZxVTadk1HvLOfRoObB/7C/25MmFq3GulfTS8Bcj9ZoR2RFThaMj3XW/0GKJp2ZZ8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735533030; c=relaxed/simple; bh=kJKgLvx0lks3/5IOmTFtz8TOWEdLNxWhRVZ+K69uG2o=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=s+Z8MGW2t1k2Qww9rZ47hdQ6rsYPBqw0aHHQHGD8cLgBgiusEu1h4f6WOXb98Llw4IWNs0n7wf5cCuqCKmVhxzL4rN41FNPLXed27O1qvacmfrRF2WaNf6QD4rtadAjBA/FhOyouZKM8jb6737JyVxycEHOZUo7YXDVmlZYweWA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=f+VQpoaa; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="f+VQpoaa" Received: (qmail 14774 invoked by uid 109); 30 Dec 2024 04:30:28 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=kJKgLvx0lks3/5IOmTFtz8TOWEdLNxWhRVZ+K69uG2o=; b=f+VQpoaaQSFjsUkAEqKedaAXuuypXaCk+MXAm/eoQ/9Ry9M3SK3vPqaglZV/aLj4sNidTIXNeFhWMxFtN80guiwNl0kiB2vwiHYTDO9CY4grmqbudNu5M3dOxuWbCppQ0kT3jBCG81wYRyT61EQqAjamA9qLHEFZ/VM0ZRD5X960kn/plV1mjKk5pSqoETy0rcohTqG1n8S9dFH57UtTjcKkeqXSqGPxVPRBxvcLxWQ2btOpP9+iemSaOJQVHniclwOPrgb5kY7b3Ap8roaEVCy5fs52cIuwhbLssNptqUoqKuJGat7LAI50NTvaaJp/bMo4q+poMiLdqKOf9voigw== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Mon, 30 Dec 2024 04:30:28 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14346 invoked by uid 111); 30 Dec 2024 04:30:27 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sun, 29 Dec 2024 23:30:27 -0500 Authentication-Results: peff.net; auth=none Date: Sun, 29 Dec 2024 23:30:26 -0500 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Patrick Steinhardt , Junio C Hamano Subject: [PATCH 5/5] grep: work around LSan threading race with barrier Message-ID: <20241230043026.GE113400@coredump.intra.peff.net> References: <20241230042325.GA112439@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241230042325.GA112439@coredump.intra.peff.net> There's a race with LSan when spawning threads and one of the threads calls die(). We worked around one such problem with index-pack in the previous commit, but it exists in git-grep, too. You can see it with: make SANITIZE=leak THREAD_BARRIER_PTHREAD=YesOnLinux cd t ./t0003-attributes.sh --stress which fails pretty quickly with: ==git==4096424==ERROR: LeakSanitizer: detected memory leaks Direct leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7f906de14556 in realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98 #1 0x7f906dc9d2c1 in __pthread_getattr_np nptl/pthread_getattr_np.c:180 #2 0x7f906de2500d in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150 #3 0x7f906de25187 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:614 #4 0x7f906de17d18 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:53 #5 0x7f906de143a9 in ThreadStartFunc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:431 #6 0x7f906dc9bf51 in start_thread nptl/pthread_create.c:447 #7 0x7f906dd1a677 in __clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 As with the previous commit, we can fix this by inserting a barrier that makes sure all threads have finished their setup before continuing. But there's one twist in this case: the thread which calls die() is not one of the worker threads, but the main thread itself! So we need the main thread to wait in the barrier, too, until all threads have gotten to it. And thus we initialize the barrier for num_threads+1, to account for all of the worker threads plus the main one. If we then test as above, t0003 should run indefinitely. Signed-off-by: Jeff King --- builtin/grep.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/builtin/grep.c b/builtin/grep.c index d00ee76f24..61b2c27490 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -101,6 +101,9 @@ static pthread_cond_t cond_write; /* Signalled when we are finished with everything. */ static pthread_cond_t cond_result; +/* Synchronize the start of all threads */ +static maybe_thread_barrier_t start_barrier; + static int skip_first_line; static void add_work(struct grep_opt *opt, struct grep_source *gs) @@ -198,6 +201,8 @@ static void *run(void *arg) int hit = 0; struct grep_opt *opt = arg; + maybe_thread_barrier_wait(&start_barrier); + while (1) { struct work_item *w = get_work(); if (!w) @@ -229,6 +234,7 @@ static void start_threads(struct grep_opt *opt) pthread_cond_init(&cond_add, NULL); pthread_cond_init(&cond_write, NULL); pthread_cond_init(&cond_result, NULL); + maybe_thread_barrier_init(&start_barrier, NULL, num_threads + 1); grep_use_locks = 1; enable_obj_read_lock(); @@ -248,6 +254,7 @@ static void start_threads(struct grep_opt *opt) die(_("grep: failed to create thread: %s"), strerror(err)); } + maybe_thread_barrier_wait(&start_barrier); } static int wait_all(void) @@ -284,6 +291,7 @@ static int wait_all(void) pthread_cond_destroy(&cond_add); pthread_cond_destroy(&cond_write); pthread_cond_destroy(&cond_result); + maybe_thread_barrier_destroy(&start_barrier); grep_use_locks = 0; disable_obj_read_lock();