builtin/fetch: avoid aborting closed reference transaction

Message ID	20250321004437.505461-1-jltobler@gmail.com (mailing list archive)
State	New
Headers	show Received: from mail-oa1-f48.google.com (mail-oa1-f48.google.com [209.85.160.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 03448C13B for <git@vger.kernel.org>; Fri, 21 Mar 2025 00:57:47 +0000 (UTC) From: Justin Tobler <jltobler@gmail.com> To: git@vger.kernel.org Cc: ps@pks.im, likui@oschina.cn, Justin Tobler <jltobler@gmail.com> Subject: [PATCH] builtin/fetch: avoid aborting closed reference transaction Date: Thu, 20 Mar 2025 19:44:37 -0500 Message-ID: <20250321004437.505461-1-jltobler@gmail.com> In-Reply-To: <g4baz2kt25ysb6wcesoqxhvw2ooxkmqio3dj6b44h6gt5l6z3r@rocsjlys5nqs> References: <g4baz2kt25ysb6wcesoqxhvw2ooxkmqio3dj6b44h6gt5l6z3r@rocsjlys5nqs> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	builtin/fetch: avoid aborting closed reference transaction \| expand builtin/fetch: avoid aborting closed reference transaction

Message ID

20250321004437.505461-1-jltobler@gmail.com (mailing list archive)

State

New

Headers

From: Justin Tobler <jltobler@gmail.com>
To: git@vger.kernel.org
Cc: ps@pks.im,
	likui@oschina.cn,
	Justin Tobler <jltobler@gmail.com>
Subject: [PATCH] builtin/fetch: avoid aborting closed reference transaction
Date: Thu, 20 Mar 2025 19:44:37 -0500
Message-ID: <20250321004437.505461-1-jltobler@gmail.com>
In-Reply-To: <g4baz2kt25ysb6wcesoqxhvw2ooxkmqio3dj6b44h6gt5l6z3r@rocsjlys5nqs>
References: <g4baz2kt25ysb6wcesoqxhvw2ooxkmqio3dj6b44h6gt5l6z3r@rocsjlys5nqs>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

builtin/fetch: avoid aborting closed reference transaction | expand

Commit Message

Justin Tobler March 21, 2025, 12:44 a.m. UTC

As part of the reference transaction commit phase, the transaction is
set to a closed state regardless of whether it was successful of not.
Attempting to abort a closed transaction via `ref_transaction_abort()`
results in a `BUG()`.

In c92abe71df (builtin/fetch: fix leaking transaction with `--atomic`,
2024-08-22), logic to free a transaction after the commit phase is moved
to the centralized exit path. In cases where the transaction commit
failed, this results in a closed transaction being aborted and signaling
a bug.

Free the transaction and set it to NULL when the commit fails. This
allows the exit path to correctly handle the error without attempting to
abort the transaction.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---

Greetings,

It has been reported[1] that executing git-fetch(1) using transactions
can result in an unexpected bug message appearing in some scenarios
where it previously did not. This issue can be reproduced with the
following commands:

        git init foo &&
        git -C foo commit --allow-empty -m 1 &&
        git clone --mirror foo bar &&
        git -C foo commit --allow-empty -m 2 &&
        touch bar/refs/heads/master.lock &&
        git -C bar fetch --atomic origin master

In this example, the reference updates for the fetch are done in a
transaction. Since one of the references is already locked, the
transaction is expected to fail, but an unexpected "BUG: refs.c:2435:
abort called on a closed reference transaction" message also gets
printed.

This issue bisects to c92abe71df (builtin/fetch: fix leaking transaction
with `--atomic`, 2024-08-22) which changes how transaction cleanup is
handled to address a memory leak. This change starts relying on using
`ref_transaction_abort()` to free the transaction when an error occurs
during `ref_transaction_commit()`. This is problematic because
`ref_transaction_commit()` always closes the transaction and
`ref_transaction_abort()` cannot be invoked on a closed transaction.
This explains why we start to see the `BUG()` message.

This patch takes a simple approach to fix the issue by explicitly
freeing the transaction and setting it to NULL if the commit phase
fails.

I've included a test that expects the reference transaction to fail when
a reference is already locked. If the `BUG()` was still being printed
the process would be aborted resulting in the test failing. This test
unfortunately relies on the files backend to create the lock file. I
would prefer to be reference backend agnostic, but am unsure of an easy
way to exercise the issue.

-Justin

[1]: <e8789a03-41ea-42e2-9f2d-5124b849277a.likui@oschina.cn>

---
 builtin/fetch.c  |  9 ++++++++-
 t/t5510-fetch.sh | 13 +++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)


base-commit: 683c54c999c301c2cd6f715c411407c413b1d84e

Comments

Patrick Steinhardt March 24, 2025, 10:40 a.m. UTC | #1

On Thu, Mar 20, 2025 at 07:44:37PM -0500, Justin Tobler wrote:
> As part of the reference transaction commit phase, the transaction is
> set to a closed state regardless of whether it was successful of not.
> Attempting to abort a closed transaction via `ref_transaction_abort()`
> results in a `BUG()`.

Yeah, this is one of the more intricate parts of ref transactions, and
it has been biting me several times in the past. It feels somewhat
similar in spirit to how the `ref_iterator` used to automatically free
itself once it has reached its end, which led to the same class of bugs
due to the interface being way too intricate.

So I wonderer whether we should refactor this interface in the same way:
instead of automatically freeing the transaction on commit/abort, we'd
never do so and require the caller to always free it themselves. This
would make it way easier to use because we can now unconditionally free
the transaction everywhere.

That wouldn't help with the fixed bug though, which is that we call
abort after a failed commit even though the transaction was already
aborted.

> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 95fd0018b9..f2555731f4 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1867,8 +1867,15 @@ static int do_fetch(struct transport *transport,
>  			goto cleanup;
>  
>  		retcode = ref_transaction_commit(transaction, &err);
> -		if (retcode)
> +		if (retcode) {
> +			/*
> +			 * Explicitly handle transaction cleanup to avoid
> +			 * aborting an already closed transaction.
> +			 */
> +			ref_transaction_free(transaction);
> +			transaction = NULL;
>  			goto cleanup;
> +		}
>  	}
>  
>  	commit_fetch_head(&fetch_head);

Okay, makes sense.

Thanks for the fix!

Patrick

Justin Tobler March 24, 2025, 3:10 p.m. UTC | #2

On 25/03/24 11:40AM, Patrick Steinhardt wrote:
> On Thu, Mar 20, 2025 at 07:44:37PM -0500, Justin Tobler wrote:
> > As part of the reference transaction commit phase, the transaction is
> > set to a closed state regardless of whether it was successful of not.
> > Attempting to abort a closed transaction via `ref_transaction_abort()`
> > results in a `BUG()`.
> 
> Yeah, this is one of the more intricate parts of ref transactions, and
> it has been biting me several times in the past. It feels somewhat
> similar in spirit to how the `ref_iterator` used to automatically free
> itself once it has reached its end, which led to the same class of bugs
> due to the interface being way too intricate.
> 
> So I wonderer whether we should refactor this interface in the same way:
> instead of automatically freeing the transaction on commit/abort, we'd
> never do so and require the caller to always free it themselves. This
> would make it way easier to use because we can now unconditionally free
> the transaction everywhere.

I was also considering this. The interface here feels rather awkward
since aborted transactions free themselves automatically while committed
ones do not. It would be easier to reason about if the caller was always
reponsible for freeing the transaction.

> That wouldn't help with the fixed bug though, which is that we call
> abort after a failed commit even though the transaction was already
> aborted.

I wonder if it would make sense to stop closing the transaction on a
failed commit and require the caller to abort it. This would allow error
handling to unconditionally abort the transaction during cleanup.

I wouldn't mind sending a followup series to refactor these interfaces
if that is something we would be interested in.

-Justin

Patrick Steinhardt March 24, 2025, 3:25 p.m. UTC | #3

On Mon, Mar 24, 2025 at 10:10:44AM -0500, Justin Tobler wrote:
> On 25/03/24 11:40AM, Patrick Steinhardt wrote:
> > That wouldn't help with the fixed bug though, which is that we call
> > abort after a failed commit even though the transaction was already
> > aborted.
> 
> I wonder if it would make sense to stop closing the transaction on a
> failed commit and require the caller to abort it. This would allow error
> handling to unconditionally abort the transaction during cleanup.

I think it might still feel somewhat awkward because now every caller
would have to both abort and free the transaction when the commit fails.
An alternative could be to make abortion idempotent, where aborting an
already aborted transaction is fine. But I'm not too sure whether that
would significantly improve things.

I don't really have a gut feeling here what the best route to go would
be.

Patrick

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 95fd0018b9..f2555731f4 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1867,8 +1867,15 @@  static int do_fetch(struct transport *transport,
 			goto cleanup;
 
 		retcode = ref_transaction_commit(transaction, &err);
-		if (retcode)
+		if (retcode) {
+			/*
+			 * Explicitly handle transaction cleanup to avoid
+			 * aborting an already closed transaction.
+			 */
+			ref_transaction_free(transaction);
+			transaction = NULL;
 			goto cleanup;
+		}
 	}
 
 	commit_fetch_head(&fetch_head);
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 5f350facf5..690755d2a8 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -537,6 +537,19 @@  test_expect_success 'fetch --atomic --append appends to FETCH_HEAD' '
 	test_cmp expected atomic/.git/FETCH_HEAD
 '
 
+test_expect_success REFFILES 'fetch --atomic fails transaction if reference locked' '
+	test_when_finished "rm -rf upstream repo" &&
+
+	git init upstream &&
+	git -C upstream commit --allow-empty -m 1 &&
+	git -C upstream switch -c foobar &&
+	git clone --mirror upstream repo &&
+	git -C upstream commit --allow-empty -m 2 &&
+	touch repo/refs/heads/foobar.lock &&
+
+	test_must_fail git -C repo fetch --atomic origin
+'
+
 test_expect_success '--refmap="" ignores configured refspec' '
 	cd "$TRASH_DIRECTORY" &&
 	git clone "$D" remote-refs &&

builtin/fetch: avoid aborting closed reference transaction

Commit Message

Comments

Patch