Message ID | 20250321004437.505461-1-jltobler@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | builtin/fetch: avoid aborting closed reference transaction | expand |
On Thu, Mar 20, 2025 at 07:44:37PM -0500, Justin Tobler wrote: > As part of the reference transaction commit phase, the transaction is > set to a closed state regardless of whether it was successful of not. > Attempting to abort a closed transaction via `ref_transaction_abort()` > results in a `BUG()`. Yeah, this is one of the more intricate parts of ref transactions, and it has been biting me several times in the past. It feels somewhat similar in spirit to how the `ref_iterator` used to automatically free itself once it has reached its end, which led to the same class of bugs due to the interface being way too intricate. So I wonderer whether we should refactor this interface in the same way: instead of automatically freeing the transaction on commit/abort, we'd never do so and require the caller to always free it themselves. This would make it way easier to use because we can now unconditionally free the transaction everywhere. That wouldn't help with the fixed bug though, which is that we call abort after a failed commit even though the transaction was already aborted. > diff --git a/builtin/fetch.c b/builtin/fetch.c > index 95fd0018b9..f2555731f4 100644 > --- a/builtin/fetch.c > +++ b/builtin/fetch.c > @@ -1867,8 +1867,15 @@ static int do_fetch(struct transport *transport, > goto cleanup; > > retcode = ref_transaction_commit(transaction, &err); > - if (retcode) > + if (retcode) { > + /* > + * Explicitly handle transaction cleanup to avoid > + * aborting an already closed transaction. > + */ > + ref_transaction_free(transaction); > + transaction = NULL; > goto cleanup; > + } > } > > commit_fetch_head(&fetch_head); Okay, makes sense. Thanks for the fix! Patrick
On 25/03/24 11:40AM, Patrick Steinhardt wrote: > On Thu, Mar 20, 2025 at 07:44:37PM -0500, Justin Tobler wrote: > > As part of the reference transaction commit phase, the transaction is > > set to a closed state regardless of whether it was successful of not. > > Attempting to abort a closed transaction via `ref_transaction_abort()` > > results in a `BUG()`. > > Yeah, this is one of the more intricate parts of ref transactions, and > it has been biting me several times in the past. It feels somewhat > similar in spirit to how the `ref_iterator` used to automatically free > itself once it has reached its end, which led to the same class of bugs > due to the interface being way too intricate. > > So I wonderer whether we should refactor this interface in the same way: > instead of automatically freeing the transaction on commit/abort, we'd > never do so and require the caller to always free it themselves. This > would make it way easier to use because we can now unconditionally free > the transaction everywhere. I was also considering this. The interface here feels rather awkward since aborted transactions free themselves automatically while committed ones do not. It would be easier to reason about if the caller was always reponsible for freeing the transaction. > That wouldn't help with the fixed bug though, which is that we call > abort after a failed commit even though the transaction was already > aborted. I wonder if it would make sense to stop closing the transaction on a failed commit and require the caller to abort it. This would allow error handling to unconditionally abort the transaction during cleanup. I wouldn't mind sending a followup series to refactor these interfaces if that is something we would be interested in. -Justin
On Mon, Mar 24, 2025 at 10:10:44AM -0500, Justin Tobler wrote: > On 25/03/24 11:40AM, Patrick Steinhardt wrote: > > That wouldn't help with the fixed bug though, which is that we call > > abort after a failed commit even though the transaction was already > > aborted. > > I wonder if it would make sense to stop closing the transaction on a > failed commit and require the caller to abort it. This would allow error > handling to unconditionally abort the transaction during cleanup. I think it might still feel somewhat awkward because now every caller would have to both abort and free the transaction when the commit fails. An alternative could be to make abortion idempotent, where aborting an already aborted transaction is fine. But I'm not too sure whether that would significantly improve things. I don't really have a gut feeling here what the best route to go would be. Patrick
diff --git a/builtin/fetch.c b/builtin/fetch.c index 95fd0018b9..f2555731f4 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1867,8 +1867,15 @@ static int do_fetch(struct transport *transport, goto cleanup; retcode = ref_transaction_commit(transaction, &err); - if (retcode) + if (retcode) { + /* + * Explicitly handle transaction cleanup to avoid + * aborting an already closed transaction. + */ + ref_transaction_free(transaction); + transaction = NULL; goto cleanup; + } } commit_fetch_head(&fetch_head); diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh index 5f350facf5..690755d2a8 100755 --- a/t/t5510-fetch.sh +++ b/t/t5510-fetch.sh @@ -537,6 +537,19 @@ test_expect_success 'fetch --atomic --append appends to FETCH_HEAD' ' test_cmp expected atomic/.git/FETCH_HEAD ' +test_expect_success REFFILES 'fetch --atomic fails transaction if reference locked' ' + test_when_finished "rm -rf upstream repo" && + + git init upstream && + git -C upstream commit --allow-empty -m 1 && + git -C upstream switch -c foobar && + git clone --mirror upstream repo && + git -C upstream commit --allow-empty -m 2 && + touch repo/refs/heads/foobar.lock && + + test_must_fail git -C repo fetch --atomic origin +' + test_expect_success '--refmap="" ignores configured refspec' ' cd "$TRASH_DIRECTORY" && git clone "$D" remote-refs &&
As part of the reference transaction commit phase, the transaction is set to a closed state regardless of whether it was successful of not. Attempting to abort a closed transaction via `ref_transaction_abort()` results in a `BUG()`. In c92abe71df (builtin/fetch: fix leaking transaction with `--atomic`, 2024-08-22), logic to free a transaction after the commit phase is moved to the centralized exit path. In cases where the transaction commit failed, this results in a closed transaction being aborted and signaling a bug. Free the transaction and set it to NULL when the commit fails. This allows the exit path to correctly handle the error without attempting to abort the transaction. Signed-off-by: Justin Tobler <jltobler@gmail.com> --- Greetings, It has been reported[1] that executing git-fetch(1) using transactions can result in an unexpected bug message appearing in some scenarios where it previously did not. This issue can be reproduced with the following commands: git init foo && git -C foo commit --allow-empty -m 1 && git clone --mirror foo bar && git -C foo commit --allow-empty -m 2 && touch bar/refs/heads/master.lock && git -C bar fetch --atomic origin master In this example, the reference updates for the fetch are done in a transaction. Since one of the references is already locked, the transaction is expected to fail, but an unexpected "BUG: refs.c:2435: abort called on a closed reference transaction" message also gets printed. This issue bisects to c92abe71df (builtin/fetch: fix leaking transaction with `--atomic`, 2024-08-22) which changes how transaction cleanup is handled to address a memory leak. This change starts relying on using `ref_transaction_abort()` to free the transaction when an error occurs during `ref_transaction_commit()`. This is problematic because `ref_transaction_commit()` always closes the transaction and `ref_transaction_abort()` cannot be invoked on a closed transaction. This explains why we start to see the `BUG()` message. This patch takes a simple approach to fix the issue by explicitly freeing the transaction and setting it to NULL if the commit phase fails. I've included a test that expects the reference transaction to fail when a reference is already locked. If the `BUG()` was still being printed the process would be aborted resulting in the test failing. This test unfortunately relies on the files backend to create the lock file. I would prefer to be reference backend agnostic, but am unsure of an easy way to exercise the issue. -Justin [1]: <e8789a03-41ea-42e2-9f2d-5124b849277a.likui@oschina.cn> --- builtin/fetch.c | 9 ++++++++- t/t5510-fetch.sh | 13 +++++++++++++ 2 files changed, 21 insertions(+), 1 deletion(-) base-commit: 683c54c999c301c2cd6f715c411407c413b1d84e