diff mbox series

[v2,6/6] refs: skip hooks when deleting uncovered packed refs

Message ID 0fbf68dbf434986aa971961e20598675b2194d51.1641556319.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series refs: excessive hook execution with packed refs | expand

Commit Message

Patrick Steinhardt Jan. 7, 2022, 11:55 a.m. UTC
When deleting refs from the loose-files refs backend, then we need to be
careful to also delete the same ref from the packed refs backend, if it
exists. If we don't, then deleting the loose ref would "uncover" the
packed ref. We thus always have to queue up deletions of refs for both
the loose and the packed refs backend. This is done in two separate
transactions, where the end result is that the reference-transaction
hook is executed twice for the deleted refs.

This behaviour is quite misleading: it's exposing implementation details
of how the files backend works to the user, in contrast to the logical
updates that we'd really want to expose via the hook. Worse yet, whether
the hook gets executed once or twice depends on how well-packed the
repository is: if the ref only exists as a loose ref, then we execute it
once, otherwise if it is also packed then we execute it twice.

Fix this behaviour and don't execute the reference-transaction hook at
all when refs in the packed-refs backend if it's driven by the files
backend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs/files-backend.c             | 9 ++++++---
 t/t1416-ref-transaction-hooks.sh | 7 +------
 2 files changed, 7 insertions(+), 9 deletions(-)

Comments

Junio C Hamano Jan. 8, 2022, 2:01 a.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> When deleting refs from the loose-files refs backend, then we need to be
> careful to also delete the same ref from the packed refs backend, if it
> exists. If we don't, then deleting the loose ref would "uncover" the
> packed ref. We thus always have to queue up deletions of refs for both
> the loose and the packed refs backend. This is done in two separate
> transactions, where the end result is that the reference-transaction
> hook is executed twice for the deleted refs.
>
> This behaviour is quite misleading: it's exposing implementation details
> of how the files backend works to the user, in contrast to the logical
> updates that we'd really want to expose via the hook. Worse yet, whether
> the hook gets executed once or twice depends on how well-packed the
> repository is: if the ref only exists as a loose ref, then we execute it
> once, otherwise if it is also packed then we execute it twice.

If the ref only exists as a packed ref, what happens? ...

> Fix this behaviour and don't execute the reference-transaction hook at
> all when refs in the packed-refs backend if it's driven by the files
> backend.

... We try to remove from the loose backend, which would say "nah,
it did not exist in my store".  I am not sure if it should execute
the delete hook in such a case for the ref.  But if it does not, not
running the hook in the ref transaction for packed backend driven by
the loose backend would mean nobody notifies the deletion of the
ref, no?

To me, it seems that the only case this strategy would work
correctly is when the files backend calls deletion hook for a
request to delete nonexistent ref, which by itself sounds like a
problem.

> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  refs/files-backend.c             | 9 ++++++---
>  t/t1416-ref-transaction-hooks.sh | 7 +------
>  2 files changed, 7 insertions(+), 9 deletions(-)
>
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 3c0f3027fe..9a20cb8fa8 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -1261,7 +1261,8 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg,
>  	if (packed_refs_lock(refs->packed_ref_store, 0, &err))
>  		goto error;
>  
> -	transaction = ref_store_transaction_begin(refs->packed_ref_store, 0, &err);
> +	transaction = ref_store_transaction_begin(refs->packed_ref_store,
> +						  REF_TRANSACTION_SKIP_HOOK, &err);
>  	if (!transaction)
>  		goto error;
>  
> @@ -2775,7 +2776,8 @@ static int files_transaction_prepare(struct ref_store *ref_store,
>  			 */
>  			if (!packed_transaction) {
>  				packed_transaction = ref_store_transaction_begin(
> -						refs->packed_ref_store, 0, err);
> +						refs->packed_ref_store,
> +						REF_TRANSACTION_SKIP_HOOK, err);
>  				if (!packed_transaction) {
>  					ret = TRANSACTION_GENERIC_ERROR;
>  					goto cleanup;
> @@ -3046,7 +3048,8 @@ static int files_initial_transaction_commit(struct ref_store *ref_store,
>  				 &affected_refnames))
>  		BUG("initial ref transaction called with existing refs");
>  
> -	packed_transaction = ref_store_transaction_begin(refs->packed_ref_store, 0, err);
> +	packed_transaction = ref_store_transaction_begin(refs->packed_ref_store,
> +							 REF_TRANSACTION_SKIP_HOOK, err);
>  	if (!packed_transaction) {
>  		ret = TRANSACTION_GENERIC_ERROR;
>  		goto cleanup;
> diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
> index f9d3d5213f..4e1e84a91f 100755
> --- a/t/t1416-ref-transaction-hooks.sh
> +++ b/t/t1416-ref-transaction-hooks.sh
> @@ -175,16 +175,11 @@ test_expect_success 'deleting packed ref calls hook once' '
>  	git update-ref -d refs/heads/to-be-deleted $POST_OID &&
>  
>  	# We only expect a single hook invocation, which is the logical
> -	# deletion. But currently, we see two interleaving transactions, once
> -	# for deleting the loose refs and once for deleting the packed ref.
> +	# deletion.
>  	cat >expect <<-EOF &&
> -		prepared
> -		$ZERO_OID $ZERO_OID refs/heads/to-be-deleted
>  		prepared
>  		$POST_OID $ZERO_OID refs/heads/to-be-deleted
>  		committed
> -		$ZERO_OID $ZERO_OID refs/heads/to-be-deleted
> -		committed
>  		$POST_OID $ZERO_OID refs/heads/to-be-deleted
>  	EOF
Patrick Steinhardt Jan. 10, 2022, 1:18 p.m. UTC | #2
On Fri, Jan 07, 2022 at 06:01:04PM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > When deleting refs from the loose-files refs backend, then we need to be
> > careful to also delete the same ref from the packed refs backend, if it
> > exists. If we don't, then deleting the loose ref would "uncover" the
> > packed ref. We thus always have to queue up deletions of refs for both
> > the loose and the packed refs backend. This is done in two separate
> > transactions, where the end result is that the reference-transaction
> > hook is executed twice for the deleted refs.
> >
> > This behaviour is quite misleading: it's exposing implementation details
> > of how the files backend works to the user, in contrast to the logical
> > updates that we'd really want to expose via the hook. Worse yet, whether
> > the hook gets executed once or twice depends on how well-packed the
> > repository is: if the ref only exists as a loose ref, then we execute it
> > once, otherwise if it is also packed then we execute it twice.
> 
> If the ref only exists as a packed ref, what happens? ...
> 
> > Fix this behaviour and don't execute the reference-transaction hook at
> > all when refs in the packed-refs backend if it's driven by the files
> > backend.
> 
> ... We try to remove from the loose backend, which would say "nah,
> it did not exist in my store".  I am not sure if it should execute
> the delete hook in such a case for the ref.  But if it does not, not
> running the hook in the ref transaction for packed backend driven by
> the loose backend would mean nobody notifies the deletion of the
> ref, no?

This is shown in the test I've added, "deleting packed ref calls hook
once". We create a new reference and pack it such that it doesn't exist
as loose ref anymore, but only as a packed one. Updating that ref
would've caused us to execute the hook twice before, once via the
packed-backend and once via the loose-backend. With my fix we only
execute it once via the loose-backend, even if it doesn't currently know
it. This works because the loose-backend has to create a lock for the
nonexistent reference such that no concurrent call touches the same ref.

> To me, it seems that the only case this strategy would work
> correctly is when the files backend calls deletion hook for a
> request to delete nonexistent ref, which by itself sounds like a
> problem.

It does so only if the ref exists in either the loose or packed backend
though. If trying to update a ref which exists in neither of those, then
the reference transaction would fail with an "unable to resolve
reference" error in `lock_raw_ref()`.

So this should behave as expected: deleting a packed ref calls the hook
once, deleting a nonexistent ref fails and doesn't call the hook at all.

Patrick

> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  refs/files-backend.c             | 9 ++++++---
> >  t/t1416-ref-transaction-hooks.sh | 7 +------
> >  2 files changed, 7 insertions(+), 9 deletions(-)
> >
> > diff --git a/refs/files-backend.c b/refs/files-backend.c
> > index 3c0f3027fe..9a20cb8fa8 100644
> > --- a/refs/files-backend.c
> > +++ b/refs/files-backend.c
> > @@ -1261,7 +1261,8 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg,
> >  	if (packed_refs_lock(refs->packed_ref_store, 0, &err))
> >  		goto error;
> >  
> > -	transaction = ref_store_transaction_begin(refs->packed_ref_store, 0, &err);
> > +	transaction = ref_store_transaction_begin(refs->packed_ref_store,
> > +						  REF_TRANSACTION_SKIP_HOOK, &err);
> >  	if (!transaction)
> >  		goto error;
> >  
> > @@ -2775,7 +2776,8 @@ static int files_transaction_prepare(struct ref_store *ref_store,
> >  			 */
> >  			if (!packed_transaction) {
> >  				packed_transaction = ref_store_transaction_begin(
> > -						refs->packed_ref_store, 0, err);
> > +						refs->packed_ref_store,
> > +						REF_TRANSACTION_SKIP_HOOK, err);
> >  				if (!packed_transaction) {
> >  					ret = TRANSACTION_GENERIC_ERROR;
> >  					goto cleanup;
> > @@ -3046,7 +3048,8 @@ static int files_initial_transaction_commit(struct ref_store *ref_store,
> >  				 &affected_refnames))
> >  		BUG("initial ref transaction called with existing refs");
> >  
> > -	packed_transaction = ref_store_transaction_begin(refs->packed_ref_store, 0, err);
> > +	packed_transaction = ref_store_transaction_begin(refs->packed_ref_store,
> > +							 REF_TRANSACTION_SKIP_HOOK, err);
> >  	if (!packed_transaction) {
> >  		ret = TRANSACTION_GENERIC_ERROR;
> >  		goto cleanup;
> > diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
> > index f9d3d5213f..4e1e84a91f 100755
> > --- a/t/t1416-ref-transaction-hooks.sh
> > +++ b/t/t1416-ref-transaction-hooks.sh
> > @@ -175,16 +175,11 @@ test_expect_success 'deleting packed ref calls hook once' '
> >  	git update-ref -d refs/heads/to-be-deleted $POST_OID &&
> >  
> >  	# We only expect a single hook invocation, which is the logical
> > -	# deletion. But currently, we see two interleaving transactions, once
> > -	# for deleting the loose refs and once for deleting the packed ref.
> > +	# deletion.
> >  	cat >expect <<-EOF &&
> > -		prepared
> > -		$ZERO_OID $ZERO_OID refs/heads/to-be-deleted
> >  		prepared
> >  		$POST_OID $ZERO_OID refs/heads/to-be-deleted
> >  		committed
> > -		$ZERO_OID $ZERO_OID refs/heads/to-be-deleted
> > -		committed
> >  		$POST_OID $ZERO_OID refs/heads/to-be-deleted
> >  	EOF
Junio C Hamano Jan. 10, 2022, 6 p.m. UTC | #3
Patrick Steinhardt <ps@pks.im> writes:

> It does so only if the ref exists in either the loose or packed backend
> though. If trying to update a ref which exists in neither of those, then
> the reference transaction would fail with an "unable to resolve
> reference" error in `lock_raw_ref()`.
>
> So this should behave as expected: deleting a packed ref calls the hook
> once, deleting a nonexistent ref fails and doesn't call the hook at all.

OK.  Your explanation deserves to be in the log message for those
who ask questions next time, not lost in the archive in a message
just answering my question and explaining it only for me.  After
all, that is why I ask questions in my reviews.

Thanks.
diff mbox series

Patch

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 3c0f3027fe..9a20cb8fa8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1261,7 +1261,8 @@  static int files_delete_refs(struct ref_store *ref_store, const char *msg,
 	if (packed_refs_lock(refs->packed_ref_store, 0, &err))
 		goto error;
 
-	transaction = ref_store_transaction_begin(refs->packed_ref_store, 0, &err);
+	transaction = ref_store_transaction_begin(refs->packed_ref_store,
+						  REF_TRANSACTION_SKIP_HOOK, &err);
 	if (!transaction)
 		goto error;
 
@@ -2775,7 +2776,8 @@  static int files_transaction_prepare(struct ref_store *ref_store,
 			 */
 			if (!packed_transaction) {
 				packed_transaction = ref_store_transaction_begin(
-						refs->packed_ref_store, 0, err);
+						refs->packed_ref_store,
+						REF_TRANSACTION_SKIP_HOOK, err);
 				if (!packed_transaction) {
 					ret = TRANSACTION_GENERIC_ERROR;
 					goto cleanup;
@@ -3046,7 +3048,8 @@  static int files_initial_transaction_commit(struct ref_store *ref_store,
 				 &affected_refnames))
 		BUG("initial ref transaction called with existing refs");
 
-	packed_transaction = ref_store_transaction_begin(refs->packed_ref_store, 0, err);
+	packed_transaction = ref_store_transaction_begin(refs->packed_ref_store,
+							 REF_TRANSACTION_SKIP_HOOK, err);
 	if (!packed_transaction) {
 		ret = TRANSACTION_GENERIC_ERROR;
 		goto cleanup;
diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
index f9d3d5213f..4e1e84a91f 100755
--- a/t/t1416-ref-transaction-hooks.sh
+++ b/t/t1416-ref-transaction-hooks.sh
@@ -175,16 +175,11 @@  test_expect_success 'deleting packed ref calls hook once' '
 	git update-ref -d refs/heads/to-be-deleted $POST_OID &&
 
 	# We only expect a single hook invocation, which is the logical
-	# deletion. But currently, we see two interleaving transactions, once
-	# for deleting the loose refs and once for deleting the packed ref.
+	# deletion.
 	cat >expect <<-EOF &&
-		prepared
-		$ZERO_OID $ZERO_OID refs/heads/to-be-deleted
 		prepared
 		$POST_OID $ZERO_OID refs/heads/to-be-deleted
 		committed
-		$ZERO_OID $ZERO_OID refs/heads/to-be-deleted
-		committed
 		$POST_OID $ZERO_OID refs/heads/to-be-deleted
 	EOF