Message ID | 0fbf68dbf434986aa971961e20598675b2194d51.1641556319.git.ps@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | refs: excessive hook execution with packed refs | expand |
Patrick Steinhardt <ps@pks.im> writes: > When deleting refs from the loose-files refs backend, then we need to be > careful to also delete the same ref from the packed refs backend, if it > exists. If we don't, then deleting the loose ref would "uncover" the > packed ref. We thus always have to queue up deletions of refs for both > the loose and the packed refs backend. This is done in two separate > transactions, where the end result is that the reference-transaction > hook is executed twice for the deleted refs. > > This behaviour is quite misleading: it's exposing implementation details > of how the files backend works to the user, in contrast to the logical > updates that we'd really want to expose via the hook. Worse yet, whether > the hook gets executed once or twice depends on how well-packed the > repository is: if the ref only exists as a loose ref, then we execute it > once, otherwise if it is also packed then we execute it twice. If the ref only exists as a packed ref, what happens? ... > Fix this behaviour and don't execute the reference-transaction hook at > all when refs in the packed-refs backend if it's driven by the files > backend. ... We try to remove from the loose backend, which would say "nah, it did not exist in my store". I am not sure if it should execute the delete hook in such a case for the ref. But if it does not, not running the hook in the ref transaction for packed backend driven by the loose backend would mean nobody notifies the deletion of the ref, no? To me, it seems that the only case this strategy would work correctly is when the files backend calls deletion hook for a request to delete nonexistent ref, which by itself sounds like a problem. > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > refs/files-backend.c | 9 ++++++--- > t/t1416-ref-transaction-hooks.sh | 7 +------ > 2 files changed, 7 insertions(+), 9 deletions(-) > > diff --git a/refs/files-backend.c b/refs/files-backend.c > index 3c0f3027fe..9a20cb8fa8 100644 > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > @@ -1261,7 +1261,8 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg, > if (packed_refs_lock(refs->packed_ref_store, 0, &err)) > goto error; > > - transaction = ref_store_transaction_begin(refs->packed_ref_store, 0, &err); > + transaction = ref_store_transaction_begin(refs->packed_ref_store, > + REF_TRANSACTION_SKIP_HOOK, &err); > if (!transaction) > goto error; > > @@ -2775,7 +2776,8 @@ static int files_transaction_prepare(struct ref_store *ref_store, > */ > if (!packed_transaction) { > packed_transaction = ref_store_transaction_begin( > - refs->packed_ref_store, 0, err); > + refs->packed_ref_store, > + REF_TRANSACTION_SKIP_HOOK, err); > if (!packed_transaction) { > ret = TRANSACTION_GENERIC_ERROR; > goto cleanup; > @@ -3046,7 +3048,8 @@ static int files_initial_transaction_commit(struct ref_store *ref_store, > &affected_refnames)) > BUG("initial ref transaction called with existing refs"); > > - packed_transaction = ref_store_transaction_begin(refs->packed_ref_store, 0, err); > + packed_transaction = ref_store_transaction_begin(refs->packed_ref_store, > + REF_TRANSACTION_SKIP_HOOK, err); > if (!packed_transaction) { > ret = TRANSACTION_GENERIC_ERROR; > goto cleanup; > diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh > index f9d3d5213f..4e1e84a91f 100755 > --- a/t/t1416-ref-transaction-hooks.sh > +++ b/t/t1416-ref-transaction-hooks.sh > @@ -175,16 +175,11 @@ test_expect_success 'deleting packed ref calls hook once' ' > git update-ref -d refs/heads/to-be-deleted $POST_OID && > > # We only expect a single hook invocation, which is the logical > - # deletion. But currently, we see two interleaving transactions, once > - # for deleting the loose refs and once for deleting the packed ref. > + # deletion. > cat >expect <<-EOF && > - prepared > - $ZERO_OID $ZERO_OID refs/heads/to-be-deleted > prepared > $POST_OID $ZERO_OID refs/heads/to-be-deleted > committed > - $ZERO_OID $ZERO_OID refs/heads/to-be-deleted > - committed > $POST_OID $ZERO_OID refs/heads/to-be-deleted > EOF
On Fri, Jan 07, 2022 at 06:01:04PM -0800, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > When deleting refs from the loose-files refs backend, then we need to be > > careful to also delete the same ref from the packed refs backend, if it > > exists. If we don't, then deleting the loose ref would "uncover" the > > packed ref. We thus always have to queue up deletions of refs for both > > the loose and the packed refs backend. This is done in two separate > > transactions, where the end result is that the reference-transaction > > hook is executed twice for the deleted refs. > > > > This behaviour is quite misleading: it's exposing implementation details > > of how the files backend works to the user, in contrast to the logical > > updates that we'd really want to expose via the hook. Worse yet, whether > > the hook gets executed once or twice depends on how well-packed the > > repository is: if the ref only exists as a loose ref, then we execute it > > once, otherwise if it is also packed then we execute it twice. > > If the ref only exists as a packed ref, what happens? ... > > > Fix this behaviour and don't execute the reference-transaction hook at > > all when refs in the packed-refs backend if it's driven by the files > > backend. > > ... We try to remove from the loose backend, which would say "nah, > it did not exist in my store". I am not sure if it should execute > the delete hook in such a case for the ref. But if it does not, not > running the hook in the ref transaction for packed backend driven by > the loose backend would mean nobody notifies the deletion of the > ref, no? This is shown in the test I've added, "deleting packed ref calls hook once". We create a new reference and pack it such that it doesn't exist as loose ref anymore, but only as a packed one. Updating that ref would've caused us to execute the hook twice before, once via the packed-backend and once via the loose-backend. With my fix we only execute it once via the loose-backend, even if it doesn't currently know it. This works because the loose-backend has to create a lock for the nonexistent reference such that no concurrent call touches the same ref. > To me, it seems that the only case this strategy would work > correctly is when the files backend calls deletion hook for a > request to delete nonexistent ref, which by itself sounds like a > problem. It does so only if the ref exists in either the loose or packed backend though. If trying to update a ref which exists in neither of those, then the reference transaction would fail with an "unable to resolve reference" error in `lock_raw_ref()`. So this should behave as expected: deleting a packed ref calls the hook once, deleting a nonexistent ref fails and doesn't call the hook at all. Patrick > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > > --- > > refs/files-backend.c | 9 ++++++--- > > t/t1416-ref-transaction-hooks.sh | 7 +------ > > 2 files changed, 7 insertions(+), 9 deletions(-) > > > > diff --git a/refs/files-backend.c b/refs/files-backend.c > > index 3c0f3027fe..9a20cb8fa8 100644 > > --- a/refs/files-backend.c > > +++ b/refs/files-backend.c > > @@ -1261,7 +1261,8 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg, > > if (packed_refs_lock(refs->packed_ref_store, 0, &err)) > > goto error; > > > > - transaction = ref_store_transaction_begin(refs->packed_ref_store, 0, &err); > > + transaction = ref_store_transaction_begin(refs->packed_ref_store, > > + REF_TRANSACTION_SKIP_HOOK, &err); > > if (!transaction) > > goto error; > > > > @@ -2775,7 +2776,8 @@ static int files_transaction_prepare(struct ref_store *ref_store, > > */ > > if (!packed_transaction) { > > packed_transaction = ref_store_transaction_begin( > > - refs->packed_ref_store, 0, err); > > + refs->packed_ref_store, > > + REF_TRANSACTION_SKIP_HOOK, err); > > if (!packed_transaction) { > > ret = TRANSACTION_GENERIC_ERROR; > > goto cleanup; > > @@ -3046,7 +3048,8 @@ static int files_initial_transaction_commit(struct ref_store *ref_store, > > &affected_refnames)) > > BUG("initial ref transaction called with existing refs"); > > > > - packed_transaction = ref_store_transaction_begin(refs->packed_ref_store, 0, err); > > + packed_transaction = ref_store_transaction_begin(refs->packed_ref_store, > > + REF_TRANSACTION_SKIP_HOOK, err); > > if (!packed_transaction) { > > ret = TRANSACTION_GENERIC_ERROR; > > goto cleanup; > > diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh > > index f9d3d5213f..4e1e84a91f 100755 > > --- a/t/t1416-ref-transaction-hooks.sh > > +++ b/t/t1416-ref-transaction-hooks.sh > > @@ -175,16 +175,11 @@ test_expect_success 'deleting packed ref calls hook once' ' > > git update-ref -d refs/heads/to-be-deleted $POST_OID && > > > > # We only expect a single hook invocation, which is the logical > > - # deletion. But currently, we see two interleaving transactions, once > > - # for deleting the loose refs and once for deleting the packed ref. > > + # deletion. > > cat >expect <<-EOF && > > - prepared > > - $ZERO_OID $ZERO_OID refs/heads/to-be-deleted > > prepared > > $POST_OID $ZERO_OID refs/heads/to-be-deleted > > committed > > - $ZERO_OID $ZERO_OID refs/heads/to-be-deleted > > - committed > > $POST_OID $ZERO_OID refs/heads/to-be-deleted > > EOF
Patrick Steinhardt <ps@pks.im> writes: > It does so only if the ref exists in either the loose or packed backend > though. If trying to update a ref which exists in neither of those, then > the reference transaction would fail with an "unable to resolve > reference" error in `lock_raw_ref()`. > > So this should behave as expected: deleting a packed ref calls the hook > once, deleting a nonexistent ref fails and doesn't call the hook at all. OK. Your explanation deserves to be in the log message for those who ask questions next time, not lost in the archive in a message just answering my question and explaining it only for me. After all, that is why I ask questions in my reviews. Thanks.
diff --git a/refs/files-backend.c b/refs/files-backend.c index 3c0f3027fe..9a20cb8fa8 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1261,7 +1261,8 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg, if (packed_refs_lock(refs->packed_ref_store, 0, &err)) goto error; - transaction = ref_store_transaction_begin(refs->packed_ref_store, 0, &err); + transaction = ref_store_transaction_begin(refs->packed_ref_store, + REF_TRANSACTION_SKIP_HOOK, &err); if (!transaction) goto error; @@ -2775,7 +2776,8 @@ static int files_transaction_prepare(struct ref_store *ref_store, */ if (!packed_transaction) { packed_transaction = ref_store_transaction_begin( - refs->packed_ref_store, 0, err); + refs->packed_ref_store, + REF_TRANSACTION_SKIP_HOOK, err); if (!packed_transaction) { ret = TRANSACTION_GENERIC_ERROR; goto cleanup; @@ -3046,7 +3048,8 @@ static int files_initial_transaction_commit(struct ref_store *ref_store, &affected_refnames)) BUG("initial ref transaction called with existing refs"); - packed_transaction = ref_store_transaction_begin(refs->packed_ref_store, 0, err); + packed_transaction = ref_store_transaction_begin(refs->packed_ref_store, + REF_TRANSACTION_SKIP_HOOK, err); if (!packed_transaction) { ret = TRANSACTION_GENERIC_ERROR; goto cleanup; diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh index f9d3d5213f..4e1e84a91f 100755 --- a/t/t1416-ref-transaction-hooks.sh +++ b/t/t1416-ref-transaction-hooks.sh @@ -175,16 +175,11 @@ test_expect_success 'deleting packed ref calls hook once' ' git update-ref -d refs/heads/to-be-deleted $POST_OID && # We only expect a single hook invocation, which is the logical - # deletion. But currently, we see two interleaving transactions, once - # for deleting the loose refs and once for deleting the packed ref. + # deletion. cat >expect <<-EOF && - prepared - $ZERO_OID $ZERO_OID refs/heads/to-be-deleted prepared $POST_OID $ZERO_OID refs/heads/to-be-deleted committed - $ZERO_OID $ZERO_OID refs/heads/to-be-deleted - committed $POST_OID $ZERO_OID refs/heads/to-be-deleted EOF
When deleting refs from the loose-files refs backend, then we need to be careful to also delete the same ref from the packed refs backend, if it exists. If we don't, then deleting the loose ref would "uncover" the packed ref. We thus always have to queue up deletions of refs for both the loose and the packed refs backend. This is done in two separate transactions, where the end result is that the reference-transaction hook is executed twice for the deleted refs. This behaviour is quite misleading: it's exposing implementation details of how the files backend works to the user, in contrast to the logical updates that we'd really want to expose via the hook. Worse yet, whether the hook gets executed once or twice depends on how well-packed the repository is: if the ref only exists as a loose ref, then we execute it once, otherwise if it is also packed then we execute it twice. Fix this behaviour and don't execute the reference-transaction hook at all when refs in the packed-refs backend if it's driven by the files backend. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- refs/files-backend.c | 9 ++++++--- t/t1416-ref-transaction-hooks.sh | 7 +------ 2 files changed, 7 insertions(+), 9 deletions(-)