mbox series

[v2,0/7] fetch: improve atomicity of `--atomic` flag

Message ID cover.1645102965.git.ps@pks.im (mailing list archive)
Headers show
Series fetch: improve atomicity of `--atomic` flag | expand

Message

Patrick Steinhardt Feb. 17, 2022, 1:04 p.m. UTC
Hi,

in c7b190dabd (fetch: implement support for atomic reference updates,
2021-01-12), I have added a new `--atomic` flag to git-fetch(1) which is
supposed to make it an all-or-nothing operation: either all refs are
updated, or none is. I have recently discovered though that there were
two oversights: neither pruning of refs via `--prune` nor the tag
backfill mechanism are currently covered by this flag, which breaks the
promise.

This is the second version of my patch series which fixes these
problems. Due to a semantic conflict with
ps/avoid-unnecessary-hook-invocation-with-packed-refs and a textual
conflict with c9e04d905e (fetch --prune: exit with error if pruning
fails, 2022-01-31) it now applies on top of the former branch merged
with 45fe28c951 (The fourth batch, 2022-02-16).

Changes compared to v1:

    - Commit messages have been improved.

    - A code comment has been fixed to not talk about the past anymore,
      but instead state what the section of code is supposed to do.

    - I have introduced a new patch 5/7 which provides an interface to
      access queued updates in reference transactions without requiring
      access to "refs/refs-internal.h".

    - Removed unnecessary shebangs in tests when using `write_script`.

    - Rebased the series to fix a conflict with c9e04d905e, mentioned
      above.

Thanks for all the feedback!

Patrick

Patrick Steinhardt (7):
  fetch: increase test coverage of fetches
  fetch: backfill tags before setting upstream
  fetch: control lifecycle of FETCH_HEAD in a single place
  fetch: report errors when backfilling tags fails
  refs: add interface to iterate over queued transactional updates
  fetch: make `--atomic` flag cover backfilling of tags
  fetch: make `--atomic` flag cover pruning of refs

 builtin/fetch.c      | 192 +++++++++++++++++++++++++++++--------------
 refs.c               |  16 ++++
 refs.h               |  14 ++++
 t/t5503-tagfollow.sh |  74 +++++++++++++++++
 t/t5510-fetch.sh     |  29 +++++++
 5 files changed, 263 insertions(+), 62 deletions(-)

Range-diff against v1:
1:  e133c14f56 ! 1:  b4ca3f1f3b fetch: increase test coverage of fetches
    @@ Metadata
      ## Commit message ##
         fetch: increase test coverage of fetches
     
    -    The `--atomic` flag is missing test coverage for pruning of deleted
    -    references and backfilling of tags, and of course both aren't covered
    -    correctly by this flag. Furthermore, we don't have tests demonstrating
    -    error cases for backfilling tags.
    +    When using git-fetch(1) with the `--atomic` flag the expectation is that
    +    either all of the references are updated, or alternatively none are in
    +    case the fetch fails. While we already have tests for this, we do not
    +    have any tests which exercise atomicity either when pruning deleted refs
    +    or when backfilling tags. This gap in test coverage hides that we indeed
    +    don't handle atomicity correctly for both of these cases.
     
    -    Add tests to cover those testing gaps.
    +    Add test cases which cover these testing gaps to demonstrate the broken
    +    behaviour. Note that tests are not marked as `test_expect_failure`: this
    +    is done to explicitly demonstrate the current known-wrong behaviour, and
    +    they will be fixed up as soon as we fix the underlying bugs.
    +
    +    While at it this commit also adds another test case which demonstrates
    +    that backfilling of tags does not return an error code in case the
    +    backfill fails. This bug will also be fixed by a subsequent commit.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    @@ t/t5503-tagfollow.sh: test_expect_success 'new clone fetch main and tags' '
     +	#
     +	# To trigger failure we simply abort when backfilling a tag.
     +	write_script clone3/.git/hooks/reference-transaction <<-\EOF &&
    -+		#!/bin/sh
    -+
     +		while read oldrev newrev reference
     +		do
     +			if test "$reference" = refs/tags/tag1
    @@ t/t5503-tagfollow.sh: test_expect_success 'new clone fetch main and tags' '
     +	test_must_fail git -C clone3 fetch --atomic .. $B:refs/heads/something &&
     +
     +	# Creation of the tag has failed, so ideally refs/heads/something
    -+	# should not exist. The fact that it does is demonstrates that there is
    -+	# missing coverage in the `--atomic` flag.
    ++	# should not exist. The fact that it does demonstrates that there is
    ++	# a bug in the `--atomic` flag.
     +	test $B = "$(git -C clone3 rev-parse --verify refs/heads/something)"
     +'
     +
    @@ t/t5503-tagfollow.sh: test_expect_success 'new clone fetch main and tags' '
     +	git init clone5 &&
     +
     +	write_script clone5/.git/hooks/reference-transaction <<-EOF &&
    -+		#!/bin/sh
    -+
     +		while read oldrev newrev reference
     +		do
     +			if test "\$reference" = refs/tags/tag1
2:  64c94e7a28 ! 2:  b0a067bbc1 fetch: backfill tags before setting upstream
    @@ Commit message
             4. We backfill tags pointing into the history we have just fetched.
     
         It is quite confusing that we fetch objects and update references in
    -    both (2) and (4), which is further stressed by the point that we require
    -    a `skip` label which jumps from (3) to (4) in case we fail to update the
    +    both (2) and (4), which is further stressed by the point that we use a
    +    `skip` goto label to jump from (3) to (4) in case we fail to update the
         gitconfig as expected.
     
         Reorder the code to first update all local references, and only after we
    @@ builtin/fetch.c: static void backfill_tags(struct transport *transport, struct r
      	int retcode = 0;
      	const struct ref *remote_refs;
     @@ builtin/fetch.c: static int do_fetch(struct transport *transport,
    - 		}
    + 			retcode = 1;
      	}
      	if (fetch_and_consume_refs(transport, ref_map, worktrees)) {
     -		free_refs(ref_map);
    @@ builtin/fetch.c: static int do_fetch(struct transport *transport,
      		goto cleanup;
      	}
      
    -+	/* if neither --no-tags nor --tags was specified, do automated tag
    -+	 * following ... */
    ++	/*
    ++	 * If neither --no-tags nor --tags was specified, do automated tag
    ++	 * following.
    ++	 */
     +	if (tags == TAGS_DEFAULT && autotags) {
     +		struct ref *tags_ref_map = NULL, **tail = &tags_ref_map;
     +
3:  4059d5034b ! 3:  0b9d04622d fetch: control lifecycle of FETCH_HEAD in a single place
    @@ Commit message
         which is essentially duplicate work.
     
         Improve the lifecycle of updating FETCH_HEAD by opening and committing
    -    it once in `do_fetch()`, where we pass the structure down to code which
    -    wants to append to it.
    +    it once in `do_fetch()`, where we pass the structure down to the code
    +    which wants to append to it.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    @@ builtin/fetch.c: static int do_fetch(struct transport *transport,
      		transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1");
      	if (prune) {
     @@ builtin/fetch.c: static int do_fetch(struct transport *transport,
    - 				   transport->url);
    - 		}
    + 		if (retcode != 0)
    + 			retcode = 1;
      	}
     -	if (fetch_and_consume_refs(transport, ref_map, worktrees)) {
     +
4:  54fdee845b ! 4:  bc1e396ae0 fetch: report errors when backfilling tags fails
    @@ builtin/fetch.c: static int do_fetch(struct transport *transport,
     -			backfill_tags(transport, tags_ref_map, &fetch_head, worktrees);
     +		if (tags_ref_map) {
     +			/*
    -+			 * If backfilling tags succeeds we used to not return
    -+			 * an error code to the user at all. Instead, we
    -+			 * silently swallowed that error and updated the local
    -+			 * state of the repository. We now notify the user of
    -+			 * any such errors, but we continue to make sure that
    -+			 * FETCH_HEAD and the upstream branch are configured as
    -+			 * expected.
    ++			 * If backfilling of tags fails then we want to tell
    ++			 * the user so, but we have to continue regardless to
    ++			 * populate upstream information of the references we
    ++			 * have already fetched above.
     +			 */
     +			if (backfill_tags(transport, tags_ref_map, &fetch_head, worktrees))
     +				retcode = 1;
-:  ---------- > 5:  fac2e3555a refs: add interface to iterate over queued transactional updates
5:  55dbe19a1a ! 6:  331ee40e57 fetch: make `--atomic` flag cover backfilling of tags
    @@ Commit message
         Note that we also have to start passing the transaction into
         `find_non_local_tags()`: this function is responsible for finding all
         tags which we need to backfill. Right now, it will happily return tags
    -    which we have already been updated with our local references. But when
    -    we use a single transaction for both local references and backfilling
    -    then it may happen that we try to queue the same reference update twice
    -    to the transaction, which consequentially triggers a bug. We thus have
    -    to skip over any tags which have already been queued. Unfortunately,
    -    this requires us to reach into internals of the reference transaction to
    -    access queued updates, but there is no non-internal interface right now
    -    which would allow us to access this information.
    +    which have already been updated with our local references. But when we
    +    use a single transaction for both local references and backfilling then
    +    it may happen that we try to queue the same reference update twice to
    +    the transaction, which consequently triggers a bug. We thus have to skip
    +    over any tags which have already been queued.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## builtin/fetch.c ##
    -@@
    - #include "cache.h"
    - #include "config.h"
    - #include "repository.h"
    --#include "refs.h"
    -+#include "refs/refs-internal.h"
    - #include "refspec.h"
    - #include "object-store.h"
    - #include "oidset.h"
     @@ builtin/fetch.c: static void clear_item(struct refname_hash_entry *item)
    + 	item->ignore = 1;
      }
      
    ++
    ++static void add_already_queued_tags(const char *refname,
    ++				    const struct object_id *old_oid,
    ++				    const struct object_id *new_oid,
    ++				    void *cb_data)
    ++{
    ++	struct hashmap *queued_tags = cb_data;
    ++	if (starts_with(refname, "refs/tags/") && new_oid)
    ++		(void) refname_hash_add(queued_tags, refname, new_oid);
    ++}
    ++
      static void find_non_local_tags(const struct ref *refs,
     +				struct ref_transaction *transaction,
      				struct ref **head,
      				struct ref ***tail)
      {
     @@ builtin/fetch.c: static void find_non_local_tags(const struct ref *refs,
    - 	const struct ref *ref;
    - 	struct refname_hash_entry *item = NULL;
    - 	const int quick_flags = OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT;
    -+	int i;
    - 
    - 	refname_hash_init(&existing_refs);
    - 	refname_hash_init(&remote_refs);
      	create_fetch_oidset(head, &fetch_oids);
      
      	for_each_ref(add_one_refname, &existing_refs);
    @@ builtin/fetch.c: static void find_non_local_tags(const struct ref *refs,
     +	 * If we already have a transaction, then we need to filter out all
     +	 * tags which have already been queued up.
     +	 */
    -+	for (i = 0; transaction && i < transaction->nr; i++) {
    -+		if (!starts_with(transaction->updates[i]->refname, "refs/tags/") ||
    -+		    !(transaction->updates[i]->flags & REF_HAVE_NEW))
    -+			continue;
    -+		(void) refname_hash_add(&existing_refs,
    -+					transaction->updates[i]->refname,
    -+					&transaction->updates[i]->new_oid);
    -+	}
    -+
    ++	if (transaction)
    ++		ref_transaction_for_each_queued_update(transaction,
    ++						       add_already_queued_tags,
    ++						       &existing_refs);
     +
      	for (ref = refs; ref; ref = ref->next) {
      		if (!starts_with(ref->name, "refs/tags/"))
    @@ builtin/fetch.c: static int do_fetch(struct transport *transport,
      		transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1");
      	if (prune) {
     @@ builtin/fetch.c: static int do_fetch(struct transport *transport,
    - 		}
    + 			retcode = 1;
      	}
      
     -	if (fetch_and_consume_refs(transport, ref_map, &fetch_head, worktrees)) {
    @@ builtin/fetch.c: static int do_fetch(struct transport *transport,
     +		find_non_local_tags(remote_refs, transaction, &tags_ref_map, &tail);
      		if (tags_ref_map) {
      			/*
    - 			 * If backfilling tags succeeds we used to not return
    -@@ builtin/fetch.c: static int do_fetch(struct transport *transport,
    - 			 * state of the repository. We now notify the user of
    - 			 * any such errors, but we continue to make sure that
    - 			 * FETCH_HEAD and the upstream branch are configured as
    --			 * expected.
    -+			 * expected. The exception though is when `--atomic`
    -+			 * is passed: in that case we'll abort the transaction
    -+			 * and don't commit anything.
    + 			 * If backfilling of tags fails then we want to tell
    + 			 * the user so, but we have to continue regardless to
    + 			 * populate upstream information of the references we
    +-			 * have already fetched above.
    ++			 * have already fetched above. The exception though is
    ++			 * when `--atomic` is passed: in that case we'll abort
    ++			 * the transaction and don't commit anything.
      			 */
     -			if (backfill_tags(transport, tags_ref_map, &fetch_head, worktrees))
     +			if (backfill_tags(transport, transaction, tags_ref_map,
    @@ t/t5503-tagfollow.sh: test_expect_success 'atomic fetch with failing backfill' '
      	test_must_fail git -C clone3 fetch --atomic .. $B:refs/heads/something &&
     -
     -	# Creation of the tag has failed, so ideally refs/heads/something
    --	# should not exist. The fact that it does is demonstrates that there is
    --	# missing coverage in the `--atomic` flag.
    +-	# should not exist. The fact that it does demonstrates that there is
    +-	# a bug in the `--atomic` flag.
     -	test $B = "$(git -C clone3 rev-parse --verify refs/heads/something)"
     +	test_must_fail git -C clone3 rev-parse --verify refs/heads/something &&
     +	test_must_fail git -C clone3 rev-parse --verify refs/tags/tag2
6:  682f16117b ! 7:  2ad16530e5 fetch: make `--atomic` flag cover pruning of refs
    @@ builtin/fetch.c: static int do_fetch(struct transport *transport,
      		 * don't care whether --tags was specified.
      		 */
      		if (rs->nr) {
    --			prune_refs(rs, ref_map, transport->url);
    -+			prune_refs(rs, transaction, ref_map, transport->url);
    +-			retcode = prune_refs(rs, ref_map, transport->url);
    ++			retcode = prune_refs(rs, transaction, ref_map, transport->url);
      		} else {
    - 			prune_refs(&transport->remote->fetch,
    --				   ref_map,
    -+				   transaction, ref_map,
    - 				   transport->url);
    + 			retcode = prune_refs(&transport->remote->fetch,
    +-					     ref_map,
    ++					     transaction, ref_map,
    + 					     transport->url);
      		}
    - 	}
    + 		if (retcode != 0)
     
      ## t/t5510-fetch.sh ##
     @@ t/t5510-fetch.sh: test_expect_success 'fetch --atomic --prune executes a single reference transact
    + 	head_oid=$(git rev-parse HEAD) &&
    + 
    + 	# Fetching with the `--atomic` flag should update all references in a
    +-	# single transaction. It is currently missing coverage of pruned
    +-	# references though, and as a result those may be committed to disk
    +-	# even if updating references fails later.
    ++	# single transaction.
      	cat >expected <<-EOF &&
      		prepared
      		$ZERO_OID $ZERO_OID refs/remotes/origin/scheduled-for-deletion

Comments

Christian Couder Feb. 17, 2022, 3:50 p.m. UTC | #1
On Thu, Feb 17, 2022 at 2:04 PM Patrick Steinhardt <ps@pks.im> wrote:

> This is the second version of my patch series which fixes these
> problems.

I took another look and found only a very minor improvement in one of
the tests in patch 1/7.

Thanks!
Junio C Hamano Feb. 17, 2022, 10:30 p.m. UTC | #2
Christian Couder <christian.couder@gmail.com> writes:

> On Thu, Feb 17, 2022 at 2:04 PM Patrick Steinhardt <ps@pks.im> wrote:
>
>> This is the second version of my patch series which fixes these
>> problems.
>
> I took another look and found only a very minor improvement in one of
> the tests in patch 1/7.

Yup, the whole series was a pleasant read.

Thanks, both.  Will queue.