mbox series

[v9,0/3] push: add "--[no-]force-if-includes"

Message ID 20201001082118.19441-1-shrinidhi.kaushik@gmail.com (mailing list archive)
Headers show
Series push: add "--[no-]force-if-includes" | expand

Message

Srinidhi Kaushik Oct. 1, 2020, 8:21 a.m. UTC
Add a new option: "--force-if-includes" to "git-push" where forced
updates are allowed only if the tip of the remote-tracking ref has
been integrated locally, by verifying if the tip of the remote-tracking
ref -- on which a local branch has based on -- is reachable from at
least one of the "reflog" entries of the branch about to be updated
by force on the remote.

This option can be used with "--force-with-lease" with setups where
the remote-tracking refs of the repository are implicitly updated in
the background to help prevent unintended remote overwrites.

If a local branch is based on a remote ref for a rewrite, and if that
remote-tracking ref is updated by a push from another repository after
it has been checked out locally, force updating that branch to remote
with "--force-with-lease[=<refname>[:<expect>]]" without specifying
the "<expect>" value, can cause the update that happened in-between
the checkout and forced push to be lost.

Changes since v8:
  - Disable "commit-graph" when "in_merge_bases_many()" is called
    for this check, because it returns different results depending
    on whether "commit-graph" is enabled [1].

  - Rename the commit list name, remove redundant comments and fix
    some typos.

Srinidhi Kaushik (3):
  push: add reflog check for "--force-if-includes"
  push: parse and set flag for "--force-if-includes"
  t, doc: update tests, reference for "--force-if-includes"

 Documentation/config/advice.txt |   9 +-
 Documentation/config/push.txt   |   6 +
 Documentation/git-push.txt      |  26 ++++-
 advice.c                        |   3 +
 advice.h                        |   2 +
 builtin/push.c                  |  27 +++++
 builtin/send-pack.c             |  12 ++
 remote-curl.c                   |  14 ++-
 remote.c                        | 198 ++++++++++++++++++++++++++++++--
 remote.h                        |  12 +-
 send-pack.c                     |   1 +
 t/t5533-push-cas.sh             | 137 ++++++++++++++++++++++
 transport-helper.c              |  10 ++
 transport.c                     |   8 ++
 transport.h                     |  15 ++-
 15 files changed, 462 insertions(+), 18 deletions(-)

base-commit: 306ee63a703ad67c54ba1209dc11dd9ea500dc1f
[1] https://public-inbox.org/git/xmqqtuvhn6yx.fsf@gitster.c.googlers.com

--
2.28.0

Comments

Junio C Hamano Oct. 1, 2020, 3:46 p.m. UTC | #1
Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:

> Changes since v8:
>   - Disable "commit-graph" when "in_merge_bases_many()" is called
>     for this check, because it returns different results depending
>     on whether "commit-graph" is enabled [1].

Is that a wise move, though?  If the "different results" is
expected, then it is a different story, but I would think it is a
bug in commit-graph codepath if it produces a result different from
what the callers expect, and disabling from the caller's end would
mean that we lose one opportunity to help commit-graph folks to go
and fix their bugs, no?

Other than that, I think the topic is in quite a good shape.  Thanks
for working on polishing it.
Junio C Hamano Oct. 1, 2020, 5:12 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:
>
>> Changes since v8:
>>   - Disable "commit-graph" when "in_merge_bases_many()" is called
>>     for this check, because it returns different results depending
>>     on whether "commit-graph" is enabled [1].
>
> Is that a wise move, though?  If the "different results" is
> expected, then it is a different story, but I would think it is a
> bug in commit-graph codepath if it produces a result different from
> what the callers expect, and disabling from the caller's end would
> mean that we lose one opportunity to help commit-graph folks to go
> and fix their bugs, no?
>
> Other than that, I think the topic is in quite a good shape.  Thanks
> for working on polishing it.

In other words, how about doing it like so.

In an ideal world, folks who know more about commit-graph than we do
will find what's broken in in_merge_bases_many() when commit-graph
is in use, before I finish lecturing against hiding a breakage under
the rug.  Let's see if another call for help helps ;-)

 remote.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git i/remote.c w/remote.c
index 98a578f5dc..361b8f1c0e 100644
--- i/remote.c
+++ w/remote.c
@@ -2408,7 +2408,20 @@ static int is_reachable_in_reflog(const char *local, const struct ref *remote)
 /* Toggle the "commit-graph" feature; return the previously set state. */
 static int toggle_commit_graph(struct repository *repo, int disable) {
 	int prev = repo->commit_graph_disabled;
-	repo->commit_graph_disabled = disable;
+	static int should_toggle = -1;
+
+	if (should_toggle < 0) {
+		/*
+		 * The in_merge_bases_many() seems to misbehave when
+		 * the commit-graph feature is in use.  Disable it for
+		 * normal users, but keep it enabled when specifically
+		 * testing the feature.
+		 */
+		should_toggle = !git_env_bool("GIT_TEST_COMMIT_GRAPH", 0);
+	}
+
+	if (should_toggle)
+		repo->commit_graph_disabled = disable;
 	return prev;
 }
Srinidhi Kaushik Oct. 1, 2020, 5:54 p.m. UTC | #3
Hello,

On 10/01/2020 10:12, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:
> >
> >> Changes since v8:
> >>   - Disable "commit-graph" when "in_merge_bases_many()" is called
> >>     for this check, because it returns different results depending
> >>     on whether "commit-graph" is enabled [1].
> >
> > Is that a wise move, though?  If the "different results" is
> > expected, then it is a different story, but I would think it is a
> > bug in commit-graph codepath if it produces a result different from
> > what the callers expect, and disabling from the caller's end would
> > mean that we lose one opportunity to help commit-graph folks to go
> > and fix their bugs, no?

I didn't want want to cause a delay with this patch. Since the new
option was seemingly working without it, I decided to disable it here
and added a "TODO" in the comments to remove "toggle_commit_graph()"
in the future. We can definitely put this on hold and wait until the
with and without "commit-graph" result disparities are clarified.

> > Other than that, I think the topic is in quite a good shape.  Thanks
> > for working on polishing it.

Thanks, I appreciate it!
 
> In other words, how about doing it like so.
> 
> In an ideal world, folks who know more about commit-graph than we do
> will find what's broken in in_merge_bases_many() when commit-graph
> is in use, before I finish lecturing against hiding a breakage under
> the rug.  Let's see if another call for help helps ;-)

:)
 
>  remote.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git i/remote.c w/remote.c
> index 98a578f5dc..361b8f1c0e 100644
> --- i/remote.c
> +++ w/remote.c
> @@ -2408,7 +2408,20 @@ static int is_reachable_in_reflog(const char *local, const struct ref *remote)
>  /* Toggle the "commit-graph" feature; return the previously set state. */
>  static int toggle_commit_graph(struct repository *repo, int disable) {
>  	int prev = repo->commit_graph_disabled;
> -	repo->commit_graph_disabled = disable;
> +	static int should_toggle = -1;
> +
> +	if (should_toggle < 0) {
> +		/*
> +		 * The in_merge_bases_many() seems to misbehave when
> +		 * the commit-graph feature is in use.  Disable it for
> +		 * normal users, but keep it enabled when specifically
> +		 * testing the feature.
> +		 */
> +		should_toggle = !git_env_bool("GIT_TEST_COMMIT_GRAPH", 0);
> +	}
> +
> +	if (should_toggle)
> +		repo->commit_graph_disabled = disable;
>  	return prev;
>  }
>  

Ah, this will keep a record of the behavior in the tests; nice,
will update in the next set.

I looked around a little bit trying to understand what was happening
here. As mentioned before [1], "repo_in_merge_bases_many()" returns
early because of this:

  for (i = 0; i < nr_reference; i++) {
	  if (repo_parse_commit(r, reference[i]))
		  return ret;

	  generation = commit_graph_generation(reference[i]);
	  if (generation < min_generation)
		  min_generation = generation;
	  fprintf(stderr,
		  "[%s]: (local): generation: %u, min_generation: %u\n",
		  oid_to_hex(&reference[i]->object.oid),
		  generation,
		  min_generation);
  }

  generation = commit_graph_generation(commit);
  fprintf(stderr, "[%s]: (remote) generation: %u, min_generation: %u\n",
	  oid_to_hex(&commit->object.oid), generation, min_generation);
  if (generation > min_generation) {
	  return ret;
  }


I made some changes locally to display the generation numbers of all
the commits that were collected during the "reflog" traversal.

In the beginning we get the minimum generation number of the list of
the commits in "reference[]" to see if the "commit" is reachable from
one of the items in "reference[]". Since in this case, the generation
number of "commit" is higher than minimum amongst the members of
"reference", does it mean it cannot be reached?

When "GIT_TEST_COMMIT_GRAPH" is turned off, all of the commits in
"reference" as well as "commit" have "GENERATION_NUMBER_INFINITY",
and the path moves forward to "paint_down_to_common()" and correctly
identifies the reachability.

I ran the test again, but this time with running "git-commit-graph"
right before pushing:

  (a) git commit-graph write --reachable, and the commit's generation
      number was "GENERATION_NUMBER_INFINITY".

  (b) git-show-ref -s | git commit-graph write --stdin-commits, and
      the commit's generation number was 5.

and since generation number of "commit" was always higher than the
minimum -- it returns that it is not reachable from any of "reference".

One of the failing tests in t5533 was modified (for debugging), and
the following shows the difference in behavior of
"repo_in_merge_bases_many()". The test checks if the remote tip is
reachable from the reflog entries of the local branch after
"git pull --rebase" is run.

  $ git log --graph --oneline # (before "pull --rebase")
   * be2465f J
   * 157d38b C
   * f9a2dac B
   * 112d1ac A

  $ git log --graph --oneline # (after "pull --rebase")
   * 7c16010 J
   * b995a30 G
   * 00b6961 F
   * 157d38b C
   * f9a2dac B
   * 112d1ac A

  $ git reflog master
  7c16010 master@{0}: pull --rebase origin master (finish): refs/heads/master onto b995a30227dd14b34b6f7728201aefac8cc12296
  be2465f master@{1}: commit: J
  157d38b master@{2}: commit: C
  f9a2dac master@{3}: commit: B
  112d1ac master@{4}: commit (initial): A

  - (a) is run.
    $ git push --force-if-includes --force-with-lease=master
    [7c16010bad84d8b53873875c2e242890920360f2]: (local):  generation: 4294967295, min_generation: 4294967295
    [be2465f6a155acb8f5eb9ee876bad730e0f656cf]: (local):  generation: 4, min_generation: 4
    [157d38b4112d457e6645c7c4e9a486e6189be435]: (local):  generation: 3, min_generation: 3
    [f9a2dac17e4f8cafaa9655d40cb86c53094da8d6]: (local):  generation: 2, min_generation: 2
    [112d1ac551b908f10b995d7e41456f4cd8f071c5]: (local):  generation: 1, min_generation: 1
    [b995a30227dd14b34b6f7728201aefac8cc12296]: (remote): generation: 4294967295, min_generation: 1
    [...] fail.

  - "git fetch --all" and (b) is run.
    $ git push --force-if-includes --force-with-lease=master
    [7c16010bad84d8b53873875c2e242890920360f2]: (local):  generation: 4294967295, min_generation: 4294967295
    [be2465f6a155acb8f5eb9ee876bad730e0f656cf]: (local):  generation: 4, min_generation: 4
    [157d38b4112d457e6645c7c4e9a486e6189be435]: (local):  generation: 3, min_generation: 3
    [f9a2dac17e4f8cafaa9655d40cb86c53094da8d6]: (local):  generation: 2, min_generation: 2
    [112d1ac551b908f10b995d7e41456f4cd8f071c5]: (local):  generation: 1, min_generation: 1
    [b995a30227dd14b34b6f7728201aefac8cc12296]: (remote): generation: 5, min_generation: 1
    [...] fail.

  - neither (a), nor (b) are run, and "core.commitGraph" is disabled.
    $ git push --force-if-includes --force-with-lease=master
    [7c16010bad84d8b53873875c2e242890920360f2]: (local):  generation: 4294967295, min_generation: 4294967295
    [be2465f6a155acb8f5eb9ee876bad730e0f656cf]: (local):  generation: 4294967295, min_generation: 4294967295
    [157d38b4112d457e6645c7c4e9a486e6189be435]: (local):  generation: 4294967295, min_generation: 4294967295
    [f9a2dac17e4f8cafaa9655d40cb86c53094da8d6]: (local):  generation: 4294967295, min_generation: 4294967295
    [112d1ac551b908f10b995d7e41456f4cd8f071c5]: (local):  generation: 4294967295, min_generation: 4294967295
    [b995a30227dd14b34b6f7728201aefac8cc12296]: (remote): generation: 4294967295, min_generation: 4294967295
    [...] pass.

It looks like G (b995a30) is reachable from J (7c16010), but
"repo_in_merge_bases_many()" disagrees when "GIT_TEST_COMMIT_GRAPH"
is enabled. I hope this information helps a little to understand
why this happens and whether it is a bug or not.

[1] https://public-inbox.org/git/20200928193400.GA88208@mail.clickyotomy.dev

Thanks.
Junio C Hamano Oct. 1, 2020, 6:32 p.m. UTC | #4
Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:

> Ah, this will keep a record of the behavior in the tests; nice,
> will update in the next set.

I actually think this kind of "sweep the breakage under the rug"
belongs not in the code that uses the (suspected broken) API but
in the code that is suspected to be broken, i.e. somewhere in
commit.c that is involved in in_merge_bases_many() codepath.

So if you are going to reroll, I would rather see you drop this part
entirely, and add a similar "this code is broken so we won't run in
the real user's environment" to protect all code that call
in_merge_bases_many() from the breakage, not just your new one.

> One of the failing tests in t5533 was modified (for debugging), and
> the following shows the difference in behavior of
> "repo_in_merge_bases_many()". The test checks if the remote tip is
> reachable from the reflog entries of the local branch after
> "git pull --rebase" is run.
> ...
> It looks like G (b995a30) is reachable from J (7c16010), but
> "repo_in_merge_bases_many()" disagrees when "GIT_TEST_COMMIT_GRAPH"
> is enabled. I hope this information helps a little to understand
> why this happens and whether it is a bug or not.

Wonderful.  With this good start, perhaps we may not have to discuss
disabling the commit-graph code in your code or in commit.c for all
callers ;-)
Junio C Hamano Oct. 2, 2020, 4:50 p.m. UTC | #5
Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:

> I didn't want want to cause a delay with this patch. Since the new
> option was seemingly working without it,...

It is a good example to help other new contributors to understand an
important point in how the development in common works, so let me
say this.

I did very much wanted to keep the bug exposed at least to the test
suite.  Since the broken helper were designed to be used in many
other places in the code, and we had a simple reproduction recipe in
this topic, using it as an opening to help debug and fix bugs in the
broken helper had higher priority than adding the "--force-if-includes"
feature.

We help the contributors who have been involved in the broken helper
by delaying this topic a bit and leaving the reproduction readily
available to them, so that they help us who are working on a piece
of code that wants to see the broken helper fixed.  

That way everybody benefits.

It's not like a corporate development where your interest lies in
shipping your piece regardless of the work done by other teams,
where it might serve you better by using the second best tool for
the task, to avoid the tool that ought to be best but does not work
well *and* you do not want to help the team that manages that best
tool, even if helping them may benefit the whole organization.

So, let's play well together.  Yield a bit to help others and let
others also help you.

Thanks.
Srinidhi Kaushik Oct. 2, 2020, 7:42 p.m. UTC | #6
Hi Junio,

On 10/02/2020 09:50, Junio C Hamano wrote:
> Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:
> 
> > I didn't want want to cause a delay with this patch. Since the new
> > option was seemingly working without it,...
> 
> It is a good example to help other new contributors to understand an
> important point in how the development in common works, so let me
> say this.
> 
> I did very much wanted to keep the bug exposed at least to the test
> suite.  Since the broken helper were designed to be used in many
> other places in the code, and we had a simple reproduction recipe in
> this topic, using it as an opening to help debug and fix bugs in the
> broken helper had higher priority than adding the "--force-if-includes"
> feature.
> 
> We help the contributors who have been involved in the broken helper
> by delaying this topic a bit and leaving the reproduction readily
> available to them, so that they help us who are working on a piece
> of code that wants to see the broken helper fixed.  
> 
> That way everybody benefits.
> 
> It's not like a corporate development where your interest lies in
> shipping your piece regardless of the work done by other teams,
> where it might serve you better by using the second best tool for
> the task, to avoid the tool that ought to be best but does not work
> well *and* you do not want to help the team that manages that best
> tool, even if helping them may benefit the whole organization.
> 
> So, let's play well together.  Yield a bit to help others and let
> others also help you.
> 
> Thanks.

Thank you for pointing this out. You're right; I should not have
rushed to disabling the feature because it wasn't working with my
patch instead of waiting for the issue to be investigated. This is
valuable advice, and I will keep this in mind when making future
contributions.

Thanks.