diff mbox series

[3/5] rebase: factor out merge_base calculation

Message ID 019158db9d2dbb371705ba79a96a907e4a17cdb1.1660576283.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series rebase --keep-base: imply --reapply-cherry-picks and --no-fork-point | expand

Commit Message

Phillip Wood Aug. 15, 2022, 3:11 p.m. UTC
From: Phillip Wood <phillip.wood@dunelm.org.uk>

Separate out calculating the merge base between onto and head from the
check for whether we can fast-forward or not. This means we can skip
the fast-forward checks when the rebase is forced and avoid
calculating the merge-base twice when --keep-base is given.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
Note the unnecessary braces around "if (keep_base)" are added here
reduce the churn on the next commit.
---
 builtin/rebase.c | 35 +++++++++++++++++++++++------------
 1 file changed, 23 insertions(+), 12 deletions(-)

Comments

Junio C Hamano Aug. 15, 2022, 5:22 p.m. UTC | #1
"Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> Separate out calculating the merge base between onto and head from the
> check for whether we can fast-forward or not. This means we can skip
> the fast-forward checks when the rebase is forced and avoid
> calculating the merge-base twice when --keep-base is given.
>
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
> Note the unnecessary braces around "if (keep_base)" are added here
> reduce the churn on the next commit.
> ---
>  builtin/rebase.c | 35 +++++++++++++++++++++++------------
>  1 file changed, 23 insertions(+), 12 deletions(-)
>
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 6cf9c95f4e1..86ea731ca3a 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -871,13 +871,9 @@ static int can_fast_forward(struct commit *onto, struct commit *upstream,
>  	struct commit_list *merge_bases = NULL;
>  	int res = 0;
>  
> -	merge_bases = get_merge_bases(onto, head);
> -	if (!merge_bases || merge_bases->next) {
> -		oidcpy(merge_base, null_oid());
> +	if (is_null_oid(merge_base))
>  		goto done;
> -	}
>  
> -	oidcpy(merge_base, &merge_bases->item->object.oid);
>  	if (!oideq(merge_base, &onto->object.oid))
>  		goto done;

Looking at the change in "git show -W", it seems that this function
no longer touches merge_bases at all, other than initializing it to
NULL at the beginning and then calling free_commit_list() on it at
the end.  Shouldn't it be removed?

> @@ -902,6 +898,20 @@ done:
>  	return res && is_linear_history(onto, head);
>  }
>  
> +static void fill_merge_base(struct rebase_options *options,
> +			    struct object_id *merge_base)
> +{
> +	struct commit_list *merge_bases = NULL;
> +
> +	merge_bases = get_merge_bases(options->onto, options->orig_head);
> +	if (!merge_bases || merge_bases->next)
> +		oidcpy(merge_base, null_oid());
> +	else
> +		oidcpy(merge_base, &merge_bases->item->object.oid);
> +
> +	free_commit_list(merge_bases);
> +}
> +
>  static int parse_opt_am(const struct option *opt, const char *arg, int unset)
>  {
>  	struct rebase_options *opts = opt->value;
> @@ -1668,7 +1678,11 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  			die(_("Does not point to a valid commit '%s'"),
>  				options.onto_name);
>  	}
> -
> +	if (keep_base) {
> +		oidcpy(&merge_base, &options.onto->object.oid);
> +	} else {
> +		fill_merge_base(&options, &merge_base);
> +	}

No need for braces around single-statement block on either side.

This is not a new issue introduced by this patch per-se, but
"merge_base" is becoming less and less accurate description of what
this variable really is.  Perhaps it is a good time to rename it?

It is "the base commit to rebuild the history on top of", aka "onto
commit", isn't it?  We often use merge-base between the upstream and
our tip of the history for it, but the variable often does not even
hold the merge-base in it, not because we cannot compute a single
merge-base but because depending on the operation mode we do not
want to use merge-base for further operation using that variable.
Johannes Schindelin Aug. 16, 2022, 9:15 a.m. UTC | #2
Hi Junio,

On Mon, 15 Aug 2022, Junio C Hamano wrote:

> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Phillip Wood <phillip.wood@dunelm.org.uk>
> >
> > Separate out calculating the merge base between onto and head from the
> > check for whether we can fast-forward or not. This means we can skip
> > the fast-forward checks when the rebase is forced and avoid
> > calculating the merge-base twice when --keep-base is given.
> >
> > Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> > ---
> > Note the unnecessary braces around "if (keep_base)" are added here
> > reduce the churn on the next commit.

This note...

> > @@ -1668,7 +1678,11 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
> >  			die(_("Does not point to a valid commit '%s'"),
> >  				options.onto_name);
> >  	}
> > -
> > +	if (keep_base) {
> > +		oidcpy(&merge_base, &options.onto->object.oid);
> > +	} else {
> > +		fill_merge_base(&options, &merge_base);
> > +	}
>
> No need for braces around single-statement block on either side.

... already addresses this feedback.

Ciao,
Dscho
Phillip Wood Aug. 16, 2022, 1:50 p.m. UTC | #3
Hi Junio

On 15/08/2022 18:22, Junio C Hamano wrote:
> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>
>> Separate out calculating the merge base between onto and head from the
>> check for whether we can fast-forward or not. This means we can skip
>> the fast-forward checks when the rebase is forced and avoid
>> calculating the merge-base twice when --keep-base is given.

I should clarify that this is referring to the merge base of onto and 
upstream.

>>
>> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>> ---
>> Note the unnecessary braces around "if (keep_base)" are added here
>> reduce the churn on the next commit.
>> ---
>>   builtin/rebase.c | 35 +++++++++++++++++++++++------------
>>   1 file changed, 23 insertions(+), 12 deletions(-)
>>
>> diff --git a/builtin/rebase.c b/builtin/rebase.c
>> index 6cf9c95f4e1..86ea731ca3a 100644
>> --- a/builtin/rebase.c
>> +++ b/builtin/rebase.c
>> @@ -871,13 +871,9 @@ static int can_fast_forward(struct commit *onto, struct commit *upstream,
>>   	struct commit_list *merge_bases = NULL;
>>   	int res = 0;
>>   
>> -	merge_bases = get_merge_bases(onto, head);
>> -	if (!merge_bases || merge_bases->next) {
>> -		oidcpy(merge_base, null_oid());
>> +	if (is_null_oid(merge_base))
>>   		goto done;
>> -	}
>>   
>> -	oidcpy(merge_base, &merge_bases->item->object.oid);
>>   	if (!oideq(merge_base, &onto->object.oid))
>>   		goto done;
> 
> Looking at the change in "git show -W", it seems that this function
> no longer touches merge_bases at all, other than initializing it to
> NULL at the beginning and then calling free_commit_list() on it at
> the end.  Shouldn't it be removed?

There is still the line

	merge_bases = get_merge_bases(upstream, head);

lower down. I should remove the call to free_commit_list() just above 
that line though as it is no longer needed.

>> @@ -902,6 +898,20 @@ done:
>>   	return res && is_linear_history(onto, head);
>>   }
>>   
>> +static void fill_merge_base(struct rebase_options *options,
>> +			    struct object_id *merge_base)
>> +{
>> +	struct commit_list *merge_bases = NULL;
>> +
>> +	merge_bases = get_merge_bases(options->onto, options->orig_head);
>> +	if (!merge_bases || merge_bases->next)
>> +		oidcpy(merge_base, null_oid());
>> +	else
>> +		oidcpy(merge_base, &merge_bases->item->object.oid);
>> +
>> +	free_commit_list(merge_bases);
>> +}
>> +
>>   static int parse_opt_am(const struct option *opt, const char *arg, int unset)
>>   {
>>   	struct rebase_options *opts = opt->value;
>> @@ -1668,7 +1678,11 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>>   			die(_("Does not point to a valid commit '%s'"),
>>   				options.onto_name);
>>   	}
>> -
>> +	if (keep_base) {
>> +		oidcpy(&merge_base, &options.onto->object.oid);

This is actually unnecessary as we do

	options.onto = lookup_commit_reference_by_name(options.onto_name);

above when calculating onto.

>> +	} else {
>> +		fill_merge_base(&options, &merge_base);
>> +	}
> 
> No need for braces around single-statement block on either side.
> 
> This is not a new issue introduced by this patch per-se, but
> "merge_base" is becoming less and less accurate description of what
> this variable really is.  Perhaps it is a good time to rename it?

Yes, merge_base is not a very descriptive name as it prompts the 
question "merge base of which commits". I think base_commit or 
branch_base would be better.

> It is "the base commit to rebuild the history on top of", aka "onto
> commit", isn't it?

I think it is the base commit of the branch i.e. we rebase all the 
commits in the range merge_base..branch onto the "onto commit".

> We often use merge-base between the upstream and
> our tip of the history for it,

I'm pretty sure it is always the merge-base of the "onto commit" and our 
tip of the history. When using --keep-base we calculate the "onto 
commit" as the merge base of upstream and our tip of the history which 
makes it look were using that for merge_base but that commit is also the 
merge-base of the "onto commit" and our tip of history.

Best Wishes

Phillip

> but the variable often does not even
> hold the merge-base in it, not because we cannot compute a single
> merge-base but because depending on the operation mode we do not
> want to use merge-base for further operation using that variable.
Junio C Hamano Aug. 16, 2022, 3 p.m. UTC | #4
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Junio,
>
> On Mon, 15 Aug 2022, Junio C Hamano wrote:
>
>> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>> > From: Phillip Wood <phillip.wood@dunelm.org.uk>
>> >
>> > Separate out calculating the merge base between onto and head from the
>> > check for whether we can fast-forward or not. This means we can skip
>> > the fast-forward checks when the rebase is forced and avoid
>> > calculating the merge-base twice when --keep-base is given.
>> >
>> > Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>> > ---
>> > Note the unnecessary braces around "if (keep_base)" are added here
>> > reduce the churn on the next commit.
>
> This note...
>
>> > @@ -1668,7 +1678,11 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>> >  			die(_("Does not point to a valid commit '%s'"),
>> >  				options.onto_name);
>> >  	}
>> > -
>> > +	if (keep_base) {
>> > +		oidcpy(&merge_base, &options.onto->object.oid);
>> > +	} else {
>> > +		fill_merge_base(&options, &merge_base);
>> > +	}
>>
>> No need for braces around single-statement block on either side.
>
> ... already addresses this feedback.

Yeah, but the point is instead of wasting readers' bandwidth with an
additional note, the series can add braces in the step where they
become necessary, i.e. the later step where there is a new statement
in the body of the "if-true" side.
Junio C Hamano Aug. 16, 2022, 3:03 p.m. UTC | #5
Phillip Wood <phillip.wood123@gmail.com> writes:

>>>   -	merge_bases = get_merge_bases(onto, head);
>>> -	if (!merge_bases || merge_bases->next) {
>>> -		oidcpy(merge_base, null_oid());
>>> +	if (is_null_oid(merge_base))
>>>   		goto done;
>>> -	}
>>>   -	oidcpy(merge_base, &merge_bases->item->object.oid);
>>>   	if (!oideq(merge_base, &onto->object.oid))
>>>   		goto done;
>> Looking at the change in "git show -W", it seems that this function
>> no longer touches merge_bases at all, other than initializing it to
>> NULL at the beginning and then calling free_commit_list() on it at
>> the end.  Shouldn't it be removed?
>
> There is still the line
>
> 	merge_bases = get_merge_bases(upstream, head);
>
> lower down. I should remove the call to free_commit_list() just above
> that line though as it is no longer needed.

Yup, that is correct.

Thanks.
Elijah Newren Aug. 18, 2022, 7:11 a.m. UTC | #6
On Mon, Aug 15, 2022 at 8:14 AM Phillip Wood via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> Separate out calculating the merge base between onto and head from the
> check for whether we can fast-forward or not. This means we can skip
> the fast-forward checks when the rebase is forced and avoid
> calculating the merge-base twice when --keep-base is given.
>
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
> Note the unnecessary braces around "if (keep_base)" are added here
> reduce the churn on the next commit.
> ---
>  builtin/rebase.c | 35 +++++++++++++++++++++++------------
>  1 file changed, 23 insertions(+), 12 deletions(-)
>
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 6cf9c95f4e1..86ea731ca3a 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -871,13 +871,9 @@ static int can_fast_forward(struct commit *onto, struct commit *upstream,
>         struct commit_list *merge_bases = NULL;
>         int res = 0;
>
> -       merge_bases = get_merge_bases(onto, head);
> -       if (!merge_bases || merge_bases->next) {
> -               oidcpy(merge_base, null_oid());
> +       if (is_null_oid(merge_base))
>                 goto done;
> -       }
>
> -       oidcpy(merge_base, &merge_bases->item->object.oid);
>         if (!oideq(merge_base, &onto->object.oid))
>                 goto done;
>
> @@ -902,6 +898,20 @@ done:
>         return res && is_linear_history(onto, head);
>  }
>
> +static void fill_merge_base(struct rebase_options *options,
> +                           struct object_id *merge_base)
> +{
> +       struct commit_list *merge_bases = NULL;
> +
> +       merge_bases = get_merge_bases(options->onto, options->orig_head);
> +       if (!merge_bases || merge_bases->next)
> +               oidcpy(merge_base, null_oid());
> +       else
> +               oidcpy(merge_base, &merge_bases->item->object.oid);
> +
> +       free_commit_list(merge_bases);
> +}
> +
>  static int parse_opt_am(const struct option *opt, const char *arg, int unset)
>  {
>         struct rebase_options *opts = opt->value;
> @@ -1668,7 +1678,11 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>                         die(_("Does not point to a valid commit '%s'"),
>                                 options.onto_name);
>         }
> -
> +       if (keep_base) {
> +               oidcpy(&merge_base, &options.onto->object.oid);
> +       } else {
> +               fill_merge_base(&options, &merge_base);
> +       }
>         if (options.fork_point > 0)
>                 options.restrict_revision =
>                         get_fork_point(options.upstream_name, options.orig_head);
> @@ -1697,13 +1711,10 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>          * Check if we are already based on onto with linear history,
>          * in which case we could fast-forward without replacing the commits
>          * with new commits recreated by replaying their changes.
> -        *
> -        * Note that can_fast_forward() initializes merge_base, so we have to
> -        * call it before checking allow_preemptive_ff.
>          */
> -       if (can_fast_forward(options.onto, options.upstream, options.restrict_revision,
> -                   options.orig_head, &merge_base) &&
> -           allow_preemptive_ff) {
> +       if (allow_preemptive_ff &&
> +           can_fast_forward(options.onto, options.upstream, options.restrict_revision,
> +                            options.orig_head, &merge_base)) {

I didn't catch anything new in my review of this patch, but I just
really wanted to say how happy this final hunk makes me.  I hated that
can_fast_forward() had to be called first; thanks for fixing that.


>                 int flag;
>
>                 if (!(options.flags & REBASE_FORCE)) {
> --
> gitgitgadget
Jonathan Tan Aug. 24, 2022, 10:02 p.m. UTC | #7
"Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
> @@ -1668,7 +1678,11 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  			die(_("Does not point to a valid commit '%s'"),
>  				options.onto_name);
>  	}
> -
> +	if (keep_base) {
> +		oidcpy(&merge_base, &options.onto->object.oid);
> +	} else {
> +		fill_merge_base(&options, &merge_base);
> +	}
>  	if (options.fork_point > 0)
>  		options.restrict_revision =
>  			get_fork_point(options.upstream_name, options.orig_head);

This patch makes sense, except for this part: why is fill_merge_base()
only called for non-keep_base, but it seemed to be unconditionally
called before? For what it's worth, all tests pass even with this diff:

  -       if (keep_base) {
  -               oidcpy(&merge_base, &options.onto->object.oid);
  -       } else {
  -               fill_merge_base(&options, &merge_base);
  -       }
  +       fill_merge_base(&options, &merge_base);
Phillip Wood Aug. 30, 2022, 3:03 p.m. UTC | #8
Hi Jonathan

Thanks for taking a look at this series

On 24/08/2022 23:02, Jonathan Tan wrote:
> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> @@ -1668,7 +1678,11 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>>   			die(_("Does not point to a valid commit '%s'"),
>>   				options.onto_name);
>>   	}
>> -
>> +	if (keep_base) {
>> +		oidcpy(&merge_base, &options.onto->object.oid);
>> +	} else {
>> +		fill_merge_base(&options, &merge_base);
>> +	}
>>   	if (options.fork_point > 0)
>>   		options.restrict_revision =
>>   			get_fork_point(options.upstream_name, options.orig_head);
> 
> This patch makes sense, except for this part: why is fill_merge_base()
> only called for non-keep_base, but it seemed to be unconditionally
> called before? For what it's worth, all tests pass even with this diff:

It's an optimization, we have already calculated the merge base above in 
the "onto" calculation when --keep-base is given. This is what I meant 
by "avoid calculating the merge-base twice when --keep-base is given" in 
the commit message, maybe I should  try and come up with some clearer 
wording.

Best Wishes

Phillip


>    -       if (keep_base) {
>    -               oidcpy(&merge_base, &options.onto->object.oid);
>    -       } else {
>    -               fill_merge_base(&options, &merge_base);
>    -       }
>    +       fill_merge_base(&options, &merge_base);
diff mbox series

Patch

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 6cf9c95f4e1..86ea731ca3a 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -871,13 +871,9 @@  static int can_fast_forward(struct commit *onto, struct commit *upstream,
 	struct commit_list *merge_bases = NULL;
 	int res = 0;
 
-	merge_bases = get_merge_bases(onto, head);
-	if (!merge_bases || merge_bases->next) {
-		oidcpy(merge_base, null_oid());
+	if (is_null_oid(merge_base))
 		goto done;
-	}
 
-	oidcpy(merge_base, &merge_bases->item->object.oid);
 	if (!oideq(merge_base, &onto->object.oid))
 		goto done;
 
@@ -902,6 +898,20 @@  done:
 	return res && is_linear_history(onto, head);
 }
 
+static void fill_merge_base(struct rebase_options *options,
+			    struct object_id *merge_base)
+{
+	struct commit_list *merge_bases = NULL;
+
+	merge_bases = get_merge_bases(options->onto, options->orig_head);
+	if (!merge_bases || merge_bases->next)
+		oidcpy(merge_base, null_oid());
+	else
+		oidcpy(merge_base, &merge_bases->item->object.oid);
+
+	free_commit_list(merge_bases);
+}
+
 static int parse_opt_am(const struct option *opt, const char *arg, int unset)
 {
 	struct rebase_options *opts = opt->value;
@@ -1668,7 +1678,11 @@  int cmd_rebase(int argc, const char **argv, const char *prefix)
 			die(_("Does not point to a valid commit '%s'"),
 				options.onto_name);
 	}
-
+	if (keep_base) {
+		oidcpy(&merge_base, &options.onto->object.oid);
+	} else {
+		fill_merge_base(&options, &merge_base);
+	}
 	if (options.fork_point > 0)
 		options.restrict_revision =
 			get_fork_point(options.upstream_name, options.orig_head);
@@ -1697,13 +1711,10 @@  int cmd_rebase(int argc, const char **argv, const char *prefix)
 	 * Check if we are already based on onto with linear history,
 	 * in which case we could fast-forward without replacing the commits
 	 * with new commits recreated by replaying their changes.
-	 *
-	 * Note that can_fast_forward() initializes merge_base, so we have to
-	 * call it before checking allow_preemptive_ff.
 	 */
-	if (can_fast_forward(options.onto, options.upstream, options.restrict_revision,
-		    options.orig_head, &merge_base) &&
-	    allow_preemptive_ff) {
+	if (allow_preemptive_ff &&
+	    can_fast_forward(options.onto, options.upstream, options.restrict_revision,
+			     options.orig_head, &merge_base)) {
 		int flag;
 
 		if (!(options.flags & REBASE_FORCE)) {