diff mbox series

Re* [PATCH v4 0/3] advice: add "all" option to disable all hints

Message ID xmqqjzka7p2t.fsf_-_@gitster.g (mailing list archive)
State New, archived
Headers show
Series Re* [PATCH v4 0/3] advice: add "all" option to disable all hints | expand

Commit Message

Junio C Hamano May 3, 2024, 6 p.m. UTC
Dragan Simic <dsimic@manjaro.org> writes:

> Hello James,
> ...
>> Range-diff against v3:
>> 1:  55d5559586 < -:  ---------- advice: add --no-advice global option
>> -:  ---------- > 1:  ae3f45dadc doc: clean up usage documentation for
>> --no-* opts
>> -:  ---------- > 2:  1b0019026a doc: add spacing around paginate
>> options
>> -:  ---------- > 3:  31e73e6c0e advice: add --no-advice global option
>
> Just a small suggestion...  Perhaps the creation factor needs adjusting
> for the range diff to actually be produced.  To be clear, I don't mind
> that it's missing here.

I see this happen to too many series I see on the list.  There are
cases when the user knows that they are comparing an old and a new
iterations of the same series, e.g. running it from format-patch.
We probably should use a much higher creation factor than default to
run range-diff in such a context.

IOW, this shouldn't have to be done by individual users, but by the
tool.

When I do a post-am check myself, I often use --creation-factor=999
to force matching.

Perhaps something along this line may not be a bad idea.

----- >8 --------- >8 --------- >8 --------- >8 -----
Subject: [PATCH] format-patch: run range-diff with larger creation-factor

We see too often that a range-diff added to format-patch output
shows too many "unmatched" patches.  This is because the default
value for creation-factor is set to a relatively low value.

It may be justified for other uses (like you have a yet-to-be-sent
new iteration of your series, and compare it against the 'seen'
branch that has an older iteration, probably with the '--left-only'
option, to pick out only your patches while ignoring the others) of
"range-diff" command, but when the command is run as part of the
format-patch, the user _knows_ and expects that the patches in the
old and the new iterations roughly correspond to each other, so we
can and should use a much higher default.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/log.c | 2 +-
 range-diff.h  | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

Comments

Eric Sunshine May 3, 2024, 7:26 p.m. UTC | #1
On Fri, May 3, 2024 at 2:00 PM Junio C Hamano <gitster@pobox.com> wrote:
> Dragan Simic <dsimic@manjaro.org> writes:
> > Just a small suggestion...  Perhaps the creation factor needs adjusting
> > for the range diff to actually be produced. [...]
>
> I see this happen to too many series I see on the list.  There are
> cases when the user knows that they are comparing an old and a new
> iterations of the same series, e.g. running it from format-patch.
> We probably should use a much higher creation factor than default to
> run range-diff in such a context.
>
> IOW, this shouldn't have to be done by individual users, but by the
> tool.
>
> Perhaps something along this line may not be a bad idea.
>
> ----- >8 --------- >8 --------- >8 --------- >8 -----
> Subject: [PATCH] format-patch: run range-diff with larger creation-factor
>
> We see too often that a range-diff added to format-patch output
> shows too many "unmatched" patches.  This is because the default
> value for creation-factor is set to a relatively low value.
>
> It may be justified for other uses (like you have a yet-to-be-sent
> new iteration of your series, and compare it against the 'seen'
> branch that has an older iteration, probably with the '--left-only'
> option, to pick out only your patches while ignoring the others) of
> "range-diff" command, but when the command is run as part of the
> format-patch, the user _knows_ and expects that the patches in the
> old and the new iterations roughly correspond to each other, so we
> can and should use a much higher default.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Ævar had posted[1] pretty much the exact same patch a few years ago.
At the time, I had trouble understanding why `git range-diff` and `git
format-patch --range-dif` would need separate default creation
factors[2], and I still have trouble understanding why they should be
different. Aren't both commands addressing the same use-case of
comparing one version of a series against a subsequent version? In
your response[3], you seemed to agree with that observation and
suggested instead simply increasing the default creation factor for
both commands (which sounds reasonable to me).

[1]: https://lore.kernel.org/git/87y35g9l18.fsf@evledraar.gmail.com/
[2]: https://lore.kernel.org/git/CAPig+cRMiEcXVRYrgp+B3tcDreh41-a5_k0zABe+HNce0G=Cyw@mail.gmail.com/
[3]: https://lore.kernel.org/git/xmqqzhps4uyq.fsf@gitster-ct.c.googlers.com/

> ---
> diff --git c/builtin/log.c w/builtin/log.c
> index 4da7399905..7a019476c3 100644
> --- c/builtin/log.c
> +++ w/builtin/log.c
> @@ -2294,7 +2294,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>         if (creation_factor < 0)
> -               creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT;
> +               creation_factor = CREATION_FACTOR_FOR_THE_SAME_SERIES;
>         else if (!rdiff_prev)
>                 die(_("the option '%s' requires '%s'"), "--creation-factor", "--range-diff");
> diff --git c/range-diff.h w/range-diff.h
> index 04ffe217be..2f69f6a434 100644
> --- c/range-diff.h
> +++ w/range-diff.h
> @@ -6,6 +6,12 @@
>  #define RANGE_DIFF_CREATION_FACTOR_DEFAULT 60
>
> +/*
> + * A much higher value than the default, when we KNOW we are comparing
> + * the same series (e.g., used when format-patch calls range-diff).
> + */
> +#define CREATION_FACTOR_FOR_THE_SAME_SERIES 999
Junio C Hamano May 3, 2024, 7:48 p.m. UTC | #2
Eric Sunshine <sunshine@sunshineco.com> writes:

> different. Aren't both commands addressing the same use-case of
> comparing one version of a series against a subsequent version? In
> your response[3], you seemed to agree with that observation and
> suggested instead simply increasing the default creation factor for
> both commands (which sounds reasonable to me).

I think Dscho's use case was compare only single updated series of
his with seen that has tons of irrelevant other patches, to have the
command sift the patches belong to other topics away while making
comparison with the older incarnation of his topic.  I never use the
command for such a comparison, but I can imagine that in such a
context lower criteria may help reduce false matches of his new
commits with unrelated commits on 'seen'.
Junio C Hamano May 3, 2024, 8:08 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> Eric Sunshine <sunshine@sunshineco.com> writes:
>
>> different. Aren't both commands addressing the same use-case of
>> comparing one version of a series against a subsequent version? In
>> your response[3], you seemed to agree with that observation and
>> suggested instead simply increasing the default creation factor for
>> both commands (which sounds reasonable to me).
>
> I think Dscho's use case was compare only single updated series of
> his with seen that has tons of irrelevant other patches, to have the
> command sift the patches belong to other topics away while making
> comparison with the older incarnation of his topic.  I never use the
> command for such a comparison, but I can imagine that in such a
> context lower criteria may help reduce false matches of his new
> commits with unrelated commits on 'seen'.

It seems that Dscho was in agreement that format-patch's use case
should try to be more aggressive at least back then.  In the message
in the thread you pointed

 https://lore.kernel.org/git/nycvar.QRO.7.76.6.1903211209280.41@tvgsbejvaqbjf.bet/

he does not give us the exact reason why he does not think the "more
aggressive" mode is not suitable for other use cases, though.

A similar thread was raised more recently:

 https://lore.kernel.org/git/rq6919s9-qspp-rn6o-n704-r0400q10747r@tzk.qr/

Also, there was an attempt to clarify what the "factor" really
meant, but we did not get a real conclusion other than the UNIT to
measure the "factor" is called "percent" (without specifying what
100% is relative to, "percent" does not mean much to guide users
to guess what the right value would be).

 https://lore.kernel.org/git/85snn12q-po05-osqs-n1o0-n6040392q01q@tzk.qr/

So, yes, --creation-factor=<value> is messy, I think a very high
value, much higher than the hardcoded default of 60, is more
appropriate for use in format-patch, but we do not know what bad
effect it would have if we used much higher default everywhere.
Eric Sunshine May 3, 2024, 9:24 p.m. UTC | #4
On Fri, May 3, 2024 at 4:08 PM Junio C Hamano <gitster@pobox.com> wrote:
> It seems that Dscho was in agreement that format-patch's use case
> should try to be more aggressive at least back then.  In the message
> in the thread you pointed
>
>  https://lore.kernel.org/git/nycvar.QRO.7.76.6.1903211209280.41@tvgsbejvaqbjf.bet/
>
> he does not give us the exact reason why he does not think the "more
> aggressive" mode is not suitable for other use cases, though.

Having an answer to that question could be helpful.

> A similar thread was raised more recently:
>
>  https://lore.kernel.org/git/rq6919s9-qspp-rn6o-n704-r0400q10747r@tzk.qr/

I think I missed this thread.

> Also, there was an attempt to clarify what the "factor" really
> meant, but we did not get a real conclusion other than the UNIT to
> measure the "factor" is called "percent" (without specifying what
> 100% is relative to, "percent" does not mean much to guide users
> to guess what the right value would be).
>
>  https://lore.kernel.org/git/85snn12q-po05-osqs-n1o0-n6040392q01q@tzk.qr/

I recall this discussion, as well as its followup (in which I
proclaimed my continuing lack of understanding of what creation-factor
really represents):

https://lore.kernel.org/git/CAPig+cRp4N=6EktoisKAH09aVAPkPgZfHJYcB5pJFJ-CUpBHFA@mail.gmail.com/

> So, yes, --creation-factor=<value> is messy, I think a very high
> value, much higher than the hardcoded default of 60, is more
> appropriate for use in format-patch, but we do not know what bad
> effect it would have if we used much higher default everywhere.

At this point in time, I'm not particularly against giving `git
format-patch` its own default creation-factor.

My only worry (and perhaps it's very minor) is that separate default
creation-factors for `git range-diff` and `git format-patch
--range-diff` could catch the unwary off-guard when invoking `git
range-diff` to manually check the difference between an old an new
version of a series before finally invoking `git format-patch
--range-diff` to actually create the patches for sending. (I've done
this myself on a very few occasions, but perhaps not often enough to
warrant the concern(?).)
Johannes Schindelin May 5, 2024, 11:03 a.m. UTC | #5
Hi,

On Fri, 3 May 2024, Eric Sunshine wrote:

> On Fri, May 3, 2024 at 4:08 PM Junio C Hamano <gitster@pobox.com> wrote:
> > It seems that Dscho was in agreement that format-patch's use case
> > should try to be more aggressive at least back then.  In the message
> > in the thread you pointed
> >
> >  https://lore.kernel.org/git/nycvar.QRO.7.76.6.1903211209280.41@tvgsbejvaqbjf.bet/
> >
> > he does not give us the exact reason why he does not think the "more
> > aggressive" mode is not suitable for other use cases, though.
>
> Having an answer to that question could be helpful.

Sure.

The `creation factor` essentially tells Git how confident the caller is that the
two provided commit ranges contain only matching pairs of patches, no
matter how different they look.

When calling `git format-patch --range-diff`, we can be _quite_ certain.
Not necessarily 100% (in which case the creation factor would need to be a
lot higher than 100), as it does happen that contributors have to drop a
patch or two, and add a patch or two, sometimes in the same iteration.

When calling `git range-diff` in general, we can be less certain about
that. In fact, users like me often call `git range-diff` to _determine_
whether there are obvious matches (for example, to see whether commits in
GitGitGadget PRs have corresponding commits in `seen`).

That difference in certainty is the entire reason why I contend that
`range-diff` and `format-patch --range-diff` need different defaults for
the creation factor.

> > A similar thread was raised more recently:
> >
> >  https://lore.kernel.org/git/rq6919s9-qspp-rn6o-n704-r0400q10747r@tzk.qr/
>
> I think I missed this thread.

Heh. I had forgotten about it.

Ciao,
Johannes
Dragan Simic May 7, 2024, 12:01 a.m. UTC | #6
Hello Eric,

On 2024-05-03 23:24, Eric Sunshine wrote:
> On Fri, May 3, 2024 at 4:08 PM Junio C Hamano <gitster@pobox.com> 
> wrote:
>> So, yes, --creation-factor=<value> is messy, I think a very high
>> value, much higher than the hardcoded default of 60, is more
>> appropriate for use in format-patch, but we do not know what bad
>> effect it would have if we used much higher default everywhere.
> 
> At this point in time, I'm not particularly against giving `git
> format-patch` its own default creation-factor.
> 
> My only worry (and perhaps it's very minor) is that separate default
> creation-factors for `git range-diff` and `git format-patch
> --range-diff` could catch the unwary off-guard when invoking `git
> range-diff` to manually check the difference between an old an new
> version of a series before finally invoking `git format-patch
> --range-diff` to actually create the patches for sending. (I've done
> this myself on a very few occasions, but perhaps not often enough to
> warrant the concern(?).)

I agree that having two different default values for the creation
factor in "git format-patch" and "git range-diff" is a valid concern,
but I think that the resulting benefits, in form of patch submissions
that don't lack the range diff contents, outweigh this concern.
diff mbox series

Patch

diff --git c/builtin/log.c w/builtin/log.c
index 4da7399905..7a019476c3 100644
--- c/builtin/log.c
+++ w/builtin/log.c
@@ -2294,7 +2294,7 @@  int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	}
 
 	if (creation_factor < 0)
-		creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT;
+		creation_factor = CREATION_FACTOR_FOR_THE_SAME_SERIES;
 	else if (!rdiff_prev)
 		die(_("the option '%s' requires '%s'"), "--creation-factor", "--range-diff");
 
diff --git c/range-diff.h w/range-diff.h
index 04ffe217be..2f69f6a434 100644
--- c/range-diff.h
+++ w/range-diff.h
@@ -6,6 +6,12 @@ 
 
 #define RANGE_DIFF_CREATION_FACTOR_DEFAULT 60
 
+/*
+ * A much higher value than the default, when we KNOW we are comparing
+ * the same series (e.g., used when format-patch calls range-diff).
+ */
+#define CREATION_FACTOR_FOR_THE_SAME_SERIES 999
+
 struct range_diff_options {
 	int creation_factor;
 	unsigned dual_color:1;