diff mbox series

[2/2] range-diff: optionally accept a pathspec

Message ID 064b147451b04051a413b532cd97ae764ba68027.1661258122.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Allow passing pathspecs to git range-diff | expand

Commit Message

Johannes Schindelin Aug. 23, 2022, 12:35 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `git range-diff` command can be quite expensive, which is not a
surprise given that the underlying algorithm to match up pairs of
commits between the provided two commit ranges has a cubic runtime.

Therefore it makes sense to restrict the commit ranges as much as
possible, to reduce the amount of input to that O(N^3) algorithm.

In chatty repositories with wide trees, this is not necessarily
possible merely by choosing commit ranges wisely.

Let's give users another option to restrict the commit ranges: by
providing a pathspec. That helps in repositories with wide trees because
it is likely that the user has a good idea which subset of the tree they
are actually interested in.

Example:

	git range-diff upstream/main upstream/seen HEAD -- range-diff.c

This shows commits that are either in the local branch or in `seen`, but
not in `main`, skipping all commits that do not touch `range-diff.c`.

Note: Since we piggy-back the pathspecs onto the `other_arg` mechanism
that was introduced to be able to pass through the `--notes` option to
the revision machinery, we must now ensure that the `other_arg` array is
appended at the end (the revision range must come before the pathspecs,
if any).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Documentation/git-range-diff.txt |  4 ++++
 builtin/range-diff.c             | 30 +++++++++++++++++++++++++++++-
 range-diff.c                     |  2 +-
 t/t3206-range-diff.sh            | 11 +++++++++++
 4 files changed, 45 insertions(+), 2 deletions(-)

Comments

Junio C Hamano Aug. 24, 2022, 9 p.m. UTC | #1
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> The `git range-diff` command can be quite expensive, which is not a
> surprise given that the underlying algorithm to match up pairs of
> commits between the provided two commit ranges has a cubic runtime.
>
> Therefore it makes sense to restrict the commit ranges as much as
> possible, to reduce the amount of input to that O(N^3) algorithm.
>
> In chatty repositories with wide trees, this is not necessarily
> possible merely by choosing commit ranges wisely.
>
> Let's give users another option to restrict the commit ranges: by
> providing a pathspec. That helps in repositories with wide trees because
> it is likely that the user has a good idea which subset of the tree they
> are actually interested in.
>
> Example:
>
> 	git range-diff upstream/main upstream/seen HEAD -- range-diff.c
>
> This shows commits that are either in the local branch or in `seen`, but
> not in `main`, skipping all commits that do not touch `range-diff.c`.
>
> Note: Since we piggy-back the pathspecs onto the `other_arg` mechanism
> that was introduced to be able to pass through the `--notes` option to
> the revision machinery, we must now ensure that the `other_arg` array is
> appended at the end (the revision range must come before the pathspecs,
> if any).
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---

It is excellent that this now takes pathspec to sparsify the
history.

The implementation looks, eh, a bit dirty with obvious repetitions
in the "..." case and two-ranges cases.  Three-arg cases sort-of
looks different but that is because the original one does not even
bother to ensure argv[0] and argv[1] are objects, while the new one
does, so they are essentially doing the same things.

> diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
> index fe350d7f405..0b393715d70 100644
> --- a/Documentation/git-range-diff.txt
> +++ b/Documentation/git-range-diff.txt
> @@ -12,6 +12,7 @@ SYNOPSIS
>  	[--no-dual-color] [--creation-factor=<factor>]
>  	[--left-only | --right-only]
>  	( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
> +	[[--] <path>...]

OK.
diff mbox series

Patch

diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
index fe350d7f405..0b393715d70 100644
--- a/Documentation/git-range-diff.txt
+++ b/Documentation/git-range-diff.txt
@@ -12,6 +12,7 @@  SYNOPSIS
 	[--no-dual-color] [--creation-factor=<factor>]
 	[--left-only | --right-only]
 	( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
+	[[--] <path>...]
 
 DESCRIPTION
 -----------
@@ -19,6 +20,9 @@  DESCRIPTION
 This command shows the differences between two versions of a patch
 series, or more generally, two commit ranges (ignoring merge commits).
 
+In the presence of `<path>` arguments, these commit ranges are limited
+accordingly.
+
 To that end, it first finds pairs of commits from both commit ranges
 that correspond with each other. Two commits are said to correspond when
 the diff between their patches (i.e. the author information, the commit
diff --git a/builtin/range-diff.c b/builtin/range-diff.c
index c8ffcd35aea..9ae95b9c950 100644
--- a/builtin/range-diff.c
+++ b/builtin/range-diff.c
@@ -40,6 +40,8 @@  int cmd_range_diff(int argc, const char **argv, const char *prefix)
 	struct option *options;
 	int res = 0;
 	struct strbuf range1 = STRBUF_INIT, range2 = STRBUF_INIT;
+	struct object_id oid;
+	const char *p;
 
 	git_config(git_diff_ui_config, NULL);
 
@@ -47,7 +49,7 @@  int cmd_range_diff(int argc, const char **argv, const char *prefix)
 
 	options = parse_options_concat(range_diff_options, diffopt.parseopts);
 	argc = parse_options(argc, argv, prefix, options,
-			     builtin_range_diff_usage, 0);
+			     builtin_range_diff_usage, PARSE_OPT_KEEP_DASHDASH);
 
 	diff_setup_done(&diffopt);
 
@@ -74,6 +76,20 @@  int cmd_range_diff(int argc, const char **argv, const char *prefix)
 			b = "HEAD";
 		strbuf_addf(&range1, "%s..%.*s", b, a_len, a);
 		strbuf_addf(&range2, "%.*s..%s", a_len, a, b);
+	} else if (argc > 1 && (p = strstr(argv[0], "..."))) {
+		const char *a = argv[0];
+		int a_len = (int)(p - a);
+
+		if (!a_len) {
+			a = "HEAD";
+			a_len = strlen(a);
+		}
+		p += 3;
+		if (!*p)
+			p = "HEAD";
+		strbuf_addf(&range1, "%s..%.*s", p, a_len, a);
+		strbuf_addf(&range2, "%.*s..%s", a_len, a, p);
+		strvec_pushv(&other_arg, argv + 1);
 	} else if (argc == 2) {
 		if (!is_range_diff_range(argv[0]))
 			die(_("not a commit range: '%s'"), argv[0]);
@@ -82,9 +98,21 @@  int cmd_range_diff(int argc, const char **argv, const char *prefix)
 		if (!is_range_diff_range(argv[1]))
 			die(_("not a commit range: '%s'"), argv[1]);
 		strbuf_addstr(&range2, argv[1]);
+	} else if (argc > 2 &&
+	    is_range_diff_range(argv[0]) && is_range_diff_range(argv[1])) {
+		strbuf_addstr(&range1, argv[0]);
+		strbuf_addstr(&range2, argv[1]);
+		strvec_pushv(&other_arg, argv + 2);
 	} else if (argc == 3) {
 		strbuf_addf(&range1, "%s..%s", argv[0], argv[1]);
 		strbuf_addf(&range2, "%s..%s", argv[0], argv[2]);
+	} else if (argc > 3 &&
+		   get_oid_committish(argv[0], &oid) &&
+		   get_oid_committish(argv[1], &oid) &&
+		   get_oid_committish(argv[2], &oid)) {
+		strbuf_addf(&range1, "%s..%s", argv[0], argv[1]);
+		strbuf_addf(&range2, "%s..%s", argv[0], argv[2]);
+		strvec_pushv(&other_arg, argv + 3);
 	} else {
 		error(_("need two commit ranges"));
 		usage_with_options(builtin_range_diff_usage, options);
diff --git a/range-diff.c b/range-diff.c
index f63b3ffc200..124dd678c38 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -57,9 +57,9 @@  static int read_patches(const char *range, struct string_list *list,
 		     "--pretty=medium",
 		     "--notes",
 		     NULL);
+	strvec_push(&cp.args, range);
 	if (other_arg)
 		strvec_pushv(&cp.args, other_arg->v);
-	strvec_push(&cp.args, range);
 	cp.out = -1;
 	cp.no_stdin = 1;
 	cp.git_cmd = 1;
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index d12e4e4cc6c..f2821a69b6f 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -772,6 +772,17 @@  test_expect_success '--left-only/--right-only' '
 	test_cmp expect actual
 '
 
+test_expect_success 'ranges with pathspecs' '
+	git range-diff topic...mode-only-change -- other-file >actual &&
+	test_line_count = 2 actual &&
+	topic_oid=$(git rev-parse --short topic) &&
+	mode_change_oid=$(git rev-parse --short mode-only-change^) &&
+	file_change_oid=$(git rev-parse --short mode-only-change) &&
+	grep "$mode_change_oid" actual &&
+	! grep "$file_change_oid" actual &&
+	! grep "$topic_oid" actual
+'
+
 test_expect_success 'submodule changes are shown irrespective of diff.submodule' '
 	git init sub-repo &&
 	test_commit -C sub-repo sub-first &&