diff mbox series

[v4,3/4] builtin: introduce diff-pairs command

Message ID 20250228002604.3859939-4-jltobler@gmail.com (mailing list archive)
State Superseded
Headers show
Series batch blob diff generation | expand

Commit Message

Justin Tobler Feb. 28, 2025, 12:26 a.m. UTC
Through git-diff(1), a single diff can be generated from a pair of blob
revisions directly. Unfortunately, there is not a mechanism to compute
batches of specific file pair diffs in a single process. Such a feature
is particularly useful on the server-side where diffing between a large
set of changes is not feasible all at once due to timeout concerns.

To facilitate this, introduce git-diff-pairs(1) which acts as a backend
passing its NUL-terminated raw diff format input from stdin through diff
machinery to produce various forms of output such as patch or raw.

The raw format was originally designed as an interchange format and
represents the contents of the diff_queued_diff list making it possible
to break the diff pipeline into separate stages. For example,
git-diff-tree(1) can be used as a frontend to compute file pairs to
queue and feed its raw output to git-diff-pairs(1) to compute patches.
With this, batches of diffs can be progressively generated without
having to recompute renames or retrieve object context. Something like
the following:

	git diff-tree -r -z -M $old $new |
	git diff-pairs -p -z

should generate the same output as `git diff-tree -p -M`. Furthermore,
each line of raw diff formatted input can also be individually fed to a
separate git-diff-pairs(1) process and still produce the same output.

Based-on-patch-by: Jeff King <peff@peff.net>
Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 .gitignore                        |   1 +
 Documentation/git-diff-pairs.adoc |  56 +++++++++
 Documentation/meson.build         |   1 +
 Makefile                          |   1 +
 builtin.h                         |   1 +
 builtin/diff-pairs.c              | 195 ++++++++++++++++++++++++++++++
 command-list.txt                  |   1 +
 git.c                             |   1 +
 meson.build                       |   1 +
 t/meson.build                     |   1 +
 t/t4070-diff-pairs.sh             |  81 +++++++++++++
 11 files changed, 340 insertions(+)
 create mode 100644 Documentation/git-diff-pairs.adoc
 create mode 100644 builtin/diff-pairs.c
 create mode 100755 t/t4070-diff-pairs.sh

Comments

Patrick Steinhardt Feb. 28, 2025, 8:29 a.m. UTC | #1
On Thu, Feb 27, 2025 at 06:26:03PM -0600, Justin Tobler wrote:
> diff --git a/builtin/diff-pairs.c b/builtin/diff-pairs.c
> new file mode 100644
> index 0000000000..5a993b7c9d
> --- /dev/null
> +++ b/builtin/diff-pairs.c
[snip]
> +int cmd_diff_pairs(int argc, const char **argv, const char *prefix,
> +		   struct repository *repo)
> +{
> +	struct strbuf path_dst = STRBUF_INIT;
> +	struct strbuf path = STRBUF_INIT;
> +	struct strbuf meta = STRBUF_INIT;
> +	struct option *parseopts;
> +	struct rev_info revs;
> +	int line_term = '\0';
> +	int ret;
> +
> +	const char * const usagestr[] = {
> +		N_("git diff-pairs -z [<diff-options>]"),
> +		NULL
> +	};

We tend to call these `builtin_*_usage`, so in your case it would be
`builtin_diff_pairs_usage`.

> +	struct option options[] = {
> +		OPT_END()
> +	};
> +
> +	repo_init_revisions(repo, &revs, prefix);
> +
> +	/*
> +	 * Diff options are usually parsed implicitly as part of
> +	 * setup_revisions(). Explicitly handle parsing to ensure options are
> +	 * printed in the usage message.
> +	 */
> +	parseopts = add_diff_options(options, &revs.diffopt);
> +	show_usage_with_options_if_asked(argc, argv, usagestr, parseopts);
> +
> +	repo_config(repo, git_diff_basic_config, NULL);
> +	revs.disable_stdin = 1;
> +	revs.abbrev = 0;
> +	revs.diff = 1;
> +
> +	argc = parse_options(argc, argv, prefix, parseopts, usagestr,
> +			     PARSE_OPT_KEEP_UNKNOWN_OPT |
> +			     PARSE_OPT_KEEP_DASHDASH |
> +			     PARSE_OPT_KEEP_ARGV0);
> 
> +	if (setup_revisions(argc, argv, &revs, NULL) > 1)
> +		usagef(_("unrecognized argument: %s"), argv[0]);

Okay, we now use `parse_options()` to parse stuff for us, and
`setup_revisions()` only really does the setup for us as we know that
all relevant diff options should've already been parsed for us. This
looks much nicer to me.

I wonder though: we keep unknown options when calling `parse_options()`
and then end up passing them to `setup_revisions()`. But are there even
any options handled by `setup_revisions()` that would make sense in our
context? And if not, shouldn't we rather make `parse_options()` die in
case it sees unknown options?

If there are, we should probably document this because it isn't obvious
to me.

> diff --git a/t/t4070-diff-pairs.sh b/t/t4070-diff-pairs.sh
> new file mode 100755
> index 0000000000..8f17e55c7d
> --- /dev/null
> +++ b/t/t4070-diff-pairs.sh
> @@ -0,0 +1,81 @@
> +#!/bin/sh
> +
> +test_description='basic diff-pairs tests'
> +. ./test-lib.sh
> +
> +# This creates a diff with added, modified, deleted, renamed, copied, and
> +# typechange entries. This includes a submodule to test submodule diff support.
> +test_expect_success 'setup' '
> +	test_config_global protocol.file.allow always &&
> +	test_create_repo sub &&

Use of `test_create_repo ()` is deprecated, as it is merely a wrapper
around git-init(1).

Patrick
Justin Tobler Feb. 28, 2025, 5:26 p.m. UTC | #2
On 25/02/28 09:29AM, Patrick Steinhardt wrote:
> On Thu, Feb 27, 2025 at 06:26:03PM -0600, Justin Tobler wrote:
> > diff --git a/builtin/diff-pairs.c b/builtin/diff-pairs.c
> > new file mode 100644
> > index 0000000000..5a993b7c9d
> > --- /dev/null
> > +++ b/builtin/diff-pairs.c
> [snip]
> > +int cmd_diff_pairs(int argc, const char **argv, const char *prefix,
> > +		   struct repository *repo)
> > +{
> > +	struct strbuf path_dst = STRBUF_INIT;
> > +	struct strbuf path = STRBUF_INIT;
> > +	struct strbuf meta = STRBUF_INIT;
> > +	struct option *parseopts;
> > +	struct rev_info revs;
> > +	int line_term = '\0';
> > +	int ret;
> > +
> > +	const char * const usagestr[] = {
> > +		N_("git diff-pairs -z [<diff-options>]"),
> > +		NULL
> > +	};
> 
> We tend to call these `builtin_*_usage`, so in your case it would be
> `builtin_diff_pairs_usage`.

Good to know, will adapt in a followup version.

> 
> > +	struct option options[] = {
> > +		OPT_END()
> > +	};
> > +
> > +	repo_init_revisions(repo, &revs, prefix);
> > +
> > +	/*
> > +	 * Diff options are usually parsed implicitly as part of
> > +	 * setup_revisions(). Explicitly handle parsing to ensure options are
> > +	 * printed in the usage message.
> > +	 */
> > +	parseopts = add_diff_options(options, &revs.diffopt);
> > +	show_usage_with_options_if_asked(argc, argv, usagestr, parseopts);
> > +
> > +	repo_config(repo, git_diff_basic_config, NULL);
> > +	revs.disable_stdin = 1;
> > +	revs.abbrev = 0;
> > +	revs.diff = 1;
> > +
> > +	argc = parse_options(argc, argv, prefix, parseopts, usagestr,
> > +			     PARSE_OPT_KEEP_UNKNOWN_OPT |
> > +			     PARSE_OPT_KEEP_DASHDASH |
> > +			     PARSE_OPT_KEEP_ARGV0);
> > 
> > +	if (setup_revisions(argc, argv, &revs, NULL) > 1)
> > +		usagef(_("unrecognized argument: %s"), argv[0]);
> 
> Okay, we now use `parse_options()` to parse stuff for us, and
> `setup_revisions()` only really does the setup for us as we know that
> all relevant diff options should've already been parsed for us. This
> looks much nicer to me.
> 
> I wonder though: we keep unknown options when calling `parse_options()`
> and then end up passing them to `setup_revisions()`. But are there even
> any options handled by `setup_revisions()` that would make sense in our
> context? And if not, shouldn't we rather make `parse_options()` die in
> case it sees unknown options?

Good catch, there should not be any actaully needed options left for
`setup_revisions()` to parse as they should all be handled by
`parse_options()`. I'll remove the `PARSE_OPT_KEEP_UNKNOWN_OPT` flag.

> If there are, we should probably document this because it isn't obvious
> to me.
> 
> > diff --git a/t/t4070-diff-pairs.sh b/t/t4070-diff-pairs.sh
> > new file mode 100755
> > index 0000000000..8f17e55c7d
> > --- /dev/null
> > +++ b/t/t4070-diff-pairs.sh
> > @@ -0,0 +1,81 @@
> > +#!/bin/sh
> > +
> > +test_description='basic diff-pairs tests'
> > +. ./test-lib.sh
> > +
> > +# This creates a diff with added, modified, deleted, renamed, copied, and
> > +# typechange entries. This includes a submodule to test submodule diff support.
> > +test_expect_success 'setup' '
> > +	test_config_global protocol.file.allow always &&
> > +	test_create_repo sub &&
> 
> Use of `test_create_repo ()` is deprecated, as it is merely a wrapper
> around git-init(1).

Good to know! I'll swap to using git-init(1) instead.

Thanks
-Justin
diff mbox series

Patch

diff --git a/.gitignore b/.gitignore
index 08a66ca508..04c444404e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -55,6 +55,7 @@ 
 /git-diff
 /git-diff-files
 /git-diff-index
+/git-diff-pairs
 /git-diff-tree
 /git-difftool
 /git-difftool--helper
diff --git a/Documentation/git-diff-pairs.adoc b/Documentation/git-diff-pairs.adoc
new file mode 100644
index 0000000000..e31f2e2fbb
--- /dev/null
+++ b/Documentation/git-diff-pairs.adoc
@@ -0,0 +1,56 @@ 
+git-diff-pairs(1)
+=================
+
+NAME
+----
+git-diff-pairs - Compare the content and mode of provided blob pairs
+
+SYNOPSIS
+--------
+[synopsis]
+git diff-pairs -z [<diff-options>]
+
+DESCRIPTION
+-----------
+Show changes for file pairs provided on stdin. Input for this command must be
+in the NUL-terminated raw output format as generated by commands such as `git
+diff-tree -z -r --raw`. By default, the outputted diffs are computed and shown
+in the patch format when stdin closes.
+
+Usage of this command enables the traditional diff pipeline to be broken up
+into separate stages where `diff-pairs` acts as the output phase. Other
+commands, such as `diff-tree`, may serve as a frontend to compute the raw
+diff format used as input.
+
+Instead of computing diffs via `git diff-tree -p -M` in one step, `diff-tree`
+can compute the file pairs and rename information without the blob diffs. This
+output can be fed to `diff-pairs` to generate the underlying blob diffs as done
+in the following example:
+
+-----------------------------
+git diff-tree -z -r -M $a $b |
+git diff-pairs -z
+-----------------------------
+
+Computing the tree diff upfront with rename information allows patch output
+from `diff-pairs` to be progressively computed over the course of potentially
+multiple invocations.
+
+Pathspecs are not currently supported by `diff-pairs`. Pathspec limiting should
+be performed by the upstream command generating the raw diffs used as input.
+
+Tree objects are not currently supported as input and are rejected.
+
+Abbreviated object IDs in the `diff-pairs` input are not supported. Outputted
+object IDs can be abbreviated using the `--abbrev` option.
+
+OPTIONS
+-------
+
+include::diff-options.adoc[]
+
+include::diff-generate-patch.adoc[]
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Documentation/meson.build b/Documentation/meson.build
index 1129ce4c85..ce990e9fe5 100644
--- a/Documentation/meson.build
+++ b/Documentation/meson.build
@@ -42,6 +42,7 @@  manpages = {
   'git-diagnose.adoc' : 1,
   'git-diff-files.adoc' : 1,
   'git-diff-index.adoc' : 1,
+  'git-diff-pairs.adoc' : 1,
   'git-difftool.adoc' : 1,
   'git-diff-tree.adoc' : 1,
   'git-diff.adoc' : 1,
diff --git a/Makefile b/Makefile
index bcf5ed3f85..56df7aed3f 100644
--- a/Makefile
+++ b/Makefile
@@ -1242,6 +1242,7 @@  BUILTIN_OBJS += builtin/describe.o
 BUILTIN_OBJS += builtin/diagnose.o
 BUILTIN_OBJS += builtin/diff-files.o
 BUILTIN_OBJS += builtin/diff-index.o
+BUILTIN_OBJS += builtin/diff-pairs.o
 BUILTIN_OBJS += builtin/diff-tree.o
 BUILTIN_OBJS += builtin/diff.o
 BUILTIN_OBJS += builtin/difftool.o
diff --git a/builtin.h b/builtin.h
index 89928ccf92..e6aad3a6a1 100644
--- a/builtin.h
+++ b/builtin.h
@@ -153,6 +153,7 @@  int cmd_diagnose(int argc, const char **argv, const char *prefix, struct reposit
 int cmd_diff_files(int argc, const char **argv, const char *prefix, struct repository *repo);
 int cmd_diff_index(int argc, const char **argv, const char *prefix, struct repository *repo);
 int cmd_diff(int argc, const char **argv, const char *prefix, struct repository *repo);
+int cmd_diff_pairs(int argc, const char **argv, const char *prefix, struct repository *repo);
 int cmd_diff_tree(int argc, const char **argv, const char *prefix, struct repository *repo);
 int cmd_difftool(int argc, const char **argv, const char *prefix, struct repository *repo);
 int cmd_env__helper(int argc, const char **argv, const char *prefix, struct repository *repo);
diff --git a/builtin/diff-pairs.c b/builtin/diff-pairs.c
new file mode 100644
index 0000000000..5a993b7c9d
--- /dev/null
+++ b/builtin/diff-pairs.c
@@ -0,0 +1,195 @@ 
+#include "builtin.h"
+#include "config.h"
+#include "diff.h"
+#include "diffcore.h"
+#include "gettext.h"
+#include "hash.h"
+#include "hex.h"
+#include "object.h"
+#include "parse-options.h"
+#include "revision.h"
+#include "strbuf.h"
+
+static unsigned parse_mode_or_die(const char *mode, const char **end)
+{
+	uint16_t ret;
+
+	*end = parse_mode(mode, &ret);
+	if (!*end)
+		die(_("unable to parse mode: %s"), mode);
+	return ret;
+}
+
+static void parse_oid_or_die(const char *hex, struct object_id *oid,
+			     const char **end, const struct git_hash_algo *algop)
+{
+	if (parse_oid_hex_algop(hex, oid, end, algop) || *(*end)++ != ' ')
+		die(_("unable to parse object id: %s"), hex);
+}
+
+int cmd_diff_pairs(int argc, const char **argv, const char *prefix,
+		   struct repository *repo)
+{
+	struct strbuf path_dst = STRBUF_INIT;
+	struct strbuf path = STRBUF_INIT;
+	struct strbuf meta = STRBUF_INIT;
+	struct option *parseopts;
+	struct rev_info revs;
+	int line_term = '\0';
+	int ret;
+
+	const char * const usagestr[] = {
+		N_("git diff-pairs -z [<diff-options>]"),
+		NULL
+	};
+	struct option options[] = {
+		OPT_END()
+	};
+
+	repo_init_revisions(repo, &revs, prefix);
+
+	/*
+	 * Diff options are usually parsed implicitly as part of
+	 * setup_revisions(). Explicitly handle parsing to ensure options are
+	 * printed in the usage message.
+	 */
+	parseopts = add_diff_options(options, &revs.diffopt);
+	show_usage_with_options_if_asked(argc, argv, usagestr, parseopts);
+
+	repo_config(repo, git_diff_basic_config, NULL);
+	revs.disable_stdin = 1;
+	revs.abbrev = 0;
+	revs.diff = 1;
+
+	argc = parse_options(argc, argv, prefix, parseopts, usagestr,
+			     PARSE_OPT_KEEP_UNKNOWN_OPT |
+			     PARSE_OPT_KEEP_DASHDASH |
+			     PARSE_OPT_KEEP_ARGV0);
+
+	if (setup_revisions(argc, argv, &revs, NULL) > 1)
+		usagef(_("unrecognized argument: %s"), argv[0]);
+
+	/*
+	 * With the -z option, both command input and raw output are
+	 * NUL-delimited (this mode does not affect patch output). At present
+	 * only NUL-delimited raw diff formatted input is supported.
+	 */
+	if (revs.diffopt.line_termination)
+		usage(_("working without -z is not supported"));
+
+	if (revs.prune_data.nr)
+		usage(_("pathspec arguments not supported"));
+
+	if (revs.pending.nr || revs.max_count != -1 ||
+	    revs.min_age != (timestamp_t)-1 ||
+	    revs.max_age != (timestamp_t)-1)
+		usage(_("revision arguments not allowed"));
+
+	if (!revs.diffopt.output_format)
+		revs.diffopt.output_format = DIFF_FORMAT_PATCH;
+
+	/*
+	 * If rename detection is not requested, use rename information from the
+	 * raw diff formatted input. Setting skip_resolving_statuses ensures
+	 * diffcore_std() does not mess with rename information already present
+	 * in queued filepairs.
+	 */
+	if (!revs.diffopt.detect_rename)
+		revs.diffopt.skip_resolving_statuses = 1;
+
+	while (1) {
+		struct object_id oid_a, oid_b;
+		struct diff_filepair *pair;
+		unsigned mode_a, mode_b;
+		const char *p;
+		char status;
+
+		if (strbuf_getwholeline(&meta, stdin, line_term) == EOF)
+			break;
+
+		p = meta.buf;
+		if (*p != ':')
+			die(_("invalid raw diff input"));
+		p++;
+
+		mode_a = parse_mode_or_die(p, &p);
+		mode_b = parse_mode_or_die(p, &p);
+
+		if (S_ISDIR(mode_a) || S_ISDIR(mode_b))
+			die(_("tree objects not supported"));
+
+		parse_oid_or_die(p, &oid_a, &p, repo->hash_algo);
+		parse_oid_or_die(p, &oid_b, &p, repo->hash_algo);
+
+		status = *p++;
+
+		if (strbuf_getwholeline(&path, stdin, line_term) == EOF)
+			die(_("got EOF while reading path"));
+
+		switch (status) {
+		case DIFF_STATUS_ADDED:
+			pair = diff_queue_addremove(&diff_queued_diff,
+						    &revs.diffopt, '+', mode_b,
+						    &oid_b, 1, path.buf, 0);
+			if (pair)
+				pair->status = status;
+			break;
+
+		case DIFF_STATUS_DELETED:
+			pair = diff_queue_addremove(&diff_queued_diff,
+						    &revs.diffopt, '-', mode_a,
+						    &oid_a, 1, path.buf, 0);
+			if (pair)
+				pair->status = status;
+			break;
+
+		case DIFF_STATUS_TYPE_CHANGED:
+		case DIFF_STATUS_MODIFIED:
+			pair = diff_queue_change(&diff_queued_diff, &revs.diffopt,
+						 mode_a, mode_b, &oid_a, &oid_b,
+						 1, 1, path.buf, 0, 0);
+			if (pair)
+				pair->status = status;
+			break;
+
+		case DIFF_STATUS_RENAMED:
+		case DIFF_STATUS_COPIED: {
+				struct diff_filespec *a, *b;
+				unsigned int score;
+
+				if (strbuf_getwholeline(&path_dst, stdin, line_term) == EOF)
+					die(_("got EOF while reading destination path"));
+
+				a = alloc_filespec(path.buf);
+				b = alloc_filespec(path_dst.buf);
+				fill_filespec(a, &oid_a, 1, mode_a);
+				fill_filespec(b, &oid_b, 1, mode_b);
+
+				pair = diff_queue(&diff_queued_diff, a, b);
+
+				if (strtoul_ui(p, 10, &score))
+					die(_("unable to parse rename/copy score: %s"), p);
+
+				pair->score = score * MAX_SCORE / 100;
+				pair->status = status;
+				pair->renamed_pair = 1;
+			}
+			break;
+
+		default:
+			die(_("unknown diff status: %c"), status);
+		}
+	}
+
+	diffcore_std(&revs.diffopt);
+	diff_flush(&revs.diffopt);
+	ret = diff_result_code(&revs);
+
+	strbuf_release(&path_dst);
+	strbuf_release(&path);
+	strbuf_release(&meta);
+	release_revisions(&revs);
+	FREE_AND_NULL(parseopts);
+
+	return ret;
+}
diff --git a/command-list.txt b/command-list.txt
index c537114b46..b7ade3ab9f 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -96,6 +96,7 @@  git-diagnose                            ancillaryinterrogators
 git-diff                                mainporcelain           info
 git-diff-files                          plumbinginterrogators
 git-diff-index                          plumbinginterrogators
+git-diff-pairs                          plumbinginterrogators
 git-diff-tree                           plumbinginterrogators
 git-difftool                            ancillaryinterrogators          complete
 git-fast-export                         ancillarymanipulators
diff --git a/git.c b/git.c
index 450d6aaa86..77c4359522 100644
--- a/git.c
+++ b/git.c
@@ -541,6 +541,7 @@  static struct cmd_struct commands[] = {
 	{ "diff", cmd_diff, NO_PARSEOPT },
 	{ "diff-files", cmd_diff_files, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "diff-index", cmd_diff_index, RUN_SETUP | NO_PARSEOPT },
+	{ "diff-pairs", cmd_diff_pairs, RUN_SETUP | NO_PARSEOPT },
 	{ "diff-tree", cmd_diff_tree, RUN_SETUP | NO_PARSEOPT },
 	{ "difftool", cmd_difftool, RUN_SETUP_GENTLY },
 	{ "fast-export", cmd_fast_export, RUN_SETUP },
diff --git a/meson.build b/meson.build
index bf95576f83..9e8b365d2a 100644
--- a/meson.build
+++ b/meson.build
@@ -540,6 +540,7 @@  builtin_sources = [
   'builtin/diagnose.c',
   'builtin/diff-files.c',
   'builtin/diff-index.c',
+  'builtin/diff-pairs.c',
   'builtin/diff-tree.c',
   'builtin/diff.c',
   'builtin/difftool.c',
diff --git a/t/meson.build b/t/meson.build
index 780939d49f..09c7bc2fad 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -500,6 +500,7 @@  integration_tests = [
   't4067-diff-partial-clone.sh',
   't4068-diff-symmetric-merge-base.sh',
   't4069-remerge-diff.sh',
+  't4070-diff-pairs.sh',
   't4100-apply-stat.sh',
   't4101-apply-nonl.sh',
   't4102-apply-rename.sh',
diff --git a/t/t4070-diff-pairs.sh b/t/t4070-diff-pairs.sh
new file mode 100755
index 0000000000..8f17e55c7d
--- /dev/null
+++ b/t/t4070-diff-pairs.sh
@@ -0,0 +1,81 @@ 
+#!/bin/sh
+
+test_description='basic diff-pairs tests'
+. ./test-lib.sh
+
+# This creates a diff with added, modified, deleted, renamed, copied, and
+# typechange entries. This includes a submodule to test submodule diff support.
+test_expect_success 'setup' '
+	test_config_global protocol.file.allow always &&
+	test_create_repo sub &&
+	test_commit -C sub initial &&
+
+	test_create_repo main &&
+	cd main &&
+	echo to-be-gone >deleted &&
+	echo original >modified &&
+	echo now-a-file >symlink &&
+	test_seq 200 >two-hundred &&
+	test_seq 201 500 >five-hundred &&
+	git add . &&
+	test_tick &&
+	git commit -m base &&
+	git tag base &&
+
+	git submodule add ../sub &&
+	echo now-here >added &&
+	echo new >modified &&
+	rm deleted &&
+	mkdir subdir &&
+	echo content >subdir/file &&
+	mv two-hundred renamed &&
+	test_seq 201 500 | sed s/300/modified/ >copied &&
+	rm symlink &&
+	git add -A . &&
+	test_ln_s_add dest symlink &&
+	test_tick &&
+	git commit -m new &&
+	git tag new
+'
+
+test_expect_success 'diff-pairs recreates --raw' '
+	git diff-tree -r -M -C -C -z base new >expect &&
+	git diff-pairs --raw -z >actual <expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'diff-pairs can create -p output' '
+	git diff-tree -p -M -C -C base new >expect &&
+	git diff-tree -r -M -C -C -z base new |
+	git diff-pairs -p -z >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'diff-pairs does not support normal raw diff input' '
+	git diff-tree -r base new |
+	test_must_fail git diff-pairs >out 2>err &&
+
+	echo "usage: working without -z is not supported" >expect &&
+	test_must_be_empty out &&
+	test_cmp expect err
+'
+
+test_expect_success 'diff-pairs does not support tree objects as input' '
+	git diff-tree -z base new |
+	test_must_fail git diff-pairs -z >out 2>err &&
+
+	echo "fatal: tree objects not supported" >expect &&
+	test_must_be_empty out &&
+	test_cmp expect err
+'
+
+test_expect_success 'diff-pairs does not support pathspec arguments' '
+	git diff-tree -r -z base new |
+	test_must_fail git diff-pairs -z -- new >out 2>err &&
+
+	echo "usage: pathspec arguments not supported" >expect &&
+	test_must_be_empty out &&
+	test_cmp expect err
+'
+
+test_done