diff mbox series

blame: allow --contents to work with non-HEAD commit

Message ID 20230324010457.275902-1-jacob.e.keller@intel.com (mailing list archive)
State Superseded
Headers show
Series blame: allow --contents to work with non-HEAD commit | expand

Commit Message

Jacob Keller March 24, 2023, 1:04 a.m. UTC
From: Jacob Keller <jacob.keller@gmail.com>

The --contents option can be used with git blame to blame the file as if
it had the contents from the specified file. This is akin to copying the
contents into the working tree and then running git blame. This option
has been supported since 1cfe77333f27 ("git-blame: no rev means start
from the working tree file.")

The --contents option always blames the file as if it was based on the
current HEAD commit. If you try to pass a revision while using
--contents, you get the following error:

  fatal: cannot use --contents with final commit object name

This is because the blame process generates a fake working tree commit
which always uses the HEAD object.

Fix fake_working_tree_commit to take the object ID to use for the
parent instead of always using HEAD. If both a revision and --contents
is provided, look up the object ID from the provided revision instead of
using HEAD.

This enables use of --contents with an arbitrary revision, rather than
forcing the use of the local HEAD commit. This makes the --contents
option significantly more flexible.

Reword the documentation so that its clear that --contents can be used
with <rev>.

Add tests for the --contents option to the annotate-tests.sh test
script.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
---
I ran into this because I use --contents in a process I'm working on for
comparing differences between two forks of a project. I use blame --contents
where I point blame to the contents from the other repository. It is useful
to be able to do with at arbitrary commits. Currently I have to switch the
working tree to the commit rather than being able to target a commit by its
oid.

With this change I can more easily run this process without the need to
actually check the contents out in the working tree. Its relatively simple
to make --contents work with a revision, since I just need to generate the
fake contents starting from that revision instead of starting from HEAD.

This might make it possible for --contents to work with --reverse now as
well, but I haven't investigated that.

 Documentation/blame-options.txt |  9 ++++-----
 blame.c                         | 27 ++++++++++++++++-----------
 t/annotate-tests.sh             | 14 ++++++++++++++
 3 files changed, 34 insertions(+), 16 deletions(-)

Comments

Junio C Hamano March 24, 2023, 4:41 a.m. UTC | #1
Jacob Keller <jacob.e.keller@intel.com> writes:

> From: Jacob Keller <jacob.keller@gmail.com>
>
> The --contents option can be used with git blame to blame the file as if
> it had the contents from the specified file. This is akin to copying the
> contents into the working tree and then running git blame. This option
> has been supported since 1cfe77333f27 ("git-blame: no rev means start
> from the working tree file.")
>
> The --contents option always blames the file as if it was based on the
> current HEAD commit. If you try to pass a revision while using
> --contents, you get the following error:
>
>   fatal: cannot use --contents with final commit object name
>
> This is because the blame process generates a fake working tree commit
> which always uses the HEAD object.

"the HEAD object as its sole parent."

> Fix fake_working_tree_commit to take the object ID to use for the
> parent instead of always using HEAD. If both a revision and --contents
> is provided, look up the object ID from the provided revision instead of
> using HEAD.

An obvious enhancement.

As the original author of 1cfe7733 (git-blame: no rev means start
from the working tree file., 2007-01-30), I am not sure if the verb
"fix" is fair to describe this change, though.  If you update the
working tree file with contents that is vastly different and totally
unrelated to the version at HEAD, then with this new feature, your
"blame" can start at the working tree file, and then some commit
that is totally unrelated to HEAD, and down the history from it, and
everything should make sense, but if you smudge your working tree
files that way, it would be quite awkward to use the working tree to
advance the history that leads to HEAD.  That is the reason why I
designed the "fake commit based on off-history contents" features to
work only with HEAD.  But unlike actually messing with the contents
of the working tree files, feeding a temporary contents via the
"--contents" option has much less chance of breaking the next
commit, so I do not have any objection to this patch.

Thanks.
Jacob Keller March 24, 2023, 6 a.m. UTC | #2
On 3/23/2023 9:41 PM, Junio C Hamano wrote:
> Jacob Keller <jacob.e.keller@intel.com> writes:
> 
>> From: Jacob Keller <jacob.keller@gmail.com>
>>
>> The --contents option can be used with git blame to blame the file as if
>> it had the contents from the specified file. This is akin to copying the
>> contents into the working tree and then running git blame. This option
>> has been supported since 1cfe77333f27 ("git-blame: no rev means start
>> from the working tree file.")
>>
>> The --contents option always blames the file as if it was based on the
>> current HEAD commit. If you try to pass a revision while using
>> --contents, you get the following error:
>>
>>   fatal: cannot use --contents with final commit object name
>>
>> This is because the blame process generates a fake working tree commit
>> which always uses the HEAD object.
> 
> "the HEAD object as its sole parent."
> 

Ah, good correction.

>> Fix fake_working_tree_commit to take the object ID to use for the
>> parent instead of always using HEAD. If both a revision and --contents
>> is provided, look up the object ID from the provided revision instead of
>> using HEAD.
> 
> An obvious enhancement.
> 
> As the original author of 1cfe7733 (git-blame: no rev means start
> from the working tree file., 2007-01-30), I am not sure if the verb
> "fix" is fair to describe this change, though. 

Right, this is an enhancement, not a fix. I reworded this in v2.

> If you update the
> working tree file with contents that is vastly different and totally
> unrelated to the version at HEAD, then with this new feature, your
> "blame" can start at the working tree file, and then some commit
> that is totally unrelated to HEAD, and down the history from it, and
> everything should make sense, but if you smudge your working tree
> files that way, it would be quite awkward to use the working tree to
> advance the history that leads to HEAD.  That is the reason why I
> designed the "fake commit based on off-history contents" features to
> work only with HEAD.  But unlike actually messing with the contents
> of the working tree files, feeding a temporary contents via the
> "--contents" option has much less chance of breaking the next
> commit, so I do not have any objection to this patch.
> > Thanks.

Right. This doesn't change the behavior for if --contents is not
provided. If a revision is specified, we ignore the working tree and
just use the revision. If no revision is specified, we use HEAD but
generate the fake working commit that includes the staged changes. Using
working tree with arbitrary commits doesn't usually make sense. If you
*do* actually want that, its possible to do now with "--contents
path/to/working-tree-file", but you have to opt-in by using --contents.

The change should only modify the behavior if --contents is provided. In
that case, we always use that file contents and assume you know what
you're doing with respect to the contents you want to blame.
diff mbox series

Patch

diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt
index 9a663535f443..6476dd327377 100644
--- a/Documentation/blame-options.txt
+++ b/Documentation/blame-options.txt
@@ -64,11 +64,10 @@  include::line-range-format.txt[]
 	manual page.
 
 --contents <file>::
-	When <rev> is not specified, the command annotates the
-	changes starting backwards from the working tree copy.
-	This flag makes the command pretend as if the working
-	tree copy has the contents of the named file (specify
-	`-` to make the command read from the standard input).
+	Pretend the file being annotated has the contents from the named
+	file instead of using the contents of <rev> or the working tree
+	copy. You may specify '-' to make the command read from standard
+	input for the file contents.
 
 --date <format>::
 	Specifies the format used to output dates. If --date is not
diff --git a/blame.c b/blame.c
index e45d8a3bf92a..52fca5a7f5b7 100644
--- a/blame.c
+++ b/blame.c
@@ -177,12 +177,12 @@  static void set_commit_buffer_from_strbuf(struct repository *r,
 static struct commit *fake_working_tree_commit(struct repository *r,
 					       struct diff_options *opt,
 					       const char *path,
-					       const char *contents_from)
+					       const char *contents_from,
+					       struct object_id *oid)
 {
 	struct commit *commit;
 	struct blame_origin *origin;
 	struct commit_list **parent_tail, *parent;
-	struct object_id head_oid;
 	struct strbuf buf = STRBUF_INIT;
 	const char *ident;
 	time_t now;
@@ -198,10 +198,7 @@  static struct commit *fake_working_tree_commit(struct repository *r,
 	commit->date = now;
 	parent_tail = &commit->parents;
 
-	if (!resolve_ref_unsafe("HEAD", RESOLVE_REF_READING, &head_oid, NULL))
-		die("no such ref: HEAD");
-
-	parent_tail = append_parent(r, parent_tail, &head_oid);
+	parent_tail = append_parent(r, parent_tail, oid);
 	append_merge_parents(r, parent_tail);
 	verify_working_tree_path(r, commit, path);
 
@@ -2772,22 +2769,30 @@  void setup_scoreboard(struct blame_scoreboard *sb,
 		sb->commits.compare = compare_commits_by_reverse_commit_date;
 	}
 
-	if (sb->final && sb->contents_from)
-		die(_("cannot use --contents with final commit object name"));
-
 	if (sb->reverse && sb->revs->first_parent_only)
 		sb->revs->children.name = NULL;
 
-	if (!sb->final) {
+	if (sb->contents_from || !sb->final) {
+		struct object_id head_oid, *parent_oid;
+
 		/*
 		 * "--not A B -- path" without anything positive;
 		 * do not default to HEAD, but use the working tree
 		 * or "--contents".
 		 */
+		if (sb->final) {
+			parent_oid = &sb->final->object.oid;
+		} else {
+			if (!resolve_ref_unsafe("HEAD", RESOLVE_REF_READING, &head_oid, NULL))
+				die("no such ref: HEAD");
+			parent_oid = &head_oid;
+		}
+
 		setup_work_tree();
 		sb->final = fake_working_tree_commit(sb->repo,
 						     &sb->revs->diffopt,
-						     sb->path, sb->contents_from);
+						     sb->path, sb->contents_from,
+						     parent_oid);
 		add_pending_object(sb->revs, &(sb->final->object), ":");
 	}
 
diff --git a/t/annotate-tests.sh b/t/annotate-tests.sh
index f1b9a6ce4dae..b35be20cf327 100644
--- a/t/annotate-tests.sh
+++ b/t/annotate-tests.sh
@@ -72,6 +72,16 @@  test_expect_success 'blame 1 author' '
 	check_count A 2
 '
 
+test_expect_success 'blame with --contents' '
+	check_count --contents=file A 2
+'
+
+test_expect_success 'blame with --contents changed' '
+	echo "1A quick brown fox jumps over the" >contents &&
+	echo "another lazy dog" >>contents &&
+	check_count --contents=contents A 1 "Not Committed Yet" 1
+'
+
 test_expect_success 'blame in a bare repo without starting commit' '
 	git clone --bare . bare.git &&
 	(
@@ -98,6 +108,10 @@  test_expect_success 'blame 2 authors' '
 	check_count A 2 B 2
 '
 
+test_expect_success 'blame with --contents and revision' '
+	check_count -h testTag --contents=file A 2 "Not Committed Yet" 2
+'
+
 test_expect_success 'setup B1 lines (branch1)' '
 	git checkout -b branch1 main &&
 	echo "3A slow green fox jumps into the" >>file &&