Message ID | CAP0H_AEd1jFNB_dO=HRjwEUKzFqnjntss_1wskKU6hE1VmBs+A@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | problem with parsing of patch files for patch-id | expand |
Rob Linden <rlinden@redhat.com> writes: > This patch (also attached) fixes it by only considering commit hashes > in a "From xxxxx..." line: If I am not mistaken, "git patch-id" was designed to read from git rev-list ... commit range ... | git diff-tree --stdin -p where we see 9005149a4a77e2d3409c6127bf4fd1a0893c3495 diff --git a/path b/path index ... ... patch text here ... so I would suspect that limiting the commit object names only to those that follow "From " (i.e. the format-patch output or output with the "--format=email" option) would break existing use cases big time.
Hello Junio! Thanks for the clue, You're right... I only work with the email format so I didn't think of that. My solution doesn't work then... I had a different idea first: to check if we already got an oid and only read a new one once the current diff is finished (and wasn't empty so far). The other one seemed just simpler. I will try again... Thanks & all the best, rob On Fri, Jun 21, 2024 at 7:05 PM Junio C Hamano <gitster@pobox.com> wrote: > > Rob Linden <rlinden@redhat.com> writes: > > > This patch (also attached) fixes it by only considering commit hashes > > in a "From xxxxx..." line: > > If I am not mistaken, "git patch-id" was designed to read from > > git rev-list ... commit range ... | git diff-tree --stdin -p > > where we see > > 9005149a4a77e2d3409c6127bf4fd1a0893c3495 > diff --git a/path b/path > index ... > ... patch text here ... > > so I would suspect that limiting the commit object names only to > those that follow "From " (i.e. the format-patch output or output > with the "--format=email" option) would break existing use cases big > time. >
Junio C Hamano <gitster@pobox.com> writes: > Rob Linden <rlinden@redhat.com> writes: > >> This patch (also attached) fixes it by only considering commit hashes >> in a "From xxxxx..." line: > > If I am not mistaken, "git patch-id" was designed to read from > > git rev-list ... commit range ... | git diff-tree --stdin -p > > where we see > > 9005149a4a77e2d3409c6127bf4fd1a0893c3495 > diff --git a/path b/path > index ... > ... patch text here ... > > so I would suspect that limiting the commit object names only to > those that follow "From " (i.e. the format-patch output or output > with the "--format=email" option) would break existing use cases big > time. Let's do this to make sure we have a baseline that we will not break. ------- >8 ------------- >8 ------------- >8 ------- [PATCH] t4204: patch-id supports various input format "git patch-id" was first developed to read from "git diff-tree --stdin -p" output. Later it was enhanced to read from "git diff-tree --stdin -p -v", which was the downstream of an early imitation of "git log" ("git rev-list" run in the upstream of a pipe to feed the "diff-tree"). These days, we also read from "git format-patch". Their output begins slightly differently, but the patch-id computed over them for the same commit should be the same. Ensure that we won't accidentally break this expectation. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- t/t4204-patch-id.sh | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git c/t/t4204-patch-id.sh w/t/t4204-patch-id.sh index 605faea0c7..ebce72b2ce 100755 --- c/t/t4204-patch-id.sh +++ w/t/t4204-patch-id.sh @@ -114,6 +114,29 @@ test_expect_success 'patch-id supports git-format-patch output' ' test "$2" = $(git rev-parse HEAD) ' +test_expect_success 'patch-id computes the same for various formats' ' + # This test happens to consider "git log -p -1" output + # the canonical input format, so use it as the norm. + git log -1 -p same >log-p.output && + git patch-id <log-p.output >expect && + + # format-patch begins with "From <commit object name>" + git format-patch -1 --stdout same >format-patch.output && + git patch-id <format-patch.output >actual && + test_cmp actual expect && + + # "diff-tree --stdin -p" begins with "<commit object name>" + same=$(git rev-parse same) && + echo $same | git diff-tree --stdin -p >diff-tree.output && + git patch-id <diff-tree.output >actual && + test_cmp actual expect && + + # "diff-tree --stdin -v -p" begins with "commit <commit object name>" + echo $same | git diff-tree --stdin -p -v >diff-tree-v.output && + git patch-id <diff-tree-v.output >actual && + test_cmp actual expect +' + test_expect_success 'whitespace is irrelevant in footer' ' get_patch_id main && git checkout same &&
Rob Linden <rlinden@redhat.com> writes: >> Rob Linden <rlinden@redhat.com> writes: >> >> > This patch (also attached) fixes it by only considering commit hashes >> > in a "From xxxxx..." line: >> >> If I am not mistaken, "git patch-id" was designed to read from >> >> git rev-list ... commit range ... | git diff-tree --stdin -p >> >> where we see >> >> 9005149a4a77e2d3409c6127bf4fd1a0893c3495 >> diff --git a/path b/path >> index ... >> ... patch text here ... >> >> so I would suspect that limiting the commit object names only to >> those that follow "From " (i.e. the format-patch output or output >> with the "--format=email" option) would break existing use cases big >> time. > ... > Hello Junio! > Thanks for the clue, You're right... I only work with the email format > so I didn't think of that. > My solution doesn't work then... > I had a different idea first: to check if we already got an oid and > only read a new one once > the current diff is finished (and wasn't empty so far). The other one > seemed just simpler. > I will try again... > Thanks & all the best, > rob Since then I sent a series [*] that was designed to address the issue you raised here, but unfortunately nobody seems to have paid attention and the patches are left hanging. It is part of my 'seen' branch and in the broken-out format parked on the jc/patch-id branch in the https://github.com/gitster/git/ repository, ending with the commit 3226bd87 (patch-id: tighten code to detect the patch header, 2024-06-21). If you can test (and if possible code review) them, that may help the series to move forward. Thanks. [Reference] * https://lore.kernel.org/git/20240621231826.3280338-1-gitster@pobox.com/
diff --git a/builtin/patch-id.c b/builtin/patch-id.c index 583099cacf..4b8a41bde8 100644 --- a/builtin/patch-id.c +++ b/builtin/patch-id.c @@ -86,7 +86,7 @@ static int get_one_patchid(struct object_id *next_oid, struct object_id *result, continue; } - if (!get_oid_hex(p, next_oid)) { + if (starts_with(p-5, "From ") && !get_oid_hex(p, next_oid)) { found_next = 1; break; }