Message ID | 20220918121053.880225-1-aclopte@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | sequencer: avoid dropping fixup commit that targets self via commit-ish | expand |
Thank you for the explanation! On Sun, Sep 18, 2022 at 2:11 PM Johannes Altmanninger <aclopte@gmail.com> wrote: > > Commit 68d5d03bc4 (rebase: teach --autosquash to match on sha1 in > addition to message, 2010-11-04) made --autosquash apply a commit > with subject "fixup! 012345" to the first commit in the todo list > whose OID starts with 012345. So far so good. I wasn't aware such fixups were recognized. The only way to create such fixups is manually, correct?
On Sun, Sep 18, 2022 at 05:05:29PM +0200, Erik Cervin Edin wrote: > Thank you for the explanation! > > On Sun, Sep 18, 2022 at 2:11 PM Johannes Altmanninger <aclopte@gmail.com> wrote: > > > > Commit 68d5d03bc4 (rebase: teach --autosquash to match on sha1 in > > addition to message, 2010-11-04) made --autosquash apply a commit > > with subject "fixup! 012345" to the first commit in the todo list > > whose OID starts with 012345. So far so good. > > I wasn't aware such fixups were recognized. Neither was I. I've repeatedly had issues where the message created by "commit --fixup" was ambiguous (user error I know) but I can change my tool to use OIDs for the cases when I squash them right away.. > The only way to create > such fixups is manually, correct? I think so
Johannes Altmanninger <aclopte@gmail.com> writes: > First, OID matches are given precedence over subject prefix > matches. Second, instead of prefix-matching OIDs, we use > lookup_commit_reference_by_name(). This means that if 012345 is a > branch name, we will apply the fixup commit to the tip of that branch > (if that is present in the todo list). Good finding. I think that both are results from imprecise conversion from the original done carelessly. A rewrite does not necessarily have to be bug-to-bug equivalent, and the precedence between object names vs subject prefix is something I think does not have to be kept the same as the original (in other words, a user who gives a token after "fixup" that is ambiguous between the two deserves whatever the implementation happens to give). But use of _by_name() that does not limit the input to hexadecimal _is_ a problem exactly for the reason why we have this discussion thread. The function accepts more than we want to accept, and anybody who reads the original commit 68d5d03b (rebase: teach --autosquash to match on sha1 in addition to message, 2010-11-04) and understands why we added it wouldn't have used it. Your solution looks somewhat surprising to me, as I would naively have thought to fix the use of _by_name() and limit its use only when the input token is all hexadecimal, or something. I'd need to think more to convince myself why this is the right solution. Thanks. > Demonstrate both behavior changes by adding two test cases for > "fixup! foo" where foo is a commit-ish that is not an OID-prefix. > Arguably, this feature is very weird. If no one uses it we should > consider removing it. > > Regardless, there is one bad edge case to fix. Let refspec "foo" point > to a commit with the subject "fixup! foo". Since rebase --autosquash > finds the fixup target via lookup_commit_reference_by_name(), the > fixup target is the fixup commit itself. Obviously this can't work. > We proceed with the broken invariant and drop the fixup commit > entirely. > > The self-fixup was only allowed because the fixup commit was already > added to the preliminary todo list, which it shouldn't be. Rather, > we should first compute the fixup target and only then add the fixup > commit to the todo list. Make it so, avoiding this error by design, > and add a third test for this case. > > Reported-by: Erik Cervin Edin <erik@cervined.in> > Signed-off-by: Johannes Altmanninger <aclopte@gmail.com> > --- > sequencer.c | 4 ++-- > t/t3415-rebase-autosquash.sh | 44 ++++++++++++++++++++++++++++++++++++ > 2 files changed, 46 insertions(+), 2 deletions(-) > > diff --git a/sequencer.c b/sequencer.c > index 484ca9aa50..777200a6dc 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -6287,8 +6287,6 @@ int todo_list_rearrange_squash(struct todo_list *todo_list) > return error(_("the script was already rearranged.")); > } > > - *commit_todo_item_at(&commit_todo, item->commit) = item; > - > parse_commit(item->commit); > commit_buffer = logmsg_reencode(item->commit, NULL, "UTF-8"); > find_commit_subject(commit_buffer, &subject); > @@ -6355,6 +6353,8 @@ int todo_list_rearrange_squash(struct todo_list *todo_list) > strhash(entry->subject)); > hashmap_put(&subject2item, &entry->entry); > } > + > + *commit_todo_item_at(&commit_todo, item->commit) = item; > } > > if (rearranged) { > diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh > index 78c27496d6..879e628512 100755 > --- a/t/t3415-rebase-autosquash.sh > +++ b/t/t3415-rebase-autosquash.sh > @@ -232,6 +232,50 @@ test_expect_success 'auto squash that matches longer sha1' ' > test_line_count = 1 actual > ' > > +test_expect_success 'auto squash that matches regex' ' > + git reset --hard base && > + git commit --allow-empty -m "hay needle hay" && > + git commit --allow-empty -m "fixup! :/[n]eedle" && > + GIT_SEQUENCE_EDITOR="cat >tmp" git rebase --autosquash -i HEAD^^ && > + sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p}" tmp >actual && > + cat <<-EOF >expect && > + pick HASH hay needle hay # empty > + fixup HASH fixup! :/[n]eedle # empty > + EOF > + test_cmp expect actual > +' > + > +test_expect_success 'auto squash of fixup commit that matches branch name' ' > + git reset --hard base && > + git commit --allow-empty -m "wip commit (just a prefix match so overshadowed by branch)" && > + git commit --allow-empty -m "tip of wip" && > + git branch wip && > + git commit --allow-empty -m "unrelated commit" && > + git commit --allow-empty -m "fixup! wip" && > + GIT_SEQUENCE_EDITOR="cat >tmp" git rebase --autosquash -i HEAD^^^^ && > + sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p}" tmp >actual && > + cat <<-EOF >expect && > + pick HASH wip commit (just a prefix match so overshadowed by branch) # empty > + pick HASH tip of wip # empty > + fixup HASH fixup! wip # empty > + pick HASH unrelated commit # empty > + EOF > + test_cmp expect actual > +' > + > +test_expect_success 'auto squash of fixup commit that matches branch name which points back to fixup commit' ' > + git reset --hard base && > + git commit --allow-empty -m "fixup! self-cycle" && > + git branch self-cycle && > + GIT_SEQUENCE_EDITOR="cat >tmp" git rebase --autosquash -i HEAD^^ && > + sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p}" tmp >actual && > + cat <<-EOF >expect && > + pick HASH second commit > + pick HASH fixup! self-cycle # empty > + EOF > + test_cmp expect actual > +' > + > test_auto_commit_flags () { > git reset --hard base && > echo 1 >file1 &&
Junio C Hamano <gitster@pobox.com> writes: > ... But use of _by_name() that does > not limit the input to hexadecimal _is_ a problem ... Ah, no, sorry, this was wrong. The original used "rev-parse -q --verify" without restricting the "single word" to "sha1 prefix".
Junio C Hamano <gitster@pobox.com> writes: > Your solution looks somewhat surprising to me, as I would naively > have thought to fix the use of _by_name() and limit its use only > when the input token is all hexadecimal, or something. I'd need to > think more to convince myself why this is the right solution. OK, we try to find what to amend with the current "fixupish" from the todo slab, which by definition must be something that we have already dealt with. The current "fixupish" should not be found because we haven't finished dealing with it, so delaying the addition of the current one to the todo slab is a must. There is no leaving or continuing the loop, other than an error return that aborts the whole thing, that may break the resulting todo slab in the normal case due to this change, either. The fix makes sense to me. Will queue. Thanks.
Johannes Altmanninger <aclopte@gmail.com> writes: > +test_expect_success 'auto squash of fixup commit that matches branch name which points back to fixup commit' ' > ... > + sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p}" tmp >actual && The construct is rejected by sed implementations of BSD descent, it seems. https://github.com/git/git/actions/runs/3084784922/jobs/4987337058#step:4:1844 ++ sed -ne '/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p}' tmp sed: 1: "/^[^#]/{s/[0-9a-f]\{7,\ ...": extra characters at the end of p command error: last command exited with $?=1 not ok 11 - auto squash that matches regex Here is a fix-up that can be squashed in. Thanks. t/t3415-rebase-autosquash.sh | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh index 879e628512..d65d2258c3 100755 --- a/t/t3415-rebase-autosquash.sh +++ b/t/t3415-rebase-autosquash.sh @@ -237,7 +237,7 @@ test_expect_success 'auto squash that matches regex' ' git commit --allow-empty -m "hay needle hay" && git commit --allow-empty -m "fixup! :/[n]eedle" && GIT_SEQUENCE_EDITOR="cat >tmp" git rebase --autosquash -i HEAD^^ && - sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p}" tmp >actual && + sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p;}" tmp >actual && cat <<-EOF >expect && pick HASH hay needle hay # empty fixup HASH fixup! :/[n]eedle # empty @@ -253,7 +253,7 @@ test_expect_success 'auto squash of fixup commit that matches branch name' ' git commit --allow-empty -m "unrelated commit" && git commit --allow-empty -m "fixup! wip" && GIT_SEQUENCE_EDITOR="cat >tmp" git rebase --autosquash -i HEAD^^^^ && - sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p}" tmp >actual && + sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p;}" tmp >actual && cat <<-EOF >expect && pick HASH wip commit (just a prefix match so overshadowed by branch) # empty pick HASH tip of wip # empty @@ -268,7 +268,7 @@ test_expect_success 'auto squash of fixup commit that matches branch name which git commit --allow-empty -m "fixup! self-cycle" && git branch self-cycle && GIT_SEQUENCE_EDITOR="cat >tmp" git rebase --autosquash -i HEAD^^ && - sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p}" tmp >actual && + sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p;}" tmp >actual && cat <<-EOF >expect && pick HASH second commit pick HASH fixup! self-cycle # empty
On Mon, Sep 19, 2022 at 09:07:27AM -0700, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > > > ... But use of _by_name() that does > > not limit the input to hexadecimal _is_ a problem ... > > Ah, no, sorry, this was wrong. The original used "rev-parse -q --verify" > without restricting the "single word" to "sha1 prefix". > Yeah, I've sent a v2 that removes the false verdicts from the commit message. Even if we restricted it to SHAs it would still be ambiguous.. I guess it doesn't matter in practice. Thanks
On Mon, Sep 19, 2022 at 04:09:57PM -0700, Junio C Hamano wrote: > Johannes Altmanninger <aclopte@gmail.com> writes: > > > +test_expect_success 'auto squash of fixup commit that matches branch name which points back to fixup commit' ' > > ... > > + sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p}" tmp >actual && FWIW I copied and adapted this one from the remerge-diff tests. Not sure what other tests use. Maybe the HASH replacement is worth a test helper even though it is a heuristic that can go wrong.
diff --git a/sequencer.c b/sequencer.c index 484ca9aa50..777200a6dc 100644 --- a/sequencer.c +++ b/sequencer.c @@ -6287,8 +6287,6 @@ int todo_list_rearrange_squash(struct todo_list *todo_list) return error(_("the script was already rearranged.")); } - *commit_todo_item_at(&commit_todo, item->commit) = item; - parse_commit(item->commit); commit_buffer = logmsg_reencode(item->commit, NULL, "UTF-8"); find_commit_subject(commit_buffer, &subject); @@ -6355,6 +6353,8 @@ int todo_list_rearrange_squash(struct todo_list *todo_list) strhash(entry->subject)); hashmap_put(&subject2item, &entry->entry); } + + *commit_todo_item_at(&commit_todo, item->commit) = item; } if (rearranged) { diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh index 78c27496d6..879e628512 100755 --- a/t/t3415-rebase-autosquash.sh +++ b/t/t3415-rebase-autosquash.sh @@ -232,6 +232,50 @@ test_expect_success 'auto squash that matches longer sha1' ' test_line_count = 1 actual ' +test_expect_success 'auto squash that matches regex' ' + git reset --hard base && + git commit --allow-empty -m "hay needle hay" && + git commit --allow-empty -m "fixup! :/[n]eedle" && + GIT_SEQUENCE_EDITOR="cat >tmp" git rebase --autosquash -i HEAD^^ && + sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p}" tmp >actual && + cat <<-EOF >expect && + pick HASH hay needle hay # empty + fixup HASH fixup! :/[n]eedle # empty + EOF + test_cmp expect actual +' + +test_expect_success 'auto squash of fixup commit that matches branch name' ' + git reset --hard base && + git commit --allow-empty -m "wip commit (just a prefix match so overshadowed by branch)" && + git commit --allow-empty -m "tip of wip" && + git branch wip && + git commit --allow-empty -m "unrelated commit" && + git commit --allow-empty -m "fixup! wip" && + GIT_SEQUENCE_EDITOR="cat >tmp" git rebase --autosquash -i HEAD^^^^ && + sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p}" tmp >actual && + cat <<-EOF >expect && + pick HASH wip commit (just a prefix match so overshadowed by branch) # empty + pick HASH tip of wip # empty + fixup HASH fixup! wip # empty + pick HASH unrelated commit # empty + EOF + test_cmp expect actual +' + +test_expect_success 'auto squash of fixup commit that matches branch name which points back to fixup commit' ' + git reset --hard base && + git commit --allow-empty -m "fixup! self-cycle" && + git branch self-cycle && + GIT_SEQUENCE_EDITOR="cat >tmp" git rebase --autosquash -i HEAD^^ && + sed -ne "/^[^#]/{s/[0-9a-f]\{7,\}/HASH/g;p}" tmp >actual && + cat <<-EOF >expect && + pick HASH second commit + pick HASH fixup! self-cycle # empty + EOF + test_cmp expect actual +' + test_auto_commit_flags () { git reset --hard base && echo 1 >file1 &&
Commit 68d5d03bc4 (rebase: teach --autosquash to match on sha1 in addition to message, 2010-11-04) made --autosquash apply a commit with subject "fixup! 012345" to the first commit in the todo list whose OID starts with 012345. So far so good. More recently, c44a4c650c (rebase -i: rearrange fixup/squash lines using the rebase--helper, 2017-07-14) reimplemented this logic in C and introduced two behavior changes. First, OID matches are given precedence over subject prefix matches. Second, instead of prefix-matching OIDs, we use lookup_commit_reference_by_name(). This means that if 012345 is a branch name, we will apply the fixup commit to the tip of that branch (if that is present in the todo list). Both behavior changes might be motivated by performance concerns (since the commit message mentions performance). Looking through the todo list to find a commit that matches the given prefix can be more expensive than looking up an OID. The runtime of the former is of O(n*m) where n is the size of the todo list and m is the length of a commit subject. However, if this is really a problem, we could easily make it O(m) by constructing a trie (prefix tree). Demonstrate both behavior changes by adding two test cases for "fixup! foo" where foo is a commit-ish that is not an OID-prefix. Arguably, this feature is very weird. If no one uses it we should consider removing it. Regardless, there is one bad edge case to fix. Let refspec "foo" point to a commit with the subject "fixup! foo". Since rebase --autosquash finds the fixup target via lookup_commit_reference_by_name(), the fixup target is the fixup commit itself. Obviously this can't work. We proceed with the broken invariant and drop the fixup commit entirely. The self-fixup was only allowed because the fixup commit was already added to the preliminary todo list, which it shouldn't be. Rather, we should first compute the fixup target and only then add the fixup commit to the todo list. Make it so, avoiding this error by design, and add a third test for this case. Reported-by: Erik Cervin Edin <erik@cervined.in> Signed-off-by: Johannes Altmanninger <aclopte@gmail.com> --- sequencer.c | 4 ++-- t/t3415-rebase-autosquash.sh | 44 ++++++++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+), 2 deletions(-)