Message ID | d0b577825124ac684ab304d3a1395f3d2d0708e8.1662333027.git.matheus.bernardino@usp.br (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | format-patch: warn if commit msg contains a patch delimiter | expand |
On Sun, Sep 04 2022, Matheus Tavares wrote: > When applying a patch, `git am` looks for special delimiter strings > (such as "---") to know where the message ends and the actual diff > starts. If one of these strings appears in the commit message itself, > `am` might get confused and fail to apply the patch properly. This has > already caused inconveniences in the past [1][2]. To help avoid such > problem, let's make `git format-patch` warn on commit messages > containing one of the said strings. > > [1]: https://lore.kernel.org/git/20210113085846-mutt-send-email-mst@kernel.org/ > [2]: https://lore.kernel.org/git/16297305.cDA1TJNmNo@earendil/ I followed this topic with one eye, and have run into this myself in the past. I'm not against this warning, but I wonder if we can't fix "am/apply" to just be smarter. The cases I've seen are all ones where: * We have a copy/pasted git diff, but we could disambiguate based on (at least) the "---" line being a telltale for the "real" patch, and the "X file changed..." diffstat. * We have a not-quite-git-looking patch diff in the commit message (which we'd normally detect and apply), as in your [2]. Couldn't we just be a bit smarter about applying these, and do a look-ahead and find what the user meant. Is any case, having such a warning won't "settle" this issue, as we're able to deal with this non-ambiguity in commit objects/the push/fetch protocol. It's just "format-patch/am" as a "wire protocol" that has this issue. But anyway, that's the state of the world now, so warning() about it is fair, even if we had a fix for the "apply" part we might want to warn for a while to note that it's an issue on older gits. > + if (pp->check_in_body_patch_breaks) { > + strbuf_reset(&linebuf); > + strbuf_add(&linebuf, line, linelen); > + if (patchbreak(&linebuf) || is_scissors_line(linebuf.buf)) { > + strbuf_strip_suffix(&linebuf, "\n"); Hrm, it's a (small) shame that the patchbreak() function takes a "struct strbuf" rather than a char */size_t in this case (seemingly for no good reason, as it's "const"?). Because of that you need to make a copy here, instead of just finding the "\n" and using the %*s format, anyway, small potatoes. > + warning("commit message has a patch delimiter: '%s'", > + linebuf.buf); Missing _()? > +test_expect_success 'warn if commit message contains patch delimiter' ' > + >delim && > + git add delim && > + GIT_EDITOR="printf \"title\n\n---\" >" git commit && Maybe I'm missing something, but isn't this GIT_EDITOR/printf just another way of saying something like: cat >msg <<-\EOF && "title ---" > EOF git commit -F msg && ... Untested, so maybe not..
Am 05.09.22 um 10:01 schrieb Ævar Arnfjörð Bjarmason: > > On Sun, Sep 04 2022, Matheus Tavares wrote: > >> When applying a patch, `git am` looks for special delimiter strings >> (such as "---") to know where the message ends and the actual diff >> starts. If one of these strings appears in the commit message itself, >> `am` might get confused and fail to apply the patch properly. This has >> already caused inconveniences in the past [1][2]. To help avoid such >> problem, let's make `git format-patch` warn on commit messages >> containing one of the said strings. >> >> [1]: https://lore.kernel.org/git/20210113085846-mutt-send-email-mst@kernel.org/ >> [2]: https://lore.kernel.org/git/16297305.cDA1TJNmNo@earendil/ > > I followed this topic with one eye, and have run into this myself in the > past. I'm not against this warning, but I wonder if we can't fix > "am/apply" to just be smarter. The cases I've seen are all ones where: > > * We have a copy/pasted git diff, but we could disambiguate based on > (at least) the "---" line being a telltale for the "real" patch, and > the "X file changed..." diffstat. > * We have a not-quite-git-looking patch diff in the commit message > (which we'd normally detect and apply), as in your [2]. > > Couldn't we just be a bit smarter about applying these, and do a > look-ahead and find what the user meant. Whatever we use to separate message from diff can be included in that message by an unsuspecting user and "---" can be part of a diff. An earlier discussion yielded an idea, but no implementation: https://lore.kernel.org/git/20200204010524-mutt-send-email-mst@kernel.org/ > Is any case, having such a warning won't "settle" this issue, as we're > able to deal with this non-ambiguity in commit objects/the push/fetch > protocol. It's just "format-patch/am" as a "wire protocol" that has this > issue. > > But anyway, that's the state of the world now, so warning() about it is > fair, even if we had a fix for the "apply" part we might want to warn > for a while to note that it's an issue on older gits. > >> + if (pp->check_in_body_patch_breaks) { >> + strbuf_reset(&linebuf); >> + strbuf_add(&linebuf, line, linelen); >> + if (patchbreak(&linebuf) || is_scissors_line(linebuf.buf)) { >> + strbuf_strip_suffix(&linebuf, "\n"); > > Hrm, it's a (small) shame that the patchbreak() function takes a "struct > strbuf" rather than a char */size_t in this case (seemingly for no good > reason, as it's "const"?). A strbuf is NUL-terminated, a length-limited string (char */size_t) doesn't have to be. That means the current implementation can use functions like starts_with(), but a faithful version that promises to stay within a given length cannot. So the reason is probably convenience. With skip_prefix_mem() it wouldn't be that bad, though: --- mailinfo.c | 37 +++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/mailinfo.c b/mailinfo.c index 9621ba62a3..ae2e70e363 100644 --- a/mailinfo.c +++ b/mailinfo.c @@ -646,32 +646,30 @@ static void decode_transfer_encoding(struct mailinfo *mi, struct strbuf *line) free(ret); } -static inline int patchbreak(const struct strbuf *line) +static int patchbreak(const char *buf, size_t len) { - size_t i; - /* Beginning of a "diff -" header? */ - if (starts_with(line->buf, "diff -")) + if (skip_prefix_mem(buf, len, "diff -", &buf, &len)) return 1; /* CVS "Index: " line? */ - if (starts_with(line->buf, "Index: ")) + if (skip_prefix_mem(buf, len, "Index: ", &buf, &len)) return 1; /* * "--- <filename>" starts patches without headers * "---<sp>*" is a manual separator */ - if (line->len < 4) + if (len < 4) return 0; - if (starts_with(line->buf, "---")) { + if (skip_prefix_mem(buf, len, "---", &buf, &len)) { /* space followed by a filename? */ - if (line->buf[3] == ' ' && !isspace(line->buf[4])) + if (len > 1 && buf[0] == ' ' && !isspace(buf[1])) return 1; /* Just whitespace? */ - for (i = 3; i < line->len; i++) { - unsigned char c = line->buf[i]; + for (; len; buf++, len--) { + unsigned char c = buf[0]; if (c == '\n') return 1; if (!isspace(c)) @@ -682,14 +680,14 @@ static inline int patchbreak(const struct strbuf *line) return 0; } -static int is_scissors_line(const char *line) +static int is_scissors_line(const char *line, size_t len) { const char *c; int scissors = 0, gap = 0; const char *first_nonblank = NULL, *last_nonblank = NULL; int visible, perforation = 0, in_perforation = 0; - for (c = line; *c; c++) { + for (c = line; len; c++, len--) { if (isspace(*c)) { if (in_perforation) { perforation++; @@ -705,12 +703,14 @@ static int is_scissors_line(const char *line) perforation++; continue; } - if (starts_with(c, ">8") || starts_with(c, "8<") || - starts_with(c, ">%") || starts_with(c, "%<")) { + if (skip_prefix_mem(c, len, ">8", &c, &len) || + skip_prefix_mem(c, len, "8<", &c, &len) || + skip_prefix_mem(c, len, ">%", &c, &len) || + skip_prefix_mem(c, len, "%<", &c, &len)) { in_perforation = 1; perforation += 2; scissors += 2; - c++; + c--, len++; continue; } in_perforation = 0; @@ -747,7 +747,8 @@ static int check_inbody_header(struct mailinfo *mi, const struct strbuf *line) { if (mi->inbody_header_accum.len && (line->buf[0] == ' ' || line->buf[0] == '\t')) { - if (mi->use_scissors && is_scissors_line(line->buf)) { + if (mi->use_scissors && + is_scissors_line(line->buf, line->len)) { /* * This is a scissors line; do not consider this line * as a header continuation line. @@ -808,7 +809,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line) if (convert_to_utf8(mi, line, mi->charset.buf)) return 0; /* mi->input_error already set */ - if (mi->use_scissors && is_scissors_line(line->buf)) { + if (mi->use_scissors && is_scissors_line(line->buf, line->len)) { int i; strbuf_setlen(&mi->log_message, 0); @@ -826,7 +827,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line) return 0; } - if (patchbreak(line)) { + if (patchbreak(line->buf, line->len)) { if (mi->message_id) strbuf_addf(&mi->log_message, "Message-Id: %s\n", mi->message_id); -- 2.37.2
diff --git a/builtin/log.c b/builtin/log.c index 56e2d95e86..edc84abaef 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -1973,6 +1973,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) rev.diffopt.flags.recursive = 1; rev.diffopt.no_free = 1; rev.subject_prefix = fmt_patch_subject_prefix; + rev.check_in_body_patch_breaks = 1; memset(&s_r_opt, 0, sizeof(s_r_opt)); s_r_opt.def = "HEAD"; s_r_opt.revarg_opt = REVARG_COMMITTISH; diff --git a/log-tree.c b/log-tree.c index 3e8c70ddcf..25ed5452b1 100644 --- a/log-tree.c +++ b/log-tree.c @@ -766,6 +766,7 @@ void show_log(struct rev_info *opt) ctx.after_subject = extra_headers; ctx.preserve_subject = opt->preserve_subject; ctx.encode_email_headers = opt->encode_email_headers; + ctx.check_in_body_patch_breaks = opt->check_in_body_patch_breaks; ctx.reflog_info = opt->reflog_info; ctx.fmt = opt->commit_format; ctx.mailmap = opt->mailmap; diff --git a/mailinfo.c b/mailinfo.c index 9621ba62a3..9945ea6267 100644 --- a/mailinfo.c +++ b/mailinfo.c @@ -646,7 +646,7 @@ static void decode_transfer_encoding(struct mailinfo *mi, struct strbuf *line) free(ret); } -static inline int patchbreak(const struct strbuf *line) +int patchbreak(const struct strbuf *line) { size_t i; @@ -682,7 +682,7 @@ static inline int patchbreak(const struct strbuf *line) return 0; } -static int is_scissors_line(const char *line) +int is_scissors_line(const char *line) { const char *c; int scissors = 0, gap = 0; diff --git a/mailinfo.h b/mailinfo.h index f2ffd0349e..8d4dda5deb 100644 --- a/mailinfo.h +++ b/mailinfo.h @@ -53,4 +53,7 @@ void setup_mailinfo(struct mailinfo *); int mailinfo(struct mailinfo *, const char *msg, const char *patch); void clear_mailinfo(struct mailinfo *); +int patchbreak(const struct strbuf *line); +int is_scissors_line(const char *line); + #endif /* MAILINFO_H */ diff --git a/pretty.c b/pretty.c index 6d819103fb..9f999029f5 100644 --- a/pretty.c +++ b/pretty.c @@ -5,6 +5,7 @@ #include "diff.h" #include "revision.h" #include "string-list.h" +#include "mailinfo.h" #include "mailmap.h" #include "log-tree.h" #include "notes.h" @@ -2097,7 +2098,8 @@ void pp_remainder(struct pretty_print_context *pp, int indent) { struct grep_opt *opt = pp->rev ? &pp->rev->grep_filter : NULL; - int first = 1; + int first = 1, found_delimiter = 0; + struct strbuf linebuf = STRBUF_INIT; for (;;) { const char *line = *msg_p; @@ -2107,6 +2109,17 @@ void pp_remainder(struct pretty_print_context *pp, if (!linelen) break; + if (pp->check_in_body_patch_breaks) { + strbuf_reset(&linebuf); + strbuf_add(&linebuf, line, linelen); + if (patchbreak(&linebuf) || is_scissors_line(linebuf.buf)) { + strbuf_strip_suffix(&linebuf, "\n"); + warning("commit message has a patch delimiter: '%s'", + linebuf.buf); + found_delimiter = 1; + } + } + if (is_blank_line(line, &linelen)) { if (first) continue; @@ -2133,6 +2146,12 @@ void pp_remainder(struct pretty_print_context *pp, } strbuf_addch(sb, '\n'); } + + if (found_delimiter) + warning("git am might fail to apply this patch. " + "Consider indenting the offending lines."); + + strbuf_release(&linebuf); } void pretty_print_commit(struct pretty_print_context *pp, diff --git a/pretty.h b/pretty.h index f34e24c53a..12df2f4a39 100644 --- a/pretty.h +++ b/pretty.h @@ -49,7 +49,8 @@ struct pretty_print_context { struct string_list *mailmap; int color; struct ident_split *from_ident; - unsigned encode_email_headers:1; + unsigned encode_email_headers:1, + check_in_body_patch_breaks:1; struct pretty_print_describe_status *describe_status; /* diff --git a/revision.h b/revision.h index 61a9b1316b..f384ab716f 100644 --- a/revision.h +++ b/revision.h @@ -230,7 +230,8 @@ struct rev_info { date_mode_explicit:1, preserve_subject:1, encode_email_headers:1, - include_header:1; + include_header:1, + check_in_body_patch_breaks:1; unsigned int disable_stdin:1; /* --show-linear-break */ unsigned int track_linear:1, diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh index fbec8ad2ef..4868ea2b91 100755 --- a/t/t4014-format-patch.sh +++ b/t/t4014-format-patch.sh @@ -2329,4 +2329,20 @@ test_expect_success 'interdiff: solo-patch' ' test_cmp expect actual ' +test_expect_success 'warn if commit message contains patch delimiter' ' + >delim && + git add delim && + GIT_EDITOR="printf \"title\n\n---\" >" git commit && + git format-patch -1 2>stderr && + grep "warning: commit message has a patch delimiter" stderr +' + +test_expect_success 'warn if commit message contains scissors' ' + >scissors && + git add scissors && + GIT_EDITOR="printf \"title\n\n-- >8 --\" >" git commit && + git format-patch -1 2>stderr && + grep "warning: commit message has a patch delimiter" stderr +' + test_done
When applying a patch, `git am` looks for special delimiter strings (such as "---") to know where the message ends and the actual diff starts. If one of these strings appears in the commit message itself, `am` might get confused and fail to apply the patch properly. This has already caused inconveniences in the past [1][2]. To help avoid such problem, let's make `git format-patch` warn on commit messages containing one of the said strings. [1]: https://lore.kernel.org/git/20210113085846-mutt-send-email-mst@kernel.org/ [2]: https://lore.kernel.org/git/16297305.cDA1TJNmNo@earendil/ Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> --- builtin/log.c | 1 + log-tree.c | 1 + mailinfo.c | 4 ++-- mailinfo.h | 3 +++ pretty.c | 21 ++++++++++++++++++++- pretty.h | 3 ++- revision.h | 3 ++- t/t4014-format-patch.sh | 16 ++++++++++++++++ 8 files changed, 47 insertions(+), 5 deletions(-)