From patchwork Mon Apr 1 21:53:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrei Rybak X-Patchwork-Id: 10880705 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ED334922 for ; Mon, 1 Apr 2019 21:53:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D6DD92881C for ; Mon, 1 Apr 2019 21:53:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C7F1428827; Mon, 1 Apr 2019 21:53:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6D52D2881C for ; Mon, 1 Apr 2019 21:53:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726656AbfDAVxi (ORCPT ); Mon, 1 Apr 2019 17:53:38 -0400 Received: from mail-wr1-f68.google.com ([209.85.221.68]:34766 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725858AbfDAVxi (ORCPT ); Mon, 1 Apr 2019 17:53:38 -0400 Received: by mail-wr1-f68.google.com with SMTP id p10so13990913wrq.1 for ; Mon, 01 Apr 2019 14:53:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=v8HsMoTpJ40120qWfjo2GURyQAfkWu9Tok//Qtr+H1g=; b=gYSho64ivRAVoahCrXFG5QH//8XGtNnpox2ilZ6vXe4viyUfTshaeebQIvYKHntyQp XXieI7F0uQ6q3/52P0gXeT7Pzdfk8ootR1PJjD6rI9j2BGiz7CDiVo6FLxiKEc7BntmN wWC9HkVc8Lc37pK9QJPU+aI/JqIERIZ2xukqLz/lanyeNW7AH9Y0RYF5A3FRJrmCwSJW 2JTqywQJm4u/82NI5wr8bxW9Mvvjid1w5etR6bx5xYkWnv5DMvKy51D+nf9prxVNNiJA V/t/ILwHW1VLIWZP4A0qvwNOmg6LcVKEcxoceRT3GyrcROLPK7kPPJeSje3uyK81m5FM dw2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=v8HsMoTpJ40120qWfjo2GURyQAfkWu9Tok//Qtr+H1g=; b=E7CxY/OmK2/IU+wQWiPRttVydU23/v0uFR+143ixVHfQeJrUPSMz9r8ltz7Xn+LLnZ pDtAtRbQTEkH4bJy0PsWgYRSmJBCHTnxasw0t/oaER3YE05oYEzKLFBiRUHeMgU1A0vT nR2NCCCKWQ9JI3U5/ybxNJ1utYV9ROg0geOrByAJOfiwjH8S+1pqhqwRV1gLpbwJyGGW vxEs4at6eoVVW/9yqwPHXoidv+rwUzUlm09lAwA48BpDq18ERSH5tDtd/itLNsZSIv3t hCFjFMkHwD21EW+D6QCiiDE21DSNcV8P8Ml5vZxJqKh8WPVqW85Zv+N3KnbnN8j2PVfd nkSw== X-Gm-Message-State: APjAAAWjP/S5HOWOHkJBycpgDkdRpe6on5TaP/qE207HVaKGIWUFCMYK CuRk/bB6OGoMoU+hhGcVW95jDO37vCg= X-Google-Smtp-Source: APXvYqxtdHJnBfbJSufi1Mrb2hAkhIWtRZOrxcKb1mKkAb0/ZYMNrq1KQWbfkRy+rCO63plj9DLLYw== X-Received: by 2002:adf:fd04:: with SMTP id e4mr44358570wrr.190.1554155616705; Mon, 01 Apr 2019 14:53:36 -0700 (PDT) Received: from localhost.localdomain ([185.220.70.166]) by smtp.gmail.com with ESMTPSA id v192sm17082227wme.24.2019.04.01.14.53.35 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 01 Apr 2019 14:53:35 -0700 (PDT) From: Andrei Rybak To: git@vger.kernel.org Cc: SZEDER =?utf-8?b?R8OhYm9y?= , Junio C Hamano , Jeff King , Duy Nguyen Subject: [PATCH v2 1/2] mailinfo: use starts_with() for clarity Date: Mon, 1 Apr 2019 23:53:33 +0200 Message-Id: <20190401215334.18678-1-rybak.a.v@gmail.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190331220104.31628-1-rybak.a.v@gmail.com> References: <20190331220104.31628-1-rybak.a.v@gmail.com> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Existing checks using memcmp(3) never read past the end of the line, because all substrings we are interested in are two characters long, and the outer loop guarantees we have at least one character. So at most we will look at the NUL. However, this is too subtle and may lead to bugs in code which copies this behavior without realizing substring length requirement. So use starts_with() instead, which will stop at NUL regardless of the length of the prefix. Remove extra pair of parentheses while we are here. Helped-by: Jeff King Signed-off-by: Andrei Rybak --- On Mon, Apr 01, 2019 at 06:11:57 -0400, Jeff King wrote: > I wonder if it's worth re-writing it like: Turned Peff's suggestion into a patch. mailinfo.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mailinfo.c b/mailinfo.c index b395adbdf2..f4aaa89788 100644 --- a/mailinfo.c +++ b/mailinfo.c @@ -693,8 +693,8 @@ static int is_scissors_line(const char *line) perforation++; continue; } - if ((!memcmp(c, ">8", 2) || !memcmp(c, "8<", 2) || - !memcmp(c, ">%", 2) || !memcmp(c, "%<", 2))) { + if (starts_with(c, ">8") || starts_with(c, "8<") || + starts_with(c, ">%") || starts_with(c, "%<")) { in_perforation = 1; perforation += 2; scissors += 2; From patchwork Mon Apr 1 21:53:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrei Rybak X-Patchwork-Id: 10880707 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3C2361575 for ; Mon, 1 Apr 2019 21:53:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2668E2881C for ; Mon, 1 Apr 2019 21:53:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1A4CE28827; Mon, 1 Apr 2019 21:53:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7F3D52881C for ; Mon, 1 Apr 2019 21:53:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726534AbfDAVxl (ORCPT ); Mon, 1 Apr 2019 17:53:41 -0400 Received: from mail-wr1-f65.google.com ([209.85.221.65]:39017 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725869AbfDAVxk (ORCPT ); Mon, 1 Apr 2019 17:53:40 -0400 Received: by mail-wr1-f65.google.com with SMTP id j9so13972991wrn.6 for ; Mon, 01 Apr 2019 14:53:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=rZhX4cj40ZpO7KMj+Sz7CPRS9gYvvVy26/o2dgZg2wc=; b=jWxf9wYELC2JMFZYrPZEuCWKnzchXhavQTtNXnXnxO/E8tlpZt31K23MKbUJp8GG5Y zAKL/jZUwyM+CkcEXGUy9ib8OyefS8RvJENOCcdDzTjTnmM/B3bjW27kfIX55f2uQG4y uBBnJjvBmA/PmpwSvtZMDY0/OvHgI9OupX1n11PdDf3AeYJaeSlKI+VB6Nc3e6wKAL9L FLCTT3xGRgXOwbV8Wi/6gX/ISYdwjG+Q3x0oPE8H4o3T2bJqDG0EsJIrgwD0eJwADqOj YcXSkCThB0ITYvzMjgzyDaWzp3ZAk0i4iuLE0Fk4/hlUdtzdtoiAzEhJCeBx7N/6OH7T n18A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rZhX4cj40ZpO7KMj+Sz7CPRS9gYvvVy26/o2dgZg2wc=; b=UH75cKDX5h8vRceiorpgvtC+YFP23x1EAxSn0cd/IDSvky6xVP5MUHTQzlDj6R1eoA t7OcS4wT+JdL7LgWcoU2yKEVuvQZNIbfKJcyW/iSRN5+/pZaY/FpdO8j+h9Of6b8Ddg+ sncFedwueRH19Hf6NguM8WYvV7aUOlG4iAPotTVc9P6LroYI+VPKTYpUcys2LPbRZCfJ jAZtLa6FSDtI/M77OBLIgLJBQlbd52pYkevFSDM0zwYRPg2WVEx0bNauks31eJrCk8vg EM34+U6xGuSTyTWWbfWglawtNd7UkU6/dFAShh5bXW1CSs2lCJsdi1GVwUeRu9rP/vEs qByA== X-Gm-Message-State: APjAAAUjXf5swAvW/KZ07m0dHWk3YAEGb4PCZU9bYTiZ5bjrqdE8om/3 YarP4SpPxNNxRLg40/jd/8Y37C+4dNk= X-Google-Smtp-Source: APXvYqyDR5qS9zoz0uRXhQBklCK1r+c4HBL0VSWhWmyvy0888akMWpsiStETawVyMeeiXruW1Zuecw== X-Received: by 2002:adf:f30a:: with SMTP id i10mr42278260wro.111.1554155617789; Mon, 01 Apr 2019 14:53:37 -0700 (PDT) Received: from localhost.localdomain ([185.220.70.166]) by smtp.gmail.com with ESMTPSA id v192sm17082227wme.24.2019.04.01.14.53.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 01 Apr 2019 14:53:37 -0700 (PDT) From: Andrei Rybak To: git@vger.kernel.org Cc: SZEDER =?utf-8?b?R8OhYm9y?= , Junio C Hamano , Jeff King , Duy Nguyen Subject: [PATCH v2 2/2] mailinfo: support Unicode scissors Date: Mon, 1 Apr 2019 23:53:34 +0200 Message-Id: <20190401215334.18678-2-rybak.a.v@gmail.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190401215334.18678-1-rybak.a.v@gmail.com> References: <20190331220104.31628-1-rybak.a.v@gmail.com> <20190401215334.18678-1-rybak.a.v@gmail.com> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Thank you all for review. Below is the second version of original patch, addressing comments by Gábor and Peff. While preparing v2 I found out that U+2702 was already suggested on the list eight months before cutting at perforation lines was implemented: https://public-inbox.org/git/200901181656.37813.markus.heidelberg@web.de/T/#m3856d2e5c5f3e1900210b74bf2be8851b92d2271 ---- >8 ---- Subject: [PATCH v2 2/2] mailinfo: support Unicode scissors Date: Mon, 1 Apr 2019 00:00:00 +0000 'git am --scissors' allows cutting a patch from an email at a scissors line. Such a line should contain perforation, i.e. hyphens, and a scissors symbol. Only ASCII graphics scissors '8<' '>8' '%<' '>%' are recognized by 'git am --scissors' command at the moment. Unicode character 'BLACK SCISSORS' (U+2702) has been a part of Unicode since version 1.0.0 [1]. Since then 'BLACK SCISSORS' also became part of character set Emoji 1.0, published in 2015 [2]. With its adoption as an emoji, availability of this character on keyboards has increased. Support UTF-8 encoding of '✂' in function is_scissors_line, for 'git am --scissors' to be able to cut at Unicode perforation lines in emails. Note, that Unicode character '✂' is three bytes in UTF-8 encoding and is spelled out using hexadecimal escape sequence. 1. https://www.unicode.org/versions/Unicode1.0.0/CodeCharts1.pdf https://www.unicode.org/Public/reconstructed/1.0.0/UnicodeData.txt 2. https://unicode.org/Public/emoji/1.0/emoji-data.txt Signed-off-by: Andrei Rybak --- mailinfo.c | 7 +++++++ t/t4150-am.sh | 26 +++++++++++++++++++++++++- 2 files changed, 32 insertions(+), 1 deletion(-) diff --git a/mailinfo.c b/mailinfo.c index f4aaa89788..804b07cd8a 100644 --- a/mailinfo.c +++ b/mailinfo.c @@ -701,6 +701,13 @@ static int is_scissors_line(const char *line) c++; continue; } + if (starts_with(c, "\xE2\x9C\x82" /* U-2702 ✂ in UTF-8 */)) { + in_perforation = 1; + perforation += 3; + scissors += 3; + c += 2; + continue; + } in_perforation = 0; } diff --git a/t/t4150-am.sh b/t/t4150-am.sh index 3f7f750cc8..3ea8e8a2cf 100755 --- a/t/t4150-am.sh +++ b/t/t4150-am.sh @@ -77,12 +77,20 @@ test_expect_success 'setup: messages' ' printf "Subject: " >subject-prefix && - cat - subject-prefix msg-without-scissors-line >msg-with-scissors-line <<-\EOF + cat - subject-prefix msg-without-scissors-line >msg-with-scissors-line <<-\EOF && This line should not be included in the commit message with --scissors enabled. - - >8 - - remove everything above this line - - >8 - - EOF + + cat - subject-prefix msg-without-scissors-line >msg-with-unicode-scissors <<-\EOF + Lines above unicode scissors line should not be included in the commit + message with --scissors enabled. + + - - - ✂ - - - ✂ - - - + + EOF ' test_expect_success setup ' @@ -161,6 +169,12 @@ test_expect_success setup ' git format-patch --stdout expected-for-no-scissors^ >patch-with-scissors-line.eml && git reset --hard HEAD^ && + echo file >file && + git add file && + git commit -F msg-with-unicode-scissors && + git format-patch --stdout HEAD^ >patch-with-unicode-scissors.eml && + git reset --hard HEAD^ && + sed -n -e "3,\$p" msg >file && git add file && test_tick && @@ -421,6 +435,16 @@ test_expect_success 'am --scissors cuts the message at the scissors line' ' test_cmp_rev expected-for-scissors HEAD ' +test_expect_success 'am --scissors cuts the message at the unicode scissors line' ' + rm -fr .git/rebase-apply && + git reset --hard && + git checkout second && + git am --scissors patch-with-unicode-scissors.eml && + test_path_is_missing .git/rebase-apply && + git diff --exit-code expected-for-scissors && + test_cmp_rev expected-for-scissors HEAD +' + test_expect_success 'am --no-scissors overrides mailinfo.scissors' ' rm -fr .git/rebase-apply && git reset --hard &&