From patchwork Sat Jul 20 16:01:59 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Phillip Wood <phillip.wood@dunelm.org.uk>
X-Patchwork-Id: 13737842
Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com
 [209.85.208.176])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 22DFD1E502
	for <git@vger.kernel.org>; Sat, 20 Jul 2024 16:02:06 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.208.176
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1721491328; cv=none;
 b=a5PDXoC7yOnmbfsMWThuPtC+OfKAh8QrIyzjCIEuAoKbvGBIJy0xqg5TRhBKanP5a4tvyiLQ6qmTF8omFD5G557g9AVTOHhCH7IJL747WdVsw38f3IAGiP2YhNPWxpRsnhTE46wnaMzFIsFs6/3Su1hi6vyfRDV6b+8BGykdLc0=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1721491328; c=relaxed/simple;
	bh=3Zbb5H+IUJVXRB7jU1K2M78y/HH5BfOY4pasOivDnjo=;
	h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type:
	 MIME-Version:To:Cc;
 b=MtF4g2/L41GZa3Xj5SLxbktQvP/knP52WDvq2k2tPKHzXtQ2jDtbX3MQV253SwNE2Qj3bcgIb48kVqi3e+HUPdYOpUqOdPPTI5IZy3fTUX6olYlzcDTTctjKUs+Pfdh3uePjJt14nRZhizy/v5J4DREvJ0aF6LAkuZX3eHA4ZGs=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com;
 spf=pass smtp.mailfrom=gmail.com;
 dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b=me8f82Zq; arc=none smtp.client-ip=209.85.208.176
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b="me8f82Zq"
Received: by mail-lj1-f176.google.com with SMTP id
 38308e7fff4ca-2ebe40673d8so36535901fa.3
        for <git@vger.kernel.org>; Sat, 20 Jul 2024 09:02:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20230601; t=1721491325; x=1722096125;
 darn=vger.kernel.org;
        h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from
         :references:in-reply-to:message-id:from:to:cc:subject:date
         :message-id:reply-to;
        bh=iTpePuwPS09EDc+EhZPO7bdloD+hfzdeJPMlvWdxeyU=;
        b=me8f82Zqn8YK3R7ZnPFVh8smmHKEghRDu8e5kVw7xXNmhTk0+KIp+05R5kfrMnDCL7
         O6+NtTdpjZJ3H3DfJKEGNWEeywTLLNLmHMVaz1wXObCi/e2HZtHFhHIjaVcCzetvArdb
         28Ni08LhOC2RR004h25l5rpBSjVmuKBc3Hp+IM5URFR5CwSD6FtKMfZYvCRMAONajZ65
         tqgLAc+T/KTNSMJVBOys2aXVxITx5fMJjgHqAwlael6/WE+cpkSrs+eI3Knm1spfi6bC
         LiILmVjm67tERqiU5cqxHRjmfCQY4Nme+UYwfktsvG7e23quiu3RX2TGgiKygwr9g21h
         +C9g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1721491325; x=1722096125;
        h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from
         :references:in-reply-to:message-id:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=iTpePuwPS09EDc+EhZPO7bdloD+hfzdeJPMlvWdxeyU=;
        b=XMMfDYBRmbFFL7zyIdev669eQGLMsSMeGY0aO4pTxv5SkaLUi9qT3eC68f6BcmrTDz
         4sho2/2eHc76DCRI/QK+W8cpxqi8g//dqPS7wnMfkluEm5EGS1kb5v1KIgZfjPwgqL2Q
         Ox0lCE7A6+nMpnXZb2+p2z+22S02o5Ab1CuM+ZqnKHsh+aJ3YANKsBBUhECBhMROSNEx
         YUfXYFQMj9vN/e0vNJCJmU98C0NkcbDnRLP46W+WhOGSwNzwJRUreNitzCblk90KecxW
         k/G/FcW/OpGrp40Yjy63XLpZry+URkraNoewYWZK/7bRP2udHuSvDkJiOctTQe2pYVAi
         0WdA==
X-Gm-Message-State: AOJu0Yzz04d52llt2tsGvhOTuXWNE1H9ErUWL+P3a+BP5WFZ7IhfjJen
	jfoJ7A2W9iynh/odhhQ1OUT88uKMhuWKpPFp/WmyTSFd8xPGQXhz4aJnNQ==
X-Google-Smtp-Source: 
 AGHT+IHA7jds5/144Z5jRvHeVM6jeNz9A/BPqykgjwJD0lO7v7PKPkwfohYUmMVTq7owj67jWSWizQ==
X-Received: by 2002:a2e:8ec9:0:b0:2ef:2658:98f2 with SMTP id
 38308e7fff4ca-2ef265899ebmr4787801fa.33.1721491324614;
        Sat, 20 Jul 2024 09:02:04 -0700 (PDT)
Received: from [127.0.0.1] ([13.74.141.28])
        by smtp.gmail.com with ESMTPSA id
 5b1f17b1804b1-427d2a3b763sm89204965e9.10.2024.07.20.09.02.03
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Sat, 20 Jul 2024 09:02:03 -0700 (PDT)
Message-Id: 
 <34d8fd44a97efd5a36003823f7db853291a2543c.1721491320.git.gitgitgadget@gmail.com>
In-Reply-To: <pull.1763.v2.git.1721491320.gitgitgadget@gmail.com>
References: <pull.1763.git.1721312619822.gitgitgadget@gmail.com>
	<pull.1763.v2.git.1721491320.gitgitgadget@gmail.com>
Date: Sat, 20 Jul 2024 16:01:59 +0000
Subject: [PATCH v2 1/2] add-patch: handle splitting hunks with
 diff.suppressBlankEmpty
Fcc: Sent
Precedence: bulk
X-Mailing-List: git@vger.kernel.org
List-Id: <git.vger.kernel.org>
List-Subscribe: <mailto:git+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:git+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>,
    Junio C Hamano <gitster@pobox.com>,
    Phillip Wood <phillip.wood123@gmail.com>,
    Phillip Wood <phillip.wood@dunelm.org.uk>,
    Phillip Wood <phillip.wood@dunelm.org.uk>
From: Phillip Wood <phillip.wood@dunelm.org.uk>

From: Phillip Wood <phillip.wood@dunelm.org.uk>

When "add -p" parses diffs, it looks for context lines starting with a
single space. But when diff.suppressBlankEmpty is in effect, an empty
context line will omit the space, giving us a true empty line. This
confuses the parser, which is unable to split based on such a line.

It's tempting to say that we should just make sure that we generate a

diff without that option.  However, although we do not parse hunks that
the user has manually edited with parse_diff() we do allow the user
to split such hunks. As POSIX calls the decision of whether to print the
space here "implementation-defined" we need to handle edited hunks where
empty context lines omit the space.

So let's handle both cases: a context line either starts with a space or
consists of a totally empty line by normalizing the first character to a
space when we parse them. Normalizing the first character rather than
changing the code to check for a space or newline will hopefully future
proof against introducing similar bugs if the code is changed.

Reported-by: Ilya Tumaykin <itumaykin@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 add-patch.c                | 19 +++++++++++++------
 t/t3701-add-interactive.sh | 19 +++++++++++++++++++
 2 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/add-patch.c b/add-patch.c
index d8ea05ff108..8feb719483f 100644
--- a/add-patch.c
+++ b/add-patch.c
@@ -400,6 +400,12 @@ static void complete_file(char marker, struct hunk *hunk)
 		hunk->splittable_into++;
 }
 
+/* Empty context lines may omit the leading ' ' */
+static int normalize_marker(const char *p)
+{
+	return p[0] == '\n' || (p[0] == '\r' && p[1] == '\n') ? ' ' : p[0];
+}
+
 static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
 {
 	struct strvec args = STRVEC_INIT;
@@ -485,6 +491,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
 	while (p != pend) {
 		char *eol = memchr(p, '\n', pend - p);
 		const char *deleted = NULL, *mode_change = NULL;
+		char ch = normalize_marker(p);
 
 		if (!eol)
 			eol = pend;
@@ -532,7 +539,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
 			 * Start counting into how many hunks this one can be
 			 * split
 			 */
-			marker = *p;
+			marker = ch;
 		} else if (hunk == &file_diff->head &&
 			   starts_with(p, "new file")) {
 			file_diff->added = 1;
@@ -586,10 +593,10 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
 			    (int)(eol - (plain->buf + file_diff->head.start)),
 			    plain->buf + file_diff->head.start);
 
-		if ((marker == '-' || marker == '+') && *p == ' ')
+		if ((marker == '-' || marker == '+') && ch == ' ')
 			hunk->splittable_into++;
-		if (marker && *p != '\\')
-			marker = *p;
+		if (marker && ch != '\\')
+			marker = ch;
 
 		p = eol == pend ? pend : eol + 1;
 		hunk->end = p - plain->buf;
@@ -813,7 +820,7 @@ static int merge_hunks(struct add_p_state *s, struct file_diff *file_diff,
 					    (int)(hunk->end - hunk->start),
 					    plain + hunk->start);
 
-				if (plain[overlap_end] != ' ')
+				if (normalize_marker(&plain[overlap_end]) != ' ')
 					return error(_("expected context line "
 						       "#%d in\n%.*s"),
 						     (int)(j + 1),
@@ -953,7 +960,7 @@ static int split_hunk(struct add_p_state *s, struct file_diff *file_diff,
 	context_line_count = 0;
 
 	while (splittable_into > 1) {
-		ch = s->plain.buf[current];
+		ch = normalize_marker(&s->plain.buf[current]);
 
 		if (!ch)
 			BUG("buffer overrun while splitting hunks");
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 5d78868ac16..9a48933cecf 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -1164,4 +1164,23 @@ test_expect_success 'reset -p with unmerged files' '
 	test_must_be_empty staged
 '
 
+test_expect_success 'hunk splitting works with diff.suppressBlankEmpty' '
+	test_config diff.suppressBlankEmpty true &&
+	write_script fake-editor.sh <<-\EOF &&
+	tr F G <"$1" >"$1.tmp" &&
+	mv "$1.tmp" "$1"
+	EOF
+
+	test_write_lines a b "" c d  "" e f "" >file &&
+	git add file &&
+	test_write_lines A b "" c D  "" e F "" >file &&
+	(
+		test_set_editor "$(pwd)/fake-editor.sh" &&
+		test_write_lines s n y e q | git add -p file
+	) &&
+	git cat-file blob :file >actual &&
+	test_write_lines a b "" c D "" e G "" >expect &&
+	test_cmp expect actual
+'
+
 test_done

From patchwork Sat Jul 20 16:02:00 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Phillip Wood <phillip.wood@dunelm.org.uk>
X-Patchwork-Id: 13737843
Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com
 [209.85.128.45])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9128113C689
	for <git@vger.kernel.org>; Sat, 20 Jul 2024 16:02:08 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.128.45
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1721491330; cv=none;
 b=RXxW1S2wBWvtd/B62liwG7odleGdEKg2NjNx03IwnDZmGAf3l+80Y9GshV20DNTiaa5Ou8dqXxTlkhQEIe9bEQuly2Y5uVxAAzXC1jXCvFxZmvAkKoOqQ34zQfRjz5avm1j3PmcYWj0ACVbJvcZxDMy0NnpqJnvjdFq4H38Hj7U=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1721491330; c=relaxed/simple;
	bh=XvRrJsaf/kyLaVhr6z6IwjVCr/3sRBwaL2qDGS4uGs8=;
	h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type:
	 MIME-Version:To:Cc;
 b=COrW6GsSDOOT2AULfxBY0Ps2FwYKf3K45qSbEMjUFIk1q7+EOn2UihQfzNZ2Gkoe66tlSwseDWGxBhHzK3IlFsM3/0vP4vDxY1MeaNiawmGWPiXJDPBdOquW8W6IFrOUpVwfOWl4FeKLk8DAgSbYLaSvGNzTJ0nvS0vsYvMzVBA=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com;
 spf=pass smtp.mailfrom=gmail.com;
 dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b=Tqayhzbb; arc=none smtp.client-ip=209.85.128.45
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com
 header.b="Tqayhzbb"
Received: by mail-wm1-f45.google.com with SMTP id
 5b1f17b1804b1-4272738eb9eso20713315e9.3
        for <git@vger.kernel.org>; Sat, 20 Jul 2024 09:02:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20230601; t=1721491326; x=1722096126;
 darn=vger.kernel.org;
        h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from
         :references:in-reply-to:message-id:from:to:cc:subject:date
         :message-id:reply-to;
        bh=Xn15d2w0u5wwj7SVEeyQFSPTSbLNpYi5g7QcyZRsBcs=;
        b=TqayhzbbEf6Gpi+6oqhGTLzWlSyxqpB5fMyREvik0GCIMtuHtlLCsRaGrc8VXWRy/L
         9OpIgKfI0/zIWYYNmNbanLsKLgygzB22WRHcCWegsjS6l8Ay0frQLOSDBMktRsuu3SZj
         j67wX90fHMNJH38Rw8JV62oIOSYWjO3JNatEbnlDMnTeyICynOHqBzGi9ruTBohS74bM
         2+6Z6ld9J6Zvd4UY3GQom5Oq1iUtVACKkh8J0oW5IqtWrh+1cVXix3f7ynyb6c0zeutE
         bE4wUaHfce4rbs4s9tnESl/4AC1FlWEXDFKPZ/KCB2hzHd0ZnFogQZxer7Cm2ydlBkIu
         5jWw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1721491326; x=1722096126;
        h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from
         :references:in-reply-to:message-id:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=Xn15d2w0u5wwj7SVEeyQFSPTSbLNpYi5g7QcyZRsBcs=;
        b=CN56q93tTxQR25U1fWUP9X97qk19CZQhyuud0A1RaB6VQORtBzMb62nQlykFzSSYCS
         1EEBEVQlC2ZO7CBH6ByKNmJ+yW5cAoGmtppm7RPus/SrYAElSdgehwjf5/QCO7Pvhnh4
         91RfMBVgUtEYCTx0Yp4dwIGwXWGsLm/dg4QXuqVOktuh8X7Mo2lNsv7m/kX9jDoKGaVD
         rDmKeFAZI6+yoHQxyCl7GSnSV5PyxOpJIMpycmt/oU7AalVaZFEYgHLZ/U3utrSID6kK
         GjRMoxRY8oSEu8Eud3zy0dukLHsmSjQQ5tgK6nPjyYWFiWppQZp7TrUModVFWW9f2ZM7
         4OdA==
X-Gm-Message-State: AOJu0YyoZbhBiaei8IiSdjXEnyfJS3oSJ0Nwnr0UQ0qLCKCFxMXEAqD4
	45iHnqHeFKSf7C44WXKjqii5wtnGX/VmVKhIXCU+vL1t5FMc54kL8EPfbQ==
X-Google-Smtp-Source: 
 AGHT+IHQvNKTkw8/NyhJTvZ3cHXlV04P9y8tCVZju36hFKBkZJiZn3zrQaWoGBwlxD/O+DOGHWnKcg==
X-Received: by 2002:a05:600c:3542:b0:426:59fe:ac2e with SMTP id
 5b1f17b1804b1-427dc559b85mr13115195e9.29.1721491326238;
        Sat, 20 Jul 2024 09:02:06 -0700 (PDT)
Received: from [127.0.0.1] ([13.74.141.28])
        by smtp.gmail.com with ESMTPSA id
 5b1f17b1804b1-427d2a8e420sm89004135e9.35.2024.07.20.09.02.04
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Sat, 20 Jul 2024 09:02:05 -0700 (PDT)
Message-Id: 
 <7bdcd2df01246932e452417815368f6c56d83e8d.1721491320.git.gitgitgadget@gmail.com>
In-Reply-To: <pull.1763.v2.git.1721491320.gitgitgadget@gmail.com>
References: <pull.1763.git.1721312619822.gitgitgadget@gmail.com>
	<pull.1763.v2.git.1721491320.gitgitgadget@gmail.com>
Date: Sat, 20 Jul 2024 16:02:00 +0000
Subject: [PATCH v2 2/2] add-patch: use normalize_marker() when recounting
 edited hunk
Fcc: Sent
Precedence: bulk
X-Mailing-List: git@vger.kernel.org
List-Id: <git.vger.kernel.org>
List-Subscribe: <mailto:git+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:git+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>,
    Junio C Hamano <gitster@pobox.com>,
    Phillip Wood <phillip.wood123@gmail.com>,
    Phillip Wood <phillip.wood@dunelm.org.uk>,
    Phillip Wood <phillip.wood@dunelm.org.uk>
From: Phillip Wood <phillip.wood@dunelm.org.uk>

From: Phillip Wood <phillip.wood@dunelm.org.uk>

After the user has edited a hunk the number of lines in the pre- and
post- image lines is recounted the hunk header can be updated before
passing the hunk to "git apply". The recounting code correctly handles
empty context lines where the leading ' ' is omitted by treating '\n'
and '\r' as context lines.

Update this code to use normalize_marker() so that the handling of empty
context lines is consistent with the rest of the hunk parsing
code. There is a small change in behavior as normalize_marker() only
treats "\r\n" as an empty context line rather than any line starting
with '\r'. This should not matter in practice as Macs have used Unix
line endings since MacOs 10 was released in 2001 and if it transpires
that someone is still using an earlier version of MacOs where lines end
with '\r' then we will need to change the handling of '\r' in
normalize_marker() anyway.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 add-patch.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/add-patch.c b/add-patch.c
index 8feb719483f..2a72ad63d14 100644
--- a/add-patch.c
+++ b/add-patch.c
@@ -1178,14 +1178,14 @@ static ssize_t recount_edited_hunk(struct add_p_state *s, struct hunk *hunk,
 
 	header->old_count = header->new_count = 0;
 	for (i = hunk->start; i < hunk->end; ) {
-		switch (s->plain.buf[i]) {
+		switch(normalize_marker(&s->plain.buf[i])) {
 		case '-':
 			header->old_count++;
 			break;
 		case '+':
 			header->new_count++;
 			break;
-		case ' ': case '\r': case '\n':
+		case ' ':
 			header->old_count++;
 			header->new_count++;
 			break;