From patchwork Wed Oct 27 12:04:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587119 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB5EDC433EF for ; Wed, 27 Oct 2021 12:04:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CCA9D6109D for ; Wed, 27 Oct 2021 12:04:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237510AbhJ0MGx (ORCPT ); Wed, 27 Oct 2021 08:06:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38508 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235674AbhJ0MGw (ORCPT ); Wed, 27 Oct 2021 08:06:52 -0400 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D374C061745 for ; Wed, 27 Oct 2021 05:04:27 -0700 (PDT) Received: by mail-wr1-x434.google.com with SMTP id d13so3711194wrf.11 for ; Wed, 27 Oct 2021 05:04:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=uEWYiN3zMagJGskhmrwR0avZIR39YtJZYgnhZg5J5eg=; b=AClcxJxgzfc5CDFOjzpEBWEc/6Q/jxUPGK35QsK8kiNvzkx2hIHDbx7/t94v5BgxKZ hUqoxvTyWmQ0gjwHmj+RVJ9wwg/bIKXdNzvlAGQp/Nh6Q109jJsUMgCZtJfZ63je63Cy 6QKsBIHQkhCNGr/+v10nrDnmibm+SOIbJiIRge3X2qHjNFAZolhxsyZPQXoZdxVHw3Hc E0/4CWYK1JTGsCg0Bp36uVp9Vv1Wbh+52jYQfHJPSVwPzwOXm2rafqphHR2zr5T0DXQh lc883oK+3ahXM/+f/2SGjmipa/i8vrhnp1Gme71CRtkotyohMbJLG5NPxXHXrAYSnwZx mh3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=uEWYiN3zMagJGskhmrwR0avZIR39YtJZYgnhZg5J5eg=; b=eMtppiL7PDgUWev/SZnfalMsXt8eRXZnHzVJwqLd6A8j+4y09CVsRGe7TL5EMUXczm nE/HPsbM/pFATZif2Lx6vQSrEalCj7+CbhquB2z5rTLj8m37GaW8XbguoT5CGOuh8Tad lkySPMs1qMmVtCqwHwGPnbbkzxpn+d2qcRjwtUvfGCyo6anqCYJ4/OqVOvH105PuMtaB GsDKc2/Dx4I778ZwwoAa90fhmfC0QLw8XVYJwCh3Fhyyr6qCj6okQCHvewi+hfq4xZ1c dJrTu5mJ1l9dkQV2jf2QaGkEnK1X/YhvonAwwAuwjTU9ipqk6sKybxMoZ2/ocSbRCx4W N0hg== X-Gm-Message-State: AOAM533ocRjhAOrJ8k7WFuTPM5D+fU4a5v27o1I5Ls3gWawFH8MLZuat vob+Fr8gHUiYUziw8yOx69K4It41JYk= X-Google-Smtp-Source: ABdhPJwneWwwWiE9Fhegzp6X6/PZa5wsRPiCwvVvUlVb2ScXHN86BsvgVAltT8jdrb2NdNxAh/QP9w== X-Received: by 2002:adf:8bd0:: with SMTP id w16mr38735368wra.32.1635336265605; Wed, 27 Oct 2021 05:04:25 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id p7sm21499522wrm.61.2021.10.27.05.04.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:25 -0700 (PDT) Message-Id: <8fc8914a37b3c343cd92bb0255088f7b000ff7f7.1635336262.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:08 +0000 Subject: [PATCH v3 01/15] diff --color-moved: add perf tests Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood Add some tests so we can monitor changes to the performance of the move detection code. The tests record the performance of a single large diff and a sequence of smaller diffs. Signed-off-by: Phillip Wood --- t/perf/p4002-diff-color-moved.sh | 45 ++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) create mode 100755 t/perf/p4002-diff-color-moved.sh diff --git a/t/perf/p4002-diff-color-moved.sh b/t/perf/p4002-diff-color-moved.sh new file mode 100755 index 00000000000..ad56bcb71e4 --- /dev/null +++ b/t/perf/p4002-diff-color-moved.sh @@ -0,0 +1,45 @@ +#!/bin/sh + +test_description='Tests diff --color-moved performance' +. ./perf-lib.sh + +test_perf_default_repo + +if ! git rev-parse --verify v2.29.0^{commit} >/dev/null +then + skip_all='skipping because tag v2.29.0 was not found' + test_done +fi + +GIT_PAGER_IN_USE=1 +test_export GIT_PAGER_IN_USE + +test_perf 'diff --no-color-moved --no-color-moved-ws large change' ' + git diff --no-color-moved --no-color-moved-ws v2.28.0 v2.29.0 +' + +test_perf 'diff --color-moved --no-color-moved-ws large change' ' + git diff --color-moved=zebra --no-color-moved-ws v2.28.0 v2.29.0 +' + +test_perf 'diff --color-moved-ws=allow-indentation-change large change' ' + git diff --color-moved=zebra --color-moved-ws=allow-indentation-change \ + v2.28.0 v2.29.0 +' + +test_perf 'log --no-color-moved --no-color-moved-ws' ' + git log --no-color-moved --no-color-moved-ws --no-merges --patch \ + -n1000 v2.29.0 +' + +test_perf 'log --color-moved --no-color-moved-ws' ' + git log --color-moved=zebra --no-color-moved-ws --no-merges --patch \ + -n1000 v2.29.0 +' + +test_perf 'log --color-moved-ws=allow-indentation-change' ' + git log --color-moved=zebra --color-moved-ws=allow-indentation-change \ + --no-merges --patch -n1000 v2.29.0 +' + +test_done From patchwork Wed Oct 27 12:04:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587123 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFC8BC433EF for ; Wed, 27 Oct 2021 12:04:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C63C3610A3 for ; Wed, 27 Oct 2021 12:04:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240342AbhJ0MGz (ORCPT ); Wed, 27 Oct 2021 08:06:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235674AbhJ0MGx (ORCPT ); Wed, 27 Oct 2021 08:06:53 -0400 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1C5AC061570 for ; Wed, 27 Oct 2021 05:04:27 -0700 (PDT) Received: by mail-wr1-x434.google.com with SMTP id u18so3756680wrg.5 for ; Wed, 27 Oct 2021 05:04:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=8PooKE+ZmACiLnvvFNaC9GMC6c2B/Iz1nlWdS4LuFUY=; b=P7/qFjWLBWw+azLlfV0+bMdc6CM1Za0mtrMMic3ce/m3VB4FpFMyAAk7X/DYmqddVa bW46ny/fmo/Vj7ExQkNFRKsBvFx4nWI5fvJWC8+177rEkwANvCA9+FKEx898+uj+uZYa HX/7Zjo3VsBv/mLQ+vTwnQh6HNeKiNpzEe2zDiPEv9xHzqgb0fqOWCpodls6D1xtBTw0 vlWPaSveKILJkuji+Jixvk8VqN49iiE4+6EHChQIsHFWpGIXiKn5Rckgz3r9M3pObH/q S3wWaJIThJFnYYgmIAeH6pt65wByhfiD2/eOw7r/tP1UoLoUQIGTgLHAdcSgsua3Zvo9 TEnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=8PooKE+ZmACiLnvvFNaC9GMC6c2B/Iz1nlWdS4LuFUY=; b=FzEij9WvHfA0wHrl91PzgoMRpqCV44TJag6LpgA3c7Ekg1llRmi0/djHUc+ewJBHsq Swzwg/UJ93FBw8FDFpRcMjNKKLlsaPCB+mm1qVSTFDJw/Hq3023it8Zu4O39phPR0FDE Mm0o7L9zckGd7Jxz78FqPCS5i8nzQT9yF5EEeSY+NtkZ7xGzvBVe/jiOQ82Jhja2XOHv Xz6zOGhX7tc2eh2/INLt/AqYWqObCLUuZd2jMx+drQUBkdEFazBKYYgPF8gvEqFVmV9n 3dDGwDNMhr1gRZu1G8Nblp6PXanKwARz6HZmsA/jHCXtPMyCQzuB/IOyYUHYp9XKFNgL AQXA== X-Gm-Message-State: AOAM530b1BHlDDZHu4ZFyK6s53ik6sP7e36HUfz2UdbZeaY6NkjZvhbT VIulrWSPjOye5pvyoA/r3glibP4KpsQ= X-Google-Smtp-Source: ABdhPJyZST+oH5gK6v20gM6OqLIlQ55mZ1CEMMCy0pyV++G+da4xSP1WgGB2VvRWhP7kq2VygbpL2g== X-Received: by 2002:a5d:5082:: with SMTP id a2mr14447730wrt.311.1635336266329; Wed, 27 Oct 2021 05:04:26 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c7sm18465517wrp.51.2021.10.27.05.04.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:25 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:09 +0000 Subject: [PATCH v3 02/15] diff --color-moved: clear all flags on blocks that are too short Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood If a block of potentially moved lines is not long enough then the DIFF_SYMBOL_MOVED_LINE flag is cleared on the matching lines so they are not marked as moved. To avoid problems when we start rewinding after an unsuccessful match in a couple of commits time make sure all the move related flags are cleared, not just DIFF_SYMBOL_MOVED_LINE. Signed-off-by: Phillip Wood --- diff.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/diff.c b/diff.c index 52c791574b7..bd8e4ec9757 100644 --- a/diff.c +++ b/diff.c @@ -1114,6 +1114,8 @@ static int shrink_potential_moved_blocks(struct moved_block *pmb, * NEEDSWORK: This uses the same heuristic as blame_entry_score() in blame.c. * Think of a way to unify them. */ +#define DIFF_SYMBOL_MOVED_LINE_ZEBRA_MASK \ + (DIFF_SYMBOL_MOVED_LINE | DIFF_SYMBOL_MOVED_LINE_ALT) static int adjust_last_block(struct diff_options *o, int n, int block_length) { int i, alnum_count = 0; @@ -1130,7 +1132,7 @@ static int adjust_last_block(struct diff_options *o, int n, int block_length) } } for (i = 1; i < block_length + 1; i++) - o->emitted_symbols->buf[n - i].flags &= ~DIFF_SYMBOL_MOVED_LINE; + o->emitted_symbols->buf[n - i].flags &= ~DIFF_SYMBOL_MOVED_LINE_ZEBRA_MASK; return 0; } @@ -1237,8 +1239,6 @@ static void mark_color_as_moved(struct diff_options *o, free(pmb); } -#define DIFF_SYMBOL_MOVED_LINE_ZEBRA_MASK \ - (DIFF_SYMBOL_MOVED_LINE | DIFF_SYMBOL_MOVED_LINE_ALT) static void dim_moved_lines(struct diff_options *o) { int n; From patchwork Wed Oct 27 12:04:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587125 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F22FC4332F for ; Wed, 27 Oct 2021 12:04:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0933660E8B for ; Wed, 27 Oct 2021 12:04:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240309AbhJ0MG5 (ORCPT ); Wed, 27 Oct 2021 08:06:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240269AbhJ0MGy (ORCPT ); Wed, 27 Oct 2021 08:06:54 -0400 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AAA84C061745 for ; Wed, 27 Oct 2021 05:04:28 -0700 (PDT) Received: by mail-wm1-x32b.google.com with SMTP id v127so2367157wme.5 for ; Wed, 27 Oct 2021 05:04:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=I1OjuyEdvS7pDHXy+edm6Y3GA0g7nFkP4MZfeMkkBBA=; b=mp8FdCghF8/KlhQlcCCk7ee0prHiDk7D+jy+6FUDbB2FPfQaTmLLlwK5oERFiRxMOF LG/4sJ0ReRn5/ybXUITqAL3IW7AY0Cr3M7PUsicp7UnLrPWRYtkebsOzq6QdxH805+hr jNbCgKG+kPBKeaK6k2YRp6/s+wKU4jqNs7Y3iNJq6NFDS6GpRGsKeDWLraDC9cnHWXWH v++kz1w+lboLyJzmV9eBhflRy1eM7tQ29MIa474y1Inba/iWs1Q+DuHIbkrG1XQ0AY8s IQLfuZCjKlonHRxPSHoum6evxin53ucwGHGdsMxrB9H4h67sby+6vXG35G6nPAG7fmgX 206w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=I1OjuyEdvS7pDHXy+edm6Y3GA0g7nFkP4MZfeMkkBBA=; b=OqxloMp+L7DEnpdpw3ZPdPag1Iyq+SbKvMWapFCWFLdVKR8aSR3SCFUnPQ1DMDVkbP WmckYh5RSLI8SUI36UP8ABBoGOaWms1BNtbd2Cjep9sRilrX593wNrHYQ7OfhLiIhi1t lMKMoZrTT0xuqNJKnyEJSdXW9BV6ZDvUyhcgpbbTiNLsAQjbRPksu/8mQXmC6MY347zg tiDcd0VP5FSVWw5Vein48dDUoOGDDQYvy2Y/P3fe9IdJf7N7OyA8Fcdw5I7R7T1dQSHT IF3Nh8rpUO1kHOkzkSjDbUHBxtOSpMGkYz6iww+I7OH5a1MxdB2eIsYjGASLhQXK5uN6 k+VA== X-Gm-Message-State: AOAM5317fR+1HajJkrnk4CFNFk5CSKaZpv2cjoJGMOgRK/5gumY0YGwC XkZE6WBJUIHdct2RcIQTKMVVC8SYD+0= X-Google-Smtp-Source: ABdhPJz2CeWovAK2TBD02K9H9qnugNmyAuVmCHiyB3mwp1T7HuUgrntT7dG+CJs2+8FTXdsfoPnPIg== X-Received: by 2002:a1c:e906:: with SMTP id q6mr5322774wmc.126.1635336266996; Wed, 27 Oct 2021 05:04:26 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 3sm3415325wms.5.2021.10.27.05.04.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:26 -0700 (PDT) Message-Id: <658aec2670c78f9753a5acccab20d3a1741403e6.1635336262.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:10 +0000 Subject: [PATCH v3 03/15] diff --color-moved: factor out function Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood This code is quite heavily indented and having it in its own function simplifies an upcoming change. Signed-off-by: Phillip Wood --- diff.c | 51 ++++++++++++++++++++++++++++++++++----------------- 1 file changed, 34 insertions(+), 17 deletions(-) diff --git a/diff.c b/diff.c index bd8e4ec9757..09af94e018c 100644 --- a/diff.c +++ b/diff.c @@ -1098,6 +1098,38 @@ static int shrink_potential_moved_blocks(struct moved_block *pmb, return rp + 1; } +static void fill_potential_moved_blocks(struct diff_options *o, + struct hashmap *hm, + struct moved_entry *match, + struct emitted_diff_symbol *l, + struct moved_block **pmb_p, + int *pmb_alloc_p, int *pmb_nr_p) + +{ + struct moved_block *pmb = *pmb_p; + int pmb_alloc = *pmb_alloc_p, pmb_nr = *pmb_nr_p; + + /* + * The current line is the start of a new block. + * Setup the set of potential blocks. + */ + hashmap_for_each_entry_from(hm, match, ent) { + ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc); + if (o->color_moved_ws_handling & + COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) { + if (compute_ws_delta(l, match->es, &(pmb[pmb_nr]).wsd)) + pmb[pmb_nr++].match = match; + } else { + pmb[pmb_nr].wsd = 0; + pmb[pmb_nr++].match = match; + } + } + + *pmb_p = pmb; + *pmb_alloc_p = pmb_alloc; + *pmb_nr_p = pmb_nr; +} + /* * If o->color_moved is COLOR_MOVED_PLAIN, this function does nothing. * @@ -1198,23 +1230,8 @@ static void mark_color_as_moved(struct diff_options *o, pmb_nr = shrink_potential_moved_blocks(pmb, pmb_nr); if (pmb_nr == 0) { - /* - * The current line is the start of a new block. - * Setup the set of potential blocks. - */ - hashmap_for_each_entry_from(hm, match, ent) { - ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc); - if (o->color_moved_ws_handling & - COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) { - if (compute_ws_delta(l, match->es, - &pmb[pmb_nr].wsd)) - pmb[pmb_nr++].match = match; - } else { - pmb[pmb_nr].wsd = 0; - pmb[pmb_nr++].match = match; - } - } - + fill_potential_moved_blocks( + o, hm, match, l, &pmb, &pmb_alloc, &pmb_nr); if (adjust_last_block(o, n, block_length) && pmb_nr && last_symbol != l->s) flipped_block = (flipped_block + 1) % 2; From patchwork Wed Oct 27 12:04:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587127 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40152C433EF for ; Wed, 27 Oct 2021 12:04:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2900A60F70 for ; Wed, 27 Oct 2021 12:04:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241737AbhJ0MG6 (ORCPT ); Wed, 27 Oct 2021 08:06:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240286AbhJ0MGy (ORCPT ); Wed, 27 Oct 2021 08:06:54 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F13DDC061767 for ; Wed, 27 Oct 2021 05:04:28 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id z14so3749392wrg.6 for ; Wed, 27 Oct 2021 05:04:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=tfO/DLvWqdwM405jxdcc6/bUU0BJV/ij1aINtW2Mnxk=; b=onwXlTJOTrOdNIFYoasGJD7phiEC97SpCePhIoaQ9mrJmA+41wu/IoKHPhQiugrkvN VX3kq46inYZeZGtUM9F8m5XOOvhhUl2MwPR67TY2dwu1Eo2lrb7h4lyIJF+XLkLyZ+Ka 4UVmnxMZ2C8ew8skce3ZMdhUkoy62VPWcvmtYQS2LcrAGpevtaVzaYIIAur4SUwkRqB6 8gfqkbbi1J0VLuaRvBa0lgbiqB59sQ6+LO2SdiZwK0CxyDsf4FS+BE94sdCSIg4MZMWz yc+0wiE+SUKvdUsODJ3Md8aOq+UR1NR9dxdb2QFYOKHHgAtbGMW0rLWkajH72KL4tjGi yj1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=tfO/DLvWqdwM405jxdcc6/bUU0BJV/ij1aINtW2Mnxk=; b=7LWt5mBl9XfbDlwkUQYXW9SLUwFIeYxdkjRBX34/I/LUDhIctD3/f2OoY3EPj8hso/ G6ddlQ64Rq/kUxp7lglarISyda8RAK0rSgokxrp9/pLXCUDlYpbJT7yiYCnKY4yeUShB BrwiWVhM7r0db51wxogOxGrmmBnw1i8G3AwhdMDe0ON6w38hFtUDRuhEwH8NZuRnPJaJ 8O01NWQ4aEA0Mn9d7z1svd4W5xis9r1LuYRWvEqbbi/ZnEow+DOSayMIdm/PTtAVYgZm ScKMbxw3iA7HW1m5aDx+B9qv3zpVuz5xwYENac/mXGiQUdJMrvlvsbFG+Oi0TnaYAILO 4Hng== X-Gm-Message-State: AOAM531hi2iejpnDHA/ZjhL0pSK/SbRTRPybMf/lsjNL6OyUJ6SjXMLs PNPlEWvgdINq4FW8ZsSUdvqTchE9HAc= X-Google-Smtp-Source: ABdhPJyRLFYrhDJMruuQLuqRqLvtAMVywT2WqXG1CsSWuD/qHCSnVC/pevQwaC09E4/UusBrQZQNIg== X-Received: by 2002:a05:6000:151:: with SMTP id r17mr12122331wrx.19.1635336267643; Wed, 27 Oct 2021 05:04:27 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q14sm12151846wrv.55.2021.10.27.05.04.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:27 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:11 +0000 Subject: [PATCH v3 04/15] diff --color-moved: rewind when discarding pmb Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood diff --color-moved colors the two sides of the diff separately. It walks through the diff and tries to find matches on the other side of the diff for the current line. When it finds one or more matches it starts a "potential moved block" (pmb) and marks the current line as moved. Then as it walks through the diff it only looks for matches for the current line in the lines following those in the pmb. When none of the lines in the pmb match it checks how long the match is and if it is too short it unmarks the lines as matched and goes back to finding all the lines that match the current line. As the process of finding matching lines restarts from the end of the block that was too short it is possible to miss the start of a matching block on on side but not the other. In the test added here "-two" would not be colored as moved but "+two" would be. Fix this by rewinding the current line when we reach the end of a block that is too short. This is quadratic in the length of the discarded block. While the discarded blocks are quite short on a large diff this still has a significant impact on the performance of --color-moved-ws=allow-indentation-change. The following commits optimize the performance of the --color-moved machinery which mitigates the performance impact of this commit. After the optimization this commit has a negligible impact on performance. Test HEAD^ HEAD ------------------------------------------------------------------------------------------------------------------ 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38 (0.33+0.05) 0.39 (0.34+0.04) +2.6% 4002.2: diff --color-moved --no-color-moved-ws large change 0.80 (0.76+0.03) 0.86 (0.82+0.04) +7.5% 4002.3: diff --color-moved-ws=allow-indentation-change large change 14.22(14.17+0.04) 19.01(18.93+0.05) +33.7% 4002.4: log --no-color-moved --no-color-moved-ws 1.16 (1.06+0.09) 1.16 (1.07+0.07) +0.0% 4002.5: log --color-moved --no-color-moved-ws 1.31 (1.22+0.09) 1.32 (1.22+0.09) +0.8% 4002.6: log --color-moved-ws=allow-indentation-change 1.71 (1.61+0.09) 1.72 (1.63+0.08) +0.6% Signed-off-by: Phillip Wood --- diff.c | 28 ++++++++++++++++++----- t/t4015-diff-whitespace.sh | 46 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 69 insertions(+), 5 deletions(-) diff --git a/diff.c b/diff.c index 09af94e018c..1e1b5127d15 100644 --- a/diff.c +++ b/diff.c @@ -1205,7 +1205,15 @@ static void mark_color_as_moved(struct diff_options *o, if (!match) { int i; - adjust_last_block(o, n, block_length); + if (!adjust_last_block(o, n, block_length) && + block_length > 1) { + /* + * Rewind in case there is another match + * starting at the second line of the block + */ + match = NULL; + n -= block_length; + } for(i = 0; i < pmb_nr; i++) moved_block_clear(&pmb[i]); pmb_nr = 0; @@ -1230,10 +1238,20 @@ static void mark_color_as_moved(struct diff_options *o, pmb_nr = shrink_potential_moved_blocks(pmb, pmb_nr); if (pmb_nr == 0) { - fill_potential_moved_blocks( - o, hm, match, l, &pmb, &pmb_alloc, &pmb_nr); - if (adjust_last_block(o, n, block_length) && - pmb_nr && last_symbol != l->s) + int contiguous = adjust_last_block(o, n, block_length); + + if (!contiguous && block_length > 1) + /* + * Rewind in case there is another match + * starting at the second line of the block + */ + n -= block_length; + else + fill_potential_moved_blocks(o, hm, match, l, + &pmb, &pmb_alloc, + &pmb_nr); + + if (contiguous && pmb_nr && last_symbol != l->s) flipped_block = (flipped_block + 1) % 2; else flipped_block = 0; diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh index 2c13b62d3c6..308dc136596 100755 --- a/t/t4015-diff-whitespace.sh +++ b/t/t4015-diff-whitespace.sh @@ -1833,6 +1833,52 @@ test_expect_success '--color-moved treats adjacent blocks as separate for MIN_AL test_cmp expected actual ' +test_expect_success '--color-moved rewinds for MIN_ALNUM_COUNT' ' + git reset --hard && + test_write_lines >file \ + A B C one two three four five six seven D E F G H I J && + git add file && + test_write_lines >file \ + one two A B C D E F G H I J two three four five six seven && + git diff --color-moved=zebra -- file && + + git diff --color-moved=zebra --color -- file >actual.raw && + grep -v "index" actual.raw | test_decode_color >actual && + cat >expected <<-\EOF && + diff --git a/file b/file + --- a/file + +++ b/file + @@ -1,13 +1,8 @@ + +one + +two + A + B + C + -one + -two + -three + -four + -five + -six + -seven + D + E + F + @@ -15,3 +10,9 @@ G + H + I + J + +two + +three + +four + +five + +six + +seven + EOF + + test_cmp expected actual +' + test_expect_success 'move detection with submodules' ' test_create_repo bananas && echo ripe >bananas/recipe && From patchwork Wed Oct 27 12:04:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587129 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD6AAC433FE for ; Wed, 27 Oct 2021 12:04:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 96C5860F70 for ; Wed, 27 Oct 2021 12:04:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240304AbhJ0MG7 (ORCPT ); Wed, 27 Oct 2021 08:06:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240336AbhJ0MGz (ORCPT ); Wed, 27 Oct 2021 08:06:55 -0400 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 789ACC061570 for ; Wed, 27 Oct 2021 05:04:30 -0700 (PDT) Received: by mail-wr1-x42e.google.com with SMTP id m22so3841072wrb.0 for ; Wed, 27 Oct 2021 05:04:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=r/21tYvfclPNJep2HNK27H7vRi4XW03mypSl62mMt2g=; b=OvLEox/rya7laRREpIHbPGovt9psGAGJbnVxxURBapKMbQXXv8+ubDStPQVDTTgtNu 5/db3BVb1gNhYjmw7Uj/WcABKefNx3Q0R4XM7ocLSckzo1xuM7Y3TIZn04lGRCcf3V+x TcqPKEOv26XCAxdTN/g5LmdqYMQWydtqlGsDKk3/xB2G8mgI3666qQ9gNI8xVXNxpLyR KqdmjjyjgWLmlXP0/VBaycqmHayj3gjmmzzvRyvTs8cVzjsMrf7LYokJ/WE9OTNIuNCV V8k08j7AIbsgjGf9fSStgtZeUGDas5lrqmcXWk9f+wkZIybuD0XpaFpVR5EyVGmZL+pS Vaug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=r/21tYvfclPNJep2HNK27H7vRi4XW03mypSl62mMt2g=; b=iZz0AnkUPy/nlPupbCKW0ffLFjW0CY0hRjNi/rakAx/dwwC3uJlC6mMACIhQT2FDSS xnefcS+56K5klv7W4cVxGOCyCSjWbZMUqEngGxQxBkT1uYYCUQ8ejfi0Ke2P6Z6FJh/T D1O9UHZ+83e+m55JDOpbP4KnkjtFm/x1atzUZzUrBuEsGyGXVvLr2qHcxK05UY9udwT8 SOUDWufLZlcf/FMQjPfi5elOVjAz6V9T2r3xH9yQzlGGiAiTYqVxAzrZSRod1AbSYP4m gr0sB1SsAilm/U9Avf9LbOIfMFsb6e00ZKLmrmngJ+0oZegsZdHdNqZNvDhzmo832btn k+5Q== X-Gm-Message-State: AOAM532dCucbuwGi0DmjQ7k1vkN7OcpXKMvBddb2Cx85d0nqid8FMZ25 X7J/7V/p3GWSF4MjscJmdnUt4X+NsE4= X-Google-Smtp-Source: ABdhPJwNuIeeyjRnsn0iNUbs47zjDrkun364rEyges8dLoycJSt6mcHlYc1iQaED7CIpXBqcuuqTRw== X-Received: by 2002:adf:e388:: with SMTP id e8mr4101870wrm.104.1635336269085; Wed, 27 Oct 2021 05:04:29 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t15sm1076036wmi.24.2021.10.27.05.04.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:27 -0700 (PDT) Message-Id: <1dde206b7b11bd04ed48336f4879370dbcd65671.1635336263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:12 +0000 Subject: [PATCH v3 05/15] diff --color-moved=zebra: fix alternate coloring Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood b0a2ba4776 ("diff --color-moved=zebra: be stricter with color alternation", 2018-11-23) sought to avoid using the alternate colors unless there are two adjacent moved blocks of the same sign. Unfortunately it contains two bugs that prevented it from fixing the problem properly. Firstly `last_symbol` is reset at the start of each iteration of the loop losing the symbol of the last line and secondly when deciding whether to use the alternate color it should be checking if the current line is the same sign of the last line, not a different sign. The combination of the two errors means that we still use the alternate color when we should do but we also use it when we shouldn't. This is most noticable when using --color-moved-ws=allow-indentation-change with hunks like -this line gets indented + this line gets indented where the post image is colored with newMovedAlternate rather than newMoved. While this does not matter much, the next commit will change the coloring to be correct in this case, so lets fix the bug here to make it clear why the output is changing and add a regression test. Signed-off-by: Phillip Wood --- diff.c | 4 +-- t/t4015-diff-whitespace.sh | 72 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 74 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index 1e1b5127d15..53f0df75329 100644 --- a/diff.c +++ b/diff.c @@ -1176,6 +1176,7 @@ static void mark_color_as_moved(struct diff_options *o, struct moved_block *pmb = NULL; /* potentially moved blocks */ int pmb_nr = 0, pmb_alloc = 0; int n, flipped_block = 0, block_length = 0; + enum diff_symbol last_symbol = 0; for (n = 0; n < o->emitted_symbols->nr; n++) { @@ -1183,7 +1184,6 @@ static void mark_color_as_moved(struct diff_options *o, struct moved_entry *key; struct moved_entry *match = NULL; struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n]; - enum diff_symbol last_symbol = 0; switch (l->s) { case DIFF_SYMBOL_PLUS: @@ -1251,7 +1251,7 @@ static void mark_color_as_moved(struct diff_options *o, &pmb, &pmb_alloc, &pmb_nr); - if (contiguous && pmb_nr && last_symbol != l->s) + if (contiguous && pmb_nr && last_symbol == l->s) flipped_block = (flipped_block + 1) % 2; else flipped_block = 0; diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh index 308dc136596..4e0fd76c6c5 100755 --- a/t/t4015-diff-whitespace.sh +++ b/t/t4015-diff-whitespace.sh @@ -1442,6 +1442,78 @@ test_expect_success 'detect permutations inside moved code -- dimmed-zebra' ' test_cmp expected actual ' +test_expect_success 'zebra alternate color is only used when necessary' ' + cat >old.txt <<-\EOF && + line 1A should be marked as oldMoved newMovedAlternate + line 1B should be marked as oldMoved newMovedAlternate + unchanged + line 2A should be marked as oldMoved newMovedAlternate + line 2B should be marked as oldMoved newMovedAlternate + line 3A should be marked as oldMovedAlternate newMoved + line 3B should be marked as oldMovedAlternate newMoved + unchanged + line 4A should be marked as oldMoved newMovedAlternate + line 4B should be marked as oldMoved newMovedAlternate + line 5A should be marked as oldMovedAlternate newMoved + line 5B should be marked as oldMovedAlternate newMoved + line 6A should be marked as oldMoved newMoved + line 6B should be marked as oldMoved newMoved + EOF + cat >new.txt <<-\EOF && + line 1A should be marked as oldMoved newMovedAlternate + line 1B should be marked as oldMoved newMovedAlternate + unchanged + line 3A should be marked as oldMovedAlternate newMoved + line 3B should be marked as oldMovedAlternate newMoved + line 2A should be marked as oldMoved newMovedAlternate + line 2B should be marked as oldMoved newMovedAlternate + unchanged + line 6A should be marked as oldMoved newMoved + line 6B should be marked as oldMoved newMoved + line 4A should be marked as oldMoved newMovedAlternate + line 4B should be marked as oldMoved newMovedAlternate + line 5A should be marked as oldMovedAlternate newMoved + line 5B should be marked as oldMovedAlternate newMoved + EOF + test_expect_code 1 git diff --no-index --color --color-moved=zebra \ + --color-moved-ws=allow-indentation-change \ + old.txt new.txt >output && + grep -v index output | test_decode_color >actual && + cat >expected <<-\EOF && + diff --git a/old.txt b/new.txt + --- a/old.txt + +++ b/new.txt + @@ -1,14 +1,14 @@ + -line 1A should be marked as oldMoved newMovedAlternate + -line 1B should be marked as oldMoved newMovedAlternate + + line 1A should be marked as oldMoved newMovedAlternate + + line 1B should be marked as oldMoved newMovedAlternate + unchanged + -line 2A should be marked as oldMoved newMovedAlternate + -line 2B should be marked as oldMoved newMovedAlternate + -line 3A should be marked as oldMovedAlternate newMoved + -line 3B should be marked as oldMovedAlternate newMoved + + line 3A should be marked as oldMovedAlternate newMoved + + line 3B should be marked as oldMovedAlternate newMoved + + line 2A should be marked as oldMoved newMovedAlternate + + line 2B should be marked as oldMoved newMovedAlternate + unchanged + -line 4A should be marked as oldMoved newMovedAlternate + -line 4B should be marked as oldMoved newMovedAlternate + -line 5A should be marked as oldMovedAlternate newMoved + -line 5B should be marked as oldMovedAlternate newMoved + -line 6A should be marked as oldMoved newMoved + -line 6B should be marked as oldMoved newMoved + + line 6A should be marked as oldMoved newMoved + + line 6B should be marked as oldMoved newMoved + + line 4A should be marked as oldMoved newMovedAlternate + + line 4B should be marked as oldMoved newMovedAlternate + + line 5A should be marked as oldMovedAlternate newMoved + + line 5B should be marked as oldMovedAlternate newMoved + EOF + test_cmp expected actual +' + test_expect_success 'cmd option assumes configured colored-moved' ' test_config color.diff.oldMoved "magenta" && test_config color.diff.newMoved "cyan" && From patchwork Wed Oct 27 12:04:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587131 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B067BC433EF for ; Wed, 27 Oct 2021 12:04:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9B609610A3 for ; Wed, 27 Oct 2021 12:04:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241762AbhJ0MHB (ORCPT ); Wed, 27 Oct 2021 08:07:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38524 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241715AbhJ0MG4 (ORCPT ); Wed, 27 Oct 2021 08:06:56 -0400 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 187E0C061224 for ; Wed, 27 Oct 2021 05:04:31 -0700 (PDT) Received: by mail-wm1-x331.google.com with SMTP id v127so2367294wme.5 for ; Wed, 27 Oct 2021 05:04:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=63o3P4q1ZFH6Z3d6V3ABrwdZeiLXPkyAw8esWLYB2Wc=; b=qXjttTT8x2fu0YX7Zyh5LECrRuVPY8qIN9LvMg/euOU6NLXScCzUsSpU0sBn4Xo3Lu dY0zaCJdM1tDv+U0sUlx7RQpF9KiJkRMvYy4IA3xBNAtoDANAYLG+KkPgGmXJBKdt0/y gjU93KcuRdfJqTGbQ2WITHXTTT8SJb1xLnM6uZzNsyTcIyQ+7SCIwBrnNyekONkyDsTr 8e3Ume9DKj1jDODXPQn4Jx/mgx5CUL+YYJPXcvGPJB8Vbd+IVi4V6BjkavG/+S1Hhhc9 pSqGpVG/oD2YAtBAlyo8pUX1aitN2btNhbYJF3EtpaSJLJ+uxrJMjZ60GpCAnB2Gh5Eg nlHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=63o3P4q1ZFH6Z3d6V3ABrwdZeiLXPkyAw8esWLYB2Wc=; b=t5fPyAo9AAVCcj3VADPdmFJQ20+cmvfMveE/AK7bhYGqDiHcfjdnIKN3Dz1eXBjIRN 5MOFAV9ceIyOqDRXrWWn/iAdlR7ZPyliS8cReGvG5jn8n6GMVIowuvjcoF3tEaklIwGf 9daTLbtej9cGpFcqUGqwZVFB72UCosVwi2Ol+aVnq8OQ42/6+MA659DGt5HwgW68RMZ6 8EfUSfH/74Pig9tm369gJxz0t/3Uep8GbJNH+c6kdfObh1AHQQ56OJvWxcuc7VN6L/SB D7j5tuLympkWn/NwVxKkaTPFgiBgpG4qHrzQJHsBnhpJdjAeWp8z37A1XE7CccvIHC+I kkXQ== X-Gm-Message-State: AOAM5309gsbE2c8F6J1MqEe01xcrQaf0WKCvhKkk6ZEvunWAptzb/Nd6 TZwQuGs63+PETqZ/6fZjXzeHwqqx3kk= X-Google-Smtp-Source: ABdhPJxmQAWEp0PHLrYjMKt7JmpM1Reoo53YoQatYY4UG9OsUEvgnX4MvGtkddgMznUHrgEDnkP5mA== X-Received: by 2002:a7b:c1ca:: with SMTP id a10mr5382711wmj.91.1635336269699; Wed, 27 Oct 2021 05:04:29 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z11sm511363wrv.4.2021.10.27.05.04.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:29 -0700 (PDT) Message-Id: <2717ff500d2ebf82179f89c90d690718221e9fa8.1635336263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:13 +0000 Subject: [PATCH v3 06/15] diff --color-moved: avoid false short line matches and bad zerba coloring Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood When marking moved lines it is possible for a block of potential matched lines to extend past a change in sign when there is a sequence of added lines whose text matches the text of a sequence of deleted and added lines. Most of the time either `match` will be NULL or `pmb_advance_or_null()` will fail when the loop encounters a change of sign but there are corner cases where `match` is non-NULL and `pmb_advance_or_null()` successfully advances the moved block despite the change in sign. One consequence of this is highlighting a short line as moved when it should not be. For example -moved line # Correctly highlighted as moved +short line # Wrongly highlighted as moved context +moved line # Correctly highlighted as moved +short line context -short line The other consequence is coloring a moved addition following a moved deletion in the wrong color. In the example below the first "+moved line 3" should be highlighted as newMoved not newMovedAlternate. -moved line 1 # Correctly highlighted as oldMoved -moved line 2 # Correctly highlighted as oldMovedAlternate +moved line 3 # Wrongly highlighted as newMovedAlternate context # Everything else is highlighted correctly +moved line 2 +moved line 3 context +moved line 1 -moved line 3 These false matches are more likely when using --color-moved-ws with the exception of --color-moved-ws=allow-indentation-change which ties the sign of the current whitespace delta to the sign of the line to avoid this problem. The fix is to check that the sign of the new line being matched is the same as the sign of the line that started the block of potential matches. Signed-off-by: Phillip Wood --- diff.c | 17 ++++++---- t/t4015-diff-whitespace.sh | 65 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 76 insertions(+), 6 deletions(-) diff --git a/diff.c b/diff.c index 53f0df75329..efba2789354 100644 --- a/diff.c +++ b/diff.c @@ -1176,7 +1176,7 @@ static void mark_color_as_moved(struct diff_options *o, struct moved_block *pmb = NULL; /* potentially moved blocks */ int pmb_nr = 0, pmb_alloc = 0; int n, flipped_block = 0, block_length = 0; - enum diff_symbol last_symbol = 0; + enum diff_symbol moved_symbol = DIFF_SYMBOL_BINARY_DIFF_HEADER; for (n = 0; n < o->emitted_symbols->nr; n++) { @@ -1202,7 +1202,7 @@ static void mark_color_as_moved(struct diff_options *o, flipped_block = 0; } - if (!match) { + if (pmb_nr && (!match || l->s != moved_symbol)) { int i; if (!adjust_last_block(o, n, block_length) && @@ -1219,12 +1219,13 @@ static void mark_color_as_moved(struct diff_options *o, pmb_nr = 0; block_length = 0; flipped_block = 0; - last_symbol = l->s; + } + if (!match) { + moved_symbol = DIFF_SYMBOL_BINARY_DIFF_HEADER; continue; } if (o->color_moved == COLOR_MOVED_PLAIN) { - last_symbol = l->s; l->flags |= DIFF_SYMBOL_MOVED_LINE; continue; } @@ -1251,11 +1252,16 @@ static void mark_color_as_moved(struct diff_options *o, &pmb, &pmb_alloc, &pmb_nr); - if (contiguous && pmb_nr && last_symbol == l->s) + if (contiguous && pmb_nr && moved_symbol == l->s) flipped_block = (flipped_block + 1) % 2; else flipped_block = 0; + if (pmb_nr) + moved_symbol = l->s; + else + moved_symbol = DIFF_SYMBOL_BINARY_DIFF_HEADER; + block_length = 0; } @@ -1265,7 +1271,6 @@ static void mark_color_as_moved(struct diff_options *o, if (flipped_block && o->color_moved != COLOR_MOVED_BLOCKS) l->flags |= DIFF_SYMBOL_MOVED_LINE_ALT; } - last_symbol = l->s; } adjust_last_block(o, n, block_length); diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh index 4e0fd76c6c5..15782c879d2 100755 --- a/t/t4015-diff-whitespace.sh +++ b/t/t4015-diff-whitespace.sh @@ -1514,6 +1514,71 @@ test_expect_success 'zebra alternate color is only used when necessary' ' test_cmp expected actual ' +test_expect_success 'short lines of opposite sign do not get marked as moved' ' + cat >old.txt <<-\EOF && + this line should be marked as moved + unchanged + unchanged + unchanged + unchanged + too short + this line should be marked as oldMoved newMoved + this line should be marked as oldMovedAlternate newMoved + unchanged 1 + unchanged 2 + unchanged 3 + unchanged 4 + this line should be marked as oldMoved newMoved/newMovedAlternate + EOF + cat >new.txt <<-\EOF && + too short + unchanged + unchanged + this line should be marked as moved + too short + unchanged + unchanged + this line should be marked as oldMoved newMoved/newMovedAlternate + unchanged 1 + unchanged 2 + this line should be marked as oldMovedAlternate newMoved + this line should be marked as oldMoved newMoved/newMovedAlternate + unchanged 3 + this line should be marked as oldMoved newMoved + unchanged 4 + EOF + test_expect_code 1 git diff --no-index --color --color-moved=zebra \ + old.txt new.txt >output && cat output && + grep -v index output | test_decode_color >actual && + cat >expect <<-\EOF && + diff --git a/old.txt b/new.txt + --- a/old.txt + +++ b/new.txt + @@ -1,13 +1,15 @@ + -this line should be marked as moved + +too short + unchanged + unchanged + +this line should be marked as moved + +too short + unchanged + unchanged + -too short + -this line should be marked as oldMoved newMoved + -this line should be marked as oldMovedAlternate newMoved + +this line should be marked as oldMoved newMoved/newMovedAlternate + unchanged 1 + unchanged 2 + +this line should be marked as oldMovedAlternate newMoved + +this line should be marked as oldMoved newMoved/newMovedAlternate + unchanged 3 + +this line should be marked as oldMoved newMoved + unchanged 4 + -this line should be marked as oldMoved newMoved/newMovedAlternate + EOF + test_cmp expect actual +' + test_expect_success 'cmd option assumes configured colored-moved' ' test_config color.diff.oldMoved "magenta" && test_config color.diff.newMoved "cyan" && From patchwork Wed Oct 27 12:04:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587133 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1049FC433FE for ; Wed, 27 Oct 2021 12:04:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EC4F360E8B for ; Wed, 27 Oct 2021 12:04:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240336AbhJ0MHD (ORCPT ); Wed, 27 Oct 2021 08:07:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241730AbhJ0MG5 (ORCPT ); Wed, 27 Oct 2021 08:06:57 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A242AC0613B9 for ; Wed, 27 Oct 2021 05:04:31 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id o14so3702607wra.12 for ; Wed, 27 Oct 2021 05:04:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=QMoWx+vto78ZYKdHTzHFdPgdOaprERVbNGfNe2DRDPg=; b=bmRnM0dQSGsV4A0ezbYAWs0YmAfehnZN8ruColG2lTC6I3NV3rhJjJcdYZomkXHOud PWGi54f1gDlzTiRuqwkXfSj8OTNMg/4DULz2QNZnFl3/URAf8U31y7kZutjx3/r5V3Gw UKcE+Qpi/mYPWq31cb4uu6knScW44Qe2r0FrmrnDnKmoayTfo+oasfhs/Ac236WUh45i z786bfXYxU2WJ4GwNPqUBwO2SpDanoicI3w6ZrLPUZj0k4qDo3MVLwwQnFIUehecmXBR 90HUyv3wUSx12rbDKquIpYQHY3xEmnTiywIePMlHmG7hm3cULNTJJbz8/EIlPZDfuiCk bRfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=QMoWx+vto78ZYKdHTzHFdPgdOaprERVbNGfNe2DRDPg=; b=rNXLBMd3uXAf0byP7LaCdWedsp6wA/JKbN740ibvAV1JScbOSJXy0+Zg6QjXJwEsDW yR+htjJKlkM/V15T9Bow2100cbpiCB+fB4fEtrsHTP+clOzvFS2dcdGFPYsQ1KVifj3K fdMcWNobit2vVSZ8iCQVcITfBOw/FxWq5GKeQrVrpZ0mVRKYpqOuPXAg1dMVp6mvTRVk UGyMfA4OkaYRK8GS0UFHs1STJqxrUWLouuxufsyPefVTIDcWToQfu9pvJbWDbWyzzgbR rjxWFPbTc1GFSBilqQIkgkdRJQBpEQo8YURxmA90ZGShkW7+Q/jm3gscWmisdmFNC3x3 W82A== X-Gm-Message-State: AOAM5315xtBxjpRSyEkGnSt1rCfw+BuMv+H4h4AwLHpp3xy1wypqdLWW AmFuQN0F3/4dGkNQBEpDCxD84sgRS4I= X-Google-Smtp-Source: ABdhPJxRyn0VSbpS0calU6sz5AZSKhoydoeib2aN9rW4mRcPlA5a3UNpdEQSJG3oQf0aZTIWYsAAmA== X-Received: by 2002:a05:6000:2c6:: with SMTP id o6mr19590582wry.321.1635336270259; Wed, 27 Oct 2021 05:04:30 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f24sm3126733wmb.33.2021.10.27.05.04.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:29 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:14 +0000 Subject: [PATCH v3 07/15] diff: simplify allow-indentation-change delta calculation Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood Now that we reliably end a block when the sign changes we don't need the whitespace delta calculation to rely on the sign. Signed-off-by: Phillip Wood --- diff.c | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/diff.c b/diff.c index efba2789354..9aff167be27 100644 --- a/diff.c +++ b/diff.c @@ -864,23 +864,17 @@ static int compute_ws_delta(const struct emitted_diff_symbol *a, a_width = a->indent_width, b_off = b->indent_off, b_width = b->indent_width; - int delta; if (a_width == INDENT_BLANKLINE && b_width == INDENT_BLANKLINE) { *out = INDENT_BLANKLINE; return 1; } - if (a->s == DIFF_SYMBOL_PLUS) - delta = a_width - b_width; - else - delta = b_width - a_width; - if (a_len - a_off != b_len - b_off || memcmp(a->line + a_off, b->line + b_off, a_len - a_off)) return 0; - *out = delta; + *out = a_width - b_width; return 1; } @@ -924,10 +918,7 @@ static int cmp_in_block_with_wsd(const struct diff_options *o, * match those of the current block and that the text of 'l' and 'cur' * after the indentation match. */ - if (cur->es->s == DIFF_SYMBOL_PLUS) - delta = a_width - c_width; - else - delta = c_width - a_width; + delta = c_width - a_width; /* * If the previous lines of this block were all blank then set its From patchwork Wed Oct 27 12:04:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587135 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18B32C433F5 for ; Wed, 27 Oct 2021 12:04:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F20DB60E8B for ; Wed, 27 Oct 2021 12:04:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241784AbhJ0MHE (ORCPT ); Wed, 27 Oct 2021 08:07:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38560 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241742AbhJ0MHA (ORCPT ); Wed, 27 Oct 2021 08:07:00 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9BADC061224 for ; Wed, 27 Oct 2021 05:04:34 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id d3so3731871wrh.8 for ; Wed, 27 Oct 2021 05:04:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=+/NY0FSLLgoOM2WL1cJXCz+GAAkWsSqC7KSVILVwygw=; b=G6c+xU680ooYDURVbohdis89yrW0oEszmUqp8ZBfR8LPqnRwG1HtLWCVP5IPbSZELD KeMDOKQ93Xj4LTmfx4OKKKgz/kA6EYlJDNy05vh3trSeK9G2qKbkwtY1OI5kd1I5mpI5 qUAQGmZ1rJ4z0VMO0gKItdjdSmhKHEkPEpBkJ6yUtrfCa+RDP0iZ5qajmZaQr76c2rqM KWYnrtYeoDS3onm0wVpkNNU7aAfSiCNDqO5W8RO9nF9CrSmj7BFoIpMPdQDgCxiRo56P idzFPmsBFs81AMeL28QrSUOmH2WPtHgcmWWQWlwUwBFEQ8oo0hwyI3WstQ/VfXeOjj5v Uwmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=+/NY0FSLLgoOM2WL1cJXCz+GAAkWsSqC7KSVILVwygw=; b=a9sMmsLIPFch8B5jjMybMRsbDWJQNWHAsDzukiL9Ey2mATH8UcSCmgBKGt8f4BVXRe kgWO6c3Z+P/002jPVuaSvEQUY+Y2aDci+5S3Zceua8w7CW6lwkxTRhX0Vm8MJcf5JhwR VktFE+fJfZBt9YCT5xUc6kvzM4VddBUoUQqmCEXjD+azZYIOp691A+56QNMutJm3RhO/ AtTZaBsaVgjnHXrkX79LHVsthzj/SaarLli4bng96nd/Pk3tYqy8yLkfJ/l6ze2sNxje RkJ8SYrExmkUCL44hoCOKTv7uz8WjZaWxnqM0RSAiMV2Ty+NHEuBn5pK65WT6WGIgr7r OTdg== X-Gm-Message-State: AOAM532/fIWFxvntr9JLRcujpW6WMjWB9+vDiMo0V7JRxTHdD+2VUpfc ymdcQFzHYiiVnl4CLfhVieitv+ge4bk= X-Google-Smtp-Source: ABdhPJxTtBip7QSTjBZ9M6oaehzcOHoylWKZTCbgHJb7jk+LJRaZYnvK/yQ3TtKd+993BwjSBnY00g== X-Received: by 2002:adf:b19b:: with SMTP id q27mr40360281wra.125.1635336270948; Wed, 27 Oct 2021 05:04:30 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t6sm15804919wrw.78.2021.10.27.05.04.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:30 -0700 (PDT) Message-Id: <324b689c915ddcb006907eec72a10257186c9e17.1635336263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:15 +0000 Subject: [PATCH v3 08/15] diff --color-moved-ws=allow-indentation-change: simplify and optimize Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood If we already have a block of potentially moved lines then as we move down the diff we need to check if the next line of each potentially moved line matches the current line of the diff. The implementation of --color-moved-ws=allow-indentation-change was needlessly performing this check on all the lines in the diff that matched the current line rather than just the current line. To exacerbate the problem finding all the other lines in the diff that match the current line involves a fuzzy lookup so we were wasting even more time performing a second comparison to filter out the non-matching lines. Fixing this reduces time to run git diff --color-moved-ws=allow-indentation-change v2.28.0 v2.29.0 by 93% compared to master and simplifies the code. Test HEAD^ HEAD --------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38 (0.35+0.03) 0.38(0.35+0.03) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.86 (0.80+0.06) 0.87(0.83+0.04) +1.2% 4002.3: diff --color-moved-ws=allow-indentation-change large change 19.01(18.93+0.06) 0.97(0.92+0.04) -94.9% 4002.4: log --no-color-moved --no-color-moved-ws 1.16 (1.06+0.09) 1.17(1.06+0.10) +0.9% 4002.5: log --color-moved --no-color-moved-ws 1.32 (1.25+0.07) 1.32(1.24+0.08) +0.0% 4002.6: log --color-moved-ws=allow-indentation-change 1.71 (1.64+0.06) 1.36(1.25+0.10) -20.5% Test master HEAD --------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38 (0.33+0.05) 0.38(0.35+0.03) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.80 (0.75+0.04) 0.87(0.83+0.04) +8.7% 4002.3: diff --color-moved-ws=allow-indentation-change large change 14.20(14.15+0.05) 0.97(0.92+0.04) -93.2% 4002.4: log --no-color-moved --no-color-moved-ws 1.15 (1.05+0.09) 1.17(1.06+0.10) +1.7% 4002.5: log --color-moved --no-color-moved-ws 1.30 (1.19+0.11) 1.32(1.24+0.08) +1.5% 4002.6: log --color-moved-ws=allow-indentation-change 1.70 (1.63+0.06) 1.36(1.25+0.10) -20.0% Helped-by: Jeff King Signed-off-by: Phillip Wood --- diff.c | 70 +++++++++++++++++----------------------------------------- 1 file changed, 20 insertions(+), 50 deletions(-) diff --git a/diff.c b/diff.c index 9aff167be27..78a486021ab 100644 --- a/diff.c +++ b/diff.c @@ -879,37 +879,21 @@ static int compute_ws_delta(const struct emitted_diff_symbol *a, return 1; } -static int cmp_in_block_with_wsd(const struct diff_options *o, - const struct moved_entry *cur, - const struct moved_entry *match, - struct moved_block *pmb, - int n) -{ - struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n]; - int al = cur->es->len, bl = match->es->len, cl = l->len; +static int cmp_in_block_with_wsd(const struct moved_entry *cur, + const struct emitted_diff_symbol *l, + struct moved_block *pmb) +{ + int al = cur->es->len, bl = l->len; const char *a = cur->es->line, - *b = match->es->line, - *c = l->line; + *b = l->line; int a_off = cur->es->indent_off, a_width = cur->es->indent_width, - c_off = l->indent_off, - c_width = l->indent_width; + b_off = l->indent_off, + b_width = l->indent_width; int delta; - /* - * We need to check if 'cur' is equal to 'match'. As those - * are from the same (+/-) side, we do not need to adjust for - * indent changes. However these were found using fuzzy - * matching so we do have to check if they are equal. Here we - * just check the lengths. We delay calling memcmp() to check - * the contents until later as if the length comparison for a - * and c fails we can avoid the call all together. - */ - if (al != bl) - return 1; - /* If 'l' and 'cur' are both blank then they match. */ - if (a_width == INDENT_BLANKLINE && c_width == INDENT_BLANKLINE) + if (a_width == INDENT_BLANKLINE && b_width == INDENT_BLANKLINE) return 0; /* @@ -918,7 +902,7 @@ static int cmp_in_block_with_wsd(const struct diff_options *o, * match those of the current block and that the text of 'l' and 'cur' * after the indentation match. */ - delta = c_width - a_width; + delta = b_width - a_width; /* * If the previous lines of this block were all blank then set its @@ -927,9 +911,8 @@ static int cmp_in_block_with_wsd(const struct diff_options *o, if (pmb->wsd == INDENT_BLANKLINE) pmb->wsd = delta; - return !(delta == pmb->wsd && al - a_off == cl - c_off && - !memcmp(a, b, al) && ! - memcmp(a + a_off, c + c_off, al - a_off)); + return !(delta == pmb->wsd && al - a_off == bl - b_off && + !memcmp(a + a_off, b + b_off, al - a_off)); } static int moved_entry_cmp(const void *hashmap_cmp_fn_data, @@ -1030,36 +1013,23 @@ static void pmb_advance_or_null(struct diff_options *o, } static void pmb_advance_or_null_multi_match(struct diff_options *o, - struct moved_entry *match, - struct hashmap *hm, + struct emitted_diff_symbol *l, struct moved_block *pmb, - int pmb_nr, int n) + int pmb_nr) { int i; - char *got_match = xcalloc(1, pmb_nr); - - hashmap_for_each_entry_from(hm, match, ent) { - for (i = 0; i < pmb_nr; i++) { - struct moved_entry *prev = pmb[i].match; - struct moved_entry *cur = (prev && prev->next_line) ? - prev->next_line : NULL; - if (!cur) - continue; - if (!cmp_in_block_with_wsd(o, cur, match, &pmb[i], n)) - got_match[i] |= 1; - } - } for (i = 0; i < pmb_nr; i++) { - if (got_match[i]) { + struct moved_entry *prev = pmb[i].match; + struct moved_entry *cur = (prev && prev->next_line) ? + prev->next_line : NULL; + if (cur && !cmp_in_block_with_wsd(cur, l, &pmb[i])) { /* Advance to the next line */ - pmb[i].match = pmb[i].match->next_line; + pmb[i].match = cur; } else { moved_block_clear(&pmb[i]); } } - - free(got_match); } static int shrink_potential_moved_blocks(struct moved_block *pmb, @@ -1223,7 +1193,7 @@ static void mark_color_as_moved(struct diff_options *o, if (o->color_moved_ws_handling & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) - pmb_advance_or_null_multi_match(o, match, hm, pmb, pmb_nr, n); + pmb_advance_or_null_multi_match(o, l, pmb, pmb_nr); else pmb_advance_or_null(o, match, hm, pmb, pmb_nr); From patchwork Wed Oct 27 12:04:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587137 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEE09C433F5 for ; Wed, 27 Oct 2021 12:04:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C402C60E8B for ; Wed, 27 Oct 2021 12:04:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241801AbhJ0MHK (ORCPT ); Wed, 27 Oct 2021 08:07:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241759AbhJ0MHB (ORCPT ); Wed, 27 Oct 2021 08:07:01 -0400 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 39E26C061226 for ; Wed, 27 Oct 2021 05:04:35 -0700 (PDT) Received: by mail-wr1-x430.google.com with SMTP id d3so3731909wrh.8 for ; Wed, 27 Oct 2021 05:04:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=nXlUzK0eG1nytfPPrkrw0gtj2yvs5u/jSO3jH2X9/oQ=; b=Ny7ItUp5bXqcNZURu/8HjGGU1SI0CxFP5+FOOYbrENZ910DWEvs2w+uRUnQvgZ2Qdm cbJjivC87bLBPxyFimTs8hrwMnJZdVtK7TfwO7AD9TK7uRGOEIq2sZ/vQqXw7zzV3LHh OfOrLKA7Z9jFjA3AOTJxRVRfed8BCjOJg9kPKlxvGzmCoJr0pmKy4xE4alLyZ47iGMCu W2hVdE+VaFoSYis95iB40c5vz/qnVYpWjeIBF0SiDZLbOnG6U95G+8GrEEfp72BxSrb7 tOdgg8e//OqqfUJCPfTcypit2qr32Q+acT1t+FnSGsoLC8KG+GGbHkdypcduSg865K66 3xsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=nXlUzK0eG1nytfPPrkrw0gtj2yvs5u/jSO3jH2X9/oQ=; b=GUGnezYvCdIIxxI7iJxwK/YATrJ3bqK2U/2vtz6t94aTcupt3sDE+beNU5SkquIL04 LM8CkQ6pA46gSre9ccf6+nCobn43kue/sGdfGSMy8oB+z5k48dX6fnViLyRqupmEEKGb 9VBv09h3oRp7pxH0cEx53hCtXGR+8bHnMLa1qHANv/50yWs1oenMswb9lu9GUGp5u1jN JhEmO+WckoykD+Hg128YlapPqDTmIIAr8c3OKMIgalrXzWgZjoYI8prPt9aIrMdYwts+ unXG3GwyzH4ufXMuQhE011beiJM0WprN3oGfhWYDjzGQ19yuMMt3mrDoU66y4Reg7Cwa hPKw== X-Gm-Message-State: AOAM532LWPll3Gdbmw6BoGwZu96ufPjthGgu3Y/i4TteXfL5qJpuDe5J aivpsHCuS4t8OuWjOsXwBbTrI745jug= X-Google-Smtp-Source: ABdhPJzD0wMG9+9jqXL4tJhqfgXYVs/oUNntXxcx1OMXZPtwfJf4GTYzYLyPEw7t+O0vCsa1qh10og== X-Received: by 2002:a05:6000:1866:: with SMTP id d6mr18917633wri.226.1635336273830; Wed, 27 Oct 2021 05:04:33 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j9sm3385790wrt.96.2021.10.27.05.04.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:33 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:16 +0000 Subject: [PATCH v3 09/15] diff --color-moved: call comparison function directly Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood This change will allow us to easily combine pmb_advance_or_null() and pmb_advance_or_null_multi_match() in the next commit. Calling xdiff_compare_lines() directly rather than using a function pointer from the hash map has little effect on the run time. Test HEAD^ HEAD ------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38(0.35+0.03) 0.38(0.32+0.06) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.87(0.83+0.04) 0.87(0.80+0.06) +0.0% 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.97(0.92+0.04) 0.97(0.93+0.04) +0.0% 4002.4: log --no-color-moved --no-color-moved-ws 1.17(1.06+0.10) 1.16(1.10+0.05) -0.9% 4002.5: log --color-moved --no-color-moved-ws 1.32(1.24+0.08) 1.31(1.22+0.09) -0.8% 4002.6: log --color-moved-ws=allow-indentation-change 1.36(1.25+0.10) 1.35(1.25+0.10) -0.7% Signed-off-by: Phillip Wood --- diff.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/diff.c b/diff.c index 78a486021ab..22e0edac173 100644 --- a/diff.c +++ b/diff.c @@ -994,17 +994,20 @@ static void add_lines_to_move_detection(struct diff_options *o, } static void pmb_advance_or_null(struct diff_options *o, - struct moved_entry *match, - struct hashmap *hm, + struct emitted_diff_symbol *l, struct moved_block *pmb, int pmb_nr) { int i; + unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; + for (i = 0; i < pmb_nr; i++) { struct moved_entry *prev = pmb[i].match; struct moved_entry *cur = (prev && prev->next_line) ? prev->next_line : NULL; - if (cur && !hm->cmpfn(o, &cur->ent, &match->ent, NULL)) { + if (cur && xdiff_compare_lines(cur->es->line, cur->es->len, + l->line, l->len, + flags)) { pmb[i].match = cur; } else { pmb[i].match = NULL; @@ -1195,7 +1198,7 @@ static void mark_color_as_moved(struct diff_options *o, COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) pmb_advance_or_null_multi_match(o, l, pmb, pmb_nr); else - pmb_advance_or_null(o, match, hm, pmb, pmb_nr); + pmb_advance_or_null(o, l, pmb, pmb_nr); pmb_nr = shrink_potential_moved_blocks(pmb, pmb_nr); From patchwork Wed Oct 27 12:04:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587139 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B61DC433FE for ; Wed, 27 Oct 2021 12:04:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7E30B60E8B for ; Wed, 27 Oct 2021 12:04:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241715AbhJ0MHL (ORCPT ); Wed, 27 Oct 2021 08:07:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241771AbhJ0MHC (ORCPT ); Wed, 27 Oct 2021 08:07:02 -0400 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE548C06122F for ; Wed, 27 Oct 2021 05:04:35 -0700 (PDT) Received: by mail-wm1-x32f.google.com with SMTP id b82-20020a1c8055000000b0032ccc728d63so2172135wmd.1 for ; Wed, 27 Oct 2021 05:04:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=p2vb6cLuLTkHSnEEqr1bN8UrIqfnfxxTwYxWo4ujnVY=; b=LTSfljhqQvCFQo6YFdnuT6ok6VKH5hcbz6w9029QtqMm2qIOIWTApQUWnMfCmdWDi8 DW/YFBppMaAuzrO0U01Y6Z4L+Ec/UqN2pIj9rdZRWuJqCUWy70guttU23J35XdJeY5/0 PLySjHSVDNh6BpCQ8HwBzDQ76542TpYycQPuLfZrTw7mcnlcXOBe++d305YI0npRQu7o 01I4twXMgrcAHiZAqybaDHC7xaW86qZh0cbqaIkKj6KDrwj0IeBdO3wiT/NZs1Qi0QLX T3fwSLdx9CuJ63ORpYcNAI4wyEjbBgVFB2e6ZIdRabvWXXOphtpjSJZm1+z1GJLXgtWq 3YIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=p2vb6cLuLTkHSnEEqr1bN8UrIqfnfxxTwYxWo4ujnVY=; b=MhQq80chG5eXs0WWCPAA4ikvNdVX+aURL1tph1dHtoZYShw+JDE2kSfCDDBTjtcwfz EDe8uzYb5pOVBmgl+azOtGTZZrCaRTzPxpZ/Jr8bHDMUJObohApEhNXVbYCxnrkOoL+D FpMniMf1aZOObeRQsDguUwrukPOeh0HFWgD9zm5fyu1kkCdi+JHFNG8UK2qTl8TkP4r+ YG/wztdqhjL0upNy2GUQRIOGffAi9G1T44z3/QxxvSOTPUkge8w8FAO5JsdcXqiVN4so ckrMeJwopVyVn3hpusiSVyCT5VeL9cGWc4E1CdnYbfM884e/KMad9gJZMtX6w+gN8VFg vecQ== X-Gm-Message-State: AOAM533apyrKCJmU1smWpDYIK2YJQPieCSN0G29YzsGcaMPKcUt+fvCD 0Aa8tx8J4HyyQUN35Tq+2fevGeCtipc= X-Google-Smtp-Source: ABdhPJwcRTuJh1QiqbhwmHpyQd9hHK4KCbLppkaAYT0qyOT/1bhJq8WpmNmedk/3Fv06k/89i1Z+Ew== X-Received: by 2002:a1c:2044:: with SMTP id g65mr5364714wmg.105.1635336274437; Wed, 27 Oct 2021 05:04:34 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l11sm6463574wrt.49.2021.10.27.05.04.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:34 -0700 (PDT) Message-Id: <8f3ea865dd33047397207c9909f114601b0a2dda.1635336263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:17 +0000 Subject: [PATCH v3 10/15] diff --color-moved: unify moved block growth functions Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood After the last two commits pmb_advance_or_null() and pmb_advance_or_null_multi_match() differ only in the comparison they perform. Lets simplify the code by combining them into a single function. Signed-off-by: Phillip Wood --- diff.c | 41 ++++++++++++----------------------------- 1 file changed, 12 insertions(+), 29 deletions(-) diff --git a/diff.c b/diff.c index 22e0edac173..51f092e724e 100644 --- a/diff.c +++ b/diff.c @@ -1002,36 +1002,23 @@ static void pmb_advance_or_null(struct diff_options *o, unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; for (i = 0; i < pmb_nr; i++) { + int match; struct moved_entry *prev = pmb[i].match; struct moved_entry *cur = (prev && prev->next_line) ? prev->next_line : NULL; - if (cur && xdiff_compare_lines(cur->es->line, cur->es->len, - l->line, l->len, - flags)) { - pmb[i].match = cur; - } else { - pmb[i].match = NULL; - } - } -} -static void pmb_advance_or_null_multi_match(struct diff_options *o, - struct emitted_diff_symbol *l, - struct moved_block *pmb, - int pmb_nr) -{ - int i; - - for (i = 0; i < pmb_nr; i++) { - struct moved_entry *prev = pmb[i].match; - struct moved_entry *cur = (prev && prev->next_line) ? - prev->next_line : NULL; - if (cur && !cmp_in_block_with_wsd(cur, l, &pmb[i])) { - /* Advance to the next line */ + if (o->color_moved_ws_handling & + COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) + match = cur && + !cmp_in_block_with_wsd(cur, l, &pmb[i]); + else + match = cur && + xdiff_compare_lines(cur->es->line, cur->es->len, + l->line, l->len, flags); + if (match) pmb[i].match = cur; - } else { + else moved_block_clear(&pmb[i]); - } } } @@ -1194,11 +1181,7 @@ static void mark_color_as_moved(struct diff_options *o, continue; } - if (o->color_moved_ws_handling & - COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) - pmb_advance_or_null_multi_match(o, l, pmb, pmb_nr); - else - pmb_advance_or_null(o, l, pmb, pmb_nr); + pmb_advance_or_null(o, l, pmb, pmb_nr); pmb_nr = shrink_potential_moved_blocks(pmb, pmb_nr); From patchwork Wed Oct 27 12:04:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587141 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E37EC433EF for ; Wed, 27 Oct 2021 12:04:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 27E3460E8B for ; Wed, 27 Oct 2021 12:04:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241811AbhJ0MHN (ORCPT ); Wed, 27 Oct 2021 08:07:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38580 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241774AbhJ0MHC (ORCPT ); Wed, 27 Oct 2021 08:07:02 -0400 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7896DC061232 for ; Wed, 27 Oct 2021 05:04:36 -0700 (PDT) Received: by mail-wr1-x42a.google.com with SMTP id o14so3703017wra.12 for ; Wed, 27 Oct 2021 05:04:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=PvZWtTB+saRNvbTicS6KSTgR/AzyDN0INgP6F8z97h4=; b=FVyy1oielz0DttBRrAUnUcMmGZMxVwUaEtv84y69Yi0UHfHDKB97ooxGv75c7fqSR8 Zuk/F30c+IJKl1gOgxarWLNG6KHtJWjO2RB/ZRAu4LqD3ZQBF1WCUHvkcr+8AOz6gwWX t3o2VKcE0h1G1ftQ4WfSvqonvwEptOoTF8tBxxDbJ4ynNnlN+7JKKMy58SOblkr00jw9 wJeNFZB5/3faHgtvFIsAQ8WMV4BwnUrVlxICyxLyT649drXuVGoLCSsVCa6SkdTlPewU M5xwm2lPWj57LN/9mrWGNI3KXY251aCOKHx7wzg1UzDYIwZB6WoRqIwD1kD4loetWLi6 7Z0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=PvZWtTB+saRNvbTicS6KSTgR/AzyDN0INgP6F8z97h4=; b=Z7/adLXmrCo5RkJ29889UyDBPSybNo4tCB72goE+2r02aI9/coYOekAPx3VROKykpO kZwP/fVeeyaqTe/0PRbV+KUtpYWdb8D3ocPudUf1HUijb/jW5ETWeBdVgSLK0fgVBRe9 aL4rSFXY7rbGeVh9cXLN8AjU4SimnS2kwVHR60Q7VZOqxo8uDyrxg2KkeX+460Mz3ItQ KrGUWACVKlrieD35XOQ6N1vtwdGNEstZ98Rd2EvCrmU/DrwS+eOdPwPQj0rs9dVpPT0o HnHELbAmxqYjsIGPxJw1NePazeErOGuYHbxhBNbHNzgj6B0at80xI4ry7blsko6ejSow ct3g== X-Gm-Message-State: AOAM530nEZG9Z8ACsAeIyznShUC9Vy6N6tYPAXDwI3S/j1EJ05QkqJE9 x5qOtgvBnzv6/eMZ03bCszTnpVsqF0c= X-Google-Smtp-Source: ABdhPJxklGNAyAM241GyThpAB1nGo509M+yprdz0/4METYWQbWSZIomNqKMEvICzRpAkxiag7ngIWA== X-Received: by 2002:a5d:6103:: with SMTP id v3mr37470182wrt.335.1635336275068; Wed, 27 Oct 2021 05:04:35 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s13sm3375925wmc.47.2021.10.27.05.04.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:34 -0700 (PDT) Message-Id: <078c04d4a66c51acdb3477b49813a9dc1c144f15.1635336263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:18 +0000 Subject: [PATCH v3 11/15] diff --color-moved: shrink potential moved blocks as we go Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood Rather than setting `match` to NULL and then looping over the list of potential matched blocks for a second time to remove blocks with no matches just filter out the blocks with no matches as we go. Signed-off-by: Phillip Wood --- diff.c | 44 ++++++++------------------------------------ 1 file changed, 8 insertions(+), 36 deletions(-) diff --git a/diff.c b/diff.c index 51f092e724e..626fd47aa0e 100644 --- a/diff.c +++ b/diff.c @@ -996,12 +996,12 @@ static void add_lines_to_move_detection(struct diff_options *o, static void pmb_advance_or_null(struct diff_options *o, struct emitted_diff_symbol *l, struct moved_block *pmb, - int pmb_nr) + int *pmb_nr) { - int i; + int i, j; unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; - for (i = 0; i < pmb_nr; i++) { + for (i = 0, j = 0; i < *pmb_nr; i++) { int match; struct moved_entry *prev = pmb[i].match; struct moved_entry *cur = (prev && prev->next_line) ? @@ -1015,38 +1015,12 @@ static void pmb_advance_or_null(struct diff_options *o, match = cur && xdiff_compare_lines(cur->es->line, cur->es->len, l->line, l->len, flags); - if (match) - pmb[i].match = cur; - else - moved_block_clear(&pmb[i]); - } -} - -static int shrink_potential_moved_blocks(struct moved_block *pmb, - int pmb_nr) -{ - int lp, rp; - - /* Shrink the set of potential block to the remaining running */ - for (lp = 0, rp = pmb_nr - 1; lp <= rp;) { - while (lp < pmb_nr && pmb[lp].match) - lp++; - /* lp points at the first NULL now */ - - while (rp > -1 && !pmb[rp].match) - rp--; - /* rp points at the last non-NULL */ - - if (lp < pmb_nr && rp > -1 && lp < rp) { - pmb[lp] = pmb[rp]; - memset(&pmb[rp], 0, sizeof(pmb[rp])); - rp--; - lp++; + if (match) { + pmb[j] = pmb[i]; + pmb[j++].match = cur; } } - - /* Remember the number of running sets */ - return rp + 1; + *pmb_nr = j; } static void fill_potential_moved_blocks(struct diff_options *o, @@ -1181,9 +1155,7 @@ static void mark_color_as_moved(struct diff_options *o, continue; } - pmb_advance_or_null(o, l, pmb, pmb_nr); - - pmb_nr = shrink_potential_moved_blocks(pmb, pmb_nr); + pmb_advance_or_null(o, l, pmb, &pmb_nr); if (pmb_nr == 0) { int contiguous = adjust_last_block(o, n, block_length); From patchwork Wed Oct 27 12:04:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587143 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5423C433F5 for ; Wed, 27 Oct 2021 12:04:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9D63160E8B for ; Wed, 27 Oct 2021 12:04:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241819AbhJ0MHO (ORCPT ); Wed, 27 Oct 2021 08:07:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241776AbhJ0MHC (ORCPT ); Wed, 27 Oct 2021 08:07:02 -0400 Received: from mail-wm1-x329.google.com (mail-wm1-x329.google.com [IPv6:2a00:1450:4864:20::329]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B7B3C061233 for ; Wed, 27 Oct 2021 05:04:37 -0700 (PDT) Received: by mail-wm1-x329.google.com with SMTP id 192so848532wme.3 for ; Wed, 27 Oct 2021 05:04:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=hbX6ChgF1/7zCe8TRq2mAwsHjDZd1Mp8/GA16/PwtSw=; b=W2ulqUE+EJkhFBDa3rmmKgp7BtGoIlTtfQKL3jUKRBgZUZD1Ub/C4faF4KFFSqmEmI ECp6KtOvjVUxSCfSGmlMrFqPDWGyk/xybgjyXGZdZ/SylUJwyfnTSrzzziC/asUG0+aG GyXMKzIon0JLWgvRqpq/eF/nZZtZodekuJapsySqedPdqp5q15UdEqXlb2y1GX7AAqTa tuSbGHNDSOr+oxeIFKpVEIIH/HU2tfxd5g93LB2gr7327BNQkTRUnKewxT4J+VibcMDU cHI1X8vlKPTQm/knJKdSKfR6bpTji9qFZBvnNlD+CnmqDvQtk4GT6pj3poMNR5yJkq1e 2qbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=hbX6ChgF1/7zCe8TRq2mAwsHjDZd1Mp8/GA16/PwtSw=; b=e2RMJkg81kb/IVRcYJfs5339VVYnYiscCeABie2uZiQVnYp9+Cgx10L3RPSY/gqnbe pVdmVUonnrTM70O35TvUZor2LWYwnlyxGiEw0QP2wKgxU9QTlKc24YG+VHdnKGxfkEpA t1dg8lz3T0vkpXHfyHW5QMwTpSh8WDnB23BZX91aYJbD5IQLtTK7s9nERWfNAt15cDR3 rnT1d1sT1HYKOVWVOpkV/pFtiNc3GY//YbIHg7VOW8uLjRPpjw1PA2aDMTIwZo9uZBh5 HJD0SYrr72t4pnk4k+l91eO05b52wyH2HwFP7jwxZbkUVrZpfdpnMZWCYCq90E/fdmc6 PzYQ== X-Gm-Message-State: AOAM53113TleUVfo5lTfCW7KpAuhgpRkoj37LQPOaPureKerbXo+c5ne uffh8hRmDoIo0WS/1PmS+8jwag3IFOg= X-Google-Smtp-Source: ABdhPJynoM3N/EyjKcKN48a44ToTddhZDYQuUuAJ4Oz5iAeb+8aD7E3IQ6Rry711t9kCVVP3MmYNwA== X-Received: by 2002:a05:600c:2909:: with SMTP id i9mr5311442wmd.74.1635336275719; Wed, 27 Oct 2021 05:04:35 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c17sm15022657wrv.22.2021.10.27.05.04.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:35 -0700 (PDT) Message-Id: <618371471a0effd4ad2d69105a7e4495ad46c350.1635336263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:19 +0000 Subject: [PATCH v3 12/15] diff --color-moved: stop clearing potential moved blocks Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood moved_block_clear() was introduced in 74d156f4a1 ("diff --color-moved-ws: fix double free crash", 2018-10-04) to free the memory that was allocated when initializing a potential moved block. However since 21536d077f ("diff --color-moved-ws: modify allow-indentation-change", 2018-11-23) initializing a potential moved block no longer allocates any memory. Up until the last commit we were relying on moved_block_clear() to set the `match` pointer to NULL when a block stopped matching, but since that commit we do not clear a moved block that does not match so it does not make sense to clear them elsewhere. Signed-off-by: Phillip Wood --- diff.c | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/diff.c b/diff.c index 626fd47aa0e..ffbe09937bc 100644 --- a/diff.c +++ b/diff.c @@ -807,11 +807,6 @@ struct moved_block { int wsd; /* The whitespace delta of this block */ }; -static void moved_block_clear(struct moved_block *b) -{ - memset(b, 0, sizeof(*b)); -} - #define INDENT_BLANKLINE INT_MIN static void fill_es_indent_data(struct emitted_diff_symbol *es) @@ -1128,8 +1123,6 @@ static void mark_color_as_moved(struct diff_options *o, } if (pmb_nr && (!match || l->s != moved_symbol)) { - int i; - if (!adjust_last_block(o, n, block_length) && block_length > 1) { /* @@ -1139,8 +1132,6 @@ static void mark_color_as_moved(struct diff_options *o, match = NULL; n -= block_length; } - for(i = 0; i < pmb_nr; i++) - moved_block_clear(&pmb[i]); pmb_nr = 0; block_length = 0; flipped_block = 0; @@ -1193,8 +1184,6 @@ static void mark_color_as_moved(struct diff_options *o, } adjust_last_block(o, n, block_length); - for(n = 0; n < pmb_nr; n++) - moved_block_clear(&pmb[n]); free(pmb); } From patchwork Wed Oct 27 12:04:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587145 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B734C433EF for ; Wed, 27 Oct 2021 12:04:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 11ED960F70 for ; Wed, 27 Oct 2021 12:04:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240320AbhJ0MHP (ORCPT ); Wed, 27 Oct 2021 08:07:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241742AbhJ0MHG (ORCPT ); Wed, 27 Oct 2021 08:07:06 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B03A4C061745 for ; Wed, 27 Oct 2021 05:04:37 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id z14so3750161wrg.6 for ; Wed, 27 Oct 2021 05:04:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=IQqFcygNNGO8TgTlsU7pdOOW17PYUaCsqLVpjC9Nw7w=; b=NwRfcSUX6nBqWkW30V0RO3ZB2RRCeK6geGwBfW8goG9SEkKZ/gJql2CuDfrk8yC27I 0D+WVSRGF4JAqqlTbmScEekBd2IZUEGu2QnX9b6B1MGqAIK8I/FwtBaAoMTVJQBYsRJU ZtD4BN9IhgEl/7dhpQB2R1uJ5D+4pUyTJcM6/zn+zrLU6eutnRSdHuNkegdVCwdXYE2H /WYdo5/LUEdB0QQm0tb2Ny7KO3foMXlnl0hk7TiVGbVYO+8j366BNEmpgZQCvtxVuw2e xF2WeY59In4xUr7cJYtHB7JWVEsVAZ9o/RLkYOZ3oS0V3jhc3MceWNdmbTjNZgBKer0b 32Sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=IQqFcygNNGO8TgTlsU7pdOOW17PYUaCsqLVpjC9Nw7w=; b=NN1uCi+Qat98zD0x7LGKtwbQ7qU1ipawkLdy7BcAvCC6BTrEA5ttMEPzuNuo47rZWd ohQ4tWDlR8GbscAbFVqig28iijKMW4nLaXLmY8+uVz54E81pwgfwcV19uLEkW2gxB9NN oJgxbx3C3SkloTZKGjTfhcgFdWN+SXSoiExN/B1GjKtD/FfFr6OUpa7RRR9u0cjereNr htnORX7iKN5ZM/U78OgXm2OHkgL/1R5MxWFgMarMjwx4QQBnIXKdkUCcxxnIn7AGPaNY 23oGoSmONH/xsDpVubb3gt1YhJBZIDkMWx8okU7yQRAi4IC4kschK2+aEZ605AqGA1ZU 8d2A== X-Gm-Message-State: AOAM532GXfpDYZubUtKRcufuygply8qjcMF1VPVnNHnQR2tpsKaxkyOG tLSw3OUFIMqLmL8fsTmLPxU/YsbwHTY= X-Google-Smtp-Source: ABdhPJyIlupYvlO5D5/IXU7kjL+ctJyjKchHlhf3JRKb7V8VMo26/eKxUMRHvHzOWM6rR31SZL4TSA== X-Received: by 2002:adf:c5c8:: with SMTP id v8mr1475515wrg.186.1635336276356; Wed, 27 Oct 2021 05:04:36 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o26sm3279791wmc.17.2021.10.27.05.04.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:36 -0700 (PDT) Message-Id: <6a8e9a2724d638cca04e3ac7d52995d9e2b0b990.1635336263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:20 +0000 Subject: [PATCH v3 13/15] diff --color-moved-ws=allow-indentation-change: improve hash lookups Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood As libxdiff does not have a whitespace flag to ignore the indentation the code for --color-moved-ws=allow-indentation-change uses XDF_IGNORE_WHITESPACE and then filters out any hash lookups where there are non-indentation changes. This filtering is inefficient as we have to perform another string comparison. By using the offset data that we have already computed to skip the indentation we can avoid using XDF_IGNORE_WHITESPACE and safely remove the extra checks which improves the performance by 11% and paves the way for the elimination of string comparisons in the next commit. This change slightly increases the run time of other --color-moved modes. This could be avoided by using different comparison functions for the different modes but after the next two commits there is no measurable benefit in doing so. There is a change in behavior for lines that begin with a form-feed or vertical-tab character. Since b46054b374 ("xdiff: use git-compat-util", 2019-04-11) xdiff does not treat '\f' or '\v' as whitespace characters. This means that lines starting with those characters are never considered to be blank and never match a line that does not start with the same character. After this patch a line matching "^[\f\v\r]*[ \t]*$" is considered to be blank by --color-moved-ws=allow-indentation-change and lines beginning "^[\f\v\r]*[ \t]*" can match another line if the suffixes match. This changes the output of git show for d18f76dccf ("compat/regex: use the regex engine from gawk for compat", 2010-08-17) as some lines in the pre-image before a moved block that contain '\f' are now considered moved as well as they match a blank line before the moved lines in the post-image. This commit updates one of the tests to reflect this change. Test HEAD^ HEAD -------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38(0.33+0.05) 0.38(0.33+0.05) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.86(0.82+0.04) 0.88(0.84+0.04) +2.3% 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.97(0.94+0.03) 0.86(0.81+0.05) -11.3% 4002.4: log --no-color-moved --no-color-moved-ws 1.16(1.07+0.09) 1.16(1.06+0.09) +0.0% 4002.5: log --color-moved --no-color-moved-ws 1.32(1.26+0.06) 1.33(1.27+0.05) +0.8% 4002.6: log --color-moved-ws=allow-indentation-change 1.35(1.29+0.06) 1.33(1.24+0.08) -1.5% Signed-off-by: Phillip Wood --- diff.c | 65 +++++++++++--------------------------- t/t4015-diff-whitespace.sh | 22 ++++++------- 2 files changed, 30 insertions(+), 57 deletions(-) diff --git a/diff.c b/diff.c index ffbe09937bc..2085c063675 100644 --- a/diff.c +++ b/diff.c @@ -850,28 +850,15 @@ static void fill_es_indent_data(struct emitted_diff_symbol *es) } static int compute_ws_delta(const struct emitted_diff_symbol *a, - const struct emitted_diff_symbol *b, - int *out) -{ - int a_len = a->len, - b_len = b->len, - a_off = a->indent_off, - a_width = a->indent_width, - b_off = b->indent_off, + const struct emitted_diff_symbol *b) +{ + int a_width = a->indent_width, b_width = b->indent_width; - if (a_width == INDENT_BLANKLINE && b_width == INDENT_BLANKLINE) { - *out = INDENT_BLANKLINE; - return 1; - } - - if (a_len - a_off != b_len - b_off || - memcmp(a->line + a_off, b->line + b_off, a_len - a_off)) - return 0; - - *out = a_width - b_width; + if (a_width == INDENT_BLANKLINE && b_width == INDENT_BLANKLINE) + return INDENT_BLANKLINE; - return 1; + return a_width - b_width; } static int cmp_in_block_with_wsd(const struct moved_entry *cur, @@ -916,26 +903,17 @@ static int moved_entry_cmp(const void *hashmap_cmp_fn_data, const void *keydata) { const struct diff_options *diffopt = hashmap_cmp_fn_data; - const struct moved_entry *a, *b; + const struct emitted_diff_symbol *a, *b; unsigned flags = diffopt->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; - a = container_of(eptr, const struct moved_entry, ent); - b = container_of(entry_or_key, const struct moved_entry, ent); + a = container_of(eptr, const struct moved_entry, ent)->es; + b = container_of(entry_or_key, const struct moved_entry, ent)->es; - if (diffopt->color_moved_ws_handling & - COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) - /* - * As there is not specific white space config given, - * we'd need to check for a new block, so ignore all - * white space. The setup of the white space - * configuration for the next block is done else where - */ - flags |= XDF_IGNORE_WHITESPACE; - - return !xdiff_compare_lines(a->es->line, a->es->len, - b->es->line, b->es->len, - flags); + return !xdiff_compare_lines(a->line + a->indent_off, + a->len - a->indent_off, + b->line + b->indent_off, + b->len - b->indent_off, flags); } static struct moved_entry *prepare_entry(struct diff_options *o, @@ -944,7 +922,8 @@ static struct moved_entry *prepare_entry(struct diff_options *o, struct moved_entry *ret = xmalloc(sizeof(*ret)); struct emitted_diff_symbol *l = &o->emitted_symbols->buf[line_no]; unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; - unsigned int hash = xdiff_hash_string(l->line, l->len, flags); + unsigned int hash = xdiff_hash_string(l->line + l->indent_off, + l->len - l->indent_off, flags); hashmap_entry_init(&ret->ent, hash); ret->es = l; @@ -1036,13 +1015,11 @@ static void fill_potential_moved_blocks(struct diff_options *o, hashmap_for_each_entry_from(hm, match, ent) { ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc); if (o->color_moved_ws_handling & - COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) { - if (compute_ws_delta(l, match->es, &(pmb[pmb_nr]).wsd)) - pmb[pmb_nr++].match = match; - } else { + COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) + pmb[pmb_nr].wsd = compute_ws_delta(l, match->es); + else pmb[pmb_nr].wsd = 0; - pmb[pmb_nr++].match = match; - } + pmb[pmb_nr++].match = match; } *pmb_p = pmb; @@ -6276,10 +6253,6 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o) if (o->color_moved) { struct hashmap add_lines, del_lines; - if (o->color_moved_ws_handling & - COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) - o->color_moved_ws_handling |= XDF_IGNORE_WHITESPACE; - hashmap_init(&del_lines, moved_entry_cmp, o, 0); hashmap_init(&add_lines, moved_entry_cmp, o, 0); diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh index 15782c879d2..50d0cf486be 100755 --- a/t/t4015-diff-whitespace.sh +++ b/t/t4015-diff-whitespace.sh @@ -2206,10 +2206,10 @@ EMPTY='' test_expect_success 'compare mixed whitespace delta across moved blocks' ' git reset --hard && - tr Q_ "\t " <<-EOF >text.txt && - ${EMPTY} - ____too short without - ${EMPTY} + tr "^|Q_" "\f\v\t " <<-EOF >text.txt && + ^__ + |____too short without + ^ ___being grouped across blank line ${EMPTY} context @@ -2228,7 +2228,7 @@ test_expect_success 'compare mixed whitespace delta across moved blocks' ' git add text.txt && git commit -m "add text.txt" && - tr Q_ "\t " <<-EOF >text.txt && + tr "^|Q_" "\f\v\t " <<-EOF >text.txt && context lines to @@ -2239,7 +2239,7 @@ test_expect_success 'compare mixed whitespace delta across moved blocks' ' ${EMPTY} QQtoo short without ${EMPTY} - Q_______being grouped across blank line + ^Q_______being grouped across blank line ${EMPTY} Q_QThese two lines have had their indentation reduced by four spaces @@ -2251,16 +2251,16 @@ test_expect_success 'compare mixed whitespace delta across moved blocks' ' -c core.whitespace=space-before-tab \ diff --color --color-moved --ws-error-highlight=all \ --color-moved-ws=allow-indentation-change >actual.raw && - grep -v "index" actual.raw | test_decode_color >actual && + grep -v "index" actual.raw | tr "\f\v" "^|" | test_decode_color >actual && cat <<-\EOF >expected && diff --git a/text.txt b/text.txt --- a/text.txt +++ b/text.txt @@ -1,16 +1,16 @@ - - - - too short without - - + -^ + -| too short without + -^ - being grouped across blank line - context @@ -2280,7 +2280,7 @@ test_expect_success 'compare mixed whitespace delta across moved blocks' ' + + too short without + - + being grouped across blank line + +^ being grouped across blank line + + These two lines have had their +indentation reduced by four spaces From patchwork Wed Oct 27 12:04:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587147 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4924DC433EF for ; Wed, 27 Oct 2021 12:05:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 338A5610A0 for ; Wed, 27 Oct 2021 12:05:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241798AbhJ0MH0 (ORCPT ); Wed, 27 Oct 2021 08:07:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38524 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241799AbhJ0MHJ (ORCPT ); Wed, 27 Oct 2021 08:07:09 -0400 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 571FBC06118D for ; Wed, 27 Oct 2021 05:04:39 -0700 (PDT) Received: by mail-wm1-x32b.google.com with SMTP id a20-20020a1c7f14000000b003231d13ee3cso6061489wmd.3 for ; Wed, 27 Oct 2021 05:04:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=B2qNreqxPbVLHXZVbNIw1FNvHSOsXYeU20/GeFvd7zY=; b=M9bwPztaiDFMSjVzO8GlFPWvOS28hK5NBMyuZ7AeZV/LnGH6FJXjSHCIWIk7ah1Njv 4Kt4mFmkWReBbaQZmn4sN3+EIuYTohdyp3vdjhRlFuCVWqlQLEx4rtFoyuFabfAtp5Dp SIJUN7tn2eS+WaD8QP5PUwut6c7Zg4yBPfXHHgn7GfeDeIba/U9rjs5S9nXefVYdV+iC yIkoqhjKRR3vFxL81lrqwBO1vKfLVTx7pV9BONfJ/QkkjjzBDKZrlXsG4XxfVH4m9edI HlYEELphc1pisGqwOrdqPy2OGyyCfSMD9Kh3OIgBKr8KY2/G/jRJWMUs/39H1XNQ1Qnr zr5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=B2qNreqxPbVLHXZVbNIw1FNvHSOsXYeU20/GeFvd7zY=; b=EhNS1j+VIX5eddVhSD0Gmz/a3ZYDLi1m6cKFuUS56sDhtrX9Q7SZmloCRhhRe6oT6c vBJLul47zz4YL+2BRWj8W8oxqU5U3dmTps5c8hGCEu1j8nIAck72Ul3QxTRDLalAGe1k rG0dRibi4jFVMQFvLYsHaM0OThNGZGC/WeNuAlusovf5JEwJwMN9fsyt0+8mbwP3j/Sc D8nks+zJhpjiZklBaDaAU8EzC/MKsNKGKoYCPmj1hrcLjmPYmy3JVmAQlvR7xgua6zeg pSENvo74iTJiMUY/SprzzjDOdJDwMu6tNhEcgMrceTM1EAXObKBAhUH9kH0HGmvWDiVB FpQw== X-Gm-Message-State: AOAM533wOl1gAq6XLh9QcstWOnCHpTV/fruVurzqx+0MROhBghaG2QMW EwvUUNWBg+/8JYaPKeIO+hUoM6r0f4g= X-Google-Smtp-Source: ABdhPJyrNxhWCfLSXn+vaKdOBO5LxhdTVhMbTA5uRz0K4PMq1B4kuyiknNYNA9xwpT4W1aqWVQLmUA== X-Received: by 2002:a05:600c:24d:: with SMTP id 13mr2254687wmj.64.1635336276950; Wed, 27 Oct 2021 05:04:36 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c17sm3405265wmk.23.2021.10.27.05.04.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:36 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:21 +0000 Subject: [PATCH v3 14/15] diff: use designated initializers for emitted_diff_symbol Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood This makes it clearer which fields are being explicitly initialized and will simplify the next commit where we add a new field to the struct. Signed-off-by: Phillip Wood --- diff.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/diff.c b/diff.c index 2085c063675..9ef88d7665a 100644 --- a/diff.c +++ b/diff.c @@ -1497,7 +1497,9 @@ static void emit_diff_symbol_from_struct(struct diff_options *o, static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, const char *line, int len, unsigned flags) { - struct emitted_diff_symbol e = {line, len, flags, 0, 0, s}; + struct emitted_diff_symbol e = { + .line = line, .len = len, .flags = flags, .s = s + }; if (o->emitted_symbols) append_emitted_diff_symbol(o, &e); From patchwork Wed Oct 27 12:04:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12587149 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AFE46C433FE for ; Wed, 27 Oct 2021 12:05:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9281860F70 for ; Wed, 27 Oct 2021 12:05:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241771AbhJ0MH1 (ORCPT ); Wed, 27 Oct 2021 08:07:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38560 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240329AbhJ0MHL (ORCPT ); Wed, 27 Oct 2021 08:07:11 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00B1EC079788 for ; Wed, 27 Oct 2021 05:04:40 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id e12so3769133wra.4 for ; Wed, 27 Oct 2021 05:04:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=p5cnPMQKbgHXVyVGUj+cefY7DiWcqcJMbT+NkgNsm8c=; b=og0zgq/WjOgM5mTfzYjHM4dGNTWhA3tr16pf/NKIUR6gwZ25/XkoCFFacK2HoUujDB 9IKatIKJN46AoSU1sxG0e09RUWM07ikeqV+QoVW5QYpoGu8I0owT7lJQL6unP3gN9IP8 PJoxeVk1ukbzs13BUYsBalsMP9edDh7j+g5RmrT7NIIrWd97NZXfMFZ+G1sf4r1M2Qbu NFnZ8QzhuRsGLkBvMvjDNmU613laMc9odljfI2rYCM6ZJ4gJZ74ufjX0j36fkxXzJu8F YeN8Klm7e4Kzln4Z/2pq9YATh6fKjeVrELHx2RLX1ThsGthVDMQ6j5HPrsjSEO1iRARm G1cQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=p5cnPMQKbgHXVyVGUj+cefY7DiWcqcJMbT+NkgNsm8c=; b=fjYlNTChDDVp2A8726uwN9X74V40bNKrutC7r7yifs67HLXqGx4Ix0ZNXsqcyOwCqO oF776zCXA7iexV5k8JiQlmfCnSage/6xadctcwdrXfQcB76eNwuh6QrcalQP2eqqm65D 3dAEnfODr4e5gXX3fKVkUCsJIElznEsTL/ecFQqIewNkrq0LFtUJAN1E7R8MYHvdRlNg AOkOvBpI7hoGalk8ZicZNCGo8H8t5FCAkyLN4R593cdxtx4zaCZEV7cnjmiwpF7u06Xi Dx+su3ZfE2WpKkzlNW1Tf2C4gWx4h01zPJWXRMf0g3yXp2J8A5kuLuv/PkSvYkohymU6 H1Sw== X-Gm-Message-State: AOAM530QxIT/enh3UjcjxNCHGWOQuSi2h85SUSBcJ8XSAL3VCqS9WODR vsD/g0ryQw+onwPZztdBf2Prvvkzxzs= X-Google-Smtp-Source: ABdhPJxYiqqpRwyZyKt6Xr9hLQJ1gcqcu9DlzazruYjNM9Z9mfpd/NfRLsNtJbTJSVXNLOLL9v9f/A== X-Received: by 2002:adf:ee43:: with SMTP id w3mr956709wro.198.1635336278436; Wed, 27 Oct 2021 05:04:38 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f6sm3101572wmj.28.2021.10.27.05.04.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Oct 2021 05:04:38 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Wed, 27 Oct 2021 12:04:22 +0000 Subject: [PATCH v3 15/15] diff --color-moved: intern strings Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Elijah Newren , Phillip Wood , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood Taking inspiration from xdl_classify_record() assign an id to each addition and deletion such that lines that match for the current --color-moved-ws mode share the same unique id. This reduces the number of hash lookups a little (calculating the ids still involves one hash lookup per line) but the main benefit is that when growing blocks of potentially moved lines we can replace string comparisons which involve chasing a pointer with a simple integer comparison. On a large diff this commit reduces the time to run 'diff --color-moved' by 37% compared to the previous commit and 31% compared to master, for 'diff --color-moved-ws=allow-indentation-change' the reduction is 28% compared to the previous commit and 96% compared to master. There is little change in the performance of 'git log --patch' as the diffs are smaller. Test HEAD^ HEAD --------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38(0.33+0.05) 0.38(0.33+0.05) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.88(0.81+0.06) 0.55(0.50+0.04) -37.5% 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.85(0.79+0.06) 0.61(0.54+0.06) -28.2% 4002.4: log --no-color-moved --no-color-moved-ws 1.16(1.07+0.08) 1.15(1.09+0.05) -0.9% 4002.5: log --color-moved --no-color-moved-ws 1.31(1.22+0.08) 1.29(1.19+0.09) -1.5% 4002.6: log --color-moved-ws=allow-indentation-change 1.32(1.24+0.08) 1.31(1.18+0.13) -0.8% Test master HEAD --------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38 (0.33+0.05) 0.38(0.33+0.05) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.80 (0.75+0.04) 0.55(0.50+0.04) -31.2% 4002.3: diff --color-moved-ws=allow-indentation-change large change 14.20(14.15+0.05) 0.61(0.54+0.06) -95.7% 4002.4: log --no-color-moved --no-color-moved-ws 1.15 (1.05+0.09) 1.15(1.09+0.05) +0.0% 4002.5: log --color-moved --no-color-moved-ws 1.30 (1.19+0.11) 1.29(1.19+0.09) -0.8% 4002.6: log --color-moved-ws=allow-indentation-change 1.70 (1.63+0.06) 1.31(1.18+0.13) -22.9% Signed-off-by: Phillip Wood --- diff.c | 174 +++++++++++++++++++++++++++++++-------------------------- 1 file changed, 96 insertions(+), 78 deletions(-) diff --git a/diff.c b/diff.c index 9ef88d7665a..c28c56c1283 100644 --- a/diff.c +++ b/diff.c @@ -18,6 +18,7 @@ #include "submodule-config.h" #include "submodule.h" #include "hashmap.h" +#include "mem-pool.h" #include "ll-merge.h" #include "string-list.h" #include "strvec.h" @@ -772,6 +773,7 @@ struct emitted_diff_symbol { int flags; int indent_off; /* Offset to first non-whitespace character */ int indent_width; /* The visual width of the indentation */ + unsigned id; enum diff_symbol s; }; #define EMITTED_DIFF_SYMBOL_INIT {NULL} @@ -797,9 +799,9 @@ static void append_emitted_diff_symbol(struct diff_options *o, } struct moved_entry { - struct hashmap_entry ent; const struct emitted_diff_symbol *es; struct moved_entry *next_line; + struct moved_entry *next_match; }; struct moved_block { @@ -865,24 +867,24 @@ static int cmp_in_block_with_wsd(const struct moved_entry *cur, const struct emitted_diff_symbol *l, struct moved_block *pmb) { - int al = cur->es->len, bl = l->len; - const char *a = cur->es->line, - *b = l->line; - int a_off = cur->es->indent_off, - a_width = cur->es->indent_width, - b_off = l->indent_off, - b_width = l->indent_width; + int a_width = cur->es->indent_width, b_width = l->indent_width; int delta; - /* If 'l' and 'cur' are both blank then they match. */ - if (a_width == INDENT_BLANKLINE && b_width == INDENT_BLANKLINE) + /* The text of each line must match */ + if (cur->es->id != l->id) + return 1; + + /* + * If 'l' and 'cur' are both blank then we don't need to check the + * indent. We only need to check cur as we know the strings match. + * */ + if (a_width == INDENT_BLANKLINE) return 0; /* * The indent changes of the block are known and stored in pmb->wsd; * however we need to check if the indent changes of the current line - * match those of the current block and that the text of 'l' and 'cur' - * after the indentation match. + * match those of the current block. */ delta = b_width - a_width; @@ -893,22 +895,26 @@ static int cmp_in_block_with_wsd(const struct moved_entry *cur, if (pmb->wsd == INDENT_BLANKLINE) pmb->wsd = delta; - return !(delta == pmb->wsd && al - a_off == bl - b_off && - !memcmp(a + a_off, b + b_off, al - a_off)); + return delta != pmb->wsd; } -static int moved_entry_cmp(const void *hashmap_cmp_fn_data, - const struct hashmap_entry *eptr, - const struct hashmap_entry *entry_or_key, - const void *keydata) +struct interned_diff_symbol { + struct hashmap_entry ent; + struct emitted_diff_symbol *es; +}; + +static int interned_diff_symbol_cmp(const void *hashmap_cmp_fn_data, + const struct hashmap_entry *eptr, + const struct hashmap_entry *entry_or_key, + const void *keydata) { const struct diff_options *diffopt = hashmap_cmp_fn_data; const struct emitted_diff_symbol *a, *b; unsigned flags = diffopt->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; - a = container_of(eptr, const struct moved_entry, ent)->es; - b = container_of(entry_or_key, const struct moved_entry, ent)->es; + a = container_of(eptr, const struct interned_diff_symbol, ent)->es; + b = container_of(entry_or_key, const struct interned_diff_symbol, ent)->es; return !xdiff_compare_lines(a->line + a->indent_off, a->len - a->indent_off, @@ -916,55 +922,81 @@ static int moved_entry_cmp(const void *hashmap_cmp_fn_data, b->len - b->indent_off, flags); } -static struct moved_entry *prepare_entry(struct diff_options *o, - int line_no) +static void prepare_entry(struct diff_options *o, struct emitted_diff_symbol *l, + struct interned_diff_symbol *s) { - struct moved_entry *ret = xmalloc(sizeof(*ret)); - struct emitted_diff_symbol *l = &o->emitted_symbols->buf[line_no]; unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; unsigned int hash = xdiff_hash_string(l->line + l->indent_off, l->len - l->indent_off, flags); - hashmap_entry_init(&ret->ent, hash); - ret->es = l; - ret->next_line = NULL; - - return ret; + hashmap_entry_init(&s->ent, hash); + s->es = l; } -static void add_lines_to_move_detection(struct diff_options *o, - struct hashmap *add_lines, - struct hashmap *del_lines) +struct moved_entry_list { + struct moved_entry *add, *del; +}; + +static struct moved_entry_list *add_lines_to_move_detection(struct diff_options *o, + struct mem_pool *entry_mem_pool) { struct moved_entry *prev_line = NULL; - + struct mem_pool interned_pool; + struct hashmap interned_map; + struct moved_entry_list *entry_list = NULL; + size_t entry_list_alloc = 0; + unsigned id = 0; int n; + + hashmap_init(&interned_map, interned_diff_symbol_cmp, o, 8096); + mem_pool_init(&interned_pool, 1024 * 1024); + for (n = 0; n < o->emitted_symbols->nr; n++) { - struct hashmap *hm; - struct moved_entry *key; + struct interned_diff_symbol key; + struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n]; + struct interned_diff_symbol *s; + struct moved_entry *entry; - switch (o->emitted_symbols->buf[n].s) { - case DIFF_SYMBOL_PLUS: - hm = add_lines; - break; - case DIFF_SYMBOL_MINUS: - hm = del_lines; - break; - default: + if (l->s != DIFF_SYMBOL_PLUS && l->s != DIFF_SYMBOL_MINUS) { prev_line = NULL; continue; } if (o->color_moved_ws_handling & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) - fill_es_indent_data(&o->emitted_symbols->buf[n]); - key = prepare_entry(o, n); - if (prev_line && prev_line->es->s == o->emitted_symbols->buf[n].s) - prev_line->next_line = key; + fill_es_indent_data(l); - hashmap_add(hm, &key->ent); - prev_line = key; + prepare_entry(o, l, &key); + s = hashmap_get_entry(&interned_map, &key, ent, &key.ent); + if (s) { + l->id = s->es->id; + } else { + l->id = id; + ALLOC_GROW_BY(entry_list, id, 1, entry_list_alloc); + hashmap_add(&interned_map, + memcpy(mem_pool_alloc(&interned_pool, + sizeof(key)), + &key, sizeof(key))); + } + entry = mem_pool_alloc(entry_mem_pool, sizeof(*entry)); + entry->es = l; + entry->next_line = NULL; + if (prev_line && prev_line->es->s == l->s) + prev_line->next_line = entry; + prev_line = entry; + if (l->s == DIFF_SYMBOL_PLUS) { + entry->next_match = entry_list[l->id].add; + entry_list[l->id].add = entry; + } else { + entry->next_match = entry_list[l->id].del; + entry_list[l->id].del = entry; + } } + + hashmap_clear(&interned_map); + mem_pool_discard(&interned_pool, 0); + + return entry_list; } static void pmb_advance_or_null(struct diff_options *o, @@ -973,7 +1005,6 @@ static void pmb_advance_or_null(struct diff_options *o, int *pmb_nr) { int i, j; - unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; for (i = 0, j = 0; i < *pmb_nr; i++) { int match; @@ -986,9 +1017,8 @@ static void pmb_advance_or_null(struct diff_options *o, match = cur && !cmp_in_block_with_wsd(cur, l, &pmb[i]); else - match = cur && - xdiff_compare_lines(cur->es->line, cur->es->len, - l->line, l->len, flags); + match = cur && cur->es->id == l->id; + if (match) { pmb[j] = pmb[i]; pmb[j++].match = cur; @@ -998,7 +1028,6 @@ static void pmb_advance_or_null(struct diff_options *o, } static void fill_potential_moved_blocks(struct diff_options *o, - struct hashmap *hm, struct moved_entry *match, struct emitted_diff_symbol *l, struct moved_block **pmb_p, @@ -1012,7 +1041,7 @@ static void fill_potential_moved_blocks(struct diff_options *o, * The current line is the start of a new block. * Setup the set of potential blocks. */ - hashmap_for_each_entry_from(hm, match, ent) { + for (; match; match = match->next_match) { ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc); if (o->color_moved_ws_handling & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) @@ -1067,8 +1096,7 @@ static int adjust_last_block(struct diff_options *o, int n, int block_length) /* Find blocks of moved code, delegate actual coloring decision to helper */ static void mark_color_as_moved(struct diff_options *o, - struct hashmap *add_lines, - struct hashmap *del_lines) + struct moved_entry_list *entry_list) { struct moved_block *pmb = NULL; /* potentially moved blocks */ int pmb_nr = 0, pmb_alloc = 0; @@ -1077,23 +1105,15 @@ static void mark_color_as_moved(struct diff_options *o, for (n = 0; n < o->emitted_symbols->nr; n++) { - struct hashmap *hm = NULL; - struct moved_entry *key; struct moved_entry *match = NULL; struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n]; switch (l->s) { case DIFF_SYMBOL_PLUS: - hm = del_lines; - key = prepare_entry(o, n); - match = hashmap_get_entry(hm, key, ent, NULL); - free(key); + match = entry_list[l->id].del; break; case DIFF_SYMBOL_MINUS: - hm = add_lines; - key = prepare_entry(o, n); - match = hashmap_get_entry(hm, key, ent, NULL); - free(key); + match = entry_list[l->id].add; break; default: flipped_block = 0; @@ -1135,7 +1155,7 @@ static void mark_color_as_moved(struct diff_options *o, */ n -= block_length; else - fill_potential_moved_blocks(o, hm, match, l, + fill_potential_moved_blocks(o, match, l, &pmb, &pmb_alloc, &pmb_nr); @@ -6253,20 +6273,18 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o) if (o->emitted_symbols) { if (o->color_moved) { - struct hashmap add_lines, del_lines; - - hashmap_init(&del_lines, moved_entry_cmp, o, 0); - hashmap_init(&add_lines, moved_entry_cmp, o, 0); + struct mem_pool entry_pool; + struct moved_entry_list *entry_list; - add_lines_to_move_detection(o, &add_lines, &del_lines); - mark_color_as_moved(o, &add_lines, &del_lines); + mem_pool_init(&entry_pool, 1024 * 1024); + entry_list = add_lines_to_move_detection(o, + &entry_pool); + mark_color_as_moved(o, entry_list); if (o->color_moved == COLOR_MOVED_ZEBRA_DIM) dim_moved_lines(o); - hashmap_clear_and_free(&add_lines, struct moved_entry, - ent); - hashmap_clear_and_free(&del_lines, struct moved_entry, - ent); + mem_pool_discard(&entry_pool, 0); + free(entry_list); } for (i = 0; i < esm.nr; i++)