From patchwork Thu Apr 22 15:17:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 12218665 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 800C3C433B4 for ; Thu, 22 Apr 2021 15:18:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 477EB61425 for ; Thu, 22 Apr 2021 15:18:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237904AbhDVPTC (ORCPT ); Thu, 22 Apr 2021 11:19:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237919AbhDVPSz (ORCPT ); Thu, 22 Apr 2021 11:18:55 -0400 Received: from mail-qv1-xf33.google.com (mail-qv1-xf33.google.com [IPv6:2607:f8b0:4864:20::f33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9780BC06138F for ; Thu, 22 Apr 2021 08:18:19 -0700 (PDT) Received: by mail-qv1-xf33.google.com with SMTP id l2so4281062qvb.7 for ; Thu, 22 Apr 2021 08:18:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=P4rxRcR9+uJuRupxBojB6SFa/+ZbjggBCNEtBkv/ZiM=; b=QpKABmlFVkdJs77Jo2Q9I2KDPJxi01UMzXUKh7fhRHlRuobpznXu2531ubp8ihvtUo UFBIn4AW+rQqxYvHT9UyemiEIVAcU16rHURe3HRYjNr93TnRVbdM8OQ6Ae07rJbirKEw dGYYfbyH7kaHIB2f6aIIqZIr5C7jsF4oZ1Lr4/zI1bLTPKRG183E8FwmMq44nTsRuDBx 7V/evwK7AB8Q8cE8rCS6rjXxfVziccRUAf5AkRzOSrpLv1wEcJPT8kEtv0GA0CDEJUsv 7xpNECwiP4Bjcjz7j0MZCm99UWJ0j45+YSLS0x4NG+OLxQgiBo8eckZzw9p6JDYSOv89 GP0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=P4rxRcR9+uJuRupxBojB6SFa/+ZbjggBCNEtBkv/ZiM=; b=P7sJTX9BD280zAqvNxEsejW8SEdI7dm6EjXnikSytIxJIfgE2swptMdDvGiTp+j5Nl At+tU+7WlSk1Wy3L1P0a8XIYchlUj3JLrfW9TYRm5pLEoK6QDLw67W1mncGfr01ZaRMa qBZmiPwM56BDRVZ1Vjt042Os+ZO7/Mt1UdQV24beG9XiZIrAvcFN6w58y0G8bseCaJ9l +Si8yJ3mz2nkZGppunFMShLmqNwgAeX1giahs/v68Q2jlXDxdwV66+Crmnmhc6BzlJyp Ftb91aotBvmQpDKzIcsWUzGpZNHXwgF83Ej8NNShpbgmxw14mo2KOeNn9G7e1W2rHaLv Afpw== X-Gm-Message-State: AOAM533O7RwSTUcbOig2CG7xwSnxuTkQ4RbSSTCZXFRzZmpVljWrrpCk vc8gK7ktyRe8jpQTJUvA1tgtluDovBpKSg== X-Google-Smtp-Source: ABdhPJwrtd4LkV6LqteEg9c7Ji5qropUc37McWwg1Gx3in4inO7WJ1S1KsfVE9iFDWSGzD/C+mPVbQ== X-Received: by 2002:a0c:b410:: with SMTP id u16mr3962686qve.8.1619104698482; Thu, 22 Apr 2021 08:18:18 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id l16sm2348909qkg.91.2021.04.22.08.18.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Apr 2021 08:18:18 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: christian.couder@gmail.com, git@jeffhostetler.com Subject: [PATCH 5/7] parallel-checkout: add tests related to path collisions Date: Thu, 22 Apr 2021 12:17:51 -0300 Message-Id: X-Mailer: git-send-email 2.30.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Add tests to confirm that path collisions are properly detected by checkout workers, both to avoid race conditions and to report colliding entries on clone. Original-patch-by: Jeff Hostetler Signed-off-by: Jeff Hostetler Signed-off-by: Matheus Tavares --- parallel-checkout.c | 4 + t/lib-parallel-checkout.sh | 4 +- t/t2081-parallel-checkout-collisions.sh | 162 ++++++++++++++++++++++++ 3 files changed, 168 insertions(+), 2 deletions(-) create mode 100755 t/t2081-parallel-checkout-collisions.sh diff --git a/parallel-checkout.c b/parallel-checkout.c index 09e8b10a35..6fb3f1e6c9 100644 --- a/parallel-checkout.c +++ b/parallel-checkout.c @@ -8,6 +8,7 @@ #include "sigchain.h" #include "streaming.h" #include "thread-utils.h" +#include "trace2.h" struct pc_worker { struct child_process cp; @@ -326,6 +327,7 @@ void write_pc_item(struct parallel_checkout_item *pc_item, if (dir_sep && !has_dirs_only_path(path.buf, dir_sep - path.buf, state->base_dir_len)) { pc_item->status = PC_ITEM_COLLIDED; + trace2_data_string("pcheckout", NULL, "collision/dirname", path.buf); goto out; } @@ -341,6 +343,8 @@ void write_pc_item(struct parallel_checkout_item *pc_item, * call should have already caught these cases. */ pc_item->status = PC_ITEM_COLLIDED; + trace2_data_string("pcheckout", NULL, + "collision/basename", path.buf); } else { error_errno("failed to open file '%s'", path.buf); pc_item->status = PC_ITEM_FAILED; diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh index 39fd36fdf6..16ee18389b 100644 --- a/t/lib-parallel-checkout.sh +++ b/t/lib-parallel-checkout.sh @@ -21,12 +21,12 @@ test_checkout_workers () { shift && rm -f trace && - GIT_TRACE2="$(pwd)/trace" "$@" && + GIT_TRACE2="$(pwd)/trace" "$@" 2>&8 && workers=$(grep "child_start\[..*\] git checkout--worker" trace | wc -l) && test $workers -eq $expected_workers && rm -f trace -} +} 8>&2 2>&4 # Verify that both the working tree and the index were created correctly verify_checkout () { diff --git a/t/t2081-parallel-checkout-collisions.sh b/t/t2081-parallel-checkout-collisions.sh new file mode 100755 index 0000000000..f6fcfc0c1e --- /dev/null +++ b/t/t2081-parallel-checkout-collisions.sh @@ -0,0 +1,162 @@ +#!/bin/sh + +test_description="path collisions during parallel checkout + +Parallel checkout must detect path collisions to: + +1) Avoid racily writing to different paths that represent the same file on disk. +2) Report the colliding entries on clone. + +The tests in this file exercise parallel checkout's collision detection code in +both these mechanics. +" + +. ./test-lib.sh +. "$TEST_DIRECTORY/lib-parallel-checkout.sh" + +TEST_ROOT="$PWD" + +test_expect_success CASE_INSENSITIVE_FS 'setup' ' + empty_oid=$(git hash-object -w --stdin objs <<-EOF && + 100644 $empty_oid FILE_X + 100644 $empty_oid FILE_x + 100644 $empty_oid file_X + 100644 $empty_oid file_x + EOF + git update-index --index-info >filter.log + EOF +' + +test_workers_in_event_trace () +{ + test $1 -eq $(grep ".event.:.child_start..*checkout--worker" $2 | wc -l) +} + +test_expect_success CASE_INSENSITIVE_FS 'worker detects basename collision' ' + GIT_TRACE2_EVENT="$(pwd)/trace" git \ + -c checkout.workers=2 -c checkout.thresholdForParallelism=0 \ + checkout . && + + test_workers_in_event_trace 2 trace && + collisions=$(grep -i "category.:.pcheckout.,.key.:.collision/basename.,.value.:.file_x.}" trace | wc -l) && + test $collisions -eq 3 +' + +test_expect_success CASE_INSENSITIVE_FS 'worker detects dirname collision' ' + test_config filter.logger.smudge "\"$TEST_ROOT/logger_script\" %f" && + empty_oid=$(git hash-object -w --stdin objs <<-EOF && + 100644 $empty_oid A/B + 100644 $empty_oid A/C + 100644 $empty_oid a + 100644 $attr_oid .gitattributes + EOF + git rm -rf . && + git update-index --index-info expected.log && + test_cmp filter.log expected.log && + + # Check that it used the right number of workers and detected the collisions + test_workers_in_event_trace 2 trace && + grep "category.:.pcheckout.,.key.:.collision/dirname.,.value.:.A/B.}" trace && + grep "category.:.pcheckout.,.key.:.collision/dirname.,.value.:.A/C.}" trace +' + +test_expect_success SYMLINKS,CASE_INSENSITIVE_FS 'do not follow symlinks colliding with leading dir' ' + empty_oid=$(git hash-object -w --stdin objs <<-EOF && + 120000 $symlink_oid D + 100644 $empty_oid d/x + 100644 $empty_oid e/y + EOF + git rm -rf . && + git update-index --index-info stderr && + + grep FILE_X stderr && + grep FILE_x stderr && + grep file_X stderr && + grep file_x stderr && + grep "the following paths have collided" stderr +' + +# This test ensures that the collision report code is correctly looking for +# colliding peers in the second half of the cache_entry array. This is done by +# defining a smudge command for the *last* array entry, which makes it +# non-eligible for parallel-checkout. Thus, it is checked out *first*, before +# spawning the workers. +# +# Note: this test doesn't work on Windows because, on this system, the +# collision report code uses strcmp() to find the colliding pairs when +# core.ignoreCase is false. And we need this setting for this test so that only +# 'file_x' matches the pattern of the filter attribute. But the test works on +# OSX, where the colliding pairs are found using inode. +# +test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN \ + 'collision report on clone (w/ colliding peer after the detected entry)' ' + + test_config_global filter.logger.smudge "\"$TEST_ROOT/logger_script\" %f" && + git reset --hard basename_collision && + echo "file_x filter=logger" >.gitattributes && + git add .gitattributes && + git commit -m "filter for file_x" && + + rm -rf clone-repo && + set_checkout_config 2 0 && + test_checkout_workers 2 \ + git -c core.ignoreCase=false clone . clone-repo 2>stderr && + + grep FILE_X stderr && + grep FILE_x stderr && + grep file_X stderr && + grep file_x stderr && + grep "the following paths have collided" stderr && + + # Check that only "file_x" was filtered + echo file_x >expected.log && + test_cmp clone-repo/filter.log expected.log +' + +test_done