From patchwork Fri Aug 14 18:07:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715139 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1EECE14E3 for ; Fri, 14 Aug 2020 18:07:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F293B20791 for ; Fri, 14 Aug 2020 18:07:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Cf2/fQIG" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728575AbgHNSH0 (ORCPT ); Fri, 14 Aug 2020 14:07:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728523AbgHNSHY (ORCPT ); Fri, 14 Aug 2020 14:07:24 -0400 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74048C061384 for ; Fri, 14 Aug 2020 11:07:24 -0700 (PDT) Received: by mail-wm1-x344.google.com with SMTP id 3so8654748wmi.1 for ; Fri, 14 Aug 2020 11:07:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=sP/+JweEgFHkxGuRGaOk+dTd7TR0KQm37A4hXQ1StOA=; b=Cf2/fQIGRTmAMDiZsZUnCajLD4bz4qkTlW32/qFfFMK8sJ2UmiHXEbG/Q3qC2yWOmt 7A8xPP2G3+PsQHd+Pd9vydScBxhyrGquyVUTVsMKDHVpXNJAu6LVZCMYnMh1TqG4w9Xp VVpMqJVdmQKWj15a58Lif9RsR1xOvHfqQTi9JtuZbP9J77HL5Pocxp8DDAinB1OXf/Mb M3EtGcSED3NfGo0tfHQIMoKNY2P0hOo8NfuMAKo8L2rWgBOmaHgOJrGh6Y2nSy4VJCAU 28I6ltQTBzfDFJS3ienZva4ms5bDFS7UuTzGq6POLqeiILhDkxoUBVyTbbrlO2OENtuT /q9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=sP/+JweEgFHkxGuRGaOk+dTd7TR0KQm37A4hXQ1StOA=; b=JrQq34WTIjJDKxBh1MfaqfqxoZdZVJKX4NwQhnlU0kRHlz6BuuuuIJVYmIPRRyyUrU 9ZzBlmM6gxw3QhdhFB07ti5qfQorcICeqVPZF6eAzQS1G/R0JyCHW0kWd+/aAKM6NjGP 23PcBBI/U4ehHN5LC58uRT+NDniCiaRZ4ivc5sY5GyKjIOCn2NV3RAY3BQYs+uhrdm6C aAqzX8pksGf/R87GQPgMkdW2zogNwhvEf9mmMGcFccTMzFdmkbHPY1ZgQvTd/nh8sIzw zPhzI28205VacUQTNn9ZOQ/t8zZ6UZsVsSVCxckAkCvKvzo1dez4l7WYoKAPAkYbeVNr 3DRg== X-Gm-Message-State: AOAM531o+Y8zGYTrY4/HcsWVF+6So+yMjckh5cGSmvoUQD8ozq5Ey+4O NwGJBTjjy4LvOKu38ydKk4gsV8dtJOU= X-Google-Smtp-Source: ABdhPJwEsmJj3D1D4D/uFAjysjT8ASyy1ORDE/IjLE35X7h/wAka21oAqDwW5FvBRq7zH+dEqGPL3A== X-Received: by 2002:a1c:964b:: with SMTP id y72mr3584139wmd.69.1597428442890; Fri, 14 Aug 2020 11:07:22 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h5sm19210057wrc.97.2020.08.14.11.07.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Aug 2020 11:07:22 -0700 (PDT) Message-Id: <242a44b63c8fc0ab7e8d8a6a913fde71444f931d.1597428440.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Fri, 14 Aug 2020 18:07:18 +0000 Subject: [PATCH 1/3] t/README: document GIT_TEST_DEFAULT_HASH Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: martin.agren@gmail.com, sandals@crustytoothpaste.net, me@ttaylorr.com, abhishekkumar8222@gmail.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Signed-off-by: Derrick Stolee --- t/README | 3 +++ 1 file changed, 3 insertions(+) diff --git a/t/README b/t/README index 70ec61cf88..ecf8c7291d 100644 --- a/t/README +++ b/t/README @@ -421,6 +421,9 @@ GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS=, when true (which is the default when running tests), errors out when an abbreviated option is used. +GIT_TEST_DEFAULT_HASH= specifies which hash algorithm to use +in the test scripts. + Naming Tests ------------ From patchwork Fri Aug 14 18:07:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715141 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 89D0D1744 for ; Fri, 14 Aug 2020 18:07:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 661F820791 for ; Fri, 14 Aug 2020 18:07:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="PWTKn/nr" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728582AbgHNSH1 (ORCPT ); Fri, 14 Aug 2020 14:07:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43668 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728571AbgHNSHZ (ORCPT ); Fri, 14 Aug 2020 14:07:25 -0400 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76E52C061384 for ; Fri, 14 Aug 2020 11:07:25 -0700 (PDT) Received: by mail-wm1-x343.google.com with SMTP id g8so8183750wmk.3 for ; Fri, 14 Aug 2020 11:07:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=+oy4Z3XB4G7URqksSnrBDvki2JQbQosnAaapXz0JFFY=; b=PWTKn/nrlVLSl3JTMOoyIqr3JrisRk1U7HI6YJ5XqWaHqGiZTNeUNs6CI0WEhPEtPL ZgMihGMrmgdoflwSiMLRfeFFYUSyOeY4xOFMKvFLZEgEXhZYl2VopbXe0mCJcpK/neqY aaWlb95VyDy2OguWLgU63mArheTpJjPPfH0IUg+ZH2t2by/CWkbdEYvYAOjxRzoEEysa 5MWjfHsaz60hjchoDcDk5Br1jwfsHOlUGuKpyqXS4SVyJBn+P1K1tJmcV/pAkHFaYBip etg2/ppvsYsScGPPUBAAFR7i8ZKNvtOtWs935iB6Obb+AAL0d+oA7Ay/q88O9nsSFg0T 148Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=+oy4Z3XB4G7URqksSnrBDvki2JQbQosnAaapXz0JFFY=; b=G/GPwuq4KJxJVTPf1m/1Ix2VFPEaqKxvxfAFHqK1YwhJzOHnaSSH3dpga/uR3L38xy kE2c6bT0YjE8qNj4xDypk+bKoUXOSadRhz7/b4yCtWI6TBYQBCzcxjqVLY+jX6rKhRk4 LSbcOTG/agpt2t4Xd+xJv2c1Q8ofvrmvIDqVXiZsI135Vh3Ed2JMfzIshukIEuopixjH IwftoDOYOmxPF1bW87WYxn82qclc+6EbGQDr72e3Wbvr5qWGH7/zPjR+fI6GhBkmoSaI R4hdXPmnp4jXyjDKxstkyNqm1z8mvCS8JFAKOrxIQWc/1xWL5fgRizORmQ1FX1duiMh4 1anw== X-Gm-Message-State: AOAM530o94z+YSN1WBF3Lz6r7qZVerV7OowL64N6pCIAx+JFnOztUMXw xTg6OygqrAaGJWAW8mo1bNptrWYt/14= X-Google-Smtp-Source: ABdhPJzK5h9rx6/vxPA8OYyBJjrOffQ/ahg9Z3lFFcu3o8/Yz4cHSNZWq5yVBXHV1LIQfE7zfM+hlQ== X-Received: by 2002:a7b:c084:: with SMTP id r4mr3481908wmh.23.1597428443833; Fri, 14 Aug 2020 11:07:23 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 15sm15490575wmo.33.2020.08.14.11.07.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Aug 2020 11:07:23 -0700 (PDT) Message-Id: <4bbfd345d16da4604dd20decda8ecb12372e4223.1597428440.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Fri, 14 Aug 2020 18:07:19 +0000 Subject: [PATCH 2/3] commit-graph: use the hash version byte Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: martin.agren@gmail.com, sandals@crustytoothpaste.net, me@ttaylorr.com, abhishekkumar8222@gmail.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The commit-graph format reserved a byte among the header of the file to store a "hash version". During the SHA-256 work, this was not modified because file formats are not necessarily intended to work across hash versions. If a repository has SHA-256 as its hash algorithm, it automatically up-shifts the lengths of object names in all necessary formats. However, since we have this byte available for adjusting the version, we can make the file formats more obviously incompatible instead of relying on other context from the repository. Update the oid_version() method in commit-graph.c to add a new value, 2, for sha-256. This automatically writes the new value in a SHA-256 repository _and_ verifies the value is correct. This is a breaking change relative to the current 'master' branch since 092b677 (Merge branch 'bc/sha-256-cvs-svn-updates', 2020-08-13) but it is not breaking relative to any released version of Git. The test impact is relatively minor: the output of 'test-tool read-graph' lists the header information, so those instances of '1' need to be replaced with a variable determined by GIT_TEST_DEFAULT_HASH. A more careful test is added that specifically creates a repository of each type then swaps the commit-graph files. The important value here is that the "git log" command succeeds while writing a message to stderr. Signed-off-by: Derrick Stolee --- .../technical/commit-graph-format.txt | 9 ++++- commit-graph.c | 6 ++- t/t4216-log-bloom.sh | 8 +++- t/t5318-commit-graph.sh | 37 ++++++++++++++++++- t/t5324-split-commit-graph.sh | 8 +++- 5 files changed, 62 insertions(+), 6 deletions(-) diff --git a/Documentation/technical/commit-graph-format.txt b/Documentation/technical/commit-graph-format.txt index 440541045d..6ddbceba15 100644 --- a/Documentation/technical/commit-graph-format.txt +++ b/Documentation/technical/commit-graph-format.txt @@ -42,8 +42,13 @@ HEADER: 1-byte version number: Currently, the only valid version is 1. - 1-byte Hash Version (1 = SHA-1) - We infer the hash length (H) from this value. + 1-byte Hash Version + We infer the hash length (H) from this value: + 1 => SHA-1 + 2 => SHA-256 + If the hash type does not match the repository's hash algorithm, the + commit-graph file should be ignored with a warning presented to the + user. 1-byte number (C) of "chunks" diff --git a/commit-graph.c b/commit-graph.c index e51c91dd5b..d03328d64c 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -179,7 +179,11 @@ static char *get_chain_filename(struct object_directory *odb) static uint8_t oid_version(void) { - return 1; + if (the_hash_algo->rawsz == GIT_SHA1_RAWSZ) + return 1; + if (the_hash_algo->rawsz == GIT_SHA256_RAWSZ) + return 2; + die(_("invalid hash version")); } static struct commit_graph *alloc_commit_graph(void) diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh index c21cc160f3..906af2799d 100755 --- a/t/t4216-log-bloom.sh +++ b/t/t4216-log-bloom.sh @@ -6,6 +6,12 @@ test_description='git log for a path with Bloom filters' GIT_TEST_COMMIT_GRAPH=0 GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0 +OID_VERSION=1 +if [ "$GIT_DEFAULT_HASH" = "sha256" ] +then + OID_VERSION=2 +fi + test_expect_success 'setup test - repo, commits, commit graph, log outputs' ' git init && mkdir A A/B A/B/C && @@ -35,7 +41,7 @@ test_expect_success 'setup test - repo, commits, commit graph, log outputs' ' graph_read_expect () { NUM_CHUNKS=5 cat >expect <<- EOF - header: 43475048 1 1 $NUM_CHUNKS 0 + header: 43475048 1 $OID_VERSION $NUM_CHUNKS 0 num_commits: $1 chunks: oid_fanout oid_lookup commit_metadata bloom_indexes bloom_data EOF diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 044cf8a3de..5b65017676 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -5,6 +5,12 @@ test_description='commit graph' GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0 +OID_VERSION=1 +if [ "$GIT_DEFAULT_HASH" = "sha256" ] +then + OID_VERSION=2 +fi + test_expect_success 'setup full repo' ' mkdir full && cd "$TRASH_DIRECTORY/full" && @@ -77,7 +83,7 @@ graph_read_expect() { NUM_CHUNKS=$((3 + $(echo "$2" | wc -w))) fi cat >expect <<- EOF - header: 43475048 1 1 $NUM_CHUNKS 0 + header: 43475048 1 $OID_VERSION $NUM_CHUNKS 0 num_commits: $1 chunks: oid_fanout oid_lookup commit_metadata$OPTIONAL EOF @@ -412,6 +418,35 @@ test_expect_success 'replace-objects invalidates commit-graph' ' ) ' +test_expect_success 'warn on improper hash version' ' + git init --object-format=sha1 sha1 && + ( + cd sha1 && + test_commit 1 && + git commit-graph write --reachable && + mv .git/objects/info/commit-graph ../cg-sha1 + ) && + git init --object-format=sha256 sha256 && + ( + cd sha256 && + test_commit 1 && + git commit-graph write --reachable && + mv .git/objects/info/commit-graph ../cg-sha256 + ) && + ( + cd sha1 && + mv ../cg-sha256 .git/objects/info/commit-graph && + git log -1 2>err && + test_i18ngrep "commit-graph hash version 2 does not match version 1" err + ) && + ( + cd sha256 && + mv ../cg-sha1 .git/objects/info/commit-graph && + git log -1 2>err && + test_i18ngrep "commit-graph hash version 1 does not match version 2" err + ) +' + # the verify tests below expect the commit-graph to contain # exactly the commits reachable from the commits/8 branch. # If the file changes the set of commits in the list, then the diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh index ea28d522b8..6f1a324f4f 100755 --- a/t/t5324-split-commit-graph.sh +++ b/t/t5324-split-commit-graph.sh @@ -6,6 +6,12 @@ test_description='split commit graph' GIT_TEST_COMMIT_GRAPH=0 GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0 +OID_VERSION=1 +if [ "$GIT_DEFAULT_HASH" = "sha256" ] +then + OID_VERSION=2 +fi + test_expect_success 'setup repo' ' git init && git config core.commitGraph true && @@ -28,7 +34,7 @@ graph_read_expect() { NUM_BASE=$2 fi cat >expect <<- EOF - header: 43475048 1 1 3 $NUM_BASE + header: 43475048 1 $OID_VERSION 3 $NUM_BASE num_commits: $1 chunks: oid_fanout oid_lookup commit_metadata EOF From patchwork Fri Aug 14 18:07:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715143 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 89EC6618 for ; Fri, 14 Aug 2020 18:07:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 637712078D for ; Fri, 14 Aug 2020 18:07:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bvbYL7e+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728587AbgHNSH2 (ORCPT ); Fri, 14 Aug 2020 14:07:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728573AbgHNSH0 (ORCPT ); Fri, 14 Aug 2020 14:07:26 -0400 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5030DC061385 for ; Fri, 14 Aug 2020 11:07:26 -0700 (PDT) Received: by mail-wm1-x341.google.com with SMTP id c80so8190966wme.0 for ; Fri, 14 Aug 2020 11:07:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=R1lpzHoI17H9whswaed9n1pKR8pPW59VuATeaYLob/o=; b=bvbYL7e+6DWwmw2KXecDgfQNuWX7nnRJgI3Zqj1+RBwigtUKvpptWyCG33gN5R9Dpf I1TR4IACHD4nY7IwrBSy09HejZek0BXrYUcN+pVu+Cs5SMY+tbmZOUiK8U8ZSb+npgKt H7yf/vhx7qzUtgHt0zzVS/cu3gU8ziQTfQERshAkirmWmD6h0Mht67jamzpbaz+TvKzh MFjEcZmQdqFn0zanYKIdMeLvE90ayZiHJNepkz6Jch7I5QPdhFU1OKBdGU4fBOtef3bb 6b+VLXO1f5x/EEuSTDsWTllmPEmq9TbPPRYwtxdqFp831BAPObwBQuPzwxuktRuZY8rl oXYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=R1lpzHoI17H9whswaed9n1pKR8pPW59VuATeaYLob/o=; b=imAgmn5pEvQrgUuBW9EwiXMXOrHRIPPfZXP5OyMb5hV2jgjtEro3/kSZ3CFikbVLsY t4OZXSX8jUasY7k60m6j8BDVPMOlU9TzwbhcrIHpHE64yMYjchZ/jLaO/3gmU8c7ezkE p22+ZNj5/ZXN2paphxhg4U+O0EgbxfCXA58cSWONTxIX9cB600lsTsG7YOAHsxlohTbo Jke7tw/aFOOWCWb35aZpB5hvJXYr+XP2LwdxLehPBiUqNPtolYwJXJB/8IRt9NXyiqbl WZuoBR6AuPM6t4V/3MGlQ8/bww55SEu9CktaU/NWA9RZyDBdHuI0Jg3TGGQWS6nzuYu8 kQGg== X-Gm-Message-State: AOAM531wLRWu418pGTsJfp9R+alj88KakHJ3sJBsjMgKTvVvh3TseUH9 X4EPfFB8DtSedxoxoO9sZE95XrfdiZ4= X-Google-Smtp-Source: ABdhPJxRo+ltgVYikX7ikGcQOKA0BrXSzqWKbbJztCtSJSLlQE2NRIwYNanO/xh9AemC8SL2rTKBYA== X-Received: by 2002:a1c:4944:: with SMTP id w65mr3401812wma.169.1597428444773; Fri, 14 Aug 2020 11:07:24 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d11sm17526372wrw.77.2020.08.14.11.07.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Aug 2020 11:07:24 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Fri, 14 Aug 2020 18:07:20 +0000 Subject: [PATCH 3/3] multi-pack-index: use hash version byte Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: martin.agren@gmail.com, sandals@crustytoothpaste.net, me@ttaylorr.com, abhishekkumar8222@gmail.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Similar to the commit-graph format, the multi-pack-index format has a byte in the header intended to track the hash version used to write the file. This allows one to interpret the hash length without having the context of the repository config specifying the hash length. This was not modified as part of the SHA-256 work because the hash length was automatically up-shifted due to that config. Since we have this byte available, we can make the file formats more obviously incompatible instead of relying on other context from the repository. Add a new oid_version() method in midx.c similar to the one in commit-graph.c. This is specifically made separate from that implementation to avoid artificially linking the formats. The test impact requires a few more things than the corresponding change in the commit-graph format. Specifically, 'test-tool read-midx' was not writing anything about this header value to output. Since the value available in 'struct multi_pack_index' is hash_len instead of a version value, we output "20" or "32" instead of "1" or "2". Since we want a user to not have their Git commands fail if their multi-pack-index has the incorrect hash version compared to the repository's hash version, we relax the die() to an error() in load_multi_pack_index(). This has some effect on 'git multi-pack-index verify' as we need to check that a failed parse of a file that exists is actually a verify error. For that test that checks the hash version matches, we change the corrupted byte from "2" to "3" to ensure the test fails for both hash algorithms. Signed-off-by: Derrick Stolee --- Documentation/technical/pack-format.txt | 7 +++- midx.c | 32 ++++++++++++++---- t/helper/test-read-midx.c | 8 +++-- t/t5319-multi-pack-index.sh | 43 ++++++++++++++++++++++--- 4 files changed, 77 insertions(+), 13 deletions(-) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index d3a142c652..16cf7e83aa 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -273,7 +273,12 @@ HEADER: Git only writes or recognizes version 1. 1-byte Object Id Version - Git only writes or recognizes version 1 (SHA1). + We infer the length of object IDs (OIDs) from this value: + 1 => SHA-1 + 2 => SHA-256 + If the hash type does not match the repository's hash algorithm, + the multi-pack-index file should be ignored with a warning + presented to the user. 1-byte number of "chunks" diff --git a/midx.c b/midx.c index a5fb797ede..0c165a40f5 100644 --- a/midx.c +++ b/midx.c @@ -17,7 +17,6 @@ #define MIDX_BYTE_HASH_VERSION 5 #define MIDX_BYTE_NUM_CHUNKS 6 #define MIDX_BYTE_NUM_PACKS 8 -#define MIDX_HASH_VERSION 1 #define MIDX_HEADER_SIZE 12 #define MIDX_MIN_SIZE (MIDX_HEADER_SIZE + the_hash_algo->rawsz) @@ -36,6 +35,15 @@ #define PACK_EXPIRED UINT_MAX +static uint8_t oid_version(void) +{ + if (the_hash_algo->rawsz == GIT_SHA1_RAWSZ) + return 1; + if (the_hash_algo->rawsz == GIT_SHA256_RAWSZ) + return 2; + die(_("invalid hash version")); +} + static char *get_midx_filename(const char *object_dir) { return xstrfmt("%s/pack/multi-pack-index", object_dir); @@ -90,8 +98,11 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local m->version); hash_version = m->data[MIDX_BYTE_HASH_VERSION]; - if (hash_version != MIDX_HASH_VERSION) - die(_("hash version %u does not match"), hash_version); + if (hash_version != oid_version()) { + error(_("multi-pack-index hash version %u does not match version %u"), + hash_version, oid_version()); + goto cleanup_fail; + } m->hash_len = the_hash_algo->rawsz; m->num_chunks = m->data[MIDX_BYTE_NUM_CHUNKS]; @@ -418,7 +429,7 @@ static size_t write_midx_header(struct hashfile *f, hashwrite_be32(f, MIDX_SIGNATURE); byte_values[0] = MIDX_VERSION; - byte_values[1] = MIDX_HASH_VERSION; + byte_values[1] = oid_version(); byte_values[2] = num_chunks; byte_values[3] = 0; /* unused */ hashwrite(f, byte_values, sizeof(byte_values)); @@ -1105,8 +1116,17 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag struct multi_pack_index *m = load_multi_pack_index(object_dir, 1); verify_midx_error = 0; - if (!m) - return 0; + if (!m) { + int result = 0; + struct stat sb; + char *filename = get_midx_filename(object_dir); + if (!stat(filename, &sb)) { + error(_("multi-pack-index file exists, but failed to parse")); + result = 1; + } + free(filename); + return result; + } if (flags & MIDX_PROGRESS) progress = start_progress(_("Looking for referenced packfiles"), diff --git a/t/helper/test-read-midx.c b/t/helper/test-read-midx.c index 831b586d02..2430880f78 100644 --- a/t/helper/test-read-midx.c +++ b/t/helper/test-read-midx.c @@ -7,14 +7,18 @@ static int read_midx_file(const char *object_dir) { uint32_t i; - struct multi_pack_index *m = load_multi_pack_index(object_dir, 1); + struct multi_pack_index *m; + + setup_git_directory(); + m = load_multi_pack_index(object_dir, 1); if (!m) return 1; - printf("header: %08x %d %d %d\n", + printf("header: %08x %d %d %d %d\n", m->signature, m->version, + m->hash_len, m->num_chunks, m->num_packs); diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 7dfff0f8f4..09cbca4949 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -5,6 +5,8 @@ test_description='multi-pack-indexes' objdir=.git/objects +HASH_LEN=$(test_oid rawsz) + midx_read_expect () { NUM_PACKS=$1 NUM_OBJECTS=$2 @@ -13,7 +15,7 @@ midx_read_expect () { EXTRA_CHUNKS="$5" { cat <<-EOF && - header: 4d494458 1 $NUM_CHUNKS $NUM_PACKS + header: 4d494458 1 $HASH_LEN $NUM_CHUNKS $NUM_PACKS chunks: pack-names oid-fanout oid-lookup object-offsets$EXTRA_CHUNKS num_objects: $NUM_OBJECTS packs: @@ -46,7 +48,7 @@ test_expect_success "don't write midx with no packs" ' test_path_is_missing pack/multi-pack-index ' -test_expect_success "Warn if a midx contains no oid" ' +test_expect_success SHA1 'warn if a midx contains no oid' ' cp "$TEST_DIRECTORY"/t5319/no-objects.midx $objdir/pack/multi-pack-index && test_must_fail git multi-pack-index verify && rm $objdir/pack/multi-pack-index @@ -198,6 +200,40 @@ test_expect_success 'write midx with twelve packs' ' compare_results_with_midx "twelve packs" +test_expect_success 'warn on improper hash version' ' + git init --object-format=sha1 sha1 && + ( + cd sha1 && + git config core.multiPackIndex true && + test_commit 1 && + git repack -a && + git multi-pack-index write && + mv .git/objects/pack/multi-pack-index ../mpi-sha1 + ) && + git init --object-format=sha256 sha256 && + ( + cd sha256 && + git config core.multiPackIndex true && + test_commit 1 && + git repack -a && + git multi-pack-index write && + mv .git/objects/pack/multi-pack-index ../mpi-sha256 + ) && + ( + cd sha1 && + mv ../mpi-sha256 .git/objects/pack/multi-pack-index && + git log -1 2>err && + test_i18ngrep "multi-pack-index hash version 2 does not match version 1" err + ) && + ( + cd sha256 && + mv ../mpi-sha1 .git/objects/pack/multi-pack-index && + git log -1 2>err && + test_i18ngrep "multi-pack-index hash version 1 does not match version 2" err + ) +' + + test_expect_success 'verify multi-pack-index success' ' git multi-pack-index verify --object-dir=$objdir ' @@ -243,7 +279,6 @@ test_expect_success 'verify bad signature' ' "multi-pack-index signature" ' -HASH_LEN=$(test_oid rawsz) NUM_OBJECTS=74 MIDX_BYTE_VERSION=4 MIDX_BYTE_OID_VERSION=5 @@ -272,7 +307,7 @@ test_expect_success 'verify bad version' ' ' test_expect_success 'verify bad OID version' ' - corrupt_midx_and_verify $MIDX_BYTE_OID_VERSION "\02" $objdir \ + corrupt_midx_and_verify $MIDX_BYTE_OID_VERSION "\03" $objdir \ "hash version" '