From patchwork Mon May 20 23:14:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Junio C Hamano X-Patchwork-Id: 13668717 Received: from pb-smtp2.pobox.com (pb-smtp2.pobox.com [64.147.108.71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D685138493 for ; Mon, 20 May 2024 23:14:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=64.147.108.71 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716246886; cv=none; b=hKWx7mApsXeLfL0tS58LbPflkmA34m4X6FkMbtwcQwWaraO+Xm1LBeAihHZejaiqKbG2XX/Hq5madDP57XJuMPK/44jrzoMXOX4AW129dHA+2QYirHXCQSWABV2mFCND/KypOvgopGp96AvWj2BehsS1brpZ0EzwJfLTLrZpmFo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716246886; c=relaxed/simple; bh=ohdpPcz3aPQaDBKktB+1iDAZPtYRXQWuNcgyUbKb5H8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mDdTNsd8jNVuM736UttYhyyzg0FqH3FDbuxFY867LT6KlZj0njhTmesZ+RIqhVKTnZPgvgS91qzQybgcSikWH86zIlvcTRA2U020uaUk+XJ77567pMManIlhJ0xi3bgTAkQRfKvVLXQS2nYJbOqe6kpC26OBXhxyal/hg9JFAFM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=pobox.com; spf=pass smtp.mailfrom=pobox.com; dkim=pass (1024-bit key) header.d=pobox.com header.i=@pobox.com header.b=A9JpOIUo; arc=none smtp.client-ip=64.147.108.71 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=pobox.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pobox.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=pobox.com header.i=@pobox.com header.b="A9JpOIUo" Received: from pb-smtp2.pobox.com (unknown [127.0.0.1]) by pb-smtp2.pobox.com (Postfix) with ESMTP id 221A91AF71; Mon, 20 May 2024 19:14:44 -0400 (EDT) (envelope-from gitster@pobox.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=pobox.com; h=from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-type:content-transfer-encoding; s=sasl; bh=ohdpPcz3aPQa DBKktB+1iDAZPtYRXQWuNcgyUbKb5H8=; b=A9JpOIUolyGfMWxpjh3MrlsgDiD4 Qzc0hsDqVb/FHIJGpC2vMq9K5dnF41orkUKLZHYberdU0wOWd5xQV2e6WrTU3GlV DrcAvVacFemuYH9OmkSSs7dCmipVmZJU/As0QnpfRXsA3itk0ZxIL9X+L542p6bF WFJ03yPs0Gw8mC0= Received: from pb-smtp2.nyi.icgroup.com (unknown [127.0.0.1]) by pb-smtp2.pobox.com (Postfix) with ESMTP id 1A6BE1AF70; Mon, 20 May 2024 19:14:44 -0400 (EDT) (envelope-from gitster@pobox.com) Received: from pobox.com (unknown [34.125.173.97]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pb-smtp2.pobox.com (Postfix) with ESMTPSA id 7434D1AF6F; Mon, 20 May 2024 19:14:43 -0400 (EDT) (envelope-from gitster@pobox.com) From: Junio C Hamano To: git@vger.kernel.org Cc: Patrick Steinhardt Subject: [PATCH v5 3/5] builtin/patch-id: fix uninitialized hash function Date: Mon, 20 May 2024 16:14:32 -0700 Message-ID: <20240520231434.1816979-4-gitster@pobox.com> X-Mailer: git-send-email 2.45.1-216-g4365c6fcf9 In-Reply-To: <20240520231434.1816979-1-gitster@pobox.com> References: <20240520231434.1816979-1-gitster@pobox.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Pobox-Relay-ID: BDE15BD6-16FE-11EF-8587-25B3960A682E-77302942!pb-smtp2.pobox.com From: Patrick Steinhardt In c8aed5e8da (repository: stop setting SHA1 as the default object hash, 2024-05-07), we have adapted `initialize_repository()` to no longer set up a default hash function. As this function is also used to set up `the_repository`, the consequence is that `the_hash_algo` will now by default be a `NULL` pointer unless the hash algorithm was configured properly. This is done as a mechanism to detect cases where we may be using the wrong hash function by accident. This change now causes git-patch-id(1) to segfault when it's run outside of a repository. As this command can read diffs from stdin, it does not necessarily need a repository, but then relies on `the_hash_algo` to compute the patch ID itself. It is somewhat dubious that git-patch-id(1) relies on `the_hash_algo` in the first place. Quoting its manpage: A "patch ID" is nothing but a sum of SHA-1 of the file diffs associated with a patch, with line numbers ignored. As such, it’s "reasonably stable", but at the same time also reasonably unique, i.e., two patches that have the same "patch ID" are almost guaranteed to be the same thing. We explicitly document patch IDs to be using SHA-1. Furthermore, patch IDs are supposed to be stable for most of the part. But even with the same input, the patch IDs will now be different depending on the repo's configured object hash. Work around the issue by setting up SHA-1 when there was no startup repository for now. This is arguably not the correct fix, but for now we rather want to focus on getting the segfault fixed. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- builtin/patch-id.c | 13 +++++++++++++ t/t1517-outside-repo.sh | 2 +- t/t4204-patch-id.sh | 34 ++++++++++++++++++++++++++++++++++ 3 files changed, 48 insertions(+), 1 deletion(-) diff --git a/builtin/patch-id.c b/builtin/patch-id.c index 3894d2b970..583099cacf 100644 --- a/builtin/patch-id.c +++ b/builtin/patch-id.c @@ -5,6 +5,7 @@ #include "hash.h" #include "hex.h" #include "parse-options.h" +#include "setup.h" static void flush_current_id(int patchlen, struct object_id *id, struct object_id *result) { @@ -237,6 +238,18 @@ int cmd_patch_id(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, builtin_patch_id_options, patch_id_usage, 0); + /* + * We rely on `the_hash_algo` to compute patch IDs. This is dubious as + * it means that the hash algorithm now depends on the object hash of + * the repository, even though git-patch-id(1) clearly defines that + * patch IDs always use SHA1. + * + * NEEDSWORK: This hack should be removed in favor of converting + * the code that computes patch IDs to always use SHA1. + */ + if (!the_hash_algo) + repo_set_hash_algo(the_repository, GIT_HASH_SHA1); + generate_id_list(opts ? opts > 1 : config.stable, opts ? opts == 3 : config.verbatim); return 0; diff --git a/t/t1517-outside-repo.sh b/t/t1517-outside-repo.sh index 389974d9fb..278ef57b3a 100755 --- a/t/t1517-outside-repo.sh +++ b/t/t1517-outside-repo.sh @@ -21,7 +21,7 @@ test_expect_success 'set up a non-repo directory and test file' ' git diff >sample.patch ' -test_expect_failure 'compute a patch-id outside repository (uses SHA-1)' ' +test_expect_success 'compute a patch-id outside repository (uses SHA-1)' ' nongit env GIT_DEFAULT_HASH=sha1 \ git patch-id patch-id.expect && nongit \ diff --git a/t/t4204-patch-id.sh b/t/t4204-patch-id.sh index a7fa94ce0a..605faea0c7 100755 --- a/t/t4204-patch-id.sh +++ b/t/t4204-patch-id.sh @@ -310,4 +310,38 @@ test_expect_success 'patch-id handles diffs with one line of before/after' ' test_config patchid.stable true && calc_patch_id diffu1stable diff <<-\EOF && + diff --git a/bar b/bar + index bdaf90f..31051f6 100644 + --- a/bar + +++ b/bar + @@ -2 +2,2 @@ + b + +c + EOF + + git init --object-format=sha1 repo-sha1 && + git -C repo-sha1 patch-id patch-id-sha1 && + git init --object-format=sha256 repo-sha256 && + git -C repo-sha256 patch-id patch-id-sha256 && + test_cmp patch-id-sha1 patch-id-sha256 +' + +test_expect_success 'patch-id without repository' ' + cat >diff <<-\EOF && + diff --git a/bar b/bar + index bdaf90f..31051f6 100644 + --- a/bar + +++ b/bar + @@ -2 +2,2 @@ + b + +c + EOF + nongit git patch-id