From patchwork Mon Jun 17 13:55:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xing Xin X-Patchwork-Id: 13700824 Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2459B1662E6 for ; Mon, 17 Jun 2024 13:55:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718632543; cv=none; b=sQ6djAjE8DHPBkiaab300AAAZEJ/vXwpK9bAJOUT+wlQIMrVsncC5dpVmV90s3IiI0L45mVNKT9EDYh8q7A/febi+ITLa9smqqTGLpbsCFrhm3dw0+rdjKmxIUyuURU92pQDhOAkuDIdup0E/B/+URtngUGTimp4ID84MHz3btQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718632543; c=relaxed/simple; bh=nAOMoDYHEnYUuR20sSvpU0d/WiRrME+5TtbkAF0N2SI=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=CU5RRnkqSvkiL4ostEmlPeupkeU9RUBr+f+cd4RHcSRjgvkwBbDf9lLSJmgrqk1Al+MEl2aV3wscLZPq2tEjnpc98NCCiCuOjQ2razuVa/XuLR9dpQ1H87oQwD8okMXp45S7QuGVo+7lGW3MgwB94ZFwZfj111OfyEPdFZ9jd7A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ekqqVS5o; arc=none smtp.client-ip=209.85.221.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ekqqVS5o" Received: by mail-wr1-f47.google.com with SMTP id ffacd0b85a97d-35f1bc2ab37so3822042f8f.1 for ; Mon, 17 Jun 2024 06:55:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718632539; x=1719237339; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=KlBf1ST5duZGJTppb2JfCp9+tjW2GKDNZCP8efh86c8=; b=ekqqVS5oR3w0Y26bcGmqYyJ1bQpzcASGFr/OX0z0nBEwemdz+H+OMYTWFeJFyTdM8B /7AHLZHjA1ugSTnHJ7vCLV7AbMwX2pSDy5emlXHq4msRQhsb/mbQmwUv3dN2i6pqXeIU aYAOXIDiOlShram0W48cdDjUOsCg7J3J+Ac1msL94ySmNJCGNj/6fhuNLOtXv8en3DDP dsLpOTc2LVChRhhq6MUjA+hXhHCMCtiuqYL98E/7X0Fujy/258L9fJW2emuQh1a3ina+ 7RloJ/q+Ts4xnqNRS8j9AlaxdZmqVMlhht5q3nQNpsS728Qc4CuKd6fgWt9ZyC1JZSXJ UPJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718632539; x=1719237339; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KlBf1ST5duZGJTppb2JfCp9+tjW2GKDNZCP8efh86c8=; b=odzkFkLpNlULSsr/I0IkKfELzbwFuoZNnpSzenyvfVuoEb+xG3P/cHEOTecbWVKHSC SBEERFprbpz/pmG8HWGebOKbnCrhgXmuuT74ddxBAaOl8qUmNMzN0k+9V2UTv/EhWWvL m8rt8Ef7nXHmPvR5JlLL4CflClF3Y2MtJoDyj6nF6JZoeNlETrLDdmFjxsUldPYyUDj3 ac8ZjqprNgf98Wx1YbI2uVFLhNPmenpdRvGmEoyDJrHaKF9ltP34Z48qFsm/1H/MXyTc X4Q4yYY/r3DhPOosvySE53aGgtSue//mqVpdBeqLTE37aYAtszP54QmK+thH6XjkuAbM zKCw== X-Gm-Message-State: AOJu0Yxhz3tYgBZzPyRuDBNeIqBrn+TZH6P6uhCEpQ0/myPjVjjYWFsa KK6yUWooVelPv4dcSp/0FgONwh8K3z+R6L6Qr6BpM2WcOyUa7y5zGfuANg== X-Google-Smtp-Source: AGHT+IFHOKJQH3762twnt5P/WFdJ+gsqMPLQG+nLvCovRLhFx6dqJQ48Vp6A/zSoNfmdkxec7G0GbA== X-Received: by 2002:adf:ab12:0:b0:35f:1545:b428 with SMTP id ffacd0b85a97d-3607a76b76fmr6585107f8f.38.1718632538721; Mon, 17 Jun 2024 06:55:38 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-360751037absm11845219f8f.91.2024.06.17.06.55.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jun 2024 06:55:38 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 17 Jun 2024 13:55:33 +0000 Subject: [PATCH v7 1/3] bundle-uri: verify oid before writing refs Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , blanet , Xing Xin From: Xing Xin From: Xing Xin When using the bundle-uri mechanism with a bundle list containing multiple interrelated bundles, we encountered a bug where tips from downloaded bundles were not discovered, thus resulting in rather slow clones. This was particularly problematic when employing the "creationTokens" heuristic. To reproduce this issue, consider a repository with a single branch "main" pointing to commit "A". Firstly, create a base bundle with: git bundle create base.bundle main Then, add a new commit "B" on top of "A", and create an incremental bundle for "main": git bundle create incr.bundle A..main Now, generate a bundle list with the following content: [bundle] version = 1 mode = all heuristic = creationToken [bundle "base"] uri = base.bundle creationToken = 1 [bundle "incr"] uri = incr.bundle creationToken = 2 A fresh clone with the bundle list above should result in a reference "refs/bundles/main" pointing to "B" in the new repository. However, git would still download everything from the server, as if it had fetched nothing locally. So why the "refs/bundles/main" is not discovered? After some digging I found that: 1. Bundles in bundle list are downloaded to local files via `bundle-uri.c:download_bundle_list` or via `bundle-uri.c:fetch_bundles_by_token` for the "creationToken" heuristic. 2. Each bundle is unbundled via `bundle-uri.c:unbundle_from_file`, which is called by `bundle-uri.c:unbundle_all_bundles` or called within `bundle-uri.c:fetch_bundles_by_token` for the "creationToken" heuristic. 3. To get all prerequisites of the bundle, the bundle header is read inside `bundle-uri.c:unbundle_from_file` to by calling `bundle.c:read_bundle_header`. 4. Then it calls `bundle.c:unbundle`, which calls `bundle.c:verify_bundle` to ensure the repository contains all the prerequisites. 5. `bundle.c:verify_bundle` calls `parse_object`, which eventually invokes `packfile.c:prepare_packed_git` or `packfile.c:reprepare_packed_git`, filling `raw_object_store->packed_git` and setting `packed_git_initialized`. 6. If `bundle.c:unbundle` succeeds, it writes refs via `refs.c:refs_update_ref` with `REF_SKIP_OID_VERIFICATION` set. Here bundle refs which can target arbitrary objects are written to the repository. 7. Finally, in `fetch-pack.c:do_fetch_pack_v2`, the functions `fetch-pack.c:mark_complete_and_common_ref` and `fetch-pack.c:mark_tips` are called with `OBJECT_INFO_QUICK` set to find local tips for negotiation. The `OBJECT_INFO_QUICK` flag prevents `packfile.c:reprepare_packed_git` from being called, resulting in failures to parse OIDs that reside only in the latest bundle. In the example above, when unbunding "incr.bundle", "base.pack" is added to `packed_git` due to prerequisites verification. However, "B" cannot be found for negotiation because it exists in "incr.pack", which is not included in `packed_git`. Fix the bug by removing `REF_SKIP_OID_VERIFICATION` flag when writing bundle refs. When `refs.c:refs_update_ref` is called to write the corresponding bundle refs, it triggers `refs.c:ref_transaction_commit`. This, in turn, invokes `refs.c:ref_transaction_prepare`, which calls `transaction_prepare` of the refs storage backend. For files backend, it is `files-backend.c:files_transaction_prepare`, and for reftable backend, it is `reftable-backend.c:reftable_be_transaction_prepare`. Both functions eventually call `object.c:parse_object`, which can invoke `packfile.c:reprepare_packed_git` to refresh `packed_git`. This ensures that bundle refs point to valid objects and that all tips from bundle refs are correctly parsed during subsequent negotiations. A set of negotiation-related tests for cloning with bundle-uri has been included to demonstrate that downloaded bundles are utilized to accelerate fetching. Additionally, another test has been added to show that bundles with incorrect headers, where refs point to non-existent objects, do not result in any bundle refs being created in the repository. Reviewed-by: Karthik Nayak Reviewed-by: Patrick Steinhardt Signed-off-by: Xing Xin --- bundle-uri.c | 3 +- t/t5558-clone-bundle-uri.sh | 150 +++++++++++++++++++++++++++++++++++- 2 files changed, 147 insertions(+), 6 deletions(-) diff --git a/bundle-uri.c b/bundle-uri.c index 91b3319a5c1..65666a11d9c 100644 --- a/bundle-uri.c +++ b/bundle-uri.c @@ -400,8 +400,7 @@ static int unbundle_from_file(struct repository *r, const char *file) refs_update_ref(get_main_ref_store(the_repository), "fetched bundle", bundle_ref.buf, oid, has_old ? &old_oid : NULL, - REF_SKIP_OID_VERIFICATION, - UPDATE_REFS_MSG_ON_ERR); + 0, UPDATE_REFS_MSG_ON_ERR); } bundle_header_release(&header); diff --git a/t/t5558-clone-bundle-uri.sh b/t/t5558-clone-bundle-uri.sh index 1ca5f745e73..2dcdd238a90 100755 --- a/t/t5558-clone-bundle-uri.sh +++ b/t/t5558-clone-bundle-uri.sh @@ -3,6 +3,7 @@ test_description='test fetching bundles with --bundle-uri' . ./test-lib.sh +. "$TEST_DIRECTORY"/lib-bundle.sh test_expect_success 'fail to clone from non-existent file' ' test_when_finished rm -rf test && @@ -19,10 +20,22 @@ test_expect_success 'fail to clone from non-bundle file' ' test_expect_success 'create bundle' ' git init clone-from && - git -C clone-from checkout -b topic && - test_commit -C clone-from A && - test_commit -C clone-from B && - git -C clone-from bundle create B.bundle topic + ( + cd clone-from && + git checkout -b topic && + + test_commit A && + git bundle create A.bundle topic && + + test_commit B && + git bundle create B.bundle topic && + + # Create a bundle with reference pointing to non-existent object. + sed -e "/^$/q" -e "s/$(git rev-parse A) /$(git rev-parse B) /" \ + bad-header.bundle && + convert_bundle_to_pack \ + >bad-header.bundle + ) ' test_expect_success 'clone with path bundle' ' @@ -33,6 +46,16 @@ test_expect_success 'clone with path bundle' ' test_cmp expect actual ' +test_expect_success 'clone with bundle that has bad header' ' + # Write bundle ref fails, but clone can still proceed. + git clone --bundle-uri="clone-from/bad-header.bundle" \ + clone-from clone-bad-header 2>err && + commit_b=$(git -C clone-from rev-parse B) && + test_grep "trying to write ref '\''refs/bundles/topic'\'' with nonexistent object $commit_b" err && + git -C clone-bad-header for-each-ref --format="%(refname)" >refs && + ! grep "refs/bundles/" refs +' + test_expect_success 'clone with path bundle and non-default hash' ' test_when_finished "rm -rf clone-path-non-default-hash" && GIT_DEFAULT_HASH=sha256 git clone --bundle-uri="clone-from/B.bundle" \ @@ -259,6 +282,125 @@ test_expect_success 'clone bundle list (file, any mode, all failures)' ' ! grep "refs/bundles/" refs ' +test_expect_success 'negotiation: bundle with part of wanted commits' ' + test_when_finished "rm -f trace*.txt" && + GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \ + git clone --no-local --bundle-uri="clone-from/A.bundle" \ + clone-from nego-bundle-part && + git -C nego-bundle-part for-each-ref --format="%(refname)" >refs && + grep "refs/bundles/" refs >actual && + test_write_lines refs/bundles/topic >expect && + test_cmp expect actual && + # Ensure that refs/bundles/topic are sent as "have". + test_grep "clone> have $(git -C clone-from rev-parse A)" trace-packet.txt +' + +test_expect_success 'negotiation: bundle with all wanted commits' ' + test_when_finished "rm -f trace*.txt" && + GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \ + git clone --no-local --single-branch --branch=topic --no-tags \ + --bundle-uri="clone-from/B.bundle" \ + clone-from nego-bundle-all && + git -C nego-bundle-all for-each-ref --format="%(refname)" >refs && + grep "refs/bundles/" refs >actual && + test_write_lines refs/bundles/topic >expect && + test_cmp expect actual && + # We already have all needed commits so no "want" needed. + ! grep "clone> want " trace-packet.txt +' + +test_expect_success 'negotiation: bundle list (no heuristic)' ' + test_when_finished "rm -f trace*.txt" && + cat >bundle-list <<-EOF && + [bundle] + version = 1 + mode = all + + [bundle "bundle-1"] + uri = file://$(pwd)/clone-from/bundle-1.bundle + + [bundle "bundle-2"] + uri = file://$(pwd)/clone-from/bundle-2.bundle + EOF + + GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \ + git clone --no-local --bundle-uri="file://$(pwd)/bundle-list" \ + clone-from nego-bundle-list-no-heuristic && + + git -C nego-bundle-list-no-heuristic for-each-ref --format="%(refname)" >refs && + grep "refs/bundles/" refs >actual && + cat >expect <<-\EOF && + refs/bundles/base + refs/bundles/left + EOF + test_cmp expect actual && + test_grep "clone> have $(git -C nego-bundle-list-no-heuristic rev-parse refs/bundles/left)" trace-packet.txt +' + +test_expect_success 'negotiation: bundle list (creationToken)' ' + test_when_finished "rm -f trace*.txt" && + cat >bundle-list <<-EOF && + [bundle] + version = 1 + mode = all + heuristic = creationToken + + [bundle "bundle-1"] + uri = file://$(pwd)/clone-from/bundle-1.bundle + creationToken = 1 + + [bundle "bundle-2"] + uri = file://$(pwd)/clone-from/bundle-2.bundle + creationToken = 2 + EOF + + GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \ + git clone --no-local --bundle-uri="file://$(pwd)/bundle-list" \ + clone-from nego-bundle-list-heuristic && + + git -C nego-bundle-list-heuristic for-each-ref --format="%(refname)" >refs && + grep "refs/bundles/" refs >actual && + cat >expect <<-\EOF && + refs/bundles/base + refs/bundles/left + EOF + test_cmp expect actual && + test_grep "clone> have $(git -C nego-bundle-list-heuristic rev-parse refs/bundles/left)" trace-packet.txt +' + +test_expect_success 'negotiation: bundle list with all wanted commits' ' + test_when_finished "rm -f trace*.txt" && + cat >bundle-list <<-EOF && + [bundle] + version = 1 + mode = all + heuristic = creationToken + + [bundle "bundle-1"] + uri = file://$(pwd)/clone-from/bundle-1.bundle + creationToken = 1 + + [bundle "bundle-2"] + uri = file://$(pwd)/clone-from/bundle-2.bundle + creationToken = 2 + EOF + + GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \ + git clone --no-local --single-branch --branch=left --no-tags \ + --bundle-uri="file://$(pwd)/bundle-list" \ + clone-from nego-bundle-list-all && + + git -C nego-bundle-list-all for-each-ref --format="%(refname)" >refs && + grep "refs/bundles/" refs >actual && + cat >expect <<-\EOF && + refs/bundles/base + refs/bundles/left + EOF + test_cmp expect actual && + # We already have all needed commits so no "want" needed. + ! grep "clone> want " trace-packet.txt +' + ######################################################################### # HTTP tests begin here From patchwork Mon Jun 17 13:55:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xing Xin X-Patchwork-Id: 13700825 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DDFBB1662E2 for ; Mon, 17 Jun 2024 13:55:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718632544; cv=none; b=q25BAuImffEeqz+RMML/awNazA1KARvWi1p1Us9WMhCgTIJ5KeZTCRIAvd0yII2tOnsp2Ueo/uyDi/N7LasK2f0t1Wd+fy3mwVCnfjPjG/toDmBiVwPEt2veRC08X5kOynFlxTAxzXezodz+CdqhGFxKI1OBdVER/vpaWjtkWyg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718632544; c=relaxed/simple; bh=XLb5xiWA4dAF2Fx/JHqqeBya6F3OqFt/2QpQOAIOTOo=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=Ov4MoK1NC8YJ+uShenzMeqAS2NJ2BgMgPcMYtZtfuwl+HTFA8SzyDsCnWa8pjdqaXDrbvp8RMh0l+oTyrif7wgUlC2NKLkhj64raBrtcwZuGrxg4oyTmkVhyAcGzpym2ip4daxdolL48X8BYwes+IA6Hq2Z5zZHwxqFiEPBwWf8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Wli6oukB; arc=none smtp.client-ip=209.85.128.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Wli6oukB" Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-421a1b834acso36374845e9.3 for ; Mon, 17 Jun 2024 06:55:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718632541; x=1719237341; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=G968azMSMZxhGCb7WK9dg8PxA15ncjcjODXB2hpkuVM=; b=Wli6oukBA1ab6ekM37HXw9pxnt/77NyHZJvZG3EVK6wghBVfO5guIjgr68ojlGUC05 F290FLWY4N4FH697p8H38D1U7clc7M7vv0Xa88UERBVlERDzYtD5UblwXkENukf1wJAN 7J788i7m8+U34Xol4LeY2AjBDr7Zseh971ydlX3Y0mbSnAAkNM/TDRgf8e9CkDQbUu7P 6ILCwDHm89K9xiH7yIQQsJKgDBI9y7Abe0v5R87/7moRDuVB3cRahWOlDZMY9kPFIjug dDjeBNoY0Qr8em+VOfLw7QN/kcJZAx4TWiL9WKXWId41WWkBoCl6VCtt7HGe7SgP65vh cOgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718632541; x=1719237341; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=G968azMSMZxhGCb7WK9dg8PxA15ncjcjODXB2hpkuVM=; b=JYqnC7q/q3F0HSRrET0k6lDTWP2yis1leoRS2E3sG87PkTSpYHSC/evIYHkF/CA+vC nGCeKZOt1vhZvGZCS2O14RkjaJDGWwsMYJysIREGMuv1dPA91DngCx79dHMNwqdAN3iE VkK3hSpIdjHB+/k29UBM1DgONo2wAAYqWg/+CwhFVnwHF6MItgUUteDKNsVIRvs43K/q FgUv182W07kYnZhvbHM2sYd3N1BuBynTwQnJwiQaorADFrZr21Xu2+OZIDqiS/wvQ0kL 4Z5hHVK3TgF1+dGKa6he/2ip09MV0DCkXgMmIY7BdVavFsPc5PEAP1EdYc5tG68pSuRx CPcg== X-Gm-Message-State: AOJu0YwflBy0d0TXlm5Vyy5t1k9sQWfG/b44NllQh5ICzGkqEt9fHdKK je/KFart/ycQGExPliTh+dwXVk3B+uts+uoE7cYSZGL56tik/eiAY66cRw== X-Google-Smtp-Source: AGHT+IEq4qRLz84qxVsfbe4vawXEXSG/UsnJLL4GMzTKEggpAX8SbpO9PlDiJJBWNL0yZ4yZGxQ7FA== X-Received: by 2002:a05:600c:4507:b0:423:490:174b with SMTP id 5b1f17b1804b1-4230490191fmr86524655e9.13.1718632540427; Mon, 17 Jun 2024 06:55:40 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-422f602ef0bsm157235885e9.18.2024.06.17.06.55.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jun 2024 06:55:39 -0700 (PDT) Message-Id: <3dc0d9dd22f64f79f7cddc453c43701f73f0e133.1718632536.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 17 Jun 2024 13:55:34 +0000 Subject: [PATCH v7 2/3] fetch-pack: expose fsckObjects configuration logic Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , blanet , Xing Xin From: Xing Xin From: Xing Xin Currently, we can use "transfer.fsckObjects" and the more specific "fetch.fsckObjects" to control checks for broken objects in received packs during fetches. However, these configurations were only acknowledged by `fetch-pack.c:get_pack` and did not take effect in direct bundle fetches and fetches with _bundle-uri_ enabled. This commit exposes the fetch-then-transfer configuration logic by adding a new function `fetch_pack_fsck_objects` in fetch-pack.h. This new function is used to replace the assignment for `fsck_objects` in `fetch-pack.c:get_pack`. In the next commit, it will also be used by `bundle.c:unbundle` to better fit fetching scenarios. Helped-by: Junio C Hamano Helped-by: Patrick Steinhardt Signed-off-by: Xing Xin --- fetch-pack.c | 17 +++++++++++------ fetch-pack.h | 5 +++++ 2 files changed, 16 insertions(+), 6 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index 7d2aef21add..3acff2baf09 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -954,12 +954,7 @@ static int get_pack(struct fetch_pack_args *args, strvec_push(&cmd.args, alternate_shallow_file); } - if (fetch_fsck_objects >= 0 - ? fetch_fsck_objects - : transfer_fsck_objects >= 0 - ? transfer_fsck_objects - : 0) - fsck_objects = 1; + fsck_objects = fetch_pack_fsck_objects(); if (do_keep || args->from_promisor || index_pack_args || fsck_objects) { if (pack_lockfiles || fsck_objects) @@ -2046,6 +2041,16 @@ static const struct object_id *iterate_ref_map(void *cb_data) return &ref->old_oid; } +int fetch_pack_fsck_objects(void) +{ + fetch_pack_setup(); + if (fetch_fsck_objects >= 0) + return fetch_fsck_objects; + if (transfer_fsck_objects >= 0) + return transfer_fsck_objects; + return 0; +} + struct ref *fetch_pack(struct fetch_pack_args *args, int fd[], const struct ref *ref, diff --git a/fetch-pack.h b/fetch-pack.h index 6775d265175..b5c579cdae2 100644 --- a/fetch-pack.h +++ b/fetch-pack.h @@ -101,4 +101,9 @@ void negotiate_using_fetch(const struct oid_array *negotiation_tips, */ int report_unmatched_refs(struct ref **sought, int nr_sought); +/* + * Return true if checks for broken objects in received pack are required. + */ +int fetch_pack_fsck_objects(void); + #endif From patchwork Mon Jun 17 13:55:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xing Xin X-Patchwork-Id: 13700826 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B31311662F3 for ; Mon, 17 Jun 2024 13:55:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718632545; cv=none; b=cXcJ+325Q6QoMcNAWfzXnfRcGJOxyyb/lIMiNayDP6R/JFASz3EhSt6pFte4YWexhKEmhrjFAVZDVAP1HTlmIP/O5kpaflUyRgw+QdmCMuaTo3nu+E/lxEdlvdOencbaugxsz0IcT8SiB+9MTxgFxIBeK6l5nKcym3/P6QPeRgw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718632545; c=relaxed/simple; bh=QswTJQqQxn+Au+aETh1E5m7aeMXa3oWBd8fq/qMZ2DA=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=Wa+B/toV+ACKgKCLkEQaQvz6qp10Rpw9md38A3vpmY4mvZVXDVIUso+rHbWo1blEsi6zGuQ2QbG3d7cTxXe+DC+32qF3nSpe3ZcBXFj6DDxXHN1YCCOXdQDYtnpcAL8aSD4m7asH1osc5IG1oOqHGQDgyL+MaUV9zY8YAr7gkZQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ZEYgxGcZ; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZEYgxGcZ" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-4217136a74dso36397675e9.2 for ; Mon, 17 Jun 2024 06:55:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718632541; x=1719237341; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=pmDLer7snI3CeNnwOAubBVT3oNuADZCa8XQcQ/B8zlQ=; b=ZEYgxGcZvwSEED+gkOKqumF9EZcXw7N5T5JYYtkSsbWUE1oFZH/bOIoLxcPqdn1UzD 1qsKzUoaGvMv8cpF4Nk1H1LhcAlxAux7LQcAafPXDKFXg535OFX9jKGAGOA7m6mSvyiV nVZl/oUn1tAuu0N6HPAFQbuP1bOVGIknXFcajiMCMzkjRn3leKrEBvC9YDl5ezIeK4UD NsX2Z+1RjOVM/3rh6HoyFZJUMHH2Z3TPXWHPh0fssdJYHmly7eRH/xw07NbGhLMjYhpL sAGh9HtbuSBPfqFdh4e0dwUt4QySfGFhgc7mbG+HY6lYNXC5L9qWGstfylhZZyTeO/Vl fTHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718632541; x=1719237341; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pmDLer7snI3CeNnwOAubBVT3oNuADZCa8XQcQ/B8zlQ=; b=I5OYTlqkF6ktQdtqcliDsrZVaI3vzVNaLsiMzgeutZQhQBey/M0oH9lSOlGwGzw2oW DnKOC0TJRsJExLkvVNWY9Vb43H7BA5HjztxER1AK5Pwe/6UPhnhAZGrF7L5D6bMhsFBd QKU9jdJElXgJueKWvBEWrfI24UP7PfGJHdYkADqPjObMizU9XiKBUMM+bNNN9KeVd8X5 GeVQpF/y5hCgsUXhdhSz1N3Spg3aLtkUhYe7F+ZOExNRphnnBN73cU0dK1OQfnP3fguB tOqVH0upwQSLQ4p3CvfrKeosO2agKOVsAy8yZLi8RBN89OHHhdKKQ6beY8BZ5gJzm+eB /Pig== X-Gm-Message-State: AOJu0YzBuYxZPJtppQLtTPQC4SMxCfXgMFbTullmKJl1Q6NmoQkaiO0M /E1fcwePn3x2xHCeKhs4DgQRAtvTADWgji6vSfuvxzCcA/AxI1c2yh3Mkw== X-Google-Smtp-Source: AGHT+IGulyO22jvCKerBa7FOUv7zExB/rlk6qJyMxCZDurYWQCmSEGf8PxL8YdrmvkAA60DBLiL+1A== X-Received: by 2002:a05:600c:a43:b0:421:ad:f104 with SMTP id 5b1f17b1804b1-42304827c4emr91279465e9.10.1718632541343; Mon, 17 Jun 2024 06:55:41 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4229c60f758sm181632095e9.20.2024.06.17.06.55.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jun 2024 06:55:40 -0700 (PDT) Message-Id: <2f15099bbb9fed4919e829274b7683ace9094d15.1718632536.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 17 Jun 2024 13:55:35 +0000 Subject: [PATCH v7 3/3] unbundle: extend object verification for fetches Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , blanet , Xing Xin From: Xing Xin From: Xing Xin The existing fetch.fsckObjects and transfer.fsckObjects configurations were not fully applied to bundle-involved fetches, including direct bundle fetches and bundle-uri enabled fetches. Furthermore, there was no object verification support for unbundle. This commit extends object verification support in `bundle.c:unbundle` by adding the `VERIFY_BUNDLE_FSCK` option to `verify_bundle_flags`. When this option is enabled, we append the `--fsck-objects` flag to `git-index-pack`. The `VERIFY_BUNDLE_FSCK` option is now used by bundle-involved fetches, where we use `fetch-pack.c:fetch_pack_fsck_objects` to determine whether to enable this option for `bundle.c:unbundle`, specifically in: - `transport.c:fetch_refs_from_bundle` for direct bundle fetches. - `bundle-uri.c:unbundle_from_file` for bundle-uri enabled fetches. This addition ensures a consistent logic for object verification during fetches. Tests have been added to confirm functionality in the scenarios mentioned above. Reviewed-by: Patrick Steinhardt Signed-off-by: Xing Xin --- bundle-uri.c | 3 ++- bundle.c | 3 +++ bundle.h | 1 + t/t5558-clone-bundle-uri.sh | 33 ++++++++++++++++++++++++++++++++- t/t5607-clone-bundle.sh | 33 +++++++++++++++++++++++++++++++++ transport.c | 3 ++- 6 files changed, 73 insertions(+), 3 deletions(-) diff --git a/bundle-uri.c b/bundle-uri.c index 65666a11d9c..ed9b49fdbc1 100644 --- a/bundle-uri.c +++ b/bundle-uri.c @@ -9,6 +9,7 @@ #include "hashmap.h" #include "pkt-line.h" #include "config.h" +#include "fetch-pack.h" #include "remote.h" static struct { @@ -373,7 +374,7 @@ static int unbundle_from_file(struct repository *r, const char *file) * the prerequisite commits. */ if ((result = unbundle(r, &header, bundle_fd, NULL, - VERIFY_BUNDLE_QUIET))) + VERIFY_BUNDLE_QUIET | (fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0)))) return 1; /* diff --git a/bundle.c b/bundle.c index 95367c2d0a0..f124a2a5626 100644 --- a/bundle.c +++ b/bundle.c @@ -625,6 +625,9 @@ int unbundle(struct repository *r, struct bundle_header *header, if (header->filter.choice) strvec_push(&ip.args, "--promisor=from-bundle"); + if (flags & VERIFY_BUNDLE_FSCK) + strvec_push(&ip.args, "--fsck-objects"); + if (extra_index_pack_args) { strvec_pushv(&ip.args, extra_index_pack_args->v); strvec_clear(extra_index_pack_args); diff --git a/bundle.h b/bundle.h index 021adbdcbb3..5ccc9a061a4 100644 --- a/bundle.h +++ b/bundle.h @@ -33,6 +33,7 @@ int create_bundle(struct repository *r, const char *path, enum verify_bundle_flags { VERIFY_BUNDLE_VERBOSE = (1 << 0), VERIFY_BUNDLE_QUIET = (1 << 1), + VERIFY_BUNDLE_FSCK = (1 << 2), }; int verify_bundle(struct repository *r, struct bundle_header *header, diff --git a/t/t5558-clone-bundle-uri.sh b/t/t5558-clone-bundle-uri.sh index 2dcdd238a90..38a25d08d0a 100755 --- a/t/t5558-clone-bundle-uri.sh +++ b/t/t5558-clone-bundle-uri.sh @@ -34,7 +34,21 @@ test_expect_success 'create bundle' ' sed -e "/^$/q" -e "s/$(git rev-parse A) /$(git rev-parse B) /" \ bad-header.bundle && convert_bundle_to_pack \ - >bad-header.bundle + >bad-header.bundle && + + cat >data <<-EOF && + tree $(git rev-parse HEAD^{tree}) + parent $(git rev-parse HEAD) + author A U Thor + committer A U Thor + + commit: this is a commit with bad emails + + EOF + git hash-object --literally -t commit -w --stdin commit && + git branch bad $(cat commit) && + git bundle create bad-object.bundle bad && + git update-ref -d refs/heads/bad ) ' @@ -56,6 +70,23 @@ test_expect_success 'clone with bundle that has bad header' ' ! grep "refs/bundles/" refs ' +test_expect_success 'clone with bundle that has bad object' ' + # Unbundle succeeds if no fsckObjects confugured. + git clone --bundle-uri="clone-from/bad-object.bundle" \ + clone-from clone-bad-object-no-fsck && + git -C clone-bad-object-no-fsck for-each-ref --format="%(refname)" >refs && + grep "refs/bundles/" refs >actual && + test_write_lines refs/bundles/bad >expect && + test_cmp expect actual && + + # Unbundle fails with fsckObjects set true, but clone can still proceed. + git -c fetch.fsckObjects=true clone --bundle-uri="clone-from/bad-object.bundle" \ + clone-from clone-bad-object-fsck 2>err && + test_grep "missingEmail" err && + git -C clone-bad-object-fsck for-each-ref --format="%(refname)" >refs && + ! grep "refs/bundles/" refs +' + test_expect_success 'clone with path bundle and non-default hash' ' test_when_finished "rm -rf clone-path-non-default-hash" && GIT_DEFAULT_HASH=sha256 git clone --bundle-uri="clone-from/B.bundle" \ diff --git a/t/t5607-clone-bundle.sh b/t/t5607-clone-bundle.sh index 0d1e92d9963..5182efc0b45 100755 --- a/t/t5607-clone-bundle.sh +++ b/t/t5607-clone-bundle.sh @@ -138,6 +138,39 @@ test_expect_success 'fetch SHA-1 from bundle' ' git fetch --no-tags foo/tip.bundle "$(cat hash)" ' +test_expect_success 'clone bundle with different fsckObjects configurations' ' + test_create_repo bundle-fsck && + ( + cd bundle-fsck && + test_commit first && + cat >data <<-EOF && + tree $(git rev-parse HEAD^{tree}) + parent $(git rev-parse HEAD) + author A U Thor + committer A U Thor + + commit: this is a commit with bad emails + + EOF + git hash-object --literally -t commit -w --stdin commit && + git branch bad $(cat commit) && + git bundle create bad.bundle bad + ) && + + git clone bundle-fsck/bad.bundle bundle-no-fsck && + + git -c fetch.fsckObjects=false -c transfer.fsckObjects=true \ + clone bundle-fsck/bad.bundle bundle-fetch-no-fsck && + + test_must_fail git -c fetch.fsckObjects=true \ + clone bundle-fsck/bad.bundle bundle-fetch-fsck 2>err && + test_grep "missingEmail" err && + + test_must_fail git -c transfer.fsckObjects=true \ + clone bundle-fsck/bad.bundle bundle-transfer-fsck 2>err && + test_grep "missingEmail" err +' + test_expect_success 'git bundle uses expected default format' ' git bundle create bundle HEAD^.. && cat >expect <<-EOF && diff --git a/transport.c b/transport.c index 0ad04b77fd2..a93c4171f7b 100644 --- a/transport.c +++ b/transport.c @@ -184,7 +184,8 @@ static int fetch_refs_from_bundle(struct transport *transport, if (!data->get_refs_from_bundle_called) get_refs_from_bundle_inner(transport); ret = unbundle(the_repository, &data->header, data->fd, - &extra_index_pack_args, 0); + &extra_index_pack_args, + fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0); transport->hash_algo = data->header.hash_algo; return ret; }