From patchwork Tue Mar 11 00:21:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14010876 Received: from mail-yw1-f175.google.com (mail-yw1-f175.google.com [209.85.128.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CBB2E179BC for ; Tue, 11 Mar 2025 00:21:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741652503; cv=none; b=U0VanmMYXLl3pYk6QNa4gZn7g16uUPrbV5AZFHABdjh/SNFJXM0tqjVLEyrMDXR9WABgeWo6PGaypmk05214u15fErqYHCmqohMjsN+JKw5huNQdRRArYxx5R0KXBoyDL1eR+OcZzfAhoSQj9DQLmBse+92DMwYg9ySm9W2m+Ok= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741652503; c=relaxed/simple; bh=rIBsKr88pxftvTcrVDxBAvCPoBL79q/QOzWUCS72mFI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=KG4hVC0G+OyzZRMltO3GYVr7xHanchROQaHcb8pS39XnZTZwD8gGxEywzTy0P9T0EsgH6FMtO+9+8/WzNt0hn0jVEtt8Er6CbWLGTz2NhjhdyxvSkDl1189B7a/tADekniu7Lqra8wdYB9XU0TmNiJxEXgtPYBOkWo19zDjYhuo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=b9paoJy0; arc=none smtp.client-ip=209.85.128.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="b9paoJy0" Received: by mail-yw1-f175.google.com with SMTP id 00721157ae682-6f74b78df93so46249177b3.0 for ; Mon, 10 Mar 2025 17:21:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1741652499; x=1742257299; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=wHVSVCPi+hJd6oqrXiriDYr1pbJ8YTi54728iCiXQd8=; b=b9paoJy0rgxIKEgH6hI/kwF2pkA8t6jnDxPgP6HSD172eDt4W6SR3Su1GPill9DXBN 9k0ZMxBd2FxKtAvE7swQbR7eRLpA3YHmfeOej8v0KDdCDNiVvEESraRn/NrNvsf9Jkwa 6XjKoMEUSGnbLNepT45CQfhvaIrLiA3dcxB15qy506fCQyk8xY6HCZj7XyNkhJDPyynA pm2SYdAAFOU51UkdpiyusANh0vqh7pjxoLZlCIteQDejRH+Ir02rI3fr1PZbF2gkrJOM kEijBoyZDcHzH99S25k2/vCc/2phGgVHpJaTWk05wPZ3EUm/WsiWvliaxAUTUOl9eehb NL3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741652499; x=1742257299; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=wHVSVCPi+hJd6oqrXiriDYr1pbJ8YTi54728iCiXQd8=; b=OO7HuNOfWFOQFZITw7SSIOxux5snw1sNXp/tdjWflzF/H7lPKnxgxK2j6LkRnfg0Dy pdLJlz8L3SbMOvwM+6pvk9EFWKX74NG1fAuhAplU4r3lUeepDpXAR65L0TGJUy5aHupV FhJZ55yJVnWr27MDI1WVbBebUYlPiieVVW57bbVXKpK7QQmwdN5CX4xn53mM4ztl0dSI 6JjmofVCoYo60+MfmtI97z6Mi4s3j2fyRCrh6aFf70YY3JM1ZbbdPL57Bywz1P/MC219 ZyFLJmK1EPQAODrCNuCillszfPROaXpjADzfxSR1c7r+MO3Hfv/P7nNSYsGRubcrkewG Rugw== X-Gm-Message-State: AOJu0YzAnuTgz2OBEuiRvSaS8v836zwtCYRMQMz52om+pnx0i8V9s2cb mj4aqK7A+bFKqJQc9NfatGZJZqUanYEoTDvtNFWaGBg/hpniWJoCxFEwbYY4l/MSGU0Jf/svRxv 2Quw= X-Gm-Gg: ASbGncvlcMpZOZqd/fVvflWmWjDYy78BWBM2mzZK3aib06DeJEwG1za/Fqr+7T7Incl Jr+SdodeJ/xh2ALzJnXQ4B7Y9s0cyk1Qo6jOSC++yH5BCWNxYbsWOZmP7c5XjFEMPPEbM/Yyc8n qQwKZAulLsWhDxvU/RpHtxAlW3yZF03+ZvaReihLv6kQTQjYaTnFKI/5u23Teaile1UHKQgfGF2 Ca5Z7QYhZsUEL+PKeulNDVECyate22/SUm8UY8dk0TYs/5KI2onuztFBiHHHfxsdywE4Ij7hY/n WIHjZwNqnPDAAJ8NkHS64m4PLN1IIuBI+WF6Cil0SNWHjDB0v+roiuIuYkIL001FxPBy8Z6vl0g QkJ1Q+kDzx2BAcjlV X-Google-Smtp-Source: AGHT+IF47GkjUffkCEaAwA0BZKMlBbiIzXNA6cwZBIYqWCH3YgT/tZdSQ/UuKYT/7hzd1RWbvG/GfQ== X-Received: by 2002:a05:690c:46c9:b0:6fd:41d5:de11 with SMTP id 00721157ae682-6febf3836e2mr212152687b3.23.1741652499553; Mon, 10 Mar 2025 17:21:39 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id 00721157ae682-6feb2c64eb2sm23951187b3.121.2025.03.10.17.21.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Mar 2025 17:21:39 -0700 (PDT) Date: Mon, 10 Mar 2025 20:21:38 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Elijah Newren , Patrick Steinhardt Subject: [PATCH v4 1/6] t/t5329-pack-objects-cruft.sh: evict 'repack'-related tests Message-ID: <390c3a6d85b287b9d141167a9cce9ce555e5509d.1741648467.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The cruft pack feature has two primary test scripts which exercise various parts of it, which are: - t5329-pack-objects-cruft.sh - t7704-repack-cruft.sh The former is designed to test low-level pack generation mechanics at the 'git pack-objects --cruft'-level, which is plumbing. The latter, on the other hand, is designed to test the user-facing behavior through 'git repack --cruft', which is porcelain (under the "ancillary manipulators" sub-section). At some point a handful of tests which should have been added to the latter script were instead written to the former. This isn't a huge deal, but rectifying it is straightforward. Move a handful of 'repack'-related tests out of t5329 and into their rightful home in t7704. Signed-off-by: Taylor Blau --- t/t5329-pack-objects-cruft.sh | 250 ---------------------------------- t/t7704-repack-cruft.sh | 250 ++++++++++++++++++++++++++++++++++ 2 files changed, 250 insertions(+), 250 deletions(-) diff --git a/t/t5329-pack-objects-cruft.sh b/t/t5329-pack-objects-cruft.sh index b71a0aef40..60dac8312d 100755 --- a/t/t5329-pack-objects-cruft.sh +++ b/t/t5329-pack-objects-cruft.sh @@ -360,43 +360,6 @@ test_expect_success 'expired objects are pruned' ' ) ' -test_expect_success 'repack --cruft generates a cruft pack' ' - git init repo && - test_when_finished "rm -fr repo" && - ( - cd repo && - - test_commit reachable && - git branch -M main && - git checkout --orphan other && - test_commit unreachable && - - git checkout main && - git branch -D other && - git tag -d unreachable && - # objects are not cruft if they are contained in the reflogs - git reflog expire --all --expire=all && - - git rev-list --objects --all --no-object-names >reachable.raw && - git cat-file --batch-all-objects --batch-check="%(objectname)" >objects && - sort reachable && - comm -13 reachable objects >unreachable && - - git repack --cruft -d && - - cruft=$(basename $(ls $packdir/pack-*.mtimes) .mtimes) && - pack=$(basename $(ls $packdir/pack-*.pack | grep -v $cruft) .pack) && - - git show-index <$packdir/$pack.idx >actual.raw && - cut -f2 -d" " actual.raw | sort >actual && - test_cmp reachable actual && - - git show-index <$packdir/$cruft.idx >actual.raw && - cut -f2 -d" " actual.raw | sort >actual && - test_cmp unreachable actual - ) -' - test_expect_success 'loose objects mtimes upsert others' ' git init repo && test_when_finished "rm -fr repo" && @@ -470,219 +433,6 @@ test_expect_success 'expiring cruft objects with git gc' ' ) ' -test_expect_success 'cruft packs are not included in geometric repack' ' - git init repo && - test_when_finished "rm -fr repo" && - ( - cd repo && - - test_commit reachable && - git repack -Ad && - git branch -M main && - - git checkout --orphan other && - test_commit cruft && - git repack -d && - - git checkout main && - git branch -D other && - git tag -d cruft && - git reflog expire --all --expire=all && - - git repack --cruft && - - find $packdir -type f | sort >before && - git repack --geometric=2 -d && - find $packdir -type f | sort >after && - - test_cmp before after - ) -' - -test_expect_success 'repack --geometric collects once-cruft objects' ' - git init repo && - test_when_finished "rm -fr repo" && - ( - cd repo && - - test_commit reachable && - git repack -Ad && - git branch -M main && - - git checkout --orphan other && - git rm -rf . && - test_commit --no-tag cruft && - cruft="$(git rev-parse HEAD)" && - - git checkout main && - git branch -D other && - git reflog expire --all --expire=all && - - # Pack the objects created in the previous step into a cruft - # pack. Intentionally leave loose copies of those objects - # around so we can pick them up in a subsequent --geometric - # reapack. - git repack --cruft && - - # Now make those objects reachable, and ensure that they are - # packed into the new pack created via a --geometric repack. - git update-ref refs/heads/other $cruft && - - # Without this object, the set of unpacked objects is exactly - # the set of objects already in the cruft pack. Tweak that set - # to ensure we do not overwrite the cruft pack entirely. - test_commit reachable2 && - - find $packdir -name "pack-*.idx" | sort >before && - git repack --geometric=2 -d && - find $packdir -name "pack-*.idx" | sort >after && - - { - git rev-list --objects --no-object-names $cruft && - git rev-list --objects --no-object-names reachable..reachable2 - } >want.raw && - sort want.raw >want && - - pack=$(comm -13 before after) && - git show-index <$pack >objects.raw && - - cut -d" " -f2 objects.raw | sort >got && - - test_cmp want got - ) -' - -test_expect_success 'cruft repack with no reachable objects' ' - git init repo && - test_when_finished "rm -fr repo" && - ( - cd repo && - - test_commit base && - git repack -ad && - - base="$(git rev-parse base)" && - - git for-each-ref --format="delete %(refname)" >in && - git update-ref --stdin in && - git hash-object -w -t blob in -} - -find_pack () { - for idx in $(ls $packdir/pack-*.idx) - do - git show-index <$idx >out && - if grep -q "$1" out - then - echo $idx - fi || return 1 - done -} - -test_expect_success 'cruft repack with --max-pack-size' ' - git init max-pack-size && - ( - cd max-pack-size && - test_commit base && - - # two cruft objects which exceed the maximum pack size - foo=$(write_blob foo 1048576) && - bar=$(write_blob bar 1048576) && - test-tool chmtime --get -1000 \ - "$objdir/$(test_oid_to_path $foo)" >foo.mtime && - test-tool chmtime --get -2000 \ - "$objdir/$(test_oid_to_path $bar)" >bar.mtime && - git repack --cruft --max-pack-size=1M && - find $packdir -name "*.mtimes" >cruft && - test_line_count = 2 cruft && - - foo_mtimes="$(basename $(find_pack $foo) .idx).mtimes" && - bar_mtimes="$(basename $(find_pack $bar) .idx).mtimes" && - test-tool pack-mtimes $foo_mtimes >foo.actual && - test-tool pack-mtimes $bar_mtimes >bar.actual && - - echo "$foo $(cat foo.mtime)" >foo.expect && - echo "$bar $(cat bar.mtime)" >bar.expect && - - test_cmp foo.expect foo.actual && - test_cmp bar.expect bar.actual && - test "$foo_mtimes" != "$bar_mtimes" - ) -' - -test_expect_success 'cruft repack with pack.packSizeLimit' ' - ( - cd max-pack-size && - # repack everything back together to remove the existing cruft - # pack (but to keep its objects) - git repack -adk && - git -c pack.packSizeLimit=1M repack --cruft && - # ensure the same post condition is met when --max-pack-size - # would otherwise be inferred from the configuration - find $packdir -name "*.mtimes" >cruft && - test_line_count = 2 cruft && - for pack in $(cat cruft) - do - test-tool pack-mtimes "$(basename $pack)" >objects && - test_line_count = 1 objects || return 1 - done - ) -' - -test_expect_success 'cruft repack respects repack.cruftWindow' ' - git init repo && - test_when_finished "rm -fr repo" && - ( - cd repo && - - test_commit base && - - GIT_TRACE2_EVENT=$(pwd)/event.trace \ - git -c pack.window=1 -c repack.cruftWindow=2 repack \ - --cruft --window=3 && - - grep "pack-objects.*--window=2.*--cruft" event.trace - ) -' - -test_expect_success 'cruft repack respects --window by default' ' - git init repo && - test_when_finished "rm -fr repo" && - ( - cd repo && - - test_commit base && - - GIT_TRACE2_EVENT=$(pwd)/event.trace \ - git -c pack.window=2 repack --cruft --window=3 && - - grep "pack-objects.*--window=3.*--cruft" event.trace - ) -' - -test_expect_success 'cruft repack respects --quiet' ' - git init repo && - test_when_finished "rm -fr repo" && - ( - cd repo && - - test_commit base && - GIT_PROGRESS_DELAY=0 git repack --cruft --quiet 2>err && - test_must_be_empty err - ) -' - test_expect_success 'cruft --local drops unreachable objects' ' git init alternate && git init repo && diff --git a/t/t7704-repack-cruft.sh b/t/t7704-repack-cruft.sh index 959e6e2648..aa5d8913ae 100755 --- a/t/t7704-repack-cruft.sh +++ b/t/t7704-repack-cruft.sh @@ -411,4 +411,254 @@ test_expect_success 'reachable packs are preferred over cruft ones' ' ) ' +test_expect_success 'repack --cruft generates a cruft pack' ' + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit reachable && + git branch -M main && + git checkout --orphan other && + test_commit unreachable && + + git checkout main && + git branch -D other && + git tag -d unreachable && + # objects are not cruft if they are contained in the reflogs + git reflog expire --all --expire=all && + + git rev-list --objects --all --no-object-names >reachable.raw && + git cat-file --batch-all-objects --batch-check="%(objectname)" >objects && + sort reachable && + comm -13 reachable objects >unreachable && + + git repack --cruft -d && + + cruft=$(basename $(ls $packdir/pack-*.mtimes) .mtimes) && + pack=$(basename $(ls $packdir/pack-*.pack | grep -v $cruft) .pack) && + + git show-index <$packdir/$pack.idx >actual.raw && + cut -f2 -d" " actual.raw | sort >actual && + test_cmp reachable actual && + + git show-index <$packdir/$cruft.idx >actual.raw && + cut -f2 -d" " actual.raw | sort >actual && + test_cmp unreachable actual + ) +' + +test_expect_success 'cruft packs are not included in geometric repack' ' + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit reachable && + git repack -Ad && + git branch -M main && + + git checkout --orphan other && + test_commit cruft && + git repack -d && + + git checkout main && + git branch -D other && + git tag -d cruft && + git reflog expire --all --expire=all && + + git repack --cruft && + + find $packdir -type f | sort >before && + git repack --geometric=2 -d && + find $packdir -type f | sort >after && + + test_cmp before after + ) +' + +test_expect_success 'repack --geometric collects once-cruft objects' ' + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit reachable && + git repack -Ad && + git branch -M main && + + git checkout --orphan other && + git rm -rf . && + test_commit --no-tag cruft && + cruft="$(git rev-parse HEAD)" && + + git checkout main && + git branch -D other && + git reflog expire --all --expire=all && + + # Pack the objects created in the previous step into a cruft + # pack. Intentionally leave loose copies of those objects + # around so we can pick them up in a subsequent --geometric + # reapack. + git repack --cruft && + + # Now make those objects reachable, and ensure that they are + # packed into the new pack created via a --geometric repack. + git update-ref refs/heads/other $cruft && + + # Without this object, the set of unpacked objects is exactly + # the set of objects already in the cruft pack. Tweak that set + # to ensure we do not overwrite the cruft pack entirely. + test_commit reachable2 && + + find $packdir -name "pack-*.idx" | sort >before && + git repack --geometric=2 -d && + find $packdir -name "pack-*.idx" | sort >after && + + { + git rev-list --objects --no-object-names $cruft && + git rev-list --objects --no-object-names reachable..reachable2 + } >want.raw && + sort want.raw >want && + + pack=$(comm -13 before after) && + git show-index <$pack >objects.raw && + + cut -d" " -f2 objects.raw | sort >got && + + test_cmp want got + ) +' + +test_expect_success 'cruft repack with no reachable objects' ' + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit base && + git repack -ad && + + base="$(git rev-parse base)" && + + git for-each-ref --format="delete %(refname)" >in && + git update-ref --stdin in && + git hash-object -w -t blob in +} + +find_pack () { + for idx in $(ls $packdir/pack-*.idx) + do + git show-index <$idx >out && + if grep -q "$1" out + then + echo $idx + fi || return 1 + done +} + +test_expect_success 'cruft repack with --max-pack-size' ' + git init max-pack-size && + ( + cd max-pack-size && + test_commit base && + + # two cruft objects which exceed the maximum pack size + foo=$(write_blob foo 1048576) && + bar=$(write_blob bar 1048576) && + test-tool chmtime --get -1000 \ + "$objdir/$(test_oid_to_path $foo)" >foo.mtime && + test-tool chmtime --get -2000 \ + "$objdir/$(test_oid_to_path $bar)" >bar.mtime && + git repack --cruft --max-pack-size=1M && + find $packdir -name "*.mtimes" >cruft && + test_line_count = 2 cruft && + + foo_mtimes="$(basename $(find_pack $foo) .idx).mtimes" && + bar_mtimes="$(basename $(find_pack $bar) .idx).mtimes" && + test-tool pack-mtimes $foo_mtimes >foo.actual && + test-tool pack-mtimes $bar_mtimes >bar.actual && + + echo "$foo $(cat foo.mtime)" >foo.expect && + echo "$bar $(cat bar.mtime)" >bar.expect && + + test_cmp foo.expect foo.actual && + test_cmp bar.expect bar.actual && + test "$foo_mtimes" != "$bar_mtimes" + ) +' + +test_expect_success 'cruft repack with pack.packSizeLimit' ' + ( + cd max-pack-size && + # repack everything back together to remove the existing cruft + # pack (but to keep its objects) + git repack -adk && + git -c pack.packSizeLimit=1M repack --cruft && + # ensure the same post condition is met when --max-pack-size + # would otherwise be inferred from the configuration + find $packdir -name "*.mtimes" >cruft && + test_line_count = 2 cruft && + for pack in $(cat cruft) + do + test-tool pack-mtimes "$(basename $pack)" >objects && + test_line_count = 1 objects || return 1 + done + ) +' + +test_expect_success 'cruft repack respects repack.cruftWindow' ' + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit base && + + GIT_TRACE2_EVENT=$(pwd)/event.trace \ + git -c pack.window=1 -c repack.cruftWindow=2 repack \ + --cruft --window=3 && + + grep "pack-objects.*--window=2.*--cruft" event.trace + ) +' + +test_expect_success 'cruft repack respects --window by default' ' + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit base && + + GIT_TRACE2_EVENT=$(pwd)/event.trace \ + git -c pack.window=2 repack --cruft --window=3 && + + grep "pack-objects.*--window=3.*--cruft" event.trace + ) +' + +test_expect_success 'cruft repack respects --quiet' ' + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit base && + GIT_PROGRESS_DELAY=0 git repack --cruft --quiet 2>err && + test_must_be_empty err + ) +' + test_done From patchwork Tue Mar 11 00:21:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14010877 Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A43711DA23 for ; Tue, 11 Mar 2025 00:21:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741652505; cv=none; b=hAV6o1RDvLndm3/dUaP1cY6Cd1W2ji5RzNsuZlticLxMtnfvXHGirS6s6347o4Z60D0ryAqTTXPuvKdDYSVpGm6uHDGoDQ+jau4oDFkuWfTY+fEYBYwUH8PcPdADmsMf+woeOLYl97l6u5u9tZZUuReCLBlGYDZ/F5wXT0wVyw4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741652505; c=relaxed/simple; bh=7eVrzaXCGC8d7NliRCiUtP4SC7Xf/1yGWTxcdMqyqiY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=j0sCbUfQ2Ul2tChLtQCFddP5BN/drbCY9lOWEJ/JUfiPxadbEui+6E/DMVOGCYhsNrJtjY85lTgib2nm8kSNQmohPq6H+TCWDaLR6RxK5PWEyft5BUTJtL2x1M85J/RsdVNWdL+leU+r93URI7om4FJiYWCu0+1NyZ8OTutVGzw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=GmDvDIdt; arc=none smtp.client-ip=209.85.128.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="GmDvDIdt" Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-6fedefb1c9cso16139567b3.0 for ; Mon, 10 Mar 2025 17:21:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1741652502; x=1742257302; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=c6k0Sfy84J8eO6x3Z0kX+45fAR4zU5o2TBXCRbGhdfo=; b=GmDvDIdtLFAnnTkP650zZ8lNkF3SOxQzPx6GytWLrqpM6O91jVU8gsVGt5kI9xecsV tn5+JVl12oyNROIlU6Fsad809/0MOaTRtfqZuyuqr5PsdmJ02Q4Eu+pXMS9K1UnTIfcu YVc04uXXfoQQZCFOUQi+gMBUKVXqIXmBTghEf1ZiW073I0YQAocKHuv3PxvbuabAn1Y8 SrDUhiYxaIe5BWz5CQXOg68z9ji0F93NP5p6y3EnonwEZar9kDF8Wdb9jxphgLevoU7u KWdWc/NOTKjtSYfSqlUxt62gMJYoRY2f69yBlFPOhRcmi3u9el7LNMczGu+KAbr/ZfeV 5r4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741652502; x=1742257302; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=c6k0Sfy84J8eO6x3Z0kX+45fAR4zU5o2TBXCRbGhdfo=; b=wU4TUTIPGqHP6JpBBEnYp/723oC22m6MQ6B4PxJMAXLk/5IR5b0fT4IYN3GMWBvKxV Gflvm35u8LozXzLdmVHQRGUu569StWnWLS4LMSqgYtQxqYsPhfT4YuHJR3Cb7dWiTXSs HVbukQkIxO5p7+mzzoIv28JP1eZEwBP/oon+0IK2W3EMahvAC+oPACwKlvS8RqQVUA9v KyQMeT+Kpu8D8tjPD2XbD03Yo0fwcJktMuCYL/7yR0aGwzK+CzgFxBdEfjch+kLqRGth 6yNfyTUBrwvbZZkqRaYb3dMKBi6CC3hFOQm4xB06uR6ywoM7RhRfat/u6weusyRuC1GA 9GqA== X-Gm-Message-State: AOJu0YwcazjLo8htAZOu0l+KJLlczz8zlehKEWYX3SokH+WGOfSkY1nQ 5wDOz2zZPzUdB1fQLxdfphplN6kVF6Xrsjl+mSHOa5bFDT5NzsEwONbQdVKRGruoZVmPlUIhXAw +1kk= X-Gm-Gg: ASbGncttu4+uzK8Fh+LuZMDbpUtKl9GmPRVQ+aBVUJ0TCSq558CQ5X8/ZNyxZ334f5K JAI3KKf1fKY3blSpmRCVjOTQoZPvPll+YncK9kkt0jXwfiyatxM9HqJw+BlzptMVNag+3lq3Moo f6aV1fCtii9Sieq4KjY7ViaZsrnL6xs+OeadodaMc+NC4xiJMVTCxCkT/d+bWqDiJ8R1mhCUqWy xVWcekIurAkV6HzexsiDPlsDUFlRy5sWQmqoSGVMefmQwit6WFjuypo/sASps8cuFhx42nT5jyZ +1LypVFTDJIrqiWIrBNKixdipTSo3ALFR7B9utBhC/Gq44KDzEq6NZTtVADfswiiOdDQNkEeudD bw82TeN2hmHLi7RAQ X-Google-Smtp-Source: AGHT+IFr6gGBxY6zczqLYngON+C3lloiQM6Mn5PI9+3gJVWRClUTYsOtezD2lfuzKG3ZkQY4YoRsTw== X-Received: by 2002:a05:690c:6d09:b0:6d4:4a0c:fcf0 with SMTP id 00721157ae682-6febf319c4bmr220237187b3.20.1741652502400; Mon, 10 Mar 2025 17:21:42 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id 00721157ae682-6feb2c4778esm24269547b3.114.2025.03.10.17.21.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Mar 2025 17:21:42 -0700 (PDT) Date: Mon, 10 Mar 2025 20:21:41 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Elijah Newren , Patrick Steinhardt Subject: [PATCH v4 2/6] t7704-repack-cruft.sh: consolidate `write_blob()` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The last commit moved a handful of tests from a different script into t7704, including one that relies on generating random blobs. Incidentally, the original home of this test defined its own helper "write_blob" for doing so, which is identical in function to our "generate_random_blob" (and is slightly inferior to the latter, which cleans up after itself). Rewrite the test that uses "write_blob" to no longer do so and then remove the function. Signed-off-by: Taylor Blau --- t/t7704-repack-cruft.sh | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/t/t7704-repack-cruft.sh b/t/t7704-repack-cruft.sh index aa5d8913ae..5ce2648a29 100755 --- a/t/t7704-repack-cruft.sh +++ b/t/t7704-repack-cruft.sh @@ -552,11 +552,6 @@ test_expect_success 'cruft repack with no reachable objects' ' ) ' -write_blob () { - test-tool genrandom "$@" >in && - git hash-object -w -t blob in -} - find_pack () { for idx in $(ls $packdir/pack-*.idx) do @@ -575,8 +570,8 @@ test_expect_success 'cruft repack with --max-pack-size' ' test_commit base && # two cruft objects which exceed the maximum pack size - foo=$(write_blob foo 1048576) && - bar=$(write_blob bar 1048576) && + foo=$(generate_random_blob foo 1048576) && + bar=$(generate_random_blob bar 1048576) && test-tool chmtime --get -1000 \ "$objdir/$(test_oid_to_path $foo)" >foo.mtime && test-tool chmtime --get -2000 \ From patchwork Tue Mar 11 00:21:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14010878 Received: from mail-yw1-f170.google.com (mail-yw1-f170.google.com [209.85.128.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 859321F94D for ; Tue, 11 Mar 2025 00:21:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741652508; cv=none; b=LMSDoJGUlX20U9HhZlVpbAYN9Qn6BtRG/gsJOgeIChNcJVnQpvvI9C/NEJ2y05GNf7CS0sKEQAdKbQwlwHVmzNPC7USmhDbIXZQxzABR6XqsKliV8/S5hvum9ZvI4MAFJ9D+3CJ8vkLPcdJtiUx0CLKz2hOFGBUnewUBpGqU0Xs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741652508; c=relaxed/simple; bh=E2BTh9J/3fLoe61TrDmsBaNBaPsgtAw0ivPxqSHKaV0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=evf6wjgEiK72AHvMvMdrJxv1Ue+ba81dP4i4LBaEex9fa6y76zCA4+43RagbGaaIkpAUswvi/plW98J6dcYbn+54PLkSAt2zEt0gD679D/I8YoEjQ0xC1H6Xpv0kq2RD3sUUJYzk1zkDw5lcHE8WXa8yW727W6bL6BDqW2Y+QnA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=VljFiNvH; arc=none smtp.client-ip=209.85.128.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="VljFiNvH" Received: by mail-yw1-f170.google.com with SMTP id 00721157ae682-6feafc707d3so44708007b3.1 for ; Mon, 10 Mar 2025 17:21:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1741652505; x=1742257305; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=+aB6R3MdIjco38Q32KvdILS92+wJPJL0WX+D9nqzdEI=; b=VljFiNvHg5/olpo2aEvZ5CaCQdUw+RTzYKM6S0eFX9ONcabDsxwxa4c3u6XzphQv1U sYnDAKyrud8eLU+KiU+WfLpru43getVgYEJuLqQTeCE4fdeepV/qbaaBDDQP8JQHFBBm m9oopmtd802YWtjurAvkJ7qMyXebf6D8StXLaJweW3ng67ag8v46beAtOj57eZO6CHMo wXCIoMDeCi0g98ASi1ICFAlWiDgBQJ4MSskmoQn0HnYLJlTybKj06h9Y0sK63K2LWmAv dgtKF3e/nFNPQbxZVVNi2YsXZGeBU4PDY2Xh5aF9z6VlQK6+CnpF6Z4rX1Xqd2ormjqy WVfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741652505; x=1742257305; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=+aB6R3MdIjco38Q32KvdILS92+wJPJL0WX+D9nqzdEI=; b=tyobyHiUW2n+7h3oiSiYaf8gBErqHSoir3yRRu+NpZw5H5UtOOPe/7ROzlPUbIGELC 67DwsCEkMxTipP5JqoXfWSvvM9LRW8pcHAicZ0m2uoccNWTA3RXUsSRP2mYK/CM8LE0m rP8r2mrN2kUa7Z1YU2kkLwtHXhB+UUlePClOxXuva0heCF7rHUg3iSfINGoTOg3ZbTUw LO8lZdC67VZwd1KjZrtetEW1lDUQfvhnOfJWdXGJ0AF+WZqv7YHFE4eVCObP03/FffXh Yk/FC++t/RVUS0NWwFzOV965neLBNOl9FGmToYYeWX9I1FJS8ihn5/KYcGg4TYZDaIK/ Hjkg== X-Gm-Message-State: AOJu0YwAneR/daFeSNqxfgQ7KldxSkyXZVDadpHjVdZdyuKvlN1FjdBT nV/rEmAEqIQ7VYENIw65n5dZBmd2DGpUPosBTMTjezlKJyxG/iKeh8uKtywa/NJkcHHc5LCcgJU gztE= X-Gm-Gg: ASbGnctgrSbZDNy2PA7erXiTXgN30zxOO70HOCXH7j5FLRCMsDvmwj+ca7om98H4IA8 VgO2GojY3boMVyrzWLgf30KAbjwO6h8PcTHlPhShirP3nsH+nDfkWJsuo4d24uqYpZS0kvUusfp dajIZK1NpfVzNfWB+mQS4cA/L44U7laRKte30xmqFbFK2SkI8zoiKR1JUGH0YA+ZC+yWgUYXmr3 nplC3phDd0+68W54PbXjmgJeHbyQDgz15DuUogN/1SVMbMVOEIc059lxYuZvDIPBwpwdOCyHXTV hmqEoRIKjPzWaAoOVe8/77tfHfCCyXQuYwRcG6AkmNVXv5VHckdsJU7aY2pBHHSgQM8kEj1hxLH Mrp9hX4hxlGK5RLQXUipN8Kz8Dwk= X-Google-Smtp-Source: AGHT+IF8aZ/rBp79O0LMe1H+1Hj+GaOiq/Dqygk20ycQAMHAy19iv9yAj18/xh4YG5nZp89xshCUIA== X-Received: by 2002:a05:690c:6f87:b0:6fb:b8a1:d3bb with SMTP id 00721157ae682-6ff091ecd50mr30524677b3.17.1741652505272; Mon, 10 Mar 2025 17:21:45 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id 00721157ae682-6feb2c64b57sm24150967b3.119.2025.03.10.17.21.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Mar 2025 17:21:45 -0700 (PDT) Date: Mon, 10 Mar 2025 20:21:44 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Elijah Newren , Patrick Steinhardt Subject: [PATCH v4 3/6] t/lib-cruft.sh: extract some cruft-related helpers Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The test script in t7704-repack-cruft.sh uses a handful of helpers to generate and pack random blobs, but a forthcoming test in t5329 will want to make use of generate_random_blob(). Instead of rewriting that function, move it (and the related helpers) to its own reusable library to avoid duplicating the function. Signed-off-by: Taylor Blau --- t/lib-cruft.sh | 23 +++++++++++++++++++++++ t/t7704-repack-cruft.sh | 22 +--------------------- 2 files changed, 24 insertions(+), 21 deletions(-) create mode 100644 t/lib-cruft.sh diff --git a/t/lib-cruft.sh b/t/lib-cruft.sh new file mode 100644 index 0000000000..9fdc0b01b0 --- /dev/null +++ b/t/lib-cruft.sh @@ -0,0 +1,23 @@ +# generate_random_blob [] +generate_random_blob () { + test-tool genrandom "$@" >blob && + git hash-object -w -t blob blob && + rm blob +} + +# pack_random_blob [] +pack_random_blob () { + generate_random_blob "$@" && + git repack -d -q >/dev/null +} + +# generate_cruft_pack [] +generate_cruft_pack () { + pack_random_blob "$@" >/dev/null && + + ls $packdir/pack-*.pack | xargs -n 1 basename >in && + pack="$(git pack-objects --cruft $packdir/pack blob && - git hash-object -w -t blob blob && - rm blob -} - -pack_random_blob () { - generate_random_blob "$@" && - git repack -d -q >/dev/null -} - -generate_cruft_pack () { - pack_random_blob "$@" >/dev/null && - - ls $packdir/pack-*.pack | xargs -n 1 basename >in && - pack="$(git pack-objects --cruft $packdir/pack X-Patchwork-Id: 14010879 Received: from mail-yw1-f177.google.com (mail-yw1-f177.google.com [209.85.128.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7AA0D1C695 for ; Tue, 11 Mar 2025 00:21:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741652511; cv=none; b=u65JAGsTbr48mqyzOxq53h1gKtGiPLiWW93owaVs7fcd4DjhYNcZgive6pp+Io7yv9CDzQEFIPKrLCDJcyFOoIyMnHGB245aj+O1xXDCB6/HhlFwENsBOiuu8Az5XfjvfAyjpQiQmFnLgk1sLqqxY1MmJKitb4Mmx+PyBp3f6z4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741652511; c=relaxed/simple; bh=l6ue9Z9lM94WN/+NHToCp+1p4WjXKWnzYAWGBLBR+Zg=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=S15wUwYvihPFgInJCPtABplmDEEYSjH87duvzBcrJQrEzd9LF4MPL913DcWDJgpPP81cXFcfzEEPgt5ujvcMeLFoEWXbc4yiFo9n7xhYDHbEmgZv7ka+9zaNONTP7IsotsZoAaPDCJ+aYo0HN9BVKVE1LDu0foNh4HsusYoYRuQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=x4hgy9U0; arc=none smtp.client-ip=209.85.128.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="x4hgy9U0" Received: by mail-yw1-f177.google.com with SMTP id 00721157ae682-6ff07872097so7004097b3.3 for ; Mon, 10 Mar 2025 17:21:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1741652508; x=1742257308; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=NlPJOxxU4Y+W3u0tmgeDZw8BlZuKCBgXzhnrKQJC7Go=; b=x4hgy9U0CNcP4QBTNt0zM4F6UifeIy1+ctSzGS6jfDMgqwaUO15X53kG8RZAFvsEBp eWEOPmLauAbk2Px7w1vCLK6aNF1Bh7gIVNl4W2v8UkIJEOr9Q+ie4fBEfwbkV4IYecI2 qXFj1Plvvt4abLzJRG+Cg+tDKGH3KJttKbNkS8JjbGZ4MxxhKuShcivr3VbPdOGYarNp 4xePXjZijsxSDqa9Hf6YVhtMOwJ2gA9kGTFOWDob6+ItuMEyJsnUysVqpwz2bqrLDnbC q9PGOMixq5bPb3FcrwxHmw8sdT462hE3ccI1GxNHH4U18KmmE1LJYpQ5N7T83kagwfTX LPbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741652508; x=1742257308; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=NlPJOxxU4Y+W3u0tmgeDZw8BlZuKCBgXzhnrKQJC7Go=; b=Lv8KTBxtJ/JA2GkZnssaN3IJcDyr9BXOEChi1eIbC9rn4snfjdGoQ+sjMD6S0NE2kM q12O+wvooUR3FLh4z+Xi6hmm0nNvwtBPbFNffH2sgkZyf/RsR2aq+ZZBO9Dp5Eet5qZQ Rc3IufWr+CPnyIN6GfiKWYin3IsbBljBfrQjR/K17oX16M/suMK2lH/h5avUmzWPVggq 9W/MbWfK8bdQx9CxuN2SHGV4/1hcJQ2VHJ1KxrxF57LP378peUEkIp47OMj61nOzpjrR /Zmqq6F/IWJVeegEr0wzTYA3eS+PQrclIrF0OfAMaPQTh9Z0LLTYZz8b7qH1SaF1lWdq nbBQ== X-Gm-Message-State: AOJu0YxLg4c1PHxPZUnkIu8aWVRjEi9fr+aMDc+Pfl0mtLVAza4LO1dy b6EQrc4qUrsElnKNmJjcr+iIQJ2r4nzyfnvsrZnz6TVJDszGR7wTp3BEUem2MgWomcwOQzqvtBe o20E= X-Gm-Gg: ASbGncubIB4dqKKbkzmQ22tcuwklvJf7qiw3n8wW+E+nch1WXIT+qGHOLZBKlA9Gz2G lbFn3Ce/LtEDg63zacjAg29yEQMZcngvkHQO9xXyCwyBkShOjiPpzLbQ97IrYqFgAaMLTRQ6BIG WxImXjCCZ8uv+6mSIA8M+dx60FldcjaUY4PqZqgs/TrOg8LLhClI6SRge4HtnsfoWFKx4l1fY4n sf5ZLUcBaYYZmLimNkXamFZpXqzN2kpqQ85kZZwqIVB8uRU5VzdmhfBhNNPqcPhUjcQeVAM1DJQ xIb77+MdKkmHCECGdKfqJ1alYPMUa1I/c1fHmE2baARdHf+a6nM1MPhQkxkBiiXBrpGiCbSndbN lCdbgDjhVNDttLk8Q X-Google-Smtp-Source: AGHT+IF4aCRmDtgy3bS2iHTIvHQ0mIT3pUXTM2JjX6Z7F8QSCv6rQpdYqRbT6mSQTOiT+XToGVPYUw== X-Received: by 2002:a05:690c:6813:b0:6ef:48ac:9d21 with SMTP id 00721157ae682-6febf3c252amr220439427b3.24.1741652508185; Mon, 10 Mar 2025 17:21:48 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id 00721157ae682-6feb2a5beb5sm23948827b3.34.2025.03.10.17.21.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Mar 2025 17:21:47 -0700 (PDT) Date: Mon, 10 Mar 2025 20:21:46 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Elijah Newren , Patrick Steinhardt Subject: [PATCH v4 4/6] pack-objects: generate cruft packs at most one object over threshold Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When generating multiple cruft packs with 'git repack --max-cruft-size', we use 'git pack-objects --cruft --max-pack-size' (with many other elided options), filling in the '--max-pack-size' value with whatever was provided via the '--max-cruft-size' flag. This causes us to generate a pack that is smaller than the specified threshold. This poses a problem since we will never be able to generate a cruft pack that crosses the threshold. In effect, this means that we will try and repack its contents over and over again. Instead, change the meaning of '--max-pack-size' in pack-objects when combined with '--cruft'. When put together, '--max-pack-size' allows the pack to grow larger than the specified threshold, but only by one additional object. This allows cruft packs to become just a little bit larger than the threshold, allowing cruft packs to accumulate past the set threshold and avoid being repacked in the future until a pruning GC takes place. Noticed-by: Patrick Steinhardt Signed-off-by: Taylor Blau --- Documentation/config/pack.adoc | 4 ++ Documentation/git-pack-objects.adoc | 4 ++ builtin/pack-objects.c | 32 +++++++++++++- t/t5329-pack-objects-cruft.sh | 67 +++++++++++++++++++++++++++++ 4 files changed, 105 insertions(+), 2 deletions(-) diff --git a/Documentation/config/pack.adoc b/Documentation/config/pack.adoc index da527377fa..0a90931b93 100644 --- a/Documentation/config/pack.adoc +++ b/Documentation/config/pack.adoc @@ -119,6 +119,10 @@ sizes (e.g., removable media that cannot store the whole repository), you are likely better off creating a single large packfile and splitting it using a generic multi-volume archive tool (e.g., Unix `split`). + +When generating cruft packs with `git pack-objects`, this option has a +slightly different interpretation than above; see the documentation for +`--max-pack-size` option in linkgit:git-pack-objects[1]. ++ The minimum size allowed is limited to 1 MiB. The default is unlimited. Common unit suffixes of 'k', 'm', or 'g' are supported. diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc index 7f69ae4855..aee467c496 100644 --- a/Documentation/git-pack-objects.adoc +++ b/Documentation/git-pack-objects.adoc @@ -161,6 +161,10 @@ depth is 4095. `pack.packSizeLimit` is set. Note that this option may result in a larger and slower repository; see the discussion in `pack.packSizeLimit`. ++ +When used with `--cruft`, the output packfile(s) may be as large or +larger than the configured maximum size. The pack will exceed the +specified maximum by no more than one object. --honor-pack-keep:: This flag causes an object already in a local pack that diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 58a9b16126..f701b4c9ec 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -692,11 +692,39 @@ static off_t write_object(struct hashfile *f, off_t len; int usable_delta, to_reuse; + if (cruft && pack_size_limit && pack_size_limit <= write_offset) { + /* + * When writing a cruft pack with a limited size, + * perform the --max-pack-size check *before* writing + * the object. + * + * When we have not yet reached the size limit, this + * combined with the fact that we act as if there is no + * limit when writing objects via write_object() allows + * us to grow one object *past* the specified limit. + * + * This is important for generating cruft packs with a + * --max-pack-size so we can generate packs that are + * just over the threshold to avoid repacking them in + * the future. + */ + return 0; + } + if (!pack_to_stdout) crc32_begin(f); - /* apply size limit if limited packsize and not first object */ - if (!pack_size_limit || !nr_written) + /* + * Apply size limit when one is provided, with the following + * exceptions: + * + * - We are writing the first object. + * + * - We are writing a cruft pack with a size limit. The check + * above covers this case while letting the pack grow at most + * one object beyond the limit. + */ + if (!pack_size_limit || !nr_written || cruft) limit = 0; else if (pack_size_limit <= write_offset) /* diff --git a/t/t5329-pack-objects-cruft.sh b/t/t5329-pack-objects-cruft.sh index 60dac8312d..9cbc21a65d 100755 --- a/t/t5329-pack-objects-cruft.sh +++ b/t/t5329-pack-objects-cruft.sh @@ -3,6 +3,7 @@ test_description='cruft pack related pack-objects tests' . ./test-lib.sh +. "$TEST_DIRECTORY"/lib-cruft.sh objdir=.git/objects packdir=$objdir/pack @@ -695,4 +696,70 @@ test_expect_success 'additional cruft blobs via gc.recentObjectsHook' ' ) ' +test_expect_success 'cruft pack generation beyond --max-pack-size' ' + test_when_finished "rm -fr repo" && + git init repo && + ( + cd repo && + + # Disable pack compression to ensure the pack size is + # predictable. + git config pack.compression 0 && + + sz=524288 && # 0.5 MiB + foo="$(generate_random_blob foo $sz)" && + bar="$(generate_random_blob bar $sz)" && + baz="$(generate_random_blob baz $sz)" && + quux="$(generate_random_blob quux $sz)" && + + printf "%s\n" "$foo" "$bar" >A.objects && + printf "%s\n" "$baz" "$quux" >B.objects && + + A="$(git pack-objects $packdir/pack C.in && + git pack-objects --cruft --max-pack-size=$sz $packdir/pack \ + C.out && + + test_line_count = 2 C.out && + C_large="$(head -n 1 C.out)" && + C_small="$(tail -n 1 C.out)" && + + # Swap $C_large and $C_small if necessary. + if test "$(test_file_size $packdir/pack-$C_large.idx)" -lt \ + "$(test_file_size $packdir/pack-$C_small.idx)" + then + tmp="$C_large" && + C_large="$C_small" && + C_small="$tmp" + fi && + + # Ensure the large pack is no smaller than the threshold + # such that it does not get repacked in subsequent runs + # with the same --max-pack-size setting. + test $(test_file_size $packdir/pack-$C_large.pack) -ge $sz && + + { + git show-index <"$packdir/pack-$C_large.idx" && + git show-index <"$packdir/pack-$C_small.idx" + } >actual.raw && + printf "%s\n" "$foo" "$bar" "$baz" "$quux" >expect.raw && + + sort expect && + cut -d " " -f 2 actual.raw | sort >actual && + + # Ensure that all of the objects are present in the two + # cruft packs we just generated. + # + # Note that the contents of "actual" are not + # de-duplicated. This is intentional to ensure we avoid + # packing the same object twice (once in each pack). + test_cmp expect actual + ) +' + test_done From patchwork Tue Mar 11 00:21:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14010880 Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 55E33288D6 for ; Tue, 11 Mar 2025 00:21:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741652514; cv=none; b=CMlwGiC8hg8JZUZXFePhO/NValKkqGyZUVrWz2jB6Xy/zzseUiIniyrpav3wGdNADLDQaCpf21Z9MZ+s0Rlh0hmxVP4JebQiYq9eetn6R46QW+g54D22TYkw9n2MVLtnoo8VbatURm3GOSmoAon4U2G3IqwtJKe4ODZKs3enuTc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741652514; c=relaxed/simple; bh=Cvk9LBejBWXghx1DchRZZ1gkUlA17n1638CK5gzGvbY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=mbwl9v1Ud77MwHFwnCHxMGjw+kG322Uxi5j+wjUi/GywgQ/IM6NtTpU3LQK8dV/UEGAShK4QYsj1juAcrgLMmSoI1v8vh8IPC2WyimZXFvGveTWijd9HjUl2v81SNGKbSAuY7istc1q6b+RN/Y61jZHwOl2s/91WuBj7Fsk5JXA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=ysVJ8QKA; arc=none smtp.client-ip=209.85.128.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="ysVJ8QKA" Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-6f7031ea11cso49825417b3.2 for ; Mon, 10 Mar 2025 17:21:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1741652511; x=1742257311; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=M0A6TwSB8q37TO6SUmVJ+KoQphhFdebNEcHWSqPs7Mc=; b=ysVJ8QKAMknh4hK6exkUFYD8uCKjihJukIGOuDzxZNoY2FcY722ZC7o6o3fRfuGNtj +h/ddmcXGzQkpzUrQEcF9kW9Pg7pvd6j0wnRd51d4BoJ3Onea837ySUFCg7lclBK8yHj LeGmOn1C5WyOaT/BBfi1LkA3Zb6zeFRgQwpRM5JOddEsYc6zJNN+Y8TxailfStfS8WId GnX3H/57wEtNtiLTWBCXMaB+7Hi21daB/Hk0qBF5yXA6rH9KpfG2rIFI0lZXkD/aKrxP /9cKOa+RiSX4zS3tnIcf9zHIFF9rh+fq46JikcZOREGEM7p56GQNNfoKv/O38uBC6aod czJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741652511; x=1742257311; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=M0A6TwSB8q37TO6SUmVJ+KoQphhFdebNEcHWSqPs7Mc=; b=jQ/gC77Q9Vmn3isgW5FcePxxfpNHq3lp+tdAWg2xvKgEwXxbrf6E+s5Vtn8psU5HFJ p2kzo8rKJ2Of+t/0KwHH1VzkFsiI9PkH3x4YSQGLlmeEZSwxiVuvmtFsflVCkP3GhVim ZtjjccsHCfqP1gQPyXEeD9FWH6iihFnjUxKuLX4rLaTw/1rGmFN8mkj2UWucNWL7uPh+ RE4XipTl25LrOBGUo0AiCg6fPZ80USXIqwgXRqXcKlWVy2guXPZE4SD/7Ef5sLVLJTct d2Hbxa2qAhatGzNdo9bxmdYMDjaOYb/n5kAJ4bcn0Q//8b6vD0BM4TaDoo89A27o/qsy kbeA== X-Gm-Message-State: AOJu0Yx0PXxafEHMjop3MONwxrmNbqsIj1ynRsNiRnGr6KU3Qph3kuRd DqxPzfO8JYrU/wvWdPkJu1suC3diqA11oL96TZNDOuhxupaq8GOjzFXD4YeLfhgfhRMkR5yUDIQ YX/s= X-Gm-Gg: ASbGnctsDdyymuG4vb+oQDHxeK8Nz2Nc5n5+Jxu9sBe8rkGKy+1qwoeAewJsNpZBSoy X1yy9US8KWCeDL7T60AnuAI/UTb3u1Q+4fVZLbRIoVKKpS5reKXoSbriqCB2BPWMp1wVwUXzSLv JNbmYVUm05/r7s4t1S6OtwhqflyL4f0KgAooZbGQCnzpFdEScTs+Ky6tErJy+X2KyqT6geqHbs7 V1sIBcDSoFxRjGVuXJdnlRisxASVn832NKCts4CsJNfWGea2eI3eicsoZEZRHgcd89upLvq9LNM fLyE7cbV103q8lb0/ndMyOqEtk3PWlKK0JWsiEQSv8vBd4YS4LHiLbiPKzzjT8QBfRzhr9JGVfi k/FtB/QMmC7HcxJ0v X-Google-Smtp-Source: AGHT+IEa02vtHRHop1m8KIE/BD5DdLnjgw9iW4jYGCLHv2ylQqg9ox9SOz+U4wNZluEzkBDMHBHB1A== X-Received: by 2002:a05:690c:690d:b0:6fb:33e1:2e66 with SMTP id 00721157ae682-6febf2c8ab4mr230204287b3.14.1741652511045; Mon, 10 Mar 2025 17:21:51 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id 00721157ae682-6feb2a8aa70sm23708037b3.61.2025.03.10.17.21.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Mar 2025 17:21:50 -0700 (PDT) Date: Mon, 10 Mar 2025 20:21:49 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Elijah Newren , Patrick Steinhardt Subject: [PATCH v4 5/6] builtin/repack.c: simplify cruft pack aggregation Message-ID: <12ddea7603e305e8a7af5c05dbe021add834f2fc.1741648467.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In 37dc6d8104 (builtin/repack.c: implement support for `--max-cruft-size`, 2023-10-02), 'git repack' built on support for multiple cruft packs in Git by instructing 'git pack-objects --cruft' how to aggregate smaller cruft packs up to the provided threshold. The implementation in 37dc6d8104 worked something like the following pseudo-code: total_size = 0; for (p in cruft packs) { if (p->pack_size + total_size < max_size) { total_size += p->pack_size; collapse(p) } else { retain(p); } } The original idea behind this approach was that smaller cruft packs would get combined together until the sum of their sizes was no larger than the given max pack size. There is a much simpler way to achieve this, however, which is to simply combine *all* cruft packs which are smaller than the threshold, regardless of what their sum is. With '--max-pack-size', 'pack-objects' will split out the resulting pack into individual pack(s) if necessary to ensure that the written pack(s) are each at most one object larger than the provided threshold. This yields a slight behavior change, which is reflected in the removed test. Previous to this change, we would aggregate smaller cruft packs first, whereas now we will opportunistically combine as many cruft packs as possible. As as result, that test is no longer relevant, and can be deleted. Signed-off-by: Taylor Blau --- builtin/repack.c | 38 ++----------------------------------- t/t7704-repack-cruft.sh | 42 ----------------------------------------- 2 files changed, 2 insertions(+), 78 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 75e3752353..4d83d40f39 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -1022,29 +1022,13 @@ static int write_filtered_pack(const struct pack_objects_args *args, return finish_pack_objects_cmd(&cmd, names, local); } -static int existing_cruft_pack_cmp(const void *va, const void *vb) -{ - struct packed_git *a = *(struct packed_git **)va; - struct packed_git *b = *(struct packed_git **)vb; - - if (a->pack_size < b->pack_size) - return -1; - if (a->pack_size > b->pack_size) - return 1; - return 0; -} - static void collapse_small_cruft_packs(FILE *in, size_t max_size, struct existing_packs *existing) { - struct packed_git **existing_cruft, *p; + struct packed_git *p; struct strbuf buf = STRBUF_INIT; - size_t total_size = 0; - size_t existing_cruft_nr = 0; size_t i; - ALLOC_ARRAY(existing_cruft, existing->cruft_packs.nr); - for (p = get_all_packs(the_repository); p; p = p->next) { if (!(p->is_cruft && p->pack_local)) continue; @@ -1056,24 +1040,7 @@ static void collapse_small_cruft_packs(FILE *in, size_t max_size, if (!string_list_has_string(&existing->cruft_packs, buf.buf)) continue; - if (existing_cruft_nr >= existing->cruft_packs.nr) - BUG("too many cruft packs (found %"PRIuMAX", but knew " - "of %"PRIuMAX")", - (uintmax_t)existing_cruft_nr + 1, - (uintmax_t)existing->cruft_packs.nr); - existing_cruft[existing_cruft_nr++] = p; - } - - QSORT(existing_cruft, existing_cruft_nr, existing_cruft_pack_cmp); - - for (i = 0; i < existing_cruft_nr; i++) { - size_t proposed; - - p = existing_cruft[i]; - proposed = st_add(total_size, p->pack_size); - - if (proposed <= max_size) { - total_size = proposed; + if (p->pack_size < max_size) { fprintf(in, "-%s\n", pack_basename(p)); } else { retain_cruft_pack(existing, p); @@ -1086,7 +1053,6 @@ static void collapse_small_cruft_packs(FILE *in, size_t max_size, existing->non_kept_packs.items[i].string); strbuf_release(&buf); - free(existing_cruft); } static int write_cruft_pack(const struct pack_objects_args *args, diff --git a/t/t7704-repack-cruft.sh b/t/t7704-repack-cruft.sh index 88c6ce2913..fb52bb36a2 100755 --- a/t/t7704-repack-cruft.sh +++ b/t/t7704-repack-cruft.sh @@ -174,48 +174,6 @@ test_expect_success '--max-cruft-size combines existing packs when below thresho ) ' -test_expect_success '--max-cruft-size combines smaller packs first' ' - git init max-cruft-size-consume-small && - ( - cd max-cruft-size-consume-small && - - test_commit base && - git repack -ad && - - cruft_foo="$(generate_cruft_pack foo 524288)" && # 0.5 MiB - cruft_bar="$(generate_cruft_pack bar 524288)" && # 0.5 MiB - cruft_baz="$(generate_cruft_pack baz 1048576)" && # 1.0 MiB - cruft_quux="$(generate_cruft_pack quux 1572864)" && # 1.5 MiB - - test-tool pack-mtimes "$(basename $cruft_foo)" >expect.raw && - test-tool pack-mtimes "$(basename $cruft_bar)" >>expect.raw && - sort expect.raw >expect.objects && - - # repacking with `--max-cruft-size=2M` should combine - # both 0.5 MiB packs together, instead of, say, one of - # the 0.5 MiB packs with the 1.0 MiB pack - ls $packdir/pack-*.mtimes | sort >cruft.before && - git repack -d --cruft --max-cruft-size=2M && - ls $packdir/pack-*.mtimes | sort >cruft.after && - - comm -13 cruft.before cruft.after >cruft.new && - comm -23 cruft.before cruft.after >cruft.removed && - - test_line_count = 1 cruft.new && - test_line_count = 2 cruft.removed && - - # the two smaller packs should be rolled up first - printf "%s\n" $cruft_foo $cruft_bar | sort >expect.removed && - test_cmp expect.removed cruft.removed && - - # ...and contain the set of objects rolled up - test-tool pack-mtimes "$(basename $(cat cruft.new))" >actual.raw && - sort actual.raw >actual.objects && - - test_cmp expect.objects actual.objects - ) -' - test_expect_success 'setup --max-cruft-size with freshened objects' ' git init max-cruft-size-freshen && ( From patchwork Tue Mar 11 00:21:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14010881 Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6BBB917991 for ; Tue, 11 Mar 2025 00:21:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741652517; cv=none; b=FiSgXujhdEYJv7qnr4Hm3N2C0j9oK7Cf8P6wOwWJ+nzYazMOX+IXNRXwuG+xwl5DcHJkxfU0DrId4Ei6XiFZ2zhNX5WmrSt6Ulw8PLYqcyxHS6AQUq3JxBjNiQc/NwFgjhWlko2FAikBArF+HXV4OyY+zzAHKFJJxULBlAsRklI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741652517; c=relaxed/simple; bh=9yoZ/kNU+UHcwLj6ANxSzEl6+wEb7wQ/0K9IdeM6aqA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=DNe0l+I15qT1NsTkSHJHF7lr2bTyJpyDsZEI0952IcONmGJdocmnvtMR6xUgBsTPyZPlbW4NOCw0Pze+osu7rLZzay6waYyJQ0zAwxOAk4Osy3JHFt5Qw7R0OhLHLVRJvQQENMKEy4ZzDP5d1BjE/vGBeVHMNG+MPdXx0QRaoow= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=rZFJ9t2z; arc=none smtp.client-ip=209.85.128.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="rZFJ9t2z" Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-6fedefb1c9cso16140457b3.0 for ; Mon, 10 Mar 2025 17:21:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1741652514; x=1742257314; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=JQcHeJGaPi/O6pIz/PBbadfzT2q9ITLSMJBpzHxESd8=; b=rZFJ9t2z9j7StyMPmRwcCLD+7r/4GcbuzjliDxigRZ20VyqFWktapiymTY3MG5jXid 4OfwYd1sseKZqth7FBjodnGU69UMBzbk8GyRpLH/x5hq/x5Ly2w/gFoHApgrqVFTPrRy afp+KElD4UWqd9qpU+AFHOH0yPH2Jkc/4HSDTLqIBFu8prhiAaeuyGmjid3/ukIroVQm oKxnTeTT7pkvNRUZY9H5zbuQXUWZ2yU4aASmhRN9JrIMsV2cwXAfRxefQBOz9PErhO5C HLWF3V2jQsN2+yiA10HvTcYyCYMYZBJ2X9NCzAbxVzbm81CcvxsCULNqIqtJnI7fp08J RGpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741652514; x=1742257314; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=JQcHeJGaPi/O6pIz/PBbadfzT2q9ITLSMJBpzHxESd8=; b=jQU8zNx35bL5PtmYm2v3j+TbmROB41QI+9yAb5HXJ+gM8TWWYM2K600X1iFrskdOYm BjyhoEumUoEo/onmWf4h6yo6AOK0o4k52r9q96L+f2tUvqwleWMNh+ZOp4hKKsTxPc+0 FMvMxiuEpAnumqGuzz4CsqvJHBlpWP4guXZ4mz+8cKR2rTA4G6Xdr58lQquR3WUtRGwM RN+uHMa8q6j9j+9fKZWcIzsxlVvhdBdw3lPK8uJJted6UeiinkB3voIPRcsWGgy1jC+W 98NMtz+NAXcZhLFfplwqIr7K8QfxmaOuvbVslhFv0rDkIfYQqSthatMYCSwv1cie5xOx 22+Q== X-Gm-Message-State: AOJu0YztYC9/QnGsMKtxe595ggDLPG7wIZIIyyx3m6rUOaHsyD2NVR3S 2OAMwXvd6Sl1PknuZTiZYvaGCq+hUn7LpOjrEGFKfq9EwN8lpT70/Jik1Ed11q3nDlP6G9vH+8T mAs8= X-Gm-Gg: ASbGncvGj+xuFfoFlUnpnuVh4oIBIu2064FLj4tKZ/b0sRd8VunE/XBJkstVFfP6Azu 4gpZw7s8qIDI7kTpY98olfquYc89beMw6N1KhnGnRNX+fjSD7ZjuJJK+vftBMB72Drdlwk72WP0 YlvdZ2RMMEwcdXhEPPYMzkFmacfNMymW6nD8as9oztHjvgli5yTdLdzj+XWweBHfMAZ03IdBeAc p1AbT4izCzcN6vkL1nSlCvS9cx4BuolacDEWx0drHqPRlheBJbpJVwgIg4JNihPj/DFcVX6J0ZZ x3MueD9q16uR+HwweI6K5cBqDPKOIAq3u9T3IuIE6IXFcmsQAOeCcdq5m/vCfLVz7YxHUDVkn/v KMSsP9Oo8RdKX22X/ X-Google-Smtp-Source: AGHT+IFpRTwGkektRO7DKipNGz654uIjOQuHQ8N6a9YozujXjdygWfy5gNddYBAeVaa4627DDV9fLQ== X-Received: by 2002:a05:690c:45c3:b0:6ef:4a57:fc98 with SMTP id 00721157ae682-6febf30026bmr224695617b3.16.1741652513956; Mon, 10 Mar 2025 17:21:53 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id 00721157ae682-6feb3d3e3aesm23666847b3.115.2025.03.10.17.21.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Mar 2025 17:21:53 -0700 (PDT) Date: Mon, 10 Mar 2025 20:21:52 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Elijah Newren , Patrick Steinhardt Subject: [PATCH v4 6/6] builtin/pack-objects.c: freshen objects from existing cruft packs Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Once an object is written into a cruft pack, we can only freshen it by writing a new loose or packed copy of that object with a more recent mtime. Prior to 61568efa95 (builtin/pack-objects.c: support `--max-pack-size` with `--cruft`, 2023-08-28), we typically had at most one cruft pack in a repository at any given time. So freshening unreachable objects was straightforward when already rewriting the cruft pack (and its *.mtimes file). But 61568efa95 changes things: 'pack-objects' now supports writing multiple cruft packs when invoked with `--cruft` and the `--max-pack-size` flag. Cruft packs are rewritten until they reach some size threshold, at which point they are considered "frozen", and will only be modified in a pruning GC, or if the threshold itself is adjusted. Prior to this patch, however, this process breaks down when we attempt to freshen an object packed in an earlier cruft pack, and that cruft pack is larger than the threshold and thus will survive the repack. When this is the case, it is impossible to freshen objects in cruft pack(s) when those cruft packs are larger than the threshold. This is because we would avoid writing them in the new cruft pack entirely, for a couple of reasons. 1. When enumerating packed objects via 'add_objects_in_unpacked_packs()' we pass the SKIP_IN_CORE_KEPT_PACKS, which is used to avoid looping over the packs we're going to retain (which are marked as kept in-core by 'read_cruft_objects()'). This means that we will avoid enumerating additional packed copies of objects found in any cruft packs which are larger than the given size threshold. Thus there is no opportunity to call 'create_object_entry()' whatsoever. 2. We likewise will discard the loose copy (if one exists) of any unreachable object packed in a cruft pack that is larger than the threshold. Here our call path is 'add_unreachable_loose_objects()', which uses the 'add_loose_object()' callback. That function will eventually land us in 'want_object_in_pack()' (via 'add_cruft_object_entry()'), and we'll discard the object as it appears in one of the packs which we marked as kept in-core. This means in effect that it is impossible to freshen an unreachable object once it appears in a cruft pack larger than the given threshold. Instead, we should pack an additional copy of an unreachable object we want to freshen even if it appears in a cruft pack, provided that the cruft copy has an mtime which is before the mtime of the copy we are trying to pack/freshen. This is sub-optimal in the sense that it requires keeping an additional copy of unreachable objects upon freshening, but we don't have a better alternative without the ability to make in-place modifications to existing *.mtimes files. In order to implement this, we have to adjust the behavior of 'want_found_object()'. When 'pack-objects' is told that we're *not* going to retain any cruft packs (i.e. the set of packs marked as kept in-core does not contain a cruft pack), the behavior is unchanged. But when there *is* at least one cruft pack that we're holding onto, it is no longer sufficient to reject a copy of an object found in that cruft pack for that reason alone. In this case, we only want to reject a candidate object when copies of that object either: - exists in a non-cruft pack that we are retaining, regardless of that pack's mtime, or - exists in a cruft pack with an mtime at least as recent as the copy we are debating whether or not to pack, in which case freshening would be redundant. To do this, keep track of whether or not we have any cruft packs in our in-core kept list with a new 'ignore_packed_keep_in_core_has_cruft' flag. When we end up in this new special case, we replace a call to 'has_object_kept_pack()' to 'want_cruft_object_mtime()', and only reject objects when we have a copy in an existing cruft pack with at least as recent an mtime as our candidate (in which case "freshening" would be redundant). Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 118 ++++++++++++++++++++++++++++++++++------ packfile.c | 3 +- packfile.h | 2 + t/t7704-repack-cruft.sh | 63 +++++++++++++++++++++ 4 files changed, 168 insertions(+), 18 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index f701b4c9ec..1a3e7dd3d3 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -206,6 +206,7 @@ static int have_non_local_packs; static int incremental; static int ignore_packed_keep_on_disk; static int ignore_packed_keep_in_core; +static int ignore_packed_keep_in_core_has_cruft; static int allow_ofs_delta; static struct pack_idx_option pack_idx_opts; static const char *base_name; @@ -1530,8 +1531,60 @@ static int have_duplicate_entry(const struct object_id *oid, return 1; } +static int want_cruft_object_mtime(struct repository *r, + const struct object_id *oid, + unsigned flags, uint32_t mtime) +{ + struct packed_git **cache; + + for (cache = kept_pack_cache(r, flags); *cache; cache++) { + struct packed_git *p = *cache; + off_t ofs; + uint32_t candidate_mtime; + + ofs = find_pack_entry_one(oid, p); + if (!ofs) + continue; + + /* + * We have a copy of the object 'oid' in a non-cruft + * pack. We can avoid packing an additional copy + * regardless of what the existing copy's mtime is since + * it is outside of a cruft pack. + */ + if (!p->is_cruft) + return 0; + + /* + * If we have a copy of the object 'oid' in a cruft + * pack, then either read the cruft pack's mtime for + * that object, or, if that can't be loaded, assume the + * pack's mtime itself. + */ + if (!load_pack_mtimes(p)) { + uint32_t pos; + if (offset_to_pack_pos(p, ofs, &pos) < 0) + continue; + candidate_mtime = nth_packed_mtime(p, pos); + } else { + candidate_mtime = p->mtime; + } + + /* + * We have a surviving copy of the object in a cruft + * pack whose mtime is greater than or equal to the one + * we are considering. We can thus avoid packing an + * additional copy of that object. + */ + if (mtime <= candidate_mtime) + return 0; + } + + return -1; +} + static int want_found_object(const struct object_id *oid, int exclude, - struct packed_git *p) + struct packed_git *p, uint32_t mtime) { if (exclude) return 1; @@ -1581,12 +1634,29 @@ static int want_found_object(const struct object_id *oid, int exclude, if (ignore_packed_keep_in_core) flags |= IN_CORE_KEEP_PACKS; - if (ignore_packed_keep_on_disk && p->pack_keep) - return 0; - if (ignore_packed_keep_in_core && p->pack_keep_in_core) - return 0; - if (has_object_kept_pack(p->repo, oid, flags)) - return 0; + /* + * If the object is in a pack that we want to ignore, *and* we + * don't have any cruft packs that are being retained, we can + * abort quickly. + */ + if (!ignore_packed_keep_in_core_has_cruft) { + if (ignore_packed_keep_on_disk && p->pack_keep) + return 0; + if (ignore_packed_keep_in_core && p->pack_keep_in_core) + return 0; + if (has_object_kept_pack(p->repo, oid, flags)) + return 0; + } else { + /* + * But if there is at least one cruft pack which + * is being kept, we only want to include the + * provided object if it has a strictly greater + * mtime than any existing cruft copy. + */ + if (!want_cruft_object_mtime(p->repo, oid, flags, + mtime)) + return 0; + } } /* @@ -1605,7 +1675,8 @@ static int want_object_in_pack_one(struct packed_git *p, const struct object_id *oid, int exclude, struct packed_git **found_pack, - off_t *found_offset) + off_t *found_offset, + uint32_t found_mtime) { off_t offset; @@ -1621,7 +1692,7 @@ static int want_object_in_pack_one(struct packed_git *p, *found_offset = offset; *found_pack = p; } - return want_found_object(oid, exclude, p); + return want_found_object(oid, exclude, p, found_mtime); } return -1; } @@ -1635,10 +1706,11 @@ static int want_object_in_pack_one(struct packed_git *p, * function finds if there is any pack that has the object and returns the pack * and its offset in these variables. */ -static int want_object_in_pack(const struct object_id *oid, - int exclude, - struct packed_git **found_pack, - off_t *found_offset) +static int want_object_in_pack_mtime(const struct object_id *oid, + int exclude, + struct packed_git **found_pack, + off_t *found_offset, + uint32_t found_mtime) { int want; struct list_head *pos; @@ -1653,7 +1725,8 @@ static int want_object_in_pack(const struct object_id *oid, * are present we will determine the answer right now. */ if (*found_pack) { - want = want_found_object(oid, exclude, *found_pack); + want = want_found_object(oid, exclude, *found_pack, + found_mtime); if (want != -1) return want; @@ -1664,7 +1737,7 @@ static int want_object_in_pack(const struct object_id *oid, for (m = get_multi_pack_index(the_repository); m; m = m->next) { struct pack_entry e; if (fill_midx_entry(the_repository, oid, &e, m)) { - want = want_object_in_pack_one(e.p, oid, exclude, found_pack, found_offset); + want = want_object_in_pack_one(e.p, oid, exclude, found_pack, found_offset, found_mtime); if (want != -1) return want; } @@ -1672,7 +1745,7 @@ static int want_object_in_pack(const struct object_id *oid, list_for_each(pos, get_packed_git_mru(the_repository)) { struct packed_git *p = list_entry(pos, struct packed_git, mru); - want = want_object_in_pack_one(p, oid, exclude, found_pack, found_offset); + want = want_object_in_pack_one(p, oid, exclude, found_pack, found_offset, found_mtime); if (!exclude && want > 0) list_move(&p->mru, get_packed_git_mru(the_repository)); @@ -1702,6 +1775,15 @@ static int want_object_in_pack(const struct object_id *oid, return 1; } +static inline int want_object_in_pack(const struct object_id *oid, + int exclude, + struct packed_git **found_pack, + off_t *found_offset) +{ + return want_object_in_pack_mtime(oid, exclude, found_pack, found_offset, + 0); +} + static struct object_entry *create_object_entry(const struct object_id *oid, enum object_type type, uint32_t hash, @@ -3634,7 +3716,7 @@ static void add_cruft_object_entry(const struct object_id *oid, enum object_type entry->no_try_delta = no_try_delta(name); } } else { - if (!want_object_in_pack(oid, 0, &pack, &offset)) + if (!want_object_in_pack_mtime(oid, 0, &pack, &offset, mtime)) return; if (!pack && type == OBJ_BLOB && !has_loose_object(oid)) { /* @@ -3708,6 +3790,8 @@ static void mark_pack_kept_in_core(struct string_list *packs, unsigned keep) struct packed_git *p = item->util; if (!p) die(_("could not find pack '%s'"), item->string); + if (p->is_cruft && keep) + ignore_packed_keep_in_core_has_cruft = 1; p->pack_keep_in_core = keep; } } diff --git a/packfile.c b/packfile.c index 2d80d80cb3..9d09f8bc72 100644 --- a/packfile.c +++ b/packfile.c @@ -24,6 +24,7 @@ #include "commit-graph.h" #include "pack-revindex.h" #include "promisor-remote.h" +#include "pack-mtimes.h" char *odb_pack_name(struct repository *r, struct strbuf *buf, const unsigned char *hash, const char *ext) @@ -2107,7 +2108,7 @@ static void maybe_invalidate_kept_pack_cache(struct repository *r, r->objects->kept_pack_cache.flags = 0; } -static struct packed_git **kept_pack_cache(struct repository *r, unsigned flags) +struct packed_git **kept_pack_cache(struct repository *r, unsigned flags) { maybe_invalidate_kept_pack_cache(r, flags); diff --git a/packfile.h b/packfile.h index 00ada7a938..25097213d0 100644 --- a/packfile.h +++ b/packfile.h @@ -197,6 +197,8 @@ int has_object_pack(struct repository *r, const struct object_id *oid); int has_object_kept_pack(struct repository *r, const struct object_id *oid, unsigned flags); +struct packed_git **kept_pack_cache(struct repository *r, unsigned flags); + /* * Return 1 if an object in a promisor packfile is or refers to the given * object, 0 otherwise. diff --git a/t/t7704-repack-cruft.sh b/t/t7704-repack-cruft.sh index fb52bb36a2..3082e65817 100755 --- a/t/t7704-repack-cruft.sh +++ b/t/t7704-repack-cruft.sh @@ -242,6 +242,69 @@ test_expect_success '--max-cruft-size with freshened objects (packed)' ' ) ' +test_expect_success '--max-cruft-size with freshened objects (previously cruft)' ' + git init max-cruft-size-threshold && + ( + cd max-cruft-size-threshold && + + test_commit base && + foo="$(generate_random_blob foo $((2*1024*1024)))" && + bar="$(generate_random_blob bar $((2*1024*1024)))" && + baz="$(generate_random_blob baz $((2*1024*1024)))" && + + test-tool chmtime --get -100000 \ + "$objdir/$(test_oid_to_path "$foo")" >foo.old && + test-tool chmtime --get -100000 \ + "$objdir/$(test_oid_to_path "$bar")" >bar.old && + test-tool chmtime --get -100000 \ + "$objdir/$(test_oid_to_path "$baz")" >baz.old && + + git repack --cruft -d && + + # Make an identical copy of foo stored in a pack with a more + # recent mtime. + foo="$(generate_random_blob foo $((2*1024*1024)))" && + foo_pack="$(echo "$foo" | git pack-objects $packdir/pack)" && + test-tool chmtime --get -100 \ + "$packdir/pack-$foo_pack.pack" >foo.new && + git prune-packed && + + # Make a loose copy of bar, also with a more recent mtime. + bar="$(generate_random_blob bar $((2*1024*1024)))" && + test-tool chmtime --get -100 \ + "$objdir/$(test_oid_to_path "$bar")" >bar.new && + + # Make a new cruft object $quux to ensure we do not + # generate an identical pack to the existing cruft + # pack. + quux="$(generate_random_blob quux $((1024)))" && + test-tool chmtime --get -100 \ + "$objdir/$(test_oid_to_path "$quux")" >quux.new && + + git repack --cruft --max-cruft-size=3M -d && + + for p in $packdir/pack-*.mtimes + do + test-tool pack-mtimes "$(basename "$p")" || return 1 + done >actual.raw && + sort actual.raw >actual && + + # Among the set of all cruft packs, we should see both + # mtimes for object $foo and $bar, as well as the + # single new copy of $baz. + sort >expect <<-EOF && + $foo $(cat foo.old) + $foo $(cat foo.new) + $bar $(cat bar.old) + $bar $(cat bar.new) + $baz $(cat baz.old) + $quux $(cat quux.new) + EOF + + test_cmp expect actual + ) +' + test_expect_success '--max-cruft-size with pruning' ' git init max-cruft-size-prune && (