From patchwork Wed Mar 10 19:30:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129137 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16885C43381 for ; Wed, 10 Mar 2021 19:31:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D64D864EF6 for ; Wed, 10 Mar 2021 19:31:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233766AbhCJTbR (ORCPT ); Wed, 10 Mar 2021 14:31:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40706 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233660AbhCJTbJ (ORCPT ); Wed, 10 Mar 2021 14:31:09 -0500 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EE41C061760 for ; Wed, 10 Mar 2021 11:31:09 -0800 (PST) Received: by mail-wr1-x42d.google.com with SMTP id l12so24598189wry.2 for ; Wed, 10 Mar 2021 11:31:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=DW5DCSyEvSe7T85a+PC5MW200sQbjVqUWH/eAQ17LaU=; b=bu9xJnEdoJQnP0m0JqX5XCWY51D4ZM/6N1cG9jaXUK4BBuE/fDcO0Tbvb3KoI6tEAA +g2KOockD0tkH2EHhfl8MpGIjgB/hHPNkafkVc1aHrgyCq1vNrFv3MPPbk/GoHK5Wi6d DF5OM3kFQ3C3glsLjnGUAylwOcUJUtrTKnEgv1DjmFANpa0j1ewBShXPgphH7mhMI7Hr alv2B37Oj0LyhKW34F3Kn1naYGe8cOPpKtcWBjfKWsoIMUfLF6ma3R7Y/xFNuPoJTorO 0BPJMlZPEzxFT5mSIpU4Dg5u34y5Ax0cd3y8lkj1ZainONgdxdNl3Wnc6/a6YdNNcSLD 741Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=DW5DCSyEvSe7T85a+PC5MW200sQbjVqUWH/eAQ17LaU=; b=lSEg1RpQrJ96mMzeiZZlFeOmKKAYlHz8c5TS3kbPcq2TCWcZ6CEEyZBBWPlXvP61VC 9Y3eWMNsbLo+FjZFFp0NFR+Zs+tGSMYktCsxxD0UONjOH4rcPg9AqB4QWmKAxvlYHewU PZbVspNgOmvTWb+Lb27+4Jw0TehSYuDgv7AMvHTpUQl72PGgbRarSadoRpCScEsJD5as fGUQjAuCChbgtz0c1lbn55H1U58jLsTlTlgZFOLI44n6K6DhkVQyWnY5zCH5vMbMe/x7 p0qc196k55aoOnEwn2zTR8hLW1AdijpbWv/GXlw49amfsxJ3/BFFMUGL+ZOox3FctC+W 8PRg== X-Gm-Message-State: AOAM532AHrhInUYqLMH50VhPNEmlRShcoWSc2JeJF8j61z8hHZniR5Mx +Fgun4e7vW93A0GLH5rTnOJX9F77TY4= X-Google-Smtp-Source: ABdhPJyAA/cb9pTwPyuKLkSJj3IF4AmYexsQtTcESngjV6O8B66fbHlwN/hbisWE0zDq7Oh+u/5jXA== X-Received: by 2002:adf:8b58:: with SMTP id v24mr5032941wra.160.1615404667777; Wed, 10 Mar 2021 11:31:07 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m3sm389300wmc.48.2021.03.10.11.31.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:07 -0800 (PST) Message-Id: <2fe413fdac808807e2058caeee6ce86c54a678c0.1615404664.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:44 +0000 Subject: [PATCH v2 01/20] sparse-index: design doc and format update Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This begins a long effort to update the index format to allow sparse directory entries. This should result in a significant improvement to Git commands when HEAD contains millions of files, but the user has selected many fewer files to keep in their sparse-checkout definition. Currently, the index format is only updated in the presence of extensions.sparseIndex instead of increasing a file format version number. This is temporary, and index v5 is part of the plan for future work in this area. The design document details many of the reasons for embarking on this work, and also the plan for completing it safely. Signed-off-by: Derrick Stolee --- Documentation/technical/index-format.txt | 7 + Documentation/technical/sparse-index.txt | 173 +++++++++++++++++++++++ 2 files changed, 180 insertions(+) create mode 100644 Documentation/technical/sparse-index.txt diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index b633482b1bdf..387126582556 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -44,6 +44,13 @@ Git index format localization, no special casing of directory separator '/'). Entries with the same name are sorted by their stage field. + An index entry typically represents a file. However, if sparse-checkout + is enabled in cone mode (`core.sparseCheckoutCone` is enabled) and the + `extensions.sparseIndex` extension is enabled, then the index may + contain entries for directories outside of the sparse-checkout definition. + These entries have mode `0040000`, include the `SKIP_WORKTREE` bit, and + the path ends in a directory separator. + 32-bit ctime seconds, the last time a file's metadata changed this is stat(2) data diff --git a/Documentation/technical/sparse-index.txt b/Documentation/technical/sparse-index.txt new file mode 100644 index 000000000000..787a2a0b3b81 --- /dev/null +++ b/Documentation/technical/sparse-index.txt @@ -0,0 +1,173 @@ +Git Sparse-Index Design Document +================================ + +The sparse-checkout feature allows users to focus a working directory on +a subset of the files at HEAD. The cone mode patterns, enabled by +`core.sparseCheckoutCone`, allow for very fast pattern matching to +discover which files at HEAD belong in the sparse-checkout cone. + +Three important scale dimensions for a Git worktree are: + +* `HEAD`: How many files are present at `HEAD`? + +* Populated: How many files are within the sparse-checkout cone. + +* Modified: How many files has the user modified in the working directory? + +We will use big-O notation -- O(X) -- to denote how expensive certain +operations are in terms of these dimensions. + +These dimensions are ordered by their magnitude: users (typically) modify +fewer files than are populated, and we can only populate files at `HEAD`. +These dimensions are also ordered by how expensive they are per item: it +is expensive to detect a modified file than it is to write one that we +know must be populated; changing `HEAD` only really requires updating the +index. + +Problems occur if there is an extreme imbalance in these dimensions. For +example, if `HEAD` contains millions of paths but the populated set has +only tens of thousands, then commands like `git status` and `git add` can +be dominated by operations that require O(`HEAD`) operations instead of +O(Populated). Primarily, the cost is in parsing and rewriting the index, +which is filled primarily with files at `HEAD` that are marked with the +`SKIP_WORKTREE` bit. + +The sparse-index intends to take these commands that read and modify the +index from O(`HEAD`) to O(Populated). To do this, we need to modify the +index format in a significant way: add "sparse directory" entries. + +With cone mode patterns, it is possible to detect when an entire +directory will have its contents outside of the sparse-checkout definition. +Instead of listing all of the files it contains as individual entries, a +sparse-index contains an entry with the directory name, referencing the +object ID of the tree at `HEAD` and marked with the `SKIP_WORKTREE` bit. +If we need to discover the details for paths within that directory, we +can parse trees to find that list. + +At time of writing, sparse-directory entries violate expectations about the +index format and its in-memory data structure. There are many consumers in +the codebase that expect to iterate through all of the index entries and +see only files. In addition, they expect to see all files at `HEAD`. One +way to handle this is to parse trees to replace a sparse-directory entry +with all of the files within that tree as the index is loaded. However, +parsing trees is slower than parsing the index format, so that is a slower +operation than if we left the index alone. + +The implementation plan below follows four phases to slowly integrate with +the sparse-index. The intention is to incrementally update Git commands to +interact safely with the sparse-index without significant slowdowns. This +may not always be possible, but the hope is that the primary commands that +users need in their daily work are dramatically improved. + +Phase I: Format and initial speedups +------------------------------------ + +During this phase, Git learns to enable the sparse-index and safely parse +one. Protections are put in place so that every consumer of the in-memory +data structure can operate with its current assumption of every file at +`HEAD`. + +At first, every index parse will expand the sparse-directory entries into +the full list of paths at `HEAD`. This will be slower in all cases. The +only noticable change in behavior will be that the serialized index file +contains sparse-directory entries. + +To start, we use a new repository extension, `extensions.sparseIndex`, to +allow inserting sparse-directory entries into indexes with file format +versions 2, 3, and 4. This prevents Git versions that do not understand +the sparse-index from operating on one, but it also prevents other +operations that do not use the index at all. A new format, index v5, will +be introduced that includes sparse-directory entries by default. It might +also introduce other features that have been considered for improving the +index, as well. + +Next, consumers of the index will be guarded against operating on a +sparse-index by inserting calls to `ensure_full_index()` or +`expand_index_to_path()`. After these guards are in place, we can begin +leaving sparse-directory entries in the in-memory index structure. + +Even after inserting these guards, we will keep expanding sparse-indexes +for most Git commands using the `command_requires_full_index` repository +setting. This setting will be on by default and disabled one builtin at a +time until we have sufficient confidence that all of the index operations +are properly guarded. + +To complete this phase, the commands `git status` and `git add` will be +integrated with the sparse-index so that they operate with O(Populated) +performance. They will be carefully tested for operations within and +outside the sparse-checkout definition. + +Phase II: Careful integrations +------------------------------ + +This phase focuses on ensuring that all index extensions and APIs work +well with a sparse-index. This requires significant increases to our test +coverage, especially for operations that interact with the working +directory outside of the sparse-checkout definition. Some of these +behaviors may not be the desirable ones, such as some tests already +marked for failure in `t1092-sparse-checkout-compatibility.sh`. + +The index extensions that may require special integrations are: + +* FS Monitor +* Untracked cache + +While integrating with these features, we should look for patterns that +might lead to better APIs for interacting with the index. Coalescing +common usage patterns into an API call can reduce the number of places +where sparse-directories need to be handled carefully. + +Phase III: Important command speedups +------------------------------------- + +At this point, the patterns for testing and implementing sparse-directory +logic should be relatively stable. This phase focuses on updating some of +the most common builtins that use the index to operate as O(Populated). +Here is a potential list of commands that could be valuable to integrate +at this point: + +* `git commit` +* `git checkout` +* `git merge` +* `git rebase` + +Hopefully, commands such as `git merge` and `git rebase` can benefit +instead from merge algorithms that do not use the index as a data +structure, such as the merge-ORT strategy. As these topics mature, we +may enalbe the ORT strategy by default for repositories using the +sparse-index feature. + +Along with `git status` and `git add`, these commands cover the majority +of users' interactions with the working directory. In addition, we can +integrate with these commands: + +* `git grep` +* `git rm` + +These have been proposed as some whose behavior could change when in a +repo with a sparse-checkout definition. It would be good to include this +behavior automatically when using a sparse-index. Some clarity is needed +to make the behavior switch clear to the user. + +This phase is the first where parallel work might be possible without too +much conflicts between topics. + +Phase IV: The long tail +----------------------- + +This last phase is less a "phase" and more "the new normal" after all of +the previous work. + +To start, the `command_requires_full_index` option could be removed in +favor of expanding only when hitting an API guard. + +There are many Git commands that could use special attention to operate as +O(Populated), while some might be so rare that it is acceptable to leave +them with additional overhead when a sparse-index is present. + +Here are some commands that might be useful to update: + +* `git sparse-checkout set` +* `git am` +* `git clean` +* `git stash` From patchwork Wed Mar 10 19:30:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129139 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BFA0C433E9 for ; Wed, 10 Mar 2021 19:31:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F03DA64FE2 for ; Wed, 10 Mar 2021 19:31:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233805AbhCJTbT (ORCPT ); Wed, 10 Mar 2021 14:31:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233716AbhCJTbK (ORCPT ); Wed, 10 Mar 2021 14:31:10 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0EC7C061761 for ; Wed, 10 Mar 2021 11:31:09 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id j2so24612231wrx.9 for ; Wed, 10 Mar 2021 11:31:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=zwMXD/n7BP4Kel8wxBvjV38W0PSA0EVBJLFSQGh4bX8=; b=GD1LTDj++dxtvcJzq+vKfnaS/3zOupHbwhrTj67A6x8zF+G83I0k9tS2vHlR+LyOZJ AKvJgifj1H68XpwBmdBL5HEL94ov0KVf1wePLhc84hYF3rUHCP2kkz5MYIt+b+RjKaYp IDHCbXfHMsGz4lk/SW5dWM3z33tbsaDWxyy1w39EDyHv3TRdpie905S3tshKhkGJKVJ8 63yuUobDEjtkyK/z5+90o4/hBYFO0MGPKSM93Fd1sHqsG262JDAm7clIvFgx7ubb8DuF 37Tidsd3puRf0FfIBOT/QvuY17IphRmaCwuaz0oeWRgV+BmX3lV6OqtJPic/wK17Z6SX mMsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=zwMXD/n7BP4Kel8wxBvjV38W0PSA0EVBJLFSQGh4bX8=; b=ANCxhtJxiTPHUTv53Q3m0jxL9WR3cwRIpHi35Tx65IJWxQv7dKcSIK3fFG6RTncSQf +mCdG8fOcoZltpz13JxjZpkg3cIiegHQJH5odOo/mAEp7/s97fKz/hyhZmEfpKlPDx6g ut8eUlAgKsq5L/iDO+1wzEgC+nDF5unKYTjXu6hYEjKEmgPKtqddq9FTre4zPvux0emh Yf4EcesXhhdbChqnkWk40oC1wxK6TUFW78u/qNoAyyGxGyYnB0XTw67vMPqqSW9fA3iF SxbBKjyqso5HCDpXIBVdZqEjv7j+Cd9H9K4C0tWnLND+8XWxYgLylkFx4/hkKJg3lKKb 0Vrg== X-Gm-Message-State: AOAM5305XXwrYkBO+QRg2ql+RTfhsIlQp6tWD4ZstH/9QzfNlsZmGlcd E6IQJlDD7RpRlznB2KyvaoGCc93R28Y= X-Google-Smtp-Source: ABdhPJyWdtRDgwgKFxxPXmMVbvK9ShxrSNcAThENiIMmKmO5zJD6DyQ36Bkdo91QadGRELdAFwgvTw== X-Received: by 2002:adf:e391:: with SMTP id e17mr5018905wrm.285.1615404668513; Wed, 10 Mar 2021 11:31:08 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v5sm451855wmh.2.2021.03.10.11.31.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:08 -0800 (PST) Message-Id: <540ab5495065805fbac5b5f782937e29fe4a4398.1615404665.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:45 +0000 Subject: [PATCH v2 02/20] t/perf: add performance test for sparse operations Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Create a test script that takes the default performance test (the Git codebase) and multiplies it by 256 using four layers of duplicated trees of width four. This results in nearly one million blob entries in the index. Then, we can clone this repository with sparse-checkout patterns that demonstrate four copies of the initial repository. Each clone will use a different index format or mode so peformance can be tested across the different options. Note that the initial repo is stripped of submodules before doing the copies. This preserves the expected data shape of the sparse index, because directories containing submodules are not collapsed to a sparse directory entry. Run a few Git commands on these clones, especially those that use the index (status, add, commit). Here are the results on my Linux machine: Test -------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.37(0.30+0.09) 2000.3: git status (full-index-v4) 0.39(0.32+0.10) 2000.4: git add -A (full-index-v3) 1.42(1.06+0.20) 2000.5: git add -A (full-index-v4) 1.26(0.98+0.16) 2000.6: git add . (full-index-v3) 1.40(1.04+0.18) 2000.7: git add . (full-index-v4) 1.26(0.98+0.17) 2000.8: git commit -a -m A (full-index-v3) 1.42(1.11+0.16) 2000.9: git commit -a -m A (full-index-v4) 1.33(1.08+0.16) It is perhaps noteworthy that there is an improvement when using index version 4. This is because the v3 index uses 108 MiB while the v4 index uses 80 MiB. Since the repeated portions of the directories are very short (f3/f1/f2, for example) this ratio is less pronounced than in similarly-sized real repositories. Signed-off-by: Derrick Stolee --- t/perf/p2000-sparse-operations.sh | 85 +++++++++++++++++++++++++++++++ 1 file changed, 85 insertions(+) create mode 100755 t/perf/p2000-sparse-operations.sh diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh new file mode 100755 index 000000000000..2fbc81b22119 --- /dev/null +++ b/t/perf/p2000-sparse-operations.sh @@ -0,0 +1,85 @@ +#!/bin/sh + +test_description="test performance of Git operations using the index" + +. ./perf-lib.sh + +test_perf_default_repo + +SPARSE_CONE=f2/f4/f1 + +test_expect_success 'setup repo and indexes' ' + git reset --hard HEAD && + # Remove submodules from the example repo, because our + # duplication of the entire repo creates an unlikly data shape. + git config --file .gitmodules --get-regexp "submodule.*.path" >modules && + git rm -f .gitmodules && + for module in $(awk "{print \$2}" modules) + do + git rm $module || return 1 + done && + git commit -m "remove submodules" && + + echo bogus >a && + cp a b && + git add a b && + git commit -m "level 0" && + BLOB=$(git rev-parse HEAD:a) && + OLD_COMMIT=$(git rev-parse HEAD) && + OLD_TREE=$(git rev-parse HEAD^{tree}) && + + for i in $(test_seq 1 4) + do + cat >in <<-EOF && + 100755 blob $BLOB a + 040000 tree $OLD_TREE f1 + 040000 tree $OLD_TREE f2 + 040000 tree $OLD_TREE f3 + 040000 tree $OLD_TREE f4 + EOF + NEW_TREE=$(git mktree >$SPARSE_CONE/a && + $command + ) + " + done +} + +test_perf_on_all git status +test_perf_on_all git add -A +test_perf_on_all git add . +test_perf_on_all git commit -a -m A + +test_done From patchwork Wed Mar 10 19:30:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129167 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 572AEC4332E for ; Wed, 10 Mar 2021 19:31:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 143DB64FE5 for ; Wed, 10 Mar 2021 19:31:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233746AbhCJTbS (ORCPT ); Wed, 10 Mar 2021 14:31:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233747AbhCJTbK (ORCPT ); Wed, 10 Mar 2021 14:31:10 -0500 Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80BB2C061760 for ; Wed, 10 Mar 2021 11:31:10 -0800 (PST) Received: by mail-wr1-x429.google.com with SMTP id e10so24610054wro.12 for ; Wed, 10 Mar 2021 11:31:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Sr8T60pvjyaHGo05oWI6Lye8k/6nZgW5k/6ArV7rGJ8=; b=CYrEFxfScBIWo/a767gRPubKMPC+NAXMFdEJj8fXvnsOWRY0NCrLH6WTmX6G+ohdHL ZifK9yFnO7QvyV0jjc5uRJY8aHt3GiW+Ssceo85k9lAaGaU6iMfCP28/6pI4qsSwaD++ cw+ucQNwv1GzlH60Dtr3DQUhWDebEML0Yb2NVCknp8Pr/bs6mCtYudAoZk1z/onw29mw JT/0RiwitV3SkxBruCBocePjXvlL9AaMJgBtOBhikO1EQrTnraAuymFz9LLlIQGXoUHX O3btunF/WqubRLii7aGCJR7zA6SfhRI1OXp5x1YjPULL4ZJXjxgKleMqs27G1XIXoD2V ZbUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Sr8T60pvjyaHGo05oWI6Lye8k/6nZgW5k/6ArV7rGJ8=; b=LWYZt2cEySlIbeFRN8SbaMUhAgZDp+PnJmpY8dVrVCTZrsPtaZYdGdDfYyDnjJMImh f7z1uuZCbSmD8LBKEmU1bYeQehyw8+rUL04PdtVv+mj+fHmWNKQplot3jbO94Udqx/z/ qXR66kx7bFpsr9H750G0OmMqttPj6/HndpuKfUrsMyu8LKWmNnfRXXYyZ67iFppWkm0+ aBGyVsqwoCGpfITkJUq6HylRjIn5DdAjO8X7k5j8IFsGTbR7SsOMsEazjvPsvwiiNg4y BScyBU2ie/MEW4be89tT4QwclnoB/0l9Mo/RAovrmJKejQpmyuwOSTcAWqcHgm3PtjvX CGTA== X-Gm-Message-State: AOAM530yOLbummIg8uy575a3xh1rKFRXZ8bssjI1dyRbYK+PePsWSHA/ WS4GdlGYZRtgH1Q2ExNxlKd1Ots7t3Y= X-Google-Smtp-Source: ABdhPJxK9VtxWJwG8MKbbVHaP3wsGLS7nP947ZtmZMeaGYPcuGcV4JUEVz9YhYKSDz9lH5BDUuGqdA== X-Received: by 2002:adf:8562:: with SMTP id 89mr5108976wrh.101.1615404669191; Wed, 10 Mar 2021 11:31:09 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j14sm307264wrw.69.2021.03.10.11.31.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:08 -0800 (PST) Message-Id: <5cbedb377b37ebdc103d9d94e68b6621bcd3d3cf.1615404665.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:46 +0000 Subject: [PATCH v2 03/20] t1092: clean up script quoting Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This test was introduced in 19a0acc83e4 (t1092: test interesting sparse-checkout scenarios, 2021-01-23), but these issues with quoting were not noticed until starting this follow-up series. The old mechanism would drop quoting such as in test_all_match git commit -m "touch README.md" The above happened to work because README.md is a file in the repository, so 'git commit -m touch REAMDE.md' would succeed by accident. Other cases included quoting for no good reason, so clean that up now. Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 8cd3e5a8d227..3725d3997e70 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -96,20 +96,20 @@ init_repos () { run_on_sparse () { ( cd sparse-checkout && - $* >../sparse-checkout-out 2>../sparse-checkout-err + "$@" >../sparse-checkout-out 2>../sparse-checkout-err ) } run_on_all () { ( cd full-checkout && - $* >../full-checkout-out 2>../full-checkout-err + "$@" >../full-checkout-out 2>../full-checkout-err ) && - run_on_sparse $* + run_on_sparse "$@" } test_all_match () { - run_on_all $* && + run_on_all "$@" && test_cmp full-checkout-out sparse-checkout-out && test_cmp full-checkout-err sparse-checkout-err } @@ -119,7 +119,7 @@ test_expect_success 'status with options' ' test_all_match git status --porcelain=v2 && test_all_match git status --porcelain=v2 -z -u && test_all_match git status --porcelain=v2 -uno && - run_on_all "touch README.md" && + run_on_all touch README.md && test_all_match git status --porcelain=v2 && test_all_match git status --porcelain=v2 -z -u && test_all_match git status --porcelain=v2 -uno && @@ -135,7 +135,7 @@ test_expect_success 'add, commit, checkout' ' write_script edit-contents <<-\EOF && echo text >>$1 EOF - run_on_all "../edit-contents README.md" && + run_on_all ../edit-contents README.md && test_all_match git add README.md && test_all_match git status --porcelain=v2 && @@ -144,7 +144,7 @@ test_expect_success 'add, commit, checkout' ' test_all_match git checkout HEAD~1 && test_all_match git checkout - && - run_on_all "../edit-contents README.md" && + run_on_all ../edit-contents README.md && test_all_match git add -A && test_all_match git status --porcelain=v2 && @@ -153,7 +153,7 @@ test_expect_success 'add, commit, checkout' ' test_all_match git checkout HEAD~1 && test_all_match git checkout - && - run_on_all "../edit-contents deep/newfile" && + run_on_all ../edit-contents deep/newfile && test_all_match git status --porcelain=v2 -uno && test_all_match git status --porcelain=v2 && @@ -186,7 +186,7 @@ test_expect_success 'diff --staged' ' write_script edit-contents <<-\EOF && echo text >>README.md EOF - run_on_all "../edit-contents" && + run_on_all ../edit-contents && test_all_match git diff && test_all_match git diff --staged && @@ -280,7 +280,7 @@ test_expect_success 'clean' ' echo bogus >>.gitignore && run_on_all cp ../.gitignore . && test_all_match git add .gitignore && - test_all_match git commit -m ignore-bogus-files && + test_all_match git commit -m "ignore bogus files" && run_on_sparse mkdir folder1 && run_on_all touch folder1/bogus && From patchwork Wed Mar 10 19:30:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 784F8C4332D for ; Wed, 10 Mar 2021 19:31:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4C06A64FF2 for ; Wed, 10 Mar 2021 19:31:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233813AbhCJTbT (ORCPT ); Wed, 10 Mar 2021 14:31:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233749AbhCJTbL (ORCPT ); Wed, 10 Mar 2021 14:31:11 -0500 Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0FA05C061760 for ; Wed, 10 Mar 2021 11:31:11 -0800 (PST) Received: by mail-wm1-x330.google.com with SMTP id f22-20020a7bc8d60000b029010c024a1407so11809985wml.2 for ; Wed, 10 Mar 2021 11:31:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ygaDs5wlP7UHuBKjOsrDT3aRtt2dHHpJXltd1v63C3s=; b=trLepgkNTwHBKHnOeqeRZ/99Aub5WT8CqV3i2ZuVO/xxvTXUFq4uvDniZ2/DMGy+wf f+GmUVS9anCkkv/7rmG9kvQtftisAfcN1MGsi5XtvmznG1nrQ0IAIhwDUn0Ofqn5qcuL 4Gl/UPeX5O2zp3hpx5d6Zje01M3AK704iKoLsVrPCwRhAhcls6ixKfs8EMKdVhdT40+9 rsYCENS4Jq5f8LKam9XW0m8d4zdg/Rnf00sJS/O2pR6qOj57KPkpiUQh2mECzue8Qjv6 rek8wRNAPfC9cr5QnPicmKmyM+DTAAmECO0uHML+bV8ukEoo6RbH67aM14A0su0Bl1DB RVrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ygaDs5wlP7UHuBKjOsrDT3aRtt2dHHpJXltd1v63C3s=; b=Wq2tS3ZwykoYmV5MFJ2RWQ2JAPWGlDOhnmPnlmGBOV+Ds3F7HaapHu8PYRFVgRiUzg WVmZxkiJiddi2w8+6m6VzbljiKbU560fRL2KMjCxtvxbARY8H/nT9DDK8fOiBVhwe5xe gtwEiNkzqeXINUv345fM7SnOaWTy1mO0MJyK9tIzVnrjk4cc9ad23MPxVWxhNIe0R+IV gyXNATchB+FIPY+LL/xhK/dOorHHGvg+AGJ5jB0oEnyM/d4az0LWL+0lNH//FQk5DHRd 4L1VLwwXAXtQjwvh7Fpw1OvbL6TO6XB/BENgNpHJJD+Uv9J7dvrIWKYRJrsXQVSXgtbt UyWg== X-Gm-Message-State: AOAM531M9hnPredrF48PzDVbPFxodSh8CcUX7B0yjLdsCdcyfs/cfGCp HTZZk0IiCorzr+HUgJcziDjnTAp8RPc= X-Google-Smtp-Source: ABdhPJyFcU4Ua4gwPEBRhTX0UlgNcPDi/CsKnCj26VAiC1VIZFwhRU43vreNo0U2lRY4ectLJPEEJA== X-Received: by 2002:a05:600c:224e:: with SMTP id a14mr4778237wmm.57.1615404669865; Wed, 10 Mar 2021 11:31:09 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w11sm309970wrv.88.2021.03.10.11.31.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:09 -0800 (PST) Message-Id: <6e21f776e883cef25f63829caf338298252ebaca.1615404665.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:47 +0000 Subject: [PATCH v2 04/20] sparse-index: add guard to ensure full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Upcoming changes will introduce modifications to the index format that allow sparse directories. It will be useful to have a mechanism for converting those sparse index files into full indexes by walking the tree at those sparse directories. Name this method ensure_full_index() as it will guarantee that the index is fully expanded. This method is not implemented yet, and instead we focus on the scaffolding to declare it and call it at the appropriate time. Add a 'command_requires_full_index' member to struct repo_settings. This will be an indicator that we need the index in full mode to do certain index operations. This starts as being true for every command, then we will set it to false as some commands integrate with sparse indexes. If 'command_requires_full_index' is true, then we will immediately expand a sparse index to a full one upon reading from disk. This suffices for now, but we will want to add more callers to ensure_full_index() later. Signed-off-by: Derrick Stolee --- Makefile | 1 + repo-settings.c | 8 ++++++++ repository.c | 11 ++++++++++- repository.h | 2 ++ sparse-index.c | 8 ++++++++ sparse-index.h | 7 +++++++ 6 files changed, 36 insertions(+), 1 deletion(-) create mode 100644 sparse-index.c create mode 100644 sparse-index.h diff --git a/Makefile b/Makefile index 5a239cac20e3..3bf61699238d 100644 --- a/Makefile +++ b/Makefile @@ -980,6 +980,7 @@ LIB_OBJS += setup.o LIB_OBJS += shallow.o LIB_OBJS += sideband.o LIB_OBJS += sigchain.o +LIB_OBJS += sparse-index.o LIB_OBJS += split-index.o LIB_OBJS += stable-qsort.o LIB_OBJS += strbuf.o diff --git a/repo-settings.c b/repo-settings.c index f7fff0f5ab83..d63569e4041e 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -77,4 +77,12 @@ void prepare_repo_settings(struct repository *r) UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_KEEP); UPDATE_DEFAULT_BOOL(r->settings.fetch_negotiation_algorithm, FETCH_NEGOTIATION_DEFAULT); + + /* + * This setting guards all index reads to require a full index + * over a sparse index. After suitable guards are placed in the + * codebase around uses of the index, this setting will be + * removed. + */ + r->settings.command_requires_full_index = 1; } diff --git a/repository.c b/repository.c index c98298acd017..a8acae002f71 100644 --- a/repository.c +++ b/repository.c @@ -10,6 +10,7 @@ #include "object.h" #include "lockfile.h" #include "submodule-config.h" +#include "sparse-index.h" /* The main repository */ static struct repository the_repo; @@ -261,6 +262,8 @@ void repo_clear(struct repository *repo) int repo_read_index(struct repository *repo) { + int res; + if (!repo->index) repo->index = xcalloc(1, sizeof(*repo->index)); @@ -270,7 +273,13 @@ int repo_read_index(struct repository *repo) else if (repo->index->repo != repo) BUG("repo's index should point back at itself"); - return read_index_from(repo->index, repo->index_file, repo->gitdir); + res = read_index_from(repo->index, repo->index_file, repo->gitdir); + + prepare_repo_settings(repo); + if (repo->settings.command_requires_full_index) + ensure_full_index(repo->index); + + return res; } int repo_hold_locked_index(struct repository *repo, diff --git a/repository.h b/repository.h index b385ca3c94b6..e06a23015697 100644 --- a/repository.h +++ b/repository.h @@ -41,6 +41,8 @@ struct repo_settings { enum fetch_negotiation_setting fetch_negotiation_algorithm; int core_multi_pack_index; + + unsigned command_requires_full_index:1; }; struct repository { diff --git a/sparse-index.c b/sparse-index.c new file mode 100644 index 000000000000..82183ead563b --- /dev/null +++ b/sparse-index.c @@ -0,0 +1,8 @@ +#include "cache.h" +#include "repository.h" +#include "sparse-index.h" + +void ensure_full_index(struct index_state *istate) +{ + /* intentionally left blank */ +} diff --git a/sparse-index.h b/sparse-index.h new file mode 100644 index 000000000000..09a20d036c46 --- /dev/null +++ b/sparse-index.h @@ -0,0 +1,7 @@ +#ifndef SPARSE_INDEX_H__ +#define SPARSE_INDEX_H__ + +struct index_state; +void ensure_full_index(struct index_state *istate); + +#endif From patchwork Wed Mar 10 19:30:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55BA3C4332B for ; Wed, 10 Mar 2021 19:31:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2EECF64FEF for ; Wed, 10 Mar 2021 19:31:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233819AbhCJTbU (ORCPT ); Wed, 10 Mar 2021 14:31:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233751AbhCJTbM (ORCPT ); Wed, 10 Mar 2021 14:31:12 -0500 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AA7C1C061760 for ; Wed, 10 Mar 2021 11:31:11 -0800 (PST) Received: by mail-wr1-x42d.google.com with SMTP id u14so24630641wri.3 for ; Wed, 10 Mar 2021 11:31:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Ci4zHD7vVaPFH9zfZ+LuihC6A8oJNG9I5+iHGD8+TSs=; b=Cy/yf3dNim5ix05/Xin78dVKCCDNY2zsclh66yc/bcYS32nJd1M4wgzkWYVtSzYPUv JH5u+bB8Ihc+YDd6Z/zKKGSXSKuI25KTAylzPVFykUbNWx2Csv60E91z+OvasQxNfQn+ KkRvnGKlbMak5uBYbJBV8JG7VwLKD7vKJZHHwn5skH4HqvLsDgPNu2bI6ana3sNuPtrH s/EnzqlysCkQ3fvV38RzCQK6TkbVLMvhtxG+ME0ahES1CZjRqFaAFpm/DbjRNh8q0XPf iqw7zk9DsBWNuqn1B8Vsg+sM8KCGATYAcj4pdmw+cTYRWYGmffk0C5r8170fmYK1YHLf rsBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Ci4zHD7vVaPFH9zfZ+LuihC6A8oJNG9I5+iHGD8+TSs=; b=feubholsw4rVF3dxFo7g/HCZLnXf5RxkM8QtFPxrLn8HcehWjKLhtoFBw9/qKM0FqU f5jqg32hHMM8WN6/Y7vrk0MzzamvvT0T04H/NdBxgsGmsSTOyJ0WaxpieKrpuRvguKgZ o/OO86j4ZRN9R/bnDt2ebRuM+0icrwvvm7hVuhARqOKBLzKuz76AjIIpruWH0ubwHI3k +GOHAwQF79GBMgk9Wqe54hmtqiTUPgK52CcjYsWLWpnVPBmFSzasnW32XYhEF60Iz1dy 4M3AZa5xlx6abl04lYb+3aZ9o4daAOdKpKR4thfdDvmJvGxzBQ+h6DBRY/vneK9eZlr0 ZZWw== X-Gm-Message-State: AOAM532BWSzH7c+x6+VCdNB9aa+VYNmLPAlefZ5O3hhykrV20cT4rnNY UjIQfpccLJ0+XUXVQ66rrJFx2v0Y+14= X-Google-Smtp-Source: ABdhPJzjAcqaqFFVly1LAT9wXAt7QEt1/L0/XZvkJN4WUEDC5r/KAYOtr5344V5w2jUE/DIcRl8ZKQ== X-Received: by 2002:a5d:4905:: with SMTP id x5mr5058020wrq.201.1615404670490; Wed, 10 Mar 2021 11:31:10 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x8sm348026wru.46.2021.03.10.11.31.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:10 -0800 (PST) Message-Id: <399ddb0bad56c69ff9d9591f5e8eacf52cf50a15.1615404665.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:48 +0000 Subject: [PATCH v2 05/20] sparse-index: implement ensure_full_index() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee We will mark an in-memory index_state as having sparse directory entries with the sparse_index bit. These currently cannot exist, but we will add a mechanism for collapsing a full index to a sparse one in a later change. That will happen at write time, so we must first allow parsing the format before writing it. Commands or methods that require a full index in order to operate can call ensure_full_index() to expand that index in-memory. This requires parsing trees using that index's repository. Sparse directory entries have a specific 'ce_mode' value. The macro S_ISSPARSEDIR(ce->ce_mode) can check if a cache_entry 'ce' has this type. This ce_mode is not possible with the existing index formats, so we don't also verify all properties of a sparse-directory entry, which are: 1. ce->ce_mode == 0040000 2. ce->flags & CE_SKIP_WORKTREE is true 3. ce->name[ce->namelen - 1] == '/' (ends in dir separator) 4. ce->oid references a tree object. These are all semi-enforced in ensure_full_index() to some extent. Any deviation will cause a warning at minimum or a failure in the worst case. Signed-off-by: Derrick Stolee --- cache.h | 13 ++++++- read-cache.c | 9 +++++ sparse-index.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 115 insertions(+), 2 deletions(-) diff --git a/cache.h b/cache.h index d92814961405..1f0b42264606 100644 --- a/cache.h +++ b/cache.h @@ -204,6 +204,8 @@ struct cache_entry { #error "CE_EXTENDED_FLAGS out of range" #endif +#define S_ISSPARSEDIR(m) ((m) == S_IFDIR) + /* Forward structure decls */ struct pathspec; struct child_process; @@ -319,7 +321,14 @@ struct index_state { drop_cache_tree : 1, updated_workdir : 1, updated_skipworktree : 1, - fsmonitor_has_run_once : 1; + fsmonitor_has_run_once : 1, + + /* + * sparse_index == 1 when sparse-directory + * entries exist. Requires sparse-checkout + * in cone mode. + */ + sparse_index : 1; struct hashmap name_hash; struct hashmap dir_hash; struct object_id oid; @@ -722,6 +731,8 @@ int read_index_from(struct index_state *, const char *path, const char *gitdir); int is_index_unborn(struct index_state *); +void ensure_full_index(struct index_state *istate); + /* For use with `write_locked_index()`. */ #define COMMIT_LOCK (1 << 0) #define SKIP_IF_UNCHANGED (1 << 1) diff --git a/read-cache.c b/read-cache.c index 29144cf879e7..97dbf2434f30 100644 --- a/read-cache.c +++ b/read-cache.c @@ -101,6 +101,9 @@ static const char *alternate_index_output; static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) { + if (S_ISSPARSEDIR(ce->ce_mode)) + istate->sparse_index = 1; + istate->cache[nr] = ce; add_name_hash(istate, ce); } @@ -2255,6 +2258,12 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) trace2_data_intmax("index", the_repository, "read/cache_nr", istate->cache_nr); + if (!istate->repo) + istate->repo = the_repository; + prepare_repo_settings(istate->repo); + if (istate->repo->settings.command_requires_full_index) + ensure_full_index(istate); + return istate->cache_nr; unmap: diff --git a/sparse-index.c b/sparse-index.c index 82183ead563b..316cb949b74b 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -1,8 +1,101 @@ #include "cache.h" #include "repository.h" #include "sparse-index.h" +#include "tree.h" +#include "pathspec.h" +#include "trace2.h" + +static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) +{ + ALLOC_GROW(istate->cache, nr + 1, istate->cache_alloc); + + istate->cache[nr] = ce; + add_name_hash(istate, ce); +} + +static int add_path_to_index(const struct object_id *oid, + struct strbuf *base, const char *path, + unsigned int mode, int stage, void *context) +{ + struct index_state *istate = (struct index_state *)context; + struct cache_entry *ce; + size_t len = base->len; + + if (S_ISDIR(mode)) + return READ_TREE_RECURSIVE; + + strbuf_addstr(base, path); + + ce = make_cache_entry(istate, mode, oid, base->buf, 0, 0); + ce->ce_flags |= CE_SKIP_WORKTREE; + set_index_entry(istate, istate->cache_nr++, ce); + + strbuf_setlen(base, len); + return 0; +} void ensure_full_index(struct index_state *istate) { - /* intentionally left blank */ + int i; + struct index_state *full; + + if (!istate || !istate->sparse_index) + return; + + if (!istate->repo) + istate->repo = the_repository; + + trace2_region_enter("index", "ensure_full_index", istate->repo); + + /* initialize basics of new index */ + full = xcalloc(1, sizeof(struct index_state)); + memcpy(full, istate, sizeof(struct index_state)); + + /* then change the necessary things */ + full->sparse_index = 0; + full->cache_alloc = (3 * istate->cache_alloc) / 2; + full->cache_nr = 0; + ALLOC_ARRAY(full->cache, full->cache_alloc); + + for (i = 0; i < istate->cache_nr; i++) { + struct cache_entry *ce = istate->cache[i]; + struct tree *tree; + struct pathspec ps; + + if (!S_ISSPARSEDIR(ce->ce_mode)) { + set_index_entry(full, full->cache_nr++, ce); + continue; + } + if (!(ce->ce_flags & CE_SKIP_WORKTREE)) + warning(_("index entry is a directory, but not sparse (%08x)"), + ce->ce_flags); + + /* recursively walk into cd->name */ + tree = lookup_tree(istate->repo, &ce->oid); + + memset(&ps, 0, sizeof(ps)); + ps.recursive = 1; + ps.has_wildcard = 1; + ps.max_depth = -1; + + read_tree_recursive(istate->repo, tree, + ce->name, strlen(ce->name), + 0, &ps, + add_path_to_index, full); + + /* free directory entries. full entries are re-used */ + discard_cache_entry(ce); + } + + /* Copy back into original index. */ + memcpy(&istate->name_hash, &full->name_hash, sizeof(full->name_hash)); + istate->sparse_index = 0; + free(istate->cache); + istate->cache = full->cache; + istate->cache_nr = full->cache_nr; + istate->cache_alloc = full->cache_alloc; + + free(full); + + trace2_region_leave("index", "ensure_full_index", istate->repo); } From patchwork Wed Mar 10 19:30:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129169 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4209C433E0 for ; Wed, 10 Mar 2021 19:32:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 986C264EF6 for ; Wed, 10 Mar 2021 19:32:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230522AbhCJTbp (ORCPT ); Wed, 10 Mar 2021 14:31:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233679AbhCJTbM (ORCPT ); Wed, 10 Mar 2021 14:31:12 -0500 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 37CC2C061760 for ; Wed, 10 Mar 2021 11:31:12 -0800 (PST) Received: by mail-wr1-x42d.google.com with SMTP id a18so24601623wrc.13 for ; Wed, 10 Mar 2021 11:31:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=5iigrqOwj76r5bXCKC1LYHTsGQ9spzO+EGwZyXD+2fQ=; b=UNfiwf9+JUoLiJXU+9NX/clSLXfzDKXtA9qIc4WDd/N4R6bkmmG9qXQB44NXj4+K6a HgNW4lx9vDJTq2kjYuBtVIuXxSTlwkxBKEeBDyTY/KI1OBDdkjJtrAN7Nsa9zEJ38caK uzStMBMATkUeJeC4bOORpXdUx4c9/1k3tJRHdHchlcP2gHpNewfe89hklIIQoTXHiHDe WHIX9K/QTsmh20d75Ri+p5tHZioWs05kAmZDwj5R74TIn0z1oSKY4bbJZlp3W0RhFCQn GYu7TT3lN0iotwm9zqz84x63UqIJRcJCS7r+a+3yMa0vB/Y5pfPp4oxYuhv8zyG4Vrld yQ1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=5iigrqOwj76r5bXCKC1LYHTsGQ9spzO+EGwZyXD+2fQ=; b=DTXcQSOyxHgpk+75KY42PJi5QTX9dlbSQdfVgZ/XmAMsge9xDd78XpYlvmX7EaRmWj MJr+KlmwWtYilSobn0DK7AYVet+uqfNGNqaVfr5hT/m2dUtRlVsP4J2FrHSkMKk0sJ14 fAIqQm4/Hrl09slCyFFRB24yw6WQ3Uhg7lEAMI8ZmP8lF2mWxBTzuRGsH4d5bh8Tm0Qi v17WUGsqCfXRMt0hE3cVY5n3hfRATGrulpZbutkJw5JH3sRao5OBeqfnXXGSlGhGl2lo dIkh+fJ7SF5YHo7vp/eN2ENrO4kNlo7LKIFhbXBUUWVMk0oCAOVnd0kF5iOrGdfb6Uhe 41gw== X-Gm-Message-State: AOAM533HLcZDjCeiX5xnns5yUEAP//wvM9eH2mZoiXvPPjNXWQQs60qk k0KMuSFHwGW4FBXBS5IkOugUvcPJG1Q= X-Google-Smtp-Source: ABdhPJz5gCVYdc44OLEWv7Hwr44oNwJDurppaMCAF9D2UsV/QDdk8wRe/ajKGhcuqfcHNEvD6fTryw== X-Received: by 2002:adf:f351:: with SMTP id e17mr5033249wrp.416.1615404671012; Wed, 10 Mar 2021 11:31:11 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d13sm359931wro.23.2021.03.10.11.31.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:10 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:49 +0000 Subject: [PATCH v2 06/20] t1092: compare sparse-checkout to sparse-index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Add a new 'sparse-index' repo alongside the 'full-checkout' and 'sparse-checkout' repos in t1092-sparse-checkout-compatibility.sh. Also add run_on_sparse and test_sparse_match helpers. These helpers will be used when the sparse index is implemented. Add GIT_TEST_SPARSE_INDEX environment variable to enable the sparse-index by default. This will be intended to use across the entire test suite, except that it will only affect cases where the sparse-checkout feature is enabled. Signed-off-by: Derrick Stolee --- t/README | 3 +++ t/t1092-sparse-checkout-compatibility.sh | 24 ++++++++++++++++++++---- 2 files changed, 23 insertions(+), 4 deletions(-) diff --git a/t/README b/t/README index 593d4a4e270c..b98bc563aab5 100644 --- a/t/README +++ b/t/README @@ -439,6 +439,9 @@ and "sha256". GIT_TEST_WRITE_REV_INDEX=, when true enables the 'pack.writeReverseIndex' setting. +GIT_TEST_SPARSE_INDEX=, when true enables index writes to use the +sparse-index format by default. + Naming Tests ------------ diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 3725d3997e70..71d6f9e4c014 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -7,6 +7,7 @@ test_description='compare full workdir to sparse workdir' test_expect_success 'setup' ' git init initial-repo && ( + GIT_TEST_SPARSE_INDEX=0 && cd initial-repo && echo a >a && echo "after deep" >e && @@ -87,23 +88,32 @@ init_repos () { cp -r initial-repo sparse-checkout && git -C sparse-checkout reset --hard && - git -C sparse-checkout sparse-checkout init --cone && + + cp -r initial-repo sparse-index && + git -C sparse-index reset --hard && # initialize sparse-checkout definitions - git -C sparse-checkout sparse-checkout set deep + git -C sparse-checkout sparse-checkout init --cone && + git -C sparse-checkout sparse-checkout set deep && + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout init --cone && + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep } run_on_sparse () { ( cd sparse-checkout && - "$@" >../sparse-checkout-out 2>../sparse-checkout-err + GIT_TEST_SPARSE_INDEX=0 "$@" >../sparse-checkout-out 2>../sparse-checkout-err + ) && + ( + cd sparse-index && + GIT_TEST_SPARSE_INDEX=1 "$@" >../sparse-index-out 2>../sparse-index-err ) } run_on_all () { ( cd full-checkout && - "$@" >../full-checkout-out 2>../full-checkout-err + GIT_TEST_SPARSE_INDEX=0 "$@" >../full-checkout-out 2>../full-checkout-err ) && run_on_sparse "$@" } @@ -114,6 +124,12 @@ test_all_match () { test_cmp full-checkout-err sparse-checkout-err } +test_sparse_match () { + run_on_sparse $* && + test_cmp sparse-checkout-out sparse-index-out && + test_cmp sparse-checkout-err sparse-index-err +} + test_expect_success 'status with options' ' init_repos && test_all_match git status --porcelain=v2 && From patchwork Wed Mar 10 19:30:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129171 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 091A2C433DB for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BB88B64FD3 for ; Wed, 10 Mar 2021 19:32:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233822AbhCJTbs (ORCPT ); Wed, 10 Mar 2021 14:31:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40730 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232874AbhCJTbN (ORCPT ); Wed, 10 Mar 2021 14:31:13 -0500 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB9BAC061761 for ; Wed, 10 Mar 2021 11:31:12 -0800 (PST) Received: by mail-wm1-x32f.google.com with SMTP id b2-20020a7bc2420000b029010be1081172so11454189wmj.1 for ; Wed, 10 Mar 2021 11:31:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Wxlia8VnIU4u5D64V+OKvd3HYkcbJivGBzhX7AKEUNw=; b=PiVdj1lcENFi4l8zjmUY/ML6OT/CHSeBTi9KIgfB+qTPLT/x0G722LgOcnslmvMpFL kH2SUFUqbSnWKW1EnMILgta55GPdKSzAkgjpKrOYydHiiQYglzobkeFiUdC/WKp6JCcI +BEkE5a9te/NxxOBRgLwoyrIjhrQ8rzXWl0sw1kQrIg2ll2ftoBLtA8D8Inm861FzJNW 7wwn0VNMsr2YdppTrUKdsJZb971QihfWwSNor08/9WbgAle0i8qmQTyVb7mNox5O0+O4 EZJl5ZRv6OoyCHQ4gGILRViPET7DogcoZ2FCTUWRkPAuTFQiLjREw0ANJxZsrZFQgoI3 7X0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Wxlia8VnIU4u5D64V+OKvd3HYkcbJivGBzhX7AKEUNw=; b=mtH3JUGn/tXQzIvP571qDd0WhlxuyvPG5fHHXiO+kIRbRKNa3DERnpcP6+8xNHrq4C 20xzoGWMcWZAbZyPbfGmhs8IbzBq3p053sG6D2kAHAk9P8VFWtMKvyQkWIn7YGBsl75T X6Whr3PA6rhMWlaSEdfZh/a0V9osJDxtKuqrGGp1F+OEJfe8duWOUKOx2Wc5TdE8lpHy DbLzqjKUvJPp/MKwrQIAuw1taJEUPv/FGutq/TfAateTA1n9gwcor/4y5w9ilb1mL4+c yTe3UA2qJRY3BVordhlYixU+s99iIYRi1jO3e+vsHNwCG6FG3jDhU4tc87GvqNz5BEyb Ktpw== X-Gm-Message-State: AOAM532C06zm1xjK7cVrnu6Tuo3kkpVBaZhzkFFmFWmgW9zyWEypsvKp 7m0xa6DeMm/Qgo52Kp44TDQBBP2ZcHg= X-Google-Smtp-Source: ABdhPJxh2CYKXf0jNpOayqzPo8PP0BLZSF07PD4952b21ZBA8A+mBrCSn/+UG9BqyL63I772zL7v7g== X-Received: by 2002:a7b:c119:: with SMTP id w25mr4732854wmi.127.1615404671652; Wed, 10 Mar 2021 11:31:11 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f7sm386602wrm.36.2021.03.10.11.31.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:11 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:50 +0000 Subject: [PATCH v2 07/20] test-read-cache: print cache entries with --table Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This table is helpful for discovering data in the index to ensure it is being written correctly, especially as we build and test the sparse-index. This table includes an output format similar to 'git ls-tree', but should not be compared to that directly. The biggest reasons are that 'git ls-tree' includes a tree entry for every subdirectory, even those that would not appear as a sparse directory in a sparse-index. Further, 'git ls-tree' does not use a trailing directory separator for its tree rows. This does not print the stat() information for the blobs. That could be added in a future change with another option. The tests that are added in the next few changes care only about the object types and IDs. To make the option parsing slightly more robust, wrap the string comparisons in a loop adapted from test-dir-iterator.c. Care must be taken with the final check for the 'cnt' variable. We continue the expectation that the numerical value is the final argument. Signed-off-by: Derrick Stolee --- t/helper/test-read-cache.c | 55 +++++++++++++++++++++++++++++++------- 1 file changed, 45 insertions(+), 10 deletions(-) diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c index 244977a29bdf..6cfd8f2de71c 100644 --- a/t/helper/test-read-cache.c +++ b/t/helper/test-read-cache.c @@ -1,36 +1,71 @@ #include "test-tool.h" #include "cache.h" #include "config.h" +#include "blob.h" +#include "commit.h" +#include "tree.h" + +static void print_cache_entry(struct cache_entry *ce) +{ + const char *type; + printf("%06o ", ce->ce_mode & 0177777); + + if (S_ISSPARSEDIR(ce->ce_mode)) + type = tree_type; + else if (S_ISGITLINK(ce->ce_mode)) + type = commit_type; + else + type = blob_type; + + printf("%s %s\t%s\n", + type, + oid_to_hex(&ce->oid), + ce->name); +} + +static void print_cache(struct index_state *istate) +{ + int i; + for (i = 0; i < istate->cache_nr; i++) + print_cache_entry(istate->cache[i]); +} int cmd__read_cache(int argc, const char **argv) { + struct repository *r = the_repository; int i, cnt = 1; const char *name = NULL; + int table = 0; - if (argc > 1 && skip_prefix(argv[1], "--print-and-refresh=", &name)) { - argc--; - argv++; + for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) { + if (skip_prefix(*argv, "--print-and-refresh=", &name)) + continue; + if (!strcmp(*argv, "--table")) + table = 1; } - if (argc == 2) - cnt = strtol(argv[1], NULL, 0); + if (argc == 1) + cnt = strtol(argv[0], NULL, 0); setup_git_directory(); git_config(git_default_config, NULL); + for (i = 0; i < cnt; i++) { - read_cache(); + repo_read_index(r); if (name) { int pos; - refresh_index(&the_index, REFRESH_QUIET, + refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL); - pos = index_name_pos(&the_index, name, strlen(name)); + pos = index_name_pos(r->index, name, strlen(name)); if (pos < 0) die("%s not in index", name); printf("%s is%s up to date\n", name, - ce_uptodate(the_index.cache[pos]) ? "" : " not"); + ce_uptodate(r->index->cache[pos]) ? "" : " not"); write_file(name, "%d\n", i); } - discard_cache(); + if (table) + print_cache(r->index); + discard_index(r->index); } return 0; } From patchwork Wed Mar 10 19:30:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DF83C433E6 for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DCF8064FD6 for ; Wed, 10 Mar 2021 19:32:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233832AbhCJTbs (ORCPT ); Wed, 10 Mar 2021 14:31:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233698AbhCJTbO (ORCPT ); Wed, 10 Mar 2021 14:31:14 -0500 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 86AE0C061760 for ; Wed, 10 Mar 2021 11:31:13 -0800 (PST) Received: by mail-wm1-x32e.google.com with SMTP id c76-20020a1c9a4f0000b029010c94499aedso11801410wme.0 for ; Wed, 10 Mar 2021 11:31:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=885NXYHQSJrIfN0vMfTm3RjnUPnYjurF/r5iiY1SF1A=; b=TEAtemQPoOV8zsZj3V8SUluiRIUvWultI6MMdlI6Do5/yA5H2n/W6Ng5Ad5JSYJeyB jZ3JPaSpVQDLOGmb+u6atRjO3WwijxdINQuD+bwF4syi9EwF1Yfeut/SFF+rPa0Dfo2J 8N6QnhfnLcDo4h2vtsVxPC0665SCzc57rDh/pG0eA9F/uQvEmmdPMAvUbip7Z75Bn++h TizsbVqHdFinX91uSDK5NLchHBYMPHrjGsMaoA2/Rxgjb9qovxK18D20LvVYxJcexU21 /NSRMM01dpIOZLpovcCnhx5u0dooCsAX/J/Nl77viF0sgYUXcAsWXgMxY/Uzrc1SeB4K b1dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=885NXYHQSJrIfN0vMfTm3RjnUPnYjurF/r5iiY1SF1A=; b=j3YOgD0YRz4hQtVkVVdGSvI2Nn6SEUFnaEXVtAz5l+RYEYnWm52yNtit4QfpuJe/G0 jS6yIq2fZjvLx1ifhGyYP/gopKbjXYNOEAFdyAPQfQMB+5CYVhIUTVGm2eZfpiOPeZQj oThcaZLhhS7RUM3uapojavO1JtjaRLPWkqkJWO/NN+pYiCfR17rMgrylV90Y22RatlS/ 6XfEIIuhJRG6xEwRY0/H20rkSjhkjEaP2GWxSYM+EK2UJdObRKYGPLNEYdyLu3uv+36U OISoetk/hH9ghgbrZpjOWjanHpd3QMOl33goNYpDBJiwHeq7EmOrfVl4xOfT0o9h8uww rjyA== X-Gm-Message-State: AOAM5305cxzYLvkvg/9WI1P+k3R49CXCDCF35jMRBRKArSfEu9QrhWpL beThIdJKfVldMSK2HbJ6hcPIcGdy/rM= X-Google-Smtp-Source: ABdhPJzZ+8GCqPYzlFDozFbWVp+7YiJluH7hbiJK6CICtX5pF3Hb5EKKajmT5RIqUcHPTyYtKYc8KQ== X-Received: by 2002:a1c:f203:: with SMTP id s3mr4818418wmc.152.1615404672374; Wed, 10 Mar 2021 11:31:12 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j26sm320332wrh.57.2021.03.10.11.31.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:12 -0800 (PST) Message-Id: <243541fc5820572faa518dfc157175a72a7a9ea8.1615404665.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:51 +0000 Subject: [PATCH v2 08/20] test-tool: don't force full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee We will use 'test-tool read-cache --table' to check that a sparse index is written as part of init_repos. Since we will no longer always expand a sparse index into a full index, add an '--expand' parameter that adds a call to ensure_full_index() so we can compare a sparse index directly against a full index, or at least what the in-memory index looks like when expanded in this way. Signed-off-by: Derrick Stolee --- t/helper/test-read-cache.c | 13 ++++++++++++- t/t1092-sparse-checkout-compatibility.sh | 5 +++++ 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c index 6cfd8f2de71c..b52c174acc7a 100644 --- a/t/helper/test-read-cache.c +++ b/t/helper/test-read-cache.c @@ -4,6 +4,7 @@ #include "blob.h" #include "commit.h" #include "tree.h" +#include "sparse-index.h" static void print_cache_entry(struct cache_entry *ce) { @@ -35,13 +36,19 @@ int cmd__read_cache(int argc, const char **argv) struct repository *r = the_repository; int i, cnt = 1; const char *name = NULL; - int table = 0; + int table = 0, expand = 0; + + initialize_the_repository(); + prepare_repo_settings(r); + r->settings.command_requires_full_index = 0; for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) { if (skip_prefix(*argv, "--print-and-refresh=", &name)) continue; if (!strcmp(*argv, "--table")) table = 1; + else if (!strcmp(*argv, "--expand")) + expand = 1; } if (argc == 1) @@ -51,6 +58,10 @@ int cmd__read_cache(int argc, const char **argv) for (i = 0; i < cnt; i++) { repo_read_index(r); + + if (expand) + ensure_full_index(r->index); + if (name) { int pos; diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 71d6f9e4c014..4d789fe86b9d 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -130,6 +130,11 @@ test_sparse_match () { test_cmp sparse-checkout-err sparse-index-err } +test_expect_success 'expanded in-memory index matches full index' ' + init_repos && + test_sparse_match test-tool read-cache --expand --table +' + test_expect_success 'status with options' ' init_repos && test_all_match git status --porcelain=v2 && From patchwork Wed Mar 10 19:30:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129185 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2611C43331 for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B131C64EF6 for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233885AbhCJTbx (ORCPT ); Wed, 10 Mar 2021 14:31:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233725AbhCJTbO (ORCPT ); Wed, 10 Mar 2021 14:31:14 -0500 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F11AC061761 for ; Wed, 10 Mar 2021 11:31:14 -0800 (PST) Received: by mail-wm1-x331.google.com with SMTP id b2-20020a7bc2420000b029010be1081172so11454222wmj.1 for ; Wed, 10 Mar 2021 11:31:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=i+gKe8EgHfQlDyjY32XCjj4qcNKDY7Jfl7pEV2IN20w=; b=nPXZErV6qHySWytLoOl9f7EugQpanl+q1uXl60kFcSZKWeIFJSMkpeIUjh3yvDurPE YWgnQm0NGgiw9s4dXRMUn//ggvaUXbYgeS0GEepeXworIa4ZMvIzDQw58zpvk32e8O4f BoVapKjRwKJqFsjFLjNlQEUKhz3y3m1eZTu7G7i6BwCMS5B/hnvRDY8UF1X8732Lw1/H Bs16ZW2O720WhV4HY/acns4FVvmJUuf3f9yyD5L1/bB6CZqwDEulbgBXuqOKt1y9QsIq PWb6KtNPPNXm6OKIQf4KlYWAzj26nCq/f+vR0mW5uRXjZVMJX01z/xuzmXIEWbIpL4lf rRFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=i+gKe8EgHfQlDyjY32XCjj4qcNKDY7Jfl7pEV2IN20w=; b=a5Ps/C7aChzrIfjkDuJisSN3GEAU0MJ4/QAayWKivVpJBzgyhWpIlgf6Z6XPzBLZ7B 39HsLrbBGpkYAzzLSxe3YiG/hqXngXhV4uzaaU8Wm5GxbfVH63aFAlQI9KjNJJSp0wkt uQWc1KBHP/U9/JehQlMEiyiL/l+j392HUHngOOkagBBTBPqVT2PXMoQ1wpQlT8tFrTzW zDNdFHOtYWLiv4Ye+ewNUpQDIo5wxl0LrOmvNwhupvxOKMkimuWeazKUdyT4AyBKk96F C21a0IuK5OlIfrPKn5ceu5kzT3OuWzVpU0YyIB33+X3jCu59/oBbkIBWoeSgO27JcBVo iNgA== X-Gm-Message-State: AOAM532PnQe9AfL4JbkY8h2TCZgFT3qbDD5L/maRBE24HT7QCT3A84mg FtidirkZqeJhzeMjrVaQJ8u0XqYsJ0w= X-Google-Smtp-Source: ABdhPJxeO6rhvcvSdmVEPojtTtAzeYNJTUTPfbb7UI6q1a7vLA1E/8GgdyCFSr0hwB88+l7TZuCWqg== X-Received: by 2002:a1c:2390:: with SMTP id j138mr4765929wmj.72.1615404672912; Wed, 10 Mar 2021 11:31:12 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s84sm446008wme.11.2021.03.10.11.31.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:12 -0800 (PST) Message-Id: <48f65093b3da3fdee606e6d52e81795cdfcbbd22.1615404665.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:52 +0000 Subject: [PATCH v2 09/20] unpack-trees: ensure full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The next change will translate full indexes into sparse indexes at write time. The existing logic provides a way for every sparse index to be expanded to a full index at read time. However, there are cases where an index is written and then continues to be used in-memory to perform further updates. unpack_trees() is frequently called after such a write. In particular, commands like 'git reset' do this double-update of the index. Ensure that we have a full index when entering unpack_trees(), but only when command_requires_full_index is true. This is always true at the moment, but we will later relax that after unpack_trees() is updated to handle sparse directory entries. Signed-off-by: Derrick Stolee --- unpack-trees.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/unpack-trees.c b/unpack-trees.c index f5f668f532d8..4dd99219073a 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1567,6 +1567,7 @@ static int verify_absent(const struct cache_entry *, */ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options *o) { + struct repository *repo = the_repository; int i, ret; static struct cache_entry *dfc; struct pattern_list pl; @@ -1578,6 +1579,12 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options trace_performance_enter(); trace2_region_enter("unpack_trees", "unpack_trees", the_repository); + prepare_repo_settings(repo); + if (repo->settings.command_requires_full_index) { + ensure_full_index(o->src_index); + ensure_full_index(o->dst_index); + } + if (!core_apply_sparse_checkout || !o->update) o->skip_sparse_checkout = 1; if (!o->skip_sparse_checkout && !o->pl) { From patchwork Wed Mar 10 19:30:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129187 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F35EC43333 for ; Wed, 10 Mar 2021 19:32:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CDC8D64FDC for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233874AbhCJTbw (ORCPT ); Wed, 10 Mar 2021 14:31:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233731AbhCJTbP (ORCPT ); Wed, 10 Mar 2021 14:31:15 -0500 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7308C061762 for ; Wed, 10 Mar 2021 11:31:14 -0800 (PST) Received: by mail-wr1-x42f.google.com with SMTP id v15so24640332wrx.4 for ; Wed, 10 Mar 2021 11:31:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=zTzPdzknucQx81xJWRuZF717S/E2tCr/L+NZEBArTdk=; b=jS2YlEXBAVLNFvl5jex/Ro/xmCKVujyBd4BpMHiOPsI/e02B9UZQvl2ceSeX7zcmJP NY0sj5qQOvArxnT3MO0goBsGtqiK6/HrjQWO6h9F0Skr8m91IcdcV1Iat3bwW+uq0Y8A 5w5R9VoThKlBc45a7OcY92+06RXCu1Dl9M7BzkFDQQJXbCdQaL5HAvXIfsQr2ONPGGaX tqhOGVGnjSDdC1DG5hd+/eZps/nHuxf9eNuawMu/PFv1fJpoKwGXGN9KlnaiQG1sgkgq qGYrnnYYnYoFLx9Y8Bo99oFxmJ+MyVpr77m0DCWrlE0IwcW2+q7myuAF6kvsFWdU3Ccu qcoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=zTzPdzknucQx81xJWRuZF717S/E2tCr/L+NZEBArTdk=; b=cP+ncPQVHLCTJBIXnJGrboIYFBWyewoK9LXxXvhkYOppGTLHdmiAM3osHI4zS8dE+M froWx9XA0y2upJwpSaiC0JSSCVyt077KMYNXjF4yxlF7Y1aD3oXJ7eWK8CRiT9uVRyBL bIWf/cNUAimrBveCY8QUUzg+/9quIrL6uFv+wohK9gE9pMeopa2VrhUNsPovYFm3L3Zf TRikZyVmO2RC8FKhClJZIxEw0O3880KVn7s4jS/Xhmyu/y0xDCEOBMlluL9ogiMqpA/5 MY0pV5qUD4hCSDajlGuVD/iQKngS0PmYOsJL4/oDRAIXHNxKBBxuJadxZIwxf2ENzwzB pkpg== X-Gm-Message-State: AOAM5307VMAz9Bw/8VOtAkvsAMjVEuXpFibaaYV6CJlXgBlzBf2JBAqK sZfJqPeXbueg4A7cOp9UitiZ15bTORU= X-Google-Smtp-Source: ABdhPJwbEvjyyBM55Et7BKrK/ZnZUlS/PoP0GtTCNFhF0Q6d3oL9ZpFdvi4Xo3PsM+TBbKWhCiMIvQ== X-Received: by 2002:a5d:698d:: with SMTP id g13mr5232086wru.2.1615404673520; Wed, 10 Mar 2021 11:31:13 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s20sm392610wmj.36.2021.03.10.11.31.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:13 -0800 (PST) Message-Id: <83aac8b7a1ec18d018205117dc2e98a5bb99d4c6.1615404665.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:53 +0000 Subject: [PATCH v2 10/20] sparse-checkout: hold pattern list in index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee As we modify the sparse-checkout definition, we perform index operations on a pattern_list that only exists in-memory. This allows easy backing out in case the index update fails. However, if the index write itself cares about the sparse-checkout pattern set, we need access to that in-memory copy. Place a pointer to a 'struct pattern_list' in the index so we can access this on-demand. This will be used in the next change which uses the sparse-checkout definition to filter out directories that are outside the sparse cone. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 17 ++++++++++------- cache.h | 2 ++ 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 2306a9ad98e0..e00b82af727b 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -110,6 +110,8 @@ static int update_working_directory(struct pattern_list *pl) if (is_index_unborn(r->index)) return UPDATE_SPARSITY_SUCCESS; + r->index->sparse_checkout_patterns = pl; + memset(&o, 0, sizeof(o)); o.verbose_update = isatty(2); o.update = 1; @@ -138,6 +140,7 @@ static int update_working_directory(struct pattern_list *pl) else rollback_lock_file(&lock_file); + r->index->sparse_checkout_patterns = NULL; return result; } @@ -517,19 +520,18 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m) { int result; int changed_config = 0; - struct pattern_list pl; - memset(&pl, 0, sizeof(pl)); + struct pattern_list *pl = xcalloc(1, sizeof(*pl)); switch (m) { case ADD: if (core_sparse_checkout_cone) - add_patterns_cone_mode(argc, argv, &pl); + add_patterns_cone_mode(argc, argv, pl); else - add_patterns_literal(argc, argv, &pl); + add_patterns_literal(argc, argv, pl); break; case REPLACE: - add_patterns_from_input(&pl, argc, argv); + add_patterns_from_input(pl, argc, argv); break; } @@ -539,12 +541,13 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m) changed_config = 1; } - result = write_patterns_and_update(&pl); + result = write_patterns_and_update(pl); if (result && changed_config) set_config(MODE_NO_PATTERNS); - clear_pattern_list(&pl); + clear_pattern_list(pl); + free(pl); return result; } diff --git a/cache.h b/cache.h index 1f0b42264606..303411726e10 100644 --- a/cache.h +++ b/cache.h @@ -307,6 +307,7 @@ static inline unsigned int canon_mode(unsigned int mode) struct split_index; struct untracked_cache; struct progress; +struct pattern_list; struct index_state { struct cache_entry **cache; @@ -338,6 +339,7 @@ struct index_state { struct mem_pool *ce_mem_pool; struct progress *progress; struct repository *repo; + struct pattern_list *sparse_checkout_patterns; }; /* Name hashing */ From patchwork Wed Mar 10 19:30:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129181 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C773C4332E for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 568EB64FD9 for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233867AbhCJTbw (ORCPT ); Wed, 10 Mar 2021 14:31:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233735AbhCJTbP (ORCPT ); Wed, 10 Mar 2021 14:31:15 -0500 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 644ACC061763 for ; Wed, 10 Mar 2021 11:31:15 -0800 (PST) Received: by mail-wm1-x32a.google.com with SMTP id u187so74275wmg.4 for ; Wed, 10 Mar 2021 11:31:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=oCQ4uveRKOyd4yZzB5O07W7BJojkgHHMTm0IFzsbxNs=; b=gW+PXyQG01Z4AuBib9jC/fao2l3f/Ypv981vDzqmstVkVyiqnrcpTfcQej5acUq8hQ muQv7SVmgRBSFkbSpiKUF0apujSsoRvlt30KaDdCnBJUlbFF2xVytx9Jae3YJBpKrjdm mjChNtN3AvtcSChsyinHj16hTM7wR9uKzr7v/Roqd17O87fsEHTrjLQpuIstjoDGZNe0 fgWagEpOoJEsblM8C/4gHolqp1ZCP5Flon6IsjbiIShWN6/3swF2BRj4ovRAOiKvphhN T7chwJ3O6ouIQC6G88DFol9Qi5E4y3lJm5FWCysUFJDmfZuXPiJE2YJ4C7ASiy7M81Ta h/wQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=oCQ4uveRKOyd4yZzB5O07W7BJojkgHHMTm0IFzsbxNs=; b=mZBtoAs7vP+58UqVmd328e2M2GFwcwRh4HvyUnLS41iF/I1EK6lEPViMoToVD8TESn jS8VHKDk7k3eLGFRS48I4w62rc+mqMKZ/YdHuuFZLkJA/OzAHM4yADYo93B9/hOvxyz4 7Ed53LUNqHEUXhei0d9cgiSYTZ9S7Weg8TkO0ZH0424GaOU9MjhNfGne/b8YZYawhlud sZBzN9I5OyLSvKDDVBrQByau4BUFQf25ORFk2+ESwD8Jd8GMZ8w4p4ImleaG4dyRHPsG NzFTHf2wLdf2sDOhTbnbSuqq4O8vjGhyB3vLRlaMuo7kiq7m79KOiXV4VQ4+w08rGsKv G+JQ== X-Gm-Message-State: AOAM531mvrRm0+0mw2K7osZDYmkF9oqB/F9PARHcOQ0DviINQWJm/ceu etCiZc9lNHh/i4hwn93t7RKauoD8+vw= X-Google-Smtp-Source: ABdhPJzQtaFw6WgR3ficVjiBMN+agDLAio5jIgvtcuskiYR/3y2k4BUTlTMTcUZwbQ7aTpfVLUJTLA== X-Received: by 2002:a1c:bdc2:: with SMTP id n185mr4906550wmf.128.1615404674108; Wed, 10 Mar 2021 11:31:14 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z1sm276584wru.95.2021.03.10.11.31.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:13 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:54 +0000 Subject: [PATCH v2 11/20] sparse-index: convert from full to sparse Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee If we have a full index, then we can convert it to a sparse index by replacing directories outside of the sparse cone with sparse directory entries. The convert_to_sparse() method does this, when the situation is appropriate. For now, we avoid converting the index to a sparse index if: 1. the index is split. 2. the index is already sparse. 3. sparse-checkout is disabled. 4. sparse-checkout does not use cone mode. Finally, we currently limit the conversion to when the GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git config will be added in a later change. The trickiest thing about this conversion is that we might not be able to mark a directory as a sparse directory just because it is outside the sparse cone. There might be unmerged files within that directory, so we need to look for those. Also, if there is some strange reason why a file is not marked with CE_SKIP_WORKTREE, then we should give up on converting that directory. There is still hope that some of its subdirectories might be able to convert to sparse, so we keep looking deeper. The conversion process is assisted by the cache-tree extension. This is calculated from the full index if it does not already exist. We then abandon the cache-tree as it no longer applies to the newly-sparse index. Thus, this cache-tree will be recalculated in every sparse-full-sparse round-trip until we integrate the cache-tree extension with the sparse index. Some Git commands use the index after writing it. For example, 'git add' will update the index, then write it to disk, then read its entries to report information. To keep the in-memory index in a full state after writing, we re-expand it to a full one after the write. This is wasteful for commands that only write the index and do not read from it again, but that is only the case until we make those commands "sparse aware." We can compare the behavior of the sparse-index in t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1 when operating on the 'sparse-index' repo. We can also compare the two sparse repos directly, such as comparing their indexes (when expanded to full in the case of the 'sparse-index' repo). We also verify that the index is actually populated with sparse directory entries. The 'checkout and reset (mixed)' test is marked for failure when comparing a sparse repo to a full repo, but we can compare the two sparse-checkout cases directly to ensure that we are not changing the behavior when using a sparse index. Signed-off-by: Derrick Stolee --- cache-tree.c | 3 + cache.h | 2 + read-cache.c | 26 ++++- sparse-index.c | 139 +++++++++++++++++++++++ sparse-index.h | 1 + t/t1092-sparse-checkout-compatibility.sh | 61 +++++++++- 6 files changed, 227 insertions(+), 5 deletions(-) diff --git a/cache-tree.c b/cache-tree.c index 2fb483d3c083..5f07a39e501e 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -6,6 +6,7 @@ #include "object-store.h" #include "replace-object.h" #include "promisor-remote.h" +#include "sparse-index.h" #ifndef DEBUG_CACHE_TREE #define DEBUG_CACHE_TREE 0 @@ -442,6 +443,8 @@ int cache_tree_update(struct index_state *istate, int flags) if (i) return i; + ensure_full_index(istate); + if (!istate->cache_tree) istate->cache_tree = cache_tree(); diff --git a/cache.h b/cache.h index 303411726e10..9217d405b9b8 100644 --- a/cache.h +++ b/cache.h @@ -251,6 +251,8 @@ static inline unsigned int create_ce_mode(unsigned int mode) { if (S_ISLNK(mode)) return S_IFLNK; + if (mode == S_IFDIR) + return S_IFDIR; if (S_ISDIR(mode) || S_ISGITLINK(mode)) return S_IFGITLINK; return S_IFREG | ce_permissions(mode); diff --git a/read-cache.c b/read-cache.c index 97dbf2434f30..92126b9d23c9 100644 --- a/read-cache.c +++ b/read-cache.c @@ -25,6 +25,7 @@ #include "fsmonitor.h" #include "thread-utils.h" #include "progress.h" +#include "sparse-index.h" /* Mask for the name length in ce_flags in the on-disk index */ @@ -1002,8 +1003,14 @@ int verify_path(const char *path, unsigned mode) c = *path++; if ((c == '.' && !verify_dotfile(path, mode)) || - is_dir_sep(c) || c == '\0') + is_dir_sep(c)) return 0; + /* + * allow terminating directory separators for + * sparse directory entries. + */ + if (c == '\0') + return S_ISDIR(mode); } else if (c == '\\' && protect_ntfs) { if (is_ntfs_dotgit(path)) return 0; @@ -3061,6 +3068,14 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l unsigned flags) { int ret; + int was_full = !istate->sparse_index; + + ret = convert_to_sparse(istate); + + if (ret) { + warning(_("failed to convert to a sparse-index")); + return ret; + } /* * TODO trace2: replace "the_repository" with the actual repo instance @@ -3072,6 +3087,9 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l trace2_region_leave_printf("index", "do_write_index", the_repository, "%s", get_lock_file_path(lock)); + if (was_full) + ensure_full_index(istate); + if (ret) return ret; if (flags & COMMIT_LOCK) @@ -3162,9 +3180,10 @@ static int write_shared_index(struct index_state *istate, struct tempfile **temp) { struct split_index *si = istate->split_index; - int ret; + int ret, was_full = !istate->sparse_index; move_cache_to_base_index(istate); + convert_to_sparse(istate); trace2_region_enter_printf("index", "shared/do_write_index", the_repository, "%s", get_tempfile_path(*temp)); @@ -3172,6 +3191,9 @@ static int write_shared_index(struct index_state *istate, trace2_region_leave_printf("index", "shared/do_write_index", the_repository, "%s", get_tempfile_path(*temp)); + if (was_full) + ensure_full_index(istate); + if (ret) return ret; ret = adjust_shared_perm(get_tempfile_path(*temp)); diff --git a/sparse-index.c b/sparse-index.c index 316cb949b74b..5eb561259bb1 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -4,6 +4,145 @@ #include "tree.h" #include "pathspec.h" #include "trace2.h" +#include "cache-tree.h" +#include "config.h" +#include "dir.h" +#include "fsmonitor.h" + +static struct cache_entry *construct_sparse_dir_entry( + struct index_state *istate, + const char *sparse_dir, + struct cache_tree *tree) +{ + struct cache_entry *de; + + de = make_cache_entry(istate, S_IFDIR, &tree->oid, sparse_dir, 0, 0); + + de->ce_flags |= CE_SKIP_WORKTREE; + return de; +} + +/* + * Returns the number of entries "inserted" into the index. + */ +static int convert_to_sparse_rec(struct index_state *istate, + int num_converted, + int start, int end, + const char *ct_path, size_t ct_pathlen, + struct cache_tree *ct) +{ + int i, can_convert = 1; + int start_converted = num_converted; + enum pattern_match_result match; + int dtype; + struct strbuf child_path = STRBUF_INIT; + struct pattern_list *pl = istate->sparse_checkout_patterns; + + /* + * Is the current path outside of the sparse cone? + * Then check if the region can be replaced by a sparse + * directory entry (everything is sparse and merged). + */ + match = path_matches_pattern_list(ct_path, ct_pathlen, + NULL, &dtype, pl, istate); + if (match != NOT_MATCHED) + can_convert = 0; + + for (i = start; can_convert && i < end; i++) { + struct cache_entry *ce = istate->cache[i]; + + if (ce_stage(ce) || + !(ce->ce_flags & CE_SKIP_WORKTREE)) + can_convert = 0; + } + + if (can_convert) { + struct cache_entry *se; + se = construct_sparse_dir_entry(istate, ct_path, ct); + + istate->cache[num_converted++] = se; + return 1; + } + + for (i = start; i < end; ) { + int count, span, pos = -1; + const char *base, *slash; + struct cache_entry *ce = istate->cache[i]; + + /* + * Detect if this is a normal entry outside of any subtree + * entry. + */ + base = ce->name + ct_pathlen; + slash = strchr(base, '/'); + + if (slash) + pos = cache_tree_subtree_pos(ct, base, slash - base); + + if (pos < 0) { + istate->cache[num_converted++] = ce; + i++; + continue; + } + + strbuf_setlen(&child_path, 0); + strbuf_add(&child_path, ce->name, slash - ce->name + 1); + + span = ct->down[pos]->cache_tree->entry_count; + count = convert_to_sparse_rec(istate, + num_converted, i, i + span, + child_path.buf, child_path.len, + ct->down[pos]->cache_tree); + num_converted += count; + i += span; + } + + strbuf_release(&child_path); + return num_converted - start_converted; +} + +int convert_to_sparse(struct index_state *istate) +{ + if (istate->split_index || istate->sparse_index || + !core_apply_sparse_checkout || !core_sparse_checkout_cone) + return 0; + + /* + * For now, only create a sparse index with the + * GIT_TEST_SPARSE_INDEX environment variable. We will relax + * this once we have a proper way to opt-in (and later still, + * opt-out). + */ + if (!git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) + return 0; + + if (!istate->sparse_checkout_patterns) { + istate->sparse_checkout_patterns = xcalloc(1, sizeof(struct pattern_list)); + if (get_sparse_checkout_patterns(istate->sparse_checkout_patterns) < 0) + return 0; + } + + if (!istate->sparse_checkout_patterns->use_cone_patterns) { + warning(_("attempting to use sparse-index without cone mode")); + return -1; + } + + if (cache_tree_update(istate, 0)) { + warning(_("unable to update cache-tree, staying full")); + return -1; + } + + remove_fsmonitor(istate); + + trace2_region_enter("index", "convert_to_sparse", istate->repo); + istate->cache_nr = convert_to_sparse_rec(istate, + 0, 0, istate->cache_nr, + "", 0, istate->cache_tree); + istate->drop_cache_tree = 1; + istate->sparse_index = 1; + trace2_region_leave("index", "convert_to_sparse", istate->repo); + return 0; +} static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) { diff --git a/sparse-index.h b/sparse-index.h index 09a20d036c46..64380e121d80 100644 --- a/sparse-index.h +++ b/sparse-index.h @@ -3,5 +3,6 @@ struct index_state; void ensure_full_index(struct index_state *istate); +int convert_to_sparse(struct index_state *istate); #endif diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 4d789fe86b9d..ca87033d30b0 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -2,6 +2,9 @@ test_description='compare full workdir to sparse workdir' +GIT_TEST_CHECK_CACHE_TREE=0 +GIT_TEST_SPLIT_INDEX=0 + . ./test-lib.sh test_expect_success 'setup' ' @@ -121,15 +124,49 @@ run_on_all () { test_all_match () { run_on_all "$@" && test_cmp full-checkout-out sparse-checkout-out && - test_cmp full-checkout-err sparse-checkout-err + test_cmp full-checkout-out sparse-index-out && + test_cmp full-checkout-err sparse-checkout-err && + test_cmp full-checkout-err sparse-index-err } test_sparse_match () { - run_on_sparse $* && + run_on_sparse "$@" && test_cmp sparse-checkout-out sparse-index-out && test_cmp sparse-checkout-err sparse-index-err } +test_expect_success 'sparse-index contents' ' + init_repos && + + test-tool -C sparse-index read-cache --table >cache && + for dir in folder1 folder2 x + do + TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + grep "040000 tree $TREE $dir/" cache \ + || return 1 + done && + + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set folder1 && + + test-tool -C sparse-index read-cache --table >cache && + for dir in deep folder2 x + do + TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + grep "040000 tree $TREE $dir/" cache \ + || return 1 + done && + + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep/deeper1 && + + test-tool -C sparse-index read-cache --table >cache && + for dir in deep/deeper2 folder1 folder2 x + do + TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + grep "040000 tree $TREE $dir/" cache \ + || return 1 + done +' + test_expect_success 'expanded in-memory index matches full index' ' init_repos && test_sparse_match test-tool read-cache --expand --table @@ -137,6 +174,7 @@ test_expect_success 'expanded in-memory index matches full index' ' test_expect_success 'status with options' ' init_repos && + test_sparse_match ls && test_all_match git status --porcelain=v2 && test_all_match git status --porcelain=v2 -z -u && test_all_match git status --porcelain=v2 -uno && @@ -273,6 +311,17 @@ test_expect_failure 'checkout and reset (mixed)' ' test_all_match git reset update-folder2 ' +# Ensure that sparse-index behaves identically to +# sparse-checkout with a full index. +test_expect_success 'checkout and reset (mixed) [sparse]' ' + init_repos && + + test_sparse_match git checkout -b reset-test update-deep && + test_sparse_match git reset deepest && + test_sparse_match git reset update-folder1 && + test_sparse_match git reset update-folder2 +' + test_expect_success 'merge' ' init_repos && @@ -309,14 +358,20 @@ test_expect_success 'clean' ' test_all_match git status --porcelain=v2 && test_all_match git clean -f && test_all_match git status --porcelain=v2 && + test_sparse_match ls && + test_sparse_match ls folder1 && test_all_match git clean -xf && test_all_match git status --porcelain=v2 && + test_sparse_match ls && + test_sparse_match ls folder1 && test_all_match git clean -xdf && test_all_match git status --porcelain=v2 && + test_sparse_match ls && + test_sparse_match ls folder1 && - test_path_is_dir sparse-checkout/folder1 + test_sparse_match test_path_is_dir folder1 ' test_done From patchwork Wed Mar 10 19:30:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129177 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57F8DC43381 for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1E86264EF6 for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233843AbhCJTbt (ORCPT ); Wed, 10 Mar 2021 14:31:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40750 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233753AbhCJTbQ (ORCPT ); Wed, 10 Mar 2021 14:31:16 -0500 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4CF6C061764 for ; Wed, 10 Mar 2021 11:31:15 -0800 (PST) Received: by mail-wr1-x42f.google.com with SMTP id u16so24612384wrt.1 for ; Wed, 10 Mar 2021 11:31:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=M4rc6Sb5rfy2YSuNLUBtIdiMahkbaMGtXkyzjafEAdw=; b=Jj9Ztye5kRTHKbio0h7UvFcJ87PPZZHAhWE5BFkTRvZPil4cT4julxFhBBP/2R4yF6 Qx4HLYleMrsRoMRKyGO/xOU1tIlIhEzSKfCarjyhRL8PSVNNWZuwd55s6dqVK4ajq9oP zdrcSztLMLqrpi/OYSVSRf22zr9Oy7i8U6fNzGbHuXH/3UiWwpgoG4IN1XCrJimdJdkJ u9kL4p+jG7mDj7RE1NE+MgZlaY0qOkdfsg3kQ5goh6/eTFfPFPCthbu0nYsmgEfBxTVi xbhK/lkQpo6RtR6NHL+dfCHXEdTgac/Y/T8RygBYDYNRVnnCeQhSxDzHWM9Iq6MJAoIN VOjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=M4rc6Sb5rfy2YSuNLUBtIdiMahkbaMGtXkyzjafEAdw=; b=d7xINVUjsGuUR76u/D7Sn36/XG3JETXY0hWqAre5g/y3DD3MTK8BAoEn3Un4SC4+a0 opILSGbUkAmRNYaibQ5QdnECNZxQirmytcD1pDqAG0/H//F4esTqy/bUXROoMdVEbsIE zjgdIwWSBfZ67g2HI2urH5KO05pzLah/Y2ZoP5xV5yboGfnj5RA2tEtBPXCCjsulOOkO J1ni0hdoCJwtHIPkSgylwwqYgSxMhXiWgf7AVWWQ3ay2n6w13EZ1i2Dno+vEEeedDUv8 5BzwVC5OSAa8uYLhe7JWJDDYgkEKMItUJ1c8IrUGPRGE2te2ALzbMPB37628bTuEa8Xt WTCA== X-Gm-Message-State: AOAM531se3eX53h0c2UA7pKOGQ3q2vH1uvgJOFs+U69s18ty3u13INMg f0BnOASiWWM1Mf/1xLy0pwhvhNCyo5s= X-Google-Smtp-Source: ABdhPJzTHDBtzR+ja3m/RyIncmC2PhLv9Uc8JARiZg+vpfd4Bj7D/iYMgscjeHvRIGJeJCJ9LSVTGQ== X-Received: by 2002:adf:d1c2:: with SMTP id b2mr5026599wrd.424.1615404674682; Wed, 10 Mar 2021 11:31:14 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c131sm387832wma.37.2021.03.10.11.31.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:14 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:55 +0000 Subject: [PATCH v2 12/20] submodule: sparse-index should not collapse links Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee A submodule is stored as a "Git link" that actually points to a commit within a submodule. Submodules are populated or not depending on submodule configuration, not sparse-checkout. To ensure that the sparse-index feature integrates correctly with submodules, we should not collapse a directory if there is a Git link within its range. Signed-off-by: Derrick Stolee --- sparse-index.c | 1 + t/t1092-sparse-checkout-compatibility.sh | 17 +++++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/sparse-index.c b/sparse-index.c index 5eb561259bb1..36b4dde7eeda 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -52,6 +52,7 @@ static int convert_to_sparse_rec(struct index_state *istate, struct cache_entry *ce = istate->cache[i]; if (ce_stage(ce) || + S_ISGITLINK(ce->ce_mode) || !(ce->ce_flags & CE_SKIP_WORKTREE)) can_convert = 0; } diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index ca87033d30b0..b38fab6455d9 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -374,4 +374,21 @@ test_expect_success 'clean' ' test_sparse_match test_path_is_dir folder1 ' +test_expect_success 'submodule handling' ' + init_repos && + + test_all_match mkdir modules && + test_all_match touch modules/a && + test_all_match git add modules && + test_all_match git commit -m "add modules directory" && + + run_on_all git submodule add "$(pwd)/initial-repo" modules/sub && + test_all_match git commit -m "add submodule" && + + # having a submodule prevents "modules" from collapse + test-tool -C sparse-index read-cache --table >cache && + grep "100644 blob .* modules/a" cache && + grep "160000 commit $(git -C initial-repo rev-parse HEAD) modules/sub" cache +' + test_done From patchwork Wed Mar 10 19:30:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129179 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60662C4332B for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3E82F64FCE for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233859AbhCJTbv (ORCPT ); Wed, 10 Mar 2021 14:31:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233767AbhCJTbR (ORCPT ); Wed, 10 Mar 2021 14:31:17 -0500 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77F27C061765 for ; Wed, 10 Mar 2021 11:31:16 -0800 (PST) Received: by mail-wm1-x32e.google.com with SMTP id u5-20020a7bcb050000b029010e9316b9d5so8499945wmj.2 for ; Wed, 10 Mar 2021 11:31:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=BT1eQ9m7Cg+4MPuy2Wj0Rt+hvbdtI/KsEf7nXdoOSfk=; b=A0k4mh4Fl2rDk7ujMWoOUqR4GznwZ8wVSkw0slnpSosQEdD0bIfYLu3xLofkIUiy9h v4zJ5ZKWeAgwF0j1a850rH6WenZIThJcAWkb+QBkzesD++eYLAIBcVYHw1OSvvXu7ZDy lkzXtbKC9MlB6zCYbsVwMLy/5ivfdVkvwZ4OVJX91GazBGrWE2iqA4x31evQqJP3I7L0 kvbZdtJ3c3mm6oGB4RBEMC45MChW9+XMgN4OpNY5brzBvvUXHcsqjOUoR+bdAtR6DZyr FH88xmizSDglbmaZ9XDURXlHJPWf0ODf4BAEnbmt1SKR8mhlqQs8L5s7sgw972BxCNue S5Kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=BT1eQ9m7Cg+4MPuy2Wj0Rt+hvbdtI/KsEf7nXdoOSfk=; b=iOiqW9fKhNbvN2fZBsFlybtRLKB+c49LSNjwO3N8ffURx14IvoxrYWGTPvzHhbp5+J vfkD24lef4PpGJMVY5SjacTeTQacMgXpge960WZ+HS9ncV6LZIpPpFUTmWHSK9NBEptD wuXjwrTDMyKMInRHbVEFjx8LE1MLzQZ57eyTtdL7/XMG6cXbrrhNnHSUU8lWzy9/EQ+A M8OENUv9jTX/RBmi/X/fEut55Xl52+HIB/x488zVTayZy6D+6C7Oj3b6hDmtguJAoEZ4 z1m6wAA2Ka4eWjQNjIO9Trn3/jdrHP99+u3mrUSGWZU2HTDvpQbQEdEzoN52CGB4cyp/ kvPw== X-Gm-Message-State: AOAM5331//PMlPWq6c/dgJcSJFXbvXevs07+toMXl3iCpHAJt32mcRuN nBzyiFtcwSJO8HfnaoFQnJoeaNjFp5k= X-Google-Smtp-Source: ABdhPJxFKMsJGhQVco13b+5wflvK5zOdHwrcuJvU+1MsFZbnQOEcrSySUT9PvWSgNXTJZb0msEarBA== X-Received: by 2002:a1c:600a:: with SMTP id u10mr4235149wmb.139.1615404675284; Wed, 10 Mar 2021 11:31:15 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r10sm450588wmh.45.2021.03.10.11.31.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:15 -0800 (PST) Message-Id: <6f1ebe6ccc08d10715ee339fe04289db341dfdc4.1615404665.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:56 +0000 Subject: [PATCH v2 13/20] unpack-trees: allow sparse directories Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The index_pos_by_traverse_info() currently throws a BUG() when a directory entry exists exactly in the index. We need to consider that it is possible to have a directory in a sparse index as long as that entry is itself marked with the skip-worktree bit. The 'pos' variable is assigned a negative value if an exact match is not found. Since a directory name can be an exact match, it is no longer an error to have a nonnegative 'pos' value. Signed-off-by: Derrick Stolee --- unpack-trees.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index 4dd99219073a..b324eec2a5d1 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -746,9 +746,12 @@ static int index_pos_by_traverse_info(struct name_entry *names, strbuf_make_traverse_path(&name, info, names->path, names->pathlen); strbuf_addch(&name, '/'); pos = index_name_pos(o->src_index, name.buf, name.len); - if (pos >= 0) - BUG("This is a directory and should not exist in index"); - pos = -pos - 1; + if (pos >= 0) { + if (!o->src_index->sparse_index || + !(o->src_index->cache[pos]->ce_flags & CE_SKIP_WORKTREE)) + BUG("This is a directory and should not exist in index"); + } else + pos = -pos - 1; if (pos >= o->src_index->cache_nr || !starts_with(o->src_index->cache[pos]->name, name.buf) || (pos > 0 && starts_with(o->src_index->cache[pos-1]->name, name.buf))) From patchwork Wed Mar 10 19:30:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129173 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BB61C433E9 for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0391364FD7 for ; Wed, 10 Mar 2021 19:32:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233852AbhCJTbu (ORCPT ); Wed, 10 Mar 2021 14:31:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40762 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233771AbhCJTbR (ORCPT ); Wed, 10 Mar 2021 14:31:17 -0500 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1DF54C0613D7 for ; Wed, 10 Mar 2021 11:31:17 -0800 (PST) Received: by mail-wm1-x333.google.com with SMTP id i9so86874wml.0 for ; Wed, 10 Mar 2021 11:31:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=l5letMvz1wyLqoALj2Ll33jDZAuVfmkP2dhyJYF6caQ=; b=eO9oVswDGiOvcSQFiLwYwYKlRB7RiO7+pn/BUDAzEKeB3KKceL6G5gtsRr+l8sz4J6 QJip6VfBQTCflrynbrea1YNAb5V9cult/Z0SJO12btOiG7CCIubc4YDJFZRiwnUVYgnG tnzlIZ9uYCA8pQFGDL87uQJkj2jrf4o+L3v59TD22EnTOr/v4O/FnGM4ZfWzT0kTcxC2 vtuQ4KHQIHtCAzv2Synb01/wHKQ0Gz8+/z4Sq1ZoF+9/QmHNchMyXIGdTKONQ4rYwu/f oFe1vTJxzkPjA1n10n89giMXrqKEYufFDyaYrTyUgM55x3RZwT1+mLysr2E6i2+X6HEU K+gQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=l5letMvz1wyLqoALj2Ll33jDZAuVfmkP2dhyJYF6caQ=; b=MQWmNAPNbEFUorHEnUwk83tpHpr21qBoJmiQ+L/dWCqjF2dLHHzVSGcJ9/QkKf+0ke l8rofZN10x66v5tjB14YGX+NZ6AScYFLaBCUF7wlIjT6VHQmKzEAXOGObwAm65FFN8pS BiFcHMglDrSHhmf+yw7SJFiE5o9sF0WmzonOsyz8+9rzI752TladdRHu/tzb3OXeExmg nX+vDvM6+j+D/98nQdvCwx2Gnts06N5NK082OfmbwQ/PZhFwUg3yYqZyft0jp56V5SCz Z3FuIaYyB39KCkcLegTUvB7caqOuDNo1klTXWHt7UFNTzxN2tM204oqHBMl3dtqpyIoD +p/w== X-Gm-Message-State: AOAM533wxsUp/F+I/7Gi1p/9/x+6JgbQvSjSSCXjyGs7aYSghQhoey42 icFlfQoFtGFwJT9j2OEfrEsH1hYi3KM= X-Google-Smtp-Source: ABdhPJwlzyLmX/SlJoYwb4t81Bw7Pivd18eVcmXQe/Dn0Yfc0PVyy0LVX7YKJUGAvd0FtrHO342XVQ== X-Received: by 2002:a1c:730a:: with SMTP id d10mr4750196wmb.53.1615404675913; Wed, 10 Mar 2021 11:31:15 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u4sm368191wrm.24.2021.03.10.11.31.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:15 -0800 (PST) Message-Id: <3fa684b315fb02f42481182a986557e47a8cf0fd.1615404665.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:57 +0000 Subject: [PATCH v2 14/20] sparse-index: check index conversion happens Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Add a test case that uses test_region to ensure that we are truly expanding a sparse index to a full one, then converting back to sparse when writing the index. As we integrate more Git commands with the sparse index, we will convert these commands to check that we do _not_ convert the sparse index to a full index and instead stay sparse the entire time. Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index b38fab6455d9..bfc9e28ef0e1 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -391,4 +391,22 @@ test_expect_success 'submodule handling' ' grep "160000 commit $(git -C initial-repo rev-parse HEAD) modules/sub" cache ' +test_expect_success 'sparse-index is expanded and converted back' ' + init_repos && + + ( + GIT_TEST_SPARSE_INDEX=1 && + export GIT_TEST_SPARSE_INDEX && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" reset --hard && + test_region index convert_to_sparse trace2.txt && + test_region index ensure_full_index trace2.txt && + + rm trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" status -uno && + test_region index ensure_full_index trace2.txt + ) +' + test_done From patchwork Wed Mar 10 19:30:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129195 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94801C4332D for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 78E0564FD6 for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233864AbhCJTbv (ORCPT ); Wed, 10 Mar 2021 14:31:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40764 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233789AbhCJTbS (ORCPT ); Wed, 10 Mar 2021 14:31:18 -0500 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0FFEC061760 for ; Wed, 10 Mar 2021 11:31:17 -0800 (PST) Received: by mail-wr1-x42d.google.com with SMTP id j2so24612643wrx.9 for ; Wed, 10 Mar 2021 11:31:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ReMDFy0EhXXbtDVzHjLTI3brGms2wpO3MKO1mBMMJgE=; b=akSrB+DJtrdLy5mYxMbYI3H5qKYX6e8GpCRnGXDlEEIsHPWvFOk/igsepYGRuUuJGP zlyOHE5shlEpOj4RiuOGORJlB1ecEl+U+BKWRE4DSavEkzS9aNs9pCe7IaCeO4Dkstbf BaJLwhZaS8zzwvrEx75nTX7984befZtutjAmcFaF/m8TVpaoZIunDJ1PLnQZfT8iq47j Yyr//NRaMq9tHClQruIJTLyAPzlJQixSobH362Ve6rrhZVl+O3vUJCYBH5U1UeTxmvKT eUN97d4noHfPuDSsTOJOd3Kqin/EmKIoj2RaBoBbh+GKVJILll7MZN4+0rpNPPOzKqqK ps0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ReMDFy0EhXXbtDVzHjLTI3brGms2wpO3MKO1mBMMJgE=; b=ogsva7kVJhWYMEydJz3JaJeLlLERd2PA5DT4QB33wfsrwRJq4hQDs8Swftnn0kalip eG4wxmDZGkx9SdeTcOfcZckLFRhum2U7irxgq7O2TmvEeNcq6al6dql0lmxwMCZgLAx3 SPQo8sFLTzsOXq0dZ5q1K4UakQ6mbZJfggubLJFHRLru4aLAgUkOYthCM1P/m6HdGZun 5OabjqF4pTTqZakquS3yWnjm8RZS556UX64ywu5zpMHX34kkTYasOh5EJJbdw27MmFq6 ljnzmklXv6X5jvcEqUe0jOJIOuv8c8NG38XJvmv4joMZkqqlodgbCgGctO8e7ecXFAyu juBA== X-Gm-Message-State: AOAM5305ZBxMZx+3lQuX0xzzpFfdvf9KIpoUi9Vw/w7H5SYJ8CHkbd11 gMKlJdEfOruQfROML/C3bBlrkCTHmXg= X-Google-Smtp-Source: ABdhPJzxH3ZVHdm5Bur7m4hZLiY2awOyS7P+IDd6yB+2i84iA24afrSaIiJHjVbfNTbrEVfJiQk/nQ== X-Received: by 2002:a5d:410b:: with SMTP id l11mr5191038wrp.16.1615404676531; Wed, 10 Mar 2021 11:31:16 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n6sm371340wrt.1.2021.03.10.11.31.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:16 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:58 +0000 Subject: [PATCH v2 15/20] sparse-index: create extension for compatibility Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Previously, we enabled the sparse index format only using GIT_TEST_SPARSE_INDEX=1. This is not a feasible direction for users to actually select this mode. Further, sparse directory entries are not understood by the index formats as advertised. We _could_ add a new index version that explicitly adds these capabilities, but there are nuances to index formats 2, 3, and 4 that are still valuable to select as options. Until we add index format version 5, create a repo extension, "extensions.sparseIndex", that specifies that the tool reading this repository must understand sparse directory entries. This change only encodes the extension and enables it when GIT_TEST_SPARSE_INDEX=1. Later, we will add a more user-friendly CLI mechanism. Signed-off-by: Derrick Stolee --- Documentation/config/extensions.txt | 8 ++++++ cache.h | 1 + repo-settings.c | 7 ++++++ repository.h | 3 ++- setup.c | 3 +++ sparse-index.c | 38 +++++++++++++++++++++++++---- 6 files changed, 54 insertions(+), 6 deletions(-) diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt index 4e23d73cdcad..c02e09af0046 100644 --- a/Documentation/config/extensions.txt +++ b/Documentation/config/extensions.txt @@ -6,3 +6,11 @@ extensions.objectFormat:: Note that this setting should only be set by linkgit:git-init[1] or linkgit:git-clone[1]. Trying to change it after initialization will not work and will produce hard-to-diagnose issues. + +extensions.sparseIndex:: + When combined with `core.sparseCheckout=true` and + `core.sparseCheckoutCone=true`, the index may contain entries + corresponding to directories outside of the sparse-checkout + definition in lieu of containing each path under such directories. + Versions of Git that do not understand this extension do not + expect directory entries in the index. diff --git a/cache.h b/cache.h index 9217d405b9b8..03f931c5f34d 100644 --- a/cache.h +++ b/cache.h @@ -1059,6 +1059,7 @@ struct repository_format { int worktree_config; int is_bare; int hash_algo; + int sparse_index; char *work_tree; struct string_list unknown_extensions; struct string_list v1_only_extensions; diff --git a/repo-settings.c b/repo-settings.c index d63569e4041e..9677d50f9238 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -85,4 +85,11 @@ void prepare_repo_settings(struct repository *r) * removed. */ r->settings.command_requires_full_index = 1; + + /* + * Initialize this as off. + */ + r->settings.sparse_index = 0; + if (!repo_config_get_bool(r, "extensions.sparseindex", &value) && value) + r->settings.sparse_index = 1; } diff --git a/repository.h b/repository.h index e06a23015697..a45f7520fd9e 100644 --- a/repository.h +++ b/repository.h @@ -42,7 +42,8 @@ struct repo_settings { int core_multi_pack_index; - unsigned command_requires_full_index:1; + unsigned command_requires_full_index:1, + sparse_index:1; }; struct repository { diff --git a/setup.c b/setup.c index c04cd25a30df..cd8394564613 100644 --- a/setup.c +++ b/setup.c @@ -500,6 +500,9 @@ static enum extension_result handle_extension(const char *var, return error("invalid value for 'extensions.objectformat'"); data->hash_algo = format; return EXTENSION_OK; + } else if (!strcmp(ext, "sparseindex")) { + data->sparse_index = 1; + return EXTENSION_OK; } return EXTENSION_UNKNOWN; } diff --git a/sparse-index.c b/sparse-index.c index 36b4dde7eeda..b9c14ef7ab50 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -102,19 +102,47 @@ static int convert_to_sparse_rec(struct index_state *istate, return num_converted - start_converted; } +static int enable_sparse_index(struct repository *repo) +{ + const char *config_path = repo_git_path(repo, "config.worktree"); + + if (upgrade_repository_format(1) < 0) { + warning(_("unable to upgrade repository format to enable sparse-index")); + return -1; + } + git_config_set_in_file_gently(config_path, + "extensions.sparseIndex", + "true"); + + prepare_repo_settings(repo); + repo->settings.sparse_index = 1; + return 0; +} + int convert_to_sparse(struct index_state *istate) { if (istate->split_index || istate->sparse_index || !core_apply_sparse_checkout || !core_sparse_checkout_cone) return 0; + if (!istate->repo) + istate->repo = the_repository; + + /* + * The GIT_TEST_SPARSE_INDEX environment variable triggers the + * extensions.sparseIndex config variable to be on. + */ + if (git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) { + int err = enable_sparse_index(istate->repo); + if (err < 0) + return err; + } + /* - * For now, only create a sparse index with the - * GIT_TEST_SPARSE_INDEX environment variable. We will relax - * this once we have a proper way to opt-in (and later still, - * opt-out). + * Only convert to sparse if extensions.sparseIndex is set. */ - if (!git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) + prepare_repo_settings(istate->repo); + if (!istate->repo->settings.sparse_index) return 0; if (!istate->sparse_checkout_patterns) { From patchwork Wed Mar 10 19:30:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129197 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41F8FC4321A for ; Wed, 10 Mar 2021 19:32:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2037564FDE for ; Wed, 10 Mar 2021 19:32:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233897AbhCJTby (ORCPT ); Wed, 10 Mar 2021 14:31:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40772 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233747AbhCJTbS (ORCPT ); Wed, 10 Mar 2021 14:31:18 -0500 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CA7FC061761 for ; Wed, 10 Mar 2021 11:31:18 -0800 (PST) Received: by mail-wr1-x432.google.com with SMTP id d15so24637321wrv.5 for ; Wed, 10 Mar 2021 11:31:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=cPtFZCkvkX2KLRtyrHWEELrhALm1VwPtUV70W83IOCw=; b=f77GmvVnKR+NPTyBcz+nG6oEZJjmR7imzMRfv4A3qB5splKG1eyETy8rFALwdD11sY 4NeQwNaA7pFtrrOgb8TZo6834h+VGzvTT80MRARWZyuJgs7M2ev3yyV8ClXYmuIJRqdF KYaXAkvwRmkRynEk/R4ZA+lOXZz77wyrBIRZN4LFrk+ib1kdrcxJe2Ejir8L6eX6lRRH f0Iodo/NgpVr80fCx+DrioLZMTK107057+uuIrSQ5hBlSNZUrGNN36tiuO7rnqKhuo+v aCctt2FxA/CSusZQbZ1q2uozerzSXVfRa2O0ovyTSCCBm2gavJ6Y25FlLlIZ90dEHqSS Av3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=cPtFZCkvkX2KLRtyrHWEELrhALm1VwPtUV70W83IOCw=; b=mK/SlLXDC1yEC5gYmUMuWj8Pi4RiRV4PLluGlRKG1v9P/f7Qc5M1FbyXq8GGwRmcdZ ffA+fjxpaJqgivKOPyugdConRIHmoEltLC4B4VUzEx5Jw8h8u1gpv6LVfz5V3kEgbbrG lKiWMmygh1CtolYZZsa6/+1qmr7IL1hkrv52i5aKD8VlXoWTXhpveDJ62aEZISZqREIT 6nt/cdzxcgn72FcYIREnZXGXmgohRIS5/IkB8Xtgvm6I97fyUdtEyBJXqzUXVSV9y1AP vt1IwUGiVj/AtRKnBfnEjswvre7Ang7NC6WxrWeNtZf1qT1SDHS0cr/q0Uex18aTI4La 4G4Q== X-Gm-Message-State: AOAM530YFJzxsYTRHj8xNw2LSTjyRTMIVQXsVxulnW0tNSQjwQdQ5lPb iE+HtX778MMPZ29Gzjv1AjMTrGBeUz8= X-Google-Smtp-Source: ABdhPJwwa/LpHkfVo8O/KBVMTrwS7qO8SgPSvOfSD7QYW29EZoZRzTxfK5aggNaM+GguqXRR1pFCSA== X-Received: by 2002:a5d:6684:: with SMTP id l4mr5044469wru.381.1615404677098; Wed, 10 Mar 2021 11:31:17 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d29sm310175wra.51.2021.03.10.11.31.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:16 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 10 Mar 2021 19:30:59 +0000 Subject: [PATCH v2 16/20] sparse-checkout: toggle sparse index from builtin Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The sparse index extension is used to signal that index writes should be in sparse mode. This was only updated using GIT_TEST_SPARSE_INDEX=1. Add a '--[no-]sparse-index' option to 'git sparse-checkout init' that specifies if the sparse index should be used. It also updates the index to use the correct format, either way. Add a warning in the documentation that the use of a repository extension might reduce compatibility with third-party tools. 'git sparse-checkout init' already sets extension.worktreeConfig, which places most sparse-checkout users outside of the scope of most third-party tools. Update t1092-sparse-checkout-compatibility.sh to use this CLI instead of GIT_TEST_SPARSE_INDEX=1. Signed-off-by: Derrick Stolee --- Documentation/git-sparse-checkout.txt | 14 +++++++++ builtin/sparse-checkout.c | 17 ++++++++++- sparse-index.c | 37 +++++++++++++++-------- sparse-index.h | 3 ++ t/t1092-sparse-checkout-compatibility.sh | 38 +++++++++++------------- 5 files changed, 76 insertions(+), 33 deletions(-) diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt index a0eeaeb02ee3..4a8343cf7fa4 100644 --- a/Documentation/git-sparse-checkout.txt +++ b/Documentation/git-sparse-checkout.txt @@ -45,6 +45,20 @@ To avoid interfering with other worktrees, it first enables the When `--cone` is provided, the `core.sparseCheckoutCone` setting is also set, allowing for better performance with a limited set of patterns (see 'CONE PATTERN SET' below). ++ +Use the `--[no-]sparse-index` option to toggle the use of the sparse +index format. This reduces the size of the index to be more closely +aligned with your sparse-checkout definition. This can have significant +performance advantages for commands such as `git status` or `git add`. +This feature is still experimental. Some commands might be slower with +a sparse index until they are properly integrated with the feature. ++ +**WARNING:** Using a sparse index requires modifying the index in a way +that is not completely understood by external tools. If you have trouble +with this compatibility, then run `git sparse-checkout sparse-index disable` +to rewrite your index to not be sparse. Older versions of Git will not +understand the `sparseIndex` repository extension and may fail to interact +with your repository until it is disabled. 'set':: Write a set of patterns to the sparse-checkout file, as given as diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index e00b82af727b..ca63e2c64e95 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -14,6 +14,7 @@ #include "unpack-trees.h" #include "wt-status.h" #include "quote.h" +#include "sparse-index.h" static const char *empty_base = ""; @@ -283,12 +284,13 @@ static int set_config(enum sparse_checkout_mode mode) } static char const * const builtin_sparse_checkout_init_usage[] = { - N_("git sparse-checkout init [--cone]"), + N_("git sparse-checkout init [--cone] [--[no-]sparse-index]"), NULL }; static struct sparse_checkout_init_opts { int cone_mode; + int sparse_index; } init_opts; static int sparse_checkout_init(int argc, const char **argv) @@ -303,11 +305,15 @@ static int sparse_checkout_init(int argc, const char **argv) static struct option builtin_sparse_checkout_init_options[] = { OPT_BOOL(0, "cone", &init_opts.cone_mode, N_("initialize the sparse-checkout in cone mode")), + OPT_BOOL(0, "sparse-index", &init_opts.sparse_index, + N_("toggle the use of a sparse index")), OPT_END(), }; repo_read_index(the_repository); + init_opts.sparse_index = -1; + argc = parse_options(argc, argv, NULL, builtin_sparse_checkout_init_options, builtin_sparse_checkout_init_usage, 0); @@ -326,6 +332,15 @@ static int sparse_checkout_init(int argc, const char **argv) sparse_filename = get_sparse_checkout_filename(); res = add_patterns_from_file_to_list(sparse_filename, "", 0, &pl, NULL); + if (init_opts.sparse_index >= 0) { + if (set_sparse_index_config(the_repository, init_opts.sparse_index) < 0) + die(_("failed to modify sparse-index config")); + + /* force an index rewrite */ + repo_read_index(the_repository); + the_repository->index->updated_workdir = 1; + } + /* If we already have a sparse-checkout file, use it. */ if (res >= 0) { free(sparse_filename); diff --git a/sparse-index.c b/sparse-index.c index b9c14ef7ab50..1c84cac255bf 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -104,23 +104,37 @@ static int convert_to_sparse_rec(struct index_state *istate, static int enable_sparse_index(struct repository *repo) { - const char *config_path = repo_git_path(repo, "config.worktree"); + int res; if (upgrade_repository_format(1) < 0) { warning(_("unable to upgrade repository format to enable sparse-index")); return -1; } - git_config_set_in_file_gently(config_path, - "extensions.sparseIndex", - "true"); + res = git_config_set_gently("extensions.sparseindex", "true"); prepare_repo_settings(repo); repo->settings.sparse_index = 1; - return 0; + return res; +} + +int set_sparse_index_config(struct repository *repo, int enable) +{ + int res; + + if (enable) + return enable_sparse_index(repo); + + /* Don't downgrade repository format, just remove the extension. */ + res = git_config_set_gently("extensions.sparseindex", NULL); + + prepare_repo_settings(repo); + repo->settings.sparse_index = 0; + return res; } int convert_to_sparse(struct index_state *istate) { + int test_env; if (istate->split_index || istate->sparse_index || !core_apply_sparse_checkout || !core_sparse_checkout_cone) return 0; @@ -129,14 +143,13 @@ int convert_to_sparse(struct index_state *istate) istate->repo = the_repository; /* - * The GIT_TEST_SPARSE_INDEX environment variable triggers the - * extensions.sparseIndex config variable to be on. + * If GIT_TEST_SPARSE_INDEX=1, then trigger extensions.sparseIndex + * to be fully enabled. If GIT_TEST_SPARSE_INDEX=0 (set explicitly), + * then purposefully disable the setting. */ - if (git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) { - int err = enable_sparse_index(istate->repo); - if (err < 0) - return err; - } + test_env = git_env_bool("GIT_TEST_SPARSE_INDEX", -1); + if (test_env >= 0) + set_sparse_index_config(istate->repo, test_env); /* * Only convert to sparse if extensions.sparseIndex is set. diff --git a/sparse-index.h b/sparse-index.h index 64380e121d80..39dcc859735e 100644 --- a/sparse-index.h +++ b/sparse-index.h @@ -5,4 +5,7 @@ struct index_state; void ensure_full_index(struct index_state *istate); int convert_to_sparse(struct index_state *istate); +struct repository; +int set_sparse_index_config(struct repository *repo, int enable); + #endif diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index bfc9e28ef0e1..9c2bc4d25f66 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -4,6 +4,7 @@ test_description='compare full workdir to sparse workdir' GIT_TEST_CHECK_CACHE_TREE=0 GIT_TEST_SPLIT_INDEX=0 +GIT_TEST_SPARSE_INDEX= . ./test-lib.sh @@ -98,25 +99,26 @@ init_repos () { # initialize sparse-checkout definitions git -C sparse-checkout sparse-checkout init --cone && git -C sparse-checkout sparse-checkout set deep && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout init --cone && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep + git -C sparse-index sparse-checkout init --cone --sparse-index && + test_cmp_config -C sparse-index true extensions.sparseindex && + git -C sparse-index sparse-checkout set deep } run_on_sparse () { ( cd sparse-checkout && - GIT_TEST_SPARSE_INDEX=0 "$@" >../sparse-checkout-out 2>../sparse-checkout-err + "$@" >../sparse-checkout-out 2>../sparse-checkout-err ) && ( cd sparse-index && - GIT_TEST_SPARSE_INDEX=1 "$@" >../sparse-index-out 2>../sparse-index-err + "$@" >../sparse-index-out 2>../sparse-index-err ) } run_on_all () { ( cd full-checkout && - GIT_TEST_SPARSE_INDEX=0 "$@" >../full-checkout-out 2>../full-checkout-err + "$@" >../full-checkout-out 2>../full-checkout-err ) && run_on_sparse "$@" } @@ -146,7 +148,7 @@ test_expect_success 'sparse-index contents' ' || return 1 done && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set folder1 && + git -C sparse-index sparse-checkout set folder1 && test-tool -C sparse-index read-cache --table >cache && for dir in deep folder2 x @@ -156,7 +158,7 @@ test_expect_success 'sparse-index contents' ' || return 1 done && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep/deeper1 && + git -C sparse-index sparse-checkout set deep/deeper1 && test-tool -C sparse-index read-cache --table >cache && for dir in deep/deeper2 folder1 folder2 x @@ -394,19 +396,15 @@ test_expect_success 'submodule handling' ' test_expect_success 'sparse-index is expanded and converted back' ' init_repos && - ( - GIT_TEST_SPARSE_INDEX=1 && - export GIT_TEST_SPARSE_INDEX && - GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ - git -C sparse-index -c core.fsmonitor="" reset --hard && - test_region index convert_to_sparse trace2.txt && - test_region index ensure_full_index trace2.txt && - - rm trace2.txt && - GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ - git -C sparse-index -c core.fsmonitor="" status -uno && - test_region index ensure_full_index trace2.txt - ) + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" reset --hard && + test_region index convert_to_sparse trace2.txt && + test_region index ensure_full_index trace2.txt && + + rm trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" status -uno && + test_region index ensure_full_index trace2.txt ' test_done From patchwork Wed Mar 10 19:31:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129189 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BEF2C432C3 for ; Wed, 10 Mar 2021 19:32:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3C62D64FDC for ; Wed, 10 Mar 2021 19:32:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233891AbhCJTbx (ORCPT ); Wed, 10 Mar 2021 14:31:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233716AbhCJTbT (ORCPT ); Wed, 10 Mar 2021 14:31:19 -0500 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DADE5C061760 for ; Wed, 10 Mar 2021 11:31:18 -0800 (PST) Received: by mail-wr1-x42d.google.com with SMTP id f12so24633242wrx.8 for ; Wed, 10 Mar 2021 11:31:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=h/Sv2nt7HYoFWjtNBIuIh6ZaRdlBye0GXuHQNudUZb8=; b=khRtp5ShFLQPShkydoXzgmrAGtrxBQJoAPFyU9FmndTmfYqTrYhJI41FAZuOS3dRQH roEO5c1+hpiNtY+VETGz5aTgFpsGORc2C5ZWoG5YeQ71Vk+/TBBsNbBoh59EQKELQ87a gPA7iXGujv1SDtV/xbqU26SDFkkj7dWlLu87ovk+P1Nl4DteJ3dAhbXt6k0NfR1dwEKH vy+M8Dnmray+jJ5Dtquskn7ehzdQWTrFpAKXlGb7gAxPJdEIebbELpDGCClliDTUMe8k YkWvYCPt5ozUufEVtZx6imF6Tckb8A/L9SAI7wcTJHb8xiIT4JqjYSOjbcAssMkvSi+t GWXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=h/Sv2nt7HYoFWjtNBIuIh6ZaRdlBye0GXuHQNudUZb8=; b=o6GfFhqHELBtGGj08p3v2WJhDOfib7pAoELzxWatYud9SxYxFl1kdxsfJ3K8g6uDm/ zW8AA8bX9inIiPmdzqDFBjG0e3np+453oipLfwMl6Lpvt+bmu/bfA0/V49arwBHZzJxA tGoJAtwVX3k2aAIRcPSssuLbvmHT/jwAsV02Yix3MBHRQPvjHGiM4vn2EBfivS6HgrnQ x6MnxfGrhy/GVWR45o9tq0lXbjuOLcUVm3n5Ud3W+r5ScNA3ow6/KhjEnI/pATlY9fRK bF54l1zhNdyUMJhdgZcVFmTp/Lpl/GH296QiKlm9jGX1xi1soJX8cTKUborSx8GJQ387 ybZA== X-Gm-Message-State: AOAM532PSmCSRvJewgR3N7DVVUA1nZA7idm7KdkUjlkqPSVtEI7kHt2Q 8Gk223kcZKz/a8G2n+i1lhIFt+WOsm8= X-Google-Smtp-Source: ABdhPJwEmM4HEk0cSDk76mowOyMWuAaC8pxFWkf/6LIij1wdI7Tho38PmYa9V3Qtro3dQKYYXGs1hQ== X-Received: by 2002:a5d:400f:: with SMTP id n15mr4912762wrp.89.1615404677671; Wed, 10 Mar 2021 11:31:17 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a14sm308367wrg.84.2021.03.10.11.31.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:17 -0800 (PST) Message-Id: <42d0da9c5def853dcec0855d586fe3c78de4c9d5.1615404665.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:31:00 +0000 Subject: [PATCH v2 17/20] sparse-checkout: disable sparse-index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee We use 'git sparse-checkout init --cone --sparse-index' to toggle the sparse-index feature. It makes sense to also disable it when running 'git sparse-checkout disable'. This is particularly important because it removes the extensions.sparseIndex config option, allowing other tools to use this Git repository again. This does mean that 'git sparse-checkout init' will not re-enable the sparse-index feature, even if it was previously enabled. While testing this feature, I noticed that the sparse-index was not being written on the first run, but by a second. This was caught by the call to 'test-tool read-cache --table'. This requires adjusting some assignments to core_apply_sparse_checkout and pl.use_cone_patterns in the sparse_checkout_init() logic. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 10 +++++++++- t/t1091-sparse-checkout-builtin.sh | 13 +++++++++++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index ca63e2c64e95..585343fa1972 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -280,6 +280,9 @@ static int set_config(enum sparse_checkout_mode mode) "core.sparseCheckoutCone", mode == MODE_CONE_PATTERNS ? "true" : NULL); + if (mode == MODE_NO_PATTERNS) + set_sparse_index_config(the_repository, 0); + return 0; } @@ -341,10 +344,11 @@ static int sparse_checkout_init(int argc, const char **argv) the_repository->index->updated_workdir = 1; } + core_apply_sparse_checkout = 1; + /* If we already have a sparse-checkout file, use it. */ if (res >= 0) { free(sparse_filename); - core_apply_sparse_checkout = 1; return update_working_directory(NULL); } @@ -366,6 +370,7 @@ static int sparse_checkout_init(int argc, const char **argv) add_pattern(strbuf_detach(&pattern, NULL), empty_base, 0, &pl, 0); strbuf_addstr(&pattern, "!/*/"); add_pattern(strbuf_detach(&pattern, NULL), empty_base, 0, &pl, 0); + pl.use_cone_patterns = init_opts.cone_mode; return write_patterns_and_update(&pl); } @@ -632,6 +637,9 @@ static int sparse_checkout_disable(int argc, const char **argv) strbuf_addstr(&match_all, "/*"); add_pattern(strbuf_detach(&match_all, NULL), empty_base, 0, &pl, 0); + prepare_repo_settings(the_repository); + the_repository->settings.sparse_index = 0; + if (update_working_directory(&pl)) die(_("error while refreshing working directory")); diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index fc64e9ed99f4..ff1ad570a255 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -205,6 +205,19 @@ test_expect_success 'sparse-checkout disable' ' check_files repo a deep folder1 folder2 ' +test_expect_success 'sparse-index enabled and disabled' ' + git -C repo sparse-checkout init --cone --sparse-index && + test_cmp_config -C repo true extensions.sparseIndex && + test-tool -C repo read-cache --table >cache && + grep " tree " cache && + + git -C repo sparse-checkout disable && + test-tool -C repo read-cache --table >cache && + ! grep " tree " cache && + git -C repo config --list >config && + ! grep extensions.sparseindex config +' + test_expect_success 'cone mode: init and set' ' git -C repo sparse-checkout init --cone && git -C repo config --list >config && From patchwork Wed Mar 10 19:31:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129191 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96115C43603 for ; Wed, 10 Mar 2021 19:32:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 738F864FE0 for ; Wed, 10 Mar 2021 19:32:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233906AbhCJTby (ORCPT ); Wed, 10 Mar 2021 14:31:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233749AbhCJTbT (ORCPT ); Wed, 10 Mar 2021 14:31:19 -0500 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98339C061760 for ; Wed, 10 Mar 2021 11:31:19 -0800 (PST) Received: by mail-wm1-x336.google.com with SMTP id r15-20020a05600c35cfb029010e639ca09eso11810191wmq.1 for ; Wed, 10 Mar 2021 11:31:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Ieph9Cy+IVymDCLa3+EgUXf0wga7IXS9ysPHMHM15nY=; b=T4tyB15s7S6mLgpYNsirUVqeoaQWgMWCbyayNNG1FSiuxiaTSBbTw8cs8UGUmCsi/V g29OCFqoF0EW/UGFGkqZYwJq0OJ3L4pmtSUexKUC1UOVFGLJEBq8upiRbBVsrWPFlQUQ uuORR031nPrVzquEhJUhoIPLW1tnJlE9lwBTVy3ANK2Z3/f5VwiLItVUAKHqgP390bsY cQHJR0XhgKSmT3bLxbHeL+OSQG9am5VpDeP7nk8eBDw0cIBrX4vhFx5uIQyFmVICChAy fZOIaV8HCKJfZQ8k6CGfhoIvJ4hp+dLlXa6hlnW8ZZLtHIV5RButxcjByoc9n+pIsHAb Ql/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Ieph9Cy+IVymDCLa3+EgUXf0wga7IXS9ysPHMHM15nY=; b=Wa+tEcf73+Th31kqeyPokExsZ7G4DfazszyofVA9lrCfa+DXo53bvdS82rOybH4wyL JN36Ula6n5ECC0e/BN3Jbg6bA+SEB9ydIpgZkneGts0zJsL6LhE9kHdohjek4AsqoTiv UUAuwQbLM5mJOTQi2cZ+9va84o4odiodMQP3rgG6szTa9ZvHtLfcKbPANmPWN6FRafa0 yQ1Me4FM1efFc9KLayt3QXcPJoR1/Qif23KdcoEbvO5FcNQqMTh1Q4ZubhjlqH0PGhEL yfihyk6tKb5B8+LOABsJUVCO2CQ4wijaIMrCG4beoiTd6pHANroF/3GsDFiW5o6la9OC O1Pg== X-Gm-Message-State: AOAM531zvY1410Uf/e5Gtmmf1t07zRm/B1VLzbQrUiWvVlKOkm48ZreI pltDYkU3baBBF9Za79vg213bm9XUZis= X-Google-Smtp-Source: ABdhPJykSbKM3n+S/SczBulfjRfxt28qDLFdC/9G6FpZxKsTDy8KxIetwA6SU8zy31Ji8JY1yWaGNA== X-Received: by 2002:a1c:65c2:: with SMTP id z185mr4909289wmb.2.1615404678431; Wed, 10 Mar 2021 11:31:18 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id g9sm374061wrp.14.2021.03.10.11.31.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:17 -0800 (PST) Message-Id: <6bb0976a6295e4a98d050ab26a0545af8c9b5ff1.1615404665.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:31:01 +0000 Subject: [PATCH v2 18/20] cache-tree: integrate with sparse directory entries Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The cache-tree extension was previously disabled with sparse indexes. However, the cache-tree is an important performance feature for commands like 'git status' and 'git add'. Integrate it with sparse directory entries. When writing a sparse index, completely clear and recalculate the cache tree. By starting from scratch, the only integration necessary is to check if we hit a sparse directory entry and create a leaf of the cache-tree that has an entry_count of one and no subtrees. Signed-off-by: Derrick Stolee --- cache-tree.c | 18 ++++++++++++++++++ sparse-index.c | 10 +++++++++- 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/cache-tree.c b/cache-tree.c index 5f07a39e501e..950a9615db8f 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -256,6 +256,24 @@ static int update_one(struct cache_tree *it, *skip_count = 0; + /* + * If the first entry of this region is a sparse directory + * entry corresponding exactly to 'base', then this cache_tree + * struct is a "leaf" in the data structure, pointing to the + * tree OID specified in the entry. + */ + if (entries > 0) { + const struct cache_entry *ce = cache[0]; + + if (S_ISSPARSEDIR(ce->ce_mode) && + ce->ce_namelen == baselen && + !strncmp(ce->name, base, baselen)) { + it->entry_count = 1; + oidcpy(&it->oid, &ce->oid); + return 1; + } + } + if (0 <= it->entry_count && has_object_file(&it->oid)) return it->entry_count; diff --git a/sparse-index.c b/sparse-index.c index 1c84cac255bf..ea603201a323 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -180,7 +180,11 @@ int convert_to_sparse(struct index_state *istate) istate->cache_nr = convert_to_sparse_rec(istate, 0, 0, istate->cache_nr, "", 0, istate->cache_tree); - istate->drop_cache_tree = 1; + + /* Clear and recompute the cache-tree */ + cache_tree_free(&istate->cache_tree); + cache_tree_update(istate, 0); + istate->sparse_index = 1; trace2_region_leave("index", "convert_to_sparse", istate->repo); return 0; @@ -278,5 +282,9 @@ void ensure_full_index(struct index_state *istate) free(full); + /* Clear and recompute the cache-tree */ + cache_tree_free(&istate->cache_tree); + cache_tree_update(istate, 0); + trace2_region_leave("index", "ensure_full_index", istate->repo); } From patchwork Wed Mar 10 19:31:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129183 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FC4CC43332 for ; Wed, 10 Mar 2021 19:32:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EB59664FD6 for ; Wed, 10 Mar 2021 19:32:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233900AbhCJTby (ORCPT ); Wed, 10 Mar 2021 14:31:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40778 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233751AbhCJTbU (ORCPT ); Wed, 10 Mar 2021 14:31:20 -0500 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28B37C061760 for ; Wed, 10 Mar 2021 11:31:20 -0800 (PST) Received: by mail-wm1-x32e.google.com with SMTP id f22-20020a7bc8d60000b029010c024a1407so11810215wml.2 for ; Wed, 10 Mar 2021 11:31:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=5pUTXKGQuSdTT7ILEy7EmDD3wWlGK1ztMU7gpypslrM=; b=Gxa9T7G4XdswJJcpfAlLNaJhmQO9wt8f551DCwNsx0QNuKm0jpeWrLoGvQXuXD+h4d AsNSrinRZA+Ia/YXoycbWTkeyClusCS7Bv3AnmoW2O6XCWch6dTZwpEl5z4WCkrOgeHh ywbHejKEO/vQ6ecDGkgRrq4cOvbe5x33mEUPCZ2C+rQj8P6JMHv8xCHdBzG1t1GZX2nM JiuvqrJgVaqkj573igBmEJF9YH6TP2TsfeFDNh+gqILHu88//E2k4NpV13G6dxwpMruq pR6YjTTjY3TjnaKoC3hgF1woO+LbSADqu3l4vXTtXda/4led2s5EZXtrGeUXiFFJ0SvA /jZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=5pUTXKGQuSdTT7ILEy7EmDD3wWlGK1ztMU7gpypslrM=; b=NYk8GkgjDNFQU88FAIXEJ7r4DMb8UCU5wDe22MxYBmZXnsCkWcur2ainAD4wvlcN+9 N0iDZRfYN1rXhO7g8S55KLsq+xRMowsLn6PJm7C9fVwDuY+az1ci5WF2XvAxdJ+BOg/m cC+fCk8s8b654VkqlRgDM3M7YdMBT3iRSQZYtK0r1EB1c/hqo+BfB3bK1UszXIHpaQuB ApCgZjxegFumMUMNa0DuBYp7t1udh9Vxb4uwxrMx1ap8Z6AifKKRWO75BSIqlpRxb4o9 PoFuMM3sii22UVzghrKJN9QQ1isJYx92Oa+zPCrwBYYdvZKOWvJNhL953oE9ehrf9tKz Z/PA== X-Gm-Message-State: AOAM531dk4cj63TE2lvB6eMSUTyEf5Lm9K7+cMdcRwdcRWa9JaeKwslE aNsjpuBcFmFq1UKatemaWvgLfM9/32w= X-Google-Smtp-Source: ABdhPJyM+CcI7MXm2vwmJ22UuVZIRdOCwadzRgFoxDO+nzjADTBbDBPDFMhd5+StvN2IsHMXBrdgQg== X-Received: by 2002:a1c:cc04:: with SMTP id h4mr4753475wmb.142.1615404678974; Wed, 10 Mar 2021 11:31:18 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l4sm321856wrt.60.2021.03.10.11.31.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:18 -0800 (PST) Message-Id: <07f34e80609a39f7cf52d21cc8fe0d83ee728fb0.1615404665.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:31:02 +0000 Subject: [PATCH v2 19/20] sparse-index: loose integration with cache_tree_verify() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The cache_tree_verify() method is run when GIT_TEST_CHECK_CACHE_TREE is enabled, which it is by default in the test suite. The logic must be adjusted for the presence of these directory entries. For now, leave the test as a simple check for whether the directory entry is sparse. Do not go any further until needed. This allows us to re-enable GIT_TEST_CHECK_CACHE_TREE in t1092-sparse-checkout-compatibility.sh. Further, p2000-sparse-operations.sh uses the test suite and hence this is enabled for all tests. We need to integrate with it before we run our performance tests with a sparse-index. Signed-off-by: Derrick Stolee --- cache-tree.c | 19 +++++++++++++++++++ t/t1092-sparse-checkout-compatibility.sh | 1 - 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/cache-tree.c b/cache-tree.c index 950a9615db8f..11bf1fcae6e1 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -808,6 +808,19 @@ int cache_tree_matches_traversal(struct cache_tree *root, return 0; } +static void verify_one_sparse(struct repository *r, + struct index_state *istate, + struct cache_tree *it, + struct strbuf *path, + int pos) +{ + struct cache_entry *ce = istate->cache[pos]; + + if (!S_ISSPARSEDIR(ce->ce_mode)) + BUG("directory '%s' is present in index, but not sparse", + path->buf); +} + static void verify_one(struct repository *r, struct index_state *istate, struct cache_tree *it, @@ -830,6 +843,12 @@ static void verify_one(struct repository *r, if (path->len) { pos = index_name_pos(istate, path->buf, path->len); + + if (pos >= 0) { + verify_one_sparse(r, istate, it, path, pos); + return; + } + pos = -pos - 1; } else { pos = 0; diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 9c2bc4d25f66..c2624176c2e0 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -2,7 +2,6 @@ test_description='compare full workdir to sparse workdir' -GIT_TEST_CHECK_CACHE_TREE=0 GIT_TEST_SPLIT_INDEX=0 GIT_TEST_SPARSE_INDEX= From patchwork Wed Mar 10 19:31:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12129193 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E3BAC4360C for ; Wed, 10 Mar 2021 19:32:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5926964FDE for ; Wed, 10 Mar 2021 19:32:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233913AbhCJTbz (ORCPT ); Wed, 10 Mar 2021 14:31:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233821AbhCJTbV (ORCPT ); Wed, 10 Mar 2021 14:31:21 -0500 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BDC5AC061760 for ; Wed, 10 Mar 2021 11:31:20 -0800 (PST) Received: by mail-wm1-x32f.google.com with SMTP id u187so74384wmg.4 for ; Wed, 10 Mar 2021 11:31:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=LHUKWFc377nClgW8+NNh7YinD6CTjnUd3Jh5YVvBApg=; b=i8an8dOW1huuJDanb/CpaWCUzBSVGMnmh2jiAckHZ7hG5lcThHYLaBUQ0e30NIvIEl GcSkJWCqPZOhTUPaNPb3DNou7nETdqoCAtYDSpPkhjGBRtPKomT1NbdMTcU0HbmvfaV9 dSD41CHwNfdfU5JwF+vpQcD6WhzjDd46UpLka3qZrNIa9F9681EF4lhjh1TP8FX9PPaw v80Rio4t9KNyLn43Ogea8ZIgwqdFB7YMQoKBao+GlCC5/qz6hbANQFWG/wfiOAMW5mW3 CW7ayaS7tiXNxs4cmQGIl8ntfJxWcAKjB7F8YxfKj0uB1nqZAj/GZ3wxtc120DIv0Nnb jL/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=LHUKWFc377nClgW8+NNh7YinD6CTjnUd3Jh5YVvBApg=; b=IqpD1GmoVmw1Lo6CvDxWpfrvOajCcul5lbY/5cPgWAL9bV2JUhnciDE30Abi7VZJKj xsA/pGzzEbm9G2+cguKHeC0l4AwnFWkaEFzuxuES2V/9r2/IILx9lCPMVn2A+KNvuPP+ /GryyKiRd7bveKBKR64aTFMEWacBgDgu9mdvcduaqoe41+6SuDUjkKV/fjb35SybQsB6 nLrKhJPDUx+U5M/fs7TvqAc8e/Qm1Esx6mi4dB9JR7ap5hR0Kyi2eaQEaUX1tMc1vkw6 0D0wa/X+Ggvf9Tyg4N/GKd5L7NpJEs3I5Z8B/RMgx8y4rNcdayPlZv0V1YHVm+RS4h4l 3X5g== X-Gm-Message-State: AOAM530kERR0HHr3RIinrjbG7nolGActIIsHXR88IOvW0O7t58ORIief keC9QXQdMA7gamIzx+oaF3AD6+qqfNE= X-Google-Smtp-Source: ABdhPJxKiKkwdVwMsr51BCAiSISUC0Qx/DK7fny1JrxAS70jAQ6+Yb02Y3pxocSfvv5dhaLbUBS53w== X-Received: by 2002:a05:600c:4ca9:: with SMTP id g41mr4899688wmp.150.1615404679532; Wed, 10 Mar 2021 11:31:19 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 21sm464012wme.6.2021.03.10.11.31.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 11:31:19 -0800 (PST) Message-Id: <41e3b56b9c17d3c30b7a7fe79abfc43e9c45ecee.1615404665.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 10 Mar 2021 19:31:03 +0000 Subject: [PATCH v2 20/20] p2000: add sparse-index repos Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee p2000-sparse-operations.sh compares different Git commands in repositories with many files at HEAD but using sparse-checkout to focus on a small portion of those files. Add extra copies of the repository that use the sparse-index format so we can track how that affects the performance of different commands. At this point in time, the sparse-index is 100% overhead from the CPU front, and this is measurable in these tests: Test --------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.59(0.51+0.12) 2000.3: git status (full-index-v4) 0.59(0.52+0.11) 2000.4: git status (sparse-index-v3) 1.40(1.32+0.12) 2000.5: git status (sparse-index-v4) 1.41(1.36+0.08) 2000.6: git add -A (full-index-v3) 2.32(1.97+0.19) 2000.7: git add -A (full-index-v4) 2.17(1.92+0.14) 2000.8: git add -A (sparse-index-v3) 2.31(2.21+0.15) 2000.9: git add -A (sparse-index-v4) 2.30(2.20+0.13) 2000.10: git add . (full-index-v3) 2.39(2.02+0.20) 2000.11: git add . (full-index-v4) 2.20(1.94+0.16) 2000.12: git add . (sparse-index-v3) 2.36(2.27+0.12) 2000.13: git add . (sparse-index-v4) 2.33(2.21+0.16) 2000.14: git commit -a -m A (full-index-v3) 2.47(2.12+0.20) 2000.15: git commit -a -m A (full-index-v4) 2.26(2.00+0.17) 2000.16: git commit -a -m A (sparse-index-v3) 3.01(2.92+0.16) 2000.17: git commit -a -m A (sparse-index-v4) 3.01(2.94+0.15) Note that there is very little difference between the v3 and v4 index formats when the sparse-index is enabled. This is primarily due to the fact that the relative file sizes are the same, and the command time is mostly taken up by parsing tree objects to expand the sparse index into a full one. With the current file layout, the index file sizes are given by this table: | full index | sparse index | +-------------+--------------+ v3 | 108 MiB | 1.6 MiB | v4 | 80 MiB | 1.2 MiB | Future updates will improve the performance of Git commands when the index is sparse. Signed-off-by: Derrick Stolee --- t/perf/p2000-sparse-operations.sh | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh index 2fbc81b22119..e527316e66d6 100755 --- a/t/perf/p2000-sparse-operations.sh +++ b/t/perf/p2000-sparse-operations.sh @@ -60,12 +60,29 @@ test_expect_success 'setup repo and indexes' ' git sparse-checkout set $SPARSE_CONE && git config index.version 4 && git update-index --index-version=4 + ) && + git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v3 && + ( + cd sparse-index-v3 && + git sparse-checkout init --cone --sparse-index && + git sparse-checkout set $SPARSE_CONE && + git config index.version 3 && + git update-index --index-version=3 + ) && + git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v4 && + ( + cd sparse-index-v4 && + git sparse-checkout init --cone --sparse-index && + git sparse-checkout set $SPARSE_CONE && + git config index.version 4 && + git update-index --index-version=4 ) ' test_perf_on_all () { command="$@" - for repo in full-index-v3 full-index-v4 + for repo in full-index-v3 full-index-v4 \ + sparse-index-v3 sparse-index-v4 do test_perf "$command ($repo)" " (