From patchwork Fri Sep 25 12:33:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Jean-No=C3=ABl_Avila_via_GitGitGadget?= X-Patchwork-Id: 11799681 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B78526CA for ; Fri, 25 Sep 2020 12:33:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8C79321D7A for ; Fri, 25 Sep 2020 12:33:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TPgspRCH" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728319AbgIYMdm (ORCPT ); Fri, 25 Sep 2020 08:33:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726368AbgIYMdm (ORCPT ); Fri, 25 Sep 2020 08:33:42 -0400 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D78CC0613CE for ; Fri, 25 Sep 2020 05:33:42 -0700 (PDT) Received: by mail-wm1-x344.google.com with SMTP id w2so2893127wmi.1 for ; Fri, 25 Sep 2020 05:33:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=7y6SfRMIjYy4ujvhkikz+HToSCukUb+jEH712wnQgYg=; b=TPgspRCHUbaNkhEaxmTH2raUQQrePqikjzZaxey9gOT3s/fXOu48cw3KiDkpEsgSV3 QkrdlX1fbZGCgAhxRuniHsNsEsamIztgupopZk0o5dzqqRT8ETuB3Mf8+F/CaT2KHhQG URVLsZaJypVOFp+Ootw0cRW6dpxEp0RmK3+MRAJjLsWgbJ9aGKzyg1lxRcfOkq/je9hv chMVloMVJBTOYxFiNaRSuQ2Ly1UHiKdGS2JXp1Pv5NwBN4R7RpADmnfqmG3kG4LVc9+Z CkKraeDVg53LiRI8ommddEegXhL4mtWcUHJOVWGgCMauxvrT2+6QNL93WulY1mC++xoJ ABhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=7y6SfRMIjYy4ujvhkikz+HToSCukUb+jEH712wnQgYg=; b=qJL6iQ/navUkocnfgCHQgwQ7t0UaKiFrHBB0sK2INCBOnn6CTwWaRutuSIqBJzsI4U ehndc47lcuRM44wciw+VcxYhXetdRYkQkQdHqaWzEasunNhqDzdp5P2UJT6lFpoHMIaA DybWH5cElpExCyk8hDL4vLR6KjuCffHCOwpQOwexFPjAMAn22+fck8aD/d/crTg+UGXg Y0CEqkA3Mfb/XiICz2kA+z2RU4oF77w7vHrVIPRoWmtpxT5Pd1hX27zA7q2yYiXxLDEc exgs4039XuNFQjC+X7oudn2WAlNcgdxe2rGD+zHUSg1OCjdKj4Z8PoKRuzsMjklem+yM HDeg== X-Gm-Message-State: AOAM530FuCuaYI6qbAM1QCs5PfTZOw2f8K2th1BX559n3JctJjF9MksB omRCDJJyVI9jxH9XP7Jl/E71QW1Lb8U= X-Google-Smtp-Source: ABdhPJwfJQA80XpdZ0ln88iUD3gj5cmkCfCTpU8PzQC3ncuBhThIRFb1IN8a0MQCyQQ/jCXowJ+XkA== X-Received: by 2002:a1c:770c:: with SMTP id t12mr3092749wmi.121.1601037220390; Fri, 25 Sep 2020 05:33:40 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u17sm2969846wri.45.2020.09.25.05.33.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Sep 2020 05:33:39 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Fri, 25 Sep 2020 12:33:30 +0000 Subject: [PATCH v4 0/8] Maintenance II: prefetch, loose-objects, incremental-repack tasks Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Jonathan Tan , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This series is based on ds/maintenance-part-1 [2]. This patch series contains 9 patches that were going to be part of v4 of ds/maintenance [1], but the discussion has gotten really long. To help, I'm splitting out the portions that create and test the 'maintenance' builtin from the additional tasks (prefetch, loose-objects, incremental-repack) that can be brought in later. [1] https://lore.kernel.org/git/pull.671.git.1594131695.gitgitgadget@gmail.com/ [2] https://lore.kernel.org/git/pull.695.v3.git.1598380426.gitgitgadget@gmail.com/ As detailed in [2], the 'git maintenance run' subcommand will run certain tasks based on config options or the --task= arguments. The --auto option indicates to the task to only run based on some internal check that there has been "enough" change in that domain to merit the work. In the case of the 'gc' task, this also reduces the amount of work done. The new maintenance tasks in this series are: * 'loose-objects' : prune packed loose objects, then create a new pack from a batch of loose objects. * 'pack-files' : expire redundant packs from the multi-pack-index, then repack using the multi-pack-index's incremental repack strategy. * 'prefetch' : fetch from each remote, storing the refs in 'refs/prefetch/ /'. These tasks are all disabled by default, but can be enabled with config options or run explicitly using "git maintenance run --task=". Since [2] replaced the 'git gc --auto' calls with 'git maintenance run --auto' at the end of some Git commands, users could replace the 'gc' task with these lighter-weight changes for foreground maintenance. The 'git maintenance' builtin has a 'run' subcommand so it can be extended later with subcommands that manage background maintenance, such as 'start' or 'stop'. These are not the subject of this series, as it is important to focus on the maintenance activities themselves. I have an RFC series for this available at [3]. [3] https://lore.kernel.org/git/pull.680.git.1597857408.gitgitgadget@gmail.com/ Updates in v3 ============= * Several commit message, documentation, and test updates from Jonathan Tan's helpful review! Updates since v2 ================ * Dropped "fetch: optionally allow disabling FETCH_HEAD update" * A lot of fallout from the change in the option parsing in v3 of Maintenance II. * Dropped the "verify, and delete and rewrite on failure" logic from the incremental-repack task. This might be added again later after it can be tested more thoroughly. Updates since v1 (of this series) ================================= * PATCH 1 ("fetch: optionally allow disabling FETCH_HEAD update") was rewritten on-list. Getting a version out with this patch is the main reason for rolling a v2. (That, and Part I is re-rolled with a v2 and I want to make sure this series applies cleanly.) * The 'prefetch' and 'loose-objects' tasks had some review, but my proposed changes were not acked, so they may need another review. UPDATES since v3 of [1] ======================= * The biggest change here is the use of "test_subcommand", based on Jonathan Nieder's approach. This requires having the exact command-line figured out, which now requires spelling out all --no- [quiet%7Cprogress] options. I also added a bunch of "2>/dev/null" checks because of the isatty(2) calls. Without that, the behavior will change depending on whether the test is run with -x/-v or without. * The 0x7FFF/0x7FFFFFFF constant problem is fixed with an EXPENSIVE test that verifies it. * The option parsing has changed to use a local struct and pass that struct to the helper methods. This is instead of having a global singleton. Thanks, -Stolee Derrick Stolee (8): maintenance: add prefetch task maintenance: add loose-objects task maintenance: create auto condition for loose-objects midx: enable core.multiPackIndex by default midx: use start_delayed_progress() maintenance: add incremental-repack task maintenance: auto-size incremental-repack batch maintenance: add incremental-repack auto condition Documentation/config/core.txt | 4 +- Documentation/config/maintenance.txt | 18 ++ Documentation/git-maintenance.txt | 48 ++++ builtin/gc.c | 326 +++++++++++++++++++++++++++ midx.c | 21 +- repo-settings.c | 6 + repository.h | 2 + t/t5319-multi-pack-index.sh | 15 +- t/t7900-maintenance.sh | 185 +++++++++++++++ 9 files changed, 603 insertions(+), 22 deletions(-) base-commit: 25914c4fdeefd99b06e134496dfb9bbb58a5c417 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-696%2Fderrickstolee%2Fmaintenance%2Fgc-v4 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-696/derrickstolee/maintenance/gc-v4 Pull-Request: https://github.com/gitgitgadget/git/pull/696 Range-diff vs v3: 1: da64c51a81 ! 1: 7a62e224cf maintenance: add prefetch task @@ Commit message of a foreground fetch to make that 'git fetch' command much faster. However, if we simply ran 'git fetch ' in the background, - then the user running a foregroudn 'git fetch ' would lose + then the user running a foreground 'git fetch ' would lose some important feedback when a new branch appears or an existing branch updates. This is especially true if a remote branch is force-updated and this isn't noticed by the user because it occurred 2: 75e846456b ! 2: f3a16fd324 maintenance: add loose-objects task @@ Commit message objects are created only by a user doing normal development. We noticed users with _millions_ of loose objects because VFS for Git downloads blobs on-demand when a file read operation - requires populating a virtual file. This has potential of - happening in partial clones if someone runs 'git grep' or - otherwise evades the batch-download feature for requesting - promisor objects. + requires populating a virtual file. This step is based on a similar step in Scalar [1] and VFS for Git. [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/LooseObjectsStep.cs 3: d6e382c43e ! 3: 931fff4883 maintenance: create auto condition for loose-objects @@ t/t7900-maintenance.sh: test_expect_success 'loose-objects task' ' + git -c maintenance.loose-objects.auto=1 maintenance \ + run --auto --task=loose-objects 2>/dev/null && + test_subcommand ! git prune-packed --quiet /dev/null && -+ test_subcommand ! git prune-packed --quiet /dev/null && -+ test_subcommand git prune-packed --quiet /dev/null && -+ test_subcommand git prune-packed --quiet /dev/null && ++ test_subcommand ! git prune-packed --quiet /dev/null && ++ test_subcommand git prune-packed --quiet /dev/null && ++ test_subcommand git prune-packed --quiet err && @@ t/t7900-maintenance.sh: test_expect_success 'maintenance.loose-objects.auto' ' - done + test_subcommand git prune-packed --quiet /dev/null && + test_subcommand ! git multi-pack-index write --no-progress /dev/null && -+ test_subcommand ! git multi-pack-index write --no-progress /dev/null && -+ test_subcommand git multi-pack-index write --no-progress /dev/null && ++ test_subcommand ! git multi-pack-index write --no-progress /dev/null && ++ test_subcommand git multi-pack-index write --no-progress