From patchwork Wed Aug 14 10:31:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 13763259 Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com [209.85.167.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1E4041A01C8 for ; Wed, 14 Aug 2024 10:31:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723631498; cv=none; b=CDDBUV9R7Jz+7OkGFQpMh4kgn/hxadSzQ1vmH1Qc2migKPPnYLeNiXfFvbXM5fH6IQu3zSf6g+asqzwuPk61Ge6UX7NeLQeaq+CNaxeg0tRluFRlqGjcQQ99wPDiECNRB0w3h+8Ev2qJZ8Heh76BcrVRbfiJQ3bpAsOITX/7cas= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723631498; c=relaxed/simple; bh=SaK3qP/HLfUFm7NOA2HOU7rBbYLEr17MKsCZ0qPe8LQ=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=bieJ8bvLQbsDLwxR+gkApHqFMgPM5lVEAUH08vJEZMAyt+RJxJYmFDbwUO7W8UYnn2ek6YTx2cp5ocg4gNoMrEoheKaKj2AIZyGtVJU3f40I4MSW8pihQ3DJkodRpIWif4MwvcarHgJCDPDHsvPtAYYWdVe/FcK/PrEOTiO3WjQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hj8rJdBE; arc=none smtp.client-ip=209.85.167.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hj8rJdBE" Received: by mail-lf1-f53.google.com with SMTP id 2adb3069b0e04-530e062217eso7934240e87.1 for ; Wed, 14 Aug 2024 03:31:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723631494; x=1724236294; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=BVZCYY51GsHPcR3dN3RpwGmxaGZMXz0R5c5jQmj7I5E=; b=hj8rJdBE+wi5B2FKQmfG+udpa7nY8/mPE2MvQPm4p8OFb4Holswh+TrDqJNxzqOohW TyKTU3IMuX1+zWwjdXFTzqVyX7yYngL3dAT+VpboSxRRyPbn+fR3ec4AA5tsfTRqg4GD h3CzD9t9Z0a+tdUKOBT2YfGDpdJBS03Owodi0uDVhcw+o14fXrfKBFV+UW1Za8AUtCi6 NhIZWSxjCvnCcorDXyIfQ9U30BVneAoy98y52aMsIkA5S4G2MXR/ewv/52Eqv/wdZVZP MGPXYZSKKiioTKn3Zy87y/qwE6ou33odfFcwoaOyH1F2Nd9WTxKtcae+Jq+Mref3UZXe Ix2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723631494; x=1724236294; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BVZCYY51GsHPcR3dN3RpwGmxaGZMXz0R5c5jQmj7I5E=; b=HJkaJ+qPsUhO/NMwrx1rbo+RZyJyhCjMQekvFRX8xJUWrvPq+7wu5cVuL74aL58jll UHxRnsTqLXL8FsTacJFAvK1RDk6jHQR0nYzquVQKrdrCZxFePNwYOYQF4MCRb0208DXv bra4QWPuYlXrgeiXeL7dZmXMz++qZMLyRTEBhiJIyUpIZyFoHYyEzfTyHDYlQp4XbcXG 713xtjY8w4fo7WwZqo2yRDqHIXeTtN1q15bSjn8/g7z1phh4QHXzH6Sty+mDm/B/sUpO 8RVEsikTln67yeDrUcJxN3QA4gpvgfK5cSjgDW+bZxO/ckLBCX+JJGjAUvFapoiGs58r h9og== X-Gm-Message-State: AOJu0YzC+KyfQltwvAjyFQmborafCLsMqG+jlqc6JrMKsSOr55TFhmLe 6NLOAnMBZVxLI3MjCaftmB7gmHDmCOJ0ivistxf6A/+CwG8umEYYRTdh3A== X-Google-Smtp-Source: AGHT+IEzV7lyij7WhyzUDpmMWdW8aIbaq4iGDJTbP1J1ruxaBpxDbLapjBd3jj8aZjGi4SCQouBALA== X-Received: by 2002:a05:6512:687:b0:52c:d27b:ddcb with SMTP id 2adb3069b0e04-532eda8e7ecmr1786315e87.3.1723631493353; Wed, 14 Aug 2024 03:31:33 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-429ded4ea80sm15469895e9.38.2024.08.14.03.31.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Aug 2024 03:31:32 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Wed, 14 Aug 2024 10:31:27 +0000 Subject: [PATCH v3 1/4] commit-reach: add get_branch_base_for_tip Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, Derrick Stolee , Derrick Stolee From: Derrick Stolee From: Derrick Stolee Add a new reachability algorithm that intends to discover (from a heuristic) which branch was used as the starting point for a given commit. Add focused tests using the 'test-tool reach' command. In repositories that use pull requests (or merge requests) to advance one or more "protected" branches, the history of that reference can be recovered by following the first-parent history in most cases. Most are completed using no-fast-forward merges, though squash merges are quite common. Less common is rebase-and-merge, which still validates this assumption. Finally, the case that breaks this assumption is the fast-forward update (with potential rebasing). Even in this case, the previous commit commonly appears in the first-parent history of the branch. Similar assumptions can be made for a topic branch created by a single user with the intention to merge back into another branch. Using 'git commit', 'git merge', and 'git cherry-pick' from HEAD will default to having the first-parent commit be the previous commit at HEAD. This history changes only with commands such as 'git reset' or 'git rebase', where the command names also imply that the branch is starting from a new location. With this movement of branches in mind, the following heuristic is proposed as a way to determine the base branch for a given source branch: Among a list of candidate base branches, select the candidate that minimizes the number of commits in the first-parent history of the source that are not in the first-parent history of the candidate. Prior third-party solutions to this problem have used this optimization criteria, but have relied upon extracting the first-parent history and comparing those lists as tables instead of using commit-graph walks. Given current command-line interface options, this optimization criteria is not easy to detect directly. Even using the command git rev-list --count --first-parent .. does not measure this count, as it uses full reachability from to determine which commits to remove from the range '..'. This may lead to one asking if we should instead be using the full reachability of the candidate and only the first-parent history of the source. This, unfortunately, does not work for repositories that use long-lived branches and automation to merge across those branches. In extremely large repositories, merging into a single trunk may not be feasible. This is usually due to the desired frequency of updates (thousands of engineers doing daily work) combined with the time required to perform a validation build. These factors combine to create significant risk of semantic merge conflicts, leading to build breaks on the trunk. In response, repository maintainers can create a single Level Zero (L0) trunk and multiple Level One (L1) branches. By partitioning the engineers by organization, these engineers may see lower risk of semantic merge conflicts as well as be protected against build breaks in other L1 branches. The key to making this system work is a semi-automated process of merging L1 branches into the L0 trunk and vice-versa. In a large enough organization, these L1 branches may further split into L2 or L3 branches, but the same principles apply for merging across deeper levels. If these automated merges use a typical merge with the second parent bringing in the "new" content, then each L0 and L1 branch can track its previous positions by following first-parent history, which appear as parallel paths (until reaching the first place where the branches diverged). If we also walk to second parents, then the histories overlap significantly and cannot be distinguished except for very-recent changes. For this reason, the first-parent condition should be symmetrical across the base and source branches. Another common case for desiring the result of this optimization method is the use of release branches. When releasing a version of a repository, a branch can be used to track that release. Any updates that are worth fixing in that release can be merged to the release branch and shipped with only the necessary fixes without any new features introduced in the trunk branch. The 'maint-2.' branches represent this pattern in the Git project. The microsoft/git fork uses 'vfs-2..' branches to track the changes that are custom to that fork on top of each upstream Git release 2... This application doesn't need the symmetrical first-parent condition, but the use of first-parent histories does not change the results for these branches. To determine the base branch from a list of candidates, create a new method in commit-reach.c that performs a single* commit-graph walk. The core concept is to walk first-parents starting at the candidate bases and the source, tracking the "best" base to reach a given commit. Use generation numbers to ensure that a commit is walked at most once and all children have been explored before visiting it. When reaching a commit that is reachable from both a base and the source, we will then have a guarantee that this is the closest intersection of first-parent histories. Track the best base to reach that commit and return it as a result. In rare cases involving multiple root commits, the first-parent history of the source may never intersect any of the candidates and thus a null result is returned. * There are up to two walks, since we require all commits to have a computed generation number in order to avoid incorrect results. This is similar to the need for computed generation numbers in ahead_behind() as implemented in fd67d149bde (commit-reach: implement ahead_behind() logic, 2023-03-20). In order to track the "best" base, use a new commit slab that stores an integer. This value defaults to zero upon initialization, so use -1 to track that the source commit can reach this commit and use 'i + 1' to track that the ith base can reach this commit. When multiple bases can reach a commit, minimize the index to break ties. This allows the caller to specify an order to the bases that determines some amount of preference when the heuristic does not result in a unique result. The trickiest part of the integer slab is what happens when reaching a collision among the histories of the bases and the history of the source. This is noticed when viewing the first parent and seeing that it has a slab value that differs in sign (negative or positive). In this case, the collision commit is stored in the method variable 'branch_point' and its slab value is set to -1. The index of the best base (so far) is stored in the method variable 'best_index'. It is possible that there are multiple commits that have the branch_point as its first parent, leading to multiple updates of best_index. The result is determined when 'branch_point' is visited in the commit walk, giving the guarantee that all commits that could reach 'branch_point' were visited. Several interesting cases of collisions and different results are tested in the t6600-test-reach.sh script. Recall that this script also tests the algorithm in three possible states involving the commit-graph file and how many commits are written in the file. This provides some coverage of the need (and lack of need) for the ensure_generations_valid() method. Signed-off-by: Derrick Stolee --- commit-reach.c | 126 ++++++++++++++++++++++++++++++++++++++++++ commit-reach.h | 17 ++++++ t/helper/test-reach.c | 2 + t/t6600-test-reach.sh | 61 ++++++++++++++++++++ 4 files changed, 206 insertions(+) diff --git a/commit-reach.c b/commit-reach.c index 8f9b008f876..4753197ec88 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -1222,3 +1222,129 @@ done: free(commits); repo_clear_commit_marks(r, SEEN); } + +/* + * This slab initializes integers to zero, so use "-1" for "tip is best" and + * "i + 1" for "bases[i] is best". + */ +define_commit_slab(best_branch_base, int); +static struct best_branch_base best_branch_base; +#define get_best(c) (*best_branch_base_at(&best_branch_base, (c))) +#define set_best(c,v) (*best_branch_base_at(&best_branch_base, (c)) = (v)) + +int get_branch_base_for_tip(struct repository *r, + struct commit *tip, + struct commit **bases, + size_t bases_nr) +{ + int best_index = -1; + struct commit *branch_point = NULL; + struct prio_queue queue = { compare_commits_by_gen_then_commit_date }; + int found_missing_gen = 0; + + if (!bases_nr) + return -1; + + repo_parse_commit(r, tip); + if (commit_graph_generation(tip) == GENERATION_NUMBER_INFINITY) + found_missing_gen = 1; + + /* Check for missing generation numbers. */ + for (size_t i = 0; i < bases_nr; i++) { + struct commit *c = bases[i]; + repo_parse_commit(r, c); + if (commit_graph_generation(c) == GENERATION_NUMBER_INFINITY) + found_missing_gen = 1; + } + + if (found_missing_gen) { + struct commit **commits; + size_t commits_nr = bases_nr + 1; + + CALLOC_ARRAY(commits, commits_nr); + COPY_ARRAY(commits, bases, bases_nr); + commits[bases_nr] = tip; + ensure_generations_valid(r, commits, commits_nr); + free(commits); + } + + /* Initialize queue and slab now that generations are guaranteed. */ + init_best_branch_base(&best_branch_base); + set_best(tip, -1); + prio_queue_put(&queue, tip); + + for (size_t i = 0; i < bases_nr; i++) { + struct commit *c = bases[i]; + int best = get_best(c); + + /* Has this already been marked as best by another commit? */ + if (best) { + if (best == -1) { + /* We agree at this position. Stop now. */ + best_index = i + 1; + goto cleanup; + } + continue; + } + + set_best(c, i + 1); + prio_queue_put(&queue, c); + } + + while (queue.nr) { + struct commit *c = prio_queue_get(&queue); + int best_for_c = get_best(c); + int best_for_p, positive; + struct commit *parent; + + /* Have we reached a known branch point? It's optimal. */ + if (c == branch_point) + break; + + repo_parse_commit(r, c); + if (!c->parents) + continue; + + parent = c->parents->item; + repo_parse_commit(r, parent); + best_for_p = get_best(parent); + + if (!best_for_p) { + /* 'parent' is new, so pass along best_for_c. */ + set_best(parent, best_for_c); + prio_queue_put(&queue, parent); + continue; + } + + if (best_for_p > 0 && best_for_c > 0) { + /* Collision among bases. Minimize. */ + if (best_for_c < best_for_p) + set_best(parent, best_for_c); + continue; + } + + /* + * At this point, we have reached a commit that is reachable + * from the tip, either from 'c' or from an earlier commit to + * have 'parent' as its first parent. + * + * Update 'best_index' to match the minimum of all base indices + * to reach 'parent'. + */ + + /* Exactly one is positive due to initial conditions. */ + positive = (best_for_c < 0) ? best_for_p : best_for_c; + + if (best_index < 0 || positive < best_index) + best_index = positive; + + /* No matter what, track that the parent is reachable from tip. */ + set_best(parent, -1); + branch_point = parent; + } + +cleanup: + clear_best_branch_base(&best_branch_base); + clear_prio_queue(&queue); + return best_index > 0 ? best_index - 1 : -1; +} diff --git a/commit-reach.h b/commit-reach.h index bf63cc468fd..9a745b7e176 100644 --- a/commit-reach.h +++ b/commit-reach.h @@ -139,4 +139,21 @@ void tips_reachable_from_bases(struct repository *r, struct commit **tips, size_t tips_nr, int mark); +/* + * Given a 'tip' commit and a list potential 'bases', return the index 'i' that + * minimizes the number of commits in the first-parent history of 'tip' and not + * in the first-parent history of 'bases[i]'. + * + * Among a list of long-lived branches that are updated only by merges (with the + * first parent being the previous position of the branch), this would inform + * which branch was used to create the tip reference. + * + * Returns -1 if no common point is found in first-parent histories, which is + * rare, but possible with multiple root commits. + */ +int get_branch_base_for_tip(struct repository *r, + struct commit *tip, + struct commit **bases, + size_t bases_nr); + #endif diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c index 1e3b431e3e7..8579b607aa5 100644 --- a/t/helper/test-reach.c +++ b/t/helper/test-reach.c @@ -114,6 +114,8 @@ int cmd__reach(int ac, const char **av) repo_in_merge_bases_many(the_repository, A, X_nr, X_array, 0)); else if (!strcmp(av[1], "is_descendant_of")) printf("%s(A,X):%d\n", av[1], repo_is_descendant_of(r, A, X)); + else if (!strcmp(av[1], "get_branch_base_for_tip")) + printf("%s(A,X):%d\n", av[1], get_branch_base_for_tip(r, A, X_array, X_nr)); else if (!strcmp(av[1], "get_merge_bases_many")) { struct commit_list *list = NULL; if (repo_get_merge_bases_many(the_repository, diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh index b330945f497..e789a4720c1 100755 --- a/t/t6600-test-reach.sh +++ b/t/t6600-test-reach.sh @@ -612,4 +612,65 @@ test_expect_success 'for-each-ref merged:none' ' --format="%(refname)" --stdin ' +# For get_branch_base_for_tip, we only care about +# first-parent history. Here is the test graph with +# second parents removed: +# +# (10,10) +# / +# (10,9) (9,10) +# / / +# (10,8) (9,9) (8,10) +# / / / +# ( continued...) +# \ / / / +# (3,1) (2,2) (1,3) +# \ / / +# (2,1) (1,2) +# \ / +# (1,1) +# +# In short, for a commit (i,j), the first-parent history +# walks all commits (i, k) with k from j to 1, then the +# commits (l, 1) with l from i to 1. + +test_expect_success 'get_branch_base_for_tip: none reach' ' + # (2,3) branched from the first tip (i,4) in X with i > 2 + cat >input <<-\EOF && + A:commit-2-3 + X:commit-1-2 + X:commit-1-4 + X:commit-4-4 + X:commit-8-4 + X:commit-10-4 + EOF + echo "get_branch_base_for_tip(A,X):2" >expect && + test_all_modes get_branch_base_for_tip +' + +test_expect_success 'get_branch_base_for_tip: equal to tip' ' + # (2,3) branched from the first tip (i,4) in X with i > 2 + cat >input <<-\EOF && + A:commit-8-4 + X:commit-1-2 + X:commit-1-4 + X:commit-4-4 + X:commit-8-4 + X:commit-10-4 + EOF + echo "get_branch_base_for_tip(A,X):3" >expect && + test_all_modes get_branch_base_for_tip +' + +test_expect_success 'get_branch_base_for_tip: all reach tip' ' + # (2,3) branched from the first tip (i,4) in X with i > 2 + cat >input <<-\EOF && + A:commit-4-1 + X:commit-4-2 + X:commit-5-1 + EOF + echo "get_branch_base_for_tip(A,X):0" >expect && + test_all_modes get_branch_base_for_tip +' + test_done From patchwork Wed Aug 14 10:31:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 13763260 Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE0FE1AD9F7 for ; Wed, 14 Aug 2024 10:31:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723631498; cv=none; b=iMZuW5ZyqA3vj/a0SRPYhb5W+O5kvCRDBnmi9TIQw2hmNullmWrnNrIIfYEC9pbV9ZYJ0a05KO0haxgIpWlHBf8Rhm8NHbGr5yRb43x5HAchAt8Ai/NWllDZp5BSsgxo7Osmhvz+mmOY/psGmEUwMWGw7czjWlFn53LJN19PhGA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723631498; c=relaxed/simple; bh=FT2/uhpEYMwmDDfjuHR3NT4mNjb7wPRkpBEiyynbYaE=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=lVKln9ak0yIbPmze4zp8KcxYpdcuH2jKQyduao1RPWhV/SfTFkcDBQTqIlJkkM1e5LVPYCDzI5llPI2d6PSbR+yGPEiXO2Wh51Y6gBE9e/4Y1ZxlMZIa1qweZV7u+mdmowSfrm4OVYWrBY/qdLtNo6RvSd4F+5Q2Jai7HhnFdUs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=a7m6MLYe; arc=none smtp.client-ip=209.85.128.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="a7m6MLYe" Received: by mail-wm1-f43.google.com with SMTP id 5b1f17b1804b1-4280b3a7efaso47806945e9.0 for ; Wed, 14 Aug 2024 03:31:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723631495; x=1724236295; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=GgNfecXlJmMbkbyK0ir13V6gfcgRCtINP9xYLrOhzmI=; b=a7m6MLYepmdUSwYE2WgcroI5hG5AMRQI3Sn8kl0CFZUwOnz9hPcDYEgplxwutNOAHY 8HUJSGMi9kCxyRsH39Jjt5rp2qky7B9Xnzj49Dg04uJ71sxRji5anDwI0T+Etv71TTB6 VVpNAlcx+s1JOPW4Qd1JUz6lnUg0lCHXYcVmigaFwu/JdR0fh0OL9aHWW4F6o7IB0AtS nvtIf6vWQK6LX4NpA9IaxaJi/dSEjCVNyZBGfq72MHPOoBkkEbf0FR9qXskeSid/ofMn Qe307xZpJqbHjf0O1l0wa4/UinDf35R1rT3zwpGASd+QzHhog+epvtX/BLiWDJy3Urbw 0+EQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723631495; x=1724236295; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GgNfecXlJmMbkbyK0ir13V6gfcgRCtINP9xYLrOhzmI=; b=YS8ItgRFI7hXcZFvOdJKMs0H1vGeSOzdn+Jt/804I0UHpGc1fGEsDS65uEyOOlrIy4 EjSgYsloiYKHW2s7kg33kbtTDavmFWiU3VY+BexfuRFLfsrbMX6rYI/n9ya6oFXRbo4O wcIsK2MQsM3JHDl0UpjS6Kg7D/6oor+niB1zJAvajTEk82IQ+LKew+e94nZKyOi4kViC 324koAoshC0XLJ5rf3QF6+IZsGvFpPdvpt3YiAnz09imdgnaJzf14nOgfPcprI5x1ZIo qjAxsaxcSW9NsrzyzNu6bWB72CdODHETG4Q4nUk6GwWzfjtgmUCo07mWb5M7Ua218Tda ip4w== X-Gm-Message-State: AOJu0YzjW4Ye17P2EDwMm+fOAWOougKmYkEs/HZuLocr0phszGQayHNk nIb+b07DOEgaRTNNwkXOq8nayBh8q+dJCVRIHeU7zYEGYQARdpP5mQrK2Q== X-Google-Smtp-Source: AGHT+IFvSlzZZGh9GO7VFEO+jOhmOOsPN8cO6FR3bVsOqq/dF4mv8NejK7wSDXQwYtpZPWtxhwxjjw== X-Received: by 2002:a05:600c:4686:b0:428:e140:88c4 with SMTP id 5b1f17b1804b1-429dd26749dmr13690015e9.33.1723631494203; Wed, 14 Aug 2024 03:31:34 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-429ded36051sm15439515e9.24.2024.08.14.03.31.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Aug 2024 03:31:33 -0700 (PDT) Message-Id: <5240c2a7b328e3d356574a1ab00e2faa8a71d92a.1723631490.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 14 Aug 2024 10:31:28 +0000 Subject: [PATCH v3 2/4] commit: add gentle reference lookup method Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, Derrick Stolee , Derrick Stolee From: Derrick Stolee From: Derrick Stolee The lookup_commit_reference_by_name() method uses lookup_commit_reference() without an option to use lookup_commit_reference_gently(). Create a gentle version of the method so it can be used in locations where non-commits may be found but error messages should be silenced. Signed-off-by: Derrick Stolee --- commit.c | 8 +++++++- commit.h | 2 ++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/commit.c b/commit.c index 1a479a997c4..ed49be8dce5 100644 --- a/commit.c +++ b/commit.c @@ -82,13 +82,19 @@ struct commit *lookup_commit(struct repository *r, const struct object_id *oid) } struct commit *lookup_commit_reference_by_name(const char *name) +{ + return lookup_commit_reference_by_name_gently(name, 0); +} + +struct commit *lookup_commit_reference_by_name_gently(const char *name, + int quiet) { struct object_id oid; struct commit *commit; if (repo_get_oid_committish(the_repository, name, &oid)) return NULL; - commit = lookup_commit_reference(the_repository, &oid); + commit = lookup_commit_reference_gently(the_repository, &oid, quiet); if (repo_parse_commit(the_repository, commit)) return NULL; return commit; diff --git a/commit.h b/commit.h index 62fe0d77a70..ef17668cc69 100644 --- a/commit.h +++ b/commit.h @@ -81,6 +81,8 @@ struct commit *lookup_commit_reference_gently(struct repository *r, const struct object_id *oid, int quiet); struct commit *lookup_commit_reference_by_name(const char *name); +struct commit *lookup_commit_reference_by_name_gently(const char *name, + int quiet); /* * Look up object named by "oid", dereference tag as necessary, From patchwork Wed Aug 14 10:31:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 13763261 Received: from mail-wm1-f48.google.com (mail-wm1-f48.google.com [209.85.128.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A77101AE035 for ; Wed, 14 Aug 2024 10:31:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723631499; cv=none; b=nOkrHUpnh/dYsXfNAOne0dHt7aMA/rzBE3feGS9ux5I37BFf139vztHyFgGElH0wGU30aKhS3Wgd9I1weKIwcWhrmAYUJEgsW6Hkkk0A3z7XEyFqoxuvOOVEo76SDJZLQX+LMlxTDcNEf5R71z+oPbWaQLV67q5N8gDXkxWynEI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723631499; c=relaxed/simple; bh=g/TNINRaOJ3pseA6gjVr+qEeU9WWbFmpVqIxro1z4TQ=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=rDbD8xLBiicg3KeiuEfORJSuX/VSprfVMb0ZrDz7DIiMMSGwQ1M8ozfv0dP36XJHjNwHofQlMzaZEWlhmJgBMD6PQWeCgK0tBW0/j6zCbA/G9HYt60VlWaJAhlfIllkrJ5KA+uudft/n3JEloD2Hztwui2pQTKx0RzS0YW/i/w4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=nG7H4cUG; arc=none smtp.client-ip=209.85.128.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="nG7H4cUG" Received: by mail-wm1-f48.google.com with SMTP id 5b1f17b1804b1-427fc97a88cso49495945e9.0 for ; Wed, 14 Aug 2024 03:31:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723631496; x=1724236296; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=F3Itdrhzfpy8fmiygJVuag0mxqZJQoRPae//Uyfg8OI=; b=nG7H4cUG/QpUqYUesyp8+KJ82H/h1H/BG4Sg7fwYSb4fpaBGK66Kv6fDACxMYbgbAn KRjx5twdWajBMlo7fpY0qWIS2pxmrUKHWidaH9p64FizLP74BYn0i5q9INAnTgNqXpVC OAxKX5bXM5nt4fFVMEScpVmxGyzkPoduuHJSWU4JL43K1X9q11LLXj+R58C0NGvIrcxg Wi6XkTD1hXoyOCOX/hD2eKzX05xfLLq5FNBcVKl96RUjjQKpj8ZXrUkjPWUHXKYvShgk g0d+AI1KItqWw62YlotYqvLw8Enj2RIsr4IrTduOsghR81LowmvDMv0jF+Rh65/Pqz2v vDqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723631496; x=1724236296; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=F3Itdrhzfpy8fmiygJVuag0mxqZJQoRPae//Uyfg8OI=; b=lFkHPymvBWPlgdAk7H9FQge1DPc+P1GrZ9aZpyKOnCQPOLU37brCCYadxpeOSmdxag MZfCz+vrVe9PEBvF3YTCHflbfS1OufHDZhwKraoFkSJXajG/HgU7gX6ffky4C8Pm59oP sDB2HRG8OO9VoKZEAtYF5EIWiffwoBhv4kPkokXAzUbf+yHIlAfrbGhfV3uemZ9VbuCg tp4Nfa7LQgjSHKjpgO+UGKEA85ITUmb7FcWDfzDhUV7zS+4u/p2kWk1tu825JKZ6F3rW 19EASSGKqSYCRh9KYYkhhhgnBR9xje6vmqSEcYoGZN6yrjmfCm9fpppS0l+rCfE0cJxv By4w== X-Gm-Message-State: AOJu0Ywer0SrmmQIyInDbiRJpu1ejiC5pQQdKq7cdQqN2+K4ObdiXLnr meuWgRHE+V1A6XWFi5t/bpK7l+bSl2FocgHsND0p/PUMqcRZQqDfGyHxEg== X-Google-Smtp-Source: AGHT+IFtO3KlSelhNMQyVgb68X7O+ruwTLsL8hwPpJ9wNsRe/3puZRvOykH8fd1vYrw2Vv44+03Daw== X-Received: by 2002:a05:600c:1f83:b0:426:5cee:4abc with SMTP id 5b1f17b1804b1-429dd23cf16mr16863755e9.20.1723631495137; Wed, 14 Aug 2024 03:31:35 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-429ded1ec28sm15502075e9.5.2024.08.14.03.31.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Aug 2024 03:31:34 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Wed, 14 Aug 2024 10:31:29 +0000 Subject: [PATCH v3 3/4] for-each-ref: add 'is-base' token Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, Derrick Stolee , Derrick Stolee From: Derrick Stolee From: Derrick Stolee The previous change introduced the get_branch_base_for_tip() method in commit-reach.c. The motivation of that change was about using a heuristic to deteremine the base branch for a source commit from a list of candidate commit tips. This change makes that algorithm visible to users via a new atom in the 'git for-each-ref' format. This change is very similar to the chang in 49abcd21da6 (for-each-ref: add ahead-behind format atom, 2023-03-20). Introduce the 'is-base:' atom, which will indicate that the algorithm should be computed and the result of the algorithm is reported using an indicator of the form '()'. For example, using '%(is-base:HEAD)' would result in one line having the token '(HEAD)'. Use the sorted order of refs included in the ref filter to break ties in the algorithm's heuristic. In the previous change, the motivating examples include using an L0 trunk, long-lived L1 branches, and temporary release branches. A caller could communicate the ordered preference among these categories using the input refpecs and avoiding a different sort mechanism. This sorting behavior is tested in the test scripts. It is important to include this atom as a special case to can_do_iterative_format() to match the expectations created in bd98f9774e1 (ref-filter.c: filter & format refs in the same callback, 2023-11-14). The ahead-behind atom was one of the special cases, and this similarly requires using an algorithm across all input refs before starting the format of any single ref. In the test script, the format tokens use colons or lack whitespace to avoid Git complaining about trailing whitespace errors. Signed-off-by: Derrick Stolee --- Documentation/git-for-each-ref.txt | 42 ++++++++++++++++ ref-filter.c | 77 +++++++++++++++++++++++++++++- ref-filter.h | 15 ++++++ t/t6300-for-each-ref.sh | 9 ++++ t/t6600-test-reach.sh | 60 +++++++++++++++++++++++ 5 files changed, 202 insertions(+), 1 deletion(-) diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt index c1dd12b93cf..d3764401a23 100644 --- a/Documentation/git-for-each-ref.txt +++ b/Documentation/git-for-each-ref.txt @@ -264,6 +264,48 @@ ahead-behind::: commits ahead and behind, respectively, when comparing the output ref to the `` specified in the format. +is-base::: + In at most one row, `()` will appear to indicate the ref + that is most likely the ref used as a starting point for the branch + that produced ``. This choice is made using a heuristic: + choose the ref that minimizes the number of commits in the + first-parent history of `` and not in the first-parent + history of the ref. ++ +For example, consider the following figure of first-parent histories of +several refs: ++ +---- +*--*--*--*--*--* refs/heads/A +\ + \ + *--*--*--* refs/heads/B + \ \ + \ \ + * * refs/heads/C + \ + \ + *--* refs/heads/D +---- ++ +Here, if `A`, `B`, and `C` are the filtered references, and the format +string is `%(refname):%(is-base:D)`, then the output would be ++ +---- +refs/heads/A: +refs/heads/B:(D) +refs/heads/C: +---- ++ +This is because the first-parent history of `D` has its earliest +intersection with the first-parent histories of the filtered refs at a +common first-parent ancestor of `B` and `C` and ties are broken by the +earliest ref in the sorted order. ++ +Note that this token will not appear if the first-parent history of +`` does not intersect the first-parent histories of the +filtered refs. + describe[:options]:: A human-readable name, like linkgit:git-describe[1]; empty string for undescribable commits. The `describe` string may diff --git a/ref-filter.c b/ref-filter.c index 59ad6f54ddb..3d598f6b6e6 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -167,6 +167,7 @@ enum atom_type { ATOM_ELSE, ATOM_REST, ATOM_AHEADBEHIND, + ATOM_ISBASE, }; /* @@ -889,6 +890,23 @@ static int ahead_behind_atom_parser(struct ref_format *format, return 0; } +static int is_base_atom_parser(struct ref_format *format, + struct used_atom *atom UNUSED, + const char *arg, struct strbuf *err) +{ + struct string_list_item *item; + + if (!arg) + return strbuf_addf_ret(err, -1, _("expected format: %%(is-base:)")); + + item = string_list_append(&format->is_base_tips, arg); + item->util = lookup_commit_reference_by_name(arg); + if (!item->util) + die("failed to find '%s'", arg); + + return 0; +} + static int head_atom_parser(struct ref_format *format UNUSED, struct used_atom *atom, const char *arg, struct strbuf *err) @@ -952,6 +970,7 @@ static struct { [ATOM_ELSE] = { "else", SOURCE_NONE }, [ATOM_REST] = { "rest", SOURCE_NONE, FIELD_STR, rest_atom_parser }, [ATOM_AHEADBEHIND] = { "ahead-behind", SOURCE_OTHER, FIELD_STR, ahead_behind_atom_parser }, + [ATOM_ISBASE] = { "is-base", SOURCE_OTHER, FIELD_STR, is_base_atom_parser }, /* * Please update $__git_ref_fieldlist in git-completion.bash * when you add new atoms @@ -2334,6 +2353,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err) int i; struct object_info empty = OBJECT_INFO_INIT; int ahead_behind_atoms = 0; + int is_base_atoms = 0; CALLOC_ARRAY(ref->value, used_atom_cnt); @@ -2475,6 +2495,15 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err) v->s = xstrdup(""); } continue; + } else if (atom_type == ATOM_ISBASE) { + if (ref->is_base && ref->is_base[is_base_atoms]) { + v->s = xstrfmt("(%s)", ref->is_base[is_base_atoms]); + free(ref->is_base[is_base_atoms]); + } else { + v->s = xstrdup(""); + } + is_base_atoms++; + continue; } else continue; @@ -2876,6 +2905,7 @@ static void free_array_item(struct ref_array_item *item) free(item->value); } free(item->counts); + free(item->is_base); free(item); } @@ -3040,6 +3070,49 @@ void filter_ahead_behind(struct repository *r, free(commits); } +void filter_is_base(struct repository *r, + struct ref_format *format, + struct ref_array *array) +{ + struct commit **bases; + size_t bases_nr = 0; + struct ref_array_item **back_index; + + if (!format->is_base_tips.nr || !array->nr) + return; + + CALLOC_ARRAY(back_index, array->nr); + CALLOC_ARRAY(bases, array->nr); + + for (size_t i = 0; i < array->nr; i++) { + const char *name = array->items[i]->refname; + struct commit *c = lookup_commit_reference_by_name_gently(name, 1); + + CALLOC_ARRAY(array->items[i]->is_base, format->is_base_tips.nr); + + if (!c) + continue; + + back_index[bases_nr] = array->items[i]; + bases[bases_nr] = c; + bases_nr++; + } + + for (size_t i = 0; i < format->is_base_tips.nr; i++) { + struct commit *tip = format->is_base_tips.items[i].util; + int base_index = get_branch_base_for_tip(r, tip, bases, bases_nr); + + if (base_index < 0) + continue; + + /* Store the string for use in output later. */ + back_index[base_index]->is_base[i] = xstrdup(format->is_base_tips.items[i].string); + } + + free(back_index); + free(bases); +} + static int do_filter_refs(struct ref_filter *filter, unsigned int type, each_ref_fn fn, void *cb_data) { int ret = 0; @@ -3126,7 +3199,8 @@ static inline int can_do_iterative_format(struct ref_filter *filter, return !(filter->reachable_from || filter->unreachable_from || sorting || - format->bases.nr); + format->bases.nr || + format->is_base_tips.nr); } void filter_and_format_refs(struct ref_filter *filter, unsigned int type, @@ -3150,6 +3224,7 @@ void filter_and_format_refs(struct ref_filter *filter, unsigned int type, struct ref_array array = { 0 }; filter_refs(&array, filter, type); filter_ahead_behind(the_repository, format, &array); + filter_is_base(the_repository, format, &array); ref_array_sort(sorting, &array); print_formatted_ref_array(&array, format); ref_array_clear(&array); diff --git a/ref-filter.h b/ref-filter.h index 0ca28d2bba6..20419a56218 100644 --- a/ref-filter.h +++ b/ref-filter.h @@ -48,6 +48,7 @@ struct ref_array_item { struct commit *commit; struct atom_value *value; struct ahead_behind_count **counts; + char **is_base; char refname[FLEX_ARRAY]; }; @@ -101,6 +102,9 @@ struct ref_format { /* List of bases for ahead-behind counts. */ struct string_list bases; + /* List of bases for is-base indicators. */ + struct string_list is_base_tips; + struct { int max_count; int omit_empty; @@ -114,6 +118,7 @@ struct ref_format { #define REF_FORMAT_INIT { \ .use_color = -1, \ .bases = STRING_LIST_INIT_DUP, \ + .is_base_tips = STRING_LIST_INIT_DUP, \ } /* Macros for checking --merged and --no-merged options */ @@ -203,6 +208,16 @@ void filter_ahead_behind(struct repository *r, struct ref_format *format, struct ref_array *array); +/* + * If the provided format includes is-base atoms, then compute the base checks + * for those tips against all refs. + * + * If this is not called, then any is-base atoms will be blank. + */ +void filter_is_base(struct repository *r, + struct ref_format *format, + struct ref_array *array); + void ref_filter_init(struct ref_filter *filter); void ref_filter_clear(struct ref_filter *filter); diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh index eb6c8204e8b..8d15713cc67 100755 --- a/t/t6300-for-each-ref.sh +++ b/t/t6300-for-each-ref.sh @@ -1907,6 +1907,15 @@ test_expect_success 'git for-each-ref with nested tags' ' test_cmp expect actual ' +test_expect_success 'is-base atom with non-commits' ' + git for-each-ref --format="%(is-base:HEAD) %(refname)" >out 2>err && + grep "(HEAD) refs/heads/main" out && + + test_line_count = 2 err && + grep "error: object .* is a commit, not a blob" err && + grep "error: bad tag pointer to" err +' + GRADE_FORMAT="%(signature:grade)%0a%(signature:key)%0a%(signature:signer)%0a%(signature:fingerprint)%0a%(signature:primarykeyfingerprint)" TRUSTLEVEL_FORMAT="%(signature:trustlevel)%0a%(signature:key)%0a%(signature:signer)%0a%(signature:fingerprint)%0a%(signature:primarykeyfingerprint)" diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh index e789a4720c1..2591f8b8b39 100755 --- a/t/t6600-test-reach.sh +++ b/t/t6600-test-reach.sh @@ -673,4 +673,64 @@ test_expect_success 'get_branch_base_for_tip: all reach tip' ' test_all_modes get_branch_base_for_tip ' +test_expect_success 'for-each-ref is-base: none reach' ' + cat >input <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-4-2 + refs/heads/commit-4-4 + refs/heads/commit-8-4 + EOF + cat >expect <<-\EOF && + refs/heads/commit-1-1: + refs/heads/commit-4-2:(commit-2-3) + refs/heads/commit-4-4: + refs/heads/commit-8-4: + EOF + run_all_modes git for-each-ref \ + --format="%(refname):%(is-base:commit-2-3)" --stdin +' + +test_expect_success 'for-each-ref is-base: all reach' ' + cat >input <<-\EOF && + refs/heads/commit-4-2 + refs/heads/commit-5-1 + EOF + cat >expect <<-\EOF && + refs/heads/commit-4-2:(commit-4-1) + refs/heads/commit-5-1: + EOF + run_all_modes git for-each-ref \ + --format="%(refname):%(is-base:commit-4-1)" --stdin +' + +test_expect_success 'for-each-ref is-base: equal to tip' ' + cat >input <<-\EOF && + refs/heads/commit-4-2 + refs/heads/commit-5-1 + EOF + cat >expect <<-\EOF && + refs/heads/commit-4-2:(commit-4-2) + refs/heads/commit-5-1: + EOF + run_all_modes git for-each-ref \ + --format="%(refname):%(is-base:commit-4-2)" --stdin +' + +test_expect_success 'for-each-ref is-base:multiple' ' + cat >input <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-4-2 + refs/heads/commit-4-4 + refs/heads/commit-8-4 + EOF + cat >expect <<-\EOF && + refs/heads/commit-1-1[-] + refs/heads/commit-4-2[(commit-2-3)-] + refs/heads/commit-4-4[-] + refs/heads/commit-8-4[-(commit-6-5)] + EOF + run_all_modes git for-each-ref \ + --format="%(refname)[%(is-base:commit-2-3)-%(is-base:commit-6-5)]" --stdin +' + test_done From patchwork Wed Aug 14 10:31:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 13763262 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4F6D01AE049 for ; Wed, 14 Aug 2024 10:31:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723631500; cv=none; b=P04QCCed0w2NCKSzIDGeRmGUPHshKChQtrG2xApWF/T7yOcN3jWnrrdc/BoEEDwp4ueBUViqjXGQ8dXqeaIUY5bxbAaTvvO0N54LePLCZ9ybPyxGCOrg3K1/pQjEkDnLuCVTLlVvVYcMNAL/VH83G8ki3CvKzrBYFMbXkTZyYLI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723631500; c=relaxed/simple; bh=jgBi6oNQL+qdRE7utMh8FfsQo+FyxbRlKAjlGaLSrHE=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=nh9IkmX4buzyeqfFu42gHYScoiKBQHiR9BKdhIyxIssbJYO8SIYdoHAdMhTdoTUA8RoDbXi/RKhLEuYg1ou4yXmLI8gt/IGYHGdKKH9w2N6tj/5FAzB9BuPGB3HXviw0ukNUTqgqoitEkcnSBDnzgPxdzGK5I0Wg0BsAMKTy4q8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=XpRVH1gS; arc=none smtp.client-ip=209.85.128.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="XpRVH1gS" Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-429d2d7be1eso4226745e9.1 for ; Wed, 14 Aug 2024 03:31:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723631496; x=1724236296; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=56hSNiOlt1LylH+n3beA4JrC5B+sMVlxoMQsGPznpV4=; b=XpRVH1gS3V08b2J7d5+3dpajhBtU9/6OgVgLp95W4FTE0qkiIsXONIb4OxJB95WZ+0 9TVbJHCobW4NczYOp8/R/atZKrJOvF4Gp1wdBbX8q/5PMKIaOELHoGK8KFZqUr1f7M/I sb41XRTOcSPBXmbJsUprA2PNsklGaV3Zk8LyPctA6sv1+uOrVZbQsZ/qqN7tz0No8TJT XvEjNNKCCqMxv3zzBPo93w7jWAm8CVqRwjVyLcVCgUXLDZKeR+TyyqDTk4A0ZuCbGhrC 81pHApO4+eCjGIj2ndR0GNCzT1N1ZZ0Ya3r2Vg7NWOHBGJUZ+iUtIBK2AonDM9zA6p+5 JbTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723631496; x=1724236296; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=56hSNiOlt1LylH+n3beA4JrC5B+sMVlxoMQsGPznpV4=; b=MDOBRma/PZizMxhvMqA8ZawZ5apj0Z+/HnMW0PHFzgBjaaCNjXtnyEdvdAD+hBCIK7 BaGKiZYVFTVQVzDhbSRhIx3k0WWlkkdxz8133apbfToK9LyHbilo/Atq5CFiSAvE/GmI EMciUeouaRG/iBoQg6VSTFV51Muc0tVvcsQjs4fpHc+jVBq3Mkm84igo2XQX4W7rurs/ 8aut1OegPgS0miiF8YBRMRFZM1mBFmxd4kty9Oz8gzhqVGk9QG9YEBfp5wGrcP5wvz7L kRiwjr4tkuM7Wm02KlY0qi4nY5Uvc9DUYcIiklEbfiQybzXoEZPV9n6N8XhOW9gI4tnf ZiRQ== X-Gm-Message-State: AOJu0YyEDUkzWkpZhjcsLJ5ccE3iZNOIetzPcxvjsj0C8G4HadpN7Kyr J/oA4wD/BoOHVhTlisWrTR5MuxzQEG7Jhy7wPFGwMavj4yORV6k2Sy9Jiw== X-Google-Smtp-Source: AGHT+IEQxSlY4LwCp/iiDl5uIrcwV6MeC/k3VQluReow/yleTzjl6ynJtTNrK+fWxxp8ho+v+fu94Q== X-Received: by 2002:a05:600c:3b8f:b0:426:706c:a55a with SMTP id 5b1f17b1804b1-429d625a7b1mr44623615e9.2.1723631496074; Wed, 14 Aug 2024 03:31:36 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-429ded71dfasm15442345e9.38.2024.08.14.03.31.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Aug 2024 03:31:35 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Wed, 14 Aug 2024 10:31:30 +0000 Subject: [PATCH v3 4/4] p1500: add is-base performance tests Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, Derrick Stolee , Derrick Stolee From: Derrick Stolee From: Derrick Stolee The previous two changes introduced a commit walking heuristic for finding the most likely base branch for a given source. This algorithm walks first-parent histories until reaching a collision. This walk _should_ be very fast. Exceptions include cases where a commit-graph file does not exist, leading to a full walk of all reachable commits to compute generation numbers, or a case where no collision in the first-parent history exists, leading to a walk of all first-parent history to the root commits. The p1500 test script guarantees a complete commit-graph file during its setup, so we will not test that scenario. Do create a new root commit in an effort to test the scenario of parallel first-parent histories. Even with the extra root commit, these tests take no longer than 0.02 seconds on my machine for the Git repository. However, the results are slightly more interesting in a copy of the Linux kernel repository: Test --------------------------------------------------------------- 1500.2: ahead-behind counts: git for-each-ref 0.12 1500.3: ahead-behind counts: git branch 0.12 1500.4: ahead-behind counts: git tag 0.12 1500.5: contains: git for-each-ref --merged 0.04 1500.6: contains: git branch --merged 0.04 1500.7: contains: git tag --merged 0.04 1500.8: is-base check: test-tool reach (refs) 0.03 1500.9: is-base check: test-tool reach (tags) 0.03 1500.10: is-base check: git for-each-ref 0.03 1500.11: is-base check: git for-each-ref (disjoint-base) 0.07 Signed-off-by: Derrick Stolee --- t/perf/p1500-graph-walks.sh | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/t/perf/p1500-graph-walks.sh b/t/perf/p1500-graph-walks.sh index e14e7620cce..5b23ce5db93 100755 --- a/t/perf/p1500-graph-walks.sh +++ b/t/perf/p1500-graph-walks.sh @@ -20,6 +20,21 @@ test_expect_success 'setup' ' echo tag-$ref || return 1 done >tags && + + echo "A:HEAD" >test-tool-refs && + for line in $(cat refs) + do + echo "X:$line" >>test-tool-refs || return 1 + done && + echo "A:HEAD" >test-tool-tags && + for line in $(cat tags) + do + echo "X:$line" >>test-tool-tags || return 1 + done && + + commit=$(git commit-tree $(git rev-parse HEAD^{tree})) && + git update-ref refs/heads/disjoint-base $commit && + git commit-graph write --reachable ' @@ -47,4 +62,20 @@ test_perf 'contains: git tag --merged' ' xargs git tag --merged=HEAD