From patchwork Wed Mar 20 22:05:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598213 Received: from mail-qk1-f181.google.com (mail-qk1-f181.google.com [209.85.222.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DCD782CA6 for ; Wed, 20 Mar 2024 22:05:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972310; cv=none; b=ELRFC32MkW9x4/ewkvFGRF5K/D9DanpQdoUhYTki9gC/W1lqQzhIf46O7vaVe1yQ60pl9Uy/afrediWHfqbKWe6WoSyYlv0Isqw+QW96ADrrt54bIykZXPTphkfKtym/t6WIVM0TSPK3nX1xDBOcTbRZCylrAZjLU2XxpREfEDI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972310; c=relaxed/simple; bh=Tj4Yd/9hQwoABD2rARMRe1PHP1U4NUg9j8xD20MPnfg=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=I9sD3tj+a3fFax8nHF4oZzkLPEHnnB+WomsDyiHKI8fjQD7acYJuDHkVJVPsC6BWNkL1tSABIMNpW9qeWu1V6k8TeIckr3teOaMY/Mb58+p4wMAWNYzk4KemGLkEDPhgy9IEukV0GHOCtaPcnK67Q5InjtSjxb4Ee0BMF7COizw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=HNARep7M; arc=none smtp.client-ip=209.85.222.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="HNARep7M" Received: by mail-qk1-f181.google.com with SMTP id af79cd13be357-789db18e24eso24705085a.1 for ; Wed, 20 Mar 2024 15:05:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972303; x=1711577103; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=p6wTwD2xUqlTlVpIWfygRxTyeS1SCis6PHQshfsFFHI=; b=HNARep7M8EMSk+waCSac2IGcxxoIHUxP9U1925Ly9OaekWRT1WymgGlyqFaZntUxh6 PSzVmn64XjNqdNop7xZsgBaJ1rfiJJSVQUiRwQTup1/5ZbieBxidaLd1TspHkDkM+pqk saPhvUlG7O0GmO/vDveS5GVsY1m+PRYcUWZ/DSFDbXqUY66EXNGLWRBwRoSdA1c94X9D 7ot6ylBZODF0NlRtUfWJrRVs+oQx+ue5ryKFbIGvNPjod1nzdf2bkm9FMuPpM2fACzei lbRLS85rJESrXMIfMK6t/Fb4SUXjUoUYSaB6oME/6A5n6xnenmHXEPjPvtclgLVhNrt0 07Ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972303; x=1711577103; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=p6wTwD2xUqlTlVpIWfygRxTyeS1SCis6PHQshfsFFHI=; b=iM33CUMl+yzOFz+Kd7FyYWeMC3XBt4G4naxD8ymSbkYG1vqySWjV2Jwhy9/zfTbcy/ 7dkQ7ha2lGqyaacSGTwbqy1cGQdnDwVKV0xkAscvXsCOyf3EK8tmm/5l8cTCXZXEyaDG LSW+7JuKgAyP/82LdW0SS9oO7K6r7OsB5yKWFVLjzrJ4Nz74i4oN1w/wS5pWgNF6DYSo ZfiMIk6kwxBe+pyAn2K9iYLwJs0/+flMBZF/2OMddt6nol9Rwm3U4TM146N0JU7fCydN ZKW75wfI0BG4SqwiPE0e+jP3T1SXYS+qPehBHbbRMdL8LWii0p9OaVNTlTiwS//xfNpY XmLg== X-Gm-Message-State: AOJu0Yy35KDqa3WPLtz6KCPBsolMSOi7bmQWuQefjs6bs4GgCrYQbRdt NItW9gpcFL4h+5HFfK4SP8KHNSSFI1ocVR6jItCEi01/PWlYf1dgG6N1hcShDEnVzL8xOAyLZAc DTgI= X-Google-Smtp-Source: AGHT+IF4NVfMn7oucJNIM5jUKG8LYXSgx0zwoTs2NWQEWJR92S1w8NIe6PjeryoWtMxi9EnFYczUdw== X-Received: by 2002:a37:e30a:0:b0:789:fb8a:fe3c with SMTP id y10-20020a37e30a000000b00789fb8afe3cmr223981qki.17.1710972303372; Wed, 20 Mar 2024 15:05:03 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id x16-20020ae9e910000000b0078a132713b3sm1984980qkf.55.2024.03.20.15.05.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:03 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:02 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 01/24] Documentation/technical: describe pseudo-merge bitmaps format Message-ID: <76e7e3b9cca7fb957384ed98f2cd32cf11ff8488.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare to implement pseudo-merge bitmaps over the next several commits by first describing the serialization format which will store the new pseudo-merge bitmaps themselves. This format is implemented as an optional extension within the bitmap v1 format, making it compatible with previous versions of Git, as well as the original .bitmap implementation within JGit. The format (as well as a general description of pseudo-merge bitmaps, and motivating use-case(s)) is described in detail in the patch contents below, but the high-level description is as follows: - An array of pseudo-merge bitmaps, each containing a pair of EWAH bitmaps: one describing the set of pseudo-merge "parents", and another describing the set of object(s) reachable from those parents. - A lookup table to determine which pseudo-merge(s) a given commit appears in. An optional extended lookup table follows when there is at least one commit which appears in multiple pseudo-merge groups. - Trailing metadata, including the number of pseudo-merge(s), number of unique parents, the offset within the .bitmap file for the pseudo-merge commit lookup table, and the size of the optional extension itself. Signed-off-by: Taylor Blau --- Documentation/technical/bitmap-format.txt | 179 ++++++++++++++++++++++ 1 file changed, 179 insertions(+) diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt index f5d200939b0..63a7177ac08 100644 --- a/Documentation/technical/bitmap-format.txt +++ b/Documentation/technical/bitmap-format.txt @@ -255,3 +255,182 @@ triplet is - xor_row (4 byte integer, network byte order): :: The position of the triplet whose bitmap is used to compress this one, or `0xffffffff` if no such bitmap exists. + +Pseudo-merge bitmaps +-------------------- + +If the `BITMAP_OPT_PSEUDO_MERGES` flag is set, a variable number of +bytes (preceding the name-hash cache, commit lookup table, and trailing +checksum) of the `.bitmap` file is used to store pseudo-merge bitmaps. + +A "pseudo-merge bitmap" is used to refer to a pair of bitmaps, as +follows: + +Commit bitmap:: + + A bitmap whose set bits describe the set of commits included in the + pseudo-merge's "merge" bitmap (as below). + +Merge bitmap:: + + A bitmap whose set bits describe the reachability closure over the set + of commits in the pseudo-merge's "commits" bitmap (as above). An + identical bitmap would be generated for an octopus merge with the same + set of parents as described in the commits bitmap. + +Pseudo-merge bitmaps can accelerate bitmap traversals when all commits +for a given pseudo-merge are listed on either side of the traversal, +either directly (by explicitly asking for them as part of the `HAVES` +or `WANTS`) or indirectly (by encountering them during a fill-in +traversal). + +=== Use-cases + +For example, suppose there exists a pseudo-merge bitmap with a large +number of commits, all of which are listed in the `WANTS` section of +some bitmap traversal query. When pseudo-merge bitmaps are enabled, the +bitmap machinery can quickly determine there is a pseudo-merge which +satisfies some subset of the wanted objects on either side of the query. +Then, we can inflate the EWAH-compressed bitmap, and `OR` it in to the +resulting bitmap. By contrast, without pseudo-merge bitmaps, we would +have to repeat the decompression and `OR`-ing step over a potentially +large number of individual bitmaps, which can take proportionally more +time. + +Another benefit of pseudo-merges arises when there is some combination +of (a) a large number of references, with (b) poor bitmap coverage, and +(c) deep, nested trees, making fill-in traversal relatively expensive. +For example, suppose that there are a large enough number of tags where +bitmapping each of the tags individually is infeasible. Without +pseudo-merge bitmaps, computing the result of, say, `git rev-list +--use-bitmap-index --count --objects --tags` would likely require a +large amount of fill-in traversal. But when a large quantity of those +tags are stored together in a pseudo-merge bitmap, the bitmap machinery +can take advantage of the fact that we only care about the union of +objects reachable from all of those tags, and answer the query much +faster. + +=== File format + +If enabled, pseudo-merge bitmaps are stored in an optional section at +the end of a `.bitmap` file. The format is as follows: + +.... ++-------------------------------------------+ +| .bitmap File | ++-------------------------------------------+ +| | +| Pseudo-merge bitmaps (Variable Length) | +| +---------------------------+ | +| | commits_bitmap (EWAH) | | +| +---------------------------+ | +| | merge_bitmap (EWAH) | | +| +---------------------------+ | +| | ++-------------------------------------------+ +| | +| Lookup Table | +| +------------+--------------+ | +| | commit_pos | offset | | +| +------------+--------------+ | +| | 4 bytes | 8 bytes | | +| +------------+--------------+ | +| | +| Offset Cases: | +| ------------- | +| | +| 1. MSB Unset: single pseudo-merge bitmap | +| + offset to pseudo-merge bitmap | +| | +| 2. MSB Set: multiple pseudo-merges | +| + offset to extended lookup table | +| | ++-------------------------------------------+ +| | +| Extended Lookup Table (Optional) | +| | +| +----+----------+----------+----------+ | +| | N | Offset 1 | .... | Offset N | | +| +----+----------+----------+----------+ | +| | | 8 bytes | .... | 8 bytes | | +| +----+----------+----------+----------+ | +| | ++-------------------------------------------+ +| | +| Pseudo-merge Metadata | +| +------------------+----------------+ | +| | # pseudo-merges | # Commits | | +| +------------------+----------------+ | +| | 4 bytes | 4 bytes | | +| +------------------+----------------+ | +| | +| +------------------+----------------+ | +| | Lookup offset | Extension size | | +| +------------------+----------------+ | +| | 8 bytes | 8 bytes | | +| +------------------+----------------+ | +| | ++-------------------------------------------+ +.... + +* One or more pseudo-merge bitmaps, each containing: + + ** `commits_bitmap`, an EWAH-compressed bitmap describing the set of + commits included in the this psuedo-merge. + + ** `merge_bitmap`, an EWAH-compressed bitmap describing the union of + the set of objects reachable from all commits listed in the + `commits_bitmap`. + +* A lookup table, mapping pseudo-merged commits to the pseudo-merges + they belong to. Entries appear in increasing order of each commit's + bit position. Each entry is 12 bytes wide, and is comprised of the + following: + + ** `commit_pos`, a 4-byte unsigned value (in network byte-order) + containing the bit position for this commit. + + ** `offset`, an 8-byte unsigned value (also in network byte-order) + containing either one of two possible offsets, depending on whether or + not the most-significant bit is set. + + *** If unset (i.e. `offset & ((uint64_t)1<<63) == 0`), the offset + (relative to the beginning of the `.bitmap` file) at which the + pseudo-merge bitmap for this commit can be read. This indicates + only a single pseudo-merge bitmap contains this commit. + + *** If set (i.e. `offset & ((uint64_t)1<<63) != 0`), the offset + (again relative to the beginning of the `.bitmap` file) at which + the extended offset table can be located describing the set of + pseudo-merge bitmaps which contain this commit. This indicates + that multiple pseudo-merge bitmaps contain this commit. + +* An (optional) extended lookup table (written if and only if there is + at least one commit which appears in more than one pseudo-merge). + There are as many entries as commits which appear in multiple + pseudo-merges. Each entry contains the following: + + ** `N`, a 4-byte unsigned value equal to the number of pseudo-merges + which contain a given commit. + + ** An array of `N` 8-byte unsigned values, each of which is + interpreted as an offset (relative to the beginning of the + `.bitmap` file) at which a pseudo-merge bitmap for this commit can + be read. These values occur in no particular order. + +* Positions for all pseudo-merges, each stored as an 8-byte unsigned + value (in network byte-order) containing the offset (relative to the + beginnign of the `.bitmap` file) of each consecutive pseudo-merge. + +* A 4-byte unsigned value (in network byte-order) equal to the number of + pseudo-merges. + +* A 4-byte unsigned value (in network byte-order) equal to the number of + unique commits which appear in any pseudo-merge. + +* An 8-byte unsigned value (in network byte-order) equal to the number + of bytes between the start of the pseudo-merge section and the + beginning of the lookup table. + +* An 8-byte unsigned value (in network byte-order) equal to the number + of bytes in the pseudo-merge section (including this field). From patchwork Wed Mar 20 22:05:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598212 Received: from mail-qt1-f173.google.com (mail-qt1-f173.google.com [209.85.160.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 00AA685C73 for ; Wed, 20 Mar 2024 22:05:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972309; cv=none; b=Mutg+oCKcz9KFruNIbNqVGXYxGErBXBJoYLQwaNPQ6fvPEUvkuKV9P4VYySThEiNSsvSEMXhELEy9OGqvN7jpOpA2DuXy05s18lqlCNRZWnrb7WXR99NG7kKQwKmOISDTg6R91T+Eh3OJrgjfe8TCveWe04Rx/vZIrnsTrTxVOE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972309; c=relaxed/simple; bh=QkihwEgs8ZxBmvY0MTBlZKV2ht4WRSE9MdXgxVLNeQE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=IqL5F95aYW20B5/gVCaLI8elax6zmH+YAq38MJ0kqj0Qa/bTSXihWJsiBcwBFVMDBcDrDRpurwHYzVr6s9Zsgm8FZc440LK4C9Wf27hzAoSzZ2SlsomPGFarRiCqTeVCFaiiGV0bYLJPwJRQrMl2yuX+/ag0sDxGAdHD1JfkyOo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=lhRGM+fl; arc=none smtp.client-ip=209.85.160.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="lhRGM+fl" Received: by mail-qt1-f173.google.com with SMTP id d75a77b69052e-42a029c8e76so2462841cf.2 for ; Wed, 20 Mar 2024 15:05:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972306; x=1711577106; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=EKGCWjpo/g6qCHA4QqC1LIycna/MfsD0TEekx5Y1Hkk=; b=lhRGM+flYOJqh3EZcu1ePg1n7/IQbvY/aC6Z/U/l1JGp7ocGFJawHK3n7NS1UY+Taa Ri3VzY1NqLByAiHgUZgtupCAdzvsNhZxapxHPebmmajRbSl5+Ofn3VDfT0B33InALwLY W0SvCQbWg6pctbJwC+QAocoWJ+Zenam+wm9nMv5p+59Iyr/KWZTNvhbp7mtAq4PViPU/ hqCQE9wyOFhxiB+APekaGKtqt4cZVmjrL+CkuDfvSlxspcJZdv2hVeXFehwBhEJVl2eF 3cLBtK1Bt8Lb/SpCE4JZ/bFlFPUMgCPqXZY7u/SgJc7wL1BZCtFSDa7WkWjx5YgigAMa qOFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972306; x=1711577106; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=EKGCWjpo/g6qCHA4QqC1LIycna/MfsD0TEekx5Y1Hkk=; b=B+YAvP2ZaLuhHvvGbi2eyxW6s0E7kuGBiHezqotzK/f1gzsPY1z8NG+6sdyzA8tdWu WXZdiTetinyFvcnON2UfyzI/zh55raAaAU9jNmi9anegstE5j4lZSiOi48S9FjIqFIdv WAr+hNFOvNWnMUuVDDj9aLmTOOrP6gjdpTdzOyirdA5eZRPpgTDlgks9HRY63Z4uWzfP FX3kTrOGh6rUbMWba4zA3piY5HUP/qMUsk/uvp3FDXqrKAYSX4PiI59neiC6AAiYeDeY +ia6xnWCJFyKIs46KVSgS9P7vgAFXND9Z2bwlRc/vHV2Af5n4QnEXagxhzZ4MswsFFC4 KMSg== X-Gm-Message-State: AOJu0Yy/mRKhJ6NoM7CYAh1jpUW82fQh3Q2d6tdfFigf77SjwmJuBhIS DQU7RSa3TAREJVOvzZQEKVaHxarGZv9xjgLYz2+A02PTHWUzT0JnhsR7eC+HNtqVPFn52knMOkQ 2DNo= X-Google-Smtp-Source: AGHT+IE5ns+ZgMoADw/B6WIxqzBshohNnUojVuuwOc98Nvqhf7kja1e+s6nmkRtH2y6JYyN6rUFuCw== X-Received: by 2002:a05:622a:11d2:b0:430:e8cc:7e0 with SMTP id n18-20020a05622a11d200b00430e8cc07e0mr6713346qtk.27.1710972306496; Wed, 20 Mar 2024 15:05:06 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id fp41-20020a05622a50a900b0042f3fa77602sm7923687qtb.2.2024.03.20.15.05.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:06 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:05 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 02/24] config: repo_config_get_expiry() Message-ID: <21d8f9dc2b4ddc8ac3f4e8f6b21bfb762fc6ab77.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Callers interested in parsing an approxidate from configuration currently make use of the `git_config_get_expiry()` function via the standard `git_config()` callback. Introduce a `repo_config_get_expiry()` variant in the style of functions introduced by 3b256228a6 (config: read config from a repository object, 2017-06-22) to read a single value without requiring the git_config() callback-style approach. This new function is similar to the existing implementation in `git_config_get_expiry()`, however it differs in that it fills out a `timestamp_t` value through a pointer, instead of merely checking and discarding the result (and returning it as a string). This function will gain its first caller in a subsequent commit to parse a "threshold" parameter for excluding too-recent commits from pseudo-merge groups. Signed-off-by: Taylor Blau --- config.c | 18 ++++++++++++++++++ config.h | 2 ++ 2 files changed, 20 insertions(+) diff --git a/config.c b/config.c index 3cfeb3d8bd9..8512da92273 100644 --- a/config.c +++ b/config.c @@ -2627,6 +2627,24 @@ int repo_config_get_pathname(struct repository *repo, return ret; } +int repo_config_get_expiry(struct repository *repo, + const char *key, const char **dest) +{ + int ret; + + git_config_check_init(repo); + + ret = repo_config_get_string(repo, key, (char **)dest); + if (ret) + return ret; + if (strcmp(*dest, "now")) { + timestamp_t now = approxidate("now"); + if (approxidate(*dest) >= now) + git_die_config(key, _("Invalid %s: '%s'"), key, *dest); + } + return ret; +} + /* Read values into protected_config. */ static void read_protected_config(void) { diff --git a/config.h b/config.h index 5dba984f770..619db01bc27 100644 --- a/config.h +++ b/config.h @@ -576,6 +576,8 @@ int repo_config_get_maybe_bool(struct repository *repo, const char *key, int *dest); int repo_config_get_pathname(struct repository *repo, const char *key, const char **dest); +int repo_config_get_expiry(struct repository *repo, + const char *key, const char **dest); /* * Functions for reading protected config. By definition, protected From patchwork Wed Mar 20 22:05:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598214 Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B29B686131 for ; Wed, 20 Mar 2024 22:05:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972312; cv=none; b=iU/lP5qzaZmr05fhSkXgmB6XITJ8FHZ6n41qtTMzSks1a+cwLbmQWP3x/3PSZDBnI9auGJtTaVPzI2xl7zztL/iPfbTYryvvjzi31+HZdDiRjPpFU5lfVZhALYgvwq4PVz+OAorhQ3WvQUTgakSWX0r41bME5rDaFxsRhBahoNA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972312; c=relaxed/simple; bh=0IpM2t38P1YeJBWTuYK6bgLOvacT3tqmVF+1vFnsFT8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=knnjwaoi7HuO3Zy1zcmIdsMc2czBaJrBZ6g9fb11TMkBWKjV5/vEbEzxXm3xC8QYZLjeSJP62rnHwhJ1rhrIW9A2SRQbB6lUlO9KeAVEzM+KEh6MlZDJyMs37kYcLVQZQi6NqkmtWQRhB2jAWE/+0Ncsl/u8GjCGM5RpdDhfeM0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=nEGEwaDI; arc=none smtp.client-ip=209.85.160.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="nEGEwaDI" Received: by mail-qt1-f175.google.com with SMTP id d75a77b69052e-429de32dad9so2159261cf.2 for ; Wed, 20 Mar 2024 15:05:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972309; x=1711577109; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=cJM5xCxTM7Nb1bLkpH/0xSc8bYxvez5lLB3JsXCjZ4Q=; b=nEGEwaDI0Brg1Ixv6LBYKsR6fYPdWqcp6JpEfN3eMfKdXWig+ybBhxMCiLu3uoeK6V tYAR7Y5Ws4fYZzyMBKOMMaoVVYwgNp4uDTqaAs1GroUjZ18w7uvSDTE0LiJE+V/sd9f1 xwFkM5DoT7GOX3bTHPKj5tfZ/2aUKuqXXSLjUP5LM/QhbAZkP5ovfKHr+kQU3KMJn7Mg p86Q/9ssQBmc2FpfzQ9Ri+kNlEYH2qVx7Y0aFLn3HeW6NkCdUjRyQInwV7xZLlIzJmUX BsIRofijRb9lTuJ9l+LGS6fAznzyO5VrGFl7MaNO63I3PNKM/aDtlSNHF8xdowWA2m+O aqcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972309; x=1711577109; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=cJM5xCxTM7Nb1bLkpH/0xSc8bYxvez5lLB3JsXCjZ4Q=; b=tdDc0f3FStSg2QZaQGyqAvbV9+ttCjc8veboB9Mt4v+EmudgkqueE3leb7A45gYWJ4 Yrb85nj2363LKVFW0AxUThxmDDrNMdOowlr9FVJtr1oLjMj57AiQvKcgMEwQN0itfjz8 RqYC1I9/QE+0hN89A/dBPFkGwBYtjbZ2yG9gAHXzMmuLIhNh65UkwkxOmBM46CzQRxxx PZkeRCNduwuerCz2Xr2xpvmV89ARZhu9NGJM5Y3lbP6fAPQ8QBb0fz+Ww2tO1IissB8u H1bJFv7BP+ygpY6VPqNymUFdefMBpFXIo76aVGqwUx5Juoa6qZ3iFUbKmYY6+i2ge0+V RFhQ== X-Gm-Message-State: AOJu0YxCRdc2QySY/qtCIY/v+h/CbHC0uVLDFCrbNqpwdTsL51lChM1v 6Lx2YNtBTiLZQwqiYyUDKvyl9q7/qK/4sdKoqstgOydXAz3GgmdI4ChXqwWG7SgoTQnO0vBk9wu r5v0= X-Google-Smtp-Source: AGHT+IEXESnRCmY+5dWCMABQ8kvlt2FW5R3CBsgsSVMNI+zsmNUmErjZHMyWdPCZ66Lx8uBd++7t3w== X-Received: by 2002:a05:622a:1787:b0:431:2033:d4cc with SMTP id s7-20020a05622a178700b004312033d4ccmr326691qtk.21.1710972309345; Wed, 20 Mar 2024 15:05:09 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id cp3-20020a05622a420300b00430bf59ebccsm5447733qtb.11.2024.03.20.15.05.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:09 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:08 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 03/24] ewah: implement `ewah_bitmap_is_subset()` Message-ID: <1347571ef4ca6329de58394bfea71927c8e08151.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In order to know whether a given pseudo-merge (comprised of a "parents" and "objects" bitmaps) is "satisfied" and can be OR'd into the bitmap result, we need to be able to quickly determine whether the "parents" bitmap is a subset of the current set of objects reachable on either side of a traversal. Implement a helper function to prepare for that, which determines whether an EWAH bitmap (the parents bitmap from the pseudo-merge) is a subset of a non-EWAH bitmap (in this case, the results bitmap from either side of the traversal). This function makes use of the EWAH iterator to avoid inflating any part of the EWAH bitmap after we determine it is not a subset of the non-EWAH bitmap. This "fail-fast" allows us to avoid a potentially large amount of wasted effort. Signed-off-by: Taylor Blau --- ewah/bitmap.c | 43 +++++++++++++++++++++++++++++++++++++++++++ ewah/ewok.h | 1 + 2 files changed, 44 insertions(+) diff --git a/ewah/bitmap.c b/ewah/bitmap.c index ac7e0af622a..5bdae3fb07b 100644 --- a/ewah/bitmap.c +++ b/ewah/bitmap.c @@ -138,6 +138,49 @@ void bitmap_or(struct bitmap *self, const struct bitmap *other) self->words[i] |= other->words[i]; } +int ewah_bitmap_is_subset(struct ewah_bitmap *self, struct bitmap *other) +{ + struct ewah_iterator it; + eword_t word; + size_t i; + + ewah_iterator_init(&it, self); + + for (i = 0; i < other->word_alloc; i++) { + if (!ewah_iterator_next(&word, &it)) { + /* + * If we reached the end of `self`, and haven't + * rejected `self` as a possible subset of + * `other` yet, then we are done and `self` is + * indeed a subset of `other`. + */ + return 1; + } + if (word & ~other->words[i]) { + /* + * Otherwise, compare the next two pairs of + * words. If the word from `self` has bit(s) not + * in the word from `other`, `self` is not a + * proper subset of `other`. + */ + return 0; + } + } + + /* + * If we got to this point, there may be zero or more words + * remaining in `self`, with no remaining words left in `other`. + * If there are any bits set in the remaining word(s) in `self`, + * then `self` is not a proper subset of `other`. + */ + while (ewah_iterator_next(&word, &it)) + if (word) + return 0; + + /* `self` is definitely a subset of `other` */ + return 1; +} + void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other) { size_t original_size = self->word_alloc; diff --git a/ewah/ewok.h b/ewah/ewok.h index c11d76c6f33..c334833b201 100644 --- a/ewah/ewok.h +++ b/ewah/ewok.h @@ -180,6 +180,7 @@ int bitmap_get(struct bitmap *self, size_t pos); void bitmap_free(struct bitmap *self); int bitmap_equals(struct bitmap *self, struct bitmap *other); int bitmap_is_subset(struct bitmap *self, struct bitmap *other); +int ewah_bitmap_is_subset(struct ewah_bitmap *self, struct bitmap *other); struct ewah_bitmap * bitmap_to_ewah(struct bitmap *bitmap); struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah); From patchwork Wed Mar 20 22:05:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598216 Received: from mail-qk1-f176.google.com (mail-qk1-f176.google.com [209.85.222.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 96B6786244 for ; Wed, 20 Mar 2024 22:05:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972318; cv=none; b=n6I4wnPydZ6QvTYh9dsjI+FcjProaW03C1nRPSPTQJyPdlehN++Nf2DQYJX5ouL/BGwJBgLHqRtCZF5x8EsAmFEJSTeaCgBnWhxR4NLRuzJwb9rbJ6VypQ7YSVKU0YJP9G0ZPmmT1LcX97aQKlbihrymNXf5ZrdHnPqslStME/c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972318; c=relaxed/simple; bh=VqD129M93bDYzaYEhQFXwnFcUEGRpSklBnn/IfNhEoQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=u+mrz2Hx9QDeRvXjRkIu0yyQKuQnpRtpLVgRJ5JwW6nF8F6oa34zaSutmKU9KUldmpRfAYViUAOJpve3BGZD9+oD8WhAu4vShTj9lNWkGQJaTjtIef6TlMOZW0V6agjrLEwMatORvA8ENi4Q3XGpeKjAmR7ORtKkDF8XMyRrBQo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=R47HZNKu; arc=none smtp.client-ip=209.85.222.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="R47HZNKu" Received: by mail-qk1-f176.google.com with SMTP id af79cd13be357-789db18e356so21322385a.2 for ; Wed, 20 Mar 2024 15:05:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972312; x=1711577112; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=SHX25G1+IyF0ODK0NHnQdkKcrCUyK/ZF2OPNcfqltHE=; b=R47HZNKu1/MG7g/7yxKItc4mQ4J2QVAS6mO8ABTIKYaPk7yuHPDO5kRMRQFgAvklQ6 hUs9QubnnfGJUuu8CvzNtabJIh1QhWPbT03T9pFDf4M6P0UWuDyxzVJxe4FaDnG0NPuI 0CJnAwm8sjqo5M/skKJhTww8XhdMYVC25xnBpNP6zI+X1lgSg56ewGC4CpfPpUAjh5gj NoQY/5swvRBXLDreI3rX9Qsdz5ACBp3ly+5EqGC6JcgH5u2CTClXetkzvcrxhY+yAXjL z6pNHOX/uR/TgxNEsCTngknqcSYCMJi43xMiuDYZ36IIDJE+klGS2fR3Cugb4riXbpyS GYnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972312; x=1711577112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=SHX25G1+IyF0ODK0NHnQdkKcrCUyK/ZF2OPNcfqltHE=; b=U7XVJZOnpVtE7JccKWDQRku7BXPhWGxFWRfDm1SjvO0sO8gn5jjSuErtUIwDGg2Gvb cZJXW+qT+FeRLqNr4tw70hCvCTqZyCrBfiZybkHOhtBbY0uhq7NERBykTakBMP4tX7FX OI6RoIQ+3T8A508BMsgWXinAE0sLecI15Sm6mE6Y3blvSAQiK0yedMa1yDNIjSy8d/MW 6+REXCzWjBAnklAAYtadOwPPUf7laLzVDwwfHge44UcA2wpa2iQrZSd7xlT9Ks6CFoid 1tK1PTX138Cby0Hp+ONUqN/fkY8Xe9u/fIY8fQBHMAhfzAZ/gPWJVhsg39O49k2FqPDx 8+JA== X-Gm-Message-State: AOJu0YxR+SeO5Smk967tMK4mUcTsiKgA6kXHUblg3drehfAwdAFScJse CXWGJDy37PV93TOq/gzdyK1yed8UR0jT8SYu+rB1F6LuEYW3r3BwNC7s6didW6uqoV1i88LqjFg 8i4k= X-Google-Smtp-Source: AGHT+IHVjznHwONJnwlaiCfSTmCifIXcKpFhCbm8bCEY8GfpbfOi0WaTj68TasFfrvwS4hM1rorx4w== X-Received: by 2002:a05:620a:2f6:b0:789:e9b0:dc50 with SMTP id a22-20020a05620a02f600b00789e9b0dc50mr16534468qko.67.1710972312203; Wed, 20 Mar 2024 15:05:12 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id m22-20020ae9e016000000b00789fa326156sm3727556qkk.82.2024.03.20.15.05.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:12 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:11 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 04/24] pack-bitmap: drop unused `max_bitmaps` parameter Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The `max_bitmaps` parameter in `bitmap_writer_select_commits()` was introduced back in 7cc8f97108 (pack-objects: implement bitmap writing, 2013-12-21), making it original to the bitmap implementation in Git itself. When that patch was merged via 0f9e62e084 (Merge branch 'jk/pack-bitmap', 2014-02-27), its sole caller in builtin/pack-objects.c passed a value of "-1" for `max_bitmaps`, indicating no limit. Since then, the only other caller (in midx.c, added via c528e17966 (pack-bitmap: write multi-pack bitmaps, 2021-08-31)) also uses a value of "-1" for `max_bitmaps`. Since no callers have needed a finite limit for the `max_bitmaps` parameter in the nearly decade that has passed since 0f9e62e084, let's remove the parameter and any dead pieces of code connected to it. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 2 +- midx.c | 2 +- pack-bitmap-write.c | 8 +------- pack-bitmap.h | 2 +- 4 files changed, 4 insertions(+), 10 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 329aeac8043..41281cae91f 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1359,7 +1359,7 @@ static void write_pack_file(void) stop_progress(&progress_state); bitmap_writer_show_progress(progress); - bitmap_writer_select_commits(indexed_commits, indexed_commits_nr, -1); + bitmap_writer_select_commits(indexed_commits, indexed_commits_nr); if (bitmap_writer_build(&to_pack) < 0) die(_("failed to write bitmap index")); bitmap_writer_finish(written_list, nr_written, diff --git a/midx.c b/midx.c index 85e1c2cd128..366bfbe18c8 100644 --- a/midx.c +++ b/midx.c @@ -1330,7 +1330,7 @@ static int write_midx_bitmap(const char *midx_name, for (i = 0; i < pdata->nr_objects; i++) index[pack_order[i]] = &pdata->objects[i].idx; - bitmap_writer_select_commits(commits, commits_nr, -1); + bitmap_writer_select_commits(commits, commits_nr); ret = bitmap_writer_build(pdata); if (ret < 0) goto cleanup; diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 990a9498d73..3dc2408eca7 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -591,8 +591,7 @@ static int date_compare(const void *_a, const void *_b) } void bitmap_writer_select_commits(struct commit **indexed_commits, - unsigned int indexed_commits_nr, - int max_bitmaps) + unsigned int indexed_commits_nr) { unsigned int i = 0, j, next; @@ -615,11 +614,6 @@ void bitmap_writer_select_commits(struct commit **indexed_commits, if (i + next >= indexed_commits_nr) break; - if (max_bitmaps > 0 && writer.selected_nr >= max_bitmaps) { - writer.selected_nr = max_bitmaps; - break; - } - if (next == 0) { chosen = indexed_commits[i]; } else { diff --git a/pack-bitmap.h b/pack-bitmap.h index c7dea13217a..3f96608d5c1 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -110,7 +110,7 @@ int rebuild_bitmap(const uint32_t *reposition, struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git, struct commit *commit); void bitmap_writer_select_commits(struct commit **indexed_commits, - unsigned int indexed_commits_nr, int max_bitmaps); + unsigned int indexed_commits_nr); int bitmap_writer_build(struct packing_data *to_pack); void bitmap_writer_finish(struct pack_idx_entry **index, uint32_t index_nr, From patchwork Wed Mar 20 22:05:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598215 Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com [209.85.222.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0E2485C51 for ; Wed, 20 Mar 2024 22:05:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972318; cv=none; b=gboDoTyfYiJBKY81DhM/CqiwBi0VjypR9m4P2BbMqvsijEEEHCL1hK02F7jUmo+XwStRB8Ol1XMVsvbPP2SKDgsRY1WwS5F6ATXyH/raAADtO6obfh+gRvdegEH8MVEGT9hyHcgCB3+oBZlsCmzaibBpdDvadHxSvODTnh/dP+A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972318; c=relaxed/simple; bh=RS2/o5MI15Crbt3u2N2vqi9z3bnbLmffnO6rzHUUNxs=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=huVmR5/3qZIPGbTUoApW1CwTBOZfuJ52emF3CfAORL4jii2QA2xtc7OStlkA4+x+l5iolglnZeRR8ucoLYZk/xZBoqUm8gys/u4CVen7A9O9yYYobHhmkO0aKLEzyZfl7J4fXI4H4ACdu86zpgNR9vANubug4OwvcREPFQOPlZY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=gbqaKzL8; arc=none smtp.client-ip=209.85.222.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="gbqaKzL8" Received: by mail-qk1-f173.google.com with SMTP id af79cd13be357-789d76ad270so26304185a.0 for ; Wed, 20 Mar 2024 15:05:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972315; x=1711577115; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=/R78c3RzCDTF19i2i8QCOcqM8AieB43JkFGW6CZxAcM=; b=gbqaKzL8rN+3xsEs1350Bf0p/8YoO4dtQfHgmYZErCvYwlKVUL1jCIsbPzTvOObcbt PFeCimB8wZXI72V2DRomMFomeq4QdmZLojzgKNMCQaEAZRWshSi+SkJ9L+Ixe0npIPiY COcZFWrWP57bVACJDYfqcLOq23bsw8ubUF8Bon3gBCRs/tufuJTqjWVdUZEzTIxSncMN LwNl/jzp8kc3nMb4G+vj9gN+ObW3RoakJbHIcAbuXGX/RR0NKePLQqghN3er/9ZDQtXD KIvFSM8v11ivkMRXIWirHQuoC6tZ91uGz1WkMrkO+tuVWn8cP5o04yXPL95RWJkvLOiM hAhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972315; x=1711577115; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=/R78c3RzCDTF19i2i8QCOcqM8AieB43JkFGW6CZxAcM=; b=uOo73LqB2JRRJfUJgMjCR7v6OZMUOcMdtddhLV1F9Cb5vQDPzSwvI+UhbfomIZbgp1 CPvrPQB2x7k27oW6yAB7OFs+jEnEvRRVedglPAKPVygFI5EESvGwrwDNFaIOQbCQA8FJ WhdEkH+O9kkMxutexqx2PTA2oz9fs0LxgYgqUUZZh82NHLipYMtVzx0ek6NXM37IhRcg cVYoQC0x27IAzCRh2TdUaYK+zcT1SVUXH93ZnqcJY3c/2WlFrpWzoUAhY8Vn1FYkEo1e 51JlPFqdK1YxNru0SQZRimymZW4DJdvt+OKLWI/wZHkMao6RFKvtUQ2InrEmDF64Qk33 rQWg== X-Gm-Message-State: AOJu0YzQP+yGFicAPxw4b2+l/INZGJZNIbdkx+hgSq50jRYauK0EkH9A W0K6SeHLIs4gxavNX7FnlJcG6PeW/mCyKpQCgcw0EPykQYEIpzHCAa19fSdQWwjaie6BvIA8GO6 xBw0= X-Google-Smtp-Source: AGHT+IHBJA+Eq3du1mvj63h2PZc0Hkpb9wdT2I1dQfsjNsSOKNsvJmTYBOdR+rtA6oJrwBBTLRawoQ== X-Received: by 2002:a0c:e212:0:b0:690:b098:1407 with SMTP id q18-20020a0ce212000000b00690b0981407mr3597084qvl.3.1710972315286; Wed, 20 Mar 2024 15:05:15 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id js7-20020a0562142aa700b00690aa73c1a8sm8286690qvb.45.2024.03.20.15.05.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:15 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:13 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 05/24] pack-bitmap: move some initialization to `bitmap_writer_init()` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The pack-bitmap-writer machinery uses a oidmap (backed by khash.h) to map from commits selected for bitmaps (by OID) to a bitmapped_commit structure (containing the bitmap itself, among other things like its XOR offset, etc.) This map was initialized at the end of `bitmap_writer_build()`. New entries are added in `pack-bitmap-write.c::store_selected()`, which is called by the bitmap_builder machinery (which is responsible for traversing history and generating the actual bitmaps). Reorganize when this field is initialized and when entries are added to it so that we can quickly determine whether a commit is a candidate for pseudo-merge selection, or not (since it was already selected to receive a bitmap, and thus is ineligible for pseudo-merge inclusion). The changes are as follows: - Introduce a new `bitmap_writer_init()` function which initializes the `writer.bitmaps` field (instead of waiting until the end of `bitmap_writer_build()`). - Add map entries in `push_bitmapped_commit()` (which is called via `bitmap_writer_select_commits()`) with OID keys and NULL values to track whether or not we *expect* to write a bitmap for some given commit. - Validate that a NULL entry is found matching the given key when we store a selected bitmap. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 1 + midx.c | 1 + pack-bitmap-write.c | 23 ++++++++++++++++++----- pack-bitmap.h | 1 + 4 files changed, 21 insertions(+), 5 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 41281cae91f..34a431e3856 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1339,6 +1339,7 @@ static void write_pack_file(void) hash_to_hex(hash)); if (write_bitmap_index) { + bitmap_writer_init(the_repository); bitmap_writer_set_checksum(hash); bitmap_writer_build_type_index( &to_pack, written_list, nr_written); diff --git a/midx.c b/midx.c index 366bfbe18c8..24d98120852 100644 --- a/midx.c +++ b/midx.c @@ -1311,6 +1311,7 @@ static int write_midx_bitmap(const char *midx_name, for (i = 0; i < pdata->nr_objects; i++) index[i] = &pdata->objects[i].idx; + bitmap_writer_init(the_repository); bitmap_writer_show_progress(flags & MIDX_PROGRESS); bitmap_writer_build_type_index(pdata, index, pdata->nr_objects); diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 3dc2408eca7..ad768959633 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -46,6 +46,11 @@ struct bitmap_writer { static struct bitmap_writer writer; +void bitmap_writer_init(struct repository *r) +{ + writer.bitmaps = kh_init_oid_map(); +} + void bitmap_writer_show_progress(int show) { writer.show_progress = show; @@ -117,11 +122,20 @@ void bitmap_writer_build_type_index(struct packing_data *to_pack, static inline void push_bitmapped_commit(struct commit *commit) { + int hash_ret; + khiter_t hash_pos; + if (writer.selected_nr >= writer.selected_alloc) { writer.selected_alloc = (writer.selected_alloc + 32) * 2; REALLOC_ARRAY(writer.selected, writer.selected_alloc); } + hash_pos = kh_put_oid_map(writer.bitmaps, commit->object.oid, &hash_ret); + if (!hash_ret) + die(_("duplicate entry when writing bitmap index: %s"), + oid_to_hex(&commit->object.oid)); + kh_value(writer.bitmaps, hash_pos) = NULL; + writer.selected[writer.selected_nr].commit = commit; writer.selected[writer.selected_nr].bitmap = NULL; writer.selected[writer.selected_nr].flags = 0; @@ -466,14 +480,14 @@ static void store_selected(struct bb_commit *ent, struct commit *commit) { struct bitmapped_commit *stored = &writer.selected[ent->idx]; khiter_t hash_pos; - int hash_ret; stored->bitmap = bitmap_to_ewah(ent->bitmap); - hash_pos = kh_put_oid_map(writer.bitmaps, commit->object.oid, &hash_ret); - if (hash_ret == 0) - die("Duplicate entry when writing index: %s", + hash_pos = kh_get_oid_map(writer.bitmaps, commit->object.oid); + if (hash_pos == kh_end(writer.bitmaps)) + die(_("attempted to store non-selected commit: '%s'"), oid_to_hex(&commit->object.oid)); + kh_value(writer.bitmaps, hash_pos) = stored; } @@ -488,7 +502,6 @@ int bitmap_writer_build(struct packing_data *to_pack) uint32_t *mapping; int closed = 1; /* until proven otherwise */ - writer.bitmaps = kh_init_oid_map(); writer.to_pack = to_pack; if (writer.show_progress) diff --git a/pack-bitmap.h b/pack-bitmap.h index 3f96608d5c1..dae2d68a338 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -97,6 +97,7 @@ int bitmap_has_oid_in_uninteresting(struct bitmap_index *, const struct object_i off_t get_disk_usage_from_bitmap(struct bitmap_index *, struct rev_info *); +void bitmap_writer_init(struct repository *r); void bitmap_writer_show_progress(int show); void bitmap_writer_set_checksum(const unsigned char *sha1); void bitmap_writer_build_type_index(struct packing_data *to_pack, From patchwork Wed Mar 20 22:05:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598217 Received: from mail-qk1-f175.google.com (mail-qk1-f175.google.com [209.85.222.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 86EAD86254 for ; Wed, 20 Mar 2024 22:05:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972321; cv=none; b=NNuCOxFSbmoZPp9Qo0yn62PmrE4PZzRr+CLH9h148RFDFVb5Zz2Ec9hwhAVmLqjXcRTeGikTYTu66b+nlSi4DcfO9Je7KAn3QYtCdks4NltPLYTkIHhfz2pj4HVHMDFOUX9ukxeqhadcyX8F+8/eRqzAfWXWKq1SJ6PbpAWWq2c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972321; c=relaxed/simple; bh=UGQhRTNH8wyA/+xrPiYdgBrCrsGpM/PiIXYqNTW1j/A=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=uQq1nSCujH+oPqQcaBGZmBYUpRD7wxtsOWYhxI5GnDF6zUKdJHvrExVePCx8B8ipbc7IGdy2adcjrBP+L+PJ3khFDuxd/AW9EUW7VdFcirgxsihnbNuLXa+rd9V64BG9XlbNWH/cCeo14//db1IAhK1fMZy92yFErySc+8XeMJc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=jXnyKzKu; arc=none smtp.client-ip=209.85.222.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="jXnyKzKu" Received: by mail-qk1-f175.google.com with SMTP id af79cd13be357-789e4a4d3a5so23311985a.1 for ; Wed, 20 Mar 2024 15:05:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972318; x=1711577118; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=5PvGvme5CaClZ6b5U8qhZpTMjLwlyclTgikKQP5cqWQ=; b=jXnyKzKuOyULDm+x1a4lHu7SAB4DqzA+ybmwQMpe7kUiHu32mvvkRT8LimJnL46Hy6 cCemwHNJhpnPXhOaudzWiDI/Ysg0xu02OSOzkaEZkEAjiUrCavIPIHRY8ocN5ylpa4NB Nl1oM+wQi2D6yMGYY42cWvVmUY8iy387nP08zeF4fZjXPd9aYbuD1VNV5iSN2ado3X1i BR1vqDkyP5HVclfTtAQXS6WUX8tfpGvQie82/fAd1VNDyc+j59BDkQ1hSTgY2BepMJ5u KS/o/Qln+K65n4rFSd0p+d9U1Hc9B3HfkTcUWzLOIIslRf7rF1DFaIUWVt4F86n9SNRy lmpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972318; x=1711577118; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=5PvGvme5CaClZ6b5U8qhZpTMjLwlyclTgikKQP5cqWQ=; b=JBZFjkN7Pp5L4SnIpXdZeMoGvOhaN/2s4YPSRURjdn+0lh7ldhElp7jrTnRykeiPq2 FF1rVs8ZcT0i2ldt7M9GnQRZVJBVLMY8plN8TGdCzk0KsQLFeASMv7OOpTjgqOY41mDt Y8wZho1o/Rz+dxtu6r1NgHhVXzKHOSZhybwYHYepp0yyWIcfZmNNPMgj8vBIgfJL6Eel JBdtkaYtTZVOz3KS8VtG2EHXjeVr3t+9Jbb/npQ+B6mFMSDRF0kw2jkS2BXD8jZzZSAm 0PKbSbCa9fR1XNIHmeRWEOf4DCvnDQcdTW5fPB7gSt/OxCYeZVqEwXMpW3KwN8/BjF7r q4wA== X-Gm-Message-State: AOJu0Yzf7TgfhuxVKh807J5CPtO9Ev4ADf67XYwaHyJlBFKB5y2uQjHu qXhaTtwEUO1T8JltEJynvFhTD2OAKDRsp+aAgmfykh4noSgsis5zHNRAe49/bKJlhAHFngWh1C/ UD1M= X-Google-Smtp-Source: AGHT+IG4LtlIdwfOXQj1OqJ3hfdyDe/GXNU48r5oz5To5t1iLfG2DLnBbrpr2oLcIrbl2VYy3XtPLg== X-Received: by 2002:a05:620a:4053:b0:78a:76b:a90f with SMTP id i19-20020a05620a405300b0078a076ba90fmr4229629qko.33.1710972318331; Wed, 20 Mar 2024 15:05:18 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id k13-20020ae9f10d000000b0078a0dda6d35sm2443978qkg.107.2024.03.20.15.05.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:18 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:17 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 06/24] pseudo-merge.ch: initial commit Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Add a new (empty) header file to contain the implementation for selecting, reading, and applying pseudo-merge bitmaps. For now this header and its corresponding implementation are left empty, but they will evolve over the course of subsequent commit(s). Signed-off-by: Taylor Blau --- Makefile | 1 + pseudo-merge.c | 2 ++ pseudo-merge.h | 6 ++++++ 3 files changed, 9 insertions(+) create mode 100644 pseudo-merge.c create mode 100644 pseudo-merge.h diff --git a/Makefile b/Makefile index 4e255c81f22..fd050bd9d68 100644 --- a/Makefile +++ b/Makefile @@ -1114,6 +1114,7 @@ LIB_OBJS += prompt.o LIB_OBJS += protocol.o LIB_OBJS += protocol-caps.o LIB_OBJS += prune-packed.o +LIB_OBJS += pseudo-merge.o LIB_OBJS += quote.o LIB_OBJS += range-diff.o LIB_OBJS += reachable.o diff --git a/pseudo-merge.c b/pseudo-merge.c new file mode 100644 index 00000000000..37e037ba272 --- /dev/null +++ b/pseudo-merge.c @@ -0,0 +1,2 @@ +#include "git-compat-util.h" +#include "pseudo-merge.h" diff --git a/pseudo-merge.h b/pseudo-merge.h new file mode 100644 index 00000000000..cab8ff6960a --- /dev/null +++ b/pseudo-merge.h @@ -0,0 +1,6 @@ +#ifndef PSEUDO_MERGE_H +#define PSEUDO_MERGE_H + +#include "git-compat-util.h" + +#endif From patchwork Wed Mar 20 22:05:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598218 Received: from mail-oi1-f176.google.com (mail-oi1-f176.google.com [209.85.167.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9D5AD85C7D for ; Wed, 20 Mar 2024 22:05:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972324; cv=none; b=mqHRo06q57F/4V94Fjus3bkabh+5R7TXWqPdxJH0FIwt/+zNy1Ih/ulPwZp0eCrUin+6BtYm91W3+S7ICo8/PIz5X09QgDe1161LN1O7i/mnQlh9VArLr2GQpMYL24zW1tuUrye0ooWevOsyVy3eJP3wy5ZUagCd1YVl4sfgkKY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972324; c=relaxed/simple; bh=eS8y1pkAjcpEenIBs0vAWQqqiSlO2LHKEu9rzZENFis=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=TO3e1AY1YgLH/r/uk8d6Ka+Zkc5A0ZcnEF1MjSUrP6/cBtuJPiWTO7pqb0Q/DUTgXXmV8B9p+7ps8Yki12VdDjsVtX9viG6rEDL8ynLDaLIF7c5xoP1+fbJy/3dwwF2I2J2U5dYRWLogb1/uVx5rEBuO/JzqQAEqhl/epEaf2uc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=kSXhtNBe; arc=none smtp.client-ip=209.85.167.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="kSXhtNBe" Received: by mail-oi1-f176.google.com with SMTP id 5614622812f47-3c387e02f23so239766b6e.3 for ; Wed, 20 Mar 2024 15:05:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972321; x=1711577121; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=JLnqUlCtra2tsnBHw7g/af4kQDZfKe5SNCJ3sxG7LG0=; b=kSXhtNBe6lExHGZiWz8jePbaPzEBd/6yZCvG/aAqCulAOVh6QVQ1Vh9eeOkyxnlPBB VRRjDQ3xGdxZy95uTD6cCL79I0Wpeb3eYeFey+aiDibmtFwDtBisAkvGXZ+pJXFGYk0I RC2VuO598pSqBakt7Ifl0uU+c5+Cq2rBhx3A5uLXK3RAc0bEA+KxAAZMkD7d+4NVsQQp tokZSpvMzz47MEzQQQdv95YE63V0rFq0WL7MuVDNIZYuFU0KaOHLiL6AiQM94vb2yQDr Y7KZx8NNzQGH/1LxSiTQRdzM27K97os8w7YdToGgYub1hJSoHLBMTgu1jlKTg+ZNZwlp Y2kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972321; x=1711577121; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=JLnqUlCtra2tsnBHw7g/af4kQDZfKe5SNCJ3sxG7LG0=; b=AJmnN4bwCwVu2b+vA9vEKhiN3B3n1Vls7hbEha1LlH26CQ8sfbjjVgUm9OEqNQCVfE +pwgM9nKpdQwcdMvL0ku57R1/+3Mbz6AG+cCWBXLcZdrodHc+qykJz+oo3Q5qt7ufvU+ depMBNxtnWUdoo59Fr5ZbPtlqMYSGz1ha4ouayGalO9gP+0MxC1CT4hr/H/G54tfJ8US 53tv54V/PpoGAfxuSof4v8+82ffb+GVOUOikX2fDNhd6OurazZMyQYvEKo8axVQb9pUi SP/Zm3MGn6xXxgxT3h0M+3XGaM1406AdwVTdBgrfoWXLAb6OU6p2xyh8f1CpA7BLO1DL h26A== X-Gm-Message-State: AOJu0YyZfjYM5ARn+zdZLzTFRp4wEcfjRZ2xfcx30SmvCqC495xeWnzq rICOXe11OKZFwd+t1TZpHECET6A9a1SmeGktW68jMmdOpAKsXYDBjcrdy6xxP2kL42qBBBt9TAS KA+g= X-Google-Smtp-Source: AGHT+IGZn20D7Sg4BMbjnwsH/3gKHGLUIGwNrXEMkMX5FOpJD9W4Eq9cggQJLHC/T2unYMFFbTTllQ== X-Received: by 2002:a05:6808:bcd:b0:3c3:7e7a:8236 with SMTP id o13-20020a0568080bcd00b003c37e7a8236mr3768207oik.18.1710972321482; Wed, 20 Mar 2024 15:05:21 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id ci10-20020a05622a260a00b00430bd60afa5sm5677713qtb.48.2024.03.20.15.05.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:21 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:19 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 07/24] pack-bitmap-write: support storing pseudo-merge commits Message-ID: <7acdee2b5f2eddfb143afa2982f40bb0136ccdd1.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare to write pseudo-merge bitmaps by annotating individual bitmapped commits (which are represented by the `bitmapped_commit` structure) with an extra bit indicating whether or not they are a pseudo-merge. In subsequent commits, pseudo-merge bitmaps will be generated by allocating a fake commit node with parents covering the full set of commits represented by the pseudo-merge bitmap. These commits will be added to the set of "selected" commits as usual, but will be written specially instead of being included with the rest of the selected commits. Mechanically speaking, there are two parts of this change: - The bitmapped_commit struct gets a new bit indicating whether it is a pseudo-merge, or an ordinary commit selected for bitmaps. - A handful of changes to only write out the non-pseudo-merge commits when enumerating through the selected array (see the new `bitmap_writer_selected_nr()` function). Pseudo-merge commits appear after all non-pseudo-merge commits, so it is safe to enumerate through the selected array like so: for (i = 0; i < bitmap_writer_selected_nr(); i++) if (writer.selected[i].pseudo_merge) BUG("unexpected pseudo-merge"); without encountering the BUG(). Signed-off-by: Taylor Blau --- pack-bitmap-write.c | 100 +++++++++++++++++++++++++++++--------------- pack-bitmap.h | 1 + 2 files changed, 67 insertions(+), 34 deletions(-) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index ad768959633..b1e8a0ad66d 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -24,7 +24,7 @@ struct bitmapped_commit { struct ewah_bitmap *write_as; int flags; int xor_offset; - uint32_t commit_pos; + unsigned pseudo_merge : 1; }; struct bitmap_writer { @@ -39,6 +39,8 @@ struct bitmap_writer { struct bitmapped_commit *selected; unsigned int selected_nr, selected_alloc; + uint32_t pseudo_merges_nr; + struct progress *progress; int show_progress; unsigned char pack_checksum[GIT_MAX_RAWSZ]; @@ -46,6 +48,11 @@ struct bitmap_writer { static struct bitmap_writer writer; +static inline int bitmap_writer_selected_nr(void) +{ + return writer.selected_nr - writer.pseudo_merges_nr; +} + void bitmap_writer_init(struct repository *r) { writer.bitmaps = kh_init_oid_map(); @@ -120,25 +127,30 @@ void bitmap_writer_build_type_index(struct packing_data *to_pack, * Compute the actual bitmaps */ -static inline void push_bitmapped_commit(struct commit *commit) +static void bitmap_writer_push_bitmapped_commit(struct commit *commit, + unsigned pseudo_merge) { - int hash_ret; - khiter_t hash_pos; - if (writer.selected_nr >= writer.selected_alloc) { writer.selected_alloc = (writer.selected_alloc + 32) * 2; REALLOC_ARRAY(writer.selected, writer.selected_alloc); } - hash_pos = kh_put_oid_map(writer.bitmaps, commit->object.oid, &hash_ret); - if (!hash_ret) - die(_("duplicate entry when writing bitmap index: %s"), - oid_to_hex(&commit->object.oid)); - kh_value(writer.bitmaps, hash_pos) = NULL; + if (!pseudo_merge) { + int hash_ret; + khiter_t hash_pos = kh_put_oid_map(writer.bitmaps, + commit->object.oid, + &hash_ret); + + if (!hash_ret) + die(_("duplicate entry when writing bitmap index: %s"), + oid_to_hex(&commit->object.oid)); + kh_value(writer.bitmaps, hash_pos) = NULL; + } writer.selected[writer.selected_nr].commit = commit; writer.selected[writer.selected_nr].bitmap = NULL; writer.selected[writer.selected_nr].flags = 0; + writer.selected[writer.selected_nr].pseudo_merge = pseudo_merge; writer.selected_nr++; } @@ -168,16 +180,20 @@ static void compute_xor_offsets(void) while (next < writer.selected_nr) { struct bitmapped_commit *stored = &writer.selected[next]; - int best_offset = 0; struct ewah_bitmap *best_bitmap = stored->bitmap; struct ewah_bitmap *test_xor; + if (stored->pseudo_merge) + goto next; + for (i = 1; i <= MAX_XOR_OFFSET_SEARCH; ++i) { int curr = next - i; if (curr < 0) break; + if (writer.selected[curr].pseudo_merge) + continue; test_xor = ewah_pool_new(); ewah_xor(writer.selected[curr].bitmap, stored->bitmap, test_xor); @@ -193,6 +209,7 @@ static void compute_xor_offsets(void) } } +next: stored->xor_offset = best_offset; stored->write_as = best_bitmap; @@ -205,7 +222,8 @@ struct bb_commit { struct bitmap *commit_mask; struct bitmap *bitmap; unsigned selected:1, - maximal:1; + maximal:1, + pseudo_merge:1; unsigned idx; /* within selected array */ }; @@ -243,17 +261,18 @@ static void bitmap_builder_init(struct bitmap_builder *bb, revs.first_parent_only = 1; for (i = 0; i < writer->selected_nr; i++) { - struct commit *c = writer->selected[i].commit; - struct bb_commit *ent = bb_data_at(&bb->data, c); + struct bitmapped_commit *bc = &writer->selected[i]; + struct bb_commit *ent = bb_data_at(&bb->data, bc->commit); ent->selected = 1; ent->maximal = 1; + ent->pseudo_merge = bc->pseudo_merge; ent->idx = i; ent->commit_mask = bitmap_new(); bitmap_set(ent->commit_mask, i); - add_pending_object(&revs, &c->object, ""); + add_pending_object(&revs, &bc->commit->object, ""); } if (prepare_revision_walk(&revs)) @@ -430,8 +449,13 @@ static int fill_bitmap_commit(struct bb_commit *ent, struct commit *c = prio_queue_get(queue); if (old_bitmap && mapping) { - struct ewah_bitmap *old = bitmap_for_commit(old_bitmap, c); + struct ewah_bitmap *old; struct bitmap *remapped = bitmap_new(); + + if (commit->object.flags & BITMAP_PSEUDO_MERGE) + old = NULL; + else + old = bitmap_for_commit(old_bitmap, c); /* * If this commit has an old bitmap, then translate that * bitmap and add its bits to this one. No need to walk @@ -450,12 +474,14 @@ static int fill_bitmap_commit(struct bb_commit *ent, * Mark ourselves and queue our tree. The commit * walk ensures we cover all parents. */ - pos = find_object_pos(&c->object.oid, &found); - if (!found) - return -1; - bitmap_set(ent->bitmap, pos); - prio_queue_put(tree_queue, - repo_get_commit_tree(the_repository, c)); + if (!(c->object.flags & BITMAP_PSEUDO_MERGE)) { + pos = find_object_pos(&c->object.oid, &found); + if (!found) + return -1; + bitmap_set(ent->bitmap, pos); + prio_queue_put(tree_queue, + repo_get_commit_tree(the_repository, c)); + } for (p = c->parents; p; p = p->next) { pos = find_object_pos(&p->item->object.oid, &found); @@ -483,6 +509,9 @@ static void store_selected(struct bb_commit *ent, struct commit *commit) stored->bitmap = bitmap_to_ewah(ent->bitmap); + if (ent->pseudo_merge) + return; + hash_pos = kh_get_oid_map(writer.bitmaps, commit->object.oid); if (hash_pos == kh_end(writer.bitmaps)) die(_("attempted to store non-selected commit: '%s'"), @@ -612,7 +641,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits, if (indexed_commits_nr < 100) { for (i = 0; i < indexed_commits_nr; ++i) - push_bitmapped_commit(indexed_commits[i]); + bitmap_writer_push_bitmapped_commit(indexed_commits[i], 0); return; } @@ -645,7 +674,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits, } } - push_bitmapped_commit(chosen); + bitmap_writer_push_bitmapped_commit(chosen, 0); i += next + 1; display_progress(writer.progress, i); @@ -683,8 +712,11 @@ static void write_selected_commits_v1(struct hashfile *f, { int i; - for (i = 0; i < writer.selected_nr; ++i) { + for (i = 0; i < bitmap_writer_selected_nr(); ++i) { struct bitmapped_commit *stored = &writer.selected[i]; + if (stored->pseudo_merge) + BUG("unexpected pseudo-merge among selected: %s", + oid_to_hex(&stored->commit->object.oid)); if (offsets) offsets[i] = hashfile_total(f); @@ -718,10 +750,10 @@ static void write_lookup_table(struct hashfile *f, uint32_t i; uint32_t *table, *table_inv; - ALLOC_ARRAY(table, writer.selected_nr); - ALLOC_ARRAY(table_inv, writer.selected_nr); + ALLOC_ARRAY(table, bitmap_writer_selected_nr()); + ALLOC_ARRAY(table_inv, bitmap_writer_selected_nr()); - for (i = 0; i < writer.selected_nr; i++) + for (i = 0; i < bitmap_writer_selected_nr(); i++) table[i] = i; /* @@ -729,16 +761,16 @@ static void write_lookup_table(struct hashfile *f, * bitmap corresponds to j'th bitmapped commit (among the selected * commits) in lex order of OIDs. */ - QSORT_S(table, writer.selected_nr, table_cmp, commit_positions); + QSORT_S(table, bitmap_writer_selected_nr(), table_cmp, commit_positions); /* table_inv helps us discover that relationship (i'th bitmap * to j'th commit by j = table_inv[i]) */ - for (i = 0; i < writer.selected_nr; i++) + for (i = 0; i < bitmap_writer_selected_nr(); i++) table_inv[table[i]] = i; trace2_region_enter("pack-bitmap-write", "writing_lookup_table", the_repository); - for (i = 0; i < writer.selected_nr; i++) { + for (i = 0; i < bitmap_writer_selected_nr(); i++) { struct bitmapped_commit *selected = &writer.selected[table[i]]; uint32_t xor_offset = selected->xor_offset; uint32_t xor_row; @@ -809,7 +841,7 @@ void bitmap_writer_finish(struct pack_idx_entry **index, memcpy(header.magic, BITMAP_IDX_SIGNATURE, sizeof(BITMAP_IDX_SIGNATURE)); header.version = htons(default_version); header.options = htons(flags | options); - header.entry_count = htonl(writer.selected_nr); + header.entry_count = htonl(bitmap_writer_selected_nr()); hashcpy(header.checksum, writer.pack_checksum); hashwrite(f, &header, sizeof(header) - GIT_MAX_RAWSZ + the_hash_algo->rawsz); @@ -821,9 +853,9 @@ void bitmap_writer_finish(struct pack_idx_entry **index, if (options & BITMAP_OPT_LOOKUP_TABLE) CALLOC_ARRAY(offsets, index_nr); - ALLOC_ARRAY(commit_positions, writer.selected_nr); + ALLOC_ARRAY(commit_positions, bitmap_writer_selected_nr()); - for (i = 0; i < writer.selected_nr; i++) { + for (i = 0; i < bitmap_writer_selected_nr(); i++) { struct bitmapped_commit *stored = &writer.selected[i]; int commit_pos = oid_pos(&stored->commit->object.oid, index, index_nr, oid_access); diff --git a/pack-bitmap.h b/pack-bitmap.h index dae2d68a338..ca9acd2f735 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -21,6 +21,7 @@ struct bitmap_disk_header { unsigned char checksum[GIT_MAX_RAWSZ]; }; +#define BITMAP_PSEUDO_MERGE (1u<<21) #define NEEDS_BITMAP (1u<<22) /* From patchwork Wed Mar 20 22:05:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598219 Received: from mail-qv1-f43.google.com (mail-qv1-f43.google.com [209.85.219.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BAB4186266 for ; Wed, 20 Mar 2024 22:05:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972327; cv=none; b=PNBTBnay2py99D0+km3tRH8dU6U77X/sgaPa7JJeUlr7NK2Neaqesejqr/0PG0WsMiwZp7PmJ02snZJau9JqjeqO4Swzw428yQr03DioyN2gNGyPp72E8bR2Jz2bIf9LM+hdKGTAD6GepEOZ1X+esOIt9i2b8xME4WTbcQPJT60= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972327; c=relaxed/simple; bh=4bA1iwDkOVDSOittigWe/os0ZHe2tPAYHJgJOQ7u1nc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=qSU+awKBQ2hdmBPdhK4dg+ViAujY3XLseHm76A+QQIsg7TWTW42yMJr3yVuoqKxgwb3je9yLWOFlpD0c6xda7pI299uT/8lvb9YVuejKyVn4HprvTi8elgFk78WqLHEI6pBxdhzQ2DS5J45qGDeAIFHtvhz9OYJBQ85Gj5Ay4VI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=ZLRfPeXz; arc=none smtp.client-ip=209.85.219.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="ZLRfPeXz" Received: by mail-qv1-f43.google.com with SMTP id 6a1803df08f44-6928a5e2479so2425996d6.0 for ; Wed, 20 Mar 2024 15:05:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972324; x=1711577124; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Kz7ra4JiW6Utg0xo2MU2HtcuHmcU5S3z7XX9Bjv406g=; b=ZLRfPeXzv/oiY0LB9erYmU1D7hpGxmPXvYK/oxdj2+SRa8Jt3hY6Pf2YWi5zJI36Ea NqSL4u8WYNFpG2Q2EKX52+k8GENMkVkfaAiwoSBhJ29UDxzMN6GP7yeCLCuwE5TXWVKz ++/0kf305iDR6KB3oXF8tg75l3JB6fwTmrPyds3j6ywKBJBlPDz7GoL8un5wfa8bouIj fHMos45j9hJDUHNplae/vmyZMlAEwYK9aDYcvzEbR/o/U2Wo0Oh3DuUS+6XeWQf1cHJ7 /bRywWn9FVY6DzA1mb5S27MW0u2Ij4OLijb81Mh3rj0Wf0Q2jxalEJPWv7cKm8tEDlF4 raMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972324; x=1711577124; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Kz7ra4JiW6Utg0xo2MU2HtcuHmcU5S3z7XX9Bjv406g=; b=QZkNUMAjJYJRRSWnh3Cf1uGOsFsoegIrDK5erKuYfTK7kvABYVQiHla0PmiVBHZarC qIozFkesFm9NJTcMymfmAPuexwWv2H01QomKtDE7rE2pagnMI+cg4/XoJCBVLqVEKqn9 CsUop6jNsnKm+MH8MTRx1atyGerY7EDvzsy9VgeYe1lOVbqW3ZDY6iKYFSDRulQd0olf 363E0miwNUd0F58v6D5YqwVp4e+MzmQIhq/hQBp3fMISOoE/iCZ5gBRjRXLWTiI0xSuo 8rcYOpuLHw21NXUNTRklDra5DWfAeYSEvapPRh79pRbRjtmYNJph2o6SECuAFDWEBrlH TTlQ== X-Gm-Message-State: AOJu0Yy02Emnm6d2Rw3UgRqHmkH/nGZIjKcd8KoVbQsYqYsdm1LycFHw oyBpw6f2QKVoWfwXk5jTHcx4g98l176+viK1/6XjhEePTUOz5CNF77lj0VzwVCoUS1y4ycgKj9z G34M= X-Google-Smtp-Source: AGHT+IGsuAU353HxJri3QAjaXRl6U9lXNAIForRZ467ue5CIunOv60rTspYkk4gkVVkr5mTXNYxx7Q== X-Received: by 2002:ad4:5bef:0:b0:691:641a:7bb9 with SMTP id k15-20020ad45bef000000b00691641a7bb9mr157065qvc.43.1710972324528; Wed, 20 Mar 2024 15:05:24 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id q15-20020a056214194f00b0069183f7fc99sm6482134qvk.144.2024.03.20.15.05.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:24 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:23 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 08/24] pack-bitmap: implement `bitmap_writer_has_bitmapped_object_id()` Message-ID: <4fdd7dda2744a938f19b76324c76196b033dc2fc.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare to implement pseudo-merge bitmap selection by implementing a necessary new function, `bitmap_writer_has_bitmapped_object_id()`. This function returns whether or not the bitmap_writer selected the given object ID for bitmapping. This will allow the pseudo-merge machinery to reject candidates for pseudo-merges if they have already been selected as an ordinary bitmap tip. Signed-off-by: Taylor Blau --- pack-bitmap-write.c | 5 +++++ pack-bitmap.h | 2 ++ 2 files changed, 7 insertions(+) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index b1e8a0ad66d..cd528f89a76 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -123,6 +123,11 @@ void bitmap_writer_build_type_index(struct packing_data *to_pack, } } +int bitmap_writer_has_bitmapped_object_id(const struct object_id *oid) +{ + return kh_get_oid_map(writer.bitmaps, *oid) != kh_end(writer.bitmaps); +} + /** * Compute the actual bitmaps */ diff --git a/pack-bitmap.h b/pack-bitmap.h index ca9acd2f735..995d664cc89 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -98,6 +98,8 @@ int bitmap_has_oid_in_uninteresting(struct bitmap_index *, const struct object_i off_t get_disk_usage_from_bitmap(struct bitmap_index *, struct rev_info *); +int bitmap_writer_has_bitmapped_object_id(const struct object_id *oid); + void bitmap_writer_init(struct repository *r); void bitmap_writer_show_progress(int show); void bitmap_writer_set_checksum(const unsigned char *sha1); From patchwork Wed Mar 20 22:05:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598220 Received: from mail-ot1-f54.google.com (mail-ot1-f54.google.com [209.85.210.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 891868614B for ; Wed, 20 Mar 2024 22:05:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972333; cv=none; b=gGJLwMkmLRS84vH6JeqK70ZNsoM5gqBvtw7vDxW1CqNcrRd7sdaHw7O4LJDeWhfMKy//BZp9Kwyetfn3nx4PPyi6npvwlhE6hXZqxYI7J5zDm7yFTxiYyOwUNVTdtxJU55omivZ1Mradfu2I/m6G3g0+v/wmDwx5y0yl/ndyrUY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972333; c=relaxed/simple; bh=9DAyVKjbBrrbmKFf1swBCqyVJcLKHdP/wqVyJfoN/rY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=XX4mn+/M5V8YYePoJx520WGkKaF2hFQMcZupOXkr+M+V94/vjII5tCNDX/dEZZfDo/ihQM45uFSesT4l1rdgnHNV5tMrVfitQYhhnYNiKbbXRWJY9eDl4k2ixan5+9SYUrWzHQuWlLCSzVG+7sb+dKC10RFDLkQduNlmiTBMYyE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=fKx9U1f3; arc=none smtp.client-ip=209.85.210.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="fKx9U1f3" Received: by mail-ot1-f54.google.com with SMTP id 46e09a7af769-6e67cf739d0so154457a34.1 for ; Wed, 20 Mar 2024 15:05:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972327; x=1711577127; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=o5vrcfYevncrhJq+1MH28mXP7kdJyD4hn+TuiGuKeq8=; b=fKx9U1f3VLKMtgrNao3flambDVoXB2VSOuP+v6yKlHSXzNKqN+ue9V4XSodvx3tx1C /uRkVD2kvK0GaomsnAqx0lKMaK3Z2KSZgmpm0S5GQyx5fLSoxPJI54lUy0u8sYadg3uE DDblHQdTuqALjrKO0RuHTrDhumDdvjOtiNnhCwaw6NUlpMkXO48J/dU2YU7AtXqSWO+v s2up3HvZug2g5dXl8QmYURILpeDbN4LdGA6KHrvOC7PHW+uyTAA6wu4LMXyALJUjbDQd DPIULFKDY3NgyGDLy4HVGYinjdB8VL81Xrai7pZ143zlCpkrvbJXXAbr+RBEkGj9SYYX LfKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972327; x=1711577127; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=o5vrcfYevncrhJq+1MH28mXP7kdJyD4hn+TuiGuKeq8=; b=hoNmAN8OuZcb7cVQfQPWiHWoxS8s/1ipSaVyrbLKiqknHgyoGHRXkA04xBzgLWGrpC Z0OCw77VsPJQfXbRCfX2jiv8FhLlDUtP2TvNgMw6lJDcMB9h8vSQT7lOUiczcz4Z4DBg e5JLQv83T8TGuqtJ9NEwJJonyyAjmhkY4kzEUsl9gSVvX5+KWgLs306f61SagHEzsI0I c2zm+51k+LaL7VqLu9IUccKGpob+hFHI3YpDCGelu8XNMwuaYjM3rKRmkiEPs9nekFwJ xbOjUiaeVp9zOltvruKMIJwCewGiQqNSdkeyNicAM91dKiLfl5lr0/mH2p7I+v8TGUX9 tjhg== X-Gm-Message-State: AOJu0YyXqX+t6Jps8CjqDJbB/8vDxjkb/j9Lg5EOU5s+piL+81dqk16Q fmbmC1fiBRjeOu57fRGs0j/c5tuZiAdQ2MdW3Ibl3hc/Q00+YKAUl46ccAtBMWQcq6HTRTx5QRo e2aA= X-Google-Smtp-Source: AGHT+IHmJryUk4YFrIQxC16Gf03S0+FXi4CAidbZ7r6KNftsx7lTXj06g1C6JuZcUzlQUdkwx6TShA== X-Received: by 2002:a9d:638b:0:b0:6e6:b01d:846 with SMTP id w11-20020a9d638b000000b006e6b01d0846mr335462otk.5.1710972327515; Wed, 20 Mar 2024 15:05:27 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id y27-20020a05620a09db00b00789f0d9e6dcsm4387103qky.93.2024.03.20.15.05.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:27 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:26 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 09/24] pack-bitmap: make `bitmap_writer_push_bitmapped_commit()` public Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The pseudo-merge selection code will be added in a subsequent commit, and will need a way to push the allocated commit structures into the bitmap writer from a separate compilation unit. Make the `bitmap_writer_push_bitmapped_commit()` function part of the pack-bitmap.h header in order to make this possible. Signed-off-by: Taylor Blau --- pack-bitmap-write.c | 4 ++-- pack-bitmap.h | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index cd528f89a76..e46978d494c 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -132,8 +132,8 @@ int bitmap_writer_has_bitmapped_object_id(const struct object_id *oid) * Compute the actual bitmaps */ -static void bitmap_writer_push_bitmapped_commit(struct commit *commit, - unsigned pseudo_merge) +void bitmap_writer_push_bitmapped_commit(struct commit *commit, + unsigned pseudo_merge) { if (writer.selected_nr >= writer.selected_alloc) { writer.selected_alloc = (writer.selected_alloc + 32) * 2; diff --git a/pack-bitmap.h b/pack-bitmap.h index 995d664cc89..0f539d79cfd 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -99,6 +99,8 @@ int bitmap_has_oid_in_uninteresting(struct bitmap_index *, const struct object_i off_t get_disk_usage_from_bitmap(struct bitmap_index *, struct rev_info *); int bitmap_writer_has_bitmapped_object_id(const struct object_id *oid); +void bitmap_writer_push_bitmapped_commit(struct commit *commit, + unsigned pseudo_merge); void bitmap_writer_init(struct repository *r); void bitmap_writer_show_progress(int show); From patchwork Wed Mar 20 22:05:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598221 Received: from mail-qk1-f175.google.com (mail-qk1-f175.google.com [209.85.222.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DFA7086266 for ; Wed, 20 Mar 2024 22:05:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972334; cv=none; b=gIRqh1lOlI5pRT9CA3c8x1HVb5hFCz4BiICnxYGMgsRKjJM8SdHxBtrs47lfORJC8kcgOaX6uacXQwrlCZnF5KgXhJ3f/Gn2+OAuxfBp+5zVfUYo8lpYBI/+lPCrj9JmtOLdEJRMuBsCBWQm32qkqxP5Z8fPEDP6vqcuKKXiLtE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972334; c=relaxed/simple; bh=PkULP0RSgQ/PCHN/FzJApfm0DD1IDBfwRvI5xy+g7aI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=A6+vfRVinFOyXUZMBANKtaj/UzgG3TTvOcQEfKMVNf18v4/BU/7ek6EB5MDZqND50tRgmP2fSueO2RoVd4WiL8bopJ/jU6qFuYjnKWUOHGlpRN0c6RbHwRxJF+2nz/aO7/2WkzRqq+G8qvegUgyEfqqHqpkBFq7fKOhVNQraKQM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=1xy/7tq2; arc=none smtp.client-ip=209.85.222.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="1xy/7tq2" Received: by mail-qk1-f175.google.com with SMTP id af79cd13be357-781753f52afso20974885a.2 for ; Wed, 20 Mar 2024 15:05:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972330; x=1711577130; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=FzrbCnjda0Uut9anKOhdohLmc7ZhdDsFnjj38jOu35k=; b=1xy/7tq2siwwbw7ebkHAb6ziSB8/FLwD6jX3E8EirHsJnHlULVyfq+81JOa1UBKyuq DPwmWV5iB8yTuWGmTJlR7kXBCM5hgjXtIF9K0n4+GLLyyXdUOCLaLkOJSp0nvpMZWElq mK1HpdOJGv8lPCKsAtSs8VMkqgwcaoFvTFIZd5JDLwmFAokf7C4pB7VF1LDlcqFfHxkj kafI99lb0C0cz6Tpt6UV23HzWn6k56MQZmBhuOFmfzgTeHRbjEqk2VgnRGUz8UzG+OnT 6keWhnhccT4M42cYOOVvLurKDc9GD0RKVC3wRWLijUeNW6wsGgrN9m49GHHtJAonv6Ca LS7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972330; x=1711577130; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=FzrbCnjda0Uut9anKOhdohLmc7ZhdDsFnjj38jOu35k=; b=P8LdMDh41PyDzKtLmOqdx+vDf4WDMmLYlXHc/bEQP2p1Ivi8mLbhLxwQi+IEGOyWtp L6tP663dqKiDbf40cvFOXOWYmwohcHQsNSjGo+KUMs+zywBk7JIT6xefCc20YKMr9pzp XngqLGP65NeiT8OouV3rZmNEDavK+x1H/xJZs69UXq7MpItnjW12RS5nZUPS8sq7KYrR LE6SfxMRlOaWfRMHChEQx+rPRJPfZL7A35rU0uUvDzMd/ptjWoSO3lDBF+p6obEwbR1T RKu2nkno7uQYF9VcDkxB7UEaQ8rccw/y8EFjzHJgUnHzBNI+QHF6aMIm755LV93XX6CU jx8A== X-Gm-Message-State: AOJu0YyYcyjj8G5HMjGCP6BTrLeZpIxiQA4Pt3Ixn3X2pEygye/MKlJY Pzn6f/LQKVlwOs7uaEUWxf+nLkwSuzRY18P9nAP5uvCeWjLhpHwZ2KsNJTmyNEtBgLK3NvpUrQQ 04HI= X-Google-Smtp-Source: AGHT+IEYf7Tlw66j+KaNgmj+MZSuW2MNUbk5ODCZJazabOjy/eWuvXPuKPV/UfJFahYiANFuRySABw== X-Received: by 2002:a05:620a:4951:b0:78a:1e39:2674 with SMTP id vz17-20020a05620a495100b0078a1e392674mr3471384qkn.39.1710972330491; Wed, 20 Mar 2024 15:05:30 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id i7-20020a05620a144700b00789e49808ffsm5665876qkl.105.2024.03.20.15.05.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:30 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:29 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 10/24] pseudo-merge: implement support for selecting pseudo-merge commits Message-ID: <323e1250b247f44212879c37092e1c4cbdc0b310.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Teach the new pseudo-merge machinery how to select non-bitmapped commits for inclusion in different pseudo-merge group(s) based on a handful of criteria. Pseudo-merges are derived first from named pseudo-merge groups (see the `bitmapPseudoMerge..*` configuration options). They are (optionally) further segmented within an individual pseudo-merge group based on any capture group(s) within the pseudo-merge group's pattern. For example, a configuration like so: [bitmapPseudoMerge "all"] pattern = "refs/" threshold = now stableThreshold = never sampleRate = 100 maxMerges = 64 would group all non-bitmapped commits into up to 64 individual pseudo-merge commits. If you wanted to separate tags from branches when generating pseudo-merge commits, and further segment them by which fork they originate from (using the same "refs/virtual/" scheme as in the delta islands documentation), you would instead write something like: [bitmapPseudoMerge "all"] pattern = "refs/virtual/([0-9]+)/(heads|tags)/" threshold = now stableThreshold = never sampleRate = 100 maxMerges = 64 Which would generate pseudo-merge group identifiers like "1234-heads", and "5678-tags" (for branches in fork "1234", and tags in remote "5678", respectively). Within pseudo-merge groups, there are a handful of other options used to control the distribution of matching commits among individual pseudo-merge commits: - bitmapPseudoMerge..decay - bitmapPseudoMerge..sampleRate - bitmapPseudoMerge..threshold - bitmapPseudoMerge..maxMerges - bitmapPseudoMerge..stableThreshold - bitmapPseudoMerge..stableSize The decay parameter roughly corresponds to "k" in `f(n) = C*n^(-k/100)`, where `f(n)` describes the size of the `n`-th pseudo-merge group. The sample rate controls what percentage of eligible commits are considered as candidates. The threshold parameter indicates the minimum age (so as to avoid including too-recent commits in a pseudo-merge group, making it less likely to be valid). The "maxMerges" parameter sets an upper-bound on the number of pseudo-merge commits an individual group The latter two "stable"-related parameters control "stable" pseudo-merge groups, comprised of a fixed number of commits which are older than the configured "stable threshold" value and may be grouped together in chunks of "stableSize" in order of age. This patch implements the aforementioned selection routine, as well as parsing the relevant configuration options. Signed-off-by: Taylor Blau --- pseudo-merge.c | 441 +++++++++++++++++++++++++++++++++++++++++++++++++ pseudo-merge.h | 96 +++++++++++ 2 files changed, 537 insertions(+) diff --git a/pseudo-merge.c b/pseudo-merge.c index 37e037ba272..caccef942a1 100644 --- a/pseudo-merge.c +++ b/pseudo-merge.c @@ -1,2 +1,443 @@ #include "git-compat-util.h" #include "pseudo-merge.h" +#include "date.h" +#include "oid-array.h" +#include "strbuf.h" +#include "config.h" +#include "string-list.h" +#include "refs.h" +#include "pack-bitmap.h" +#include "commit.h" +#include "alloc.h" +#include "progress.h" + +#define DEFAULT_PSEUDO_MERGE_DECAY 1.0f +#define DEFAULT_PSEUDO_MERGE_MAX_MERGES 64 +#define DEFAULT_PSEUDO_MERGE_SAMPLE_RATE 100 +#define DEFAULT_PSEUDO_MERGE_THRESHOLD approxidate("1.week.ago") +#define DEFAULT_PSEUDO_MERGE_STABLE_THRESHOLD approxidate("1.month.ago") +#define DEFAULT_PSEUDO_MERGE_STABLE_SIZE 512 + +static float gitexp(float base, int exp) +{ + float result = 1; + while (1) { + if (exp % 2) + result *= base; + exp >>= 1; + if (!exp) + break; + base *= base; + } + return result; +} + +static uint32_t pseudo_merge_group_size(const struct pseudo_merge_group *group, + const struct pseudo_merge_matches *matches, + uint32_t i) +{ + float C = 0.0f; + uint32_t n; + + /* + * The size of pseudo-merge groups decays according to a power series, + * which looks like: + * + * f(n) = C * n^-k + * + * , where 'n' is the n-th pseudo-merge group, 'f(n)' is its size, 'k' + * is the decay rate, and 'C' is a scaling value. + * + * The value of C depends on the number of groups, decay rate, and total + * number of commits. It is computed such that if there are M and N + * total groups and commits, respectively, that: + * + * N = f(0) + f(1) + ... f(M-1) + * + * Rearranging to isolate C, we get: + * + * N = \sum_{n=1}^M C / n^k + * + * N / C = \sum_{n=1}^M n^-k + * + * C = N / \sum_{n=1}^M n^-k + * + * For example, if we have a decay rate of 'k' being equal to 1.5, 'N' + * total commits equal to 10,000, and 'M' being equal to 6 groups, then + * the (rounded) group sizes are: + * + * { 5469, 1934, 1053, 684, 489, 372 } + * + * increasing the number of total groups, say to 10, scales the group + * sizes appropriately: + * + * { 5012, 1772, 964, 626, 448, 341, 271, 221, 186, 158 } + */ + for (n = 0; n < group->max_merges; n++) + C += 1.0f / gitexp(n + 1, group->decay); + C = matches->unstable_nr / C; + + return (int)((C / gitexp(i + 1, group->decay)) + 0.5); +} + +static void init_pseudo_merge_group(struct pseudo_merge_group *group) +{ + memset(group, 0, sizeof(struct pseudo_merge_group)); + + strmap_init_with_options(&group->matches, NULL, 0); + + group->decay = DEFAULT_PSEUDO_MERGE_DECAY; + group->max_merges = DEFAULT_PSEUDO_MERGE_MAX_MERGES; + group->sample_rate = DEFAULT_PSEUDO_MERGE_SAMPLE_RATE; + group->threshold = DEFAULT_PSEUDO_MERGE_THRESHOLD; + group->stable_threshold = DEFAULT_PSEUDO_MERGE_STABLE_THRESHOLD; + group->stable_size = DEFAULT_PSEUDO_MERGE_STABLE_SIZE; +} + +static int pseudo_merge_config(const char *var, const char *value, + const struct config_context *ctx, + void *cb_data) +{ + struct string_list *list = cb_data; + struct string_list_item *item; + struct pseudo_merge_group *group; + struct strbuf buf = STRBUF_INIT; + const char *sub, *key; + size_t sub_len; + + if (parse_config_key(var, "bitmappseudomerge", &sub, &sub_len, &key)) + return 0; + + if (!sub_len) + return 0; + + strbuf_add(&buf, sub, sub_len); + + item = string_list_lookup(list, buf.buf); + if (!item) { + item = string_list_insert(list, buf.buf); + + item->util = xmalloc(sizeof(struct pseudo_merge_group)); + init_pseudo_merge_group(item->util); + } + + group = item->util; + + if (!strcmp(key, "pattern")) { + struct strbuf re = STRBUF_INIT; + + free(group->pattern); + if (*value != '^') + strbuf_addch(&re, '^'); + strbuf_addstr(&re, value); + + group->pattern = xcalloc(1, sizeof(regex_t)); + if (regcomp(group->pattern, re.buf, REG_EXTENDED)) + die(_("failed to load pseudo-merge regex for %s: '%s'"), + sub, re.buf); + + strbuf_release(&re); + } else if (!strcmp(key, "decay")) { + group->decay = git_config_int(var, value, ctx->kvi); + if (group->decay < 0) { + warning(_("%s must be non-negative, using default"), var); + group->decay = DEFAULT_PSEUDO_MERGE_DECAY; + } + } else if (!strcmp(key, "samplerate")) { + group->sample_rate = git_config_int(var, value, ctx->kvi); + if (!(0 <= group->sample_rate && group->sample_rate <= 100)) { + warning(_("%s must be between 0 and 100, using default"), var); + group->sample_rate = DEFAULT_PSEUDO_MERGE_SAMPLE_RATE; + } + } else if (!strcmp(key, "threshold")) { + if (git_config_expiry_date(&group->threshold, var, value)) { + strbuf_release(&buf); + return -1; + } + } else if (!strcmp(key, "maxmerges")) { + group->max_merges = git_config_int(var, value, ctx->kvi); + if (group->max_merges < 0) { + warning(_("%s must be non-negative, using default"), var); + group->max_merges = DEFAULT_PSEUDO_MERGE_MAX_MERGES; + } + } else if (!strcmp(key, "stablethreshold")) { + if (git_config_expiry_date(&group->stable_threshold, var, value)) { + strbuf_release(&buf); + return -1; + } + } else if (!strcmp(key, "stablesize")) { + group->stable_size = git_config_int(var, value, ctx->kvi); + if (group->stable_size <= 0) { + warning(_("%s must be positive, using default"), var); + group->stable_size = DEFAULT_PSEUDO_MERGE_STABLE_SIZE; + } + } + + strbuf_release(&buf); + + return 0; +} + +void load_pseudo_merges_from_config(struct string_list *list) +{ + struct string_list_item *item; + + git_config(pseudo_merge_config, list); + + for_each_string_list_item(item, list) { + struct pseudo_merge_group *group = item->util; + if (!group->pattern) + die(_("pseudo-merge group '%s' missing required pattern"), + item->string); + if (group->threshold < group->stable_threshold) + die(_("pseudo-merge group '%s' has unstable threshold " + "before stable one"), item->string); + } +} + +static int find_pseudo_merge_group_for_ref(const char *refname, + const struct object_id *oid, + int flags UNUSED, + void *_data) +{ + struct string_list *list = _data; + struct object_id peeled; + struct commit *c; + uint32_t i; + int has_bitmap; + + if (!peel_iterated_oid(oid, &peeled)) + oid = &peeled; + + c = lookup_commit(the_repository, oid); + if (!c) + return 0; + + has_bitmap = bitmap_writer_has_bitmapped_object_id(oid); + + for (i = 0; i < list->nr; i++) { + struct pseudo_merge_group *group; + struct pseudo_merge_matches *matches; + struct strbuf group_name = STRBUF_INIT; + regmatch_t captures[16]; + size_t j; + + group = list->items[i].util; + if (regexec(group->pattern, refname, ARRAY_SIZE(captures), + captures, 0)) + continue; + + if (captures[ARRAY_SIZE(captures) - 1].rm_so != -1) + warning(_("pseudo-merge regex from config has too many capture " + "groups (max=%"PRIuMAX")"), + (uintmax_t)ARRAY_SIZE(captures) - 2); + + for (j = !!group->pattern->re_nsub; j < ARRAY_SIZE(captures); j++) { + regmatch_t *match = &captures[j]; + if (match->rm_so == -1) + continue; + + if (group_name.len) + strbuf_addch(&group_name, '-'); + + strbuf_add(&group_name, refname + match->rm_so, + match->rm_eo - match->rm_so); + } + + matches = strmap_get(&group->matches, group_name.buf); + if (!matches) { + matches = xcalloc(1, sizeof(*matches)); + strmap_put(&group->matches, strbuf_detach(&group_name, NULL), + matches); + } + + if (c->date <= group->stable_threshold) { + ALLOC_GROW(matches->stable, matches->stable_nr + 1, + matches->stable_alloc); + matches->stable[matches->stable_nr++] = c; + } else if (c->date <= group->threshold && !has_bitmap) { + ALLOC_GROW(matches->unstable, matches->unstable_nr + 1, + matches->unstable_alloc); + matches->unstable[matches->unstable_nr++] = c; + } + + strbuf_release(&group_name); + } + + return 0; +} + +static struct commit *push_pseudo_merge(struct pseudo_merge_group *group) +{ + struct commit *merge; + + ALLOC_GROW(group->merges, group->merges_nr + 1, group->merges_alloc); + + merge = alloc_commit_node(the_repository); + merge->object.parsed = 1; + merge->object.flags |= BITMAP_PSEUDO_MERGE; + + group->merges[group->merges_nr++] = merge; + + return merge; +} + +static struct pseudo_merge_commit_idx *pseudo_merge_idx(kh_oid_map_t *pseudo_merge_commits, + const struct object_id *oid) + +{ + struct pseudo_merge_commit_idx *pmc; + khiter_t hash_pos; + + hash_pos = kh_get_oid_map(pseudo_merge_commits, *oid); + if (hash_pos == kh_end(pseudo_merge_commits)) { + int hash_ret; + hash_pos = kh_put_oid_map(pseudo_merge_commits, *oid, &hash_ret); + + CALLOC_ARRAY(pmc, 1); + + kh_value(pseudo_merge_commits, hash_pos) = pmc; + } else { + pmc = kh_value(pseudo_merge_commits, hash_pos); + } + + return pmc; +} + +#define MIN_PSEUDO_MERGE_SIZE 8 + +static void select_pseudo_merges_1(struct pseudo_merge_group *group, + struct pseudo_merge_matches *matches, + kh_oid_map_t *pseudo_merge_commits, + uint32_t *pseudo_merges_nr) +{ + uint32_t i, j; + uint32_t stable_merges_nr; + + if (!matches->stable_nr && !matches->unstable_nr) + return; /* all tips in this group already have bitmaps */ + + stable_merges_nr = matches->stable_nr / group->stable_size; + if (matches->stable_nr % group->stable_size) + stable_merges_nr++; + + /* make stable_merges_nr pseudo merges for stable commits */ + for (i = 0, j = 0; i < stable_merges_nr; i++) { + struct commit *merge; + struct commit_list **p; + + merge = push_pseudo_merge(group); + p = &merge->parents; + + do { + struct commit *c; + struct pseudo_merge_commit_idx *pmc; + + if (j >= matches->stable_nr) + break; + + c = matches->stable[j++]; + pmc = pseudo_merge_idx(pseudo_merge_commits, + &c->object.oid); + + ALLOC_GROW(pmc->pseudo_merge, pmc->nr + 1, pmc->alloc); + + pmc->pseudo_merge[pmc->nr++] = *pseudo_merges_nr; + p = commit_list_append(c, p); + } while (j % group->stable_size); + + bitmap_writer_push_bitmapped_commit(merge, 1); + (*pseudo_merges_nr)++; + } + + /* make up to group->max_merges pseudo merges for unstable commits */ + for (i = 0, j = 0; i < group->max_merges; i++) { + struct commit *merge; + struct commit_list **p; + uint32_t size, end; + + merge = push_pseudo_merge(group); + p = &merge->parents; + + size = pseudo_merge_group_size(group, matches, i); + end = size < MIN_PSEUDO_MERGE_SIZE ? matches->unstable_nr : j + size; + + for (; j < end && j < matches->unstable_nr; j++) { + struct commit *c = matches->unstable[j]; + struct pseudo_merge_commit_idx *pmc; + + if (j % (100 / group->sample_rate)) + continue; + + pmc = pseudo_merge_idx(pseudo_merge_commits, + &c->object.oid); + + ALLOC_GROW(pmc->pseudo_merge, pmc->nr + 1, pmc->alloc); + + pmc->pseudo_merge[pmc->nr++] = *pseudo_merges_nr; + p = commit_list_append(c, p); + } + + bitmap_writer_push_bitmapped_commit(merge, 1); + (*pseudo_merges_nr)++; + if (end >= matches->unstable_nr) + break; + } +} + +static int commit_date_cmp(const void *va, const void *vb) +{ + timestamp_t a = (*(const struct commit **)va)->date; + timestamp_t b = (*(const struct commit **)vb)->date; + + if (a < b) + return -1; + else if (a > b) + return 1; + return 0; +} + +static void sort_pseudo_merge_matches(struct pseudo_merge_matches *matches) +{ + QSORT(matches->stable, matches->stable_nr, commit_date_cmp); + QSORT(matches->unstable, matches->unstable_nr, commit_date_cmp); +} + +void select_pseudo_merges(struct string_list *list, + struct commit **commits, size_t commits_nr, + kh_oid_map_t *pseudo_merge_commits, + uint32_t *pseudo_merges_nr, + unsigned show_progress) +{ + struct progress *progress = NULL; + uint32_t i; + + if (!list->nr) + return; + + if (show_progress) + progress = start_progress("Selecting pseudo-merge commits", list->nr); + + for_each_ref(find_pseudo_merge_group_for_ref, list); + + for (i = 0; i < list->nr; i++) { + struct pseudo_merge_group *group; + struct hashmap_iter iter; + struct strmap_entry *e; + + group = list->items[i].util; + strmap_for_each_entry(&group->matches, &iter, e) { + struct pseudo_merge_matches *matches = e->value; + + sort_pseudo_merge_matches(matches); + + select_pseudo_merges_1(group, matches, + pseudo_merge_commits, + pseudo_merges_nr); + } + + display_progress(progress, i + 1); + } + + stop_progress(&progress); +} diff --git a/pseudo-merge.h b/pseudo-merge.h index cab8ff6960a..81888731864 100644 --- a/pseudo-merge.h +++ b/pseudo-merge.h @@ -2,5 +2,101 @@ #define PSEUDO_MERGE_H #include "git-compat-util.h" +#include "strmap.h" +#include "khash.h" +#include "ewah/ewok.h" + +struct commit; +struct string_list; +struct bitmap_index; + +/* + * A pseudo-merge group tracks the set of non-bitmapped reference tips + * that match the given pattern. + * + * Within those matches, they are further segmented by separating + * consecutive capture groups with '-' dash character capture groups + * with '-' dash characters. + * + * Those groups are then ordered by committer date and partitioned + * into individual pseudo-merge(s) according to the decay, max_merges, + * sample_rate, and threshold parameters. + */ +struct pseudo_merge_group { + regex_t *pattern; + + /* capture group(s) -> struct pseudo_merge_matches */ + struct strmap matches; + + /* + * The individual pseudo-merge(s) that are generated from the + * above array of matches, partitioned according to the below + * parameters. + */ + struct commit **merges; + size_t merges_nr; + size_t merges_alloc; + + /* + * Pseudo-merge grouping parameters. See git-config(1) for + * more information. + */ + float decay; + int max_merges; + int sample_rate; + int stable_size; + timestamp_t threshold; + timestamp_t stable_threshold; +}; + +struct pseudo_merge_matches { + struct commit **stable; + struct commit **unstable; + size_t stable_nr, stable_alloc; + size_t unstable_nr, unstable_alloc; +}; + +/* + * Read the repository's configuration: + * + * - bitmapPseudoMerge..pattern + * - bitmapPseudoMerge..decay + * - bitmapPseudoMerge..sampleRate + * - bitmapPseudoMerge..threshold + * - bitmapPseudoMerge..maxMerges + * - bitmapPseudoMerge..stableThreshold + * - bitmapPseudoMerge..stableSize + * + * and populates the given `list` with pseudo-merge groups. String + * entry keys are the pseudo-merge group names, and the values are + * pointers to the pseudo_merge_group structure itself. + */ +void load_pseudo_merges_from_config(struct string_list *list); + +/* + * A pseudo-merge commit index (pseudo_merge_commit_idx) maps a + * particular (non-pseudo-merge) commit to the list of pseudo-merge(s) + * it appears in. + */ +struct pseudo_merge_commit_idx { + uint32_t *pseudo_merge; + size_t nr, alloc; +}; + +/* + * Selects pseudo-merges from a list of commits, populating the given + * string_list of pseudo-merge groups. + * + * Populates the pseudo_merge_commits map with a commit_idx + * corresponding to each commit in the list. Counts the total number + * of pseudo-merges generated. + * + * Optionally shows a progress meter. + */ +void select_pseudo_merges(struct string_list *list, + struct commit **commits, size_t commits_nr, + kh_oid_map_t *pseudo_merge_commits, + uint32_t *pseudo_merges_nr, + unsigned show_progress); #endif From patchwork Wed Mar 20 22:05:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598222 Received: from mail-qt1-f179.google.com (mail-qt1-f179.google.com [209.85.160.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 989658627A for ; Wed, 20 Mar 2024 22:05:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972336; cv=none; b=tx9+zf6XDOgCFWOb2UwBuLYGPU69NESpt9TPbKHwN2pXglKK9ntnfhjTBrirRLe0cyfzPGr9KaWY9JAMvIRAvgCZNkePnxbC2iuW4KxCA7YTLlLAa9lfOya+4HgIR7zynRqyCgSoHcDBxvImlIe2sXxmbQvjT1r6ceouVJJb0HU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972336; c=relaxed/simple; bh=uZwPc6+MaaFAiXPDnVM+S6MtGyE2zaJkmIQdMat6mU0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=jTxiKwYT8IU5lcv3GB3BIlZsWsUkA98NMDF1SDZaIH0nawh0zLhQuJDMD/eAqT/+0WDHYWMK3nmcXkJoJBhV/Ipiw+A39WGv4JrpdDQyBlJSZKyrVqQ6n8xl/kR1FG9ylmaoRsTgV8CiwIDHaf9bl/HQOO+MnTc/9iGpE2X6lm4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=gwX6Pw5r; arc=none smtp.client-ip=209.85.160.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="gwX6Pw5r" Received: by mail-qt1-f179.google.com with SMTP id d75a77b69052e-430bf84977dso2352651cf.1 for ; Wed, 20 Mar 2024 15:05:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972333; x=1711577133; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=UHYxGoCCMbjPrZxqJC1drLuglnsOvoskdJ+pzFyxRy8=; b=gwX6Pw5r2NOdmIUQcfu68tPoWlwlM2jqMysphRS99gwts6QfNiCzvLzGgsKucZ0jBu n5mAmQfJexJ2eXlHDgC3Yp0JqHZ9ESmD8S/5/ewfXZyRD5B7WneoQSeBl1yRIachAmGl AgZVzk/9bYydaIZETr6KCm8m4k7UtPRFRHqLabTX51AYgNcKp8V4S0OMuqNLbTZQWr5H VZ+XbK7P47nI3wuFXzz3Xva1xqFBN8gwMjYSpNlYw4hw/1yDhScl8SlC671TcIJu2k3G uuF/hbZtAOgco+nhlOvtv6WxI9Hneg36bF6+uxa51QvhPMn8quHS8pjNbMuD7Y3ZNFe4 +7ZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972333; x=1711577133; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=UHYxGoCCMbjPrZxqJC1drLuglnsOvoskdJ+pzFyxRy8=; b=ea7ACerOqn8hsDaPTp0oNluDD9Qx+Zo6UB3QU6fw4Dd23JYkaXcVgzyScrdsvqW4Wo guwaSHyPVyxlXAILLkN3VMJY4jynINT1PIQKS9o953M0oJYDkL9YGJzxbBbGfB2RYZ7B bKLCincpIU4rgl6Mhg4ujJvDCTLwh8InJXrTMOl1MP1xtM9li3QtwOJpBCoZ2Wwc8cYz 7xCsoonWoESr5MK5//YJa0t0jy+ExDtg7T7jfSJN9TlHysxioUQqUwL02XluVwsN6qwl iaIOtwO3QnYtW4AHkYHu8MmYpXrZZnbNiADqrUOeksqjur/j95EcnYpe9Y+jiUTlgYHx YpHg== X-Gm-Message-State: AOJu0YwI+T96fT5SXGVL9YpnA7zClPLS4Cfvx6ivTx9pE8db9DQ8/0st TPdROt3cRhFKMRI8Qc8BOG4+hEkqQmvr8ZxQxykBvERhOyoIFjEagq/CWIB2i8lnPIfDhaTCmjJ Yj00= X-Google-Smtp-Source: AGHT+IESSMioY7Ngrt+4McIkOCshDHgVMu2DVUY0y5sr31uYKyEJPf4+qnM5V1NEXr0rPIpz5j5qFw== X-Received: by 2002:a05:6214:2245:b0:696:4c48:e110 with SMTP id c5-20020a056214224500b006964c48e110mr108820qvc.55.1710972333404; Wed, 20 Mar 2024 15:05:33 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id iw11-20020a0562140f2b00b00690bf9548aasm8234446qvb.108.2024.03.20.15.05.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:33 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:32 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 11/24] pack-bitmap-write.c: select pseudo-merge commits Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Now that the pseudo-merge machinery has learned how to select non-bitmapped commits and assign them into different pseudo-merge group(s), invoke this new API from within the pack-bitmap internals and store the results off. Note that the selected pseudo-merge commits aren't actually used or written anywhere yet. This will be done in the following commit. Signed-off-by: Taylor Blau --- Documentation/config.txt | 2 + Documentation/config/bitmap-pseudo-merge.txt | 75 ++++++++++++++++++++ Documentation/technical/bitmap-format.txt | 26 +++++++ pack-bitmap-write.c | 14 ++++ 4 files changed, 117 insertions(+) create mode 100644 Documentation/config/bitmap-pseudo-merge.txt diff --git a/Documentation/config.txt b/Documentation/config.txt index 782c2bab906..e5a7170c9e0 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -381,6 +381,8 @@ include::config/apply.txt[] include::config/attr.txt[] +include::config/bitmap-pseudo-merge.txt[] + include::config/blame.txt[] include::config/branch.txt[] diff --git a/Documentation/config/bitmap-pseudo-merge.txt b/Documentation/config/bitmap-pseudo-merge.txt new file mode 100644 index 00000000000..90b72522046 --- /dev/null +++ b/Documentation/config/bitmap-pseudo-merge.txt @@ -0,0 +1,75 @@ +bitmapPseudoMerge..pattern:: + Regular expression used to match reference names. Commits + pointed to by references matching this pattern (and meeting + the below criteria, like `bitmapPseudoMerge..sampleRate` + and `bitmapPseudoMerge..threshold`) will be considered + for inclusion in a pseudo-merge bitmap. ++ +Commits are grouped into pseudo-merge groups based on whether or not +any reference(s) that point at a given commit match the pattern, which +is an extended regular expression. ++ +Within a pseudo-merge group, commits may be further grouped into +sub-groups based on the capture groups in the pattern. These +sub-groupings are formed from the regular expressions by concatenating +any capture groups from the regular expression, with a '-' dash in +between. ++ +For example, if the pattern is `refs/tags/`, then all tags (provided +they meet the below criteria) will be considered candidates for the +same pseudo-merge group. However, if the pattern is instead +`refs/remotes/([0-9])+/tags/`, then tags from different remotes will +be grouped into separate pseudo-merge groups, based on the remote +number. + +bitmapPseudoMerge..decay:: + Determines the rate at which consecutive pseudo-merge bitmap + groups decrease in size. Must be non-negative. This parameter + can be thought of as `k` in the function `f(n) = C * + n^(-k/100)`, where `f(n)` is the size of the `n`th group. ++ +Setting the decay rate equal to `0` will cause all groups to be the +same size. Setting the decay rate equal to `100` will cause the `n`th +group to be `1/n` the size of the initial group. Higher values of the +decay rate cause consecutive groups to shrink at an increasing rate. +The default is `100`. + +bitmapPseudoMerge..sampleRate:: + Determines the proportion of non-bitmapped commits (among + reference tips) which are selected for inclusion in an + unstable pseudo-merge bitmap. Must be between `0` and `100` + (inclusive). The default is `100`. + +bitmapPseudoMerge..threshold:: + Determines the minimum age of non-bitmapped commits (among + reference tips, as above) which are candidates for inclusion + in an unstable pseudo-merge bitmap. The default is + `1.week.ago`. + +bitmapPseudoMerge..maxMerges:: + Determines the maximum number of pseudo-merge commits among + which commits may be distributed. ++ +For pseudo-merge groups whose pattern does not contain any capture +groups, this setting is applied for all commits matching the regular +expression. For patterns that have one or more capture groups, this +setting is applied for each distinct capture group. ++ +For example, if your capture group is `refs/tags/`, then this setting +will distribute all tags into a maximum of `maxMerges` pseudo-merge +commits. However, if your capture group is, say, +`refs/remotes/([0-9]+)/tags/`, then this setting will be applied to +each remote's set of tags individually. ++ +Must be non-negative. The default value is 64. + +bitmapPseudoMerge..stableThreshold:: + Determines the minimum age of commits (among reference tips, + as above, however stable commits are still considered + candidates even when they have been covered by a bitmap) which + are candidates for a stable a pseudo-merge bitmap. The default + is `1.month.ago`. + +bitmapPseudoMerge..stableSize:: + Determines the size (in number of commits) of a stable + psuedo-merge bitmap. The default is `512`. diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt index 63a7177ac08..ed7edf98034 100644 --- a/Documentation/technical/bitmap-format.txt +++ b/Documentation/technical/bitmap-format.txt @@ -434,3 +434,29 @@ the end of a `.bitmap` file. The format is as follows: * An 8-byte unsigned value (in network byte-order) equal to the number of bytes in the pseudo-merge section (including this field). + +=== Pseudo-merge selection + +Pseudo-merge commits are selected among non-bitmapped commits at the +tip of one or more reference(s). In addition, there are a handful of +constraints to further refine this selection: + +`pack.bitmapPseudoMergeDecay`:: Defines the "decay rate", which +corresponds to how quickly (or not) consecutive pseudo-merges decrease +in size relative to one another. + +`pack.bitmapPseudoMergeGroups`:: Defines the maximum number of +pseudo-merge groups. + +`pack.bitmapPseudoMergeSampleRate`:: Defines the percentage of commits +(matching the above criteria) which are selected. + +`pack.bitmapPseudoMergeThreshold`:: Defines the minimum age of a commit +in order to be considered for inclusion within one or more pseudo-merge +bitmaps. + +The size of consecutive pseudo-merge groups decays according to a +power-law decay function, where the size of the `n`-th group is `f(n) = +C*n^-k`. The value of `C` is chosen accordingly to match the number of +desired groups, and `k` is 1/100th of the value of +`pack.bitmapPseudoMergeDecay`. diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index e46978d494c..db1c38f4e46 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -17,6 +17,7 @@ #include "trace2.h" #include "tree.h" #include "tree-walk.h" +#include "pseudo-merge.h" struct bitmapped_commit { struct commit *commit; @@ -39,6 +40,8 @@ struct bitmap_writer { struct bitmapped_commit *selected; unsigned int selected_nr, selected_alloc; + struct string_list pseudo_merge_groups; + kh_oid_map_t *pseudo_merge_commits; /* oid -> pseudo merge(s) */ uint32_t pseudo_merges_nr; struct progress *progress; @@ -56,6 +59,11 @@ static inline int bitmap_writer_selected_nr(void) void bitmap_writer_init(struct repository *r) { writer.bitmaps = kh_init_oid_map(); + writer.pseudo_merge_commits = kh_init_oid_map(); + + string_list_init_dup(&writer.pseudo_merge_groups); + + load_pseudo_merges_from_config(&writer.pseudo_merge_groups); } void bitmap_writer_show_progress(int show) @@ -686,6 +694,12 @@ void bitmap_writer_select_commits(struct commit **indexed_commits, } stop_progress(&writer.progress); + + select_pseudo_merges(&writer.pseudo_merge_groups, + indexed_commits, indexed_commits_nr, + writer.pseudo_merge_commits, + &writer.pseudo_merges_nr, + writer.show_progress); } From patchwork Wed Mar 20 22:05:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598223 Received: from mail-qk1-f171.google.com (mail-qk1-f171.google.com [209.85.222.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9DDFC8662F for ; Wed, 20 Mar 2024 22:05:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972339; cv=none; b=drpZnLOZD51Ch0H5uyfu+e+167zGWUi75+gRUy71HAXTRO1OIOoG1jF+JZN/y1+hMACA5kSSbtuuuh2xK4ouZxe2eooPX0HUT/2kxNxRwx5ofReJ7Azw3QGMerPWG9E9Xz5N/zKGVwjrqJgEA/eLlr6Do2MgSY64eWR/CiLEJoc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972339; c=relaxed/simple; bh=w8whBemgTwx22+lyuKLZU4+ok9yfOMqG60sIXaSGVnM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=PLY+z8IjRh1upkiXjtowze1VWqhP31b0LvpOOasKnpFx6mcJP2oDlTh0ByiYOvoaMz1vcBcABmjjUSvrwvR1OpGjFVSZVw/1wZNfILyl43wCAINe7KnVj4u/BwRemmmDjSgq2MTUYFtke/Q7zAChaUgMpNmfXHCR7HLb7JHTNMU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=ln+M8ww8; arc=none smtp.client-ip=209.85.222.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="ln+M8ww8" Received: by mail-qk1-f171.google.com with SMTP id af79cd13be357-788598094c4so20092385a.0 for ; Wed, 20 Mar 2024 15:05:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972336; x=1711577136; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=pJxbbaFLdDyVr7Us64ZQdBUVq66chOkLYZkEZAOhvTM=; b=ln+M8ww84GMrBe4p55SnLYKb31VESrAPpTDbpV7KHVGmtbsOPOfBknrA5xNcGHftCC IMMv/SG6R0JBUEz7wHrewgRytwXuDjYuwslaOsqwz9G7oN3jYynbriMfptPuM9m4J5nG zmidfZNf4fV+JGUbgQj9BM+S9uEUNNiGlEc3uOLibci/8pnyDPOWlQJvpgu5iOp1MHLW iaPOBR7El3Vm6lEmnfjb/6yc+lXNE9aPLgoge4Bgcsc/H49L0PPOT1FuLYYidfGRhO7Y eekPrUyPmyz3fTZZ7flPjxbZ3oaxRNk8Rtb4ruAMsF7m1AEdhG3n5qJdm8V7kObzmD1J nlOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972336; x=1711577136; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=pJxbbaFLdDyVr7Us64ZQdBUVq66chOkLYZkEZAOhvTM=; b=RAMtr2B33gIao+ifwP6NHo5hWZ3Xd4Q16e+1MbPCrvMBuQOu/1dT6vtLR+nAZ0oUbF A3J78ubtN7DAWAC7MbGgYvmyaSdxvtVbRJGKA51RlyXtiyX/r/jAirNlHQcjuK/b3TsK /hS13+grp3rw8qIOxdsPXzfJ0jy7KwUP/m2B9DfgoxZAZoKY/0PYk55QiviFwwX7ZbR1 mbrfd/z8Xu+MQvMkQIlV9oV9M9A7Xi3yW2FFivh6J9k31HGqfUlzs2T/NotLXoDPLcgu EAqhu/N6NwG67ynhBOVT+SN7LIfPb7JzOsh3ES785LYv1IXVocks744qeongM/cExkR6 /8hQ== X-Gm-Message-State: AOJu0Yya8NHL48ku+sioRmmk+U99v82Yrf00dRFZvUV2PRcnjPhWrTPQ Pi3YZOolJ7NHtkW/pKRVOxaXIlfArDAI9dnCn0ZkqFM+VdW1ct/tN4L+NwaUye/GDyftuk5cKkg TV6M= X-Google-Smtp-Source: AGHT+IExc7AsB19goaoprC4x56ywnaj1ZQ7Um1kDr8j4/iyHjN8+173tkN+diPwJDVZ60tmrkc/2Bw== X-Received: by 2002:a05:620a:1443:b0:78a:6de:8743 with SMTP id i3-20020a05620a144300b0078a06de8743mr9194281qkl.30.1710972336332; Wed, 20 Mar 2024 15:05:36 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id pa30-20020a05620a831e00b0078a255927c0sm837019qkn.41.2024.03.20.15.05.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:36 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:35 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 12/24] pack-bitmap-write.c: write pseudo-merge table Message-ID: <4c594f3faa875a6f54a801daf4250e2f8750a87c.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Now that the pack-bitmap writer machinery understands how to select and store pseudo-merge commits, teach it how to write the new optional pseudo-merge .bitmap extension. No readers yet exist for this new extension to the .bitmap format. The following commits will take any preparatory step(s) necessary before then implementing the routines necessary to read this new table. In the meantime, the new `write_pseudo_merges()` function implements writing this new format as described by a previous commit in Documentation/technical/bitmap-format.txt. Writing this table is fairly straightforward and consists of a few sub-components: - a pair of bitmaps for each pseudo-merge (one for the pseudo-merge "parents", and another for the objects reachable from those parents) - for each commit, the offset of either (a) the pseudo-merge it belongs to, or (b) an extended lookup table if it belongs to >1 pseudo-merge groups - if there are any commits belonging to >1 pseudo-merge group, the extended lookup tables (which each consist of the number of pseudo-merge groups a commit appears in, and then that many 4-byte unsigned ) Signed-off-by: Taylor Blau --- pack-bitmap-write.c | 128 ++++++++++++++++++++++++++++++++++++++++++++ pack-bitmap.h | 1 + 2 files changed, 129 insertions(+) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index db1c38f4e46..2d1b202fcd9 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -18,6 +18,7 @@ #include "tree.h" #include "tree-walk.h" #include "pseudo-merge.h" +#include "oid-array.h" struct bitmapped_commit { struct commit *commit; @@ -748,6 +749,127 @@ static void write_selected_commits_v1(struct hashfile *f, } } +static void write_pseudo_merges(struct hashfile *f) +{ + struct oid_array commits = OID_ARRAY_INIT; + struct bitmap **commits_bitmap = NULL; + off_t *pseudo_merge_ofs = NULL; + off_t start, table_start, next_ext; + + uint32_t base = bitmap_writer_selected_nr(); + size_t i, j = 0; + + CALLOC_ARRAY(commits_bitmap, writer.pseudo_merges_nr); + CALLOC_ARRAY(pseudo_merge_ofs, writer.pseudo_merges_nr); + + for (i = 0; i < writer.pseudo_merges_nr; i++) { + struct bitmapped_commit *merge = &writer.selected[base + i]; + struct commit_list *p; + + if (!merge->pseudo_merge) + BUG("found non-pseudo merge commit at %"PRIuMAX, (uintmax_t)i); + + commits_bitmap[i] = bitmap_new(); + + for (p = merge->commit->parents; p; p = p->next) + bitmap_set(commits_bitmap[i], + find_object_pos(&p->item->object.oid, NULL)); + } + + start = hashfile_total(f); + + for (i = 0; i < writer.pseudo_merges_nr; i++) { + struct ewah_bitmap *commits_ewah = bitmap_to_ewah(commits_bitmap[i]); + + pseudo_merge_ofs[i] = hashfile_total(f); + + dump_bitmap(f, commits_ewah); + dump_bitmap(f, writer.selected[base+i].write_as); + + ewah_free(commits_ewah); + } + + next_ext = st_add(hashfile_total(f), + st_mult(kh_size(writer.pseudo_merge_commits), + sizeof(uint64_t))); + + table_start = hashfile_total(f); + + commits.alloc = kh_size(writer.pseudo_merge_commits); + CALLOC_ARRAY(commits.oid, commits.alloc); + + for (i = kh_begin(writer.pseudo_merge_commits); i != kh_end(writer.pseudo_merge_commits); i++) { + if (!kh_exist(writer.pseudo_merge_commits, i)) + continue; + oid_array_append(&commits, &kh_key(writer.pseudo_merge_commits, i)); + } + + oid_array_sort(&commits); + + /* write lookup table (non-extended) */ + for (i = 0; i < commits.nr; i++) { + int hash_pos; + struct pseudo_merge_commit_idx *c; + + hash_pos = kh_get_oid_map(writer.pseudo_merge_commits, + commits.oid[i]); + if (hash_pos == kh_end(writer.pseudo_merge_commits)) + BUG("could not find pseudo-merge commit %s", + oid_to_hex(&commits.oid[i])); + + c = kh_value(writer.pseudo_merge_commits, hash_pos); + + hashwrite_be32(f, find_object_pos(&commits.oid[i], NULL)); + if (c->nr == 1) + hashwrite_be64(f, pseudo_merge_ofs[c->pseudo_merge[0]]); + else if (c->nr > 1) { + if (next_ext & ((uint64_t)1<<63)) + die(_("too many pseudo-merges")); + hashwrite_be64(f, next_ext | ((uint64_t)1<<63)); + next_ext = st_add3(next_ext, + sizeof(uint32_t), + st_mult(c->nr, sizeof(uint64_t))); + } else + BUG("expected commit '%s' to have at least one " + "pseudo-merge", oid_to_hex(&commits.oid[i])); + } + + /* write lookup table (extended) */ + for (i = 0; i < commits.nr; i++) { + int hash_pos; + struct pseudo_merge_commit_idx *c; + + hash_pos = kh_get_oid_map(writer.pseudo_merge_commits, + commits.oid[i]); + if (hash_pos == kh_end(writer.pseudo_merge_commits)) + BUG("could not find pseudo-merge commit %s", + oid_to_hex(&commits.oid[i])); + + c = kh_value(writer.pseudo_merge_commits, hash_pos); + if (c->nr == 1) + continue; + + hashwrite_be32(f, c->nr); + for (j = 0; j < c->nr; j++) + hashwrite_be64(f, pseudo_merge_ofs[c->pseudo_merge[j]]); + } + + /* write positions for all pseudo merges */ + for (i = 0; i < writer.pseudo_merges_nr; i++) + hashwrite_be64(f, pseudo_merge_ofs[i]); + + hashwrite_be32(f, writer.pseudo_merges_nr); + hashwrite_be32(f, kh_size(writer.pseudo_merge_commits)); + hashwrite_be64(f, table_start - start); + hashwrite_be64(f, hashfile_total(f) - start + sizeof(uint64_t)); + + for (i = 0; i < writer.pseudo_merges_nr; i++) + bitmap_free(commits_bitmap[i]); + + free(pseudo_merge_ofs); + free(commits_bitmap); +} + static int table_cmp(const void *_va, const void *_vb, void *_data) { uint32_t *commit_positions = _data; @@ -855,6 +977,9 @@ void bitmap_writer_finish(struct pack_idx_entry **index, int fd = odb_mkstemp(&tmp_file, "pack/tmp_bitmap_XXXXXX"); + if (writer.pseudo_merges_nr) + options |= BITMAP_OPT_PSEUDO_MERGES; + f = hashfd(fd, tmp_file.buf); memcpy(header.magic, BITMAP_IDX_SIGNATURE, sizeof(BITMAP_IDX_SIGNATURE)); @@ -886,6 +1011,9 @@ void bitmap_writer_finish(struct pack_idx_entry **index, write_selected_commits_v1(f, commit_positions, offsets); + if (options & BITMAP_OPT_PSEUDO_MERGES) + write_pseudo_merges(f); + if (options & BITMAP_OPT_LOOKUP_TABLE) write_lookup_table(f, commit_positions, offsets); diff --git a/pack-bitmap.h b/pack-bitmap.h index 0f539d79cfd..55527f61cd9 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -37,6 +37,7 @@ enum pack_bitmap_opts { BITMAP_OPT_FULL_DAG = 0x1, BITMAP_OPT_HASH_CACHE = 0x4, BITMAP_OPT_LOOKUP_TABLE = 0x10, + BITMAP_OPT_PSEUDO_MERGES = 0x20, }; enum pack_bitmap_flags { From patchwork Wed Mar 20 22:05:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598224 Received: from mail-qk1-f178.google.com (mail-qk1-f178.google.com [209.85.222.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 89FFF86249 for ; Wed, 20 Mar 2024 22:05:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972342; cv=none; b=hmbVS2BpCydUp7mdDTsvr5KcOc00Ep14ezGmDsjAbgY11QEbl4AZuRo68vxHtzR/72OmEtTdeqfxKnSRx/dmkD3k/fBVHoOznIDVye9/Q8jp23oA50cSdywlvEyr8kW431jvfNDhgBcfc4U2eYDxYoVWUS9D5Po+Rf0W69pRqZw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972342; c=relaxed/simple; bh=KnyHOkkiJt1d3dEbTi+1Sj9Vzy3UVaAP8gKEs25Ngbs=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=NzE2shXvWH7fIKgzRQYBS+AxjW15ucqjHcyL0Z+oFWioAjZKYMuPqDEmdjUejfiDAV/r2u5dwGzcM6WsmfG9RlFNfT0JNOsUZMMk09Ai+5busSTf3YL+W4J5U0WAxeZmc2f1eb82nH6oq4WI4T/p3UWob47aUJvqxe3hHWGnZZU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=mH9DbfRa; arc=none smtp.client-ip=209.85.222.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="mH9DbfRa" Received: by mail-qk1-f178.google.com with SMTP id af79cd13be357-789dbd9a6f1so26228985a.1 for ; Wed, 20 Mar 2024 15:05:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972339; x=1711577139; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=WegdbeZ7HXG0eTn2YXN/LVlTtrUOWKYviHbav9sBHsM=; b=mH9DbfRams2pjnxIlop0lnFUzQv+9DwWI6czU8VWcAnh747VegAI/ILrOVtq+hUOMC gNQrdhb6fI4YpqrgURDQGKyuEcTNqYO9ZsdvtV26zTqjxjLaqGpYWzntVCY62MN0u8o6 UWPvsreq3r0rkOw1qQ79fRfAHjLWra1aUfmuT4LeD+Tn1FEKrK8hcGw6UfziV+JJdcCv yOWoS1PP8CUsiNgbpkJc/d5uFJ/JNHpy81F/pzqpryp8czp4VcjVdC0eGOkWYp7FSu6e Vj7Grlsue8Pst+4zaSpabaD92N+gGizRTN/ZY8lokRqYqA+H3JiCis5qqQrKNMoAqvgn wQbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972339; x=1711577139; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=WegdbeZ7HXG0eTn2YXN/LVlTtrUOWKYviHbav9sBHsM=; b=Ywr8yqY/t2pHJNrhewRHc6HkRzx+ECRU74kb6E/IXmFRzck8Yd8e/axki1iXC1EyeO kCmvvdBmqB5PRkoGhCrD7En9VXEfDqcNIbiAKcILTizYTbppATet9sYva+LQxasct+lX W00Sbx67KDd3zIg0q9pvD50oyxO4Zeh2TO53l2XJXxgwF3Z2RMgNn0npGdjImktZF7GW JXdZmgDKejaeHAv2NCJfuM3UNnMsjPv2VhMUuRMoEKkiLn/xKTxFLGmYvn9JOE5rw6+k rozZmTdXJqGK0tljwnQBDrdL87Aohqqycyq97VpeS0j1ftL6vJLmK6IgIvdx6fh6d/5K CnXw== X-Gm-Message-State: AOJu0YxzCbXlCxoU4qwJVE6q3mp7ho3EwtZ82aMgtOMLJIfvHCiUJ+5I ey60K8rs8AHXQNXQ1Vx+SjGAy9PYlJMJoRwilZuiQARliVUjHNjvprGvk9o0CzkxLJdJ/C4Pr2f mvGc= X-Google-Smtp-Source: AGHT+IH7lN94Yx1uBMQoxf3ZfmRz7dkJfzc8zlG9OGUZ/nEIFh7uHcb6aAO5fEhdSEm81zu1bBv9Ew== X-Received: by 2002:a05:6214:20e9:b0:691:826f:5060 with SMTP id 9-20020a05621420e900b00691826f5060mr154702qvk.10.1710972339341; Wed, 20 Mar 2024 15:05:39 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id r3-20020a0562140c8300b0069612ee6742sm4664732qvr.14.2024.03.20.15.05.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:39 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:38 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 13/24] pack-bitmap: extract `read_bitmap()` function Message-ID: <7a31a932ab327681e7d91454e5dee3f903f279d9.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The pack-bitmap machinery uses the `read_bitmap_1()` function to read a bitmap from within the mmap'd region corresponding to the .bitmap file. As as side-effect of calling this function, `read_bitmap_1()` increments the `index->map_pos` variable to reflect the number of bytes read. Extract the core of this routine to a separate function (that operates over a `const unsigned char *`, a `size_t` and a `size_t *` pointer) instead of a `struct bitmap_index *` pointer. This function (called `read_bitmap()`) is part of the pack-bitmap.h API so that it can be used within the upcoming portion of the implementation in pseduo-merge.ch. Rewrite the existing function, `read_bitmap_1()`, in terms of its more generic counterpart. Signed-off-by: Taylor Blau --- pack-bitmap.c | 24 +++++++++++++++--------- pack-bitmap.h | 2 ++ 2 files changed, 17 insertions(+), 9 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 2baeabacee1..b3b6f9aad21 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -129,17 +129,13 @@ static struct ewah_bitmap *lookup_stored_bitmap(struct stored_bitmap *st) return composed; } -/* - * Read a bitmap from the current read position on the mmaped - * index, and increase the read position accordingly - */ -static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) +struct ewah_bitmap *read_bitmap(const unsigned char *map, + size_t map_size, size_t *map_pos) { struct ewah_bitmap *b = ewah_pool_new(); - ssize_t bitmap_size = ewah_read_mmap(b, - index->map + index->map_pos, - index->map_size - index->map_pos); + ssize_t bitmap_size = ewah_read_mmap(b, map + *map_pos, + map_size - *map_pos); if (bitmap_size < 0) { error(_("failed to load bitmap index (corrupted?)")); @@ -147,10 +143,20 @@ static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) return NULL; } - index->map_pos += bitmap_size; + *map_pos += bitmap_size; + return b; } +/* + * Read a bitmap from the current read position on the mmaped + * index, and increase the read position accordingly + */ +static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) +{ + return read_bitmap(index->map, index->map_size, &index->map_pos); +} + static uint32_t bitmap_num_objects(struct bitmap_index *index) { if (index->midx) diff --git a/pack-bitmap.h b/pack-bitmap.h index 55527f61cd9..a5fe4f305ef 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -133,4 +133,6 @@ int bitmap_is_preferred_refname(struct repository *r, const char *refname); int verify_bitmap_files(struct repository *r); +struct ewah_bitmap *read_bitmap(const unsigned char *map, + size_t map_size, size_t *map_pos); #endif From patchwork Wed Mar 20 22:05:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598225 Received: from mail-qk1-f179.google.com (mail-qk1-f179.google.com [209.85.222.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7D8D886644 for ; Wed, 20 Mar 2024 22:05:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972345; cv=none; b=Dow1UKceYW/sbRjZqLKxUSi4NKh5cbgVwzV3FBT0bMB5XhOUUmEr+Obu4xYSAG6k8GBg3f8FJGxQVYVfz3woRqcw5u90ipIjWxhL0LAoDhkGE9guRJhN2EoUA+mmRugumKMSTkw/25StdfFPK6Dnqwh/ORJrsDCQvG+C2DU5waQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972345; c=relaxed/simple; bh=PdfFqYZdxSeycYhSJK6Y5ObhSIdSURN1uLCi4/tpTRk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=g2yTy8VY1P/zWxwoBHtIRGk7X2je+evJ6Vf6X4Eb5Y+OzCAquDhzygPddl8eVkNdIeiyPVlL5w8TNKzQju1OF/0srgnpXbApj9CXyn2IDLiX/O4yjBx42Lg8KFbEGxKNuQxdl/ZOUA4a6BfKmtGGulWtJFAPUgdWy9aTDU4SM/U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=JYp1rIlQ; arc=none smtp.client-ip=209.85.222.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="JYp1rIlQ" Received: by mail-qk1-f179.google.com with SMTP id af79cd13be357-78a26803f1aso23741485a.3 for ; Wed, 20 Mar 2024 15:05:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972342; x=1711577142; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=0JFyZvkyvarvvwkoe9l/kl0kJb7pObru261GyhxLWwI=; b=JYp1rIlQcU3sQWYp2zKZwzhnkU5kp3CAkPLkj3t10bjDTjmgg1pUXU2DKgAK5+l+ps tNaEx14SceveIdhb4lmpftVlXS2TXjnCyPctFfb2AkHtdlaR7q15kH6XDxc64wzYBPAT 4U/ctfhE2WL0eOoFS8zZHBzFEG/bKppa0O0bZ3n6sha1ISYblGhxbUqn5kbWZwt/fnoN GhFK6Y0yBZKC+A9JWAojMxq2FpY4pnYn5k8c106CEzgRLVntHWWnJ2+M5nvfhTFwkPGF s2dNOV/f/awFNnzmOnL4TvZwqRnCf6O0lJ8OQOFXyz56NXsv5iddyn7sut5Xq/OngcOc atGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972342; x=1711577142; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=0JFyZvkyvarvvwkoe9l/kl0kJb7pObru261GyhxLWwI=; b=noRbNjPU7tzYchW4iFhv0LU2mla1ofyOsQL50i/I5Hz+HTxa8VRmZoVFVPEhqe/KgV qWoWjVIJf6jenAQE/fKxJVUIyb0WDL1HE5OkU7pj42aGU+tWqMX4aLgbyLOGgqkx9dJf oOLIDWZD23xF5Q+WyJd36J5oWEU/y/6fLSyCzuOVUKTaMMjgXHWCl9oH0dMFXc8AU/w1 6zxbxrYhAgzatMpQ8jtmGoJFf/v4FVXT8Fy2D+SEgaEI5nBTEaNOkfkl8iGXyDOu1H6u hmElIbPA652HmGrZL+qyFXkhYmU0BWxISQYMDiyfIbBadaunRrHPGIhuG2qiwh7G5llz xcKQ== X-Gm-Message-State: AOJu0YyKK0jz92NQ8um8YgVOuRSwSfAzgADeY/p7anoPv4W1L/E5MG0s apwFFNyyEayM06FcStjQ8ctLlusDSPU6GmMfOrPOfGMuK+N+HS3t0jLWOSlonZzv/b+Eb15HTR1 T0GE= X-Google-Smtp-Source: AGHT+IEeEUXaEMW1G6k8eULJzYPhy5QeWlUJ53/W6a+4NTxmYUb6l2TFs9PfYLIROqtXf/S16aF6Mw== X-Received: by 2002:a05:620a:20ca:b0:789:d0ca:82d1 with SMTP id f10-20020a05620a20ca00b00789d0ca82d1mr246393qka.35.1710972342256; Wed, 20 Mar 2024 15:05:42 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id c14-20020a37e10e000000b007883a49baeesm6945635qkm.4.2024.03.20.15.05.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:42 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:41 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 14/24] pseudo-merge: scaffolding for reads Message-ID: <7e4d051f37a42e9e44b13acfc60b42fbb2a891b5.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Implement scaffolding within the new pseudo-merge compilation unit necessary to use the pseudo-merge API from within the pack-bitmap.c machinery. The core of this scaffolding is two-fold: - The `pseudo_merge` structure itself, which represents an individual pseudo-merge bitmap. It has fields for both bitmaps, as well as metadata about its position within the memory-mapped region, and a few extra bits indicating whether or not it is satisfied, and which bitmaps(s, if any) have been read, since they are initialized lazily. - The `pseudo_merge_map` structure, which holds an array of pseudo_merges, as well as a pointer to the memory-mapped region containing the pseudo-merge serialization from within a .bitmap file. Note that the `bitmap_index` structure is defined statically within the pack-bitmap.o compilation unit, so we can't take in a `struct bitmap_index *`. Instead, wrap the primary components necessary to read the pseudo-merges in this new structure to avoid exposing the implementation details of the `bitmap_index` structure. Signed-off-by: Taylor Blau --- pseudo-merge.c | 10 ++++++++ pseudo-merge.h | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 75 insertions(+) diff --git a/pseudo-merge.c b/pseudo-merge.c index caccef942a1..d18de0a266b 100644 --- a/pseudo-merge.c +++ b/pseudo-merge.c @@ -441,3 +441,13 @@ void select_pseudo_merges(struct string_list *list, stop_progress(&progress); } + +void free_pseudo_merge_map(struct pseudo_merge_map *pm) +{ + uint32_t i; + for (i = 0; i < pm->nr; i++) { + ewah_pool_free(pm->v[i].commits); + ewah_pool_free(pm->v[i].bitmap); + } + free(pm->v); +} diff --git a/pseudo-merge.h b/pseudo-merge.h index 81888731864..2f652fc6767 100644 --- a/pseudo-merge.h +++ b/pseudo-merge.h @@ -99,4 +99,69 @@ void select_pseudo_merges(struct string_list *list, uint32_t *pseudo_merges_nr, unsigned show_progress); +/* + * Represents a serialized view of a file containing pseudo-merge(s) + * (see Documentation/technical/bitmap-format.txt for a specification + * of the format). + */ +struct pseudo_merge_map { + /* + * An array of pseudo-merge(s), lazily loaded from the .bitmap + * file. + */ + struct pseudo_merge *v; + size_t nr; + size_t commits_nr; + + /* + * Pointers into a memory-mapped view of the .bitmap file: + * + * - map: the beginning of the .bitmap file + * - commits: the beginning of the pseudo-merge commit index + * - map_size: the size of the .bitmap file + */ + const unsigned char *map; + const unsigned char *commits; + + size_t map_size; +}; + +/* + * An individual pseudo-merge, storing a pair of lazily-loaded + * bitmaps: + * + * - commits: the set of commit(s) that are part of the pseudo-merge + * - bitmap: the set of object(s) reachable from the above set of + * commits. + * + * The `at` and `bitmap_at` fields are used to store the locations of + * each of the above bitmaps in the .bitmap file. + */ +struct pseudo_merge { + struct ewah_bitmap *commits; + struct ewah_bitmap *bitmap; + + off_t at; + off_t bitmap_at; + + /* + * `satisfied` indicates whether the given pseudo-merge has been + * used. + * + * `loaded_commits` and `loaded_bitmap` indicate whether the + * respective bitmaps have been loaded and read from the + * .bitmap file. + */ + unsigned satisfied : 1, + loaded_commits : 1, + loaded_bitmap : 1; +}; + +/* + * Frees the given pseudo-merge map, releasing any memory held by (a) + * parsed EWAH bitmaps, or (b) the array of pseudo-merges itself. Does + * not free the memory-mapped view of the .bitmap file. + */ +void free_pseudo_merge_map(struct pseudo_merge_map *pm); + #endif From patchwork Wed Mar 20 22:05:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598226 Received: from mail-qk1-f181.google.com (mail-qk1-f181.google.com [209.85.222.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A077685C76 for ; Wed, 20 Mar 2024 22:05:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972348; cv=none; b=S18kahye8RPaALbrhUFwIro+qaGUell667sGITLlBUzWXZUbobJ9Xh8L+BBZce+2tfK4JlZZjkuLRk/wg8e8wOFyRnM4OzF8bJYDsrXb4pnNxvKA3COf05OX9yaL9jxmCeYgeN4Rez7ch//FEeXPb9Gz+cCOhGdqCbxOaaPyf60= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972348; c=relaxed/simple; bh=6+96b/5sUjjFsIAMFH1PO40G1FFuWL/jgT3XvZj3H/w=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Om+kUd7W3Nyv7RrEzQr8SijmXiJ4aPldp105vG8pIq7IH+te8oJInST5/hFeHIiQrTKhJW6/bn3qOwIN3BnZGUYfFzq9m87HbZxBOyspivsoYlENzBnQXxAYGwbrGAC4J83H4Yg5qfFgDvtt8VJCFQYR34O5ltNEmMara0keBVo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=m1CjYVa7; arc=none smtp.client-ip=209.85.222.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="m1CjYVa7" Received: by mail-qk1-f181.google.com with SMTP id af79cd13be357-789e209544eso20960585a.0 for ; Wed, 20 Mar 2024 15:05:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972345; x=1711577145; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=S6qVirMX4vU5terMRqnTxnGDczVXhtaoK5hQiSz628s=; b=m1CjYVa7MEKQncNnav5Ej6fry8G0jaPCyM1eDYNRAIsJeF4ejQPCJwI2hmIB+NRDS9 B1+ottCyN8Jkz2k6YvzygQlXwSLMvE022Nvdf4Xcfa2osrT+6yJMMUA/n0vDbn6i0LLM eA8cdYowVyDqpTYK3eDQBASizKmThTPIyMVzgSArOl80Z9ViDxnmyEu7vjugWjeoGtMm yjSq07MLhvaH/U6eP89VtB5XSym3q2cztUrjp6CQvxQbrw7NbhPAxRGYrLEaoYK09+z7 gWRDY3kVXsYIJods5ppg2cjfk7Md/B0ygcdsCKQ/vGI/ziD97KM+hV+w6TnULOx85af3 6SIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972345; x=1711577145; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=S6qVirMX4vU5terMRqnTxnGDczVXhtaoK5hQiSz628s=; b=lu38H82Rcm2FisS8mmHZwyVWLVSdXEywt6sAtw1R1YvA3ZCj5V/PqINfXb2nGdN6Gm vG+NJfPYMAUcCR8Wmyi/BvrSUzjG1SQusV16bSMeCIUbrU3WG9pD2s4gLxyS5VlOc/t2 Rkbl3ztumDYf8a++bDr5+8E4oDD20wJNkgUDq5LaKcf3WgzfT1QcZ0At4Cl/oXPJpgOE jN7Yfw7s+BM+Ubhhyd8jlxm2eNd4QzJShDIrNQH45LKNJCqHAJyC9evp8AEa1WbQM6Uo h19JVPxpqRJIutpHLjbKCWxCZ/ZShDACIhPNdoHXHKm3/sWVshSILOECkxGehCXYmDgz 2kMg== X-Gm-Message-State: AOJu0YwPdo5K3QuMxCBHQLfGEVexT+aRDaTL4lkaIUS8khkwAYhDfpOd m2I3BJpNjd52S9xTlE9uZMDt72QwnwgKEZqBGF9CD1THom+6ASU9GiuLJiTCCM4ieis6eptkCrb PMg0= X-Google-Smtp-Source: AGHT+IFhqdAC8wLa61kYoXnrm4Eqzv+I1cqstzH4SIxLtrfr2kI89020iESMBMeWH/5bqzktcOygMg== X-Received: by 2002:a05:620a:5819:b0:789:ea36:c461 with SMTP id wm25-20020a05620a581900b00789ea36c461mr16148059qkn.74.1710972345339; Wed, 20 Mar 2024 15:05:45 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id z27-20020a05620a101b00b007887d30dbb7sm6901918qkj.60.2024.03.20.15.05.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:45 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:44 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 15/24] pack-bitmap.c: read pseudo-merge extension Message-ID: <7bb644b2b0c3478c65e16c355be41127f32c9787.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Now that the scaffolding for reading the pseudo-merge extension has been laid, teach the pack-bitmap machinery to read the pseudo-merge extension when present. Note that pseudo-merges themselves are not yet used during traversal, this step will be taken by a future commit. In the meantime, read the table and initialize the pseudo_merge_map structure introduced by a previous commit. When the pseudo-merge extension is present, `load_bitmap_header()` performs basic sanity checks to make sure that the table is well-formed. Signed-off-by: Taylor Blau --- pack-bitmap.c | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/pack-bitmap.c b/pack-bitmap.c index b3b6f9aad21..e0f191b7581 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -20,6 +20,7 @@ #include "list-objects-filter-options.h" #include "midx.h" #include "config.h" +#include "pseudo-merge.h" /* * An entry on the bitmap index, representing the bitmap for a given @@ -86,6 +87,9 @@ struct bitmap_index { */ unsigned char *table_lookup; + /* This contains the pseudo-merge cache within 'map' (if found). */ + struct pseudo_merge_map pseudo_merges; + /* * Extended index. * @@ -205,6 +209,41 @@ static int load_bitmap_header(struct bitmap_index *index) index->table_lookup = (void *)(index_end - table_size); index_end -= table_size; } + + if (flags & BITMAP_OPT_PSEUDO_MERGES) { + unsigned char *pseudo_merge_ofs; + size_t table_size; + uint32_t i; + + if (sizeof(table_size) > index_end - index->map - header_size) + return error(_("corrupted bitmap index file (too short to fit pseudo-merge table header)")); + + table_size = get_be64(index_end - 8); + if (table_size > index_end - index->map - header_size) + return error(_("corrupted bitmap index file (too short to fit pseudo-merge table)")); + + if (git_env_bool("GIT_TEST_USE_PSEUDO_MERGES", 1)) { + const unsigned char *ext = (index_end - table_size); + + index->pseudo_merges.map = index->map; + index->pseudo_merges.map_size = index->map_size; + index->pseudo_merges.commits = ext + get_be64(index_end - 16); + index->pseudo_merges.commits_nr = get_be32(index_end - 20); + index->pseudo_merges.nr = get_be32(index_end - 24); + + CALLOC_ARRAY(index->pseudo_merges.v, + index->pseudo_merges.nr); + + pseudo_merge_ofs = index_end - 24 - + (index->pseudo_merges.nr * sizeof(uint64_t)); + for (i = 0; i < index->pseudo_merges.nr; i++) { + index->pseudo_merges.v[i].at = get_be64(pseudo_merge_ofs); + pseudo_merge_ofs += sizeof(uint64_t); + } + } + + index_end -= table_size; + } } index->entry_count = ntohl(header->entry_count); From patchwork Wed Mar 20 22:05:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598227 Received: from mail-ot1-f49.google.com (mail-ot1-f49.google.com [209.85.210.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6B34986654 for ; Wed, 20 Mar 2024 22:05:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972351; cv=none; b=LBV5UU7J8inUz3vmY80tHeb/qOGm42DPpD2d3TkiDMtrxJ9oZkanWtIiK2gpAYrLCCGVa7WLEB9QGjJKMVL9AGnnZSNXveB6pSONxU8CPgXkwJXP6xTYif2s0EppiUB8LE093a/AQM6nydzj0EyhBbVIaJ35N+/aiPSQsU/Kvvo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972351; c=relaxed/simple; bh=1ZQhnRFa5UwsyELJcRKrUwbOBzscU/hKWt1nOdUUvAA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=EtLMBsFuIKK0jsraoR4lEDm7tyktrxz85NM88kpjprqaQSYkO6q6sNnYfeXGpi3cTqG/IMAhvFkpW96V65Y7oZ9zskoDr3s1SV3knZtKz9t0Ic92+bYdfi/8Vd0ydmIrCc1NjTXY/fZuyG5cZ/5nubosjWwHGEznJvUSGsZzN0E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=JyhsvAfz; arc=none smtp.client-ip=209.85.210.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="JyhsvAfz" Received: by mail-ot1-f49.google.com with SMTP id 46e09a7af769-6e68d358974so131591a34.3 for ; Wed, 20 Mar 2024 15:05:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972348; x=1711577148; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=kRqQzagdvs80+WYg58dduEQlvRfzCbgxQR/3j0LxbYE=; b=JyhsvAfzrsStANU3LIM24zE4iMYo0Ada67z+ldAPNN6r8rVbHSC7OWSPdq/gX7VygU vWH1PWYD+rSm/v1kmY/HO6EH//vh0qMwNwgCo+1r9phuomcpRvzYvOZ/BTaIo2iLVfaC HvEeIS09ZDLYTrAqsyW2AJfqDo7i3aRskV6hRPFlz7Hf8IiN97xOqbr5OAgQMNq/7ixg 1N79QlFVHEvTsOa0Eo8JNqnW6Bj21xBBUisaoRY+CqhdLDJkr26m6MvFqXZSjPKWmWd6 ZFxr41fObvMwFscJbPS9DU8m6DqjzBtkPVu1vR6Rtr2vsNpYPNVMJZz7jzalH6mGrRVb eeUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972348; x=1711577148; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=kRqQzagdvs80+WYg58dduEQlvRfzCbgxQR/3j0LxbYE=; b=oh6zYNRfGK7+Ey0Ky50lYxQIF/i6Wgx4eWpmqy8raKb/8MAFJG+hbr0My+XlKUnVmw fgyz7vkigdWBYinrb3DydkGTPQear6p84j2FWy89zT590Yd20OmrlMYSHF62gxNsA7m4 WUJ1YTKqJQuZxyjfHkejrQvacq4HMR0ZAPRoGtdWkxCfWFySL/ge3irj4N9HwiJvja45 UQZg8q5PHE3g9RFnL/DNy/vfeAPY69xCEvY3a2zg+dJU2RW9Wl8gAeAwkkQTzygIJ6Vq XNhTStdRK/MWLB0twasy7X0z/GE8uB0Vev0XMR2jRfazQ1TettuvoIiSmdiOLHUqRz0/ 4vWQ== X-Gm-Message-State: AOJu0YwBAVi+QxHRJu9wPDzL8mzvezZVzG68IlYooCfT1pkSckm7gk3H IwdSV3/WOpG10Nm8o+huTJsrlZpOBNveJSMszCTZT8AIVGSJCrBkq2tNkbhHmIJT2FQjXArcJIT XmyY= X-Google-Smtp-Source: AGHT+IEbmrkMowQpSL6/wwoRhXzNgLE11hZyX+2ZSHTzbWLblqi83w7x94NacaUhIlac3Sz/sOW0Pg== X-Received: by 2002:a05:6830:108:b0:6e6:991f:16be with SMTP id i8-20020a056830010800b006e6991f16bemr9419238otp.0.1710972348249; Wed, 20 Mar 2024 15:05:48 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id i9-20020ad45c69000000b00690d26a6b20sm8268337qvh.130.2024.03.20.15.05.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:48 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:47 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 16/24] pseudo-merge: implement support for reading pseudo-merge commits Message-ID: <792cc863154a2671291efad6a64ac8a034dd4bb4.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Implement the basic API for reading pseudo-merge bitmaps, which consists of four basic functions: - pseudo_merge_bitmap() - use_pseudo_merge() - apply_pseudo_merges_for_commit() - cascade_pseudo_merges() These functions are all documented in pseudo-merge.h, but their rough descriptions are as follows: - pseudo_merge_bitmap() reads and inflates the objects EWAH bitmap for a given pseudo-merge - use_pseudo_merge() does the same as pseudo_merge_bitmap(), but on the commits EWAH bitmap, not the objects bitmap - apply_pseudo_merges_for_commit() applies all satisfied pseudo-merge commits for a given result set, and cascades any yet-unsatisfied pseudo-merges if any were applied in the previous step - cascade_pseudo_merges() applies all pseudo-merges which are satisfied but have not been previously applied, repeating this process until no more pseudo-merges can be applied The core of the API is the latter two functions, which are responsible for applying pseudo-merges during the object traversal implemented in the pack-bitmap machinery. The other two functions (pseudo_merge_bitmap(), and use_pseudo_merge()) are low-level ways to interact with the pseudo-merge machinery, which will be useful in future commits. Signed-off-by: Taylor Blau --- pseudo-merge.c | 231 +++++++++++++++++++++++++++++++++++++++++++++++++ pseudo-merge.h | 44 ++++++++++ 2 files changed, 275 insertions(+) diff --git a/pseudo-merge.c b/pseudo-merge.c index d18de0a266b..e111c9cd1a6 100644 --- a/pseudo-merge.c +++ b/pseudo-merge.c @@ -10,6 +10,7 @@ #include "commit.h" #include "alloc.h" #include "progress.h" +#include "hex.h" #define DEFAULT_PSEUDO_MERGE_DECAY 1.0f #define DEFAULT_PSEUDO_MERGE_MAX_MERGES 64 @@ -451,3 +452,233 @@ void free_pseudo_merge_map(struct pseudo_merge_map *pm) } free(pm->v); } + +struct pseudo_merge_commit_ext { + uint32_t nr; + const unsigned char *ptr; +}; + +static int pseudo_merge_ext_at(const struct pseudo_merge_map *pm, + struct pseudo_merge_commit_ext *ext, size_t at) +{ + if (at >= pm->map_size) + return error(_("extended pseudo-merge read out-of-bounds " + "(%"PRIuMAX" >= %"PRIuMAX")"), + (uintmax_t)at, (uintmax_t)pm->map_size); + + ext->nr = get_be32(pm->map + at); + ext->ptr = pm->map + at + sizeof(uint32_t); + + return 0; +} + +struct ewah_bitmap *pseudo_merge_bitmap(const struct pseudo_merge_map *pm, + struct pseudo_merge *merge) +{ + if (!merge->loaded_commits) + BUG("cannot use unloaded pseudo-merge bitmap"); + + if (!merge->loaded_bitmap) { + size_t at = merge->bitmap_at; + + merge->bitmap = read_bitmap(pm->map, pm->map_size, &at); + merge->loaded_bitmap = 1; + } + + return merge->bitmap; +} + +struct pseudo_merge *use_pseudo_merge(const struct pseudo_merge_map *pm, + struct pseudo_merge *merge) +{ + if (!merge->loaded_commits) { + size_t pos = merge->at; + + merge->commits = read_bitmap(pm->map, pm->map_size, &pos); + merge->bitmap_at = pos; + merge->loaded_commits = 1; + } + return merge; +} + +static struct pseudo_merge *pseudo_merge_at(const struct pseudo_merge_map *pm, + struct object_id *oid, + size_t want) +{ + size_t lo = 0; + size_t hi = pm->nr; + + while (lo < hi) { + size_t mi = lo + (hi - lo) / 2; + size_t got = pm->v[mi].at; + + if (got == want) + return use_pseudo_merge(pm, &pm->v[mi]); + else if (got < want) + hi = mi; + else + lo = mi + 1; + } + + warning(_("could not find pseudo-merge for commit %s at offset %"PRIuMAX), + oid_to_hex(oid), (uintmax_t)want); + + return NULL; +} + +struct pseudo_merge_commit { + uint32_t commit_pos; + uint64_t pseudo_merge_ofs; +}; + +#define PSEUDO_MERGE_COMMIT_RAWSZ (sizeof(uint32_t)+sizeof(uint64_t)) + +static void read_pseudo_merge_commit_at(struct pseudo_merge_commit *merge, + const unsigned char *at) +{ + merge->commit_pos = get_be32(at); + merge->pseudo_merge_ofs = get_be64(at + sizeof(uint32_t)); +} + +static int nth_pseudo_merge_ext(const struct pseudo_merge_map *pm, + struct pseudo_merge_commit_ext *ext, + struct pseudo_merge_commit *merge, + uint32_t n) +{ + size_t ofs; + + if (n >= ext->nr) + return error(_("extended pseudo-merge lookup out-of-bounds " + "(%"PRIu32" >= %"PRIu32")"), n, ext->nr); + + ofs = get_be64(ext->ptr + st_mult(n, sizeof(uint64_t))); + if (ofs >= pm->map_size) + return error(_("out-of-bounds read: (%"PRIuMAX" >= %"PRIuMAX")"), + (uintmax_t)ofs, (uintmax_t)pm->map_size); + + read_pseudo_merge_commit_at(merge, pm->map + ofs); + + return 0; +} + +static unsigned apply_pseudo_merge(const struct pseudo_merge_map *pm, + struct pseudo_merge *merge, + struct bitmap *result, + struct bitmap *roots) +{ + if (merge->satisfied) + return 0; + + if (!ewah_bitmap_is_subset(merge->commits, roots ? roots : result)) + return 0; + + bitmap_or_ewah(result, pseudo_merge_bitmap(pm, merge)); + if (roots) + bitmap_or_ewah(roots, pseudo_merge_bitmap(pm, merge)); + merge->satisfied = 1; + + return 1; +} + +static int pseudo_merge_commit_cmp(const void *va, const void *vb) +{ + struct pseudo_merge_commit merge; + uint32_t key = *(uint32_t*)va; + + read_pseudo_merge_commit_at(&merge, vb); + + if (key < merge.commit_pos) + return -1; + if (key > merge.commit_pos) + return 1; + return 0; +} + +static struct pseudo_merge_commit *find_pseudo_merge(const struct pseudo_merge_map *pm, + uint32_t pos) +{ + if (!pm->commits_nr) + return NULL; + + return bsearch(&pos, pm->commits, pm->commits_nr, + PSEUDO_MERGE_COMMIT_RAWSZ, pseudo_merge_commit_cmp); +} + +int apply_pseudo_merges_for_commit(const struct pseudo_merge_map *pm, + struct bitmap *result, + struct commit *commit, uint32_t commit_pos) +{ + struct pseudo_merge *merge; + struct pseudo_merge_commit *merge_commit; + int ret = 0; + + merge_commit = find_pseudo_merge(pm, commit_pos); + if (!merge_commit) + return 0; + + if (merge_commit->pseudo_merge_ofs & ((uint64_t)1<<63)) { + struct pseudo_merge_commit_ext ext = { 0 }; + off_t ofs = merge_commit->pseudo_merge_ofs & ~((uint64_t)1<<63); + uint32_t i; + + if (pseudo_merge_ext_at(pm, &ext, ofs) < -1) { + warning(_("could not read extended pseudo-merge table " + "for commit %s"), + oid_to_hex(&commit->object.oid)); + return ret; + } + + for (i = 0; i < ext.nr; i++) { + if (nth_pseudo_merge_ext(pm, &ext, merge_commit, i) < 0) + return ret; + + merge = pseudo_merge_at(pm, &commit->object.oid, + merge_commit->pseudo_merge_ofs); + + if (!merge) + return ret; + + if (apply_pseudo_merge(pm, merge, result, NULL)) + ret++; + } + } else { + merge = pseudo_merge_at(pm, &commit->object.oid, + merge_commit->pseudo_merge_ofs); + + if (!merge) + return ret; + + if (apply_pseudo_merge(pm, merge, result, NULL)) + ret++; + } + + if (ret) + cascade_pseudo_merges(pm, result, NULL); + + return ret; +} + +int cascade_pseudo_merges(const struct pseudo_merge_map *pm, + struct bitmap *result, + struct bitmap *roots) +{ + unsigned any_satisfied; + int ret = 0; + + do { + struct pseudo_merge *merge; + uint32_t i; + + any_satisfied = 0; + + for (i = 0; i < pm->nr; i++) { + merge = use_pseudo_merge(pm, &pm->v[i]); + if (apply_pseudo_merge(pm, merge, result, roots)) { + any_satisfied |= 1; + ret++; + } + } + } while (any_satisfied); + + return ret; +} diff --git a/pseudo-merge.h b/pseudo-merge.h index 2f652fc6767..cc14e947e86 100644 --- a/pseudo-merge.h +++ b/pseudo-merge.h @@ -164,4 +164,48 @@ struct pseudo_merge { */ void free_pseudo_merge_map(struct pseudo_merge_map *pm); +/* + * Loads the bitmap corresponding to the given pseudo-merge from the + * map, if it has not already been loaded. + */ +struct ewah_bitmap *pseudo_merge_bitmap(const struct pseudo_merge_map *pm, + struct pseudo_merge *merge); + +/* + * Loads the pseudo-merge and its commits bitmap from the given + * pseudo-merge map, if it has not already been loaded. + */ +struct pseudo_merge *use_pseudo_merge(const struct pseudo_merge_map *pm, + struct pseudo_merge *merge); + +/* + * Applies pseudo-merge(s) containing the given commit to the bitmap + * "result". + * + * If any pseudo-merge(s) were satisfied, returns the number + * satisfied, otherwise returns 0. If any were satisfied, the + * remaining unsatisfied pseudo-merges are cascaded (see below). + */ +int apply_pseudo_merges_for_commit(const struct pseudo_merge_map *pm, + struct bitmap *result, + struct commit *commit, uint32_t commit_pos); + +/* + * Applies pseudo-merge(s) which are satisfied according to the + * current bitmap in result (or roots, see below). If any + * pseudo-merges were satisfied, repeat the process over unsatisfied + * pseudo-merge commits until no more pseudo-merges are satisfied. + * + * Result is the bitmap to which the pseudo-merge(s) are applied. + * Roots (if given) is a bitmap of the traversal tip(s) for either + * side of a reachability traversal. + * + * Roots may given instead of a populated results bitmap at the + * beginning of a traversal on either side where the reachability + * closure over tips is not yet known. + */ +int cascade_pseudo_merges(const struct pseudo_merge_map *pm, + struct bitmap *result, + struct bitmap *roots); + #endif From patchwork Wed Mar 20 22:05:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598228 Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com [209.85.160.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 36FBA8665A for ; Wed, 20 Mar 2024 22:05:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972353; cv=none; b=gvUBdbUFwcznLJkk4xvcqZIpkjGyp4oqG9eE8uZZ582nA7r78WdlbWErtPMM2hapNjpjYsDJ21ypa4ebjCWjDWb1kTkxnLXYxuq56MfAjqWJjtVZSqyJQ2N4BcyRQqVOoD4iWNUqW9W6wfaYLidU/0s5Qa7m2PmCMJkxo/4TBj4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972353; c=relaxed/simple; bh=SjEM7ajDVP32iB2JQg8l89qX9zrDrdYGlXdCHf8bgOk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=CAIGrZVInl4Yp15f4iMhrQ7suZvFkx3Dc9t7Lxv3Cm3K357GJv1ROgFOwdxwoXgoJ76cVv50G+c3OuagaMOhRwV4pKoYyT+KXkzt/hRh94UElF4+67Bb6qPSDBZ0+EHGkAc7/Z9GtZ/Wsrz3nSJb5mzY3dYUD2jqVxwQBRDMFwc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=k4HlN6PK; arc=none smtp.client-ip=209.85.160.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="k4HlN6PK" Received: by mail-qt1-f170.google.com with SMTP id d75a77b69052e-430c63d4da9so2354281cf.0 for ; Wed, 20 Mar 2024 15:05:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972351; x=1711577151; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=o+yzlGzgl5SgHehfth0gTlFAAejNUnU0g/Yw2kPx8pQ=; b=k4HlN6PKWeh7flqd9D60hTHe4oW0ZnSosOnMSe+i1sVfcvCgvTGCa9jP+YWaR2mzcB 8S/rdHMEqQROaPIEVdg/egRrE+x26wt8ADtBw5eAHrrMkIQ7kScUhOePmz1ncSkzs9U8 SiW1YXCp5NIbncC9kWUowPcB2gFjwHU0vcpKh81n5/dDYYCGSICSiblVApNYF39vrfel Hz0j4lpZNhgEaX0A1k3IxOVvkGK37Iqb/LSoeglgjrOXDlnBwMOP9lkDw01Vcd8Cjk7s PsWgrsburHFS3oYdgWDCCHewstGqq1C4KLT9lLucj4TRsPfP0qknWPg8/o2BvjnvimkP jPcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972351; x=1711577151; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=o+yzlGzgl5SgHehfth0gTlFAAejNUnU0g/Yw2kPx8pQ=; b=hgVwsByWu+FXvgsnZHkv8LpbXSCBAQxeYOOboCc8D5AqEoZH0oUEU1fbv8x0syASN0 v9paBxLk0TR7nGGxNa5pdW6NVv7Rr4NFd8PClfmGzZatucmnMMzH2jK/dANRe/HR4JIL cSwB29xHbZ++ewVV9qRMGc0xZnXe7GZIoBlhORge49ZSwWkHv0cyyCY7rOAulsIJudC2 5pZc9JAvLIq2XIxx3yTEJ/YebxpVTleNsuBbo6U1Zup/kW9cm+MOZP9sk8pJfTBxwc88 UEmm4I+XtM2uf6ff3hWeWx7hISCE3Jcp12mvMnWxvX9HNmhEC1QozvnByZaB70Xlq9BL au3w== X-Gm-Message-State: AOJu0YxmWg9lZuz1w99XZpGvbWz9wM0kLTWLDZmXmcM6XP6ohy041Rp4 ryOncBBjSsfnxJiQfz+gdObJ8zmI8kYQvuFqbPr7GVdDwwZYQDV4aU2ebs8dhglu/xPlsH4KNCj CIZQ= X-Google-Smtp-Source: AGHT+IH6F8AwTaBlVHLbLp0R5cYz301I2wcJGsGEyM5HZbBkezRu43UKYHsaNOYVXcHbeu6xJtvtfw== X-Received: by 2002:a0c:8e0c:0:b0:691:59ad:ff46 with SMTP id v12-20020a0c8e0c000000b0069159adff46mr17709599qvb.30.1710972351124; Wed, 20 Mar 2024 15:05:51 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id kj25-20020a056214529900b0069183a8de64sm6494675qvb.75.2024.03.20.15.05.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:50 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:49 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 17/24] ewah: implement `ewah_bitmap_popcount()` Message-ID: <8fb7f7ab37b467bb5026296e9c7ae10632163525.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Some of the pseudo-merge test helpers (which will be introduced in the following commit) will want to indicate the total number of commits in or objects reachable from a pseudo-merge. Implement a popcount() function that operates on EWAH bitmaps to quickly determine how many bits are set in each of the respective bitmaps. Signed-off-by: Taylor Blau --- ewah/bitmap.c | 14 ++++++++++++++ ewah/ewok.h | 1 + 2 files changed, 15 insertions(+) diff --git a/ewah/bitmap.c b/ewah/bitmap.c index 5bdae3fb07b..a41fa152cbd 100644 --- a/ewah/bitmap.c +++ b/ewah/bitmap.c @@ -212,6 +212,20 @@ size_t bitmap_popcount(struct bitmap *self) return count; } +size_t ewah_bitmap_popcount(struct ewah_bitmap *self) +{ + struct ewah_iterator it; + eword_t word; + size_t count = 0; + + ewah_iterator_init(&it, self); + + while (ewah_iterator_next(&word, &it)) + count += ewah_bit_popcount64(word); + + return count; +} + int bitmap_is_empty(struct bitmap *self) { size_t i; diff --git a/ewah/ewok.h b/ewah/ewok.h index c334833b201..d7e9fb67715 100644 --- a/ewah/ewok.h +++ b/ewah/ewok.h @@ -190,6 +190,7 @@ void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other); void bitmap_or(struct bitmap *self, const struct bitmap *other); size_t bitmap_popcount(struct bitmap *self); +size_t ewah_bitmap_popcount(struct ewah_bitmap *self); int bitmap_is_empty(struct bitmap *self); #endif From patchwork Wed Mar 20 22:05:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598229 Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 68FB48665A for ; Wed, 20 Mar 2024 22:05:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972357; cv=none; b=EY5ntjLi0X/LT/fpXkjMzWo9/u7igsY/sjoCUm7PmWDAUKzjZqFeoDhlW1+xYJumkOEuoYRvFRx80O0gHRi9CQ3TctaVWtTkq3sYWjXnjRsGkVeiRVRqEP3MYfTdyEHD90eaTPq3ttY7GjVDMfnr44gJl8zzHOKncxMrLz1ETyk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972357; c=relaxed/simple; bh=STc801ngdsw4grWj/2fCuhZ5g2oSOIiTqFMWy9yv+gQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Zrxz9NYHS0jOfU+kqvFW6+g0sa6w1Gv3AqbDEkECYl77lbQKPfwz0fjW2eC6Jx4/GxYbetKnp1hBkSpl/sVebW4ZhnieyVVzyHSTqxEVYVmU9f2Tuzg80imHQGiB4Q+kKMkXQhadHRurMmYLPzcr2UbpvJ+sABBtbCo/J+nwTG4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=Pcr359d0; arc=none smtp.client-ip=209.85.219.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="Pcr359d0" Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-6963c0c507eso4457206d6.1 for ; Wed, 20 Mar 2024 15:05:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972354; x=1711577154; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=m3PcbGEDqgL/0XLppcmfbabNLxqQPslp9rWFGst9FQY=; b=Pcr359d0OEpEQzDH6s3rjJUDWPap7y5qrtUILBJbtqeU4xpANiMJwAmf/Xy5pSGj5O //ncNJcuZACiMT/VnwtlqFalrG8IzKEazHR3dwNU8nN1g8wUzeWp+tCCMpP7cIKKoEU1 1DdrbWBIxi4Ma3ZQPRjDuScU0/HekcISEJaQ0qMae+GgpaU9ssMxWQcDpquxXjPfzBAQ Th/0nN8WkHQ6K612TbQgNDFRz+qGJc4h0HxV1a5ichQcpIUhEQA359q5T8gQl/p9quHY BdNFd2pGL2NROZgFlhq62Kx86Cn2ZVLgt6xckZ4L8r0bIX7CZ9SgtDxaJs/lEL2+Ejko m4bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972354; x=1711577154; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=m3PcbGEDqgL/0XLppcmfbabNLxqQPslp9rWFGst9FQY=; b=tQ2dn7EYgt1EYSeb3BlWt6dNJav1YwcRZflQqHh1AQ9RUc13J0voh7OXs8IGzEbqhi uMl3ZG1M33sly0yUrrCddkLWr3L965KnFx5HOvppDR0rT5T2gq4MRMH4aCeOOr6QK6GZ Qt2bbnt1XvitPg80wFAjRdfimJZnHFvjehi48TeaNKh0eXp9KgYzQsgfA+fPMg6zfyct 8aSQHGdlvqmZ4zMt9kEENK3O+ZzCrxBYe0H/BwX0uaDok5CXu5kVlqcfg0vH+mqyyf9D MIefVLglf7UzTHA4g5hamKBjcp3ldw2CEFMwpmyHCijUYNgAiNVqnJaGdHx1lTFDmcRK Y7xA== X-Gm-Message-State: AOJu0Yw6p2skgnDJhazvJCRefdFxr60irBYUqRJ7RmljFVddZUtsg8EC uQ37dCiiSJudheo2xhPgH8kNwrgLqe2RflYd+D0c06UrOameBydESvtmIOpwpWeQJyHcOiZhjkg 2f00= X-Google-Smtp-Source: AGHT+IEiY2t0WiVhFXi/vjPPwMoD8GO5ET9g/xq/FolakoT/c25DJm4FsMVIgBMyOFdWdLTyYYQ6cA== X-Received: by 2002:a05:6214:5294:b0:696:4771:9b57 with SMTP id kj20-20020a056214529400b0069647719b57mr1572984qvb.23.1710972354217; Wed, 20 Mar 2024 15:05:54 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id m10-20020a0562141bca00b00690b21ff926sm8224811qvc.137.2024.03.20.15.05.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:53 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:52 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 18/24] pack-bitmap: implement test helpers for pseudo-merge Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Implement three new sub-commands for the "bitmap" test-helper: - t/helper test-tool bitmap dump-pseudo-merges - t/helper test-tool bitmap dump-pseudo-merge-commits - t/helper test-tool bitmap dump-pseudo-merge-objects These three helpers dump the list of pseudo merges, the "parents" of the nth pseudo-merges, and the set of objects reachable from those parents, respectively. These helpers will be useful in subsequent patches when we add test coverage for pseudo-merge bitmaps. Signed-off-by: Taylor Blau --- pack-bitmap.c | 126 +++++++++++++++++++++++++++++++++++++++++ pack-bitmap.h | 3 + t/helper/test-bitmap.c | 34 ++++++++--- 3 files changed, 156 insertions(+), 7 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index e0f191b7581..7188dd75eaf 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -2442,6 +2442,132 @@ int test_bitmap_hashes(struct repository *r) return 0; } +static void bit_pos_to_object_id(struct bitmap_index *bitmap_git, + uint32_t bit_pos, + struct object_id *oid) +{ + uint32_t index_pos; + + if (bitmap_is_midx(bitmap_git)) + index_pos = pack_pos_to_midx(bitmap_git->midx, bit_pos); + else + index_pos = pack_pos_to_index(bitmap_git->pack, bit_pos); + + nth_bitmap_object_oid(bitmap_git, oid, index_pos); +} + +int test_bitmap_pseudo_merges(struct repository *r) +{ + struct bitmap_index *bitmap_git; + uint32_t i; + + bitmap_git = prepare_bitmap_git(r); + if (!bitmap_git || !bitmap_git->pseudo_merges.nr) + goto cleanup; + + for (i = 0; i < bitmap_git->pseudo_merges.nr; i++) { + struct pseudo_merge *merge; + struct ewah_bitmap *commits_bitmap, *merge_bitmap; + + merge = use_pseudo_merge(&bitmap_git->pseudo_merges, + &bitmap_git->pseudo_merges.v[i]); + commits_bitmap = merge->commits; + merge_bitmap = pseudo_merge_bitmap(&bitmap_git->pseudo_merges, + merge); + + printf("at=%"PRIuMAX", commits=%"PRIuMAX", objects=%"PRIuMAX"\n", + (uintmax_t)merge->at, + (uintmax_t)ewah_bitmap_popcount(commits_bitmap), + (uintmax_t)ewah_bitmap_popcount(merge_bitmap)); + } + +cleanup: + free_bitmap_index(bitmap_git); + return 0; +} + +static void dump_ewah_object_ids(struct bitmap_index *bitmap_git, + struct ewah_bitmap *bitmap) + +{ + struct ewah_iterator it; + eword_t word; + uint32_t pos = 0; + + ewah_iterator_init(&it, bitmap); + + while (ewah_iterator_next(&word, &it)) { + struct object_id oid; + uint32_t offset; + + for (offset = 0; offset < BITS_IN_EWORD; offset++) { + if (!(word >> offset)) + break; + + offset += ewah_bit_ctz64(word >> offset); + + bit_pos_to_object_id(bitmap_git, pos + offset, &oid); + printf("%s\n", oid_to_hex(&oid)); + } + pos += BITS_IN_EWORD; + } +} + +int test_bitmap_pseudo_merge_commits(struct repository *r, uint32_t n) +{ + struct bitmap_index *bitmap_git; + struct pseudo_merge *merge; + int ret = 0; + + bitmap_git = prepare_bitmap_git(r); + if (!bitmap_git || !bitmap_git->pseudo_merges.nr) + goto cleanup; + + if (n >= bitmap_git->pseudo_merges.nr) { + ret = error(_("pseudo-merge index out of range " + "(%"PRIu32" >= %"PRIuMAX")"), + n, (uintmax_t)bitmap_git->pseudo_merges.nr); + goto cleanup; + } + + merge = use_pseudo_merge(&bitmap_git->pseudo_merges, + &bitmap_git->pseudo_merges.v[n]); + dump_ewah_object_ids(bitmap_git, merge->commits); + +cleanup: + free_bitmap_index(bitmap_git); + return ret; +} + +int test_bitmap_pseudo_merge_objects(struct repository *r, uint32_t n) +{ + struct bitmap_index *bitmap_git; + struct pseudo_merge *merge; + int ret = 0; + + bitmap_git = prepare_bitmap_git(r); + if (!bitmap_git || !bitmap_git->pseudo_merges.nr) + goto cleanup; + + if (n >= bitmap_git->pseudo_merges.nr) { + ret = error(_("pseudo-merge index out of range " + "(%"PRIu32" >= %"PRIuMAX")"), + n, (uintmax_t)bitmap_git->pseudo_merges.nr); + goto cleanup; + } + + merge = use_pseudo_merge(&bitmap_git->pseudo_merges, + &bitmap_git->pseudo_merges.v[n]); + + dump_ewah_object_ids(bitmap_git, + pseudo_merge_bitmap(&bitmap_git->pseudo_merges, + merge)); + +cleanup: + free_bitmap_index(bitmap_git); + return ret; +} + int rebuild_bitmap(const uint32_t *reposition, struct ewah_bitmap *source, struct bitmap *dest) diff --git a/pack-bitmap.h b/pack-bitmap.h index a5fe4f305ef..25d3b8e604a 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -73,6 +73,9 @@ void traverse_bitmap_commit_list(struct bitmap_index *, void test_bitmap_walk(struct rev_info *revs); int test_bitmap_commits(struct repository *r); int test_bitmap_hashes(struct repository *r); +int test_bitmap_pseudo_merges(struct repository *r); +int test_bitmap_pseudo_merge_commits(struct repository *r, uint32_t n); +int test_bitmap_pseudo_merge_objects(struct repository *r, uint32_t n); #define GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL \ "GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL" diff --git a/t/helper/test-bitmap.c b/t/helper/test-bitmap.c index af43ee1cb5e..6af2b42678f 100644 --- a/t/helper/test-bitmap.c +++ b/t/helper/test-bitmap.c @@ -13,21 +13,41 @@ static int bitmap_dump_hashes(void) return test_bitmap_hashes(the_repository); } +static int bitmap_dump_pseudo_merges(void) +{ + return test_bitmap_pseudo_merges(the_repository); +} + +static int bitmap_dump_pseudo_merge_commits(uint32_t n) +{ + return test_bitmap_pseudo_merge_commits(the_repository, n); +} + +static int bitmap_dump_pseudo_merge_objects(uint32_t n) +{ + return test_bitmap_pseudo_merge_objects(the_repository, n); +} + int cmd__bitmap(int argc, const char **argv) { setup_git_directory(); - if (argc != 2) - goto usage; - - if (!strcmp(argv[1], "list-commits")) + if (argc == 2 && !strcmp(argv[1], "list-commits")) return bitmap_list_commits(); - if (!strcmp(argv[1], "dump-hashes")) + if (argc == 2 && !strcmp(argv[1], "dump-hashes")) return bitmap_dump_hashes(); + if (argc == 2 && !strcmp(argv[1], "dump-pseudo-merges")) + return bitmap_dump_pseudo_merges(); + if (argc == 3 && !strcmp(argv[1], "dump-pseudo-merge-commits")) + return bitmap_dump_pseudo_merge_commits(atoi(argv[2])); + if (argc == 3 && !strcmp(argv[1], "dump-pseudo-merge-objects")) + return bitmap_dump_pseudo_merge_objects(atoi(argv[2])); -usage: usage("\ttest-tool bitmap list-commits\n" - "\ttest-tool bitmap dump-hashes"); + "\ttest-tool bitmap dump-hashes\n" + "\ttest-tool bitmap dump-pseudo-merges\n" + "\ttest-tool bitmap dump-pseudo-merge-commits \n" + "\ttest-tool bitmap dump-pseudo-merge-objects "); return -1; } From patchwork Wed Mar 20 22:05:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598230 Received: from mail-qk1-f174.google.com (mail-qk1-f174.google.com [209.85.222.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6532686ACB for ; Wed, 20 Mar 2024 22:05:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972359; cv=none; b=M5viu4txdGdRdSZpzR0b/LPNUSzsZM2AqEy7x7KM14Ff3LVjwVVEvM7ENvIDCWhey/QJJRZGeJ+gmEhS5vPgK5IDpBn+r+p/BxiviA18D5meNQEgFh4SuGHQgxPSs+FzcTSQJBwRMWzSgoOSWW1zk4/gSvx0ziv50VP+vwyQlGY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972359; c=relaxed/simple; bh=GlY92Z4ds5nCsWYJoGQMI6CgCmEoD2F7qw05/3ktzI8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=N52T5T5q3GAPxiB95sTi7kITtfr773h3qpm/dRdGi7c1h9srhCIh7OjHLvXvOwbadMbSGS95MqdzVt6Dt+510/2vTENIw+QlEikiewwD+07gbMDHmP4wImOGNG/X5KvuvMiY4Yrue7td7xv3HyAw+wvkF89XueSFh+6Wt6nyA1I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=uXBYO0F0; arc=none smtp.client-ip=209.85.222.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="uXBYO0F0" Received: by mail-qk1-f174.google.com with SMTP id af79cd13be357-789e4a4d3a5so23346785a.1 for ; Wed, 20 Mar 2024 15:05:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972357; x=1711577157; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=wsap2U0PPPij8jTyogEOiI4xnjSxw193AnQxICyLyzQ=; b=uXBYO0F0H5PNyQqQy038H3MkIHcDqjkv9niacvVlevZ5JRG7p2bma+dzT0FM0BbE0A PmW3sIrMyCEJTw3mQIowr2eSGTYLqP4yzxtb7PDsDB7mndCgOYvbeUBKdTYYJkz2ds5o wu46148tKgSs8vi29h4GxqMAuCdSSvjBmnu+QWLudS7mU67+1eFsKMYL8oHmKCWAAiDx wWGpI3/QvUC7X1iJ3FSTvg6yXJwkyO0QV1Iz1w01dC0XxgDi2HDcnhu6n7P2XZGjqLjI gR6v/kTfpZwot+Ok0kVjvlKwUMS3YBx5PIObNUz/QUTyZqFigImpaDSQWmdxjp01luNQ 43Yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972357; x=1711577157; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=wsap2U0PPPij8jTyogEOiI4xnjSxw193AnQxICyLyzQ=; b=f4gVbfStG6fPmHCVtQ+O94jQEO44JVor+lPNPiEfwiyoDz+IU3iDwaS3aEf307LJFx WYafPj9e7lOVWtHyFfWv6vYO1KMFmIrJDL7RRjI3TBZtGHVeS1L5K5CxLkhR6OcbahHg 1vAIImLCAeC6/jRrhEuMMJQX5wF/GSiJ4O/rAF3yFJSHAX0w8gMfYBG8f6VZa0GMgklg BLxVOfZ6OhnLHyOhGhL0R3Ap/sVE2FfF/MxlA2lTfn8Onm8MevdXkP0ccbF3VQpcHdFv rEq0oP9ffwt/+cN0tG8wxZCgPKib9iTp8O7s1ks0xU1pHg/k/2vvvs7dzuKkJ7zCQDbs nnGw== X-Gm-Message-State: AOJu0YyFgHZzrllygHgTIV9j5MDvSDrqWH/587Q5oU79J1aeaLKm8SF+ dwEMaKHh3Fdizhsb00OG9u2WDOaehX2St+doCOYssey4IUXrTxdalQmEFBoN5DtP7C6LdiGQ3pI zxN0= X-Google-Smtp-Source: AGHT+IEeOHJTXhWkiebLJGetAl1qSnFy3bwJPkFHllifmAy8811j4zJ0nGCiK3gaNxe93RxisGSgww== X-Received: by 2002:a05:622a:58a:b0:42e:def6:3d31 with SMTP id c10-20020a05622a058a00b0042edef63d31mr4064052qtb.49.1710972357096; Wed, 20 Mar 2024 15:05:57 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id s5-20020ac85cc5000000b0042f01390d5csm7898784qta.30.2024.03.20.15.05.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:56 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:55 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 19/24] t/test-lib-functions.sh: support `--date` in `test_commit_bulk()` Message-ID: <7d3b88e6fd68724c1a984396be64e1043411dcb1.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: One of the tests we'll want to add for pseudo-merge bitmaps needs to be able to generate a large number of commits at a specific date. Support the `--date` option (with identical semantics to the `--date` option for `test_commit()`) within `test_commit_bulk` as a prerequisite for that. Signed-off-by: Taylor Blau --- t/test-lib-functions.sh | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 6eaf116346b..312cc5d4c79 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -458,6 +458,7 @@ test_commit_bulk () { indir=. ref=HEAD n=1 + notick= message='commit %s' filename='%s.t' contents='content %s' @@ -488,6 +489,12 @@ test_commit_bulk () { filename="${1#--*=}-%s.t" contents="${1#--*=} %s" ;; + --date) + notick=yes + GIT_COMMITTER_DATE="$2" + GIT_AUTHOR_DATE="$2" + shift + ;; -*) BUG "invalid test_commit_bulk option: $1" ;; @@ -507,7 +514,10 @@ test_commit_bulk () { while test "$total" -gt 0 do - test_tick && + if test -z "$notick" + then + test_tick + fi && echo "commit $ref" printf 'author %s <%s> %s\n' \ "$GIT_AUTHOR_NAME" \ From patchwork Wed Mar 20 22:05:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598231 Received: from mail-oi1-f182.google.com (mail-oi1-f182.google.com [209.85.167.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 825EB8625E for ; Wed, 20 Mar 2024 22:06:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972363; cv=none; b=R6xOKIQDXuqXmHxS+QWdGLqJOOfvyo8N2aHRFJjN4PlfL3DT8o9myzE2aRDoZuewXAp8s3ughTPpttAnwNw4qWGagj5TA/LSjjmjuRlItJXzIiGZKV8fFKhL+Ix/c8kLMByQqjt7aJhFixOjH3CVgOMmc4FrPO98NSO5gBIclmw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972363; c=relaxed/simple; bh=PY41+F1DFmQ5nqTuqAg41a9mU1Iz/gBjhodOqWC7KAY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=fPUiMcknyzQSBeMeWDMZK2XrrusL5fS6xSDVAHMb3bd3PRBG22S8N3HB3br/td5NA2SQxITqiHhHcMHPfTjywNekQuvXIHsXk9/VbxIpltaExMWX2qSPuk13vFbwaW1G0jp4JAFYp34Pp/EBcCxvfnqh4ixZESLuHZkMAqfQZZY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=c+dsgWU9; arc=none smtp.client-ip=209.85.167.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="c+dsgWU9" Received: by mail-oi1-f182.google.com with SMTP id 5614622812f47-3c36f882372so262319b6e.2 for ; Wed, 20 Mar 2024 15:06:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972360; x=1711577160; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=IadnKAN+B8cH3+lZ0cWP+g7SLHeZEQdyNFue662v1kA=; b=c+dsgWU90bGZf+lumDjPpC0knv/Tu4XcpnBwECl4UQvIn+qeyP8LHhmCs/rvLdSiWj vLwpxp9vrjAB2qvvj7sCv8FDAp6Kk8tZgNYrN6/PPY3yC+STqRhpm4IwXE/xNOk8gkLb AtqQhjVp0NaWjhq4qUN+a6S1Ymdvh6+EsK66hz1Zs9B0G9lGHSYVVC5izUa4XMoWx0WD xFUbSS61GZ7akdMnso2+j5sdYEHSgsazsb7UMLrewuE60HYeiiCxy6XqHWO+DujGXGzL sGxcGTH5GG9HSaiq8D8mWl06ZCPfRdO+4Vfy2vBj5B+uZhapkZYf7k6EnvaixLymr3xt t+iA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972360; x=1711577160; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=IadnKAN+B8cH3+lZ0cWP+g7SLHeZEQdyNFue662v1kA=; b=O9m/KHXZcVukOFio/TKl6nm3Z8tuDM04vN3TkzBKinv0pRV0e22fggaRrOhgyviLsZ GjFvWoOws0F0jSYC7rzyWDitjdn2Iyu2OUurbVowP4zN3CaRXO4B/Fj+DGP/RVXur0/2 /7H8m1N6VcDQv5EKmCYbhL5Y9XcExaHmjI6IMF28Pxo86vmRn6gfJTPBKHShJHmxfcPv F7Xril5c8aE3V6hzNZiws1ltaB+pMn9eQk3M0xP89wsqltZC1g8v+inJTWqxyicPjlQT tuaOqqVJUeZVbUWqjPHFlQliP7Nv/rniScVzi+6NaEyJExrDFzdVdrBSyfJbA7ZctFOb WE2g== X-Gm-Message-State: AOJu0YxMrqvPIyR/pbhgfrnpTRRaA4QL/zQyxJ/jvuAkN/6L9DxHhpkU FV2fyAp4gOFkZDKQB23tK2V4/HUa7o/fVcuPtB005sAYDT1eIt2PPKQW4ODxdAkyGVdcCux+0+I kzD0= X-Google-Smtp-Source: AGHT+IE4Pwxe3VeZf8ZPwBmkrTO8GgtKpEmxL9IgDtNFKperJbCLTmbGirJZ5lwaP67Uleti3RGkRQ== X-Received: by 2002:a54:4116:0:b0:3c3:82c4:4f96 with SMTP id l22-20020a544116000000b003c382c44f96mr6427403oic.28.1710972360117; Wed, 20 Mar 2024 15:06:00 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id cn7-20020a05622a248700b00430b60698e9sm6348912qtb.32.2024.03.20.15.05.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:05:59 -0700 (PDT) Date: Wed, 20 Mar 2024 18:05:58 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 20/24] pack-bitmap.c: use pseudo-merges during traversal Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Now that all of the groundwork has been laid to support reading and using pseudo-merges, make use of that work in this commit by teaching the pack-bitmap machinery to use pseudo-merge(s) when available during traversal. The basic operation is as follows: - When enumerating objects on either side of a reachability query, first see if any subset of the roots satisfies some pseudo-merge bitmap. If it does, apply that pseudo-merge bitmap. - If any pseudo-merge bitmap(s) were applied in the previous step, OR them into the result[^1]. Then repeat the process over all pseudo-merge bitmaps (we'll refer to this as "cascading" pseudo-merges). Once this is done, OR in the resulting bitmap. - If there is no fill-in traversal to be done, return the bitmap for that side of the reachability query. If there is fill-in traversal, then for each commit we encounter via show_commit(), check to see if any unsatisfied pseudo-merges containing that commit as one of its parents has been made satisfied by the presence of that commit. If so, OR in the object set from that pseudo-merge bitmap, and then cascade. If not, continue traversal. A similar implementation is present in the boundary-based bitmap traversal routines. [^1]: Importantly, we cannot OR in the entire set of roots along with the objects reachable from whatever pseudo-merge bitmaps were satisfied. This may leave some dangling bits corresponding to any unsatisfied root(s) getting OR'd into the resulting bitmap, tricking other parts of the traversal into thinking we already have a reachability closure over those commit(s) when we do not. Signed-off-by: Taylor Blau --- pack-bitmap.c | 112 ++++++++++- t/t5333-pseudo-merge-bitmaps.sh | 325 ++++++++++++++++++++++++++++++++ 2 files changed, 436 insertions(+), 1 deletion(-) create mode 100755 t/t5333-pseudo-merge-bitmaps.sh diff --git a/pack-bitmap.c b/pack-bitmap.c index 7188dd75eaf..a7c36a977bd 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -114,6 +114,9 @@ struct bitmap_index { unsigned int version; }; +static int pseudo_merges_satisfied_nr; +static int pseudo_merges_cascades_nr; + static struct ewah_bitmap *lookup_stored_bitmap(struct stored_bitmap *st) { struct ewah_bitmap *parent; @@ -1006,6 +1009,22 @@ static void show_commit(struct commit *commit UNUSED, { } +static unsigned apply_pseudo_merges_for_commit_1(struct bitmap_index *bitmap_git, + struct bitmap *result, + struct commit *commit, + uint32_t commit_pos) +{ + int ret; + + ret = apply_pseudo_merges_for_commit(&bitmap_git->pseudo_merges, + result, commit, commit_pos); + + if (ret) + pseudo_merges_satisfied_nr += ret; + + return ret; +} + static int add_to_include_set(struct bitmap_index *bitmap_git, struct include_data *data, struct commit *commit, @@ -1026,6 +1045,10 @@ static int add_to_include_set(struct bitmap_index *bitmap_git, } bitmap_set(data->base, bitmap_pos); + if (apply_pseudo_merges_for_commit_1(bitmap_git, data->base, commit, + bitmap_pos)) + return 0; + return 1; } @@ -1151,6 +1174,20 @@ static void show_boundary_object(struct object *object UNUSED, BUG("should not be called"); } +static unsigned cascade_pseudo_merges_1(struct bitmap_index *bitmap_git, + struct bitmap *result, + struct bitmap *roots) +{ + int ret = cascade_pseudo_merges(&bitmap_git->pseudo_merges, + result, roots); + if (ret) { + pseudo_merges_cascades_nr++; + pseudo_merges_satisfied_nr += ret; + } + + return ret; +} + static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, struct rev_info *revs, struct object_list *roots) @@ -1160,6 +1197,7 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, unsigned int i; unsigned int tmp_blobs, tmp_trees, tmp_tags; int any_missing = 0; + int existing_bitmaps = 0; cb.bitmap_git = bitmap_git; cb.base = bitmap_new(); @@ -1167,6 +1205,25 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, revs->ignore_missing_links = 1; + if (bitmap_git->pseudo_merges.nr) { + struct bitmap *roots_bitmap = bitmap_new(); + struct object_list *objects = NULL; + + for (objects = roots; objects; objects = objects->next) { + struct object *object = objects->item; + int pos; + + pos = bitmap_position(bitmap_git, &object->oid); + if (pos < 0) + continue; + + bitmap_set(roots_bitmap, pos); + } + + if (!cascade_pseudo_merges_1(bitmap_git, cb.base, roots_bitmap)) + bitmap_free(roots_bitmap); + } + /* * OR in any existing reachability bitmaps among `roots` into * `cb.base`. @@ -1178,8 +1235,10 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, continue; if (add_commit_to_bitmap(bitmap_git, &cb.base, - (struct commit *)object)) + (struct commit *)object)) { + existing_bitmaps = 1; continue; + } any_missing = 1; } @@ -1187,6 +1246,9 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, if (!any_missing) goto cleanup; + if (existing_bitmaps) + cascade_pseudo_merges_1(bitmap_git, cb.base, NULL); + tmp_blobs = revs->blob_objects; tmp_trees = revs->tree_objects; tmp_tags = revs->blob_objects; @@ -1242,6 +1304,13 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, return cb.base; } +static void unsatisfy_all_pseudo_merges(struct bitmap_index *bitmap_git) +{ + uint32_t i; + for (i = 0; i < bitmap_git->pseudo_merges.nr; i++) + bitmap_git->pseudo_merges.v[i].satisfied = 0; +} + static struct bitmap *find_objects(struct bitmap_index *bitmap_git, struct rev_info *revs, struct object_list *roots, @@ -1249,9 +1318,32 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git, { struct bitmap *base = NULL; int needs_walk = 0; + unsigned existing_bitmaps = 0; struct object_list *not_mapped = NULL; + unsatisfy_all_pseudo_merges(bitmap_git); + + if (bitmap_git->pseudo_merges.nr) { + struct bitmap *roots_bitmap = bitmap_new(); + struct object_list *objects = NULL; + + for (objects = roots; objects; objects = objects->next) { + struct object *object = objects->item; + int pos; + + pos = bitmap_position(bitmap_git, &object->oid); + if (pos < 0) + continue; + + bitmap_set(roots_bitmap, pos); + } + + base = bitmap_new(); + if (!cascade_pseudo_merges_1(bitmap_git, base, roots_bitmap)) + bitmap_free(roots_bitmap); + } + /* * Go through all the roots for the walk. The ones that have bitmaps * on the bitmap index will be `or`ed together to form an initial @@ -1262,11 +1354,21 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git, */ while (roots) { struct object *object = roots->item; + roots = roots->next; + if (base) { + int pos = bitmap_position(bitmap_git, &object->oid); + if (pos > 0 && bitmap_get(base, pos)) { + object->flags |= SEEN; + continue; + } + } + if (object->type == OBJ_COMMIT && add_commit_to_bitmap(bitmap_git, &base, (struct commit *)object)) { object->flags |= SEEN; + existing_bitmaps = 1; continue; } @@ -1282,6 +1384,9 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git, roots = not_mapped; + if (existing_bitmaps) + cascade_pseudo_merges_1(bitmap_git, base, NULL); + /* * Let's iterate through all the roots that don't have bitmaps to * check if we can determine them to be reachable from the existing @@ -1866,6 +1971,11 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, object_list_free(&wants); object_list_free(&haves); + trace2_data_intmax("bitmap", the_repository, "pseudo_merges_satisfied", + pseudo_merges_satisfied_nr); + trace2_data_intmax("bitmap", the_repository, "pseudo_merges_cascades", + pseudo_merges_cascades_nr); + return bitmap_git; cleanup: diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh new file mode 100755 index 00000000000..909c17e301e --- /dev/null +++ b/t/t5333-pseudo-merge-bitmaps.sh @@ -0,0 +1,325 @@ +#!/bin/sh + +test_description='pseudo-merge bitmaps' + +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 + +. ./test-lib.sh + +test_pseudo_merges () { + test-tool bitmap dump-pseudo-merges +} + +test_pseudo_merge_commits () { + test-tool bitmap dump-pseudo-merge-commits "$1" +} + +test_pseudo_merges_satisfied () { + test_trace2_data bitmap pseudo_merges_satisfied "$1" +} + +test_pseudo_merges_cascades () { + test_trace2_data bitmap pseudo_merges_cascades "$1" +} + +tag_everything () { + git rev-list --all --no-object-names >in && + perl -lne ' + print "create refs/tags/" . $. . " " . $1 if /([0-9a-f]+)/ + ' expect && + + : >trace2.txt && + GIT_TRACE2_EVENT=$PWD/trace2.txt \ + git rev-list --count --all --objects --use-bitmap-index >actual && + + test_pseudo_merges_satisfied 0 merges && + test_must_be_empty merges && + test_cmp expect actual +' + +test_expect_success 'pseudo-merges accurately represent their objects' ' + test_config bitmapPseudoMerge.test.pattern "refs/tags/" && + test_config bitmapPseudoMerge.test.maxMerges 8 && + test_config bitmapPseudoMerge.test.stableThreshold never && + + git repack -adb && + + test_pseudo_merges >merges && + test_line_count = 8 merges && + + for i in $(test_seq 0 $(($(wc -l commits && + + git rev-list --objects --no-object-names --stdin expect.raw && + test-tool bitmap dump-pseudo-merge-objects $i >actual.raw && + + sort -u expect && + sort -u actual && + + test_cmp expect actual || return 1 + done +' + +test_expect_success 'bitmap traversal with pseudo-merges' ' + : >trace2.txt && + GIT_TRACE2_EVENT=$PWD/trace2.txt \ + git rev-list --count --all --objects --use-bitmap-index >actual && + git rev-list --count --all --objects >expect && + + test_pseudo_merges_satisfied 8 trace2.txt && + GIT_TRACE2_EVENT=$PWD/trace2.txt \ + git rev-list --count --all --objects --use-bitmap-index >actual && + git rev-list --count --all --objects >expect && + + test_pseudo_merges_satisfied 8 merges && + for i in $(test_seq 0 $(($(wc -l commits && + + test-tool bitmap list-commits >bitmaps && + bitmaps_nr="$(wc -l expect && + + test $(cat expect) -eq $(wc -l merges && + test_line_count = 1 merges && + + test_pseudo_merge_commits 0 >oids && + git cat-file --batch commits && + + test $(wc -l in && + git update-ref --stdin merges && + merges_nr="$(wc -l oids && + git cat-file --batch commits && + + expect="$(grep -c "^committer.*$old +0000$" commits)" && + actual="$(wc -l oids && + git cat-file --batch commits && + test $(wc -l err && + + cat >expect <<-EOF && + fatal: pseudo-merge group ${SQ}test${SQ} has unstable threshold before stable one + EOF + + test_cmp expect err +' + +test_expect_success 'pseudo-merge pattern with capture groups' ' + git init pseudo-merge-captures && + ( + cd pseudo-merge-captures && + + test_commit_bulk 128 && + tag_everything && + + for r in $(test_seq 8) + do + test_commit_bulk 16 && + + git rev-list HEAD~16.. >in && + + perl -lne "print \"create refs/remotes/$r/tags/\$. \$_\"" refs && + + test_pseudo_merges >merges && + for m in $(test_seq 0 $(($(wc -l oids && + grep -f oids refs | + perl -lne "print \$1 if /refs\/remotes\/([0-9]+)/" | + sort -u || return 1 + done >remotes && + + test $(wc -l merges && + test_line_count = 2 merges && + + test_pseudo_merge_commits 0 >commits-0.raw && + test_pseudo_merge_commits 1 >commits-1.raw && + + sort commits-0.raw >commits-0 && + sort commits-1.raw >commits-1 && + + comm -12 commits-0 commits-1 >overlap && + + test_line_count -gt 0 overlap + ) +' + +test_expect_success 'pseudo-merge overlap traversal' ' + ( + cd pseudo-merge-overlap && + + : >trace2.txt && + GIT_TRACE2_EVENT=$PWD/trace2.txt \ + git rev-list --count --all --objects --use-bitmap-index >actual && + git rev-list --count --all --objects >expect && + + test_pseudo_merges_satisfied 2 trace2.txt && + GIT_TRACE2_EVENT=$PWD/trace2.txt \ + git rev-list --count --all --objects --use-bitmap-index >actual && + git rev-list --count --all --objects >expect && + + test_pseudo_merges_satisfied 2 X-Patchwork-Id: 13598233 Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 09E7886ACB for ; Wed, 20 Mar 2024 22:06:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972369; cv=none; b=qrBS/asMlzlQC5PskaGqwKTST2QQOChEBMyn6xwWFRi8I+56hKL4aAwA0s6Fi8TWnNy1G0+mR7ZmQSIv9URtkdqN2X1dsA1UAe0Z4tm61S55X9fzp7796y47ov6S+xNCFsW2SRdgyuII4ywdjIM2OcRrWjCCxLnF7ci9eul6jL0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972369; c=relaxed/simple; bh=LGleJ04VBkbJ+Fq/xlePpFto67qJhFbp6tnpUS83uGs=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=jYPBcRFUAzw5IHd3xbzL0Ip1bRGXsDmViUgno6WEhUlhV1aRL4n3HD2ORKms2AiEakQisPdcq0jRFrT3jevyIH4AqGSu0GvMtwLLurrkhnHIVOvDjN+gHJMvhH2fjtnS1PohsR5Ga4daZAuUxz4XpOgS/ILRFnfRCyoYOxiRdzs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=kaTR6GoQ; arc=none smtp.client-ip=209.85.128.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="kaTR6GoQ" Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-6098b9ed2a3so3024117b3.0 for ; Wed, 20 Mar 2024 15:06:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972363; x=1711577163; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=QCBQv+XpRYcllzMZNKRb2EC+SYSpIpqhGBHmgnGM/XE=; b=kaTR6GoQoMzGYueGNccvfLE6QhJkeKGmuYjmBP4lYKgIeswzrxh8ARNhS41ztoi14d AO7upKiwndwfil1MA4aHOibBGRINB5AIe8BUeZwhzpCujPXSZ4oN3xbScyTCyazoKJJl 99iNAqynhUhT6ld6C6R9JM+DLSwRUVs2DMqnqBvqwJe14TYwE3Deb8R1OnH8I0Qjfo5m qIj8mvtnbgqDEHH7D2HOcoUzG6nR1c6tHeIlKG+0ccp171WJpMDeNb7/kIBWzdksOPm4 k3nPQvQiBYQ078ZfLTLQUt2+bCkoIHwuvWR0Gf+ullwXDQ29r/UPo2Yhv+tQg+BNktHe t0mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972363; x=1711577163; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=QCBQv+XpRYcllzMZNKRb2EC+SYSpIpqhGBHmgnGM/XE=; b=NDWNZbW6cEjepUviAO5u9TD64DOJAEpVv/Hh+yV69OylVL/AtfrPvPlYQufexR7S/6 MTVhG5RUJO9r+VjzhU9Zw2vjyK1i7+Fj5EpG6xYK2ou4cin2npA3ekTk/f06lmSrXHLk wHv8aX4ar6yhJSqGGV4xdEPJZAPOYGXtRUQQWqhEW8N/odR0IOTQM0x53/wrr+MNhStj a+EZScSMCgbW1XxTfg2CHlHU0BSg3RzlUeN4VQg+kozHFlcIC1srrHdHGlYflhJZIUFQ ptibiO9oDQUjIwh6VTBIdp9C+G6NXBPxa9UX5ZLc3E+XJ1a9K+vUWzIyd7QtAQcN3+u/ 8IPg== X-Gm-Message-State: AOJu0YyaeDcp8kGaJNK2Sm/NTxmWogWsiGaaFvfn7/NfwCbYvrgTELdh pdqnOf1CKX2kD1WVAoL6+DB3y7K2joL0yG+zvn/kx6lqKndE2vnmaVuWFLWbF08bkqTkrDuiccs FpNc= X-Google-Smtp-Source: AGHT+IFCpqPOUE/sXLqczrKQkObkL3QnvdDiuIHZfoqGDrbJFyO/nmDhnRqdZe2MfDI+A/gLdsvh9Q== X-Received: by 2002:a81:81c1:0:b0:611:48d:5274 with SMTP id r184-20020a8181c1000000b00611048d5274mr949320ywf.15.1710972363010; Wed, 20 Mar 2024 15:06:03 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id cq7-20020a05622a424700b0042f3e288807sm4615350qtb.95.2024.03.20.15.06.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:06:02 -0700 (PDT) Date: Wed, 20 Mar 2024 18:06:01 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 21/24] pack-bitmap: extra trace2 information Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Add some extra trace2 lines to capture the number of bitmap lookups that are hits versus misses, as well as the number of reachability roots that have bitmap coverage (versus those that do not). Signed-off-by: Taylor Blau --- pack-bitmap.c | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index a7c36a977bd..be65f637cf5 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -116,6 +116,10 @@ struct bitmap_index { static int pseudo_merges_satisfied_nr; static int pseudo_merges_cascades_nr; +static int existing_bitmaps_hits_nr; +static int existing_bitmaps_misses_nr; +static int roots_with_bitmaps_nr; +static int roots_without_bitmaps_nr; static struct ewah_bitmap *lookup_stored_bitmap(struct stored_bitmap *st) { @@ -1040,10 +1044,14 @@ static int add_to_include_set(struct bitmap_index *bitmap_git, partial = bitmap_for_commit(bitmap_git, commit); if (partial) { + existing_bitmaps_hits_nr++; + bitmap_or_ewah(data->base, partial); return 0; } + existing_bitmaps_misses_nr++; + bitmap_set(data->base, bitmap_pos); if (apply_pseudo_merges_for_commit_1(bitmap_git, data->base, commit, bitmap_pos)) @@ -1099,8 +1107,12 @@ static int add_commit_to_bitmap(struct bitmap_index *bitmap_git, { struct ewah_bitmap *or_with = bitmap_for_commit(bitmap_git, commit); - if (!or_with) + if (!or_with) { + existing_bitmaps_misses_nr++; return 0; + } + + existing_bitmaps_hits_nr++; if (!*base) *base = ewah_to_bitmap(or_with); @@ -1407,8 +1419,12 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git, object->flags &= ~UNINTERESTING; add_pending_object(revs, object, ""); needs_walk = 1; + + roots_without_bitmaps_nr++; } else { object->flags |= SEEN; + + roots_with_bitmaps_nr++; } } @@ -1975,6 +1991,14 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, pseudo_merges_satisfied_nr); trace2_data_intmax("bitmap", the_repository, "pseudo_merges_cascades", pseudo_merges_cascades_nr); + trace2_data_intmax("bitmap", the_repository, "bitmap/hits", + existing_bitmaps_hits_nr); + trace2_data_intmax("bitmap", the_repository, "bitmap/misses", + existing_bitmaps_misses_nr); + trace2_data_intmax("bitmap", the_repository, "bitmap/roots_with_bitmap", + roots_with_bitmaps_nr); + trace2_data_intmax("bitmap", the_repository, "bitmap/roots_without_bitmap", + roots_without_bitmaps_nr); return bitmap_git; From patchwork Wed Mar 20 22:06:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598232 Received: from mail-ot1-f42.google.com (mail-ot1-f42.google.com [209.85.210.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1CC0685C51 for ; Wed, 20 Mar 2024 22:06:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972368; cv=none; b=i5pKZbYO/crforekZjoTIC5jyYCZYT0X2nMy98qihzH5sTAYJxO8mL6/MtoaLvgaqbPOzMPbH6t+vv6fg+WfNKX0H+Yci+8Sz/A3ElhiO+OVwJWgvzAahXV3YsVozUfqa9mg9dYhcikRBxd9nevse2bZdbIAyzqDVex19g6Mu6k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972368; c=relaxed/simple; bh=KKvLWU8FC1yfen69son4EmnxoeodpjekonGcyUOmpCw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=QzV51do0i+2viDWE9JTE262hMiaUeWeWno7Xfj6UpNuaJIHGQLhSDw/mEZsQUGZnGiIP5342CVetDL+mTj04Nq+A6mabG2mX8OR5PYZcoluBgC0uxy8sj2j/JxbHW3IPwu7w0pPGPneuxzZ58Ce7A7fhqezSkc4loV+mMczy7vo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=24DKaKw+; arc=none smtp.client-ip=209.85.210.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="24DKaKw+" Received: by mail-ot1-f42.google.com with SMTP id 46e09a7af769-6e682dbd84bso180233a34.0 for ; Wed, 20 Mar 2024 15:06:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972366; x=1711577166; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=CulxNu1SVN0LUBqrLvG0/ZAJrUkb6yN9SnGiMoaKZN4=; b=24DKaKw+t28v1xw+9ppE0t1aViC9JI6iat4qjyIDL/2+mi3O8aQn/JhuLXAzZNwfYX c6/9vpAzAh2HJarUdbzRwHOSTXhiBD/mh8nWfEljkiwKB8Dzv8oyfi759NPhgy/LYY6D Z4DaFzNOH1mo0blSAKKfaE9sOdHauC7uUggiem9/v/ao0Hn15qzOttEszXpfFmsieaBK ne47JduvC0HxUpRymbJ8QhCApZr71hX2qeSJ3LUTRkt8o/Nb86NhufI9T2EBzV0yNQcD 9E6/R9SBhYSCycUaQ6BGKces0wzI/4u69DCFbsMa1SyuF+m7s7+PF5KOLOrfcmLaYuGh 9gYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972366; x=1711577166; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=CulxNu1SVN0LUBqrLvG0/ZAJrUkb6yN9SnGiMoaKZN4=; b=SUIXzY/cKZ3yodcx5nxIs52eInFn+Gh1+80cHiGC/mTi5GuMC4xaV9Va8Z0tRskVB/ dZBJ4c+/wh0Z7AW5qLxb8IIvYgxj8ZYK8JAX7YL5S9/oZ08zcKu9zG9QcH1QM0ocAloS a7+6J5OsPNH0Pym34uZTMViVijiFGztUvbHVYFg8eEOODKma4J6JZQEe8aZx2bRhQiSB 4AQCtOEXAmVRTbiIUFwo1T8Km6gZixlVk0smd6v3sqvun8rRtnksg9OrH3f+jLRgtzJt X852WsczNp4XT2hWCjYNhoQk6Qh8ez3oj2QP8x5CuUVh8hpfvZ2VMX/bN2p73Gkuqoco ScOg== X-Gm-Message-State: AOJu0YzHfXuhbhRJ92s/sdnWlGELqfcVOhzkncdcDAhIsQUdr6wb6HKa UK3NyVkLx4ufLDOnQxJzahSLetsT5tv/B/jk/grke8ag6a7Qj3UJbaomPGt5i4zKsReXRdM7lYc FpCY= X-Google-Smtp-Source: AGHT+IGWGduUvDp9nPoRV7q3q9e0FE+4+d9A4w1zO8UEj0OiGBDNNAKU5CBd5zhEXDRtbrBLAcmgyQ== X-Received: by 2002:a05:6830:4787:b0:6e5:f65:8775 with SMTP id df7-20020a056830478700b006e50f658775mr20431708otb.24.1710972366059; Wed, 20 Mar 2024 15:06:06 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id n21-20020a05620a223500b00789ed8823f5sm4597526qkh.0.2024.03.20.15.06.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:06:05 -0700 (PDT) Date: Wed, 20 Mar 2024 18:06:04 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 22/24] ewah: `bitmap_equals_ewah()` Message-ID: <1eb10c190ba1f045b3eab8c1975a77e6a046c8ee.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare to reuse existing pseudo-merge bitmaps by implementing a `bitmap_equals_ewah()` helper. This helper will be used to see if a raw bitmap (containing the set of parents for some pseudo-merge) is equal to any existing pseudo-merge's commits bitmap (which are stored as EWAH-compressed bitmaps on disk). Signed-off-by: Taylor Blau --- ewah/bitmap.c | 19 +++++++++++++++++++ ewah/ewok.h | 1 + 2 files changed, 20 insertions(+) diff --git a/ewah/bitmap.c b/ewah/bitmap.c index a41fa152cbd..59dc77a08f6 100644 --- a/ewah/bitmap.c +++ b/ewah/bitmap.c @@ -261,6 +261,25 @@ int bitmap_equals(struct bitmap *self, struct bitmap *other) return 1; } +int bitmap_equals_ewah(struct bitmap *self, struct ewah_bitmap *other) +{ + struct ewah_iterator it; + eword_t word; + size_t i = 0; + + ewah_iterator_init(&it, other); + + while (ewah_iterator_next(&word, &it)) + if (word != (i < self->word_alloc ? self->words[i++] : 0)) + return 0; + + for (; i < self->word_alloc; i++) + if (self->words[i]) + return 0; + + return 1; +} + int bitmap_is_subset(struct bitmap *self, struct bitmap *other) { size_t common_size, i; diff --git a/ewah/ewok.h b/ewah/ewok.h index d7e9fb67715..0d49ec00618 100644 --- a/ewah/ewok.h +++ b/ewah/ewok.h @@ -179,6 +179,7 @@ void bitmap_unset(struct bitmap *self, size_t pos); int bitmap_get(struct bitmap *self, size_t pos); void bitmap_free(struct bitmap *self); int bitmap_equals(struct bitmap *self, struct bitmap *other); +int bitmap_equals_ewah(struct bitmap *self, struct ewah_bitmap *other); int bitmap_is_subset(struct bitmap *self, struct bitmap *other); int ewah_bitmap_is_subset(struct ewah_bitmap *self, struct bitmap *other); From patchwork Wed Mar 20 22:06:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598234 Received: from mail-oi1-f181.google.com (mail-oi1-f181.google.com [209.85.167.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4052A86AEE for ; Wed, 20 Mar 2024 22:06:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972372; cv=none; b=EsaUDYfbY+tx25s41XhgH9XlOTZNJkDK5N2EXs/48m6PVrs6qGUeiofDL2F1UJAcBllWoD67wJg2FQy3jV2URm1IMIy+5pO/jbfhnHqrzmZXXVoFfL0EgWDYklcha+S/QtbpeBkxJxRBDvF2Q4ccSuQ3rICYfV17P4cEabcqhHs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972372; c=relaxed/simple; bh=u2uP3+2jtB5XSZEI8H0GQbeujPGXGL0002InGDaFgYo=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Nru4rCyYMNT3BDsI62p5shLvpkWAHDGhsr7ZfFZdwl0ZtLg4aFccv/tZp5lKl5SqA6w+KmMKGJmYLHd1h7P4DtQyg7gaBopazuvB1ilYsDdsUp7xWcbZB8qQiQpcz4tumXnUFDYeFL9x8uMiuqey3vtbSzDViZX5xuafh+Qp69k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=xuiLhzGy; arc=none smtp.client-ip=209.85.167.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="xuiLhzGy" Received: by mail-oi1-f181.google.com with SMTP id 5614622812f47-3c1e992f060so220804b6e.0 for ; Wed, 20 Mar 2024 15:06:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972369; x=1711577169; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=kWPHP3zI6fSABW5CxUEOH//01Urp+7DMA8AZ7sE/iGQ=; b=xuiLhzGyN2Xm0vcDopTn7Ro6hECkMEshUAlIZqUvVIC1fgg6+vVCtAhFRMTDWh6Qse DHnAUf1GtdYIw+jPJrcwaIsYLwr0IiH5wl8UP5pIDzjVcy6SMN1g9s+x1GC67Wb22pzc uHLjpfvPXdINPGoXgIMEJRd7ApBK4zH80kqxrxb6rtk+Ve7RFR2IV2eTXYjCGFq/+Y2H 9NfBa+DoK4bPOHRaON7jL0lm4JGPmVgOCzuoGlUIKP9T3rhumJvKbEShV2OnJYfCurnt h9TzkFKwvdSp89CT54Ea2MoQ3xPX84FUnMrFUcLi3hIdfkxCs8KA0puiHEK1qn0mBqzM xDTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972369; x=1711577169; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=kWPHP3zI6fSABW5CxUEOH//01Urp+7DMA8AZ7sE/iGQ=; b=Q8Vi8lwrU5rX9nrIdGlrZ36YNPyHYUnwyYMeGuIYnH/LK2BtGG7QLzJz+2D4P+Rw3Q UgKjN3boMqiBlXkMWwiEOkY9i4OSNeEnskF9xwcI/MvtBwTldphj7BHHYw1u/7/4eofW FzKPJYjd2YdeaAtP35V7Mv2NO8DsgZ7YRripGw3bFXo7M7a/x0wuiToeGHJ2DQ3ocXLt avPWJpN0O47Vf4YX8MX8PK7HcxW8Y5uxm5RZhGgcmQmL0EeUWmXIlGUfdbzFOv/O936o nmGI/iEQgJl59AC2PlNxXT2LqMf/EdMSiGdIrYxFF8T+CNTOTPmm3reZq4cCNzMQpL3k PlUg== X-Gm-Message-State: AOJu0Yz9Efg8qHdK1kLTSwx4kPf64M51m1pZ5qNxCE3iSS2sb+7N6Vbq BlaXOm8sxvHJiT+J3+UZWqVSRXjDFRPuMVbS52GU7C1mD/1suRAmU0x+/3gqWGCoKlFNEXx7kqV 34JM= X-Google-Smtp-Source: AGHT+IG8/Cbp/iwJY6L8IW618wOJbeJZMAMVZ2lXEDWDA3eLaIR9YTE7UZf+CUM2vqgM97AofEdHzQ== X-Received: by 2002:a05:6808:1382:b0:3c2:105e:2a75 with SMTP id c2-20020a056808138200b003c2105e2a75mr405685oiw.21.1710972369089; Wed, 20 Mar 2024 15:06:09 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id kd26-20020a056214401a00b0069035d9a576sm8133241qvb.60.2024.03.20.15.06.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:06:08 -0700 (PDT) Date: Wed, 20 Mar 2024 18:06:07 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 23/24] pseudo-merge: implement support for finding existing merges Message-ID: <4ae4f0eaae5ebe9495968e8585f4b2692d2cbec2.1710972293.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: This patch implements support for reusing existing pseudo-merge commits when writing bitmaps when there is an existing pseudo-merge bitmap which has exactly the same set of parents as one that we are about to write. Note that unstable pseudo-merges are likely to change between consecutive repacks, and so are generally poor candidates for reuse. However, stable pseudo-merges (see the configuration option 'bitmapPseudoMerge..stableThreshold') are by definition unlikely to change between runs (as they represent long-running branches). Because there is no index from a *set* of pseudo-merge parents to a matching pseudo-merge bitmap, we have to construct the bitmap corresponding to the set of parents for each pending pseudo-merge commit and see if a matching bitmap exists. This is technically quadratic in the number of pseudo-merges, but is OK in practice for a couple of reasons: - non-matching pseudo-merge bitmaps are rejected quickly as soon as they differ in a single bit - already-matched pseudo-merge bitmaps are discarded from subsequent rounds of search - the number of pseudo-merges is generally small, even for large repositories In order to do this, implement (a) a function that finds a matching pseudo-merge given some uncompressed bitset describing its parents, (b) a function that computes the bitset of parents for a given pseudo-merge commit, and (c) call that function before computing the set of reachable objects for some pending pseudo-merge. Signed-off-by: Taylor Blau --- pack-bitmap-write.c | 15 ++++++-- pack-bitmap.c | 32 +++++++++++++++++ pack-bitmap.h | 2 ++ pseudo-merge.c | 55 ++++++++++++++++++++++++++++ pseudo-merge.h | 7 ++++ t/t5333-pseudo-merge-bitmaps.sh | 64 +++++++++++++++++++++++++++++++++ 6 files changed, 173 insertions(+), 2 deletions(-) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 2d1b202fcd9..fdd84d31a68 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -19,6 +19,10 @@ #include "tree-walk.h" #include "pseudo-merge.h" #include "oid-array.h" +#include "config.h" +#include "alloc.h" +#include "refs.h" +#include "strmap.h" struct bitmapped_commit { struct commit *commit; @@ -443,6 +447,7 @@ static int fill_bitmap_tree(struct bitmap *bitmap, } static int reused_bitmaps_nr; +static int reused_pseudo_merge_bitmaps_nr; static int fill_bitmap_commit(struct bb_commit *ent, struct commit *commit, @@ -467,7 +472,7 @@ static int fill_bitmap_commit(struct bb_commit *ent, struct bitmap *remapped = bitmap_new(); if (commit->object.flags & BITMAP_PSEUDO_MERGE) - old = NULL; + old = pseudo_merge_bitmap_for_commit(old_bitmap, c); else old = bitmap_for_commit(old_bitmap, c); /* @@ -478,7 +483,10 @@ static int fill_bitmap_commit(struct bb_commit *ent, if (old && !rebuild_bitmap(mapping, old, remapped)) { bitmap_or(ent->bitmap, remapped); bitmap_free(remapped); - reused_bitmaps_nr++; + if (commit->object.flags & BITMAP_PSEUDO_MERGE) + reused_pseudo_merge_bitmaps_nr++; + else + reused_bitmaps_nr++; continue; } bitmap_free(remapped); @@ -604,6 +612,9 @@ int bitmap_writer_build(struct packing_data *to_pack) the_repository); trace2_data_intmax("pack-bitmap-write", the_repository, "building_bitmaps_reused", reused_bitmaps_nr); + trace2_data_intmax("pack-bitmap-write", the_repository, + "building_bitmaps_pseudo_merge_reused", + reused_pseudo_merge_bitmaps_nr); stop_progress(&writer.progress); diff --git a/pack-bitmap.c b/pack-bitmap.c index be65f637cf5..5a5f8b7e69f 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1316,6 +1316,37 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, return cb.base; } +struct ewah_bitmap *pseudo_merge_bitmap_for_commit(struct bitmap_index *bitmap_git, + struct commit *commit) +{ + struct commit_list *p; + struct bitmap *parents; + struct pseudo_merge *match = NULL; + + if (!bitmap_git->pseudo_merges.nr) + return NULL; + + parents = bitmap_new(); + + for (p = commit->parents; p; p = p->next) { + int pos = bitmap_position(bitmap_git, &p->item->object.oid); + if (pos < 0 || pos >= bitmap_num_objects(bitmap_git)) + goto done; + + bitmap_set(parents, pos); + } + + match = pseudo_merge_for_parents(&bitmap_git->pseudo_merges, + parents); + +done: + bitmap_free(parents); + if (match) + return pseudo_merge_bitmap(&bitmap_git->pseudo_merges, match); + + return NULL; +} + static void unsatisfy_all_pseudo_merges(struct bitmap_index *bitmap_git) { uint32_t i; @@ -2808,6 +2839,7 @@ void free_bitmap_index(struct bitmap_index *b) */ close_midx_revindex(b->midx); } + free_pseudo_merge_map(&b->pseudo_merges); free(b); } diff --git a/pack-bitmap.h b/pack-bitmap.h index 25d3b8e604a..0fefef39bec 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -119,6 +119,8 @@ int rebuild_bitmap(const uint32_t *reposition, struct bitmap *dest); struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git, struct commit *commit); +struct ewah_bitmap *pseudo_merge_bitmap_for_commit(struct bitmap_index *bitmap_git, + struct commit *commit); void bitmap_writer_select_commits(struct commit **indexed_commits, unsigned int indexed_commits_nr); int bitmap_writer_build(struct packing_data *to_pack); diff --git a/pseudo-merge.c b/pseudo-merge.c index e111c9cd1a6..9e21fbb5062 100644 --- a/pseudo-merge.c +++ b/pseudo-merge.c @@ -682,3 +682,58 @@ int cascade_pseudo_merges(const struct pseudo_merge_map *pm, return ret; } + +struct pseudo_merge *pseudo_merge_for_parents(const struct pseudo_merge_map *pm, + struct bitmap *parents) +{ + struct pseudo_merge *match = NULL; + size_t i; + + if (!pm->nr) + return NULL; + + /* + * NOTE: this loop is quadratic in the worst-case (where no + * matching pseudo-merge bitmaps are found), but in practice + * this is OK for a few reasons: + * + * - Rejecting pseudo-merge bitmaps that do not match the + * given commit is done quickly (i.e. `bitmap_equals_ewah()` + * returns early when we know the two bitmaps aren't equal. + * + * - Already matched pseudo-merge bitmaps (which we track with + * the `->satisfied` bit here) are skipped as potential + * candidates. + * + * - The number of pseudo-merges should be small (in the + * hundreds for most repositories). + * + * If in the future this semi-quadratic behavior does become a + * problem, another approach would be to keep track of which + * pseudo-merges are still "viable" after enumerating the + * pseudo-merge commit's parents: + * + * - A pseudo-merge bitmap becomes non-viable when the bit(s) + * corresponding to one or more parent(s) of the given + * commit are not set in a candidate pseudo-merge's commits + * bitmap. + * + * - After processing all bits, enumerate the remaining set of + * viable pseudo-merge bitmaps, and check that their + * popcount() matches the number of parents in the given + * commit. + */ + for (i = 0; i < pm->nr; i++) { + struct pseudo_merge *candidate = use_pseudo_merge(pm, &pm->v[i]); + if (!candidate || candidate->satisfied) + continue; + if (!bitmap_equals_ewah(parents, candidate->commits)) + continue; + + match = candidate; + match->satisfied = 1; + break; + } + + return match; +} diff --git a/pseudo-merge.h b/pseudo-merge.h index cc14e947e86..33acd00a3e5 100644 --- a/pseudo-merge.h +++ b/pseudo-merge.h @@ -208,4 +208,11 @@ int cascade_pseudo_merges(const struct pseudo_merge_map *pm, struct bitmap *result, struct bitmap *roots); +/* + * Returns a pseudo-merge which contains the exact set of commits + * listed in the "parents" bitamp, or NULL if none could be found. + */ +struct pseudo_merge *pseudo_merge_for_parents(const struct pseudo_merge_map *pm, + struct bitmap *parents); + #endif diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh index 909c17e301e..531f1924af4 100755 --- a/t/t5333-pseudo-merge-bitmaps.sh +++ b/t/t5333-pseudo-merge-bitmaps.sh @@ -22,6 +22,10 @@ test_pseudo_merges_cascades () { test_trace2_data bitmap pseudo_merges_cascades "$1" } +test_pseudo_merges_reused () { + test_trace2_data pack-bitmap-write building_bitmaps_pseudo_merge_reused "$1" +} + tag_everything () { git rev-list --all --no-object-names >in && perl -lne ' @@ -322,4 +326,64 @@ test_expect_success 'pseudo-merge overlap stale traversal' ' ) ' +test_expect_success 'pseudo-merge reuse' ' + git init pseudo-merge-reuse && + ( + cd pseudo-merge-reuse && + + stable="1641013200" && # 2022-01-01 + unstable="1672549200" && # 2023-01-01 + + for date in $stable $unstable + do + test_commit_bulk --date "$date +0000" 128 && + test_tick || return 1 + done && + + tag_everything && + + git \ + -c bitmapPseudoMerge.test.pattern="refs/tags/" \ + -c bitmapPseudoMerge.test.maxMerges=1 \ + -c bitmapPseudoMerge.test.threshold=now \ + -c bitmapPseudoMerge.test.stableThreshold=$(($unstable - 1)) \ + -c bitmapPseudoMerge.test.stableSize=512 \ + repack -adb && + + test_pseudo_merges >merges && + test_line_count = 2 merges && + + test_pseudo_merge_commits 0 >stable-oids.before && + test_pseudo_merge_commits 1 >unstable-oids.before && + + : >trace2.txt && + GIT_TRACE2_EVENT=$PWD/trace2.txt git \ + -c bitmapPseudoMerge.test.pattern="refs/tags/" \ + -c bitmapPseudoMerge.test.maxMerges=2 \ + -c bitmapPseudoMerge.test.threshold=now \ + -c bitmapPseudoMerge.test.stableThreshold=$(($unstable - 1)) \ + -c bitmapPseudoMerge.test.stableSize=512 \ + repack -adb && + + test_pseudo_merges_reused 1 merges && + test_line_count = 3 merges && + + test_pseudo_merge_commits 0 >stable-oids.after && + for i in 1 2 + do + test_pseudo_merge_commits $i || return 1 + done >unstable-oids.after && + + sort -u expect && + sort -u actual && + test_cmp expect actual && + + sort -u expect && + sort -u actual && + test_cmp expect actual + ) +' + test_done From patchwork Wed Mar 20 22:06:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13598235 Received: from mail-qv1-f42.google.com (mail-qv1-f42.google.com [209.85.219.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 46B2686140 for ; Wed, 20 Mar 2024 22:06:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972374; cv=none; b=S5LHZnMsmjp5TPppSseSegy7SGvsDB1716hhA1omzF1d0MA6MUMUiM4V8vTS5basojKru1R5b39Z4n6rA3BAtp9T9jwyfSjwERSXaFNJFDmTuREdo5uOB9wQvhltsSHsS1gX0vrCrA9hUejU9PwWods3ggNaYJZephjm9BAwuv0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710972374; c=relaxed/simple; bh=3iAnkBErJUUehPh9jl5QXCZKhTMOX0axVfFxaSGxLoY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=HbgcdJCrFFFBjvwgH1xwJz/Ac8OVnzZlh2cxdK5AeAyF78m+42kS/y/Jyq+rtF/DmoYIKUlYp8mgvPIAj3cfhCIAoj1GtRsK7IRyPGvZb9Yoa3GtbP4/3AzwhHFFNHwYNRaJ5fFfRHaBCxmNSqXCpx3NxjmPI24z9oXwyylzAmo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=jaGX6sIx; arc=none smtp.client-ip=209.85.219.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="jaGX6sIx" Received: by mail-qv1-f42.google.com with SMTP id 6a1803df08f44-690cd7f83cdso2158466d6.3 for ; Wed, 20 Mar 2024 15:06:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1710972372; x=1711577172; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=cA0lReE301x14H+9nv5o6mAI4tz1gT48nr+JRRADIVw=; b=jaGX6sIxvgmKsm4mdd4njVvPweCQ9v5yelrDxo9ret/YsldBd9pEwByFucpbtj19Yt pMoEG6UzK0lu7HIksk2bJtYd/GMuNZigFcNw6KY2iGGoyUf3XtB5kxIAKZgafnLkmeN4 p2QUJhMqJJxRoDzHQ7MyM5ESgJECHiLEGO6XbjvUTdsEbWExPv16cQFksMRWu0WOuJiz EsgVHycvAUy/A244YGJTk/vppr7toc+CwTU1ilcPCNb3FtIgJV1Lo+7VhFpy1I2Pq2Ct g2ExXM7l0JNmitwZBZRfmFrlKZ1BvClFLM9KD+PgeNRLqSHv+HimpH04OYsp+9sdBa/c m+8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710972372; x=1711577172; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=cA0lReE301x14H+9nv5o6mAI4tz1gT48nr+JRRADIVw=; b=qC3G1DWubhfSbP4KOTziKENvV4dbEHuIfRcjoTtnX5IjjRXt2uKQ7x5ZUrcwPNfrfY f9xaj0fhE7HtHCiWe3XnZkDbZAc2NTmTX47Wq8TZ7DLL86lfWf5WC4D71/KcWbmDUY0/ qcNG4FsjTFgGP/fG5+Ku3mx/YabpyOqACe/BmiVN6JF6KFSC4v4QJEpVAXLKmP6vXz36 UO8nkYRAdI50IWuL5p4iuTBkJ3WihOriqqqP4WCXDgUZl4Wdci2rdtzT5RpV8ANHGrIf RsIMaJsXpTny0qA5VWCZfQxwqY+liE4s++SFwJShaTxt0AAJUZMzupStZd/G5aEPbYLq yWgg== X-Gm-Message-State: AOJu0Yy3xVVLiJjpWDroWLVCcFrM1xp8yC7Ma2rsbUdpUxcgbG3vLxNb 7RTOrV5pl69pd2KvR2r9vxxSKDxh4+udlOTfx8EjP3J6Vmo6wIP/UDfExARrU14Mb/xTOOYynro M0W4= X-Google-Smtp-Source: AGHT+IHIq5MZ7zI+CvCpPC9De6AvL9rzcS4GJObaeO2ayv/eeplfEyoURrUmmJSrSNHuXDkp7D+vnQ== X-Received: by 2002:a0c:ac0e:0:b0:690:b2c3:bfd0 with SMTP id l14-20020a0cac0e000000b00690b2c3bfd0mr19944785qvb.57.1710972372022; Wed, 20 Mar 2024 15:06:12 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id g12-20020a0caacc000000b0069186a078b3sm6237622qvb.143.2024.03.20.15.06.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:06:11 -0700 (PDT) Date: Wed, 20 Mar 2024 18:06:10 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 24/24] t/perf: implement performace tests for pseudo-merge bitmaps Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Implement a straightforward performance test demonstrating the benefit of pseudo-merge bitmaps by measuring how long it takes to count reachable objects in a few different scenarios: - without bitmaps, to demonstrate a reasonable baseline - with bitmaps, but without pseudo-merges - with bitmaps and pseudo-merges Results from running this test on git.git are as follows: Test this tree ----------------------------------------------------------------------------------- 5333.2: git rev-list --count --all --objects (no bitmaps) 3.46(3.37+0.09) 5333.3: git rev-list --count --all --objects (no pseudo-merges) 0.13(0.11+0.01) 5333.4: git rev-list --count --all --objects (with pseudo-merges) 0.12(0.11+0.01) Signed-off-by: Taylor Blau --- t/perf/p5333-pseudo-merge-bitmaps.sh | 32 ++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) create mode 100755 t/perf/p5333-pseudo-merge-bitmaps.sh diff --git a/t/perf/p5333-pseudo-merge-bitmaps.sh b/t/perf/p5333-pseudo-merge-bitmaps.sh new file mode 100755 index 00000000000..4bec409d10e --- /dev/null +++ b/t/perf/p5333-pseudo-merge-bitmaps.sh @@ -0,0 +1,32 @@ +#!/bin/sh + +test_description='pseudo-merge bitmaps' +. ./perf-lib.sh + +test_perf_large_repo + +test_expect_success 'setup' ' + git \ + -c bitmapPseudoMerge.all.pattern="refs/" \ + -c bitmapPseudoMerge.all.threshold=now \ + -c bitmapPseudoMerge.all.stableThreshold=never \ + -c bitmapPseudoMerge.all.maxMerges=64 \ + -c pack.writeBitmapLookupTable=true \ + repack -adb +' + +test_perf 'git rev-list --count --all --objects (no bitmaps)' ' + git rev-list --objects --all +' + +test_perf 'git rev-list --count --all --objects (no pseudo-merges)' ' + GIT_TEST_USE_PSEDUO_MERGES=0 \ + git rev-list --objects --all --use-bitmap-index +' + +test_perf 'git rev-list --count --all --objects (with pseudo-merges)' ' + GIT_TEST_USE_PSEDUO_MERGES=1 \ + git rev-list --objects --all --use-bitmap-index +' + +test_done