From patchwork Tue May 21 19:01:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669659 Received: from mail-qk1-f170.google.com (mail-qk1-f170.google.com [209.85.222.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 20800147C9F for ; Tue, 21 May 2024 19:01:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318112; cv=none; b=CC9dDTGylo0ymVKfxorX85+Uut5zV/vukhpvLe29AvOp/gK5xlD9lO9WaCjXAw8SIw3dIK0wgWiDYHHJrzjd+JKW3MthxyqbHExJKFzEMds0dbucwBvwDQMnhgGWH0T8eDs39Z0wCmu4Ss69UPGUnGUtVhjx+QNNQ/N70bwwegM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318112; c=relaxed/simple; bh=2oerGoMXWmoobyoUwz4ZjGS6g5RpBPMLp3F7s4a0uR0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=bm7XA7YqsixSphfqGNNMLkBlA0qnjzJBrZHxzbgeZMLwjJkfbrLXtxhFnAKgLDfejVggE+E6y+EuexjjRBU0YiFIyBmRPlyyMoNSK5AIe97EbOclZKU3ssvudeYfiiuCXp6glPTuLsqZ0YiXMhVbBeIPbJ0N5+e0QoJealjMlHs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=P2appik7; arc=none smtp.client-ip=209.85.222.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="P2appik7" Received: by mail-qk1-f170.google.com with SMTP id af79cd13be357-7948b768497so27017485a.3 for ; Tue, 21 May 2024 12:01:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318110; x=1716922910; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=jTxZe62Q6V3kBii8AUf3/8CgYS7X/VqIbDLCVFTJTT4=; b=P2appik7V6xa+DNYH2GzRaGm520r1P9HXiw7/6cNEvhqkpPYgvate7DC3RUtW15gzV 20CKtCFSv+zYTW/VCgQT+lQvAM5o4eA0wiTfxpqldUSVv3aSBSPhMhr4HwZZa5ucjkRM knTc1f5fpy5lN2uugfcadM5zBtjty2t3fKni/xDQNvTWh112azC4RDkX9lvAwFkeNvdc H/QkirFU/cEfoiV+COrT+25dKiMpf0IkxsWnboOnabUWpNkafAN4DVTFhKaAB3veBMmO iSYurRqyl8wrorZpjNnCX7CNpdMYFThbL0RLx89ALsBzuUOf7OO4GlGNVK5HKFvvnQfW 8XHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318110; x=1716922910; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=jTxZe62Q6V3kBii8AUf3/8CgYS7X/VqIbDLCVFTJTT4=; b=oM8QzScdZIQMZOGk9nHgDRzTOqkkroDcJxlCl+G9NXT9wh2RdwKKFKjNJ4riTJHRaH sa7uTfQGb7/27KAIXhGi/NUX0uDWT5iXHUQ4F96kC0AIOn99yRFa7+LRtEkGnINJRjL1 XSOT5twxg9O2ukt1fAaEHyzZwY8HDv34MN0+m6slX72xH3D1tRtHL0t0LstpkPzIsVWQ 8pV+T63AQNWQTRSk2l7o0kvqP9FmwURRBaZyqX85G71Uv/GtIIQS08Al/Y/kXwIpBM11 ApDxNDWvE/L8y0A++6VWsT1n1wLG8GNmk0BFjiKBzJ7nPWV9f2fSUvc5vNLGznvjxfW9 pbEA== X-Gm-Message-State: AOJu0YyZO3xj8498yiRJdDS0yCBO5FDjMgBZHgr/NDSoqhkFE1/ljMkN in+JV6aPgDXAfJaffHvdvH52x7WzJ1iSzj7mdMCSDSVWjaW+7WlEiyzteGihaUZAwZjTVaT+gNF + X-Google-Smtp-Source: AGHT+IG9i4xXjlizKAWc3IWBu5Bc5uVaNfBIIqXLDvVzFxYDW0zWE/YDdM3vW1AzWUQdHSWgdgjCSQ== X-Received: by 2002:a05:6214:3d8a:b0:6a8:e485:f59e with SMTP id 6a1803df08f44-6a8e486054emr103606226d6.26.1716318109740; Tue, 21 May 2024 12:01:49 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6a15f194cddsm126569876d6.64.2024.05.21.12.01.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:01:49 -0700 (PDT) Date: Tue, 21 May 2024 15:01:48 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 01/30] object.h: add flags allocated by pack-bitmap.h Message-ID: <38c96fc1909162a4d9c188f55b7c708cfc1b14b9.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In commit 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) the NEEDS_BITMAP flag was introduced into pack-bitmap.h, but no object flags allocation table existed at the time. In 208acbfb82f (object.h: centralize object flag allocation, 2014-03-25) when that table was first introduced, we never added the flags from 7cc8f971085, which has remained the case since. Rectify this by including the flag bit used by pack-bitmap.h into the centralized table in object.h. Signed-off-by: Taylor Blau --- object.h | 1 + 1 file changed, 1 insertion(+) diff --git a/object.h b/object.h index 9293e703ccc..99b9c8f114c 100644 --- a/object.h +++ b/object.h @@ -81,6 +81,7 @@ void object_array_init(struct object_array *array); * reflog.c: 10--12 * builtin/show-branch.c: 0-------------------------------------------26 * builtin/unpack-objects.c: 2021 + * pack-bitmap.h: 22 */ #define FLAG_BITS 28 From patchwork Tue May 21 19:01:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669660 Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B4A51142910 for ; Tue, 21 May 2024 19:02:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318124; cv=none; b=Jx5cOeXeePBsqWq8HNsQjWddUVEcBzxNX0JqpUbY1BE416bio3TcEeFjE81NddS5sA7hv+fbiN5b05Itr3JwdT7jhUvKVptszoLmRUatHNIq4va56CWbsAf6/C0GWFkK/SWO8yMWJAZET+6e/aAk265QNLraaCOjk/CSlujlx+I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318124; c=relaxed/simple; bh=bNb0gZpxJxKA3rx6NX4mpLBLw43vYS0sAB14XI294b4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=fylOCeKHO2Ee6gr3tj3xd0UVwDJa4/kIUzVY47c5mX/E9VHFDcBvrmf/wfaOR364XI9TfJxAdsq97B9EXq05HdWDLHXuP5DLoq7t0CpPgrmbxSESZgKrvXzbvqEkiE5ixLNg4nlYxMIMV8G0Ja8QZIqFKrbxISgGDD68bo1/7ag= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=IyAu/CoU; arc=none smtp.client-ip=209.85.160.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="IyAu/CoU" Received: by mail-qt1-f175.google.com with SMTP id d75a77b69052e-43f7fe8b041so3126421cf.3 for ; Tue, 21 May 2024 12:02:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318121; x=1716922921; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=gpacc1qntbi0ZQD0xLsmfEsTBV/Op9X8U7EJfh2gzxc=; b=IyAu/CoUNpYmRI2Bla49m24GHALNo+Utjxj0YQwVGuZZWwypNJTuzsxOXEWj7sOnHQ 7B9X6xqHf51f3Bw/zNKSlFl+uhhjNSvIChZhDWuYiXSiOhP9TG+RjwRFXChCHeqO/z17 UvoVy1nDkb/Ss2I+a4O4a9EJ28bJt2aXpnbpYTNYboKzsExJJKRScJmNR3bN2ue09arx nTUnuiA4fc5ctW0j0Mw9ZXwoOf8siouO7mT/wnFmPsi1p0rnnVOG0d8Ocmo2YVhge31L xpXBWnUbP310YsPQ7iStYx0yDZ6XIfEV/kvbnH7OhhWbXK4R+KxOrwiXVsxHZcfBhtfW hzcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318121; x=1716922921; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=gpacc1qntbi0ZQD0xLsmfEsTBV/Op9X8U7EJfh2gzxc=; b=nLySIKV6q5KNOPwzOWUWbxn2qyvFEcJ74kmMVfFDGsxHo8uXs9o0DrFjuz+kmpTwn+ vwE8afG2ts4pND6t7jXDD+xG2NG6ePXsoXB6bdwhXopkjlDXZRum7BTDjAAabaIr4u2t /Fyj++kxtP17IbJkWpqslVAipUPip9jw68uCrBYQEbX9uy2fLF5lknZu1IIwGDL7lcOC 3iINNJAlY1yYzbltNGNfRLnY7wlRtGJbD5W2FpZnsbCmY/c6dtK/eTCbvR01r8N4UgKZ Ld6EIpR1dCDl3Ok4NgA2BTVcyVpdJMXfNdtT0UjQ1ZrN5YZzTbW8YtY5ZuVpbbwQ2xjh hljw== X-Gm-Message-State: AOJu0Yy7Vquu406QFMjd+LB1Str99f600GbLjKIlSsWKBKmQM/bz8FHW l8YBGt63Dqe6uPh7AVFlY5kBsXYUfQyIFV6xS7V5KF6ltIH0L2VrDw8LgXhDN8LqpK2NvSmto4r a X-Google-Smtp-Source: AGHT+IEM+5XUAHXrYVz0+glTsNPRvRZ0xhVgpykbUNTwDSOifzCZ68k2S30FnuroMzbQpE/x5yCNEQ== X-Received: by 2002:ac8:5d12:0:b0:43c:5d37:5a97 with SMTP id d75a77b69052e-43dfdaed391mr390627581cf.31.1716318121235; Tue, 21 May 2024 12:02:01 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-43e0d56dce5sm134596151cf.58.2024.05.21.12.02.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:00 -0700 (PDT) Date: Tue, 21 May 2024 15:01:59 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 07/30] Documentation/gitpacking.txt: initial commit Message-ID: <0f20c9becf452ef7a7e931b36336ccddba0f1d13.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Introduce a new manual page, gitpacking(7) to collect useful information about advanced packing concepts in Git. In future commits in this series, this manual page will expand to describe the new pseudo-merge bitmaps feature, as well as include examples, relevant configuration bits, use-cases, and so on. Outside of this series, this manual page may absorb similar pieces from other parts of Git's documentation about packing. Signed-off-by: Taylor Blau --- Documentation/Makefile | 1 + Documentation/gitpacking.txt | 34 ++++++++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+) create mode 100644 Documentation/gitpacking.txt diff --git a/Documentation/Makefile b/Documentation/Makefile index 3f2383a12c7..920b6248aa4 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -51,6 +51,7 @@ MAN7_TXT += gitdiffcore.txt MAN7_TXT += giteveryday.txt MAN7_TXT += gitfaq.txt MAN7_TXT += gitglossary.txt +MAN7_TXT += gitpacking.txt MAN7_TXT += gitnamespaces.txt MAN7_TXT += gitremote-helpers.txt MAN7_TXT += gitrevisions.txt diff --git a/Documentation/gitpacking.txt b/Documentation/gitpacking.txt new file mode 100644 index 00000000000..50e9900d845 --- /dev/null +++ b/Documentation/gitpacking.txt @@ -0,0 +1,34 @@ +gitpacking(7) +============= + +NAME +---- +gitpacking - Advanced concepts related to packing in Git + +SYNOPSIS +-------- +gitpacking + +DESCRIPTION +----------- + +This document aims to describe some advanced concepts related to packing +in Git. + +Many concepts are currently described scattered between manual pages of +various Git commands, including linkgit:git-pack-objects[1], +linkgit:git-repack[1], and others, as well as linkgit:gitformat-pack[5], +and parts of the `Documentation/technical` tree. + +There are many aspects of packing in Git that are not covered in this +document that instead live in the aforementioned areas. Over time, those +scattered bits may coalesce into this document. + +SEE ALSO +-------- +linkgit:git-pack-objects[1] +linkgit:git-repack[1] + +GIT +--- +Part of the linkgit:git[1] suite From patchwork Tue May 21 19:02:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669661 Received: from mail-qk1-f171.google.com (mail-qk1-f171.google.com [209.85.222.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9517D142910 for ; Tue, 21 May 2024 19:02:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318128; cv=none; b=gGk1FAsV090AfVYhPXlEzz19on1nSDwfEE/+xRlcKaiyZ3dfc0FUfCDdMpgmyho0tJqL9H+MZrGwKdpbxYh91Qbs1taEYj82DQfFThpqRcxYC/eejDIAxOtR7l9HZtFt1kGdMFie+fnfYH8Djap0JJknoL9Z4wxqn8P/I/c0Vs8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318128; c=relaxed/simple; bh=uG2VeWxyNu+y9XCRxT3eBN3ROPalKcbRovJk6gABMlg=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=BiTn6+N3lDJgUUbQQv9rlEB70eZ7T1OzWuc233XeJieIB473vt6lxOv77r648zgDZKOk+mPbg7g7Jmb5855eTn38PQNItvM/IT1NuyXS+VFDBfynj455CBJ6vY3YvV0VSkGJZZ6ZOuM01qwottTFAsYIhcpwKkxQya/i2sYSt9g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=cp/EHThH; arc=none smtp.client-ip=209.85.222.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="cp/EHThH" Received: by mail-qk1-f171.google.com with SMTP id af79cd13be357-7948b50225bso38937585a.3 for ; Tue, 21 May 2024 12:02:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318125; x=1716922925; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=CwhWHFflON3YqISFMkfOPz3bJpv//U/ZSMzwtdbnnRs=; b=cp/EHThHbjGrFxtQjYayIV+4Zo4sw8cKVhKXzd9Nog8H3/rmWgDQy0ZJNmvhXvJkBS HqBYmihQACJDgBrXKoW1EWQDWE4EA6ghu+jsUO4+315NfiMBElrWmXDEOj4Dbnkd6HU5 TBpD/p9crZy09RzyJZCBs9cwF343N9BCPPRwAqIgFPozydvq+q1qmJbRYgttUE9RUmu6 j62kIfu/o8qvlWeNeQpbd0uDjJPZ9x7+zONHzvVVIETt0Z3lbCeOz3WS1iL0gSqOyLIn KpSuNJtqabDJVgaOhDxLCGmzDxLoq0EpjzPrrf6MdLoan/gTJCCOnJCxtou1XHcqNCCQ z0nQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318125; x=1716922925; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=CwhWHFflON3YqISFMkfOPz3bJpv//U/ZSMzwtdbnnRs=; b=PZbVYGgpGcJ4JzA4MFQFieFAj1IBn8FuLPN/Imtg0lyvbaft2rU6w66zuYxT+YFCv6 yIxS0L5LE24XpKipjMKetMIay6ij5TiUV3CAkmrmEsd3AoG7QFjhXtFh636HpzLRHHsb VEmuI+H2dUeV+yLjejiGDamhvGtjn/rBrl8g35d5r7QSIQpeJ2oD6gAxpMVNfFxOPMkf JwuTP4MixAhwIww1a8ZDbdZczdEXXfl8rshsJZnEoLGqSo6zQnGU0grsn41OAQEadVYc XP+6foT62/PNiX/uFIIM75isECLTTbRAHTK2p1DqWLOnCKjwfgGYxFab28YyYl5cHZHy LC+A== X-Gm-Message-State: AOJu0YzfGem8GJ91MEv8MTCoDoFpyHKPnnWBOfZmfB6X/8Tr3B0lT5OM j3Zy3/5gCFi3XXnJqSSZ1smNWEFivyDUr+FRsuzcaGIB8dkKukGnp3vNtULl4idiH8aLx8+Af6z R X-Google-Smtp-Source: AGHT+IGplrmGOt1vruyiZ4ZDfHE1dg/fHxWIe1VNAfttv/hRVAS1DbvDpKjuNxuJfX68SYRcLJVbFA== X-Received: by 2002:a05:620a:1272:b0:792:c25e:2d49 with SMTP id af79cd13be357-792c7576fc2mr3784978685a.4.1716318125040; Tue, 21 May 2024 12:02:05 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-792e564e4dbsm907513485a.82.2024.05.21.12.02.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:04 -0700 (PDT) Date: Tue, 21 May 2024 15:02:02 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 08/30] Documentation/gitpacking.txt: describe pseudo-merge bitmaps Message-ID: <528b591bd84e189d7d48e01deb8e876f74fc471b.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Add some details to the gitpacking(7) manual page which motivate and describe pseudo-merge bitmaps. The exact on-disk format and many of the configuration knobs will be described in subsequent commits. Helped-by: Jeff King Signed-off-by: Taylor Blau --- Documentation/gitpacking.txt | 69 ++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/Documentation/gitpacking.txt b/Documentation/gitpacking.txt index 50e9900d845..ff18077129b 100644 --- a/Documentation/gitpacking.txt +++ b/Documentation/gitpacking.txt @@ -24,6 +24,75 @@ There are many aspects of packing in Git that are not covered in this document that instead live in the aforementioned areas. Over time, those scattered bits may coalesce into this document. +== Pseudo-merge bitmaps + +=== Background + +Reachability bitmaps are most efficient when we have on-disk stored +bitmaps for one or more of the starting points of a traversal. For this +reason, Git prefers storing bitmaps for commits at the tips of refs, +because traversals tend to start with those points. + +But if you have a large number of refs, it's not feasible to store a +bitmap for _every_ ref tip. It takes up space, and just OR-ing all of +those bitmaps together is expensive. + +One way we can deal with that is to create bitmaps that represent +_groups_ of refs. When a traversal asks about the entire group, then we +can use this single bitmap instead of considering each ref individually. +Because these bitmaps represent the set of objects which would be +reachable in a hypothetical merge of all of the commits, we call them +pseudo-merge bitmaps. + +=== Overview + +A "pseudo-merge bitmap" is used to refer to a pair of bitmaps, as +follows: + +Commit bitmap:: + + A bitmap whose set bits describe the set of commits included in the + pseudo-merge's "merge" bitmap (as below). + +Merge bitmap:: + + A bitmap whose set bits describe the reachability closure over the set + of commits in the pseudo-merge's "commits" bitmap (as above). An + identical bitmap would be generated for an octopus merge with the same + set of parents as described in the commits bitmap. + +Pseudo-merge bitmaps can accelerate bitmap traversals when all commits +for a given pseudo-merge are listed on either side of the traversal, +either directly (by explicitly asking for them as part of the `HAVES` +or `WANTS`) or indirectly (by encountering them during a fill-in +traversal). + +=== Use-cases + +For example, suppose there exists a pseudo-merge bitmap with a large +number of commits, all of which are listed in the `WANTS` section of +some bitmap traversal query. When pseudo-merge bitmaps are enabled, the +bitmap machinery can quickly determine there is a pseudo-merge which +satisfies some subset of the wanted objects on either side of the query. +Then, we can inflate the EWAH-compressed bitmap, and `OR` it in to the +resulting bitmap. By contrast, without pseudo-merge bitmaps, we would +have to repeat the decompression and `OR`-ing step over a potentially +large number of individual bitmaps, which can take proportionally more +time. + +Another benefit of pseudo-merges arises when there is some combination +of (a) a large number of references, with (b) poor bitmap coverage, and +(c) deep, nested trees, making fill-in traversal relatively expensive. +For example, suppose that there are a large enough number of tags where +bitmapping each of the tags individually is infeasible. Without +pseudo-merge bitmaps, computing the result of, say, `git rev-list +--use-bitmap-index --count --objects --tags` would likely require a +large amount of fill-in traversal. But when a large quantity of those +tags are stored together in a pseudo-merge bitmap, the bitmap machinery +can take advantage of the fact that we only care about the union of +objects reachable from all of those tags, and answer the query much +faster. + SEE ALSO -------- linkgit:git-pack-objects[1] From patchwork Tue May 21 19:02:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669662 Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8CB321494BD for ; Tue, 21 May 2024 19:02:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318132; cv=none; b=jBrtnC0wiH0fqDAWPWcCi8lvbHXBo7yGEXc3xQ1SPuZcgiSVw6S1rs9qpR85baARDCqop/xB5l5Cl/9BK6kaOJhpYq24dUJfm/RyB5nYKuI3bQ/FuguyetofLzvdmK66bfShDBvHX8CcGibtSowx0PwDe2zbzg0CUk3NBSZLMDU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318132; c=relaxed/simple; bh=odCIASvmMA1FyARQEtQLMWzU4YVyIUW7KeJ2eARDXhM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Nx3BikFvqbUgnuQNgxf/1mYwUA7fUvUsrnmwMZmFqc/05HxpOKUCWA72ZmXoDBZOZB23Bbgelu9zXQDGh0B4wK58LTePhHgzAroDjFSmY0TKeSNP858ZFWI2xKc35xdIFNDJopWVfRQ68PkEWmZYXr6lvVLfi6Ep9dnrXuFE68I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=TzmEqe2G; arc=none smtp.client-ip=209.85.160.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="TzmEqe2G" Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-43e09dab877so28315351cf.1 for ; Tue, 21 May 2024 12:02:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318128; x=1716922928; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Xi19maJyBx+n/ZqJMPBU4/fw3iEjma+IG+fUnZ28vOI=; b=TzmEqe2GZ8XyZFzbYnvcyTwX//rOZiJq3RchH68OPGhE1qKbVPRWPeiafT6h7lCC02 Z+iBuqbN20uzlSxKWtgLjpYtPvJL16oVR3jFpBl7ZvfHLS035atEhIiEJViB/TrLjF9L XkBZbJg6XHlZ0ok0MAa2EkrF2+s8DteeQPv8rQMSkbar+QDTF+xgRlUDY6RT+06/1ngI /QxYyq8/PaI5smU3gI+dUEumRjL8cisaWRZoDUbOrpIbd1GvAxc1A6x4cJJ87UEk1igv rZJ2FR6OXU+MWAV3i3hAsilujyAsKXjQNLj/3yBxBt/lwZ0a0g99da7hTJux8SW87RoK NYCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318128; x=1716922928; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Xi19maJyBx+n/ZqJMPBU4/fw3iEjma+IG+fUnZ28vOI=; b=jrOsXrdZbqFYcboDimZ+TegEMR6LW9k7l9COkFM8+v5oMNm9wYvpCvN6WKvmdXSAU+ w+NiBVBPRp4QjWCczgO5nQv8G51dA/b+cgo+BxOxZm0RHCXVCQRs8VUoope9nhEBI6ct xnLhT5Rkcxvw5U6Y73Hx0gZgGO/ucvtc5kCsGoe/sjju9YfLReCxjm6s5Nh30n9bVq7O 2SbvY6vDar7K8iZPcFIX1lejLPVcYxIR5Ja1NVeAgLaEnQG4UTV000M6/d3/S/FwJrgn qb7g/QJUdQb/UAUJW86lXgSMDAYSFWQkNgMFNnsPn7gmZyMXpec1+3K7a0uvkr+XZ6vk 4ZSg== X-Gm-Message-State: AOJu0YzA6ByE1TKaDMZr4Vm3bifPGN60OD+nOKO9CLoFFjcLQkTwtFX1 v80r1y6K+g4fx+7hNox6bFzJCRaa2aN00Hh+JtppcSYU/EehVVGtTMJR4IuUfLQ4Fe5a/RJrkJA R X-Google-Smtp-Source: AGHT+IGQw9KfQIfc8+bfADwapI1zBJ0o8GBq/AWj6Ei+jjsrQfD1Uqtgb+RBNKpVdWjyLcM+nfyIZA== X-Received: by 2002:ac8:5954:0:b0:43a:c483:9fc3 with SMTP id d75a77b69052e-43dfdb299c0mr379817801cf.26.1716318128077; Tue, 21 May 2024 12:02:08 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-43f7a6bd188sm48640481cf.19.2024.05.21.12.02.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:07 -0700 (PDT) Date: Tue, 21 May 2024 15:02:06 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 09/30] Documentation/technical: describe pseudo-merge bitmaps format Message-ID: <12f318b3d7e20ff0b6038d6be7f2534f885f3220.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare to implement pseudo-merge bitmaps over the next several commits by first describing the serialization format which will store the new pseudo-merge bitmaps themselves. This format is implemented as an optional extension within the bitmap v1 format, making it compatible with previous versions of Git, as well as the original .bitmap implementation within JGit. The format is described in detail in the patch contents below, but the high-level description is as follows: - An array of pseudo-merge bitmaps, each containing a pair of EWAH bitmaps: one describing the set of pseudo-merge "parents", and another describing the set of object(s) reachable from those parents. - A lookup table to determine which pseudo-merge(s) a given commit appears in. An optional extended lookup table follows when there is at least one commit which appears in multiple pseudo-merge groups. - Trailing metadata, including the number of pseudo-merge(s), number of unique parents, the offset within the .bitmap file for the pseudo-merge commit lookup table, and the size of the optional extension itself. Signed-off-by: Taylor Blau --- Documentation/technical/bitmap-format.txt | 132 ++++++++++++++++++++++ 1 file changed, 132 insertions(+) diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt index f5d200939b0..ee7775a2586 100644 --- a/Documentation/technical/bitmap-format.txt +++ b/Documentation/technical/bitmap-format.txt @@ -255,3 +255,135 @@ triplet is - xor_row (4 byte integer, network byte order): :: The position of the triplet whose bitmap is used to compress this one, or `0xffffffff` if no such bitmap exists. + +Pseudo-merge bitmaps +-------------------- + +If the `BITMAP_OPT_PSEUDO_MERGES` flag is set, a variable number of +bytes (preceding the name-hash cache, commit lookup table, and trailing +checksum) of the `.bitmap` file is used to store pseudo-merge bitmaps. + +For more information on what pseudo-merges are, why they are useful, and +how to configure them, see the information in linkgit:gitpacking[7]. + +=== File format + +If enabled, pseudo-merge bitmaps are stored in an optional section at +the end of a `.bitmap` file. The format is as follows: + +.... ++-------------------------------------------+ +| .bitmap File | ++-------------------------------------------+ +| | +| Pseudo-merge bitmaps (Variable Length) | +| +---------------------------+ | +| | commits_bitmap (EWAH) | | +| +---------------------------+ | +| | merge_bitmap (EWAH) | | +| +---------------------------+ | +| | ++-------------------------------------------+ +| | +| Lookup Table | +| +---------------------------+ | +| | commit_pos (4 bytes) | | +| +---------------------------+ | +| | offset (8 bytes) | | +| +------------+--------------+ | +| | +| Offset Cases: | +| ------------- | +| | +| 1. MSB Unset: single pseudo-merge bitmap | +| + offset to pseudo-merge bitmap | +| | +| 2. MSB Set: multiple pseudo-merges | +| + offset to extended lookup table | +| | ++-------------------------------------------+ +| | +| Extended Lookup Table (Optional) | +| +----+----------+----------+----------+ | +| | N | Offset 1 | .... | Offset N | | +| +----+----------+----------+----------+ | +| | | 8 bytes | .... | 8 bytes | | +| +----+----------+----------+----------+ | +| | ++-------------------------------------------+ +| | +| Pseudo-merge Metadata | +| +-----------------------------------+ | +| | # pseudo-merges (4 bytes) | | +| +-----------------------------------+ | +| | # commits (4 bytes) | | +| +-----------------------------------+ | +| | Lookup offset (8 bytes) | | +| +-----------------------------------+ | +| | Extension size (8 bytes) | | +| +-----------------------------------+ | +| | ++-------------------------------------------+ +.... + +* One or more pseudo-merge bitmaps, each containing: + + ** `commits_bitmap`, an EWAH-compressed bitmap describing the set of + commits included in the this psuedo-merge. + + ** `merge_bitmap`, an EWAH-compressed bitmap describing the union of + the set of objects reachable from all commits listed in the + `commits_bitmap`. + +* A lookup table, mapping pseudo-merged commits to the pseudo-merges + they belong to. Entries appear in increasing order of each commit's + bit position. Each entry is 12 bytes wide, and is comprised of the + following: + + ** `commit_pos`, a 4-byte unsigned value (in network byte-order) + containing the bit position for this commit. + + ** `offset`, an 8-byte unsigned value (also in network byte-order) + containing either one of two possible offsets, depending on whether or + not the most-significant bit is set. + + *** If unset (i.e. `offset & ((uint64_t)1<<63) == 0`), the offset + (relative to the beginning of the `.bitmap` file) at which the + pseudo-merge bitmap for this commit can be read. This indicates + only a single pseudo-merge bitmap contains this commit. + + *** If set (i.e. `offset & ((uint64_t)1<<63) != 0`), the offset + (again relative to the beginning of the `.bitmap` file) at which + the extended offset table can be located describing the set of + pseudo-merge bitmaps which contain this commit. This indicates + that multiple pseudo-merge bitmaps contain this commit. + +* An (optional) extended lookup table (written if and only if there is + at least one commit which appears in more than one pseudo-merge). + There are as many entries as commits which appear in multiple + pseudo-merges. Each entry contains the following: + + ** `N`, a 4-byte unsigned value equal to the number of pseudo-merges + which contain a given commit. + + ** An array of `N` 8-byte unsigned values, each of which is + interpreted as an offset (relative to the beginning of the + `.bitmap` file) at which a pseudo-merge bitmap for this commit can + be read. These values occur in no particular order. + +* Positions for all pseudo-merges, each stored as an 8-byte unsigned + value (in network byte-order) containing the offset (relative to the + beginning of the `.bitmap` file) of each consecutive pseudo-merge. + +* A 4-byte unsigned value (in network byte-order) equal to the number of + pseudo-merges. + +* A 4-byte unsigned value (in network byte-order) equal to the number of + unique commits which appear in any pseudo-merge. + +* An 8-byte unsigned value (in network byte-order) equal to the number + of bytes between the start of the pseudo-merge section and the + beginning of the lookup table. + +* An 8-byte unsigned value (in network byte-order) equal to the number + of bytes in the pseudo-merge section (including this field). From patchwork Tue May 21 19:02:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669663 Received: from mail-qk1-f171.google.com (mail-qk1-f171.google.com [209.85.222.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B657E1494BF for ; Tue, 21 May 2024 19:02:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318134; cv=none; b=mkPZbGlGa4yZFuYCOPpivra55I4/dAPCwWAlGSs9rDlH0u4+7EXWISpbzYvA2eDbV7EKPgDytvAzRCRCjEq2lqASZX9mi4H5sNcNekul8GZnuqO+SWTqQIifJCHEcm3xkHrfAFfIvlmbQULJgY8jr+diO+q5u+0GixaRLy3mnGA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318134; c=relaxed/simple; bh=GyQgZ1oHifaF1aA6YJ79/i8EiqDL5cRCWzEt6XGJhAE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ZRRljM74i94S5UB+JeZ/ifmsHLOpptSuVlCulB+d6wzdWTzLASLlpx7RZMCLVtJuTFjeeZg3xzpjl5zasakGUuaADhWvm+Tw9zoB1Apjsb+xxe8lpy4dGScBzWo6yfiTpYEB0MJCwIGylIs1O97G5qBIqKLTMsUwFR0i+ODL1EE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=AJt1dMor; arc=none smtp.client-ip=209.85.222.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="AJt1dMor" Received: by mail-qk1-f171.google.com with SMTP id af79cd13be357-792bdf626beso15679185a.1 for ; Tue, 21 May 2024 12:02:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318131; x=1716922931; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=8F9/JtO25EibLgceDnUtIQkyknjeWzUIMGx6e14HKOM=; b=AJt1dMorxyaRCp3HRTrPm09GZ4Y6bhVw0bTRfeJs+N3t5m5zMEpdLB9CfN25HdQY0i a7y+XVAjea4qkEm0Q/u+yNx500KomRDeuJIa8NXjJ4LWCE+/42A3UUk7fugRUaUxa9I4 3Qxwp1nZ4qJNyRt9ffIrHsoFa8ZT3Lkj07w2Mux6QRikOZZb8J14poxKVn/mVQ6Kmrdd pddQ8TKCK1XH/3ZoG7soXQLCgjtSOwMA+d5qObSSVTIxQkzkVdSWNommEEPgjIHUOn73 7Fxz3yJG7wrWdamEjSlIdboH5O9YmvGfQ0JpoOd21uDdezaa2cmiJaSuJaeF2LvqEgZv OgxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318131; x=1716922931; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=8F9/JtO25EibLgceDnUtIQkyknjeWzUIMGx6e14HKOM=; b=VVKh7U0HmtwT8WNWCOMrUfa7Qn2n3bph/yp9wIZkf1BfPTVKq35cCWkfpPOpOzXFid ky1BcJSCDZ+mbJGXFpKYyxatxCMcg68J839dJo1SFZcnEFzRlNnvZBYpmOW3xfdkRzMq seHn5ltJDkAB3tz89gsMp4mZOq/3XItnumYQytzW4GeT7ckD8molJppFn+J40QvORb0Z 5VvrujPo2Lu5vPEHAxVj6Jrw96RPp7omlnAX6OYGrHYJvV3RnLmkjbjOHd9CBCwQfgqh F+CjX7UL43gOzLdFqR6LsEdCMAy7lVJzkYZqf4LXrzGMoqslg+EaXoEAl6uKkjht1lNd TNgw== X-Gm-Message-State: AOJu0Yzl5oYBua4GcmKEY1XFagj/pmdk5vpZlMZD/wDC2GTFkzkG7LqS 1J6fIDCqM60UqKjvTTXcTh1IXKFzv2s6JqiE+NnO8PJQC3dArmxX+DXYXU3yz3eashp0+hu2+50 5 X-Google-Smtp-Source: AGHT+IGBLdNkNi1v81uIPFt1tl6O0kElNtyiSuGstyfBaffTLwQZtMdtVEOKLOcp0LTO9QrSM7fb7A== X-Received: by 2002:a05:620a:4416:b0:790:ea3e:61cf with SMTP id af79cd13be357-79470edcccdmr1686288685a.15.1716318131158; Tue, 21 May 2024 12:02:11 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-792bf310c2csm1314847185a.112.2024.05.21.12.02.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:10 -0700 (PDT) Date: Tue, 21 May 2024 15:02:09 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 10/30] ewah: implement `ewah_bitmap_is_subset()` Message-ID: <40eb6137618ad4c648eaafd720a9678d8d84c96a.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In order to know whether a given pseudo-merge (comprised of a "parents" and "objects" bitmaps) is "satisfied" and can be OR'd into the bitmap result, we need to be able to quickly determine whether the "parents" bitmap is a subset of the current set of objects reachable on either side of a traversal. Implement a helper function to prepare for that, which determines whether an EWAH bitmap (the parents bitmap from the pseudo-merge) is a subset of a non-EWAH bitmap (in this case, the results bitmap from either side of the traversal). This function makes use of the EWAH iterator to avoid inflating any part of the EWAH bitmap after we determine it is not a subset of the non-EWAH bitmap. This "fail-fast" allows us to avoid a potentially large amount of wasted effort. Signed-off-by: Taylor Blau --- ewah/bitmap.c | 43 +++++++++++++++++++++++++++++++++++++++++++ ewah/ewok.h | 6 ++++++ 2 files changed, 49 insertions(+) diff --git a/ewah/bitmap.c b/ewah/bitmap.c index ac7e0af622a..d352fec54ce 100644 --- a/ewah/bitmap.c +++ b/ewah/bitmap.c @@ -138,6 +138,49 @@ void bitmap_or(struct bitmap *self, const struct bitmap *other) self->words[i] |= other->words[i]; } +int ewah_bitmap_is_subset(struct ewah_bitmap *self, struct bitmap *other) +{ + struct ewah_iterator it; + eword_t word; + size_t i; + + ewah_iterator_init(&it, self); + + for (i = 0; i < other->word_alloc; i++) { + if (!ewah_iterator_next(&word, &it)) { + /* + * If we reached the end of `self`, and haven't + * rejected `self` as a possible subset of + * `other` yet, then we are done and `self` is + * indeed a subset of `other`. + */ + return 1; + } + if (word & ~other->words[i]) { + /* + * Otherwise, compare the next two pairs of + * words. If the word from `self` has bit(s) not + * in the word from `other`, `self` is not a + * subset of `other`. + */ + return 0; + } + } + + /* + * If we got to this point, there may be zero or more words + * remaining in `self`, with no remaining words left in `other`. + * If there are any bits set in the remaining word(s) in `self`, + * then `self` is not a subset of `other`. + */ + while (ewah_iterator_next(&word, &it)) + if (word) + return 0; + + /* `self` is definitely a subset of `other` */ + return 1; +} + void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other) { size_t original_size = self->word_alloc; diff --git a/ewah/ewok.h b/ewah/ewok.h index c11d76c6f33..2b6c4ac499c 100644 --- a/ewah/ewok.h +++ b/ewah/ewok.h @@ -179,7 +179,13 @@ void bitmap_unset(struct bitmap *self, size_t pos); int bitmap_get(struct bitmap *self, size_t pos); void bitmap_free(struct bitmap *self); int bitmap_equals(struct bitmap *self, struct bitmap *other); + +/* + * Both `bitmap_is_subset()` and `ewah_bitmap_is_subset()` return 1 if the set + * of bits in 'self' are a subset of the bits in 'other'. Returns 0 otherwise. + */ int bitmap_is_subset(struct bitmap *self, struct bitmap *other); +int ewah_bitmap_is_subset(struct ewah_bitmap *self, struct bitmap *other); struct ewah_bitmap * bitmap_to_ewah(struct bitmap *bitmap); struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah); From patchwork Tue May 21 19:02:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669664 Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C29A21494BF for ; Tue, 21 May 2024 19:02:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318137; cv=none; b=diZ2QIFek7HUKZg1jE92hW+iUyWtz6qCM2jqKEHjg3UMOHCRt7iMMAOtnNHFGBESj6UApCsgynCeKS1FYtrbBcl28qhGfZen2BkWhVf4et2Nb1UqgWQu9f0Yhrc2VFhDHcYkUtAP1FagDfn2AVNViOnQ6GhVZYrifYwxdIctInA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318137; c=relaxed/simple; bh=rFwwZzCpRRs2uUVGpntak1ggwg0sOFRPsErihbQO8kU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=blALrZ7TuBwbaw7jfDo1kvFuThiQlMSWZf5GEqdGOFSU8x4ahqKhvS00xBYTZ0X9BufkF6p+9EqWMoOLSiaaZjR+4cxgzzqGgKdeCyAmY1ejMq71Ra+vFTjfxUKJjl81uiLjIf/3HMQVSFbTRKQY808q5VqDk84pKfcnPFhDFoo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=AZbxcI9u; arc=none smtp.client-ip=209.85.160.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="AZbxcI9u" Received: by mail-qt1-f178.google.com with SMTP id d75a77b69052e-43de92e234dso1523631cf.1 for ; Tue, 21 May 2024 12:02:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318134; x=1716922934; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=KXAwHXRnd6XAfJ4r/CzVC8TwkfRAR3OfEiLzf2F99G8=; b=AZbxcI9uAI6+nPoLG7U4A9Uk/YfUr+tygjirmri8cU4LAKlk4zWHMC/83Gjj3QtGnu uUD09kVqPJAGqfS1YZJYn+C8bB+pCSnNkTrPeIio1jO4eT+G4sIEJhoYzeSayDUYBYmH FUPTuYIvjBn8yy0FdrzHhIRWWRYlqZzp/3pp3HUdqwvDhy/euH0fA/mINlDfI2G/NcT4 7yFfrxtfxmsaiFzLXwTJbF5iKZNn4B85dv123+5BsrqiDCPXFFuLLb/i449EeOAlxRtH QFlfG3a9xrNfrYFa5xaQOlr+yRFglQz2Lo+9X7wc7KgJHO4+Qokz5qtFHZFwho1kYtOb RWnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318134; x=1716922934; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=KXAwHXRnd6XAfJ4r/CzVC8TwkfRAR3OfEiLzf2F99G8=; b=lwAp2Wvd4ncguJSam5re+MTpo5gNhPSSlFnIbz0JnKrABJi03WzS/mZ0KPN2dEH6MN lEAv5hNg8F0uqjO2Jax76KIKOwGuF9mzckCn1PqYD5V9X5cDkudSKeIxCHrKbDM06/WS s+4Ak3zDG1ATQqnI3+2mKzGIm047JRLAKS9esfairSjRQusnZVGo2IoDWFt8ZIr7msH6 ck9JqdNtztE0vVyBX3ahVqhsk2B568shKvp0wyVArhFaQaJIgvK55AtnejV+lYNY1xJ6 oJRLbQ9rrmHPFyUt5hlYXH3ZMTeELODx0fK61jMDQBa4GcB1nsG1bwAF7VpdsDhWRoHq yLcw== X-Gm-Message-State: AOJu0YwdryrR+LYJNmKsRUSij0CdFpjb4ymcCvrkcrylkL44MEcmFjDr s/cYu4+yaSdHtt1QO/yo5KDWT4YBrmcadb1mZHtOb23V13bfDpzCAZHjInAy8tEi6D8p9j1Go+o g X-Google-Smtp-Source: AGHT+IFRoZ+OBaPTyRdV3qFFAnxf01S4Vn6ul4Z0vird8DrWVFUNiRI31+pW0LBVGvAtAhFD2GmFuw== X-Received: by 2002:a05:622a:1186:b0:43a:b8e2:5870 with SMTP id d75a77b69052e-43f9dd930eamr216041cf.3.1716318134293; Tue, 21 May 2024 12:02:14 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-43dfa7c0c1esm156526451cf.34.2024.05.21.12.02.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:13 -0700 (PDT) Date: Tue, 21 May 2024 15:02:12 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 11/30] pack-bitmap: move some initialization to `bitmap_writer_init()` Message-ID: <487fb7c6e9c760f3febb1d795b9b6f408f67ea0d.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The pack-bitmap-writer machinery uses a oidmap (backed by khash.h) to map from commits selected for bitmaps (by OID) to a bitmapped_commit structure (containing the bitmap itself, among other things like its XOR offset, etc.) This map was initialized at the end of `bitmap_writer_build()`. New entries are added in `pack-bitmap-write.c::store_selected()`, which is called by the bitmap_builder machinery (which is responsible for traversing history and generating the actual bitmaps). Reorganize when this field is initialized and when entries are added to it so that we can quickly determine whether a commit is a candidate for pseudo-merge selection, or not (since it was already selected to receive a bitmap, and thus storing it in a pseudo-merge would be redundant). The changes are as follows: - Introduce a new `bitmap_writer_init()` function which initializes the `writer.bitmaps` field (instead of waiting until the end of `bitmap_writer_build()`). - Add map entries in `push_bitmapped_commit()` (which is called via `bitmap_writer_select_commits()`) with OID keys and NULL values to track whether or not we *expect* to write a bitmap for some given commit. - Validate that a NULL entry is found matching the given key when we store a selected bitmap. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 3 ++- midx-write.c | 2 +- pack-bitmap-write.c | 24 ++++++++++++++++++------ pack-bitmap.h | 2 +- 4 files changed, 22 insertions(+), 9 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 26a6d0d7919..6209264e60c 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1340,7 +1340,8 @@ static void write_pack_file(void) hash_to_hex(hash)); if (write_bitmap_index) { - bitmap_writer_init(&bitmap_writer); + bitmap_writer_init(&bitmap_writer, + the_repository); bitmap_writer_set_checksum(&bitmap_writer, hash); bitmap_writer_build_type_index(&bitmap_writer, &to_pack, written_list, nr_written); diff --git a/midx-write.c b/midx-write.c index 7c0c08c64b2..c747d1a6af3 100644 --- a/midx-write.c +++ b/midx-write.c @@ -820,7 +820,7 @@ static int write_midx_bitmap(const char *midx_name, for (i = 0; i < pdata->nr_objects; i++) index[i] = &pdata->objects[i].idx; - bitmap_writer_init(&writer); + bitmap_writer_init(&writer, the_repository); bitmap_writer_show_progress(&writer, flags & MIDX_PROGRESS); bitmap_writer_build_type_index(&writer, pdata, index, pdata->nr_objects); diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 6cae670412c..d8870155831 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -27,9 +27,12 @@ struct bitmapped_commit { uint32_t commit_pos; }; -void bitmap_writer_init(struct bitmap_writer *writer) +void bitmap_writer_init(struct bitmap_writer *writer, struct repository *r) { memset(writer, 0, sizeof(struct bitmap_writer)); + if (writer->bitmaps) + BUG("bitmap writer already initialized"); + writer->bitmaps = kh_init_oid_map(); } void bitmap_writer_free(struct bitmap_writer *writer) @@ -128,11 +131,21 @@ void bitmap_writer_build_type_index(struct bitmap_writer *writer, static inline void push_bitmapped_commit(struct bitmap_writer *writer, struct commit *commit) { + int hash_ret; + khiter_t hash_pos; + if (writer->selected_nr >= writer->selected_alloc) { writer->selected_alloc = (writer->selected_alloc + 32) * 2; REALLOC_ARRAY(writer->selected, writer->selected_alloc); } + hash_pos = kh_put_oid_map(writer->bitmaps, commit->object.oid, + &hash_ret); + if (!hash_ret) + die(_("duplicate entry when writing bitmap index: %s"), + oid_to_hex(&commit->object.oid)); + kh_value(writer->bitmaps, hash_pos) = NULL; + writer->selected[writer->selected_nr].commit = commit; writer->selected[writer->selected_nr].bitmap = NULL; writer->selected[writer->selected_nr].write_as = NULL; @@ -483,14 +496,14 @@ static void store_selected(struct bitmap_writer *writer, { struct bitmapped_commit *stored = &writer->selected[ent->idx]; khiter_t hash_pos; - int hash_ret; stored->bitmap = bitmap_to_ewah(ent->bitmap); - hash_pos = kh_put_oid_map(writer->bitmaps, commit->object.oid, &hash_ret); - if (hash_ret == 0) - die("Duplicate entry when writing index: %s", + hash_pos = kh_get_oid_map(writer->bitmaps, commit->object.oid); + if (hash_pos == kh_end(writer->bitmaps)) + die(_("attempted to store non-selected commit: '%s'"), oid_to_hex(&commit->object.oid)); + kh_value(writer->bitmaps, hash_pos) = stored; } @@ -506,7 +519,6 @@ int bitmap_writer_build(struct bitmap_writer *writer, uint32_t *mapping; int closed = 1; /* until proven otherwise */ - writer->bitmaps = kh_init_oid_map(); writer->to_pack = to_pack; if (writer->show_progress) diff --git a/pack-bitmap.h b/pack-bitmap.h index 3091095f336..f87e60153dd 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -114,7 +114,7 @@ struct bitmap_writer { unsigned char pack_checksum[GIT_MAX_RAWSZ]; }; -void bitmap_writer_init(struct bitmap_writer *writer); +void bitmap_writer_init(struct bitmap_writer *writer, struct repository *r); void bitmap_writer_show_progress(struct bitmap_writer *writer, int show); void bitmap_writer_set_checksum(struct bitmap_writer *writer, const unsigned char *sha1); From patchwork Tue May 21 19:02:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669665 Received: from mail-qk1-f171.google.com (mail-qk1-f171.google.com [209.85.222.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 61FBE1494BF for ; Tue, 21 May 2024 19:02:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318140; cv=none; b=jQ5g2j/sEU1VmbS0fswHLjb+P1Esj7qc2gkRIfdRTVNCFfT5pIhZRJVE8TTtPxnFx2YuMNUisgfWdPmePzKBTQqu5K9Qo/1iNiIK4PoxTnDpAsiEiIjnfUCjTLGtalcKY2n6u2OOehOHZ7GunTk4zC6jwYvDGlyGbltq71OADic= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318140; c=relaxed/simple; bh=BCMcsubjBIWMCjUTktMGRZVIQ7fvrJgSm4iASH1Hr48=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=n1CneUPYpJNIwS5bhkQAK94nJMFmnhb552lowy1uEjoHPrF5qY+elThfgVMEq6emg6OG6UhYYzciIjgQJc5uYIfeeBLXhJPWBGUz0LbRrGsgQ1eSzebN4BP037OhQ+UH2fcbfnrSfWX9ScEFeC9UCW3aBofkIisVSFnKS+mjFGE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=Mzek5EEr; arc=none smtp.client-ip=209.85.222.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="Mzek5EEr" Received: by mail-qk1-f171.google.com with SMTP id af79cd13be357-792b8ebc4eeso327773785a.1 for ; Tue, 21 May 2024 12:02:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318138; x=1716922938; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=/nWHd39/HOj6RcEYnoxjlP67QsM8zN8cBzzsRiJrKQI=; b=Mzek5EErxDrDF0lUVNEaPJ9kT6rlk3CpdeEbf1DECajvUc+J4KRa4QIvlz3CZgi5qA i39K2CFEQs6vsBwxE7GCcopfJaX7oiip+GF2NncL6R3caI40LnrXWI6uk8AfWVVD3kWs vojhPrQBOX3iXTltce5TB7OWorMlcWlPwg3e9GRwAB66lIMTvVlpqxC7gTEo9LBIcMDd QEEzPEvI/DSfcv0ixNtA45gholTETlYPMua0mOOlT0U1z2zEhLzZW0TxNI4N3tUdrCjo uN/+kFJLlAZJ4Sxzf58sW6lns/Kl7wN01xd+3/V5hN3ONdkLz5mzDbCG+sRiERw0fgyN 0brA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318138; x=1716922938; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=/nWHd39/HOj6RcEYnoxjlP67QsM8zN8cBzzsRiJrKQI=; b=cMJaiz11z5MHvIGfK3WRTJSIGAMeKdd6o5QBzU8Kb0nrPAGLaFmswLs6MDJ4bi/+b0 h/YzeFZSHnSijIouWR6oMfptS3gfMPdc6O58i6zX7LdHgAjrQAMUTHGKkIh2MWRpvwK7 OduZmEouvJFy6OgUCtaNqKHOiLSk5NpyOBFIATgWb3QjjC3+8mO138UqfG61uQ0PtShx na5vpaEkTAPE89Ji/1YcsSIJ+DZLdV5p4FU2Jycj1SE3d+fVE1ixRgf1JEUWJs8Su6kO IShw82h5u9GY8krPw8Kc00UYjq98bOeKZzced0FAmPr12GF3Q75Fe3qi4PvwpGqVeYhj Pu+Q== X-Gm-Message-State: AOJu0YxCu+KN2PRU0IM5lAWAfPL1bCCmiN+TKYXyRn29wz/2agsD2XS1 KjyrHbtiYfnM9D+4Nkz2Gkqa8beqVXuTwOoddWUSBMTbeoMdZ1xFeaOxYvsmBc2yjUf3sXR5fyd S X-Google-Smtp-Source: AGHT+IFQ6cS1l8z0hwUvVqpSxNMdHOcfuYSJxbmy0fk4dESEWKdNsHCBcDV6IVl329mFS60MqHm7Kw== X-Received: by 2002:a05:620a:472b:b0:792:c3a3:ba29 with SMTP id af79cd13be357-792c75ac217mr4328738685a.42.1716318137853; Tue, 21 May 2024 12:02:17 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-792bf2a14ccsm1303126685a.56.2024.05.21.12.02.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:17 -0700 (PDT) Date: Tue, 21 May 2024 15:02:15 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 12/30] pseudo-merge.ch: initial commit Message-ID: <827732acf99406fe30e87efd679fb6631de9c130.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Add a new (empty) header file to contain the implementation for selecting, reading, and applying pseudo-merge bitmaps. For now this header and its corresponding implementation are left empty, but they will evolve over the course of subsequent commit(s). Signed-off-by: Taylor Blau --- Makefile | 1 + pseudo-merge.c | 2 ++ pseudo-merge.h | 6 ++++++ 3 files changed, 9 insertions(+) create mode 100644 pseudo-merge.c create mode 100644 pseudo-merge.h diff --git a/Makefile b/Makefile index 0285db56306..4705a69f57f 100644 --- a/Makefile +++ b/Makefile @@ -1105,6 +1105,7 @@ LIB_OBJS += prompt.o LIB_OBJS += protocol.o LIB_OBJS += protocol-caps.o LIB_OBJS += prune-packed.o +LIB_OBJS += pseudo-merge.o LIB_OBJS += quote.o LIB_OBJS += range-diff.o LIB_OBJS += reachable.o diff --git a/pseudo-merge.c b/pseudo-merge.c new file mode 100644 index 00000000000..37e037ba272 --- /dev/null +++ b/pseudo-merge.c @@ -0,0 +1,2 @@ +#include "git-compat-util.h" +#include "pseudo-merge.h" diff --git a/pseudo-merge.h b/pseudo-merge.h new file mode 100644 index 00000000000..cab8ff6960a --- /dev/null +++ b/pseudo-merge.h @@ -0,0 +1,6 @@ +#ifndef PSEUDO_MERGE_H +#define PSEUDO_MERGE_H + +#include "git-compat-util.h" + +#endif From patchwork Tue May 21 19:02:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669666 Received: from mail-qk1-f178.google.com (mail-qk1-f178.google.com [209.85.222.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 04B6E14901C for ; Tue, 21 May 2024 19:02:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318144; cv=none; b=J8cZEJSMLABr+LRk/EkLQ+mf93klH398kCXGTy/eYODkgHbLbXokv0OfDJRFJAbl4uVjmzPPeU0xL51EsKekgWkTY5m9hYu+E+wCGFqDpcqREh7d3r23z7aW2QqHBBLm+An+v2EvzMSES4YuwWuJG03jUTwXZh8pMOlScsfHTPw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318144; c=relaxed/simple; bh=oOPfbLnXz/93bf5uliFqIyy3joKZQahCZ2xtXAUujZg=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=VzQQkJ60GMPUsXnHn9Ex96rzDjo7IDZzr32nxu7SpHekTdNPjWedArPYIi8yjVVZJBrTKR50Gg2jL1F94qAa50ycIGg3w3YSxVBLthALhjyBk5+v37C4oUMxFMTD1sQ5MY/pdTFXnI0NNRG8B+z9CgiXFCrqf1ieVthql1MCyEg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=YRcbMv2m; arc=none smtp.client-ip=209.85.222.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="YRcbMv2m" Received: by mail-qk1-f178.google.com with SMTP id af79cd13be357-792b8d30075so350921785a.1 for ; Tue, 21 May 2024 12:02:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318141; x=1716922941; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=M81yiR6WqmUQJyuDk+DWkGm6R/Cyxuqu+GvfG19fZKo=; b=YRcbMv2mO3SQUQLDPWDu9f9mvAbk+kO3GIcaVVycJCOk15JVDvCvN8CB2Jp9nSFyH9 ZdjSxbDhHNbH53vc7VGEtKEj5S3pVfAtgQf1C6UUix1ZaH/ppAzBQ7ZOPMi4cgRVZdOe BWKnhs7/bNXclsj0rZbMDWL6QmEVOFOWn2yEqItRMn1d699PWRCy+pnr6Yn5UCcl9PNC OAbGHYbkgbBMM5WGndNl9BqPA5CdrhYK7Dk9ehiBqMwkQUpPvAfV8FQE4PylX9hXM9HC iBC73HYIWYfOrBf+XAVcIPZY0nuezQ+W7D8nbSZIjmXtcRdXTHI4Q/85IMJgyjmriZv1 VVXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318141; x=1716922941; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=M81yiR6WqmUQJyuDk+DWkGm6R/Cyxuqu+GvfG19fZKo=; b=Q6P/fswfBqykPnbmkpF35yah/O5MSTBkxCxJvLD7AVU9eQojLNhqZ67SBz/4gg2FHI 5b92Nfn+QV8OnFFXdza2QPepSp+XuSlswrHGBU3fbIEJr5HEe7jHkx0J4kcHO0v/fQks vMzs0w9+IML/T/w3LX4X2pE2Um7ejDDisJPWHQ5qEHNW3esqYVfLpMWxXYSRzPC802Jx JnfebudsIIJqTAfj2HGhnxfCUgKwfzNTol6fNJDNh14FGnPy308/W6/OerdMohaFI0k7 WMrYKdmaWJ35acSxFWdxQ76mPVrUgCwPAvzjeIh/34xyfL16VGGGjG12S/32WDC7WgUw SgEw== X-Gm-Message-State: AOJu0Yz27+URtWprMQtEZRVS731AreVD5uJq+9ZsT9nJ9zKkg6gFCJQR vC/8UTIsbPFCvvWtsMepRDguIYDJoPj4uMdCJhMFKVH0T+LkzCOAd7KM1petIY78QeBOzf+9h9H E X-Google-Smtp-Source: AGHT+IFEquH0rw2fdh7ivCwgZyAURqCuOTHXSQJuGIpXLCtuxMOSmI5Hiiy+6LnYwz/v18826x0Qjw== X-Received: by 2002:a05:620a:298a:b0:792:cd45:8e2b with SMTP id af79cd13be357-792cd459433mr3866093585a.78.1716318141032; Tue, 21 May 2024 12:02:21 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7948e9baa51sm124636285a.5.2024.05.21.12.02.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:20 -0700 (PDT) Date: Tue, 21 May 2024 15:02:19 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 13/30] pack-bitmap-write: support storing pseudo-merge commits Message-ID: <8608dd1860fd332655b7977b30a6cb79e78b692b.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare to write pseudo-merge bitmaps by annotating individual bitmapped commits (which are represented by the `bitmapped_commit` structure) with an extra bit indicating whether or not they are a pseudo-merge. In subsequent commits, pseudo-merge bitmaps will be generated by allocating a fake commit node with parents covering the full set of commits represented by the pseudo-merge bitmap. These commits will be added to the set of "selected" commits as usual, but will be written specially instead of being included with the rest of the selected commits. Mechanically speaking, there are two parts of this change: - The bitmapped_commit struct gets a new bit indicating whether it is a pseudo-merge, or an ordinary commit selected for bitmaps. - A handful of changes to only write out the non-pseudo-merge commits when enumerating through the selected array (see the new `bitmap_writer_selected_nr()` function). Pseudo-merge commits appear after all non-pseudo-merge commits, so it is safe to enumerate through the selected array like so: for (i = 0; i < bitmap_writer_selected_nr(); i++) if (writer.selected[i].pseudo_merge) BUG("unexpected pseudo-merge"); without encountering the BUG(). Signed-off-by: Taylor Blau --- object.h | 2 +- pack-bitmap-write.c | 96 +++++++++++++++++++++++++++++---------------- pack-bitmap.h | 3 ++ 3 files changed, 67 insertions(+), 34 deletions(-) diff --git a/object.h b/object.h index 99b9c8f114c..e6f9e89d3c5 100644 --- a/object.h +++ b/object.h @@ -81,7 +81,7 @@ void object_array_init(struct object_array *array); * reflog.c: 10--12 * builtin/show-branch.c: 0-------------------------------------------26 * builtin/unpack-objects.c: 2021 - * pack-bitmap.h: 22 + * pack-bitmap.h: 2122 */ #define FLAG_BITS 28 diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index d8870155831..60eb1e71c98 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -25,8 +25,14 @@ struct bitmapped_commit { int flags; int xor_offset; uint32_t commit_pos; + unsigned pseudo_merge : 1; }; +static inline int bitmap_writer_nr_selected_commits(struct bitmap_writer *writer) +{ + return writer->selected_nr - writer->pseudo_merges_nr; +} + void bitmap_writer_init(struct bitmap_writer *writer, struct repository *r) { memset(writer, 0, sizeof(struct bitmap_writer)); @@ -129,27 +135,31 @@ void bitmap_writer_build_type_index(struct bitmap_writer *writer, */ static inline void push_bitmapped_commit(struct bitmap_writer *writer, - struct commit *commit) + struct commit *commit, + unsigned pseudo_merge) { - int hash_ret; - khiter_t hash_pos; - if (writer->selected_nr >= writer->selected_alloc) { writer->selected_alloc = (writer->selected_alloc + 32) * 2; REALLOC_ARRAY(writer->selected, writer->selected_alloc); } - hash_pos = kh_put_oid_map(writer->bitmaps, commit->object.oid, - &hash_ret); - if (!hash_ret) - die(_("duplicate entry when writing bitmap index: %s"), - oid_to_hex(&commit->object.oid)); - kh_value(writer->bitmaps, hash_pos) = NULL; + if (!pseudo_merge) { + int hash_ret; + khiter_t hash_pos = kh_put_oid_map(writer->bitmaps, + commit->object.oid, + &hash_ret); + + if (!hash_ret) + die(_("duplicate entry when writing bitmap index: %s"), + oid_to_hex(&commit->object.oid)); + kh_value(writer->bitmaps, hash_pos) = NULL; + } writer->selected[writer->selected_nr].commit = commit; writer->selected[writer->selected_nr].bitmap = NULL; writer->selected[writer->selected_nr].write_as = NULL; writer->selected[writer->selected_nr].flags = 0; + writer->selected[writer->selected_nr].pseudo_merge = pseudo_merge; writer->selected_nr++; } @@ -180,16 +190,20 @@ static void compute_xor_offsets(struct bitmap_writer *writer) while (next < writer->selected_nr) { struct bitmapped_commit *stored = &writer->selected[next]; - int best_offset = 0; struct ewah_bitmap *best_bitmap = stored->bitmap; struct ewah_bitmap *test_xor; + if (stored->pseudo_merge) + goto next; + for (i = 1; i <= MAX_XOR_OFFSET_SEARCH; ++i) { int curr = next - i; if (curr < 0) break; + if (writer->selected[curr].pseudo_merge) + continue; test_xor = ewah_pool_new(); ewah_xor(writer->selected[curr].bitmap, stored->bitmap, test_xor); @@ -205,6 +219,7 @@ static void compute_xor_offsets(struct bitmap_writer *writer) } } +next: stored->xor_offset = best_offset; stored->write_as = best_bitmap; @@ -217,7 +232,8 @@ struct bb_commit { struct bitmap *commit_mask; struct bitmap *bitmap; unsigned selected:1, - maximal:1; + maximal:1, + pseudo_merge:1; unsigned idx; /* within selected array */ }; @@ -255,17 +271,18 @@ static void bitmap_builder_init(struct bitmap_builder *bb, revs.first_parent_only = 1; for (i = 0; i < writer->selected_nr; i++) { - struct commit *c = writer->selected[i].commit; - struct bb_commit *ent = bb_data_at(&bb->data, c); + struct bitmapped_commit *bc = &writer->selected[i]; + struct bb_commit *ent = bb_data_at(&bb->data, bc->commit); ent->selected = 1; ent->maximal = 1; + ent->pseudo_merge = bc->pseudo_merge; ent->idx = i; ent->commit_mask = bitmap_new(); bitmap_set(ent->commit_mask, i); - add_pending_object(&revs, &c->object, ""); + add_pending_object(&revs, &bc->commit->object, ""); } if (prepare_revision_walk(&revs)) @@ -444,8 +461,13 @@ static int fill_bitmap_commit(struct bitmap_writer *writer, struct commit *c = prio_queue_get(queue); if (old_bitmap && mapping) { - struct ewah_bitmap *old = bitmap_for_commit(old_bitmap, c); + struct ewah_bitmap *old; struct bitmap *remapped = bitmap_new(); + + if (commit->object.flags & BITMAP_PSEUDO_MERGE) + old = NULL; + else + old = bitmap_for_commit(old_bitmap, c); /* * If this commit has an old bitmap, then translate that * bitmap and add its bits to this one. No need to walk @@ -464,12 +486,14 @@ static int fill_bitmap_commit(struct bitmap_writer *writer, * Mark ourselves and queue our tree. The commit * walk ensures we cover all parents. */ - pos = find_object_pos(writer, &c->object.oid, &found); - if (!found) - return -1; - bitmap_set(ent->bitmap, pos); - prio_queue_put(tree_queue, - repo_get_commit_tree(the_repository, c)); + if (!(c->object.flags & BITMAP_PSEUDO_MERGE)) { + pos = find_object_pos(writer, &c->object.oid, &found); + if (!found) + return -1; + bitmap_set(ent->bitmap, pos); + prio_queue_put(tree_queue, + repo_get_commit_tree(the_repository, c)); + } for (p = c->parents; p; p = p->next) { pos = find_object_pos(writer, &p->item->object.oid, @@ -499,6 +523,9 @@ static void store_selected(struct bitmap_writer *writer, stored->bitmap = bitmap_to_ewah(ent->bitmap); + if (ent->pseudo_merge) + return; + hash_pos = kh_get_oid_map(writer->bitmaps, commit->object.oid); if (hash_pos == kh_end(writer->bitmaps)) die(_("attempted to store non-selected commit: '%s'"), @@ -631,7 +658,7 @@ void bitmap_writer_select_commits(struct bitmap_writer *writer, if (indexed_commits_nr < 100) { for (i = 0; i < indexed_commits_nr; ++i) - push_bitmapped_commit(writer, indexed_commits[i]); + push_bitmapped_commit(writer, indexed_commits[i], 0); return; } @@ -664,7 +691,7 @@ void bitmap_writer_select_commits(struct bitmap_writer *writer, } } - push_bitmapped_commit(writer, chosen); + push_bitmapped_commit(writer, chosen, 0); i += next + 1; display_progress(writer->progress, i); @@ -701,8 +728,11 @@ static void write_selected_commits_v1(struct bitmap_writer *writer, { int i; - for (i = 0; i < writer->selected_nr; ++i) { + for (i = 0; i < bitmap_writer_nr_selected_commits(writer); ++i) { struct bitmapped_commit *stored = &writer->selected[i]; + if (stored->pseudo_merge) + BUG("unexpected pseudo-merge among selected: %s", + oid_to_hex(&stored->commit->object.oid)); if (offsets) offsets[i] = hashfile_total(f); @@ -735,10 +765,10 @@ static void write_lookup_table(struct bitmap_writer *writer, struct hashfile *f, uint32_t i; uint32_t *table, *table_inv; - ALLOC_ARRAY(table, writer->selected_nr); - ALLOC_ARRAY(table_inv, writer->selected_nr); + ALLOC_ARRAY(table, bitmap_writer_nr_selected_commits(writer)); + ALLOC_ARRAY(table_inv, bitmap_writer_nr_selected_commits(writer)); - for (i = 0; i < writer->selected_nr; i++) + for (i = 0; i < bitmap_writer_nr_selected_commits(writer); i++) table[i] = i; /* @@ -746,16 +776,16 @@ static void write_lookup_table(struct bitmap_writer *writer, struct hashfile *f, * bitmap corresponds to j'th bitmapped commit (among the selected * commits) in lex order of OIDs. */ - QSORT_S(table, writer->selected_nr, table_cmp, writer); + QSORT_S(table, bitmap_writer_nr_selected_commits(writer), table_cmp, writer); /* table_inv helps us discover that relationship (i'th bitmap * to j'th commit by j = table_inv[i]) */ - for (i = 0; i < writer->selected_nr; i++) + for (i = 0; i < bitmap_writer_nr_selected_commits(writer); i++) table_inv[table[i]] = i; trace2_region_enter("pack-bitmap-write", "writing_lookup_table", the_repository); - for (i = 0; i < writer->selected_nr; i++) { + for (i = 0; i < bitmap_writer_nr_selected_commits(writer); i++) { struct bitmapped_commit *selected = &writer->selected[table[i]]; uint32_t xor_offset = selected->xor_offset; uint32_t xor_row; @@ -827,7 +857,7 @@ void bitmap_writer_finish(struct bitmap_writer *writer, memcpy(header.magic, BITMAP_IDX_SIGNATURE, sizeof(BITMAP_IDX_SIGNATURE)); header.version = htons(default_version); header.options = htons(flags | options); - header.entry_count = htonl(writer->selected_nr); + header.entry_count = htonl(bitmap_writer_nr_selected_commits(writer)); hashcpy(header.checksum, writer->pack_checksum); hashwrite(f, &header, sizeof(header) - GIT_MAX_RAWSZ + the_hash_algo->rawsz); @@ -839,7 +869,7 @@ void bitmap_writer_finish(struct bitmap_writer *writer, if (options & BITMAP_OPT_LOOKUP_TABLE) CALLOC_ARRAY(offsets, index_nr); - for (i = 0; i < writer->selected_nr; i++) { + for (i = 0; i < bitmap_writer_nr_selected_commits(writer); i++) { struct bitmapped_commit *stored = &writer->selected[i]; int commit_pos = oid_pos(&stored->commit->object.oid, index, index_nr, oid_access); diff --git a/pack-bitmap.h b/pack-bitmap.h index f87e60153dd..6937a0f090f 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -21,6 +21,7 @@ struct bitmap_disk_header { unsigned char checksum[GIT_MAX_RAWSZ]; }; +#define BITMAP_PSEUDO_MERGE (1u<<21) #define NEEDS_BITMAP (1u<<22) /* @@ -109,6 +110,8 @@ struct bitmap_writer { struct bitmapped_commit *selected; unsigned int selected_nr, selected_alloc; + uint32_t pseudo_merges_nr; + struct progress *progress; int show_progress; unsigned char pack_checksum[GIT_MAX_RAWSZ]; From patchwork Tue May 21 19:02:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669667 Received: from mail-oi1-f174.google.com (mail-oi1-f174.google.com [209.85.167.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EBC96149C60 for ; Tue, 21 May 2024 19:02:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318148; cv=none; b=qCsQ0VERDGF2dD1AwO/zZd/fd7o6grvi4gU0diacCLiPU8j9rnv0wKpDyTgCi4Qhays77oKUaB1ZfKK1EYcv/l06Kc2Ct7cOgJiks7HOpdDpf0Tzr0mN8sSh8hkn6XvdyHNGSybJl62B33Y8eqgsx5QXTo2Cdzyml8RjOoG44X8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318148; c=relaxed/simple; bh=ElSEYqMG35FVztiwNx65MugJny7Y5G4YINRmErucZwY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ijKvMfngd58GSiv3gFK5amIVkN2WrUczdes+40GnT40PPG1XN30OJuUlXKNW+LWhItl4LDUCqhe2XTJAEZoM71IBsKUESvCNDnPQt+ZYeGHV5WZ6R7SWBYtR+JovmgR/D+bbsdepy5lTtEwKnhoCkPysNK7iV50gfcLM5fus86o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=B4CcCeYo; arc=none smtp.client-ip=209.85.167.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="B4CcCeYo" Received: by mail-oi1-f174.google.com with SMTP id 5614622812f47-3c9c41cdc4cso1589337b6e.2 for ; Tue, 21 May 2024 12:02:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318146; x=1716922946; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=BvL1r1kutsC8n40mTvcpvKDxHEn+2zuNKJXrnKFiFIY=; b=B4CcCeYoji8SM1fMwO5ijD9aeKNpM6WVgMn8SEiDzWFyKubkOcyyOuzPRVps8RPNWo fGYs+3GjMF7stX8KDj7VYYjaWZ1TEjp6j/pOlAj9S4ke4MbZGU5UB8FLIVfjEIn9haKI 5v2DP5PaqN44gToo9+cbBWgiv6DcEc2jVfzow+7lyZu1Oy0PEhDUQxUFn2cnvQ0H4/X6 P3hsyZ3q6jm5Lvyd07Srt26D6hEZXLs1ynNvUmCtZgKEDQjLOMghU6IDh6G5LmMxZRob Np7j69lQZKoHo56IsVYl5Rlgv3y8x6Bg3M8eNMyXbUUnXmPzx4hVuu5XbaqWDt/ZeABU nBEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318146; x=1716922946; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=BvL1r1kutsC8n40mTvcpvKDxHEn+2zuNKJXrnKFiFIY=; b=xDHJ0f6fxra1eIZVTzKUkw+H/Eg8lLNB0li3XYcbNSOlcxw3meNl8QTVbxzewAMD+B CB0J0hNWDfDA17HkVrQzt4+5Z1JyhjGp1FSOj+e+nykPoqwU6ay7AJDek7lOvUqluU2p cR7FZsLUglZafXVv6Pns4lnVU8Ahs4ynJhtUJdTtJKptvYH4hK//AR22aA1c4zMHtL02 Bc1Z18KgC4N91zGnvV+3G/+hi8wGqSNQFpnm4g1bOep9nr2+cQJNIn7gNa9luakzis+Z 353Fs40kFVnYCKncEe9o2xcroY2Eb7M+mSdQKRlJBL4q2CvfACmEIL9oQREajYi+KYKh 2OPw== X-Gm-Message-State: AOJu0YzXxNNrrH89829n18mMz2UnriX1DNfNKuPJ4l687SOz7QFF00G9 n59ftGoFRLajlq7wH+WMXikR78i+EW+OrHTvJYC/KJpv/ZSnypi5zWEazQQ+1KfAnk4Ee6bMnhx v X-Google-Smtp-Source: AGHT+IFWnHmOuxHYpzWjWsBgFcCSgvoyFincw2eLrH1cPj3upS7E+QzIfP6oebx0O3RXiocz+hxvKQ== X-Received: by 2002:a54:4890:0:b0:3c9:668b:c9ca with SMTP id 5614622812f47-3c9970cec87mr42351741b6e.39.1716318144330; Tue, 21 May 2024 12:02:24 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6ab0f86c3adsm19446316d6.28.2024.05.21.12.02.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:23 -0700 (PDT) Date: Tue, 21 May 2024 15:02:22 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 14/30] pack-bitmap: implement `bitmap_writer_has_bitmapped_object_id()` Message-ID: <99d2b6872baf5702aca74429c7c31b6fab5b1ec0.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare to implement pseudo-merge bitmap selection by implementing a necessary new function, `bitmap_writer_has_bitmapped_object_id()`. This function returns whether or not the bitmap_writer selected the given object ID for bitmapping. This will allow the pseudo-merge machinery to reject candidates for pseudo-merges if they have already been selected as an ordinary bitmap tip. Signed-off-by: Taylor Blau --- pack-bitmap-write.c | 6 ++++++ pack-bitmap.h | 2 ++ 2 files changed, 8 insertions(+) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 60eb1e71c98..299aa8af6f5 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -130,6 +130,12 @@ void bitmap_writer_build_type_index(struct bitmap_writer *writer, } } +int bitmap_writer_has_bitmapped_object_id(struct bitmap_writer *writer, + const struct object_id *oid) +{ + return kh_get_oid_map(writer->bitmaps, *oid) != kh_end(writer->bitmaps); +} + /** * Compute the actual bitmaps */ diff --git a/pack-bitmap.h b/pack-bitmap.h index 6937a0f090f..e175f28e0de 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -125,6 +125,8 @@ void bitmap_writer_build_type_index(struct bitmap_writer *writer, struct packing_data *to_pack, struct pack_idx_entry **index, uint32_t index_nr); +int bitmap_writer_has_bitmapped_object_id(struct bitmap_writer *writer, + const struct object_id *oid); uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, struct packing_data *mapping); int rebuild_bitmap(const uint32_t *reposition, From patchwork Tue May 21 19:02:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669668 Received: from mail-qk1-f182.google.com (mail-qk1-f182.google.com [209.85.222.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E02AD149C76 for ; Tue, 21 May 2024 19:02:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318150; cv=none; b=itjh1NYSP/iSUqZ98kqLeJkhAy9OHWkUns67uVnTdX6LG1x2ORlSd6aKN9BmGTDjf8AdLiBW5RHTgZigGxXKnVYF72pVX+BleDS36xeH8BRZaKglWRUkoYQKMn6Q3Ur43Mv60tqKnEVCsHAuS2pUIW+6V/vOTq9CPXMvjqW+D+0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318150; c=relaxed/simple; bh=zL05U430trYyComy5YdiFkSKQLavhhq4cnAd8gCqg8A=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=u/82JiX+TyUebUvDn+2oomwIt78BO1K1+luMv3/akwsoWweMtbVDrIf/7yCnudRr8AO01Sj20D7o0u5mfHr+FFPAyUA78v1rF2Ob7Y8+QR5MUSbYtLPNZZvMRAFiigZuADNSIZTz8WIVSIolBXObAyrWndYKWZyfvIoK9n3CPyQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=cpAp8WOH; arc=none smtp.client-ip=209.85.222.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="cpAp8WOH" Received: by mail-qk1-f182.google.com with SMTP id af79cd13be357-792b8ebc4eeso327781685a.1 for ; Tue, 21 May 2024 12:02:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318147; x=1716922947; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=5LVBX3huhz0FjivVPIfYgyVZqnbwrn+QLqnbungBDDY=; b=cpAp8WOHD08Axn2DBD15qpzMx/c8Q5epLWXd7Y5WfM/950pc8BSD7nr0ayFN4aEAkY JiM5Z8nPPwJNNHRuYnVF4ugwDNnculKP/fmySruB/IttAPkIwj5q9SyH3yjPiNZo+i6L arjAjXpa1x2IcjcR4XXMORQFRImOSL7avWynovrfQOLzGGX6hk9+i3CBZV8D7Ef/Uc/l z5r+U+Y623bbzkmu1AN46V+VoCTcJz8fqo81ouSmEUBu3LK1ShnPKOiRro+JfZIZph0M Zz05W8N4dleukzpshwhheVTmHBmaLMBpN8sK7X0UPL3+NLxhyojuA1cAqu77/vOuGgFr 0mQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318147; x=1716922947; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=5LVBX3huhz0FjivVPIfYgyVZqnbwrn+QLqnbungBDDY=; b=FgMhmvduDuX2lr3zL8Bg7PaIh51eNLdyy/wzW8ta4o9L+nGPVafv7AaAky73tJfYKj Glr/fG/s5ngrt1t1nakzOQT8NqmeeVzzYxQ3Y87dNbjL5MOQ0Yd3V+NpkKfAo8JKN2Ge UryAn8mnyTgM7lw6GLQqSbas7vpX6O6TtJaRe3tU/zqxusZnyym2M9X+6nHAOP79LRQc srw7AXDMvMRARjfhsIAVr1WRJGkW3WT/RNB+AgEa0OjCfuh0f/D+hnaUEv69XkDf+Jkr 73NXxt1FFPUW8j0HsYkqs1Hb1i2kVCLuC7FAUPjhUpGPveAFYkMuTa67fl6U+JN5z60d Py4Q== X-Gm-Message-State: AOJu0Yy+yTHFnSO37W0lN0hgDKfbKnH8BmCcawtLPRZYsa5QmPCL5NVk dqQ/mDiem/UrB2y+eHM2iCHI53h1gvbioF2x8fBUFDfXZ0ofyn3ZQX240qYzB4KoZAiJiZw5mlL U X-Google-Smtp-Source: AGHT+IF6Ur8LVPX6MP40q9VCIxjlrLCZTtg95Si4aZspDnikJI5elZcYYVk111qQEWW/TKpzYrPRug== X-Received: by 2002:a05:620a:118a:b0:790:6aa4:d068 with SMTP id af79cd13be357-792c75763b0mr3498910785a.15.1716318147490; Tue, 21 May 2024 12:02:27 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-792bf27fdf8sm1313638385a.44.2024.05.21.12.02.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:27 -0700 (PDT) Date: Tue, 21 May 2024 15:02:26 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 15/30] pack-bitmap: make `bitmap_writer_push_bitmapped_commit()` public Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The pseudo-merge selection code will be added in a subsequent commit, and will need a way to push the allocated commit structures into the bitmap writer from a separate compilation unit. Make the `bitmap_writer_push_bitmapped_commit()` function part of the pack-bitmap.h header in order to make this possible. Signed-off-by: Taylor Blau --- pack-bitmap-write.c | 9 ++++----- pack-bitmap.h | 2 ++ 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 299aa8af6f5..bc19b33ad16 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -140,9 +140,8 @@ int bitmap_writer_has_bitmapped_object_id(struct bitmap_writer *writer, * Compute the actual bitmaps */ -static inline void push_bitmapped_commit(struct bitmap_writer *writer, - struct commit *commit, - unsigned pseudo_merge) +void bitmap_writer_push_commit(struct bitmap_writer *writer, + struct commit *commit, unsigned pseudo_merge) { if (writer->selected_nr >= writer->selected_alloc) { writer->selected_alloc = (writer->selected_alloc + 32) * 2; @@ -664,7 +663,7 @@ void bitmap_writer_select_commits(struct bitmap_writer *writer, if (indexed_commits_nr < 100) { for (i = 0; i < indexed_commits_nr; ++i) - push_bitmapped_commit(writer, indexed_commits[i], 0); + bitmap_writer_push_commit(writer, indexed_commits[i], 0); return; } @@ -697,7 +696,7 @@ void bitmap_writer_select_commits(struct bitmap_writer *writer, } } - push_bitmapped_commit(writer, chosen, 0); + bitmap_writer_push_commit(writer, chosen, 0); i += next + 1; display_progress(writer->progress, i); diff --git a/pack-bitmap.h b/pack-bitmap.h index e175f28e0de..a7e2f56c971 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -127,6 +127,8 @@ void bitmap_writer_build_type_index(struct bitmap_writer *writer, uint32_t index_nr); int bitmap_writer_has_bitmapped_object_id(struct bitmap_writer *writer, const struct object_id *oid); +void bitmap_writer_push_commit(struct bitmap_writer *writer, + struct commit *commit, unsigned pseudo_merge); uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, struct packing_data *mapping); int rebuild_bitmap(const uint32_t *reposition, From patchwork Tue May 21 19:02:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669669 Received: from mail-oa1-f44.google.com (mail-oa1-f44.google.com [209.85.160.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3499D1494B7 for ; Tue, 21 May 2024 19:02:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318153; cv=none; b=TM9KilQDm4PZ4QMKXAEv1Ugi3QvV/gjqBrVZ/RKmriFfUvOc6VlJo1pS1pMwMa90B7aF8i7pYrvjrTVpyJjXMAS6bs0UB7Bz/dsbeCeVCNx1j2ETPpeiygFOddTs3Es7TKcYTRjsBnF09+nwN6lbdLFqGlWaOw2hUeA1tHu3uUk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318153; c=relaxed/simple; bh=8hHu9KwFLQ7vz7SK56W3iSbAgdPyyqvkExomSPkkEBY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=YbIB+Nxg+vb73QX8jgnNCbxD3nJIqj9maxNZXolQvXv2y+csmf3ypWl7PxlgASZcqwHwQbj4Cur0MMyI8yqCyMTFI66eJr0PQbvWS/SHXMsVI8lC/1Yz6R2htm1SNxEyk56RZO4vAbSe462elTgCa8ipJRBYZ6P0Q85LlpZwc+I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=Y+3p9uaO; arc=none smtp.client-ip=209.85.160.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="Y+3p9uaO" Received: by mail-oa1-f44.google.com with SMTP id 586e51a60fabf-23f0d54c5ffso2484227fac.1 for ; Tue, 21 May 2024 12:02:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318151; x=1716922951; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=uUPYn6RxoH8N1CxZ5+akeynFHJMEMPuogtfmSZQvq2U=; b=Y+3p9uaOJ/C2THwHgGB2UlmHcAu6CQgrMH19awaONCuyG67bNTx61eqQswhPW6PyNg imlTzu9aO4YFBbnF7qH/3FdnsbP8vW2+XTrV+jQufhZaJ4BqwKQQ5FyLYwJXZZSoeXQW CGHJt8uVphaxby7Aw7nPtqB8igZJq8owGwuoDKzBoSaCSdAFDC2TPkh2p5WhYoTDm+Hi BkkeaMlmAXglMKsfFNELkJFDn2TBOxPN6HgH/oS/GrfuRs4Ir4tTpGAMheBzTkrZefQj LM9cHGfkLQ2/gl/LIQRm0pbsk+/oAXTjSKBd6JZwT5tNSdVMXxWpU0OZ6VdKXrSewtzE 980Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318151; x=1716922951; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=uUPYn6RxoH8N1CxZ5+akeynFHJMEMPuogtfmSZQvq2U=; b=C1VJKJKwvM0aEAaJ4xu1If3e86bb3qdj/0ht0fJj9SQFYLF50Ijy9OmAUiStTM1w0p mFDesrYbL3pupqVTIO8kJPKVB/o036REQ9xkUbSENryqnam+c3goApo/G5KPiV08vPBD e7rWYSV0sxqkPNMTfE0cpGr1cOwryVnjYsUtB7ZOP6Eo1Lh2NFH7jf2x11uK2anm4wOz toAQo8oWXmvTB3nK5ScXtSmiXhvKz9VE4Nw6VaadJJHZk+wcAaS7TyjN6kwyJAPc81Ax bewZtSIY0dzDbUoTh+a+QNz5QaEssRGr1kVs0G8UrqL5yHfG35dTbL9RdRxWv26fB7rV 3qhg== X-Gm-Message-State: AOJu0Ywy/CR4+fe8A8GaSoMqqS6odlnszCPQUDVOtRFZ3QZtrSqpFHkF +Z1oknhFyNlQuMaj5WGv/zuuEvzWmjLMMAq/KmKwykAT51uFXSjOO8F0FuhgE3RIEE0ZOtP0Ya+ b X-Google-Smtp-Source: AGHT+IHO4ISwX7n3blXsLdP1YVlwvKeDpwEN8l9ktJLJT6iCa52WEOhgBeWf4o40J6wpb6bC2ZnCfg== X-Received: by 2002:a05:6870:e40f:b0:221:bf34:b15f with SMTP id 586e51a60fabf-24c68b2732bmr10923fac.25.1716318150822; Tue, 21 May 2024 12:02:30 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6aa10d89780sm25894896d6.94.2024.05.21.12.02.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:30 -0700 (PDT) Date: Tue, 21 May 2024 15:02:29 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 16/30] config: introduce git_config_float() Message-ID: <3070135eb4b9bd16117e82f1817c112c56a24b55.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Future commits will want to parse a floating point value from configuration, but we have no way to parse such a value prior to this patch. The core of the routine is implemented in git_parse_float(). Unlike git_parse_unsigned() and git_parse_signed(), however, the function implemented here only works on type "float", and not related types like "double", or "long double". This is because "double" and "long double" use different functions to convert from ASCII strings to floating point values (strtod() and strtold(), respectively). Likewise, there is no pointer type that can assign to any of these values (except for "void *"), so the only way to define this trio of functions would be with a macro expansion that is parameterized over the floating point type and conversion function. That is all doable, but likely to be overkill given our current needs, which is only to parse floats. Signed-off-by: Taylor Blau --- config.c | 9 +++++++++ config.h | 6 ++++++ parse.c | 29 +++++++++++++++++++++++++++++ parse.h | 1 + 4 files changed, 45 insertions(+) diff --git a/config.c b/config.c index 77a0fd2d80e..ee681fda34b 100644 --- a/config.c +++ b/config.c @@ -1243,6 +1243,15 @@ ssize_t git_config_ssize_t(const char *name, const char *value, return ret; } +float git_config_float(const char *name, const char *value, + const struct key_value_info *kvi) +{ + float ret; + if (!git_parse_float(value, &ret)) + die_bad_number(name, value, kvi); + return ret; +} + static const struct fsync_component_name { const char *name; enum fsync_component component_bits; diff --git a/config.h b/config.h index f4966e37494..b0d1baba95a 100644 --- a/config.h +++ b/config.h @@ -261,6 +261,12 @@ unsigned long git_config_ulong(const char *, const char *, ssize_t git_config_ssize_t(const char *, const char *, const struct key_value_info *); +/** + * Identical to `git_config_int`, but for floating point values. + */ +float git_config_float(const char *, const char *, + const struct key_value_info *); + /** * Same as `git_config_bool`, except that integers are returned as-is, and * an `is_bool` flag is unset. diff --git a/parse.c b/parse.c index 42d691a0fbb..a5967e80910 100644 --- a/parse.c +++ b/parse.c @@ -125,6 +125,35 @@ int git_parse_ssize_t(const char *value, ssize_t *ret) return 1; } +int git_parse_float(const char *value, float *ret) +{ + char *end; + float val; + uintmax_t factor; + + if (!value || !*value) { + errno = EINVAL; + return 0; + } + + errno = 0; + val = strtof(value, &end); + if (errno == ERANGE) + return 0; + if (end == value) { + errno = EINVAL; + return 0; + } + factor = get_unit_factor(end); + if (!factor) { + errno = EINVAL; + return 0; + } + val *= factor; + *ret = val; + return 1; +} + int git_parse_maybe_bool_text(const char *value) { if (!value) diff --git a/parse.h b/parse.h index 07d2193d698..7df82c5f5b8 100644 --- a/parse.h +++ b/parse.h @@ -6,6 +6,7 @@ int git_parse_ssize_t(const char *, ssize_t *); int git_parse_ulong(const char *, unsigned long *); int git_parse_int(const char *value, int *ret); int git_parse_int64(const char *value, int64_t *ret); +int git_parse_float(const char *value, float *ret); /** * Same as `git_config_bool`, except that it returns -1 on error rather From patchwork Tue May 21 19:02:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669670 Received: from mail-qv1-f47.google.com (mail-qv1-f47.google.com [209.85.219.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 03D8514884B for ; Tue, 21 May 2024 19:02:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318158; cv=none; b=WLMIcHcFJ3ofhxqUPGWoYRXNyqN8QA6jpMYnqoBTb5c1UvVDEGA0f7M7zJhdoWLCsy/bBpQU0/boxz4cUsKcHunXq3RWYmC/4WjpEtZ8NOr6KKKdm+bOKg+i/kFRbbwwg3R7ZOgiWAlsl5zZt2D2Pxh7dRrsdjh7GG44QEFYNaA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318158; c=relaxed/simple; bh=vWk/sEfgQPIvzCfgQg2bp3QYsFa9ak1EB3QfHq9BDE8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=m+RBiDk2KaUsKd8Y8NWTlRk22JJSjowNNHQOBHlJcK0UnzCb3PNRkRZU/oYuu3UVOHz4G/uELYz62gZ39SZ7i/GjkMHUTCgRJq1mkmHWfnlBLjdZ9KlsbLgbzCcLZvseljtEb/IkZCMYM0xQmKUelQzWLcxgEVec2zkTurJc3SM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=PDBbxumt; arc=none smtp.client-ip=209.85.219.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="PDBbxumt" Received: by mail-qv1-f47.google.com with SMTP id 6a1803df08f44-6a3652a732fso2612766d6.3 for ; Tue, 21 May 2024 12:02:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318154; x=1716922954; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=/EiGYhQfKLO+QHFrbYO3o52QrEoAbeU2UaKTxFFH/sU=; b=PDBbxumtuaNuM99zKMzoL23pl7b3o37W5CXbabylWHe+6nxYZY9CG6iY1NJF124WAb fcNU5xejJy8EAS3ZXTeecvhFfNAn71drNrsacYw4FTeWD1iIRQRX+7x+H9UdETis9/wx Rrva3hq1tFGxowf7vQZoHqkDh/hIJnkfGEydY6rGc8lylEq7p7ItI44yrfjxM/lUSnch vlm4QRrdy2LcLI3UoRvAne4dkD8cnOEPRX0RkmLKC8kxc5wDFaGNNPam6UsAKE8j5pHo qDlaPUbsmL1/KI7rresrNqpd23dCVCuXRsIq/LXoQka5UKYR3hvBUVmKQN1JW5Gd0nQo kohA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318154; x=1716922954; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=/EiGYhQfKLO+QHFrbYO3o52QrEoAbeU2UaKTxFFH/sU=; b=oTKfws48l87pkNOrMEcPHsP9MEKj8tcETU4vuOAR0jEoWarRojUrkFBFmIJjJkoFJ1 S5sztB/DMU+MF0XN7VP59LEMiR1ONG7c7AGXMS7i07VzaKO6XuRUXxBjL4TBPX8RRFiD gUHviv3ogY/5RZtvCtFD9l2i5Fjch3oQwcB8kYa09PaZ8EFthrbmHi8U8jITdoZ2Mha1 FF836/OjEej5YkyJZapCKLOvmZPySMRUtiv90BPHYXK19MDHsOHjoGIT8ziav5CzNhdn CxYoNfU3U5z+76//9btqM0yJ90uH0wbhNoJhyWAZuiCRqK2pY6H0z5OGjiuA889Ig1Df W7oA== X-Gm-Message-State: AOJu0Yyjj9HoOkOb2GwGF9zVZQJRa9nVjH/Y+DttR1gRaSaBMPMoX3fn hg/FxQ5vRt8fVaEw8WgnEDONVUF9e4EdlrvMufQJk8Ep+DZ0O3MCbm7BMy5tfJfQZkUzNJflGn1 M X-Google-Smtp-Source: AGHT+IEIQsQ6fGNfknHxj5FdW9sozQOjC38izDnJDeQgioRcPqba9gEzPJc8A/FdGSTbr2Mgd/SV5A== X-Received: by 2002:a05:6214:3b87:b0:6a9:7a39:c115 with SMTP id 6a1803df08f44-6ab7067e981mr38100966d6.26.1716318153988; Tue, 21 May 2024 12:02:33 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6a15f1d6f30sm125636386d6.110.2024.05.21.12.02.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:33 -0700 (PDT) Date: Tue, 21 May 2024 15:02:32 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 17/30] pseudo-merge: implement support for selecting pseudo-merge commits Message-ID: <3029473c094b3edf51828b7a1d1acfc8e959ece6.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Teach the new pseudo-merge machinery how to select non-bitmapped commits for inclusion in different pseudo-merge group(s) based on a handful of criteria. Note that the selected pseudo-merge commits aren't actually used or written anywhere yet. This will be done in the following commit. Signed-off-by: Taylor Blau --- Documentation/config.txt | 2 + Documentation/config/bitmap-pseudo-merge.txt | 90 ++++ Documentation/gitpacking.txt | 83 ++++ pack-bitmap-write.c | 21 + pack-bitmap.h | 2 + pseudo-merge.c | 454 +++++++++++++++++++ pseudo-merge.h | 94 ++++ 7 files changed, 746 insertions(+) create mode 100644 Documentation/config/bitmap-pseudo-merge.txt diff --git a/Documentation/config.txt b/Documentation/config.txt index 6f649c997c0..caa34311214 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -384,6 +384,8 @@ include::config/apply.txt[] include::config/attr.txt[] +include::config/bitmap-pseudo-merge.txt[] + include::config/blame.txt[] include::config/branch.txt[] diff --git a/Documentation/config/bitmap-pseudo-merge.txt b/Documentation/config/bitmap-pseudo-merge.txt new file mode 100644 index 00000000000..d4a2023b84a --- /dev/null +++ b/Documentation/config/bitmap-pseudo-merge.txt @@ -0,0 +1,90 @@ +NOTE: The configuration options in `bitmapPseudoMerge.*` are considered +EXPERIMENTAL and may be subject to change or be removed entirely in the +future. + +bitmapPseudoMerge..pattern:: + Regular expression used to match reference names. Commits + pointed to by references matching this pattern (and meeting + the below criteria, like `bitmapPseudoMerge..sampleRate` + and `bitmapPseudoMerge..threshold`) will be considered + for inclusion in a pseudo-merge bitmap. ++ +Commits are grouped into pseudo-merge groups based on whether or not +any reference(s) that point at a given commit match the pattern, which +is an extended regular expression. ++ +Within a pseudo-merge group, commits may be further grouped into +sub-groups based on the capture groups in the pattern. These +sub-groupings are formed from the regular expressions by concatenating +any capture groups from the regular expression, with a '-' dash in +between. ++ +For example, if the pattern is `refs/tags/`, then all tags (provided +they meet the below criteria) will be considered candidates for the +same pseudo-merge group. However, if the pattern is instead +`refs/remotes/([0-9])+/tags/`, then tags from different remotes will +be grouped into separate pseudo-merge groups, based on the remote +number. + +bitmapPseudoMerge..decay:: + Determines the rate at which consecutive pseudo-merge bitmap + groups decrease in size. Must be non-negative. This parameter + can be thought of as `k` in the function `f(n) = C * n^-k`, + where `f(n)` is the size of the `n`th group. ++ +Setting the decay rate equal to `0` will cause all groups to be the +same size. Setting the decay rate equal to `1` will cause the `n`th +group to be `1/n` the size of the initial group. Higher values of the +decay rate cause consecutive groups to shrink at an increasing rate. +The default is `1`. ++ +If all groups are the same size, it is possible that groups containing +newer commits will be able to be used less often than earlier groups, +since it is more likely that the references pointing at newer commits +will be updated more often than a reference pointing at an old commit. + +bitmapPseudoMerge..sampleRate:: + Determines the proportion of non-bitmapped commits (among + reference tips) which are selected for inclusion in an + unstable pseudo-merge bitmap. Must be between `0` and `1` + (inclusive). The default is `1`. + +bitmapPseudoMerge..threshold:: + Determines the minimum age of non-bitmapped commits (among + reference tips, as above) which are candidates for inclusion + in an unstable pseudo-merge bitmap. The default is + `1.week.ago`. + +bitmapPseudoMerge..maxMerges:: + Determines the maximum number of pseudo-merge commits among + which commits may be distributed. ++ +For pseudo-merge groups whose pattern does not contain any capture +groups, this setting is applied for all commits matching the regular +expression. For patterns that have one or more capture groups, this +setting is applied for each distinct capture group. ++ +For example, if your capture group is `refs/tags/`, then this setting +will distribute all tags into a maximum of `maxMerges` pseudo-merge +commits. However, if your capture group is, say, +`refs/remotes/([0-9]+)/tags/`, then this setting will be applied to +each remote's set of tags individually. ++ +Must be non-negative. The default value is 64. + +bitmapPseudoMerge..stableThreshold:: + Determines the minimum age of commits (among reference tips, + as above, however stable commits are still considered + candidates even when they have been covered by a bitmap) which + are candidates for a stable a pseudo-merge bitmap. The default + is `1.month.ago`. ++ +Setting this threshold to a smaller value (e.g., 1.week.ago) will cause +more stable groups to be generated (which impose a one-time generation +cost) but those groups will likely become stale over time. Using a +larger value incurs the opposite penalty (fewer stable groups which are +more useful). + +bitmapPseudoMerge..stableSize:: + Determines the size (in number of commits) of a stable + psuedo-merge bitmap. The default is `512`. diff --git a/Documentation/gitpacking.txt b/Documentation/gitpacking.txt index ff18077129b..1ed645ff910 100644 --- a/Documentation/gitpacking.txt +++ b/Documentation/gitpacking.txt @@ -93,6 +93,89 @@ can take advantage of the fact that we only care about the union of objects reachable from all of those tags, and answer the query much faster. +=== Configuration + +Reference tips are grouped into different pseudo-merge groups according +to two criteria. A reference name matches one or more of the defined +pseudo-merge patterns, and optionally one or more capture groups within +that pattern which further partition the group. + +Within a group, commits may be considered "stable", or "unstable" +depending on their age. These are adjusted by setting the +`bitmapPseudoMerge..stableThreshold` and +`bitmapPseudoMerge..threshold` configuration values, respectively. + +All stable commits are grouped into pseudo-merges of equal size +(`bitmapPseudoMerge..stableSize`). If the `stableSize` +configuration is set to, say, 100, then the first 100 commits (ordered +by committer date) which are older than the `stableThreshold` value will +form one group, the next 100 commits will form another group, and so on. + +Among unstable commits, the pseudo-merge machinery will attempt to +combine older commits into large groups as opposed to newer commits +which will appear in smaller groups. This is based on the heuristic that +references whose tip commit is older are less likely to be modified to +point at a different commit than a reference whose tip commit is newer. + +The size of groups is determined by a power-law decay function, and the +decay parameter roughly corresponds to "k" in `f(n) = C*n^(-k/100)`, +where `f(n)` describes the size of the `n`-th pseudo-merge group. The +sample rate controls what percentage of eligible commits are considered +as candidates. The threshold parameter indicates the minimum age (so as +to avoid including too-recent commits in a pseudo-merge group, making it +less likely to be valid). The "maxMerges" parameter sets an upper-bound +on the number of pseudo-merge commits an individual group + +The "stable"-related parameters control "stable" pseudo-merge groups, +comprised of a fixed number of commits which are older than the +configured "stable threshold" value and may be grouped together in +chunks of "stableSize" in order of age. + +The exact configuration for pseudo-merges is as follows: + +include::config/bitmap-pseudo-merge.txt[] + +=== Examples + +Suppose that you have a repository with a large number of references, +and you want a bare-bones configuration of pseudo-merge bitmaps that +will enhance bitmap coverage of the `refs/` namespace. You may start +wiht a configuration like so: + + [bitmapPseudoMerge "all"] + pattern = "refs/" + threshold = now + stableThreshold = never + sampleRate = 100 + maxMerges = 64 + +This will create pseudo-merge bitmaps for all references, regardless of +their age, and group them into 64 pseudo-merge commits. + +If you wanted to separate tags from branches when generating +pseudo-merge commits, you would instead define the pattern with a +capture group, like so: + + [bitmapPseudoMerge "all"] + pattern = "refs/(heads/tags)/" + +Suppose instead that you are working in a fork-network repository, with +each fork specified by some numeric ID, and whose refs reside in +`refs/virtual/NNN/` (where `NNN` is the numeric ID corresponding to some +fork) in the network. In this instance, you may instead write something +like: + + [bitmapPseudoMerge "all"] + pattern = "refs/virtual/([0-9]+)/(heads|tags)/" + threshold = now + stableThreshold = never + sampleRate = 100 + maxMerges = 64 + +Which would generate pseudo-merge group identifiers like "1234-heads", +and "5678-tags" (for branches in fork "1234", and tags in remote "5678", +respectively). + SEE ALSO -------- linkgit:git-pack-objects[1] diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index bc19b33ad16..d5884ea5e9c 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -17,6 +17,7 @@ #include "trace2.h" #include "tree.h" #include "tree-walk.h" +#include "pseudo-merge.h" struct bitmapped_commit { struct commit *commit; @@ -39,11 +40,25 @@ void bitmap_writer_init(struct bitmap_writer *writer, struct repository *r) if (writer->bitmaps) BUG("bitmap writer already initialized"); writer->bitmaps = kh_init_oid_map(); + writer->pseudo_merge_commits = kh_init_oid_map(); + + string_list_init_dup(&writer->pseudo_merge_groups); + + load_pseudo_merges_from_config(&writer->pseudo_merge_groups); +} + +static void free_pseudo_merge_commit_idx(struct pseudo_merge_commit_idx *idx) +{ + if (!idx) + return; + free(idx->pseudo_merge); + free(idx); } void bitmap_writer_free(struct bitmap_writer *writer) { uint32_t i; + struct pseudo_merge_commit_idx *idx; if (!writer) return; @@ -55,6 +70,10 @@ void bitmap_writer_free(struct bitmap_writer *writer) kh_destroy_oid_map(writer->bitmaps); + kh_foreach_value(writer->pseudo_merge_commits, idx, + free_pseudo_merge_commit_idx(idx)); + kh_destroy_oid_map(writer->pseudo_merge_commits); + for (i = 0; i < writer->selected_nr; i++) { struct bitmapped_commit *bc = &writer->selected[i]; if (bc->write_as != bc->bitmap) @@ -703,6 +722,8 @@ void bitmap_writer_select_commits(struct bitmap_writer *writer, } stop_progress(&writer->progress); + + select_pseudo_merges(writer, indexed_commits, indexed_commits_nr); } diff --git a/pack-bitmap.h b/pack-bitmap.h index a7e2f56c971..1e730ea1e54 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -110,6 +110,8 @@ struct bitmap_writer { struct bitmapped_commit *selected; unsigned int selected_nr, selected_alloc; + struct string_list pseudo_merge_groups; + kh_oid_map_t *pseudo_merge_commits; /* oid -> pseudo merge(s) */ uint32_t pseudo_merges_nr; struct progress *progress; diff --git a/pseudo-merge.c b/pseudo-merge.c index 37e037ba272..4be730563eb 100644 --- a/pseudo-merge.c +++ b/pseudo-merge.c @@ -1,2 +1,456 @@ #include "git-compat-util.h" #include "pseudo-merge.h" +#include "date.h" +#include "oid-array.h" +#include "strbuf.h" +#include "config.h" +#include "string-list.h" +#include "refs.h" +#include "pack-bitmap.h" +#include "commit.h" +#include "alloc.h" +#include "progress.h" + +#define DEFAULT_PSEUDO_MERGE_DECAY 1.0f +#define DEFAULT_PSEUDO_MERGE_MAX_MERGES 64 +#define DEFAULT_PSEUDO_MERGE_SAMPLE_RATE 1 +#define DEFAULT_PSEUDO_MERGE_THRESHOLD approxidate("1.week.ago") +#define DEFAULT_PSEUDO_MERGE_STABLE_THRESHOLD approxidate("1.month.ago") +#define DEFAULT_PSEUDO_MERGE_STABLE_SIZE 512 + +static float gitexp(float base, int exp) +{ + float result = 1; + while (1) { + if (exp % 2) + result *= base; + exp >>= 1; + if (!exp) + break; + base *= base; + } + return result; +} + +static uint32_t pseudo_merge_group_size(const struct pseudo_merge_group *group, + const struct pseudo_merge_matches *matches, + uint32_t i) +{ + float C = 0.0f; + uint32_t n; + + /* + * The size of pseudo-merge groups decays according to a power series, + * which looks like: + * + * f(n) = C * n^-k + * + * , where 'n' is the n-th pseudo-merge group, 'f(n)' is its size, 'k' + * is the decay rate, and 'C' is a scaling value. + * + * The value of C depends on the number of groups, decay rate, and total + * number of commits. It is computed such that if there are M and N + * total groups and commits, respectively, that: + * + * N = f(0) + f(1) + ... f(M-1) + * + * Rearranging to isolate C, we get: + * + * N = \sum_{n=1}^M C / n^k + * + * N / C = \sum_{n=1}^M n^-k + * + * C = N / \sum_{n=1}^M n^-k + * + * For example, if we have a decay rate of 'k' being equal to 1.5, 'N' + * total commits equal to 10,000, and 'M' being equal to 6 groups, then + * the (rounded) group sizes are: + * + * { 5469, 1934, 1053, 684, 489, 372 } + * + * increasing the number of total groups, say to 10, scales the group + * sizes appropriately: + * + * { 5012, 1772, 964, 626, 448, 341, 271, 221, 186, 158 } + */ + for (n = 0; n < group->max_merges; n++) + C += 1.0f / gitexp(n + 1, group->decay); + C = matches->unstable_nr / C; + + return (uint32_t)((C / gitexp(i + 1, group->decay)) + 0.5); +} + +static void pseudo_merge_group_init(struct pseudo_merge_group *group) +{ + memset(group, 0, sizeof(struct pseudo_merge_group)); + + strmap_init_with_options(&group->matches, NULL, 0); + + group->decay = DEFAULT_PSEUDO_MERGE_DECAY; + group->max_merges = DEFAULT_PSEUDO_MERGE_MAX_MERGES; + group->sample_rate = DEFAULT_PSEUDO_MERGE_SAMPLE_RATE; + group->threshold = DEFAULT_PSEUDO_MERGE_THRESHOLD; + group->stable_threshold = DEFAULT_PSEUDO_MERGE_STABLE_THRESHOLD; + group->stable_size = DEFAULT_PSEUDO_MERGE_STABLE_SIZE; +} + +static int pseudo_merge_config(const char *var, const char *value, + const struct config_context *ctx, + void *cb_data) +{ + struct string_list *list = cb_data; + struct string_list_item *item; + struct pseudo_merge_group *group; + struct strbuf buf = STRBUF_INIT; + const char *sub, *key; + size_t sub_len; + int ret = 0; + + if (parse_config_key(var, "bitmappseudomerge", &sub, &sub_len, &key)) + goto done; + + if (!sub_len) + goto done; + + strbuf_add(&buf, sub, sub_len); + + item = string_list_lookup(list, buf.buf); + if (!item) { + item = string_list_insert(list, buf.buf); + + item->util = xmalloc(sizeof(struct pseudo_merge_group)); + pseudo_merge_group_init(item->util); + } + + group = item->util; + + if (!strcmp(key, "pattern")) { + struct strbuf re = STRBUF_INIT; + + free(group->pattern); + if (*value != '^') + strbuf_addch(&re, '^'); + strbuf_addstr(&re, value); + + group->pattern = xcalloc(1, sizeof(regex_t)); + if (regcomp(group->pattern, re.buf, REG_EXTENDED)) + die(_("failed to load pseudo-merge regex for %s: '%s'"), + sub, re.buf); + + strbuf_release(&re); + } else if (!strcmp(key, "decay")) { + group->decay = git_config_float(var, value, ctx->kvi); + if (group->decay < 0) { + warning(_("%s must be non-negative, using default"), var); + group->decay = DEFAULT_PSEUDO_MERGE_DECAY; + } + } else if (!strcmp(key, "samplerate")) { + group->sample_rate = git_config_float(var, value, ctx->kvi); + if (!(0 <= group->sample_rate && group->sample_rate <= 1)) { + warning(_("%s must be between 0 and 1, using default"), var); + group->sample_rate = DEFAULT_PSEUDO_MERGE_SAMPLE_RATE; + } + } else if (!strcmp(key, "threshold")) { + if (git_config_expiry_date(&group->threshold, var, value)) { + ret = -1; + goto done; + } + } else if (!strcmp(key, "maxmerges")) { + group->max_merges = git_config_int(var, value, ctx->kvi); + if (group->max_merges < 0) { + warning(_("%s must be non-negative, using default"), var); + group->max_merges = DEFAULT_PSEUDO_MERGE_MAX_MERGES; + } + } else if (!strcmp(key, "stablethreshold")) { + if (git_config_expiry_date(&group->stable_threshold, var, value)) { + ret = -1; + goto done; + } + } else if (!strcmp(key, "stablesize")) { + group->stable_size = git_config_int(var, value, ctx->kvi); + if (group->stable_size <= 0) { + warning(_("%s must be positive, using default"), var); + group->stable_size = DEFAULT_PSEUDO_MERGE_STABLE_SIZE; + } + } + +done: + strbuf_release(&buf); + + return ret; +} + +void load_pseudo_merges_from_config(struct string_list *list) +{ + struct string_list_item *item; + + git_config(pseudo_merge_config, list); + + for_each_string_list_item(item, list) { + struct pseudo_merge_group *group = item->util; + if (!group->pattern) + die(_("pseudo-merge group '%s' missing required pattern"), + item->string); + if (group->threshold < group->stable_threshold) + die(_("pseudo-merge group '%s' has unstable threshold " + "before stable one"), item->string); + } +} + +static int find_pseudo_merge_group_for_ref(const char *refname, + const struct object_id *oid, + int flags UNUSED, + void *_data) +{ + struct bitmap_writer *writer = _data; + struct object_id peeled; + struct commit *c; + uint32_t i; + int has_bitmap; + + if (!peel_iterated_oid(oid, &peeled)) + oid = &peeled; + + c = lookup_commit(the_repository, oid); + if (!c) + return 0; + + has_bitmap = bitmap_writer_has_bitmapped_object_id(writer, oid); + + for (i = 0; i < writer->pseudo_merge_groups.nr; i++) { + struct pseudo_merge_group *group; + struct pseudo_merge_matches *matches; + struct strbuf group_name = STRBUF_INIT; + regmatch_t captures[16]; + size_t j; + + group = writer->pseudo_merge_groups.items[i].util; + if (regexec(group->pattern, refname, ARRAY_SIZE(captures), + captures, 0)) + continue; + + if (captures[ARRAY_SIZE(captures) - 1].rm_so != -1) + warning(_("pseudo-merge regex from config has too many capture " + "groups (max=%"PRIuMAX")"), + (uintmax_t)ARRAY_SIZE(captures) - 2); + + for (j = !!group->pattern->re_nsub; j < ARRAY_SIZE(captures); j++) { + regmatch_t *match = &captures[j]; + if (match->rm_so == -1) + continue; + + if (group_name.len) + strbuf_addch(&group_name, '-'); + + strbuf_add(&group_name, refname + match->rm_so, + match->rm_eo - match->rm_so); + } + + matches = strmap_get(&group->matches, group_name.buf); + if (!matches) { + matches = xcalloc(1, sizeof(*matches)); + strmap_put(&group->matches, strbuf_detach(&group_name, NULL), + matches); + } + + if (c->date <= group->stable_threshold) { + ALLOC_GROW(matches->stable, matches->stable_nr + 1, + matches->stable_alloc); + matches->stable[matches->stable_nr++] = c; + } else if (c->date <= group->threshold && !has_bitmap) { + ALLOC_GROW(matches->unstable, matches->unstable_nr + 1, + matches->unstable_alloc); + matches->unstable[matches->unstable_nr++] = c; + } + + strbuf_release(&group_name); + } + + return 0; +} + +static struct commit *push_pseudo_merge(struct pseudo_merge_group *group) +{ + struct commit *merge; + + ALLOC_GROW(group->merges, group->merges_nr + 1, group->merges_alloc); + + merge = alloc_commit_node(the_repository); + merge->object.parsed = 1; + merge->object.flags |= BITMAP_PSEUDO_MERGE; + + group->merges[group->merges_nr++] = merge; + + return merge; +} + +static struct pseudo_merge_commit_idx *pseudo_merge_idx(kh_oid_map_t *pseudo_merge_commits, + const struct object_id *oid) + +{ + struct pseudo_merge_commit_idx *pmc; + int hash_ret; + khiter_t hash_pos = kh_put_oid_map(pseudo_merge_commits, *oid, + &hash_ret); + + if (hash_ret) { + CALLOC_ARRAY(pmc, 1); + kh_value(pseudo_merge_commits, hash_pos) = pmc; + } else { + pmc = kh_value(pseudo_merge_commits, hash_pos); + } + + return pmc; +} + +#define MIN_PSEUDO_MERGE_SIZE 8 + +static void select_pseudo_merges_1(struct bitmap_writer *writer, + struct pseudo_merge_group *group, + struct pseudo_merge_matches *matches) +{ + uint32_t i, j; + uint32_t stable_merges_nr; + + if (!matches->stable_nr && !matches->unstable_nr) + return; /* all tips in this group already have bitmaps */ + + stable_merges_nr = matches->stable_nr / group->stable_size; + if (matches->stable_nr % group->stable_size) + stable_merges_nr++; + + /* make stable_merges_nr pseudo merges for stable commits */ + for (i = 0, j = 0; i < stable_merges_nr; i++) { + struct commit *merge; + struct commit_list **p; + + merge = push_pseudo_merge(group); + p = &merge->parents; + + /* + * For each pseudo-merge created above, add parents to the + * allocated commit node from the stable set of commits + * (un-bitmapped, newer than the stable threshold). + */ + do { + struct commit *c; + struct pseudo_merge_commit_idx *pmc; + + if (j >= matches->stable_nr) + break; + + c = matches->stable[j++]; + /* + * Here and below, make sure that we keep our mapping of + * commits -> pseudo-merge(s) which include the key'd + * commit up-to-date. + */ + pmc = pseudo_merge_idx(writer->pseudo_merge_commits, + &c->object.oid); + + ALLOC_GROW(pmc->pseudo_merge, pmc->nr + 1, pmc->alloc); + + pmc->pseudo_merge[pmc->nr++] = writer->pseudo_merges_nr; + p = commit_list_append(c, p); + } while (j % group->stable_size); + + bitmap_writer_push_commit(writer, merge, 1); + writer->pseudo_merges_nr++; + } + + /* make up to group->max_merges pseudo merges for unstable commits */ + for (i = 0, j = 0; i < group->max_merges; i++) { + struct commit *merge; + struct commit_list **p; + uint32_t size, end; + + merge = push_pseudo_merge(group); + p = &merge->parents; + + size = pseudo_merge_group_size(group, matches, i); + end = size < MIN_PSEUDO_MERGE_SIZE ? matches->unstable_nr : j + size; + + /* + * For each pseudo-merge commit created above, add parents to + * the allocated commit node from the unstable set of commits + * (newer than the stable threshold). + * + * Account for the sample rate, since not every candidate from + * the set of stable commits will be included as a pseudo-merge + * parent. + */ + for (; j < end && j < matches->unstable_nr; j++) { + struct commit *c = matches->unstable[j]; + struct pseudo_merge_commit_idx *pmc; + + if (j % (uint32_t)(1.0f / group->sample_rate)) + continue; + + pmc = pseudo_merge_idx(writer->pseudo_merge_commits, + &c->object.oid); + + ALLOC_GROW(pmc->pseudo_merge, pmc->nr + 1, pmc->alloc); + + pmc->pseudo_merge[pmc->nr++] = writer->pseudo_merges_nr; + p = commit_list_append(c, p); + } + + bitmap_writer_push_commit(writer, merge, 1); + writer->pseudo_merges_nr++; + if (end >= matches->unstable_nr) + break; + } +} + +static int commit_date_cmp(const void *va, const void *vb) +{ + timestamp_t a = (*(const struct commit **)va)->date; + timestamp_t b = (*(const struct commit **)vb)->date; + + if (a < b) + return -1; + else if (a > b) + return 1; + return 0; +} + +static void sort_pseudo_merge_matches(struct pseudo_merge_matches *matches) +{ + QSORT(matches->stable, matches->stable_nr, commit_date_cmp); + QSORT(matches->unstable, matches->unstable_nr, commit_date_cmp); +} + +void select_pseudo_merges(struct bitmap_writer *writer, + struct commit **commits, size_t commits_nr) +{ + struct progress *progress = NULL; + uint32_t i; + + if (!writer->pseudo_merge_groups.nr) + return; + + if (writer->show_progress) + progress = start_progress("Selecting pseudo-merge commits", + writer->pseudo_merge_groups.nr); + + for_each_ref(find_pseudo_merge_group_for_ref, writer); + + for (i = 0; i < writer->pseudo_merge_groups.nr; i++) { + struct pseudo_merge_group *group; + struct hashmap_iter iter; + struct strmap_entry *e; + + group = writer->pseudo_merge_groups.items[i].util; + strmap_for_each_entry(&group->matches, &iter, e) { + struct pseudo_merge_matches *matches = e->value; + + sort_pseudo_merge_matches(matches); + + select_pseudo_merges_1(writer, group, matches); + } + + display_progress(progress, i + 1); + } + + stop_progress(&progress); +} diff --git a/pseudo-merge.h b/pseudo-merge.h index cab8ff6960a..cab54daf14b 100644 --- a/pseudo-merge.h +++ b/pseudo-merge.h @@ -2,5 +2,99 @@ #define PSEUDO_MERGE_H #include "git-compat-util.h" +#include "strmap.h" +#include "khash.h" +#include "ewah/ewok.h" + +struct commit; +struct string_list; +struct bitmap_index; +struct bitmap_writer; + +/* + * A pseudo-merge group tracks the set of non-bitmapped reference tips + * that match the given pattern. + * + * Within those matches, they are further segmented by separating + * consecutive capture groups with '-' dash character capture groups + * with '-' dash characters. + * + * Those groups are then ordered by committer date and partitioned + * into individual pseudo-merge(s) according to the decay, max_merges, + * sample_rate, and threshold parameters. + */ +struct pseudo_merge_group { + regex_t *pattern; + + /* capture group(s) -> struct pseudo_merge_matches */ + struct strmap matches; + + /* + * The individual pseudo-merge(s) that are generated from the + * above array of matches, partitioned according to the below + * parameters. + */ + struct commit **merges; + size_t merges_nr; + size_t merges_alloc; + + /* + * Pseudo-merge grouping parameters. See git-config(1) for + * more information. + */ + float decay; + int max_merges; + float sample_rate; + int stable_size; + timestamp_t threshold; + timestamp_t stable_threshold; +}; + +struct pseudo_merge_matches { + struct commit **stable; + struct commit **unstable; + size_t stable_nr, stable_alloc; + size_t unstable_nr, unstable_alloc; +}; + +/* + * Read the repository's configuration: + * + * - bitmapPseudoMerge..pattern + * - bitmapPseudoMerge..decay + * - bitmapPseudoMerge..sampleRate + * - bitmapPseudoMerge..threshold + * - bitmapPseudoMerge..maxMerges + * - bitmapPseudoMerge..stableThreshold + * - bitmapPseudoMerge..stableSize + * + * and populates the given `list` with pseudo-merge groups. String + * entry keys are the pseudo-merge group names, and the values are + * pointers to the pseudo_merge_group structure itself. + */ +void load_pseudo_merges_from_config(struct string_list *list); + +/* + * A pseudo-merge commit index (pseudo_merge_commit_idx) maps a + * particular (non-pseudo-merge) commit to the list of pseudo-merge(s) + * it appears in. + */ +struct pseudo_merge_commit_idx { + uint32_t *pseudo_merge; + size_t nr, alloc; +}; + +/* + * Selects pseudo-merges from a list of commits, populating the given + * string_list of pseudo-merge groups. + * + * Populates the pseudo_merge_commits map with a commit_idx + * corresponding to each commit in the list. Counts the total number + * of pseudo-merges generated. + * + * Optionally shows a progress meter. + */ +void select_pseudo_merges(struct bitmap_writer *writer, + struct commit **commits, size_t commits_nr); #endif From patchwork Tue May 21 19:02:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669671 Received: from mail-il1-f174.google.com (mail-il1-f174.google.com [209.85.166.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E300148FF4 for ; Tue, 21 May 2024 19:02:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318160; cv=none; b=GqHFHCmPTeCuETtI9kl+VVbmv61+0y6PDhp5cjC1BJ/3z6kXogsy1wKN2mt7LcFz0Zo2BKfZqbpZd8hm0Lb+iHre6MlXNrMpCrP0pe55K6gl1qVmRO7VPomoUW9IQLlIQCHi20rfyQyfeiwbxFz3G27GKontkFhcfNXPt4uHhmo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318160; c=relaxed/simple; bh=72OY/KzEYF6JmbGPUaSgQv3aUGh3LYS9jLBSxSpuKD0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=sIxeZW3FzUDh4zc1Qg4k/gJS43n8OPTGyUy3s+4QGNPw+IOrhudX1YnLg796jGe0O1NcWdBWq9cFaglbLAIKdk+Jg0IKpVAUgUm3rD2Cns7TvCRdZ5s8ENOlLjB2JRIZcbX7QWEQvA4QB0VNYyjFEjRWuiJDFoQT4OQAQ5ZV/HM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=SoAmiQVh; arc=none smtp.client-ip=209.85.166.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="SoAmiQVh" Received: by mail-il1-f174.google.com with SMTP id e9e14a558f8ab-3713ff97cf5so3332195ab.2 for ; Tue, 21 May 2024 12:02:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318157; x=1716922957; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Vubycmys41sbVyGLY0zuuN09K5YNF0WjIn8Lo46HYdw=; b=SoAmiQVhKjlF8vwBJxP8inpndUMo8RO7ZeBu7+VAPubkpNCKKa5hwROvyZ2bTuagDS cIRydePo2sEDtI4yTSalsAvskwsFGFy3++2Uy/ERxUWI2fYrF58CN/C2OF58GWGONQQT 2g+h/xhGoyKiXyj3DjaAkxBL3A3GDbTjZiAnyK8962bdgerAgMlJJMACUkfAuu89q3Wu 6LgWktL+Gr9dYE5tQv7ahHFM3KvVhAF8b03M1EqFGwVtFdA2Ks76+YaKa3gHj2o/oc/7 kFl+KWmpGCUfet9CmkX+FVXyQuMNvCtrHbHmb6K6ROwh2LP0iKM2ZLzJik840+3iclQy BjKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318157; x=1716922957; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Vubycmys41sbVyGLY0zuuN09K5YNF0WjIn8Lo46HYdw=; b=TYU/6Uor4aRbHlqZgfWvTza4zTdQDQ6Bmmh9fCqqBW2wL68q/Q/wn34XaqrikOJDYd gPZjmkZPcGtIIMuH1A5VAz65SPn/LJ7XAzxH9OIUPzQwoxWOmxu+NhuX6p8mMu8IdB+d S3rit48NGvDKrI998zALK3mgtihUdIJJGxeH5Piyju0oloRPHk170wqDos1w+wBjZBHi Isv+H+lkpjImpLPecFIpEDCAvgUXgRnT08wSn8S7eU6h304Aax7xpgqoJfWT/YPL01El CsBWFsQRIBX4pIIs3+IVjwOm2RUszyqTLB8zLkQGv2PWB4o+Js8utWya9mwmzEPoef4I TZBw== X-Gm-Message-State: AOJu0YxI/Kf3fXfXG6qlaqrXu5OkmNz7k4zujaF1Hp0njzXzb31aDnb3 UOEZOSTDveS7Igh9YFk2BOElQdEW6qsDkF2mrEGsyMrViO/6+XhgJHiI3PyorB+bS4fD5X4ENNE z X-Google-Smtp-Source: AGHT+IEJqa2NzFaCJhWyxXdg1qM/+XGJgHEJYiuvBOkACNy5jDV/HWwl9dbLmbJU2cKUAFrg0MHsmQ== X-Received: by 2002:a05:6e02:1786:b0:36c:46bf:4afa with SMTP id e9e14a558f8ab-36cc14406famr432413815ab.6.1716318157143; Tue, 21 May 2024 12:02:37 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6a15f09847csm125976756d6.0.2024.05.21.12.02.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:36 -0700 (PDT) Date: Tue, 21 May 2024 15:02:35 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 18/30] pack-bitmap-write.c: write pseudo-merge table Message-ID: <311226f65c27295aff159fa741e1e6a44ade4b8b.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Now that the pack-bitmap writer machinery understands how to select and store pseudo-merge commits, teach it how to write the new optional pseudo-merge .bitmap extension. No readers yet exist for this new extension to the .bitmap format. The following commits will take any preparatory step(s) necessary before then implementing the routines necessary to read this new table. In the meantime, the new `write_pseudo_merges()` function implements writing this new format as described by a previous commit in Documentation/technical/bitmap-format.txt. Writing this table is fairly straightforward and consists of a few sub-components: - a pair of bitmaps for each pseudo-merge (one for the pseudo-merge "parents", and another for the objects reachable from those parents) - for each commit, the offset of either (a) the pseudo-merge it belongs to, or (b) an extended lookup table if it belongs to >1 pseudo-merge groups - if there are any commits belonging to >1 pseudo-merge group, the extended lookup tables (which each consist of the number of pseudo-merge groups a commit appears in, and then that many 4-byte unsigned ) Signed-off-by: Taylor Blau --- pack-bitmap-write.c | 131 ++++++++++++++++++++++++++++++++++++++++++++ pack-bitmap.h | 1 + 2 files changed, 132 insertions(+) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index d5884ea5e9c..47250398aa2 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -18,6 +18,7 @@ #include "tree.h" #include "tree-walk.h" #include "pseudo-merge.h" +#include "oid-array.h" struct bitmapped_commit { struct commit *commit; @@ -771,6 +772,130 @@ static void write_selected_commits_v1(struct bitmap_writer *writer, } } +static void write_pseudo_merges(struct bitmap_writer *writer, + struct hashfile *f) +{ + struct oid_array commits = OID_ARRAY_INIT; + struct bitmap **commits_bitmap = NULL; + off_t *pseudo_merge_ofs = NULL; + off_t start, table_start, next_ext; + + uint32_t base = bitmap_writer_nr_selected_commits(writer); + size_t i, j = 0; + + CALLOC_ARRAY(commits_bitmap, writer->pseudo_merges_nr); + CALLOC_ARRAY(pseudo_merge_ofs, writer->pseudo_merges_nr); + + for (i = 0; i < writer->pseudo_merges_nr; i++) { + struct bitmapped_commit *merge = &writer->selected[base + i]; + struct commit_list *p; + + if (!merge->pseudo_merge) + BUG("found non-pseudo merge commit at %"PRIuMAX, (uintmax_t)i); + + commits_bitmap[i] = bitmap_new(); + + for (p = merge->commit->parents; p; p = p->next) + bitmap_set(commits_bitmap[i], + find_object_pos(writer, &p->item->object.oid, + NULL)); + } + + start = hashfile_total(f); + + for (i = 0; i < writer->pseudo_merges_nr; i++) { + struct ewah_bitmap *commits_ewah = bitmap_to_ewah(commits_bitmap[i]); + + pseudo_merge_ofs[i] = hashfile_total(f); + + dump_bitmap(f, commits_ewah); + dump_bitmap(f, writer->selected[base+i].write_as); + + ewah_free(commits_ewah); + } + + next_ext = st_add(hashfile_total(f), + st_mult(kh_size(writer->pseudo_merge_commits), + sizeof(uint64_t))); + + table_start = hashfile_total(f); + + commits.alloc = kh_size(writer->pseudo_merge_commits); + CALLOC_ARRAY(commits.oid, commits.alloc); + + for (i = kh_begin(writer->pseudo_merge_commits); i != kh_end(writer->pseudo_merge_commits); i++) { + if (!kh_exist(writer->pseudo_merge_commits, i)) + continue; + oid_array_append(&commits, &kh_key(writer->pseudo_merge_commits, i)); + } + + oid_array_sort(&commits); + + /* write lookup table (non-extended) */ + for (i = 0; i < commits.nr; i++) { + int hash_pos; + struct pseudo_merge_commit_idx *c; + + hash_pos = kh_get_oid_map(writer->pseudo_merge_commits, + commits.oid[i]); + if (hash_pos == kh_end(writer->pseudo_merge_commits)) + BUG("could not find pseudo-merge commit %s", + oid_to_hex(&commits.oid[i])); + + c = kh_value(writer->pseudo_merge_commits, hash_pos); + + hashwrite_be32(f, find_object_pos(writer, &commits.oid[i], + NULL)); + if (c->nr == 1) + hashwrite_be64(f, pseudo_merge_ofs[c->pseudo_merge[0]]); + else if (c->nr > 1) { + if (next_ext & ((uint64_t)1<<63)) + die(_("too many pseudo-merges")); + hashwrite_be64(f, next_ext | ((uint64_t)1<<63)); + next_ext = st_add3(next_ext, + sizeof(uint32_t), + st_mult(c->nr, sizeof(uint64_t))); + } else + BUG("expected commit '%s' to have at least one " + "pseudo-merge", oid_to_hex(&commits.oid[i])); + } + + /* write lookup table (extended) */ + for (i = 0; i < commits.nr; i++) { + int hash_pos; + struct pseudo_merge_commit_idx *c; + + hash_pos = kh_get_oid_map(writer->pseudo_merge_commits, + commits.oid[i]); + if (hash_pos == kh_end(writer->pseudo_merge_commits)) + BUG("could not find pseudo-merge commit %s", + oid_to_hex(&commits.oid[i])); + + c = kh_value(writer->pseudo_merge_commits, hash_pos); + if (c->nr == 1) + continue; + + hashwrite_be32(f, c->nr); + for (j = 0; j < c->nr; j++) + hashwrite_be64(f, pseudo_merge_ofs[c->pseudo_merge[j]]); + } + + /* write positions for all pseudo merges */ + for (i = 0; i < writer->pseudo_merges_nr; i++) + hashwrite_be64(f, pseudo_merge_ofs[i]); + + hashwrite_be32(f, writer->pseudo_merges_nr); + hashwrite_be32(f, kh_size(writer->pseudo_merge_commits)); + hashwrite_be64(f, table_start - start); + hashwrite_be64(f, hashfile_total(f) - start + sizeof(uint64_t)); + + for (i = 0; i < writer->pseudo_merges_nr; i++) + bitmap_free(commits_bitmap[i]); + + free(pseudo_merge_ofs); + free(commits_bitmap); +} + static int table_cmp(const void *_va, const void *_vb, void *_data) { struct bitmap_writer *writer = _data; @@ -878,6 +1003,9 @@ void bitmap_writer_finish(struct bitmap_writer *writer, int fd = odb_mkstemp(&tmp_file, "pack/tmp_bitmap_XXXXXX"); + if (writer->pseudo_merges_nr) + options |= BITMAP_OPT_PSEUDO_MERGES; + f = hashfd(fd, tmp_file.buf); memcpy(header.magic, BITMAP_IDX_SIGNATURE, sizeof(BITMAP_IDX_SIGNATURE)); @@ -907,6 +1035,9 @@ void bitmap_writer_finish(struct bitmap_writer *writer, write_selected_commits_v1(writer, f, offsets); + if (options & BITMAP_OPT_PSEUDO_MERGES) + write_pseudo_merges(writer, f); + if (options & BITMAP_OPT_LOOKUP_TABLE) write_lookup_table(writer, f, offsets); diff --git a/pack-bitmap.h b/pack-bitmap.h index 1e730ea1e54..db9ae554fa8 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -37,6 +37,7 @@ enum pack_bitmap_opts { BITMAP_OPT_FULL_DAG = 0x1, BITMAP_OPT_HASH_CACHE = 0x4, BITMAP_OPT_LOOKUP_TABLE = 0x10, + BITMAP_OPT_PSEUDO_MERGES = 0x20, }; enum pack_bitmap_flags { From patchwork Tue May 21 19:02:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669672 Received: from mail-qk1-f171.google.com (mail-qk1-f171.google.com [209.85.222.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5A7F0148FF4 for ; Tue, 21 May 2024 19:02:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318163; cv=none; b=IG4F34PELVs1dvAyttNiG/bX5eU7UTZRpAqWDxwEZ98Yfn18QJZQBsYs1O/LOgnor3bYAdIQKLsJ7rMDZR2d+j+k34aLhIP7UtXaH6/yOr4N01d++aThpnDI6UkNknmF7xE9TDiDr6j7cPhfoj0IrUgk+7O711WnIbhjiPX+3uQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318163; c=relaxed/simple; bh=v3HyzGw4s4uOasLIiFBfYg1S7Bq3Ia6rouObgPHZSEY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Pntkj1b2pdZDNBaPGCfyjpfIAZbZkCByLtX50npZkbauOCWGf7gs7IPIo1mHggCd9iJpwud5jkp7LlytsLcYOiWkAsZNPw/VC2zj2Hd1x2Uw82esoDXe2FHLTP9jVbObJB5TjTKsUKfJRJsPTTPUfNlyBrF/ynMlvTxScg2raTg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=vvEvRJjw; arc=none smtp.client-ip=209.85.222.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="vvEvRJjw" Received: by mail-qk1-f171.google.com with SMTP id af79cd13be357-792b8bf806fso30820585a.0 for ; Tue, 21 May 2024 12:02:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318160; x=1716922960; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Mag3WXEGw9TgE7Fy+nYodFXV59VLIcDLrJ4LZS5CV0c=; b=vvEvRJjwOBRlRTbwqU4pdpJRx5uCnhXY145ZuynUXVldy2SYy6n95dOX8/Jpe3rGNL 7Ee/Agvwcu2Wz+DSQ+dIDZWt8M3xSgDADsv+voVDMTDlijbGAq0/KPb2Cnvt9HibfK6F 2aeU7rTfgUd3ZMd61jXeFty9P3+dLuit22avvXMcaU3SV3evXis+fbqK2uwIGhHsvk+N EwrB5EGsqj1UxnyTeYFQmApknnYW/PfZ95bjsPUDGV0rS5eJLe8k6PZiLH1mpQdOs0vd gDAtMvmgvQSPpzw/E7/uPOIAcBL/ct9L052Y2vINcWxwCXKL+JDrjCjBJsmoRTpFnQ3c aolQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318160; x=1716922960; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Mag3WXEGw9TgE7Fy+nYodFXV59VLIcDLrJ4LZS5CV0c=; b=wp8SCW7UFn1ioEEQWbzYwbAJ0Y4CREh+Oo64iDLSeTCZrzP7/YwDRqTGfsw1H3Ukxn EGtcKJ9JMfaF4UCgXrWYC00hIK9oaUp5DHQvi9UJ8d8VSpqFI7EL7XN5xDPifMKC0XNs GXCUfD9SY0mNIAk33kgaaX38QdW5jeBKeijexC6BmYmt+e14Hla+FAegfU2nz9lYPZ/4 4eyKponxKNIcwPlYUZ9tu88sJI1TqQQ6h5h/DdQTeeYvka08yE8M2yPaf3rwa2KFigbF dfyhJ/8AC9cfFMFUTcKfxha8Z0IALNMVAa08t2zo5LEs+BoNJT7b2ycg8i/c1TUSjdtV nnVA== X-Gm-Message-State: AOJu0YyKfYM3ZeNfBnCOgI7ZJiE5GyeaTi/Prd56FX+Qgw02n44VmL97 hcGPc6N6QzIL0z+h61x6T6MZQ7BdRHwchAyXuUOjqrqM5OZ91iCjXvJGUbSRTIiUcCumnDVNGnD r X-Google-Smtp-Source: AGHT+IF0LLN9sNC+PoeQ7ElRI5A/0+Rl4kde5PcYvGfwFrahbCo/KqvikDL6e6WJf9q5qbuGRvClGQ== X-Received: by 2002:a05:620a:46a0:b0:792:a8d2:83d0 with SMTP id af79cd13be357-792c75af2a4mr4005484485a.39.1716318160168; Tue, 21 May 2024 12:02:40 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7948e9baa51sm124659285a.5.2024.05.21.12.02.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:39 -0700 (PDT) Date: Tue, 21 May 2024 15:02:38 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 19/30] pack-bitmap: extract `read_bitmap()` function Message-ID: <55dd7a8023e78d187c3f71164537f49af07110bf.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The pack-bitmap machinery uses the `read_bitmap_1()` function to read a bitmap from within the mmap'd region corresponding to the .bitmap file. As as side-effect of calling this function, `read_bitmap_1()` increments the `index->map_pos` variable to reflect the number of bytes read. Extract the core of this routine to a separate function (that operates over a `const unsigned char *`, a `size_t` and a `size_t *` pointer) instead of a `struct bitmap_index *` pointer. This function (called `read_bitmap()`) is part of the pack-bitmap.h API so that it can be used within the upcoming portion of the implementation in pseduo-merge.ch. Rewrite the existing function, `read_bitmap_1()`, in terms of its more generic counterpart. Signed-off-by: Taylor Blau --- pack-bitmap.c | 24 +++++++++++++++--------- pack-bitmap.h | 2 ++ 2 files changed, 17 insertions(+), 9 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 35c5ef9d3cd..3519edb896b 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -129,17 +129,13 @@ static struct ewah_bitmap *lookup_stored_bitmap(struct stored_bitmap *st) return composed; } -/* - * Read a bitmap from the current read position on the mmaped - * index, and increase the read position accordingly - */ -static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) +struct ewah_bitmap *read_bitmap(const unsigned char *map, + size_t map_size, size_t *map_pos) { struct ewah_bitmap *b = ewah_pool_new(); - ssize_t bitmap_size = ewah_read_mmap(b, - index->map + index->map_pos, - index->map_size - index->map_pos); + ssize_t bitmap_size = ewah_read_mmap(b, map + *map_pos, + map_size - *map_pos); if (bitmap_size < 0) { error(_("failed to load bitmap index (corrupted?)")); @@ -147,10 +143,20 @@ static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) return NULL; } - index->map_pos += bitmap_size; + *map_pos += bitmap_size; + return b; } +/* + * Read a bitmap from the current read position on the mmaped + * index, and increase the read position accordingly + */ +static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) +{ + return read_bitmap(index->map, index->map_size, &index->map_pos); +} + static uint32_t bitmap_num_objects(struct bitmap_index *index) { if (index->midx) diff --git a/pack-bitmap.h b/pack-bitmap.h index db9ae554fa8..21aabf805ea 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -160,4 +160,6 @@ int bitmap_is_preferred_refname(struct repository *r, const char *refname); int verify_bitmap_files(struct repository *r); +struct ewah_bitmap *read_bitmap(const unsigned char *map, + size_t map_size, size_t *map_pos); #endif From patchwork Tue May 21 19:02:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669673 Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A073149E1B for ; Tue, 21 May 2024 19:02:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318166; cv=none; b=WkwmkEugj0/H1Y5cQy/qQ6lIM43k3/y5Od++oiPPbZmGsMfnMpDAlEEXHJSYb35BHWhUzVnPehKAsbLiSheNRLZVOhfiDzUnkukA6Y35xjH5W2sVKE1qNoCjIMYDb5pHL7xCBjCsR2q5f2J3pW9+l/j42zPn8J7VmU+COFHeGWs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318166; c=relaxed/simple; bh=6gPiXTQbzqTM2VOoEzLtIO7Cfo8n/fDoeFiydNoFK4M=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=MX3D7diNFw33gOI8ekxIasHCTAdI2JWTvkRYyKMv5QtATURw/eHHGrTxq10dkCXRPYKEnB7nzZ6C7xOBmce6UpCUKN5N0Ozdfaa/cxwE4RxfA6M0D22DVUo3rJ9zCI56dpU+NAeRNNdccWsK+PG2Sivy005q3JAPFsPiGC/lV1Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=dle+Cemx; arc=none smtp.client-ip=209.85.160.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="dle+Cemx" Received: by mail-qt1-f178.google.com with SMTP id d75a77b69052e-43df23e034cso1628411cf.0 for ; Tue, 21 May 2024 12:02:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318164; x=1716922964; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=m/3J8+gYtOZkY1JR9A4bOqljMajPYqybQ3k/nNPdyKI=; b=dle+CemxEJRwhnA/BmAx1TfZAG5trGYpqu03sgNL3u+RuHSLAJhVitLHrLmEgIP29+ tLXtVkuAM1lvPaEvD+UbBhlJSxWtn5a/gkbSE3ZIXX8EtlAH6NNW+P6VQdvESTHsTqaV 4BwC+7Rosvrx0r+bs7+kSwsjmqjsV/3fP8gl6LvZR+R+ig6v57DPu0tE3TV409AHG4hb m45OpGdQ+txa3YwnwahsrFCZnQrYiHZpFHBSAu1WiqPpQN4eOljJKWblS8OXZtJJZqyv PPVSpOib7ZpZ27hP+d/mSmN+JS/Pphs09L0TWvlr7bHKDfIDPaqogI1s9ludenUVtmD0 L7pQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318164; x=1716922964; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=m/3J8+gYtOZkY1JR9A4bOqljMajPYqybQ3k/nNPdyKI=; b=H2yiCu5MVnR2/EivssrSUSeGlrWHznLGMD+bGBLMEzL63OuQ46tUwnogQUusBhG77G ThKbsLyIfKO8J8ArEHnPsZElXMrZesD/YD/nbRD6/J7kKn+FN0zPmF4B9FC66HN006AX jobzkM4SRPmwwIeJ1CLkbwJGNh2/0+MS6eEr42pMZuzG2AO76kBE7d1xLe8zx8dNoRRG 4gV6jp9qMVShaeehqzo5Jcu2wbpdGeIoc6ekwMljWWX0wp/rsXpOQfqrHZ+zDBuZvv0T 4vOko9XxoGp3bmunuN1ugFp3XhD8r0H02V9ya0wAre0XvE26cOlKkjOoKiKwVqc0rG16 DrTg== X-Gm-Message-State: AOJu0Yx1FKIVF2BTQqboarxhzyi1XmzFm+PEJnh5R91NVKoGIGCCj1hQ F2hiNb41Z0WRZuvGxvXK67gseYDOkKL4jATPjq8EpmjbV9KWxAyESfAW1q3piLrVrYAX/BMyx04 W X-Google-Smtp-Source: AGHT+IF/C0OQlV8ZyPqOdTyhpWTDq9o6TiUTeUCYT/kE5D/sfj1TL9A/3b1KKv9HCTIR1Mn4wcBHzQ== X-Received: by 2002:a05:622a:1827:b0:43a:1d94:c573 with SMTP id d75a77b69052e-43f7a2c8979mr188665651cf.22.1716318163544; Tue, 21 May 2024 12:02:43 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-43f991bb19dsm5969611cf.63.2024.05.21.12.02.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:42 -0700 (PDT) Date: Tue, 21 May 2024 15:02:41 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 20/30] pseudo-merge: scaffolding for reads Message-ID: <3cc5434e44e09e8ba5f73df602bc3801112fa8c0.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Implement scaffolding within the new pseudo-merge compilation unit necessary to use the pseudo-merge API from within the pack-bitmap.c machinery. The core of this scaffolding is two-fold: - The `pseudo_merge` structure itself, which represents an individual pseudo-merge bitmap. It has fields for both bitmaps, as well as metadata about its position within the memory-mapped region, and a few extra bits indicating whether or not it is satisfied, and which bitmaps(s, if any) have been read, since they are initialized lazily. - The `pseudo_merge_map` structure, which holds an array of pseudo_merges, as well as a pointer to the memory-mapped region containing the pseudo-merge serialization from within a .bitmap file. Note that the `bitmap_index` structure is defined statically within the pack-bitmap.o compilation unit, so we can't take in a `struct bitmap_index *`. Instead, wrap the primary components necessary to read the pseudo-merges in this new structure to avoid exposing the implementation details of the `bitmap_index` structure. Signed-off-by: Taylor Blau --- pseudo-merge.c | 10 ++++++++ pseudo-merge.h | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 75 insertions(+) diff --git a/pseudo-merge.c b/pseudo-merge.c index 4be730563eb..1aca70ecdfb 100644 --- a/pseudo-merge.c +++ b/pseudo-merge.c @@ -454,3 +454,13 @@ void select_pseudo_merges(struct bitmap_writer *writer, stop_progress(&progress); } + +void free_pseudo_merge_map(struct pseudo_merge_map *pm) +{ + uint32_t i; + for (i = 0; i < pm->nr; i++) { + ewah_pool_free(pm->v[i].commits); + ewah_pool_free(pm->v[i].bitmap); + } + free(pm->v); +} diff --git a/pseudo-merge.h b/pseudo-merge.h index cab54daf14b..a3f0243062c 100644 --- a/pseudo-merge.h +++ b/pseudo-merge.h @@ -97,4 +97,69 @@ struct pseudo_merge_commit_idx { void select_pseudo_merges(struct bitmap_writer *writer, struct commit **commits, size_t commits_nr); +/* + * Represents a serialized view of a file containing pseudo-merge(s) + * (see Documentation/technical/bitmap-format.txt for a specification + * of the format). + */ +struct pseudo_merge_map { + /* + * An array of pseudo-merge(s), lazily loaded from the .bitmap + * file. + */ + struct pseudo_merge *v; + size_t nr; + size_t commits_nr; + + /* + * Pointers into a memory-mapped view of the .bitmap file: + * + * - map: the beginning of the .bitmap file + * - commits: the beginning of the pseudo-merge commit index + * - map_size: the size of the .bitmap file + */ + const unsigned char *map; + const unsigned char *commits; + + size_t map_size; +}; + +/* + * An individual pseudo-merge, storing a pair of lazily-loaded + * bitmaps: + * + * - commits: the set of commit(s) that are part of the pseudo-merge + * - bitmap: the set of object(s) reachable from the above set of + * commits. + * + * The `at` and `bitmap_at` fields are used to store the locations of + * each of the above bitmaps in the .bitmap file. + */ +struct pseudo_merge { + struct ewah_bitmap *commits; + struct ewah_bitmap *bitmap; + + off_t at; + off_t bitmap_at; + + /* + * `satisfied` indicates whether the given pseudo-merge has been + * used. + * + * `loaded_commits` and `loaded_bitmap` indicate whether the + * respective bitmaps have been loaded and read from the + * .bitmap file. + */ + unsigned satisfied : 1, + loaded_commits : 1, + loaded_bitmap : 1; +}; + +/* + * Frees the given pseudo-merge map, releasing any memory held by (a) + * parsed EWAH bitmaps, or (b) the array of pseudo-merges itself. Does + * not free the memory-mapped view of the .bitmap file. + */ +void free_pseudo_merge_map(struct pseudo_merge_map *pm); + #endif From patchwork Tue May 21 19:02:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669674 Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B5F0C149E1B for ; Tue, 21 May 2024 19:02:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318170; cv=none; b=Tw8IemFBjvtGa+jIncHiETUjYRH157Nlu31euG/ZEGlpZBHkQcx7PL3nXTsn3/pizc9VREnO2MoVXFyB6S8diQKG6IZTcqZn4fvjjf1mSXnSJPeXjtDBdWpnDkmt23cantmpIZQulshG/BCx6hLBpO2LTWoXUeD9A/oeQDn9n5g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318170; c=relaxed/simple; bh=zANdjcydiA+otR7WPTOiZbeC5lcepqIqDY1nRjlxtWw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=cr9erbpihAX1E6fQeSaaCFQsD2YgxU2nq8QlnsyHEuhP4xHoc4DKtCKPVsZofM4uTvZq3ULCfDw2RSdKR0LwwIUIcv+t6HeTONLTD6vcvLLYBtIgtb+CHXm1iCgqocWjrqk04bh/YswkT4BXRQNkEdSm8LaVAKkWKWpRO1RHdVc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=LnUt9okH; arc=none smtp.client-ip=209.85.160.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="LnUt9okH" Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-43e1593d633so3200691cf.3 for ; Tue, 21 May 2024 12:02:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318167; x=1716922967; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=opLcp5O+zjuR/751L16t+v40EZdGi5KDRmF2283eYE0=; b=LnUt9okHw18wZwZjWDfoAYuqxlKd6UbOTJsHPH6KStkMVJN0vOgdrNhy9h9w9jh72l fEAJwOW+WAo0eRTbntRiku036mOB/432MgtN8k7HppC8S9c+XP7Emn8EhfJgG3EeueJ5 SPKcG19g41Liw1h/BWOCuUe3CxKan4qDKAC0IaJrb67JK4OpPx49KyGj/72rdtB+IQdV oiSsv0DrbycCfAqhaRl4JhhUQrFq2a6Qk3fArzihXVT3ams6gcF60RXjpBIwAB7tbD13 cdL4u3ImvTkMdBXhmxsKGOqfFxBTg6W/XB2pSqNyAlUHQsGZ/d5LQ6Z/oS0E8Wt0mLSp lpvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318167; x=1716922967; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=opLcp5O+zjuR/751L16t+v40EZdGi5KDRmF2283eYE0=; b=BG2OWc+kPiZJ/FYWBIif3C9SIoahKBTbpp/X7KdE3CcSluhGHc/kCDUNEwsdUlFLnt kzqEXys9ZE0Z4pNRgQ0Tv67UY0ZHgE6xYWyKp/EGYg6yNtau3Cu14fwap303BPRvJrMh aE9rA1jsYZBwgtwZew2RWJOrmExVnR5Rbvv76g99sDzKzuTFdNIPAZSdhP/xzRBchYcJ xBBwam0WCWExBO0rc6TPLFPKvVLQ5mX0t/obcovKwPdopkrlvuMYTTVWR/I8VPMFyct2 W1w9xuPqprjGCkUEz7AQOJ6Q5qscIYZ7ku8I0ELJrxk2JP1AxVTB6hrnLeLC32fXUIr5 2Qcg== X-Gm-Message-State: AOJu0Yw1w0WfQ4Z4Xwx4Vp1pvyIP/6roWnszE8pHBvGw5Yb1K2UknokN cfxaAu8AG0oHZBFjoqrzJmNynacrYe4K6l1ExY+NodNfMtXD8jZuBgAJGHG/UyZ10I3IA/oX0vP 5 X-Google-Smtp-Source: AGHT+IGJAmc6gdcz3wEw7W/q/RmgwnycfGw6NENt9U9c4BHXc48KMCc6/EjWe3U80sH0vAxNR1Dnzw== X-Received: by 2002:a05:622a:241:b0:43c:5d37:5a94 with SMTP id d75a77b69052e-43dfdd6a6b7mr347475161cf.62.1716318167238; Tue, 21 May 2024 12:02:47 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-43e15b37e83sm124180351cf.1.2024.05.21.12.02.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:46 -0700 (PDT) Date: Tue, 21 May 2024 15:02:45 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 21/30] pack-bitmap.c: read pseudo-merge extension Message-ID: <7664f5f964867f62dbd748351bda83007958a751.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Now that the scaffolding for reading the pseudo-merge extension has been laid, teach the pack-bitmap machinery to read the pseudo-merge extension when present. Note that pseudo-merges themselves are not yet used during traversal, this step will be taken by a future commit. In the meantime, read the table and initialize the pseudo_merge_map structure introduced by a previous commit. When the pseudo-merge extension is present, `load_bitmap_header()` performs basic sanity checks to make sure that the table is well-formed. Signed-off-by: Taylor Blau --- pack-bitmap.c | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/pack-bitmap.c b/pack-bitmap.c index 3519edb896b..fc9c3e2fc43 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -20,6 +20,7 @@ #include "list-objects-filter-options.h" #include "midx.h" #include "config.h" +#include "pseudo-merge.h" /* * An entry on the bitmap index, representing the bitmap for a given @@ -86,6 +87,9 @@ struct bitmap_index { */ unsigned char *table_lookup; + /* This contains the pseudo-merge cache within 'map' (if found). */ + struct pseudo_merge_map pseudo_merges; + /* * Extended index. * @@ -205,6 +209,41 @@ static int load_bitmap_header(struct bitmap_index *index) index->table_lookup = (void *)(index_end - table_size); index_end -= table_size; } + + if (flags & BITMAP_OPT_PSEUDO_MERGES) { + unsigned char *pseudo_merge_ofs; + size_t table_size; + uint32_t i; + + if (sizeof(table_size) > index_end - index->map - header_size) + return error(_("corrupted bitmap index file (too short to fit pseudo-merge table header)")); + + table_size = get_be64(index_end - 8); + if (table_size > index_end - index->map - header_size) + return error(_("corrupted bitmap index file (too short to fit pseudo-merge table)")); + + if (git_env_bool("GIT_TEST_USE_PSEUDO_MERGES", 1)) { + const unsigned char *ext = (index_end - table_size); + + index->pseudo_merges.map = index->map; + index->pseudo_merges.map_size = index->map_size; + index->pseudo_merges.commits = ext + get_be64(index_end - 16); + index->pseudo_merges.commits_nr = get_be32(index_end - 20); + index->pseudo_merges.nr = get_be32(index_end - 24); + + CALLOC_ARRAY(index->pseudo_merges.v, + index->pseudo_merges.nr); + + pseudo_merge_ofs = index_end - 24 - + (index->pseudo_merges.nr * sizeof(uint64_t)); + for (i = 0; i < index->pseudo_merges.nr; i++) { + index->pseudo_merges.v[i].at = get_be64(pseudo_merge_ofs); + pseudo_merge_ofs += sizeof(uint64_t); + } + } + + index_end -= table_size; + } } index->entry_count = ntohl(header->entry_count); From patchwork Tue May 21 19:02:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669675 Received: from mail-oa1-f51.google.com (mail-oa1-f51.google.com [209.85.160.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE2B4149E1B for ; Tue, 21 May 2024 19:02:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318173; cv=none; b=e0polBI/ypM2zNI5DCQgi/+jRtQdgtV56Y892cEuMuNhPUbREmxz7U7dMJyhXCCKvY0lVlDeo9qm6VYWQcS3ZqGKF9GtGzwIbuTbIlwVScD/1ZgDUkStT24eT3OUkjR+IGv0TW5+eRvmgIzrYcP3DXWq75ODMRsrnRrAkeL3wWU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318173; c=relaxed/simple; bh=GS7ETYlZZ3CaQiQxjdUEfAOozKTCPoH/xp1LHtPAaH0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=H21XNoNnC02FWSHxfP8Gnmfh1cmSIlAHHEetkMPZvNm2YU2cs02hsk84fKKh4cMzaqkTJFuulaCb0LYLWYmi1oRHIGceloUHh9P6bIroul12V4ZYOZscsoCZEaXg4MkD47s10ZXKFYRntmz0MoCySyRyxNKqfchstyKwF9zLMho= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=lD9bTtOW; arc=none smtp.client-ip=209.85.160.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="lD9bTtOW" Received: by mail-oa1-f51.google.com with SMTP id 586e51a60fabf-23ee34c33ceso1954164fac.3 for ; Tue, 21 May 2024 12:02:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318170; x=1716922970; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=rwNBCtxGkzIA0ESquA+d83YILY/XN3+X4GtvrWe7Zzk=; b=lD9bTtOWfxZVHj2X6gmu0urAIqUP7fv7fa68aWmjVWNFpmNQB0H4WMEgAWTfgr4XeG 8kXXf/3z0j+vu9p0DxbtDb2ZlW4BL3dcPxxlILNg6xhAkqZzM1B7pUymgOYeYdK6jHxk /XuuuaOSdwVlSHivXrBnZH3f8jxk7TdaRipeUqlCMKo/LPeZqJom8Kx2PM3zDOKaqAW/ qBZQRCln0TWgnaQQDvaW+DkanwrFIEHMvOoXZsUVBGlLUDEQpfH+q1ooUUHsLz/2x36g UqOIVquULnPPSLLzJK2NbuGWjLFj+8iW25vGXbeB1GXSt/fj5etkATWezPYSgmhyTnqg y+Sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318170; x=1716922970; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=rwNBCtxGkzIA0ESquA+d83YILY/XN3+X4GtvrWe7Zzk=; b=E9KocRj0PzVwVqvpyMTfBRNQ/11rItMVldjYro2cdHdYefcWso8ApHouGaaphxOy6y +xPdQlNDaAPYkU7ZnohTlNa9XHThb4DVx+lQdleQTiL1qB76KvxXn1uyJqzxuNWJLV+M x68p4xmtTZj2AcMlP+a6XGyk8RzglKOM4rGH1StQf6E1QTpeIp1NnWNK6LFTVtwT0cli 9h8QbSPrBZYGS/R3wnCk6dMYvkatft8NMqzKHnhj2P0mCaUj1GIpsGOXk4CZLLa/cLab tpabNYYE2nzvEmc+M/t/FaEPUc8cXmXX9Ots7tQ/HQrsa5xog3Hynep+tjYSbzbivBgK TbgQ== X-Gm-Message-State: AOJu0YxkN/gLn4tYUToZLIimaGkZDO7CopEHfs5qxDMzCgpEC5cauFuc MA5ilWuhywpUyWi0MItTRA7yiSuiBQhzyEN4Mhz5hEbWoqQj+f/WzxpEDWAMI3laxydXeKPur4R B X-Google-Smtp-Source: AGHT+IGz0hSnUodJAUoLeAH0VmxwlsP2qMjSzExztR6W8L7ohQcEJCUTRmXhP8ACL1+4pPoZNEmazA== X-Received: by 2002:a05:6870:b41d:b0:24c:58e4:976e with SMTP id 586e51a60fabf-24c68b3dddcmr11724fac.29.1716318170374; Tue, 21 May 2024 12:02:50 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6a15f1872f8sm126092546d6.55.2024.05.21.12.02.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:49 -0700 (PDT) Date: Tue, 21 May 2024 15:02:48 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 22/30] pseudo-merge: implement support for reading pseudo-merge commits Message-ID: <8ba0a9c5402fb154bc316768a8fbb016e302a686.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Implement the basic API for reading pseudo-merge bitmaps, which consists of four basic functions: - pseudo_merge_bitmap() - use_pseudo_merge() - apply_pseudo_merges_for_commit() - cascade_pseudo_merges() These functions are all documented in pseudo-merge.h, but their rough descriptions are as follows: - pseudo_merge_bitmap() reads and inflates the objects EWAH bitmap for a given pseudo-merge - use_pseudo_merge() does the same as pseudo_merge_bitmap(), but on the commits EWAH bitmap, not the objects bitmap - apply_pseudo_merges_for_commit() applies all satisfied pseudo-merge commits for a given result set, and cascades any yet-unsatisfied pseudo-merges if any were applied in the previous step - cascade_pseudo_merges() applies all pseudo-merges which are satisfied but have not been previously applied, repeating this process until no more pseudo-merges can be applied The core of the API is the latter two functions, which are responsible for applying pseudo-merges during the object traversal implemented in the pack-bitmap machinery. The other two functions (pseudo_merge_bitmap(), and use_pseudo_merge()) are low-level ways to interact with the pseudo-merge machinery, which will be useful in future commits. Signed-off-by: Taylor Blau --- pseudo-merge.c | 231 +++++++++++++++++++++++++++++++++++++++++++++++++ pseudo-merge.h | 44 ++++++++++ 2 files changed, 275 insertions(+) diff --git a/pseudo-merge.c b/pseudo-merge.c index 1aca70ecdfb..0f50ac6183e 100644 --- a/pseudo-merge.c +++ b/pseudo-merge.c @@ -10,6 +10,7 @@ #include "commit.h" #include "alloc.h" #include "progress.h" +#include "hex.h" #define DEFAULT_PSEUDO_MERGE_DECAY 1.0f #define DEFAULT_PSEUDO_MERGE_MAX_MERGES 64 @@ -464,3 +465,233 @@ void free_pseudo_merge_map(struct pseudo_merge_map *pm) } free(pm->v); } + +struct pseudo_merge_commit_ext { + uint32_t nr; + const unsigned char *ptr; +}; + +static int pseudo_merge_ext_at(const struct pseudo_merge_map *pm, + struct pseudo_merge_commit_ext *ext, size_t at) +{ + if (at >= pm->map_size) + return error(_("extended pseudo-merge read out-of-bounds " + "(%"PRIuMAX" >= %"PRIuMAX")"), + (uintmax_t)at, (uintmax_t)pm->map_size); + + ext->nr = get_be32(pm->map + at); + ext->ptr = pm->map + at + sizeof(uint32_t); + + return 0; +} + +struct ewah_bitmap *pseudo_merge_bitmap(const struct pseudo_merge_map *pm, + struct pseudo_merge *merge) +{ + if (!merge->loaded_commits) + BUG("cannot use unloaded pseudo-merge bitmap"); + + if (!merge->loaded_bitmap) { + size_t at = merge->bitmap_at; + + merge->bitmap = read_bitmap(pm->map, pm->map_size, &at); + merge->loaded_bitmap = 1; + } + + return merge->bitmap; +} + +struct pseudo_merge *use_pseudo_merge(const struct pseudo_merge_map *pm, + struct pseudo_merge *merge) +{ + if (!merge->loaded_commits) { + size_t pos = merge->at; + + merge->commits = read_bitmap(pm->map, pm->map_size, &pos); + merge->bitmap_at = pos; + merge->loaded_commits = 1; + } + return merge; +} + +static struct pseudo_merge *pseudo_merge_at(const struct pseudo_merge_map *pm, + struct object_id *oid, + size_t want) +{ + size_t lo = 0; + size_t hi = pm->nr; + + while (lo < hi) { + size_t mi = lo + (hi - lo) / 2; + size_t got = pm->v[mi].at; + + if (got == want) + return use_pseudo_merge(pm, &pm->v[mi]); + else if (got < want) + hi = mi; + else + lo = mi + 1; + } + + warning(_("could not find pseudo-merge for commit %s at offset %"PRIuMAX), + oid_to_hex(oid), (uintmax_t)want); + + return NULL; +} + +struct pseudo_merge_commit { + uint32_t commit_pos; + uint64_t pseudo_merge_ofs; +}; + +#define PSEUDO_MERGE_COMMIT_RAWSZ (sizeof(uint32_t)+sizeof(uint64_t)) + +static void read_pseudo_merge_commit_at(struct pseudo_merge_commit *merge, + const unsigned char *at) +{ + merge->commit_pos = get_be32(at); + merge->pseudo_merge_ofs = get_be64(at + sizeof(uint32_t)); +} + +static int nth_pseudo_merge_ext(const struct pseudo_merge_map *pm, + struct pseudo_merge_commit_ext *ext, + struct pseudo_merge_commit *merge, + uint32_t n) +{ + size_t ofs; + + if (n >= ext->nr) + return error(_("extended pseudo-merge lookup out-of-bounds " + "(%"PRIu32" >= %"PRIu32")"), n, ext->nr); + + ofs = get_be64(ext->ptr + st_mult(n, sizeof(uint64_t))); + if (ofs >= pm->map_size) + return error(_("out-of-bounds read: (%"PRIuMAX" >= %"PRIuMAX")"), + (uintmax_t)ofs, (uintmax_t)pm->map_size); + + read_pseudo_merge_commit_at(merge, pm->map + ofs); + + return 0; +} + +static unsigned apply_pseudo_merge(const struct pseudo_merge_map *pm, + struct pseudo_merge *merge, + struct bitmap *result, + struct bitmap *roots) +{ + if (merge->satisfied) + return 0; + + if (!ewah_bitmap_is_subset(merge->commits, roots ? roots : result)) + return 0; + + bitmap_or_ewah(result, pseudo_merge_bitmap(pm, merge)); + if (roots) + bitmap_or_ewah(roots, pseudo_merge_bitmap(pm, merge)); + merge->satisfied = 1; + + return 1; +} + +static int pseudo_merge_commit_cmp(const void *va, const void *vb) +{ + struct pseudo_merge_commit merge; + uint32_t key = *(uint32_t*)va; + + read_pseudo_merge_commit_at(&merge, vb); + + if (key < merge.commit_pos) + return -1; + if (key > merge.commit_pos) + return 1; + return 0; +} + +static struct pseudo_merge_commit *find_pseudo_merge(const struct pseudo_merge_map *pm, + uint32_t pos) +{ + if (!pm->commits_nr) + return NULL; + + return bsearch(&pos, pm->commits, pm->commits_nr, + PSEUDO_MERGE_COMMIT_RAWSZ, pseudo_merge_commit_cmp); +} + +int apply_pseudo_merges_for_commit(const struct pseudo_merge_map *pm, + struct bitmap *result, + struct commit *commit, uint32_t commit_pos) +{ + struct pseudo_merge *merge; + struct pseudo_merge_commit *merge_commit; + int ret = 0; + + merge_commit = find_pseudo_merge(pm, commit_pos); + if (!merge_commit) + return 0; + + if (merge_commit->pseudo_merge_ofs & ((uint64_t)1<<63)) { + struct pseudo_merge_commit_ext ext = { 0 }; + off_t ofs = merge_commit->pseudo_merge_ofs & ~((uint64_t)1<<63); + uint32_t i; + + if (pseudo_merge_ext_at(pm, &ext, ofs) < -1) { + warning(_("could not read extended pseudo-merge table " + "for commit %s"), + oid_to_hex(&commit->object.oid)); + return ret; + } + + for (i = 0; i < ext.nr; i++) { + if (nth_pseudo_merge_ext(pm, &ext, merge_commit, i) < 0) + return ret; + + merge = pseudo_merge_at(pm, &commit->object.oid, + merge_commit->pseudo_merge_ofs); + + if (!merge) + return ret; + + if (apply_pseudo_merge(pm, merge, result, NULL)) + ret++; + } + } else { + merge = pseudo_merge_at(pm, &commit->object.oid, + merge_commit->pseudo_merge_ofs); + + if (!merge) + return ret; + + if (apply_pseudo_merge(pm, merge, result, NULL)) + ret++; + } + + if (ret) + cascade_pseudo_merges(pm, result, NULL); + + return ret; +} + +int cascade_pseudo_merges(const struct pseudo_merge_map *pm, + struct bitmap *result, + struct bitmap *roots) +{ + unsigned any_satisfied; + int ret = 0; + + do { + struct pseudo_merge *merge; + uint32_t i; + + any_satisfied = 0; + + for (i = 0; i < pm->nr; i++) { + merge = use_pseudo_merge(pm, &pm->v[i]); + if (apply_pseudo_merge(pm, merge, result, roots)) { + any_satisfied |= 1; + ret++; + } + } + } while (any_satisfied); + + return ret; +} diff --git a/pseudo-merge.h b/pseudo-merge.h index a3f0243062c..c00b622be4b 100644 --- a/pseudo-merge.h +++ b/pseudo-merge.h @@ -162,4 +162,48 @@ struct pseudo_merge { */ void free_pseudo_merge_map(struct pseudo_merge_map *pm); +/* + * Loads the bitmap corresponding to the given pseudo-merge from the + * map, if it has not already been loaded. + */ +struct ewah_bitmap *pseudo_merge_bitmap(const struct pseudo_merge_map *pm, + struct pseudo_merge *merge); + +/* + * Loads the pseudo-merge and its commits bitmap from the given + * pseudo-merge map, if it has not already been loaded. + */ +struct pseudo_merge *use_pseudo_merge(const struct pseudo_merge_map *pm, + struct pseudo_merge *merge); + +/* + * Applies pseudo-merge(s) containing the given commit to the bitmap + * "result". + * + * If any pseudo-merge(s) were satisfied, returns the number + * satisfied, otherwise returns 0. If any were satisfied, the + * remaining unsatisfied pseudo-merges are cascaded (see below). + */ +int apply_pseudo_merges_for_commit(const struct pseudo_merge_map *pm, + struct bitmap *result, + struct commit *commit, uint32_t commit_pos); + +/* + * Applies pseudo-merge(s) which are satisfied according to the + * current bitmap in result (or roots, see below). If any + * pseudo-merges were satisfied, repeat the process over unsatisfied + * pseudo-merge commits until no more pseudo-merges are satisfied. + * + * Result is the bitmap to which the pseudo-merge(s) are applied. + * Roots (if given) is a bitmap of the traversal tip(s) for either + * side of a reachability traversal. + * + * Roots may given instead of a populated results bitmap at the + * beginning of a traversal on either side where the reachability + * closure over tips is not yet known. + */ +int cascade_pseudo_merges(const struct pseudo_merge_map *pm, + struct bitmap *result, + struct bitmap *roots); + #endif From patchwork Tue May 21 19:02:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669676 Received: from mail-oi1-f177.google.com (mail-oi1-f177.google.com [209.85.167.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 10B2114900E for ; Tue, 21 May 2024 19:02:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318176; cv=none; b=mTNe3du6VjOs/d3r37YyjtCeu0GcdBqVl7G4yB93LQhwCKLBcYsvTFTiStCio8jr6rUqKVog/0qMaQTqgDTNNE0azkgpVzEe3k/Z9OcwrzqC18CLzEgYG6RlZTaLG9pqHH+dz0xYd09rDYhx/Yq/tcoF95wCAaHNkW6aIKk4o/4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318176; c=relaxed/simple; bh=opVO4O6tcQ4Ee/iIhnebO6lIeVOUp5WzOXhacWWLrVQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=r5EkoegCPoG/hOZ0ivW/TEqvoG4uIvHqiy9HygTmFEF9NoiJHXnvhwpozteuoHYJqz0WDrTf3+ACUseAf4qDegoTSaC/ukADi/I7xdOo24VXIhAj4pBYk6z7HjcYaSdZh9Si2dv5DKPiwCRhtqhGNqlBadZnUSX4F2OpN+HB3/E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=l/dgyTHM; arc=none smtp.client-ip=209.85.167.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="l/dgyTHM" Received: by mail-oi1-f177.google.com with SMTP id 5614622812f47-3c9cc66c649so2431926b6e.1 for ; Tue, 21 May 2024 12:02:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318174; x=1716922974; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Pi17tgcchXG4ghdIXJNw3E0XqTE+LmY4yBJAKAkYsU4=; b=l/dgyTHMtklbEtcgDbSN+DMI7ZcmxCv1KpkvgiZCwZJt+6mHYdvu7OiceRcPFJCpHd NRtxgLG7PHlWlS+HI0+6XoEdGr0C8IBn2LNsh+CJMDJezLkS+w+Ds4fEN02hakAzobZw VPQamtdDMlKNayW5/LoHkIpFiH1n//jGednf/kgzAr4ERuma059XBMEVgrivWjfYdKBJ lh7WByXViMTm4rXdOMOsu32tUy6WHj+noK9Einq5jnwZ+qbOZRya2dNw7rGYT1aP4bcA E/NQrsY9+36ihuV+xW+lKkrmAyacfTaXJ9EMWtHkOumIlaPuo/j2MQsjryrWdYSZ39pw 1d+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318174; x=1716922974; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Pi17tgcchXG4ghdIXJNw3E0XqTE+LmY4yBJAKAkYsU4=; b=iXwiY3J+nBAralVBdOuMkPTbL63MOQoIYLyt1MeWYwE/aOUk8QAJIHnG3OV9uJf1lm itFTxClS/VUEOSFnQg+BdPS4AOYsKjUKGyS24Y8R4+Kdt1REkP4l3fwMaii/j5Ig6AWr 6mxEDJtCe/BBv4eNligGMNE/UNDpgVIQ36sPXcL/TbKaE7yRFnSP3b1CpI3MM+f+xPF+ wsX58GW8oizWpAaYu1ar7xUix3shr9Eg681MwrJujt3/hY/5Qie90sXBKpLJev8JExux 8Luo/5fsCPGQa3ifOn6YwNjIDHcLuF5PdmWscYJ2MjAu70ZtsHXQf2StvelIpa7Cyix/ vX8w== X-Gm-Message-State: AOJu0Yy16bTtu8dRRDCXqx3tVe42tLAO/GUGV7hsObL2dutR8LrMI6Uo BN0fKBN/ofk/6UWW0vu9UsCnz6WATXOv7LE5aV6R+3JhWXIzi1ADlZIvkbEVbRiX6nbi8D7L1BM Z X-Google-Smtp-Source: AGHT+IHu6Qtci4GL8Qj6jxgbgwjROUlBVo/WO+foLNPEvQWJ6x2k7O/d6dqV8bOkFPOl2eRkgT6Pvg== X-Received: by 2002:aca:1c0c:0:b0:3ca:b21a:7943 with SMTP id 5614622812f47-3cdb20adcd1mr18087b6e.10.1716318173654; Tue, 21 May 2024 12:02:53 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-792bf33b16dsm1314093485a.127.2024.05.21.12.02.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:53 -0700 (PDT) Date: Tue, 21 May 2024 15:02:52 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 23/30] ewah: implement `ewah_bitmap_popcount()` Message-ID: <2c02f303b6f5629c08b92cbab06a827905c83f1d.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Some of the pseudo-merge test helpers (which will be introduced in the following commit) will want to indicate the total number of commits in or objects reachable from a pseudo-merge. Implement a popcount() function that operates on EWAH bitmaps to quickly determine how many bits are set in each of the respective bitmaps. Signed-off-by: Taylor Blau --- ewah/bitmap.c | 14 ++++++++++++++ ewah/ewok.h | 1 + 2 files changed, 15 insertions(+) diff --git a/ewah/bitmap.c b/ewah/bitmap.c index d352fec54ce..dc2ca190f12 100644 --- a/ewah/bitmap.c +++ b/ewah/bitmap.c @@ -212,6 +212,20 @@ size_t bitmap_popcount(struct bitmap *self) return count; } +size_t ewah_bitmap_popcount(struct ewah_bitmap *self) +{ + struct ewah_iterator it; + eword_t word; + size_t count = 0; + + ewah_iterator_init(&it, self); + + while (ewah_iterator_next(&word, &it)) + count += ewah_bit_popcount64(word); + + return count; +} + int bitmap_is_empty(struct bitmap *self) { size_t i; diff --git a/ewah/ewok.h b/ewah/ewok.h index 2b6c4ac499c..7074a6347b7 100644 --- a/ewah/ewok.h +++ b/ewah/ewok.h @@ -195,6 +195,7 @@ void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other); void bitmap_or(struct bitmap *self, const struct bitmap *other); size_t bitmap_popcount(struct bitmap *self); +size_t ewah_bitmap_popcount(struct ewah_bitmap *self); int bitmap_is_empty(struct bitmap *self); #endif From patchwork Tue May 21 19:02:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669677 Received: from mail-qk1-f172.google.com (mail-qk1-f172.google.com [209.85.222.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56AD014900E for ; Tue, 21 May 2024 19:03:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318181; cv=none; b=ZyjBgGF9VUf3fpEVAZRfIPTB6/zQrXybT606OV3vIkXBgPN7lgbVgqZcte7IH3B4GJkkXuHXMHwXSCCbpmT776dBv6qU0UpjgpimBOWnTAxaSm4MGRtIsbKWYas7denuyxr68tsl/Y7ajMvg1V1DIcZgJeOw8Wf8ey2XYXx+F8A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318181; c=relaxed/simple; bh=VS9V9zlRHdbbhPu599RQE4AJ5e+ccVaxjMU6wdrVUak=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=V032pH9jTIF/HgpH/e+Ssn+a0gYNO+SoDsMyMcSjcTikO71ol1UzREQ1R9I/zwyL7HPtIY9m7QPENu9K3rvggSXZ+Uu+gN2kjrgvxNc4hSUktJj0N9dzRbmtYMZ7ZFuUVznr4XDavpLbpMltDHHlriQef0kkth7D2niWEJXuorY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=3Ap7yrX7; arc=none smtp.client-ip=209.85.222.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="3Ap7yrX7" Received: by mail-qk1-f172.google.com with SMTP id af79cd13be357-7931a0a4d8fso346410185a.1 for ; Tue, 21 May 2024 12:03:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318178; x=1716922978; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=VtHZas0M6YfAgIYEcoNEU6R5U2uAV2PxIiQOW5CCHKg=; b=3Ap7yrX7YHG7btUB4dex8ieZfeBCFkGQ2cI4yDPd9wMF/cMHRDC2IuMoRUrGn2q+VD Ih8RXVHV1nG960AgX6RGlw76xUGr5/sBMoh4m0FpNHvEqOCkp3C0GYmc+AG3JzWRjYrf cgnIV70yPce9N4LKpo4lyNxNQegr4TV3WGyBX+elBvay5qMQkoK7sANHuVDsRRBoKPRJ 1fVwkqK/uGPAiIlRM5kpMyfr0lvo7qeDrrB8Hs9I439Le2IDHrHnzZ+2uxomqx/T8SeD Stu0sbxaKHEUlUtN3unlGZWsYwSYHOTi83ukGySXYCvBE8E255OL93Cd9GHT1A+sVmn5 eEhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318178; x=1716922978; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=VtHZas0M6YfAgIYEcoNEU6R5U2uAV2PxIiQOW5CCHKg=; b=lpJj09z1ZpQ22PVH1Tom4jGDzr6G8r6dcNjf6PCurpPU0n+6JTe1C8WfL0ugGucMJh P/ocl6r6SY/CcYq/z40YBDRpey06IiZjw37ZK5RRhZCY1s6xGUx8gUaPs5RWmCEi3vIJ F+vB9VqOhQZRFzUrulgQ37uifkbfV0lnm5NHWa23mq/WOP2kA53li/jffe0ToQhhrDl9 VLh86IxY9fH+ck/WAxRAIw+OCeFxgqsEEpHs4NlsqvD0A3s6i5a5EzIykAq4KziUCixo E7uwAeHqX4PkB6QVHFNwgnw530utXjSs3pppRTS7EFxporWhPLibV/6jPKOUlCjhEdeC fIyg== X-Gm-Message-State: AOJu0YymjbNH4SVLFw6ECR9zNrXQdxdr6GNTc7/vjoHEPVH0ZaQGBY2M 9pXi+Ee1vhSrl5KDCeLD1KMCxUn9AWnSinSMawWYiWhygr499VpsjJM3sfcL7fyVtuVs/GJJOUi C X-Google-Smtp-Source: AGHT+IFtMLPtUuCScjwhiVreg78B4WFKVxl4Rz9ThvJRn13hCeHG0slNV7INtchZBiUyIMT3xmpYsw== X-Received: by 2002:a05:620a:29c2:b0:792:d2a9:a59a with SMTP id af79cd13be357-792d2a9a840mr3893520485a.9.1716318178054; Tue, 21 May 2024 12:02:58 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-792bf27584asm1317137085a.10.2024.05.21.12.02.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:02:56 -0700 (PDT) Date: Tue, 21 May 2024 15:02:55 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 24/30] pack-bitmap: implement test helpers for pseudo-merge Message-ID: <82cce72bf55a2543eb9a4045848a85d04cdb618e.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Implement three new sub-commands for the "bitmap" test-helper: - t/helper test-tool bitmap dump-pseudo-merges - t/helper test-tool bitmap dump-pseudo-merge-commits - t/helper test-tool bitmap dump-pseudo-merge-objects These three helpers dump the list of pseudo merges, the "parents" of the nth pseudo-merges, and the set of objects reachable from those parents, respectively. These helpers will be useful in subsequent patches when we add test coverage for pseudo-merge bitmaps. Signed-off-by: Taylor Blau --- pack-bitmap.c | 126 +++++++++++++++++++++++++++++++++++++++++ pack-bitmap.h | 3 + t/helper/test-bitmap.c | 34 ++++++++--- 3 files changed, 156 insertions(+), 7 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index fc9c3e2fc43..c13074673af 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -2443,6 +2443,132 @@ int test_bitmap_hashes(struct repository *r) return 0; } +static void bit_pos_to_object_id(struct bitmap_index *bitmap_git, + uint32_t bit_pos, + struct object_id *oid) +{ + uint32_t index_pos; + + if (bitmap_is_midx(bitmap_git)) + index_pos = pack_pos_to_midx(bitmap_git->midx, bit_pos); + else + index_pos = pack_pos_to_index(bitmap_git->pack, bit_pos); + + nth_bitmap_object_oid(bitmap_git, oid, index_pos); +} + +int test_bitmap_pseudo_merges(struct repository *r) +{ + struct bitmap_index *bitmap_git; + uint32_t i; + + bitmap_git = prepare_bitmap_git(r); + if (!bitmap_git || !bitmap_git->pseudo_merges.nr) + goto cleanup; + + for (i = 0; i < bitmap_git->pseudo_merges.nr; i++) { + struct pseudo_merge *merge; + struct ewah_bitmap *commits_bitmap, *merge_bitmap; + + merge = use_pseudo_merge(&bitmap_git->pseudo_merges, + &bitmap_git->pseudo_merges.v[i]); + commits_bitmap = merge->commits; + merge_bitmap = pseudo_merge_bitmap(&bitmap_git->pseudo_merges, + merge); + + printf("at=%"PRIuMAX", commits=%"PRIuMAX", objects=%"PRIuMAX"\n", + (uintmax_t)merge->at, + (uintmax_t)ewah_bitmap_popcount(commits_bitmap), + (uintmax_t)ewah_bitmap_popcount(merge_bitmap)); + } + +cleanup: + free_bitmap_index(bitmap_git); + return 0; +} + +static void dump_ewah_object_ids(struct bitmap_index *bitmap_git, + struct ewah_bitmap *bitmap) + +{ + struct ewah_iterator it; + eword_t word; + uint32_t pos = 0; + + ewah_iterator_init(&it, bitmap); + + while (ewah_iterator_next(&word, &it)) { + struct object_id oid; + uint32_t offset; + + for (offset = 0; offset < BITS_IN_EWORD; offset++) { + if (!(word >> offset)) + break; + + offset += ewah_bit_ctz64(word >> offset); + + bit_pos_to_object_id(bitmap_git, pos + offset, &oid); + printf("%s\n", oid_to_hex(&oid)); + } + pos += BITS_IN_EWORD; + } +} + +int test_bitmap_pseudo_merge_commits(struct repository *r, uint32_t n) +{ + struct bitmap_index *bitmap_git; + struct pseudo_merge *merge; + int ret = 0; + + bitmap_git = prepare_bitmap_git(r); + if (!bitmap_git || !bitmap_git->pseudo_merges.nr) + goto cleanup; + + if (n >= bitmap_git->pseudo_merges.nr) { + ret = error(_("pseudo-merge index out of range " + "(%"PRIu32" >= %"PRIuMAX")"), + n, (uintmax_t)bitmap_git->pseudo_merges.nr); + goto cleanup; + } + + merge = use_pseudo_merge(&bitmap_git->pseudo_merges, + &bitmap_git->pseudo_merges.v[n]); + dump_ewah_object_ids(bitmap_git, merge->commits); + +cleanup: + free_bitmap_index(bitmap_git); + return ret; +} + +int test_bitmap_pseudo_merge_objects(struct repository *r, uint32_t n) +{ + struct bitmap_index *bitmap_git; + struct pseudo_merge *merge; + int ret = 0; + + bitmap_git = prepare_bitmap_git(r); + if (!bitmap_git || !bitmap_git->pseudo_merges.nr) + goto cleanup; + + if (n >= bitmap_git->pseudo_merges.nr) { + ret = error(_("pseudo-merge index out of range " + "(%"PRIu32" >= %"PRIuMAX")"), + n, (uintmax_t)bitmap_git->pseudo_merges.nr); + goto cleanup; + } + + merge = use_pseudo_merge(&bitmap_git->pseudo_merges, + &bitmap_git->pseudo_merges.v[n]); + + dump_ewah_object_ids(bitmap_git, + pseudo_merge_bitmap(&bitmap_git->pseudo_merges, + merge)); + +cleanup: + free_bitmap_index(bitmap_git); + return ret; +} + int rebuild_bitmap(const uint32_t *reposition, struct ewah_bitmap *source, struct bitmap *dest) diff --git a/pack-bitmap.h b/pack-bitmap.h index 21aabf805ea..4466b5ad0fb 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -73,6 +73,9 @@ void traverse_bitmap_commit_list(struct bitmap_index *, void test_bitmap_walk(struct rev_info *revs); int test_bitmap_commits(struct repository *r); int test_bitmap_hashes(struct repository *r); +int test_bitmap_pseudo_merges(struct repository *r); +int test_bitmap_pseudo_merge_commits(struct repository *r, uint32_t n); +int test_bitmap_pseudo_merge_objects(struct repository *r, uint32_t n); #define GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL \ "GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL" diff --git a/t/helper/test-bitmap.c b/t/helper/test-bitmap.c index af43ee1cb5e..6af2b42678f 100644 --- a/t/helper/test-bitmap.c +++ b/t/helper/test-bitmap.c @@ -13,21 +13,41 @@ static int bitmap_dump_hashes(void) return test_bitmap_hashes(the_repository); } +static int bitmap_dump_pseudo_merges(void) +{ + return test_bitmap_pseudo_merges(the_repository); +} + +static int bitmap_dump_pseudo_merge_commits(uint32_t n) +{ + return test_bitmap_pseudo_merge_commits(the_repository, n); +} + +static int bitmap_dump_pseudo_merge_objects(uint32_t n) +{ + return test_bitmap_pseudo_merge_objects(the_repository, n); +} + int cmd__bitmap(int argc, const char **argv) { setup_git_directory(); - if (argc != 2) - goto usage; - - if (!strcmp(argv[1], "list-commits")) + if (argc == 2 && !strcmp(argv[1], "list-commits")) return bitmap_list_commits(); - if (!strcmp(argv[1], "dump-hashes")) + if (argc == 2 && !strcmp(argv[1], "dump-hashes")) return bitmap_dump_hashes(); + if (argc == 2 && !strcmp(argv[1], "dump-pseudo-merges")) + return bitmap_dump_pseudo_merges(); + if (argc == 3 && !strcmp(argv[1], "dump-pseudo-merge-commits")) + return bitmap_dump_pseudo_merge_commits(atoi(argv[2])); + if (argc == 3 && !strcmp(argv[1], "dump-pseudo-merge-objects")) + return bitmap_dump_pseudo_merge_objects(atoi(argv[2])); -usage: usage("\ttest-tool bitmap list-commits\n" - "\ttest-tool bitmap dump-hashes"); + "\ttest-tool bitmap dump-hashes\n" + "\ttest-tool bitmap dump-pseudo-merges\n" + "\ttest-tool bitmap dump-pseudo-merge-commits \n" + "\ttest-tool bitmap dump-pseudo-merge-objects "); return -1; } From patchwork Tue May 21 19:02:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669678 Received: from mail-qv1-f47.google.com (mail-qv1-f47.google.com [209.85.219.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0170C1494A8 for ; Tue, 21 May 2024 19:03:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318185; cv=none; b=NxDJqwsHrW2mc/lap1LdH6GmiTyqjS8wrj13lkH65uKW+Gn3tIyf4KB0uBBLr2PE26aQCAAQ8lGtILSgQZ0zmkmsdC883g+U5lU7/3qUuMq/x3j3iLwkvtblfsPw9HQcczaiKYFyHC+5JivPdeYCn5WAfGr32fgE92FeQUEXJZw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318185; c=relaxed/simple; bh=nY1teEgDzm9oYwt/v3rHPAz503cG2cEDyXUnbicT1TQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=H1ZG0W9wrCUP9siBfGS99tKwV0eK2V7HqkFThQ+s9ps7bACLVMV5Ys48Vnb+ADy31kesdx0eFn4mw6+IUCgIA2s/XIYZg51W9/szeDM4BnJduvyr+kjPDi5nwn5T1L909c5B9AKaFnbvTun/eGIgHpjk4Ju8vbTlYuF7vUGyiI4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=Ux+gN0uO; arc=none smtp.client-ip=209.85.219.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="Ux+gN0uO" Received: by mail-qv1-f47.google.com with SMTP id 6a1803df08f44-6aa282ece86so7203416d6.1 for ; Tue, 21 May 2024 12:03:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318181; x=1716922981; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Ut6H9GKDZDM44Lpf1ORkfSN3Y4oKlv4R+tyO/MRJ/Q8=; b=Ux+gN0uOQW/WMpauSArVpfbBzx7XPMab98cgFf7dFQkyqlNr4tEbgaF5SuwDDkzeCT ONv4vSU0kmxlPhvUt/XpaweS62BpZ5tDb28i72c4DviLRmdL8scAR6iTt7ZhP7NYzsNd X3ZsJQqwiv9rBRKsWL2DIlAWcR7yJVrWHVbuGBS8bTE8O07E3dOZ3frlGXBH+Ht7hc6D QA2c4ovrFd+vI1hvWt0DJhA0tog0MAtC99j0+g/nuvvoFIEoKQy3Qa7YLeR02z8YzA8R lpYDOI2HAlSJDTDtvSgglsVhQNsAH6u4mmvVc85LgUWKv1FrfnXbtl0G2OoOWCjHBRoI tSWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318181; x=1716922981; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Ut6H9GKDZDM44Lpf1ORkfSN3Y4oKlv4R+tyO/MRJ/Q8=; b=w2gpsQbjADWmBjExCR1jrZioR09cbxPMmtAaUfH9xnLL+U9Klrb+qpmX7KyOhz2ile qYP+G+VvnjmRcG/XYXkdKWS86RWTYXCiKSonHFoFtJ6JwYDFFrGF3YOAfNM7rgXiOypp uhGXoCsLdDdFIUMKePQ7mgVvEyIBBvefsLMlATZYRC6hjwJ9OoQmRzADCvh0jhkFRbyL bwVX5Cv64C4YmeQGqKF13wi/TT6mlYmkDTNai84Jrk3eY7nwxZeLoS2lv8SD2ljTWR95 4tL/fcIluIAZyvZ9p645nHND9szySOKqN7R/0RHUiFjIqiZ0GFsjs6M+/opVEtACz4uk PEbQ== X-Gm-Message-State: AOJu0YwoNuXRUJYIao4tmX9Qr13ZBvEOzQJIt/00wxNdtxIsKfxC/J0J bsPfXUMGSzYZcE1GelckBMt3jR3gFLrl2HjfFIQKRAMr23oTaCmctJEDPMyMG4774rplGdZTdvu 1 X-Google-Smtp-Source: AGHT+IEwcR9W7pDMzVx0xDpExFXQM+SD4poU7wAATz97Ef/f1KGmLlGmHE85UyaWf7/KCCfj47cZog== X-Received: by 2002:a05:6214:43c3:b0:6a9:452b:196a with SMTP id 6a1803df08f44-6a9452b1b40mr105534316d6.41.1716318181597; Tue, 21 May 2024 12:03:01 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6a62014446fsm42997616d6.31.2024.05.21.12.03.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:03:00 -0700 (PDT) Date: Tue, 21 May 2024 15:02:59 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 25/30] t/test-lib-functions.sh: support `--date` in `test_commit_bulk()` Message-ID: <890f6c4b9deb9e3bf02aa180c7ad4ced7f7b6a80.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: One of the tests we'll want to add for pseudo-merge bitmaps needs to be able to generate a large number of commits at a specific date. Support the `--date` option (with identical semantics to the `--date` option for `test_commit()`) within `test_commit_bulk` as a prerequisite for that. Signed-off-by: Taylor Blau --- t/test-lib-functions.sh | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 862d80c9748..16fd585e34b 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -458,6 +458,7 @@ test_commit_bulk () { indir=. ref=HEAD n=1 + notick= message='commit %s' filename='%s.t' contents='content %s' @@ -488,6 +489,12 @@ test_commit_bulk () { filename="${1#--*=}-%s.t" contents="${1#--*=} %s" ;; + --date) + notick=yes + GIT_COMMITTER_DATE="$2" + GIT_AUTHOR_DATE="$2" + shift + ;; -*) BUG "invalid test_commit_bulk option: $1" ;; @@ -507,7 +514,10 @@ test_commit_bulk () { while test "$total" -gt 0 do - test_tick && + if test -z "$notick" + then + test_tick + fi && echo "commit $ref" printf 'author %s <%s> %s\n' \ "$GIT_AUTHOR_NAME" \ From patchwork Tue May 21 19:03:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669679 Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 64BB41494A8 for ; Tue, 21 May 2024 19:03:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318189; cv=none; b=dTm3fkORGG8slYSJaGy+SJqAWTBKKgb+342PHNAH0ojwn3r8UmZ0Yddyc6a8AHNotb1JVn4yZFueGqKEsrVbJJogkr7EmMTXOkG7qhLoBy0DR5hj1BpGkpGLMbozEJk9oFccn1CR0QA1LW0+TilS/Z44I7HTYd204A81ChPk74M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318189; c=relaxed/simple; bh=eCK+Jjcz17REO3YwyyOZaX+qay+tiPFA5vP32+QCBx0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=mFk+nCxaHMJfAUbRUV1sW+sjsLJU7mPHqrZaKfCwvqz5bo0y2netHjcCOOQNc2jO0aq1g2PxkX/mNfYfh+5Vco6pnoxuX7M0gKm+srDbB2Jz4H3CO+xkwr/r9ucT+vUCT6Taslp71ldo8drYS/bYi/oiv+ykmRcxXzG+FkxAyH0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=CGtf/qfz; arc=none smtp.client-ip=209.85.160.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="CGtf/qfz" Received: by mail-qt1-f172.google.com with SMTP id d75a77b69052e-43f84f53f66so2896941cf.3 for ; Tue, 21 May 2024 12:03:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318186; x=1716922986; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=IQSG9mGdYkavg2Dtb8nCj3XB4E7AWsAkUZFV5MfhF7I=; b=CGtf/qfzk9PdxP2m3mE72SIx0K+6EkNQBfhj4sQOKKRJwbMDfZygavRvV5YV0QTLzc xRAlZTspII5CemaKK8exod7zSGbU0eEOk8bI7dDnjO2ljjF8hobDwTcsEJf6w1SI4Yij 9sk19DgDqklN/wpv4tY3PJYczv3E7zkjSyaOZcm+KgrJPy65yZNIyyK6QlejFl7qXzeG fiL3sgsdPNZ7+DFysxktUJU4pBdjLq6Fru+vSelfHuf9nL8Ku8l0mG8LNXldtGO9JM/C RzlCQAruGjYXrI9+b6QwIbrnQOVBUYHm28hWxaaHnzhLCttLpld7cu9AovIj2a9YImPX dfyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318186; x=1716922986; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=IQSG9mGdYkavg2Dtb8nCj3XB4E7AWsAkUZFV5MfhF7I=; b=uhnHLQAXu6LAOM+w/dTXbiU/bgfqN4nQMS2CVjceoohHfYK0nysN4EGm4YNHu/+xqr rEQjnigjjCtj3EBcPoQb7FYxzZ5nDU+Kwag8ulqrINvj+Rx2N4uQJz/WnZH7BjrP1SSu TGnNVwv3NbmvfauD2qSE5PODqjxD+M0wHX8NWm4w9fHx6SKhqMqdSjgdEk5IBeqETuO1 lW7LuCongQYve0ePC7QxJzgpJu2TBE4NzsL0cLhCL89yCLkNZPvYT0Y5bNg2sSQ24cUZ OEE4Dbom6f3/he9niMe1qCzHWa+JWKGvoMbln+obU7BNaOoxYxPGYuy8EbF5zxfSIOB6 wcew== X-Gm-Message-State: AOJu0YwqdeB1/6/a3KUfkhwcwkkVMTx6anzjOnLYHpTD81RTRbLwhxyu KQ4xjRr13iMNokULxIjzqMN1kXzyFkTW0O28XCZYQjnJkDVN3rixjcbLDdZDwZMmydPu2Fls2se 8 X-Google-Smtp-Source: AGHT+IH2roxKmBPzGJBm3mEp66zDsPE7cAD7eNGalvaZfQeoCDXWbUPiwtrlvM26iEHKIlNp2AbkSQ== X-Received: by 2002:ac8:7dd6:0:b0:43a:df49:f8b7 with SMTP id d75a77b69052e-43dfdb0a4cfmr342555451cf.36.1716318185261; Tue, 21 May 2024 12:03:05 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-43e4bcde036sm54794211cf.38.2024.05.21.12.03.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:03:04 -0700 (PDT) Date: Tue, 21 May 2024 15:03:03 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 26/30] pack-bitmap.c: use pseudo-merges during traversal Message-ID: <41691824f78818f3c70cad6d02cd7f66d12c68c3.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Now that all of the groundwork has been laid to support reading and using pseudo-merges, make use of that work in this commit by teaching the pack-bitmap machinery to use pseudo-merge(s) when available during traversal. The basic operation is as follows: - When enumerating objects on either side of a reachability query, first see if any subset of the roots satisfies some pseudo-merge bitmap. If it does, apply that pseudo-merge bitmap. - If any pseudo-merge bitmap(s) were applied in the previous step, OR them into the result[^1]. Then repeat the process over all pseudo-merge bitmaps (we'll refer to this as "cascading" pseudo-merges). Once this is done, OR in the resulting bitmap. - If there is no fill-in traversal to be done, return the bitmap for that side of the reachability query. If there is fill-in traversal, then for each commit we encounter via show_commit(), check to see if any unsatisfied pseudo-merges containing that commit as one of its parents has been made satisfied by the presence of that commit. If so, OR in the object set from that pseudo-merge bitmap, and then cascade. If not, continue traversal. A similar implementation is present in the boundary-based bitmap traversal routines. [^1]: Importantly, we cannot OR in the entire set of roots along with the objects reachable from whatever pseudo-merge bitmaps were satisfied. This may leave some dangling bits corresponding to any unsatisfied root(s) getting OR'd into the resulting bitmap, tricking other parts of the traversal into thinking we already have a reachability closure over those commit(s) when we do not. Signed-off-by: Taylor Blau --- pack-bitmap.c | 112 ++++++++++- t/t5333-pseudo-merge-bitmaps.sh | 323 ++++++++++++++++++++++++++++++++ 2 files changed, 434 insertions(+), 1 deletion(-) create mode 100755 t/t5333-pseudo-merge-bitmaps.sh diff --git a/pack-bitmap.c b/pack-bitmap.c index c13074673af..e61058dada6 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -114,6 +114,9 @@ struct bitmap_index { unsigned int version; }; +static int pseudo_merges_satisfied_nr; +static int pseudo_merges_cascades_nr; + static struct ewah_bitmap *lookup_stored_bitmap(struct stored_bitmap *st) { struct ewah_bitmap *parent; @@ -1006,6 +1009,22 @@ static void show_commit(struct commit *commit UNUSED, { } +static unsigned apply_pseudo_merges_for_commit_1(struct bitmap_index *bitmap_git, + struct bitmap *result, + struct commit *commit, + uint32_t commit_pos) +{ + int ret; + + ret = apply_pseudo_merges_for_commit(&bitmap_git->pseudo_merges, + result, commit, commit_pos); + + if (ret) + pseudo_merges_satisfied_nr += ret; + + return ret; +} + static int add_to_include_set(struct bitmap_index *bitmap_git, struct include_data *data, struct commit *commit, @@ -1026,6 +1045,10 @@ static int add_to_include_set(struct bitmap_index *bitmap_git, } bitmap_set(data->base, bitmap_pos); + if (apply_pseudo_merges_for_commit_1(bitmap_git, data->base, commit, + bitmap_pos)) + return 0; + return 1; } @@ -1151,6 +1174,20 @@ static void show_boundary_object(struct object *object UNUSED, BUG("should not be called"); } +static unsigned cascade_pseudo_merges_1(struct bitmap_index *bitmap_git, + struct bitmap *result, + struct bitmap *roots) +{ + int ret = cascade_pseudo_merges(&bitmap_git->pseudo_merges, + result, roots); + if (ret) { + pseudo_merges_cascades_nr++; + pseudo_merges_satisfied_nr += ret; + } + + return ret; +} + static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, struct rev_info *revs, struct object_list *roots) @@ -1160,6 +1197,7 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, unsigned int i; unsigned int tmp_blobs, tmp_trees, tmp_tags; int any_missing = 0; + int existing_bitmaps = 0; cb.bitmap_git = bitmap_git; cb.base = bitmap_new(); @@ -1167,6 +1205,25 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, revs->ignore_missing_links = 1; + if (bitmap_git->pseudo_merges.nr) { + struct bitmap *roots_bitmap = bitmap_new(); + struct object_list *objects = NULL; + + for (objects = roots; objects; objects = objects->next) { + struct object *object = objects->item; + int pos; + + pos = bitmap_position(bitmap_git, &object->oid); + if (pos < 0) + continue; + + bitmap_set(roots_bitmap, pos); + } + + if (!cascade_pseudo_merges_1(bitmap_git, cb.base, roots_bitmap)) + bitmap_free(roots_bitmap); + } + /* * OR in any existing reachability bitmaps among `roots` into * `cb.base`. @@ -1178,8 +1235,10 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, continue; if (add_commit_to_bitmap(bitmap_git, &cb.base, - (struct commit *)object)) + (struct commit *)object)) { + existing_bitmaps = 1; continue; + } any_missing = 1; } @@ -1187,6 +1246,9 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, if (!any_missing) goto cleanup; + if (existing_bitmaps) + cascade_pseudo_merges_1(bitmap_git, cb.base, NULL); + tmp_blobs = revs->blob_objects; tmp_trees = revs->tree_objects; tmp_tags = revs->blob_objects; @@ -1242,6 +1304,13 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, return cb.base; } +static void unsatisfy_all_pseudo_merges(struct bitmap_index *bitmap_git) +{ + uint32_t i; + for (i = 0; i < bitmap_git->pseudo_merges.nr; i++) + bitmap_git->pseudo_merges.v[i].satisfied = 0; +} + static struct bitmap *find_objects(struct bitmap_index *bitmap_git, struct rev_info *revs, struct object_list *roots, @@ -1249,9 +1318,32 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git, { struct bitmap *base = NULL; int needs_walk = 0; + unsigned existing_bitmaps = 0; struct object_list *not_mapped = NULL; + unsatisfy_all_pseudo_merges(bitmap_git); + + if (bitmap_git->pseudo_merges.nr) { + struct bitmap *roots_bitmap = bitmap_new(); + struct object_list *objects = NULL; + + for (objects = roots; objects; objects = objects->next) { + struct object *object = objects->item; + int pos; + + pos = bitmap_position(bitmap_git, &object->oid); + if (pos < 0) + continue; + + bitmap_set(roots_bitmap, pos); + } + + base = bitmap_new(); + if (!cascade_pseudo_merges_1(bitmap_git, base, roots_bitmap)) + bitmap_free(roots_bitmap); + } + /* * Go through all the roots for the walk. The ones that have bitmaps * on the bitmap index will be `or`ed together to form an initial @@ -1262,11 +1354,21 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git, */ while (roots) { struct object *object = roots->item; + roots = roots->next; + if (base) { + int pos = bitmap_position(bitmap_git, &object->oid); + if (pos > 0 && bitmap_get(base, pos)) { + object->flags |= SEEN; + continue; + } + } + if (object->type == OBJ_COMMIT && add_commit_to_bitmap(bitmap_git, &base, (struct commit *)object)) { object->flags |= SEEN; + existing_bitmaps = 1; continue; } @@ -1282,6 +1384,9 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git, roots = not_mapped; + if (existing_bitmaps) + cascade_pseudo_merges_1(bitmap_git, base, NULL); + /* * Let's iterate through all the roots that don't have bitmaps to * check if we can determine them to be reachable from the existing @@ -1866,6 +1971,11 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, object_list_free(&wants); object_list_free(&haves); + trace2_data_intmax("bitmap", the_repository, "pseudo_merges_satisfied", + pseudo_merges_satisfied_nr); + trace2_data_intmax("bitmap", the_repository, "pseudo_merges_cascades", + pseudo_merges_cascades_nr); + return bitmap_git; cleanup: diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh new file mode 100755 index 00000000000..3a7dc7278a7 --- /dev/null +++ b/t/t5333-pseudo-merge-bitmaps.sh @@ -0,0 +1,323 @@ +#!/bin/sh + +test_description='pseudo-merge bitmaps' + +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 + +. ./test-lib.sh + +test_pseudo_merges () { + test-tool bitmap dump-pseudo-merges +} + +test_pseudo_merge_commits () { + test-tool bitmap dump-pseudo-merge-commits "$1" +} + +test_pseudo_merges_satisfied () { + test_trace2_data bitmap pseudo_merges_satisfied "$1" +} + +test_pseudo_merges_cascades () { + test_trace2_data bitmap pseudo_merges_cascades "$1" +} + +tag_everything () { + git rev-list --all --no-object-names >in && + perl -lne ' + print "create refs/tags/" . $. . " " . $1 if /([0-9a-f]+)/ + ' expect && + + : >trace2.txt && + GIT_TRACE2_EVENT=$PWD/trace2.txt \ + git rev-list --count --all --objects --use-bitmap-index >actual && + + test_pseudo_merges_satisfied 0 merges && + test_must_be_empty merges && + test_cmp expect actual +' + +test_expect_success 'pseudo-merges accurately represent their objects' ' + test_config bitmapPseudoMerge.test.pattern "refs/tags/" && + test_config bitmapPseudoMerge.test.maxMerges 8 && + test_config bitmapPseudoMerge.test.stableThreshold never && + + git repack -adb && + + test_pseudo_merges >merges && + test_line_count = 8 merges && + + for i in $(test_seq 0 $(($(wc -l commits && + + git rev-list --objects --no-object-names --stdin expect.raw && + test-tool bitmap dump-pseudo-merge-objects $i >actual.raw && + + sort -u expect && + sort -u actual && + + test_cmp expect actual || return 1 + done +' + +test_expect_success 'bitmap traversal with pseudo-merges' ' + : >trace2.txt && + GIT_TRACE2_EVENT=$PWD/trace2.txt \ + git rev-list --count --all --objects --use-bitmap-index >actual && + git rev-list --count --all --objects >expect && + + test_pseudo_merges_satisfied 8 trace2.txt && + GIT_TRACE2_EVENT=$PWD/trace2.txt \ + git rev-list --count --all --objects --use-bitmap-index >actual && + git rev-list --count --all --objects >expect && + + test_pseudo_merges_satisfied 8 merges && + test_line_count = 1 merges && + test_pseudo_merge_commits 0 >commits && + + test-tool bitmap list-commits >bitmaps && + bitmaps_nr="$(wc -l expect && + + test $(cat expect) -eq $(wc -l merges && + test_line_count = 1 merges && + + test_pseudo_merge_commits 0 >oids && + git cat-file --batch commits && + + test $(wc -l in && + git update-ref --stdin merges && + merges_nr="$(wc -l oids && + git cat-file --batch commits && + + expect="$(grep -c "^committer.*$old +0000$" commits)" && + actual="$(wc -l oids && + git cat-file --batch commits && + test $(wc -l err && + + cat >expect <<-EOF && + fatal: pseudo-merge group ${SQ}test${SQ} has unstable threshold before stable one + EOF + + test_cmp expect err +' + +test_expect_success 'pseudo-merge pattern with capture groups' ' + git init pseudo-merge-captures && + ( + cd pseudo-merge-captures && + + test_commit_bulk 128 && + tag_everything && + + for r in $(test_seq 8) + do + test_commit_bulk 16 && + + git rev-list HEAD~16.. >in && + + perl -lne "print \"create refs/remotes/$r/tags/\$. \$_\"" refs && + + test_pseudo_merges >merges && + for m in $(test_seq 0 $(($(wc -l oids && + grep -f oids refs | + perl -lne "print \$1 if /refs\/remotes\/([0-9]+)/" | + sort -u || return 1 + done >remotes && + + test $(wc -l merges && + test_line_count = 2 merges && + + test_pseudo_merge_commits 0 >commits-0.raw && + test_pseudo_merge_commits 1 >commits-1.raw && + + sort commits-0.raw >commits-0 && + sort commits-1.raw >commits-1 && + + comm -12 commits-0 commits-1 >overlap && + + test_line_count -gt 0 overlap + ) +' + +test_expect_success 'pseudo-merge overlap traversal' ' + ( + cd pseudo-merge-overlap && + + : >trace2.txt && + GIT_TRACE2_EVENT=$PWD/trace2.txt \ + git rev-list --count --all --objects --use-bitmap-index >actual && + git rev-list --count --all --objects >expect && + + test_pseudo_merges_satisfied 2 trace2.txt && + GIT_TRACE2_EVENT=$PWD/trace2.txt \ + git rev-list --count --all --objects --use-bitmap-index >actual && + git rev-list --count --all --objects >expect && + + test_pseudo_merges_satisfied 2 X-Patchwork-Id: 13669680 Received: from mail-qk1-f169.google.com (mail-qk1-f169.google.com [209.85.222.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 347301494A8 for ; Tue, 21 May 2024 19:03:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318192; cv=none; b=ew5KXVeUID3A9DZ8FSU8KvgqNYaV8tKSZtPZHQxAE2aNhUgYCmMHPcLSB/4rnO3woSnaPE0Tuz7eBYYcMWFt++vPUSOvO/gksrnOoZAalnCcihFHEKrES2SsnF8gXLoe7ax3uYq/fCQCSZao5VpFXuch+t3/oIWaLyJ1JFnU7ns= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318192; c=relaxed/simple; bh=mEi7eF8WlJT6PpcmCVxstk7i6MgWCA1cLd2rI7kGJ1M=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=beoG460mto2BdnwA4BnKme0IAELHlhpxhJhfoCNnius7VEYi1zpyg3QN3cTqBs6knrRCc59ePRUdUGiWoYNr+KJlJ6rKE5hAwmdz83Z/MSdrXYRQSZcSKrSNKXPS3GQ01sxyNSOW/2106ysTWHEwb22rnQStP1+tEOxl9lCtEjQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=FhimWOAE; arc=none smtp.client-ip=209.85.222.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="FhimWOAE" Received: by mail-qk1-f169.google.com with SMTP id af79cd13be357-792b8d989e8so12481485a.0 for ; Tue, 21 May 2024 12:03:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318189; x=1716922989; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=qtCxJ99lL4q5myAzy5XypnKvMZZzYm7fExC2YIFRo0Q=; b=FhimWOAEP72bzpM9+HoTm0isuzw94nmpoLcSkmmmxVMikcYIvoUm6tync6F8H4PGMn HbhdIeleTv5P42FJ3dz0p4lbElwP1bwhmiiNgY5s9IjvPKoTaMPUgPbmNATSjaJB+Dii NkuvcuuyLAk8xQEjnuydI7MOlMRhwjmWvj/McpZDhSBD1TLqW4cITr1UVdLMyo+uIWxm 8dxIdYQGq7FtH5MuNN+ciO1LgtL9ujNasNZlBh0VN6p706gg9tnAHl+aAY5UajfBldcl lKf16YItERnOhNuxJMR3qiaWsJzkzyY9Bn4migRcyCMLfatwTEqlGjZ8n7GZ7exQIjg2 k+Dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318189; x=1716922989; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=qtCxJ99lL4q5myAzy5XypnKvMZZzYm7fExC2YIFRo0Q=; b=XizJNtZLOfS/MF3vkiArsMSyhYFmlF2dS+tl+GMixURFRwqSl+qyO4tSLY2GjAXEsk a1FNnRQMeoYqvl0KNLuR11QvHOOcV6e8DnUGEluSxCk8aYdqgBTHHFxdxMETEUJURRJk WmzmRsixsLTRMC2NsD6YSbbM7nATot7Vo2NbvbrWKjtVom+BIH//c63v28SuGRF20A45 OJIE9uJcUYwTKVdg+TlDbQa8RAnazTNPEvBlntm+KbaXLlmYSUImAC+TeESKwXVYI9lY hLBaAuu9YKeWEeQsFsn+s6BaXjyOKg3GbtB7lO6Pxa+eeqoLI46uTGU8IggsZZX+/BH5 TqCQ== X-Gm-Message-State: AOJu0YwqZu1aNbTKdw4OKT4SfHp6hDiNaJpA4yhZcsNYq4b01ndx3hrJ Kvi+/cRTZvbQC0eJG/gGxIJTz/tJ5QSTott/C9kvOFDpjcz6QEvkIz9QVShwmtwdsj2SnXN2NeN s X-Google-Smtp-Source: AGHT+IFWfSYW9RZ620LqztV/RcglViLT/rnr5qT2mnS/g3Eh9RI9AIPT5cOcRgeiDHk9xwBImwf0Mw== X-Received: by 2002:a05:620a:458c:b0:790:a960:28f4 with SMTP id af79cd13be357-79470f0c0e7mr1802004985a.25.1716318188938; Tue, 21 May 2024 12:03:08 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-43df54d922esm162191431cf.22.2024.05.21.12.03.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:03:08 -0700 (PDT) Date: Tue, 21 May 2024 15:03:06 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 27/30] pack-bitmap: extra trace2 information Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Add some extra trace2 lines to capture the number of bitmap lookups that are hits versus misses, as well as the number of reachability roots that have bitmap coverage (versus those that do not). Signed-off-by: Taylor Blau --- pack-bitmap.c | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index e61058dada6..1966b3b95f1 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -116,6 +116,10 @@ struct bitmap_index { static int pseudo_merges_satisfied_nr; static int pseudo_merges_cascades_nr; +static int existing_bitmaps_hits_nr; +static int existing_bitmaps_misses_nr; +static int roots_with_bitmaps_nr; +static int roots_without_bitmaps_nr; static struct ewah_bitmap *lookup_stored_bitmap(struct stored_bitmap *st) { @@ -1040,10 +1044,14 @@ static int add_to_include_set(struct bitmap_index *bitmap_git, partial = bitmap_for_commit(bitmap_git, commit); if (partial) { + existing_bitmaps_hits_nr++; + bitmap_or_ewah(data->base, partial); return 0; } + existing_bitmaps_misses_nr++; + bitmap_set(data->base, bitmap_pos); if (apply_pseudo_merges_for_commit_1(bitmap_git, data->base, commit, bitmap_pos)) @@ -1099,8 +1107,12 @@ static int add_commit_to_bitmap(struct bitmap_index *bitmap_git, { struct ewah_bitmap *or_with = bitmap_for_commit(bitmap_git, commit); - if (!or_with) + if (!or_with) { + existing_bitmaps_misses_nr++; return 0; + } + + existing_bitmaps_hits_nr++; if (!*base) *base = ewah_to_bitmap(or_with); @@ -1407,8 +1419,12 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git, object->flags &= ~UNINTERESTING; add_pending_object(revs, object, ""); needs_walk = 1; + + roots_without_bitmaps_nr++; } else { object->flags |= SEEN; + + roots_with_bitmaps_nr++; } } @@ -1975,6 +1991,14 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, pseudo_merges_satisfied_nr); trace2_data_intmax("bitmap", the_repository, "pseudo_merges_cascades", pseudo_merges_cascades_nr); + trace2_data_intmax("bitmap", the_repository, "bitmap/hits", + existing_bitmaps_hits_nr); + trace2_data_intmax("bitmap", the_repository, "bitmap/misses", + existing_bitmaps_misses_nr); + trace2_data_intmax("bitmap", the_repository, "bitmap/roots_with_bitmap", + roots_with_bitmaps_nr); + trace2_data_intmax("bitmap", the_repository, "bitmap/roots_without_bitmap", + roots_without_bitmaps_nr); return bitmap_git; From patchwork Tue May 21 19:03:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669681 Received: from mail-qk1-f181.google.com (mail-qk1-f181.google.com [209.85.222.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E33B149C66 for ; Tue, 21 May 2024 19:03:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318195; cv=none; b=VNy5m81QQKIZPS3oD6Kd/jVoqOhWq54S6AfMnqHELnfzYDOxR4DVbnfGugn6Vyj9lC4N3Vz6gm4gNAirBrP3aH2DGX1bunaL3l/F21MVtjiF5845kL6wP2vmhX+lzFa0Htq1ygqbeCf38Lrooa44v6C5R4Yw6H3MvDQGDHjCi5o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318195; c=relaxed/simple; bh=8hPjU0ugIJf4TcmbrO4sEG67fSpJLtsuDgDFo0EY0mA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=feI6+/8jG2PX6U7P3i4XtS9Uvg9NVU8Yr5lT7bz5VtPthTteEUSDliR90A1kef5bg8LY72KqvSwRfUBOo26WShuDAmCEjvB9GpLlRkFPYc1dejgAjj9+R0H8vbdBoGYgSGM4Y/DLxhu7hnffIMDWbrmkceQMYJ1y09XlNh83j5U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=vCOTUfW8; arc=none smtp.client-ip=209.85.222.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="vCOTUfW8" Received: by mail-qk1-f181.google.com with SMTP id af79cd13be357-792b8ebc4eeso327827285a.1 for ; Tue, 21 May 2024 12:03:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318193; x=1716922993; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=H+TApMOej23rAq8J7X8zVnMzg2eM8MoZahdjZuHHWMo=; b=vCOTUfW88Pl4RihCzXy1e3UKN6XK5BpdtRL+Rfao2kYROGiJTZ9SsiIbajZLniilrL 4c+HJSMOzuIa/SiAoFTvfdqzzGeVf2W5va5vZW7ZrKxCQY3wl4W3p48m2oxUjyWaVfc1 hm1ZoIZXgtHguObaKfFYye6fJwAJXyhzyrheR18wdvEpCuQ820hfG8b2lS0J3w172MQb ssB9EtlMZ8zmCnpanP0kXU3xiS8eES7GJF9/60B4ZStm2nCWRtaHt+C6AlonApn1cD9F qAdSFmm55hzoJPs4N0V8QmWrz10qpCRMLXEoZDupj4Nbgo8+RG/GlYw7bR7Bb3ygzJ9y kdXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318193; x=1716922993; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=H+TApMOej23rAq8J7X8zVnMzg2eM8MoZahdjZuHHWMo=; b=lSgki6lYrdy4gPCBIdopJs/Hr4TErg7PUtN3brAVPRKUI54K1T4RfHrR50kyn0LHXA PB6c7Kyqsr9Qf2vh9IlCCk9bIRRQGbVXwJCBk1qPvCcCi5mNsvDHlKyTJ1oLDgByjcRI DjjzipOUGPQOfD0PXh5OLVtwKiicpXl83xHxQZ9bcZDVpNM7Y/wv3MgM5OIiSkWKPo6Y MaA0hDrnOWNZUv6/qMuzugVsaEy6XgN6tvjwBb27m4rvJq7OmRVZS8MARQLU99sk4vN/ Dd69X1ciDO8rzUafV88+o6UKGy7qYRU/mSMwAu/LOnBXd8LwHqDpQIsF/8Gi4+7i86wl T0jA== X-Gm-Message-State: AOJu0YwTUuVVl1yDyJ5Fxq8tpq3VxXsAFxDC7C9wj6qtgC7w3yaBknws FhaIx+UmOKAU8/R0TCq+Se4xOiTEKv7G2ah13Vn9fDFDtLLoMJVifhXz40BSa5tV7ZK3sVdEHkz y X-Google-Smtp-Source: AGHT+IF8xgtroXwyNLSCaNMok0dX3hXztlqkeTYjjc8XdY097T/ZRcJMLXz825iFSx1fq52YyIXUWg== X-Received: by 2002:a05:620a:a14:b0:792:c0af:dda1 with SMTP id af79cd13be357-792c75abf30mr3293911185a.32.1716318192574; Tue, 21 May 2024 12:03:12 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-792bf3106c2sm1312022185a.106.2024.05.21.12.03.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:03:11 -0700 (PDT) Date: Tue, 21 May 2024 15:03:10 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 28/30] ewah: `bitmap_equals_ewah()` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare to reuse existing pseudo-merge bitmaps by implementing a `bitmap_equals_ewah()` helper. This helper will be used to see if a raw bitmap (containing the set of parents for some pseudo-merge) is equal to any existing pseudo-merge's commits bitmap (which are stored as EWAH-compressed bitmaps on disk). Signed-off-by: Taylor Blau --- ewah/bitmap.c | 19 +++++++++++++++++++ ewah/ewok.h | 1 + 2 files changed, 20 insertions(+) diff --git a/ewah/bitmap.c b/ewah/bitmap.c index dc2ca190f12..55928dada86 100644 --- a/ewah/bitmap.c +++ b/ewah/bitmap.c @@ -261,6 +261,25 @@ int bitmap_equals(struct bitmap *self, struct bitmap *other) return 1; } +int bitmap_equals_ewah(struct bitmap *self, struct ewah_bitmap *other) +{ + struct ewah_iterator it; + eword_t word; + size_t i = 0; + + ewah_iterator_init(&it, other); + + while (ewah_iterator_next(&word, &it)) + if (word != (i < self->word_alloc ? self->words[i++] : 0)) + return 0; + + for (; i < self->word_alloc; i++) + if (self->words[i]) + return 0; + + return 1; +} + int bitmap_is_subset(struct bitmap *self, struct bitmap *other) { size_t common_size, i; diff --git a/ewah/ewok.h b/ewah/ewok.h index 7074a6347b7..5e357e24933 100644 --- a/ewah/ewok.h +++ b/ewah/ewok.h @@ -179,6 +179,7 @@ void bitmap_unset(struct bitmap *self, size_t pos); int bitmap_get(struct bitmap *self, size_t pos); void bitmap_free(struct bitmap *self); int bitmap_equals(struct bitmap *self, struct bitmap *other); +int bitmap_equals_ewah(struct bitmap *self, struct ewah_bitmap *other); /* * Both `bitmap_is_subset()` and `ewah_bitmap_is_subset()` return 1 if the set From patchwork Tue May 21 19:03:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669682 Received: from mail-oi1-f180.google.com (mail-oi1-f180.google.com [209.85.167.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 143311494A8 for ; Tue, 21 May 2024 19:03:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318199; cv=none; b=gtbcXCcFteX0/a30T1wRu3qbUefjL+jpD9NyFJVMsT+o9d5rtrRVCG6GH0Bicy06pXs854G5lz93EvZpcyfzXFLSR3Qb9h1WQj5ixMnZK9zxx1IM62Db8KDQhRAkXb7aRHjcNw+aKtffhMiLDWLGzhXbqtjiAYIQ0nuxL7Ad4RI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318199; c=relaxed/simple; bh=R775myxDJ2a/Bjt+zbdzTtsD1I50QRDSXuM/YmxVnaQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=T3D1lvJkAev4f7KOgbt30YG0/G5B11wUFzyBve30mno7UUvg8WbjU+WHag7OcKcIMQn5owmsFF0OXT/Jv+n82l0OyBVgIuv7PvOeAKcLfga/DXqII8VqPj3aC0N7EcdRXEGWCxm7RXaabunrcwPTPqUjuPRsewWRyJaWXRjJNFM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=cCGxfTp2; arc=none smtp.client-ip=209.85.167.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="cCGxfTp2" Received: by mail-oi1-f180.google.com with SMTP id 5614622812f47-3c999d53e04so2780311b6e.2 for ; Tue, 21 May 2024 12:03:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318196; x=1716922996; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=JxSBRCRT7r2wVAdYg/vfLSPh2bkl2kJpGHC6pnStocY=; b=cCGxfTp2EbpYe5XAtWcHvaGAFoSy0tCTM1IjIK8WuJAczNrrY7M5uTPnKUt/eGFWAQ eufVJQz2Tl+tha+72eXeCu3xFSDAHdj6zQ88tiaZVfMh3FI0OSz+YDr9Xf4pDQ60Exzh YQ8cXShuIImxQ9h5n6U7yY8KBJQN02qkbMVv9Y8DZjF82zBzKpch27z7tpgrwmdy2Blw KppNgNbNr61IP8q8muvkQ0jcMwFcTx+Qv1opRpMufCqt0S9XSIfFRlR1JdY2EErvW8uo Xz3HDTfGtM76puPXhR7iTaYSGkMawZxGQWXge5jdlQMDLUEFp/+jetad3qHfRNxVpJ76 /oRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318196; x=1716922996; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=JxSBRCRT7r2wVAdYg/vfLSPh2bkl2kJpGHC6pnStocY=; b=H5SzN8mmeKt6HoLdJ3PGDocLFpX6Vlibkjrvjm4YwzEl41lDQsJ5IzqIdKNV+3NnR8 /EE1VMw5TigmTMTD3bpJAWeoGsduMIWGQNmXQGwr2Mzj0wSilk6gUvxPmOazFXAzwT0h /mpYoH0YOYynl+CuqcuFd/aTqrWAHs+jY1d3BCv3bNg9J3808gK2z+CCh7dHIgWneTkP 8wzgd2Qo4xD/bKcPmZGQ+4yCdY2vRSvMY8hpe+DkiVJgGwJWHpzaKRjQ930t/Y56+OPq d7wkMmcqih/WDWdw26ir+d0PBn3Poq8GtpfeLlhDXWmZXVK68DFvMuW9TLEw9pckEobR JxNg== X-Gm-Message-State: AOJu0YwoyqpX9vL/TZpfl0Cm1HQBgTTZ4rGYoFPCGgg79Pj1VHIL5l3H lx1DayLASP0L1+1xPi5WQw1MhaCbHaDG+6l/YztjnY4IPDjA4vMs+Y8bepyR4FYgOljl6h7VFQW Z X-Google-Smtp-Source: AGHT+IHIFdWR0woO2xKa1JXH8/wCJUS4jlI00LhbnWNI+ot3XLSX2elK2eE9FKqAlN7z35XwOWvH8w== X-Received: by 2002:a05:6808:116:b0:3c8:47a3:3cb5 with SMTP id 5614622812f47-3cdb432fc1fmr10910b6e.24.1716318196235; Tue, 21 May 2024 12:03:16 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-79484010bcdsm231672785a.133.2024.05.21.12.03.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:03:15 -0700 (PDT) Date: Tue, 21 May 2024 15:03:14 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 29/30] pseudo-merge: implement support for finding existing merges Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: This patch implements support for reusing existing pseudo-merge commits when writing bitmaps when there is an existing pseudo-merge bitmap which has exactly the same set of parents as one that we are about to write. Note that unstable pseudo-merges are likely to change between consecutive repacks, and so are generally poor candidates for reuse. However, stable pseudo-merges (see the configuration option 'bitmapPseudoMerge..stableThreshold') are by definition unlikely to change between runs (as they represent long-running branches). Because there is no index from a *set* of pseudo-merge parents to a matching pseudo-merge bitmap, we have to construct the bitmap corresponding to the set of parents for each pending pseudo-merge commit and see if a matching bitmap exists. This is technically quadratic in the number of pseudo-merges, but is OK in practice for a couple of reasons: - non-matching pseudo-merge bitmaps are rejected quickly as soon as they differ in a single bit - already-matched pseudo-merge bitmaps are discarded from subsequent rounds of search - the number of pseudo-merges is generally small, even for large repositories In order to do this, implement (a) a function that finds a matching pseudo-merge given some uncompressed bitset describing its parents, (b) a function that computes the bitset of parents for a given pseudo-merge commit, and (c) call that function before computing the set of reachable objects for some pending pseudo-merge. Signed-off-by: Taylor Blau --- pack-bitmap-write.c | 15 ++++++-- pack-bitmap.c | 32 +++++++++++++++++ pack-bitmap.h | 2 ++ pseudo-merge.c | 55 ++++++++++++++++++++++++++++ pseudo-merge.h | 7 ++++ t/t5333-pseudo-merge-bitmaps.sh | 64 +++++++++++++++++++++++++++++++++ 6 files changed, 173 insertions(+), 2 deletions(-) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 47250398aa2..6e8060f8a0b 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -19,6 +19,10 @@ #include "tree-walk.h" #include "pseudo-merge.h" #include "oid-array.h" +#include "config.h" +#include "alloc.h" +#include "refs.h" +#include "strmap.h" struct bitmapped_commit { struct commit *commit; @@ -465,6 +469,7 @@ static int fill_bitmap_tree(struct bitmap_writer *writer, } static int reused_bitmaps_nr; +static int reused_pseudo_merge_bitmaps_nr; static int fill_bitmap_commit(struct bitmap_writer *writer, struct bb_commit *ent, @@ -490,7 +495,7 @@ static int fill_bitmap_commit(struct bitmap_writer *writer, struct bitmap *remapped = bitmap_new(); if (commit->object.flags & BITMAP_PSEUDO_MERGE) - old = NULL; + old = pseudo_merge_bitmap_for_commit(old_bitmap, c); else old = bitmap_for_commit(old_bitmap, c); /* @@ -501,7 +506,10 @@ static int fill_bitmap_commit(struct bitmap_writer *writer, if (old && !rebuild_bitmap(mapping, old, remapped)) { bitmap_or(ent->bitmap, remapped); bitmap_free(remapped); - reused_bitmaps_nr++; + if (commit->object.flags & BITMAP_PSEUDO_MERGE) + reused_pseudo_merge_bitmaps_nr++; + else + reused_bitmaps_nr++; continue; } bitmap_free(remapped); @@ -631,6 +639,9 @@ int bitmap_writer_build(struct bitmap_writer *writer, the_repository); trace2_data_intmax("pack-bitmap-write", the_repository, "building_bitmaps_reused", reused_bitmaps_nr); + trace2_data_intmax("pack-bitmap-write", the_repository, + "building_bitmaps_pseudo_merge_reused", + reused_pseudo_merge_bitmaps_nr); stop_progress(&writer->progress); diff --git a/pack-bitmap.c b/pack-bitmap.c index 1966b3b95f1..70230e26479 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1316,6 +1316,37 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, return cb.base; } +struct ewah_bitmap *pseudo_merge_bitmap_for_commit(struct bitmap_index *bitmap_git, + struct commit *commit) +{ + struct commit_list *p; + struct bitmap *parents; + struct pseudo_merge *match = NULL; + + if (!bitmap_git->pseudo_merges.nr) + return NULL; + + parents = bitmap_new(); + + for (p = commit->parents; p; p = p->next) { + int pos = bitmap_position(bitmap_git, &p->item->object.oid); + if (pos < 0 || pos >= bitmap_num_objects(bitmap_git)) + goto done; + + bitmap_set(parents, pos); + } + + match = pseudo_merge_for_parents(&bitmap_git->pseudo_merges, + parents); + +done: + bitmap_free(parents); + if (match) + return pseudo_merge_bitmap(&bitmap_git->pseudo_merges, match); + + return NULL; +} + static void unsatisfy_all_pseudo_merges(struct bitmap_index *bitmap_git) { uint32_t i; @@ -2809,6 +2840,7 @@ void free_bitmap_index(struct bitmap_index *b) */ close_midx_revindex(b->midx); } + free_pseudo_merge_map(&b->pseudo_merges); free(b); } diff --git a/pack-bitmap.h b/pack-bitmap.h index 4466b5ad0fb..1171e6d9893 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -142,6 +142,8 @@ int rebuild_bitmap(const uint32_t *reposition, struct bitmap *dest); struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git, struct commit *commit); +struct ewah_bitmap *pseudo_merge_bitmap_for_commit(struct bitmap_index *bitmap_git, + struct commit *commit); void bitmap_writer_select_commits(struct bitmap_writer *writer, struct commit **indexed_commits, unsigned int indexed_commits_nr); diff --git a/pseudo-merge.c b/pseudo-merge.c index 0f50ac6183e..36a617f64e6 100644 --- a/pseudo-merge.c +++ b/pseudo-merge.c @@ -695,3 +695,58 @@ int cascade_pseudo_merges(const struct pseudo_merge_map *pm, return ret; } + +struct pseudo_merge *pseudo_merge_for_parents(const struct pseudo_merge_map *pm, + struct bitmap *parents) +{ + struct pseudo_merge *match = NULL; + size_t i; + + if (!pm->nr) + return NULL; + + /* + * NOTE: this loop is quadratic in the worst-case (where no + * matching pseudo-merge bitmaps are found), but in practice + * this is OK for a few reasons: + * + * - Rejecting pseudo-merge bitmaps that do not match the + * given commit is done quickly (i.e. `bitmap_equals_ewah()` + * returns early when we know the two bitmaps aren't equal. + * + * - Already matched pseudo-merge bitmaps (which we track with + * the `->satisfied` bit here) are skipped as potential + * candidates. + * + * - The number of pseudo-merges should be small (in the + * hundreds for most repositories). + * + * If in the future this semi-quadratic behavior does become a + * problem, another approach would be to keep track of which + * pseudo-merges are still "viable" after enumerating the + * pseudo-merge commit's parents: + * + * - A pseudo-merge bitmap becomes non-viable when the bit(s) + * corresponding to one or more parent(s) of the given + * commit are not set in a candidate pseudo-merge's commits + * bitmap. + * + * - After processing all bits, enumerate the remaining set of + * viable pseudo-merge bitmaps, and check that their + * popcount() matches the number of parents in the given + * commit. + */ + for (i = 0; i < pm->nr; i++) { + struct pseudo_merge *candidate = use_pseudo_merge(pm, &pm->v[i]); + if (!candidate || candidate->satisfied) + continue; + if (!bitmap_equals_ewah(parents, candidate->commits)) + continue; + + match = candidate; + match->satisfied = 1; + break; + } + + return match; +} diff --git a/pseudo-merge.h b/pseudo-merge.h index c00b622be4b..62fde979015 100644 --- a/pseudo-merge.h +++ b/pseudo-merge.h @@ -206,4 +206,11 @@ int cascade_pseudo_merges(const struct pseudo_merge_map *pm, struct bitmap *result, struct bitmap *roots); +/* + * Returns a pseudo-merge which contains the exact set of commits + * listed in the "parents" bitamp, or NULL if none could be found. + */ +struct pseudo_merge *pseudo_merge_for_parents(const struct pseudo_merge_map *pm, + struct bitmap *parents); + #endif diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh index 3a7dc7278a7..7ae4b7a35b7 100755 --- a/t/t5333-pseudo-merge-bitmaps.sh +++ b/t/t5333-pseudo-merge-bitmaps.sh @@ -22,6 +22,10 @@ test_pseudo_merges_cascades () { test_trace2_data bitmap pseudo_merges_cascades "$1" } +test_pseudo_merges_reused () { + test_trace2_data pack-bitmap-write building_bitmaps_pseudo_merge_reused "$1" +} + tag_everything () { git rev-list --all --no-object-names >in && perl -lne ' @@ -320,4 +324,64 @@ test_expect_success 'pseudo-merge overlap stale traversal' ' ) ' +test_expect_success 'pseudo-merge reuse' ' + git init pseudo-merge-reuse && + ( + cd pseudo-merge-reuse && + + stable="1641013200" && # 2022-01-01 + unstable="1672549200" && # 2023-01-01 + + for date in $stable $unstable + do + test_commit_bulk --date "$date +0000" 128 && + test_tick || return 1 + done && + + tag_everything && + + git \ + -c bitmapPseudoMerge.test.pattern="refs/tags/" \ + -c bitmapPseudoMerge.test.maxMerges=1 \ + -c bitmapPseudoMerge.test.threshold=now \ + -c bitmapPseudoMerge.test.stableThreshold=$(($unstable - 1)) \ + -c bitmapPseudoMerge.test.stableSize=512 \ + repack -adb && + + test_pseudo_merges >merges && + test_line_count = 2 merges && + + test_pseudo_merge_commits 0 >stable-oids.before && + test_pseudo_merge_commits 1 >unstable-oids.before && + + : >trace2.txt && + GIT_TRACE2_EVENT=$PWD/trace2.txt git \ + -c bitmapPseudoMerge.test.pattern="refs/tags/" \ + -c bitmapPseudoMerge.test.maxMerges=2 \ + -c bitmapPseudoMerge.test.threshold=now \ + -c bitmapPseudoMerge.test.stableThreshold=$(($unstable - 1)) \ + -c bitmapPseudoMerge.test.stableSize=512 \ + repack -adb && + + test_pseudo_merges_reused 1 merges && + test_line_count = 3 merges && + + test_pseudo_merge_commits 0 >stable-oids.after && + for i in 1 2 + do + test_pseudo_merge_commits $i || return 1 + done >unstable-oids.after && + + sort -u expect && + sort -u actual && + test_cmp expect actual && + + sort -u expect && + sort -u actual && + test_cmp expect actual + ) +' + test_done From patchwork Tue May 21 19:03:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13669683 Received: from mail-qk1-f169.google.com (mail-qk1-f169.google.com [209.85.222.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A3FBB149C75 for ; Tue, 21 May 2024 19:03:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318203; cv=none; b=l8mLO/5M6NA8+iWOZH8U9OtRPkVOIN8GcxpL3fZldQtJmd96VMiFISiYV1bjJmBIaOmtsb8/3kEq19fW+h4W7d1SVEJbX2xl9XRm+CveOdzKKQ5mh1RMFCi7MTajtqErmAGEbJ+ltu1Ugl/zkM8SrfNbRH4Slp7ehKAfAgo+OLE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716318203; c=relaxed/simple; bh=/1J781I2ZkQcBClblGKFACa1qEddzLPQ3Ima4SFXNkg=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=bp0rdvJYSrqg+/lDNDJK6SDUrbkfM6vmQt1sMgwLSyL/o8DER1Qh8dAmll25SH3FoEHgm7f3Ck3KqswtZRKLw1JAxOKimsJ6WlSjCKXnSJ7zaNELdwyOU2AbGXxkHx/XdgTlP4/kBUQzU9nY1s1iVHcHeY2ln91/6IQUJ9jEqhk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=mvKt5xYk; arc=none smtp.client-ip=209.85.222.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="mvKt5xYk" Received: by mail-qk1-f169.google.com with SMTP id af79cd13be357-792b8bf806fso30879185a.0 for ; Tue, 21 May 2024 12:03:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1716318200; x=1716923000; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=9pJYhTRVWR6lVrJ9L1ix7SPzVX4IgOrcOaxp91LKDAE=; b=mvKt5xYk6RYOWsY2O1rVkFmlnu0aQfYlUQw9uR0nuflEVwm1iuPfdQLkx7Kck5aGcO 9cYyFc4RdB4P8vcJyU7rCouy6MBokskPFiASYKx4PGMJwr/9QxvSzk6zZXYiQojZpDAf onM9UT86ns07THnyL1uc1pAx3OcTimcWeAF+NZEzCBQnKn3xICHBEN8l1UygT9HIG68I S49SGLVznVAq3CZkc17E3dhrMCwHtsz1q4HpPLy4SxC1A9DqU9/VcOqU5avf/gzSaudQ VDHpA4zMLCl1/1FTijex0QfQoGTP9fLSPbu2U61cHEodNuaUUXimDcpoCfmlZnojWEyK JqUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716318200; x=1716923000; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=9pJYhTRVWR6lVrJ9L1ix7SPzVX4IgOrcOaxp91LKDAE=; b=viHDOJgrfjyddMVYR2s3D8cBEp1qK2cigjDBwmgo80ZNM5AYRBMVxN8csvDBv1vKF3 SxHgSt4Psat3oEfHlfBcOwIuQiIl+jlDur+Hai1fcZWQ4OorwyP+ytz2adiD4qjgkV8M SlHsMqEloVy0yn7WOsv6bmTolYa1sDc4tIGkci0dX/+bLzhQdrIhQ+8fTR8j2Z+mJYhG 2/S75RyetBWxwRIB5jYWDUQj1XQPvXXMlhZ/fJQUHDc+VupxsdE0gwP+LS14jODdAAA8 B0w5ROqIK+G6QppjhEiJzfLs81UnUGcC7rH5NSpS8K16Zyt3iDSP6OwNit5YBU981nhN Vb3g== X-Gm-Message-State: AOJu0YxRhkiArj9mc4VqtDGWFRbJRpobS+8QwLv1zN7A95FE6TQtqCX7 +IFNPxDsxYhaB9rU8trR+bp6q/fbN3W+1Pay84QhiZ9L7Sykn5HfwAO3/KQeavnsiE+1u3+vJAv 6 X-Google-Smtp-Source: AGHT+IHgdxAEPkP5Zm9qcXOHQnsaKlyeZ07CMW1rkgnjuyHCkLaUlhXbE3DNBmogtfrOWsufy7q1DQ== X-Received: by 2002:a05:620a:4143:b0:793:82:ad50 with SMTP id af79cd13be357-7930082aeadmr2149380885a.23.1716318199708; Tue, 21 May 2024 12:03:19 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-794724bb95dsm369130385a.53.2024.05.21.12.03.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 12:03:18 -0700 (PDT) Date: Tue, 21 May 2024 15:03:17 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v3 30/30] t/perf: implement performace tests for pseudo-merge bitmaps Message-ID: <6a6d88fa512ba344543f5f0df33d5a61e406f3db.1716318089.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Implement a straightforward performance test demonstrating the benefit of pseudo-merge bitmaps by measuring how long it takes to count reachable objects in a few different scenarios: - without bitmaps, to demonstrate a reasonable baseline - with bitmaps, but without pseudo-merges - with bitmaps and pseudo-merges Results from running this test on git.git are as follows: Test this tree ----------------------------------------------------------------------------------- 5333.2: git rev-list --count --all --objects (no bitmaps) 3.46(3.37+0.09) 5333.3: git rev-list --count --all --objects (no pseudo-merges) 0.13(0.11+0.01) 5333.4: git rev-list --count --all --objects (with pseudo-merges) 0.12(0.11+0.01) Signed-off-by: Taylor Blau --- t/perf/p5333-pseudo-merge-bitmaps.sh | 32 ++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) create mode 100755 t/perf/p5333-pseudo-merge-bitmaps.sh diff --git a/t/perf/p5333-pseudo-merge-bitmaps.sh b/t/perf/p5333-pseudo-merge-bitmaps.sh new file mode 100755 index 00000000000..4bec409d10e --- /dev/null +++ b/t/perf/p5333-pseudo-merge-bitmaps.sh @@ -0,0 +1,32 @@ +#!/bin/sh + +test_description='pseudo-merge bitmaps' +. ./perf-lib.sh + +test_perf_large_repo + +test_expect_success 'setup' ' + git \ + -c bitmapPseudoMerge.all.pattern="refs/" \ + -c bitmapPseudoMerge.all.threshold=now \ + -c bitmapPseudoMerge.all.stableThreshold=never \ + -c bitmapPseudoMerge.all.maxMerges=64 \ + -c pack.writeBitmapLookupTable=true \ + repack -adb +' + +test_perf 'git rev-list --count --all --objects (no bitmaps)' ' + git rev-list --objects --all +' + +test_perf 'git rev-list --count --all --objects (no pseudo-merges)' ' + GIT_TEST_USE_PSEDUO_MERGES=0 \ + git rev-list --objects --all --use-bitmap-index +' + +test_perf 'git rev-list --count --all --objects (with pseudo-merges)' ' + GIT_TEST_USE_PSEDUO_MERGES=1 \ + git rev-list --objects --all --use-bitmap-index +' + +test_done