From patchwork Mon Mar 25 17:24:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13602553 Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 470C5481BA for ; Mon, 25 Mar 2024 17:24:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387462; cv=none; b=qHX4jD5pA5hnVZg2vMlN5qXiRyzWPPlPQwpAcvaGouUtQikV248rkXsE/3tr+ELlBponHbSTlAvLwUJhoMRSxQWE1jWe/rGvNF/BunUydzQ7DGWKUbUw7PFQ4yOY9exhVuJgcVnTWYziKVMr3VrJw3k2H/lHn/s2HE4f+hwkIGY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387462; c=relaxed/simple; bh=Ot9Ew/5Q/UZqDmV5nnRZZZ4ZYKh3wbtWT9k0x79C2Eg=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=d2IAEB/XWo8QunOkqbHe/k5x8Qd51hUTmEjZutuLlC10XxSbYE8jv62nhlA7KKqb51G5K0XV83ho7qK1q5KvPOXB2Wj9Gtn1Cybjfb37S1fNjo7pvGfSFZ+TOnqBj0S7bTyY6R8si8PNPk+cvKJaa8AiFCMcpbA1wN4JATgGuzI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=onXq8CLq; arc=none smtp.client-ip=209.85.128.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="onXq8CLq" Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-609f4155b76so49877897b3.1 for ; Mon, 25 Mar 2024 10:24:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1711387460; x=1711992260; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=6ceu0s3e5idqiOpZEvwwRPlDJBJ5d/i4tq+E8njG6zg=; b=onXq8CLqgPb61QQAA6oiZNFkzQKQgx2YSa88yDhFEUJKnCTWFow/x+2yYc3S93L9Ee w+MpoExKcQv2ZOx6Vwr/mtJXNuc+Jct0nPPY+tFbQT0fouvaDcT7XBHkXkWGFibIEwxs Za8p1iZnTIv/Avz7Zh/I3DI7kRv/tyK+n4V5hRx2CLUDj9/lOFVKRzlwb+c9rNXAGq/c AHbR3a6qgyFBRxkIQOeJyhz6FBRfWRBS4MWvPzVOEzI7pxSdGEYglaFjfs94DETZXFoH GyZT2w4Pf1AqUU+vDO1jJcNDf+yDedlxnVWmvAlXcuDA/YebRiVlPPhUeAVwoLeCTkTb dqpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387460; x=1711992260; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=6ceu0s3e5idqiOpZEvwwRPlDJBJ5d/i4tq+E8njG6zg=; b=sEKQITCm/wAk6wYOzC3LooP6ijxHBlU+9YbDSn+EC9BT+DHKDHAL9j9iAthQA4tAWy 7+cfnBdA0sX6/EH4U2FS9CPWzoWIfWMuI3bGe2OZ7OEmRhzYaMesQE4NxWvnQ4N2/Vsl FMlRS/hVHmq4VAW5h72IvYpngk8yhq/rzT8MmEX3z9fgrVqFOfQ5LdbSbXvdmm9vlXLp 9MwKwKcFp4TgQ9hRTX/9NDUBnQOMcWVbSHF80KNE3HfY+fnogbH4ZdfAlx0L/CbUar4K ewmfGXHWI0XRvBTxyJv0XxliDKMUR5+cpXn587wzvvMrVie9sxyXqfyeDq/RfhOHKRKq av2Q== X-Gm-Message-State: AOJu0YwjfqHXZhs0awBiUH8memKtkPglvQtWKb1tY3BcjZ7QdIKLBFal czGJH6hFC1bkPj+rJzs3YTJya4Lmit7xPNnDG1ZMZmTwuWN+dXjSHnGZsq0NOoM8DHMrtR56aW4 5wM8= X-Google-Smtp-Source: AGHT+IFxy+4WB97y6mgESRJonHSXdKVZRrrkqZV6nrX/ODftAKf2O+JMdt9V6Rl1VJ1Mbex59DTvCQ== X-Received: by 2002:a81:a096:0:b0:611:2207:efb with SMTP id x144-20020a81a096000000b0061122070efbmr6987438ywg.34.1711387460135; Mon, 25 Mar 2024 10:24:20 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id ay19-20020a05620a179300b00789db88792fsm2275237qkb.90.2024.03.25.10.24.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 10:24:19 -0700 (PDT) Date: Mon, 25 Mar 2024 13:24:18 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano Subject: [PATCH 01/11] midx-write: initial commit Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Introduce a new empty midx-write.c source file. Similar to the relationship between "pack-bitmap.c" and "pack-bitmap-write.c", this source file will hold code that is specific to writing MIDX files as opposed to reading them (the latter will remain in midx.c). This is a preparatory step which will reduce the overall size of midx.c and make it easier to read as we prepare for future changes to that file (outside the immediate scope of this series). Signed-off-by: Taylor Blau --- Makefile | 1 + midx-write.c | 2 ++ 2 files changed, 3 insertions(+) create mode 100644 midx-write.c diff --git a/Makefile b/Makefile index 4e255c81f2..cf44a964c0 100644 --- a/Makefile +++ b/Makefile @@ -1072,6 +1072,7 @@ LIB_OBJS += merge-ort-wrappers.o LIB_OBJS += merge-recursive.o LIB_OBJS += merge.o LIB_OBJS += midx.o +LIB_OBJS += midx-write.o LIB_OBJS += name-hash.o LIB_OBJS += negotiator/default.o LIB_OBJS += negotiator/noop.o diff --git a/midx-write.c b/midx-write.c new file mode 100644 index 0000000000..214179d308 --- /dev/null +++ b/midx-write.c @@ -0,0 +1,2 @@ +#include "git-compat-util.h" +#include "midx.h" From patchwork Mon Mar 25 17:24:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13602554 Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 385E96FBB7 for ; Mon, 25 Mar 2024 17:24:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387466; cv=none; b=kc9TOcRH5r6p2/T7on6x5KMUwtjHxf3/3VmXlqkgPg/Ok0Uab57zaf9CAUP+zFyMGpVSJDtLm86C7lV6B8S+labOW5DrYq/eaP2TsdMoSVzhE8HNyy6gYrOvmbT6CuYROT7+4Z2TgVVAEeW+7psa/I3jwBAy8uMaGIveBpAnPLo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387466; c=relaxed/simple; bh=Zb2rZ3n91uJ9QZzCf+DzAHnyKgWyohQNXoF2NMJLfwA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=P2SzIVUYbNWfOLxjhe8tS+abKU+9sq4tn1gK3Z7fIFR85qnp31YSMVrZPtqqMkv2NiyiXbITa7V04tIVF9Jo22CK0LSYcOxl5erwB/b6bFuow2ywybCxjEoILVjIZd8ueO7k4U5fUp7XcybhdiMYDLHlJCOKXpJb2ALFFGpVN9k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=GctpdTI0; arc=none smtp.client-ip=209.85.160.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="GctpdTI0" Received: by mail-qt1-f169.google.com with SMTP id d75a77b69052e-430ad286ab8so27049111cf.1 for ; Mon, 25 Mar 2024 10:24:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1711387464; x=1711992264; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=o2NfsrsLd0gmM+HYYTMl1k0SuVIEutFba06YeUzqASo=; b=GctpdTI0fD3x5OB3TKn3DzC7XUXWORztD/iCwYo4/mw4eGQfbjv18qLcc4ifcY0e5t aLTN+HLKqPV7ycRq1h+PL4c0DBl2j/RRdi4D6lkkee+OBPs1WOHN/OvAvP9bQXFv6zwC xBUsblmxHOhCQLKKjt443EgyHr16vbWlLrso1GDqRBJIx0meHGd/h7aipFYxfu29APW3 zXvX8BsqJHLgTF2nwp8XFa5EdBPyuP50+id0qdr7T3cyXG9Ai1tKPF96B+1roWmewYct DZbh41Fa6dAsVZPk7wFQKVEiQvVeHypvsgmtdggFEmIzD052yvbb59wBNnV4cBjQmNdu VMzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387464; x=1711992264; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=o2NfsrsLd0gmM+HYYTMl1k0SuVIEutFba06YeUzqASo=; b=Pxacz5wwmN8TgiuSBHL4sXzPN+Hjr4vOUwWvhMvO+Khf1WmnHXD0P2KZ9mWI7kvT95 4CdLte/YVDVm68gKO8hj+BdARhDqrjMHwRaGHIvGLMYv8osEI/4UVFkZy7UTlYivYXlS hEco/dkeT/c+EKfUlLQ5rf7KXY0AboATNkoPHAwkkYsrkvpFHdNb0rNMfB/+jIFjVvBh fvW7F8EatDPWVDvWsw/CtcQAU49ULGkcP91LFFkj5QyXMWSNv4OjKrexuD5KRWjxXD7j f+yFUfy2OYzBO5yO0JCzuY2Diu2JyPVSe9pGtLMhzI4oBIMWJL+kBV0H9nw9hpsHwsaa pHVw== X-Gm-Message-State: AOJu0YxS+LeK2tNmQbUyotumknr6jL3tZ8IUwWwEBuCoFmnsYDHJiV1d e13CFDs9483KXbq1JcVN2czft1QjOA17EeEz0xNTotgbVKvs8d5/VL4yJgedyjQeaGze90G98C8 aMl0= X-Google-Smtp-Source: AGHT+IGnvCsip3GhoxIaEVfgjLceUkIhP+L1PEPFhEiDhSg9y98ZNRXkCln7dCI8dVuZJp25t4iR8Q== X-Received: by 2002:a05:622a:47cf:b0:431:5c17:60bf with SMTP id dp15-20020a05622a47cf00b004315c1760bfmr3067099qtb.26.1711387464033; Mon, 25 Mar 2024 10:24:24 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id eo7-20020a05622a544700b00431532ced95sm1448560qtb.27.2024.03.25.10.24.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 10:24:23 -0700 (PDT) Date: Mon, 25 Mar 2024 13:24:22 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano Subject: [PATCH 02/11] midx: extern a pair of shared functions Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The following commits will incrementally move code specific to writing MIDXs from midx.c to their new home in midx-write.c. Prepare for that by externing a pair of functions which: - will (temporarily) need to be called from both midx.c and midx-write.c, but - are implementation details that should not be exposed via the midx.h header. Declare these functions as extern within midx-write.c, and introduce a similar (non-extern) declaration within midx.c. This change will be effectively reverted by the end of this series. Signed-off-by: Taylor Blau --- midx-write.c | 10 ++++++++++ midx.c | 24 +++++++++++++++++------- 2 files changed, 27 insertions(+), 7 deletions(-) diff --git a/midx-write.c b/midx-write.c index 214179d308..4aab273243 100644 --- a/midx-write.c +++ b/midx-write.c @@ -1,2 +1,12 @@ #include "git-compat-util.h" #include "midx.h" + +extern int write_midx_internal(const char *object_dir, + struct string_list *packs_to_include, + struct string_list *packs_to_drop, + const char *preferred_pack_name, + const char *refs_snapshot, + unsigned flags); + +extern struct multi_pack_index *lookup_multi_pack_index(struct repository *r, + const char *object_dir); diff --git a/midx.c b/midx.c index 85e1c2cd12..5f22f01716 100644 --- a/midx.c +++ b/midx.c @@ -23,6 +23,16 @@ #include "list-objects.h" #include "pack-revindex.h" +struct multi_pack_index *lookup_multi_pack_index(struct repository *r, + const char *object_dir); + +int write_midx_internal(const char *object_dir, + struct string_list *packs_to_include, + struct string_list *packs_to_drop, + const char *preferred_pack_name, + const char *refs_snapshot, + unsigned flags); + #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ #define MIDX_VERSION 1 #define MIDX_BYTE_FILE_VERSION 4 @@ -1347,7 +1357,7 @@ static int write_midx_bitmap(const char *midx_name, return ret; } -static struct multi_pack_index *lookup_multi_pack_index(struct repository *r, +struct multi_pack_index *lookup_multi_pack_index(struct repository *r, const char *object_dir) { struct multi_pack_index *result = NULL; @@ -1372,12 +1382,12 @@ static struct multi_pack_index *lookup_multi_pack_index(struct repository *r, return result; } -static int write_midx_internal(const char *object_dir, - struct string_list *packs_to_include, - struct string_list *packs_to_drop, - const char *preferred_pack_name, - const char *refs_snapshot, - unsigned flags) +int write_midx_internal(const char *object_dir, + struct string_list *packs_to_include, + struct string_list *packs_to_drop, + const char *preferred_pack_name, + const char *refs_snapshot, + unsigned flags) { struct strbuf midx_name = STRBUF_INIT; unsigned char midx_hash[GIT_MAX_RAWSZ]; From patchwork Mon Mar 25 17:24:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13602555 Received: from mail-qk1-f182.google.com (mail-qk1-f182.google.com [209.85.222.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8DB226FBB7 for ; Mon, 25 Mar 2024 17:24:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387470; cv=none; b=OjDMTOeOK08zAlF3o759v3f0W2mGyh0Kb5HAxnSNmMxRWS5dGWYK6vlkHc6B+wAxREAy6Aa4T8Kozi5uYVza6HUYKKhy0UMbqfmLvyjstQmYJ9jwPrbgvR6DsK/Pix9vM0HjEz767qYF36psabgBAe4T/1lCM5XxeozONr41Voo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387470; c=relaxed/simple; bh=PxxsTghCwbpI/R/U/vBtRR0e0Qp2R4u1xzBNbevGXTM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=DjqFKy35yRPe+keSew4ByImz7yaNr7I+rZCUQlbDAfQoBLGHo3hrEhixA875LgaC68dlN/c6vs1A2fRvhFBC+NJYlBihOkN8F5KhXd+vST+DaNlIqI52xdCabY+SdV0a0h8gaVruKj3rdON/oFjrBOtHXDgz8kqFfCuj8O/kpqM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=YLtozCOw; arc=none smtp.client-ip=209.85.222.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="YLtozCOw" Received: by mail-qk1-f182.google.com with SMTP id af79cd13be357-78a16114b69so410583785a.0 for ; Mon, 25 Mar 2024 10:24:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1711387467; x=1711992267; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=DrJj5MFYE1iAgO/hSuHaso91cQRw3aJ8mDD+yDQ0OmI=; b=YLtozCOwnExcWTH/Kmr1VwvtSGkadToP4i2IQy2SedeskmhSI44hRbru3IuqUs0ywC jpXL6juXB23lWJRyCUET6p8EYWKLrln3quDBnnJzDhU0oED7dE7LdkEmMV+HpiIicpeK mfTHCSgv0SRNP9RCkQw8tSzPoOSHJmZRXSgmNk3l2ix0sHBAJQdxshP4EavQg7FruAy8 9z/2MbIt5+/0xXIAoxY+TEaBm6jMTKdILSh/W3wfyvu3xOB0EipnJ+ciPC278TdSc3y0 18cv9vYRnbe0v+X8dwsv8PFvgzxG6BUqrWpVMWf0vuWiAlufhZZA+/lGUN216pO2BYaM HdHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387467; x=1711992267; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=DrJj5MFYE1iAgO/hSuHaso91cQRw3aJ8mDD+yDQ0OmI=; b=xS2JmP/wKZBqxOMfAh+bWZTz10pQlt96wjammEalH60G4XL4E19bjXEit6nWOFFdJR Lc783kTja8nb+rr2t5RkmlZNnlOglSvGMlILtjgCFUdQMUyjl8zWve9SWXf6XBuVPAxc +5JQMda+X4YKsU+8ggEFNDqsZyJB4lFC65+sS+Xux3WZG9QwePvxlGkNmqxwIcsZrnRI Y6IdRTKsqxBWsH9Z7fAkG52e5FipIbPOtAhkRqOKva6LhwxFmIKB1vCXWN446KOYDgYf 9l/DlyqQKJg9TwLE48VjVao33062LbONBYDDNOpYGrWjfjUFBnq5GWNCDukp8nefBBpB Lc0w== X-Gm-Message-State: AOJu0YyA4bVCXWmdQX8ge6D46Pj2N8EhBTQwlZQ07oNJnDRX4A4HOrDm VDaMTbFA3UHrSKBEzyOzLh8O1LNMQIT5cnYqEER37/zM+WllEQF8P0gHtnW5bPdGakr2G1AWbl9 OCvs= X-Google-Smtp-Source: AGHT+IEUdKuG5lf1ORCRt122533QQUINqLbdDumEAGnHi6u8qCzHY4uWMOOu94AZFLHlFQPInAYD4A== X-Received: by 2002:a05:620a:628b:b0:78a:40f0:7d20 with SMTP id ov11-20020a05620a628b00b0078a40f07d20mr10260333qkn.19.1711387467123; Mon, 25 Mar 2024 10:24:27 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id u11-20020ae9c00b000000b0078a3871bdf3sm2285517qkk.4.2024.03.25.10.24.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 10:24:26 -0700 (PDT) Date: Mon, 25 Mar 2024 13:24:25 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano Subject: [PATCH 03/11] midx: move `midx_repack` (and related functions) to midx-write.c Message-ID: <487a0ccda8c781a4e7cfdd14d32b0466a867ddff.1711387439.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Move `midx_repack()`, the main function which implements the sub-command 'git multi-pack-index repack' into midx-write.c. This patch does not introduce any behavioral changes and is best viewed with `--color-moved`. Signed-off-by: Taylor Blau --- midx-write.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++++++ midx.c | 196 ------------------------------------------------- 2 files changed, 202 insertions(+), 196 deletions(-) diff --git a/midx-write.c b/midx-write.c index 4aab273243..6dd58be7e0 100644 --- a/midx-write.c +++ b/midx-write.c @@ -1,5 +1,11 @@ #include "git-compat-util.h" +#include "config.h" +#include "hex.h" +#include "packfile.h" #include "midx.h" +#include "run-command.h" +#include "pack-bitmap.h" +#include "revision.h" extern int write_midx_internal(const char *object_dir, struct string_list *packs_to_include, @@ -10,3 +16,199 @@ extern int write_midx_internal(const char *object_dir, extern struct multi_pack_index *lookup_multi_pack_index(struct repository *r, const char *object_dir); + +struct repack_info { + timestamp_t mtime; + uint32_t referenced_objects; + uint32_t pack_int_id; +}; + +static int compare_by_mtime(const void *a_, const void *b_) +{ + const struct repack_info *a, *b; + + a = (const struct repack_info *)a_; + b = (const struct repack_info *)b_; + + if (a->mtime < b->mtime) + return -1; + if (a->mtime > b->mtime) + return 1; + return 0; +} + +static int fill_included_packs_all(struct repository *r, + struct multi_pack_index *m, + unsigned char *include_pack) +{ + uint32_t i, count = 0; + int pack_kept_objects = 0; + + repo_config_get_bool(r, "repack.packkeptobjects", &pack_kept_objects); + + for (i = 0; i < m->num_packs; i++) { + if (prepare_midx_pack(r, m, i)) + continue; + if (!pack_kept_objects && m->packs[i]->pack_keep) + continue; + if (m->packs[i]->is_cruft) + continue; + + include_pack[i] = 1; + count++; + } + + return count < 2; +} + +static int fill_included_packs_batch(struct repository *r, + struct multi_pack_index *m, + unsigned char *include_pack, + size_t batch_size) +{ + uint32_t i, packs_to_repack; + size_t total_size; + struct repack_info *pack_info; + int pack_kept_objects = 0; + + CALLOC_ARRAY(pack_info, m->num_packs); + + repo_config_get_bool(r, "repack.packkeptobjects", &pack_kept_objects); + + for (i = 0; i < m->num_packs; i++) { + pack_info[i].pack_int_id = i; + + if (prepare_midx_pack(r, m, i)) + continue; + + pack_info[i].mtime = m->packs[i]->mtime; + } + + for (i = 0; i < m->num_objects; i++) { + uint32_t pack_int_id = nth_midxed_pack_int_id(m, i); + pack_info[pack_int_id].referenced_objects++; + } + + QSORT(pack_info, m->num_packs, compare_by_mtime); + + total_size = 0; + packs_to_repack = 0; + for (i = 0; total_size < batch_size && i < m->num_packs; i++) { + int pack_int_id = pack_info[i].pack_int_id; + struct packed_git *p = m->packs[pack_int_id]; + size_t expected_size; + + if (!p) + continue; + if (!pack_kept_objects && p->pack_keep) + continue; + if (p->is_cruft) + continue; + if (open_pack_index(p) || !p->num_objects) + continue; + + expected_size = st_mult(p->pack_size, + pack_info[i].referenced_objects); + expected_size /= p->num_objects; + + if (expected_size >= batch_size) + continue; + + packs_to_repack++; + total_size += expected_size; + include_pack[pack_int_id] = 1; + } + + free(pack_info); + + if (packs_to_repack < 2) + return 1; + + return 0; +} + +int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags) +{ + int result = 0; + uint32_t i; + unsigned char *include_pack; + struct child_process cmd = CHILD_PROCESS_INIT; + FILE *cmd_in; + struct strbuf base_name = STRBUF_INIT; + struct multi_pack_index *m = lookup_multi_pack_index(r, object_dir); + + /* + * When updating the default for these configuration + * variables in builtin/repack.c, these must be adjusted + * to match. + */ + int delta_base_offset = 1; + int use_delta_islands = 0; + + if (!m) + return 0; + + CALLOC_ARRAY(include_pack, m->num_packs); + + if (batch_size) { + if (fill_included_packs_batch(r, m, include_pack, batch_size)) + goto cleanup; + } else if (fill_included_packs_all(r, m, include_pack)) + goto cleanup; + + repo_config_get_bool(r, "repack.usedeltabaseoffset", &delta_base_offset); + repo_config_get_bool(r, "repack.usedeltaislands", &use_delta_islands); + + strvec_push(&cmd.args, "pack-objects"); + + strbuf_addstr(&base_name, object_dir); + strbuf_addstr(&base_name, "/pack/pack"); + strvec_push(&cmd.args, base_name.buf); + + if (delta_base_offset) + strvec_push(&cmd.args, "--delta-base-offset"); + if (use_delta_islands) + strvec_push(&cmd.args, "--delta-islands"); + + if (flags & MIDX_PROGRESS) + strvec_push(&cmd.args, "--progress"); + else + strvec_push(&cmd.args, "-q"); + + strbuf_release(&base_name); + + cmd.git_cmd = 1; + cmd.in = cmd.out = -1; + + if (start_command(&cmd)) { + error(_("could not start pack-objects")); + result = 1; + goto cleanup; + } + + cmd_in = xfdopen(cmd.in, "w"); + + for (i = 0; i < m->num_objects; i++) { + struct object_id oid; + uint32_t pack_int_id = nth_midxed_pack_int_id(m, i); + + if (!include_pack[pack_int_id]) + continue; + + nth_midxed_object_oid(&oid, m, i); + fprintf(cmd_in, "%s\n", oid_to_hex(&oid)); + } + fclose(cmd_in); + + if (finish_command(&cmd)) { + error(_("could not finish pack-objects")); + result = 1; + goto cleanup; + } + + result = write_midx_internal(object_dir, NULL, NULL, NULL, NULL, flags); + +cleanup: + free(include_pack); + return result; +} diff --git a/midx.c b/midx.c index 5f22f01716..3bd8c58642 100644 --- a/midx.c +++ b/midx.c @@ -2055,199 +2055,3 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla return result; } - -struct repack_info { - timestamp_t mtime; - uint32_t referenced_objects; - uint32_t pack_int_id; -}; - -static int compare_by_mtime(const void *a_, const void *b_) -{ - const struct repack_info *a, *b; - - a = (const struct repack_info *)a_; - b = (const struct repack_info *)b_; - - if (a->mtime < b->mtime) - return -1; - if (a->mtime > b->mtime) - return 1; - return 0; -} - -static int fill_included_packs_all(struct repository *r, - struct multi_pack_index *m, - unsigned char *include_pack) -{ - uint32_t i, count = 0; - int pack_kept_objects = 0; - - repo_config_get_bool(r, "repack.packkeptobjects", &pack_kept_objects); - - for (i = 0; i < m->num_packs; i++) { - if (prepare_midx_pack(r, m, i)) - continue; - if (!pack_kept_objects && m->packs[i]->pack_keep) - continue; - if (m->packs[i]->is_cruft) - continue; - - include_pack[i] = 1; - count++; - } - - return count < 2; -} - -static int fill_included_packs_batch(struct repository *r, - struct multi_pack_index *m, - unsigned char *include_pack, - size_t batch_size) -{ - uint32_t i, packs_to_repack; - size_t total_size; - struct repack_info *pack_info; - int pack_kept_objects = 0; - - CALLOC_ARRAY(pack_info, m->num_packs); - - repo_config_get_bool(r, "repack.packkeptobjects", &pack_kept_objects); - - for (i = 0; i < m->num_packs; i++) { - pack_info[i].pack_int_id = i; - - if (prepare_midx_pack(r, m, i)) - continue; - - pack_info[i].mtime = m->packs[i]->mtime; - } - - for (i = 0; i < m->num_objects; i++) { - uint32_t pack_int_id = nth_midxed_pack_int_id(m, i); - pack_info[pack_int_id].referenced_objects++; - } - - QSORT(pack_info, m->num_packs, compare_by_mtime); - - total_size = 0; - packs_to_repack = 0; - for (i = 0; total_size < batch_size && i < m->num_packs; i++) { - int pack_int_id = pack_info[i].pack_int_id; - struct packed_git *p = m->packs[pack_int_id]; - size_t expected_size; - - if (!p) - continue; - if (!pack_kept_objects && p->pack_keep) - continue; - if (p->is_cruft) - continue; - if (open_pack_index(p) || !p->num_objects) - continue; - - expected_size = st_mult(p->pack_size, - pack_info[i].referenced_objects); - expected_size /= p->num_objects; - - if (expected_size >= batch_size) - continue; - - packs_to_repack++; - total_size += expected_size; - include_pack[pack_int_id] = 1; - } - - free(pack_info); - - if (packs_to_repack < 2) - return 1; - - return 0; -} - -int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags) -{ - int result = 0; - uint32_t i; - unsigned char *include_pack; - struct child_process cmd = CHILD_PROCESS_INIT; - FILE *cmd_in; - struct strbuf base_name = STRBUF_INIT; - struct multi_pack_index *m = lookup_multi_pack_index(r, object_dir); - - /* - * When updating the default for these configuration - * variables in builtin/repack.c, these must be adjusted - * to match. - */ - int delta_base_offset = 1; - int use_delta_islands = 0; - - if (!m) - return 0; - - CALLOC_ARRAY(include_pack, m->num_packs); - - if (batch_size) { - if (fill_included_packs_batch(r, m, include_pack, batch_size)) - goto cleanup; - } else if (fill_included_packs_all(r, m, include_pack)) - goto cleanup; - - repo_config_get_bool(r, "repack.usedeltabaseoffset", &delta_base_offset); - repo_config_get_bool(r, "repack.usedeltaislands", &use_delta_islands); - - strvec_push(&cmd.args, "pack-objects"); - - strbuf_addstr(&base_name, object_dir); - strbuf_addstr(&base_name, "/pack/pack"); - strvec_push(&cmd.args, base_name.buf); - - if (delta_base_offset) - strvec_push(&cmd.args, "--delta-base-offset"); - if (use_delta_islands) - strvec_push(&cmd.args, "--delta-islands"); - - if (flags & MIDX_PROGRESS) - strvec_push(&cmd.args, "--progress"); - else - strvec_push(&cmd.args, "-q"); - - strbuf_release(&base_name); - - cmd.git_cmd = 1; - cmd.in = cmd.out = -1; - - if (start_command(&cmd)) { - error(_("could not start pack-objects")); - result = 1; - goto cleanup; - } - - cmd_in = xfdopen(cmd.in, "w"); - - for (i = 0; i < m->num_objects; i++) { - struct object_id oid; - uint32_t pack_int_id = nth_midxed_pack_int_id(m, i); - - if (!include_pack[pack_int_id]) - continue; - - nth_midxed_object_oid(&oid, m, i); - fprintf(cmd_in, "%s\n", oid_to_hex(&oid)); - } - fclose(cmd_in); - - if (finish_command(&cmd)) { - error(_("could not finish pack-objects")); - result = 1; - goto cleanup; - } - - result = write_midx_internal(object_dir, NULL, NULL, NULL, NULL, flags); - -cleanup: - free(include_pack); - return result; -} From patchwork Mon Mar 25 17:24:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13602556 Received: from mail-oi1-f179.google.com (mail-oi1-f179.google.com [209.85.167.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 749F976401 for ; Mon, 25 Mar 2024 17:24:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387473; cv=none; b=XRMZCYFonYe8UrknKqOV/usxveXuh6Ayx1h5N+BvbUEjX6IE0qkReF8xGLJ+gED7Ld1xxG1bwJcrjMg1IakSXobugJTXoDXothh/kREbP2IGPuu/oA45N+/PTICpVxylXaZuhOJqSWdvsDT9xKgEApdZSkMYntp1Mr1jNWB3ekQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387473; c=relaxed/simple; bh=AXRDKhbjrUfF2C16tSYWGrrQk3n2EAeoCXOpGNpYRfc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Mz0wrWDS7n3vQZla5pfLrfJOcEpLwvZvgXhYqFda6Bs/2k9JvgoGP0SFOMShyLXldF/lkhDy5wJvf5M5WoVo74jE07Zx7yDUXTtuQtidIZPLs/1G7l6TBh9woHxtOgoLL6wwm1uhHRBc1PtqoREBInMzUiQCZPsj2SmCnJ6QvxM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=jldTHH6G; arc=none smtp.client-ip=209.85.167.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="jldTHH6G" Received: by mail-oi1-f179.google.com with SMTP id 5614622812f47-3c396fec63aso1593586b6e.0 for ; Mon, 25 Mar 2024 10:24:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1711387470; x=1711992270; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=CUx7jtVOi+jb3D65FVk3x9l97YapyQi7TjNwuh00bo8=; b=jldTHH6GdQ2wGsemCQoH9847byEowVMfsBmh0ktNWPbyg1YOHgLPymrY/eVRkXt9AO 3UzoR29IW6zcez+esayg57+eTOcK6f74cs+j/BezxNIMzY8fw7YAB5EbpWh79rShu1o1 icVLFChOrBvOuZIiZlkmum5i+5Ws1uPkIpGDFtfySK9S9ZKHU3G3rPnZP2IeGBGHNSYR PjjvwkVfeBgjl3WITVlXl1jekGVwz4jk5nTKkKPGExrwMa+pfTP9N5rinog6Wb3YRXEy WxxApCOPFRPVQ3e6mPO3djZ+Y23XsK8K9cH6hB9VZXZyw+IctXKsWmfS3+jab/b6ahEu iUFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387470; x=1711992270; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=CUx7jtVOi+jb3D65FVk3x9l97YapyQi7TjNwuh00bo8=; b=LGkC8MQU5EmDe5qryFNyf1/cQ2SI+aBRNVF891bvR95FijN+3BhfW8ACqmxnHdC4Fl +OJS9fb2UgZoFeD0PMDMvIukviUkIQ6DQOmaSQSGVTX6Ov7mafg6M9N27JDkb0HyzOrt pU2c+H8bzNssY8u7yzTbtwQQNetGbzxOpOF4vj0LJ050AWOft9UDUqKLCR4l1+vWBMix Oq9QmpI6C/X4XNr/qmvl5l8iTo0MEKfUpgkQtDiGG6PSeD3szRjEDYY+DAu7U6eqWnJo FSm18uGsQV7j0oudI9ExKGvVkW3EaJ6gEMsHkP/thlNH8T5S2HBbKUMSvwfWPk3UJghB mSqg== X-Gm-Message-State: AOJu0YzOuNvjGci0mrKoxKqJEZ4bQ8HJm+1aJrO4udbKyX8eT56kZLC5 RJkdFcdcfhVXddP3BcaHqD4XnggY9ZRHIy8Sad1vtLbFSsys9YnjLdJiJlE6xD1bBJy88pc+6wN R7Bc= X-Google-Smtp-Source: AGHT+IFONp5umqcdFhAWcKpinArVF+jOnjAedHVBj+y5WzLVkLqh9QxOJU4a1/KZl1iNGRuj+jBlhA== X-Received: by 2002:a05:6808:ec8:b0:3c2:39d1:f111 with SMTP id q8-20020a0568080ec800b003c239d1f111mr531992oiv.48.1711387470432; Mon, 25 Mar 2024 10:24:30 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id he37-20020a05622a602500b00430bad81704sm2779940qtb.52.2024.03.25.10.24.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 10:24:30 -0700 (PDT) Date: Mon, 25 Mar 2024 13:24:29 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano Subject: [PATCH 04/11] midx: move `expire_midx_packs` to midx-write.c Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Move `expire_midx_packs()`, the main function which implements the sub-command 'git multi-pack-index expire' into midx-write.c. Similar to the previous patch, this patch does not introduce any behavioral changes and is best viewed with `--color-moved`. Signed-off-by: Taylor Blau --- midx-write.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++ midx.c | 57 --------------------------------------------------- 2 files changed, 58 insertions(+), 57 deletions(-) diff --git a/midx-write.c b/midx-write.c index 6dd58be7e0..d679e0a131 100644 --- a/midx-write.c +++ b/midx-write.c @@ -3,6 +3,7 @@ #include "hex.h" #include "packfile.h" #include "midx.h" +#include "progress.h" #include "run-command.h" #include "pack-bitmap.h" #include "revision.h" @@ -17,6 +18,63 @@ extern int write_midx_internal(const char *object_dir, extern struct multi_pack_index *lookup_multi_pack_index(struct repository *r, const char *object_dir); +int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags) +{ + uint32_t i, *count, result = 0; + struct string_list packs_to_drop = STRING_LIST_INIT_DUP; + struct multi_pack_index *m = lookup_multi_pack_index(r, object_dir); + struct progress *progress = NULL; + + if (!m) + return 0; + + CALLOC_ARRAY(count, m->num_packs); + + if (flags & MIDX_PROGRESS) + progress = start_delayed_progress(_("Counting referenced objects"), + m->num_objects); + for (i = 0; i < m->num_objects; i++) { + int pack_int_id = nth_midxed_pack_int_id(m, i); + count[pack_int_id]++; + display_progress(progress, i + 1); + } + stop_progress(&progress); + + if (flags & MIDX_PROGRESS) + progress = start_delayed_progress(_("Finding and deleting unreferenced packfiles"), + m->num_packs); + for (i = 0; i < m->num_packs; i++) { + char *pack_name; + display_progress(progress, i + 1); + + if (count[i]) + continue; + + if (prepare_midx_pack(r, m, i)) + continue; + + if (m->packs[i]->pack_keep || m->packs[i]->is_cruft) + continue; + + pack_name = xstrdup(m->packs[i]->pack_name); + close_pack(m->packs[i]); + + string_list_insert(&packs_to_drop, m->pack_names[i]); + unlink_pack_path(pack_name, 0); + free(pack_name); + } + stop_progress(&progress); + + free(count); + + if (packs_to_drop.nr) + result = write_midx_internal(object_dir, NULL, &packs_to_drop, NULL, NULL, flags); + + string_list_clear(&packs_to_drop, 0); + + return result; +} + struct repack_info { timestamp_t mtime; uint32_t referenced_objects; diff --git a/midx.c b/midx.c index 3bd8c58642..5936bc5b9e 100644 --- a/midx.c +++ b/midx.c @@ -1998,60 +1998,3 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag return verify_midx_error; } - -int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags) -{ - uint32_t i, *count, result = 0; - struct string_list packs_to_drop = STRING_LIST_INIT_DUP; - struct multi_pack_index *m = lookup_multi_pack_index(r, object_dir); - struct progress *progress = NULL; - - if (!m) - return 0; - - CALLOC_ARRAY(count, m->num_packs); - - if (flags & MIDX_PROGRESS) - progress = start_delayed_progress(_("Counting referenced objects"), - m->num_objects); - for (i = 0; i < m->num_objects; i++) { - int pack_int_id = nth_midxed_pack_int_id(m, i); - count[pack_int_id]++; - display_progress(progress, i + 1); - } - stop_progress(&progress); - - if (flags & MIDX_PROGRESS) - progress = start_delayed_progress(_("Finding and deleting unreferenced packfiles"), - m->num_packs); - for (i = 0; i < m->num_packs; i++) { - char *pack_name; - display_progress(progress, i + 1); - - if (count[i]) - continue; - - if (prepare_midx_pack(r, m, i)) - continue; - - if (m->packs[i]->pack_keep || m->packs[i]->is_cruft) - continue; - - pack_name = xstrdup(m->packs[i]->pack_name); - close_pack(m->packs[i]); - - string_list_insert(&packs_to_drop, m->pack_names[i]); - unlink_pack_path(pack_name, 0); - free(pack_name); - } - stop_progress(&progress); - - free(count); - - if (packs_to_drop.nr) - result = write_midx_internal(object_dir, NULL, &packs_to_drop, NULL, NULL, flags); - - string_list_clear(&packs_to_drop, 0); - - return result; -} From patchwork Mon Mar 25 17:24:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13602557 Received: from mail-oi1-f176.google.com (mail-oi1-f176.google.com [209.85.167.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ADA5F1272AF for ; Mon, 25 Mar 2024 17:24:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387476; cv=none; b=FArLQl58eSeTLKqlKY/DFNDWTxpNK0Wc+U2izJN6yBBC5gDyCOORsqyIJwuURiTYMG3vABL+b6mtjLuNxTKJxczSpyliwOQgiRyRIQK+4tHazkcD6BssK1ok9daSlR+27N4llunr5RPUe2AtqKiQ8rR45vIzYXFDf4Yd6nMZKUM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387476; c=relaxed/simple; bh=TqjaVgNPJNpjRRicv16JJgia3X0dZv9Z009I814wat4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=RiUVuPfGAd6l4PFRjVstxH8nIaE4/E8sqUIyfGiZMBb8exoxMfGNwxn6gXm9OCn0Pswz8mhbH83bSRLhxlTRHCu4SemdP4wzKd7uPEqYVi67CCwiaoXGDaw3WA6aLiFKbb5++uENlKQyVly2R3zXXs0XvGX2++c/dO9wCDfo7lM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=0PatccOL; arc=none smtp.client-ip=209.85.167.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="0PatccOL" Received: by mail-oi1-f176.google.com with SMTP id 5614622812f47-3bbbc6e51d0so2851580b6e.3 for ; Mon, 25 Mar 2024 10:24:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1711387473; x=1711992273; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=GnGIBLKWbE5v3yKfheu+hdHEAHVfHxkWdQ7KR2ETDO0=; b=0PatccOLw5xKxqBplCUjxzg+JdksBLJUKS6hgRbPher3opfB3Hb14agM48uojTrgIF xiSwgHsaFIVrV7rnKVQXGDDxAJwEmAXtQfpetNu5BOFMDkQBh9gdYQfcB1bRLD5zJCmC bF7B3vAu6/O6jnK+yEb3N8vhEoWEWoEsrnWdbb2mDi0PQzwBaRpjeNBav7yVcUK/uFoR CtUwXqyntCrHLdDPwhmQx/uGSuYywMa9LFIGxmuLF6HmagOclB14EXW1pcoeC6FgFerG qUMpFGX50YgsWbvfSUARxdYJN4BB+4GLd6V4SEy1RhuTABKLYNIf0QKYwBeoAIW+Jg1e sUMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387473; x=1711992273; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=GnGIBLKWbE5v3yKfheu+hdHEAHVfHxkWdQ7KR2ETDO0=; b=QWzLujGNIs7YE+D0tCJavRUs+KaggXL6xEz4mS9EyfuW4CJLi8aUJuELtM9uhnpnxY fL5TQqS8vGfmrg+DLw+OOHBMBBGSBrkDcnq7hN0bJs+BjZz2O3LQqM28rotxzbKU3K9R vqOzejJqK8sAGZXShJDR32jK2mRSJsYMq3ShXfbmLP1983uscKo6ENN5NZ58/OY9pvge jrFtPB9H+51VaaFQZf0oWOGWg9TciL02q/i7uQ6n5TUYmwwBxIbYdp98WdZzgJ9zwwQm Iug+JyaqJXyzLnJGUjiPuiyQR/UOpDYYa2SFpPQW6PPHDCS+GZ3npBnhdi20iu2oV0lG MvjA== X-Gm-Message-State: AOJu0YwADkWwbsuSukc8TV4h5cr8K6ycR56Cq96ZtsYCfwGsSpZZe4r7 CVQw5IsEe0EoKBZRPxaatrlACWyUET2fEqSQElhHQVe5oaAvfHhhQww/OykmBuppr/AvmQld3nD FHqE= X-Google-Smtp-Source: AGHT+IHVLEF5wMb+nRy2I12u5xhg4kLIek3Z5ZbZKBVqIR6WsuOXlcIZC+tqL8AUCsEwE9iloOa/Cg== X-Received: by 2002:a05:6808:179f:b0:3c3:be7d:3c6e with SMTP id bg31-20020a056808179f00b003c3be7d3c6emr10487651oib.41.1711387473502; Mon, 25 Mar 2024 10:24:33 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id o26-20020ac8555a000000b0042ebbc1196fsm2770151qtr.87.2024.03.25.10.24.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 10:24:33 -0700 (PDT) Date: Mon, 25 Mar 2024 13:24:32 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano Subject: [PATCH 05/11] midx: move `write_midx_file_only` to midx-write.c Message-ID: <31d2e074fbeb81dc856a66fdaee455a7fa70e9c6.1711387439.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare to move the last substantial function related to writing from midx.c to to midx-write.c by first moving a thin wrapper around it. Like previous changes, this patch does not introduce any behavioral changes and is best viewed with `--color-moved`. Signed-off-by: Taylor Blau --- midx-write.c | 10 ++++++++++ midx.c | 10 ---------- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/midx-write.c b/midx-write.c index d679e0a131..635e6af193 100644 --- a/midx-write.c +++ b/midx-write.c @@ -18,6 +18,16 @@ extern int write_midx_internal(const char *object_dir, extern struct multi_pack_index *lookup_multi_pack_index(struct repository *r, const char *object_dir); +int write_midx_file_only(const char *object_dir, + struct string_list *packs_to_include, + const char *preferred_pack_name, + const char *refs_snapshot, + unsigned flags) +{ + return write_midx_internal(object_dir, packs_to_include, NULL, + preferred_pack_name, refs_snapshot, flags); +} + int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags) { uint32_t i, *count, result = 0; diff --git a/midx.c b/midx.c index 5936bc5b9e..702eca805a 100644 --- a/midx.c +++ b/midx.c @@ -1764,16 +1764,6 @@ int write_midx_file(const char *object_dir, refs_snapshot, flags); } -int write_midx_file_only(const char *object_dir, - struct string_list *packs_to_include, - const char *preferred_pack_name, - const char *refs_snapshot, - unsigned flags) -{ - return write_midx_internal(object_dir, packs_to_include, NULL, - preferred_pack_name, refs_snapshot, flags); -} - struct clear_midx_data { char *keep; const char *ext; From patchwork Mon Mar 25 17:24:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13602558 Received: from mail-oi1-f177.google.com (mail-oi1-f177.google.com [209.85.167.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C69A112C80A for ; Mon, 25 Mar 2024 17:24:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387479; cv=none; b=NG0vEJRx0jc0XlbLwGxoLsFbxpkX5+XuNIGmKP+L7vH3/7oDm1X+9lwHls7I8io8hXccfUPzkoP0prbgZPp5cgbczSE5/ekvoyY9CSS9bBX3zyjfzmVKgIJwAJh+e1RK0V7AYyJVamL++1PTj5mGfk7+TRSWwsJ/WDY93/zx9Uo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387479; c=relaxed/simple; bh=NDPtDtw7T+0JatdBkWT2+BSE6I8rFk0sWdOtK9u8RYI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=HLfT3t/BSipH6w28hL1ps2LyG9vI5XV/BmmmTrm+hN27bhw2p7Lfk36UgQ/70ap0OqqdBJgXFrk7Z2TkJK6VdsjhBT5vosjh0GSoehk9Q/N46H8fFV7Kj8Efw/j3/8gzb6dR+p9F1md9JPoP4eGi8+V5CgUmgE7XV0GsSC+XEXM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=t5m+hzzD; arc=none smtp.client-ip=209.85.167.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="t5m+hzzD" Received: by mail-oi1-f177.google.com with SMTP id 5614622812f47-3c3b256ab5eso2387290b6e.0 for ; Mon, 25 Mar 2024 10:24:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1711387476; x=1711992276; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=GoV1jDIczJHRwL9dURd3Gzt7HmKY5m4NmbsSqobqSDw=; b=t5m+hzzD0cLvTbVVr7jBvZX3a2S2qadNm+5Hd3TxeiiGXcezj0fy5sNScYEq6qU3iJ L0+wZScd6sZetYylRH64z1hBUz2dMN+tDY2ItG8JzlnPfqpibCeJ0S8S1ttgkr3nIJO/ 8Z4eGr6faJy5kuGKujLyymoyD/6x4ghvtKbowNYINW5BDRnb6QkJCYrUYm3oGfCBQkp9 oJh3BWuQ6EXsxRsJSewp+ATCU/5SQAaDhYOb3ormUxU4RfJJUU3xqMMTEl18sg61AsLn jvCtUCxS1ai1pJ46ZLo53u3JuMOidZlY5xx/6AS8foEHv/13rolNsymAjtgnZZDowxpw lRLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387476; x=1711992276; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=GoV1jDIczJHRwL9dURd3Gzt7HmKY5m4NmbsSqobqSDw=; b=C/ERDrVngl/q/IhbrA1BmC9pFODiW8zEIoYD/C+z9SXscvSYWWV/4I2GMSZ4dksRzw bQaoS1F9f5jiUwTj3GXUXIVHlzausG+O23dW992N2GWPP8dogK9M0jRMSoZ8kbkHZykK Wdh5bVXPyTYvw7KaGI9ssf2PZB4DlDRak1FKDH7vgtWeBcS9bUqUpVP9+yBt1MYbDbzE dWRSlzmf623LUEvYBPhmd6+51USWWV5WtR28kWBS5MKbz4zRwysjCLo4OUk2X6ah+UFO JleMJ7pqpzXH9CyIkquWQwnwwn4B4Gd7RalpIxdaak3au9oTQLiXPXqjjVQ27BbhIC1X +HVQ== X-Gm-Message-State: AOJu0YzUX5kLXZ/nJbWy7nFRobAP/X9LVhFIcnwPbCc2U3weaKRCnUhp JrZcnZfvntecwa8skg52eqHbk4KonMG2/+i/Ygu0DlH5Q2Cf8NEbFKQMVxTy/+WoQnbocXqQ0Xy DHkU= X-Google-Smtp-Source: AGHT+IHKqSQhzws6Hrc4mlVM8SYYKs3cuRo+1JxCOi7WFAOTGCquq8xta4d9JTnQpKVmjUb4KfxR2w== X-Received: by 2002:a05:6808:6492:b0:3c3:5852:263a with SMTP id fh18-20020a056808649200b003c35852263amr9929826oib.42.1711387476645; Mon, 25 Mar 2024 10:24:36 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id k20-20020ac84754000000b0042f30e63b1fsm2782347qtp.49.2024.03.25.10.24.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 10:24:36 -0700 (PDT) Date: Mon, 25 Mar 2024 13:24:35 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano Subject: [PATCH 06/11] midx: move `write_midx_file` to midx-write.c Message-ID: <73977036d7a8cc8f84a5dbc54e462477d33f2c1e.1711387439.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare to move the last substantial function related to writing from midx.c to to midx-write.c by moving another thin wrapper around it. Like previous changes, this patch does not introduce any behavioral changes and is best viewed with `--color-moved`. Signed-off-by: Taylor Blau --- midx-write.c | 9 +++++++++ midx.c | 9 --------- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/midx-write.c b/midx-write.c index 635e6af193..3d7697d8a2 100644 --- a/midx-write.c +++ b/midx-write.c @@ -18,6 +18,15 @@ extern int write_midx_internal(const char *object_dir, extern struct multi_pack_index *lookup_multi_pack_index(struct repository *r, const char *object_dir); +int write_midx_file(const char *object_dir, + const char *preferred_pack_name, + const char *refs_snapshot, + unsigned flags) +{ + return write_midx_internal(object_dir, NULL, NULL, preferred_pack_name, + refs_snapshot, flags); +} + int write_midx_file_only(const char *object_dir, struct string_list *packs_to_include, const char *preferred_pack_name, diff --git a/midx.c b/midx.c index 702eca805a..39b5c86736 100644 --- a/midx.c +++ b/midx.c @@ -1755,15 +1755,6 @@ int write_midx_internal(const char *object_dir, return result; } -int write_midx_file(const char *object_dir, - const char *preferred_pack_name, - const char *refs_snapshot, - unsigned flags) -{ - return write_midx_internal(object_dir, NULL, NULL, preferred_pack_name, - refs_snapshot, flags); -} - struct clear_midx_data { char *keep; const char *ext; From patchwork Mon Mar 25 17:24:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13602560 Received: from mail-ot1-f53.google.com (mail-ot1-f53.google.com [209.85.210.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C89293DABF0 for ; Mon, 25 Mar 2024 17:24:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387485; cv=none; b=ECLk0o9+bbgrmPAWTAHG/zwbxpv9YACT2HOhSRqu/4G9pIIHfwxl0IhQQq8i/rZBw5TiGjKhZMAIfY2Ee3lVpQjNb8cuOLlHG9AupcEiKblh941wXWDUXbkVPUrmivXKukl/4HbJj8oVXYvAHbCnRtNtXSM/h/x9X6uJG1TYg0E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387485; c=relaxed/simple; bh=TVLudebW/yvoLnEKhGjeIDFBnIunVc8Txwmnq/p9D3o=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=j0Wk+7rRaNlsTSnGTpTUKV8w4c+JuWBlTLyT7JDh6SOe3t0aswbGknE3/XeUmkAG3nCqr1EUNrSLqIrdSTZ840q4tf4tvftUoSFzLYpQchrK4Ssi5GS4126Dc6e7QtORqQAjyhFyrKy/HaCxX76s4NgU4G7UDK5lN8Y/bnftsEk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=MWCeQgZK; arc=none smtp.client-ip=209.85.210.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="MWCeQgZK" Received: by mail-ot1-f53.google.com with SMTP id 46e09a7af769-6e6d063e88bso926541a34.3 for ; Mon, 25 Mar 2024 10:24:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1711387480; x=1711992280; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=NOvdbCjBbK+KppNFvocMpjdwKencUevqhhU04xjFj6U=; b=MWCeQgZKI76W5UCxaewEoR5LnlL+dJXMAAcJFHSuVzpp4ch3VMP2a1Z+5hfYouRGuQ JlR74Vzn1kcaUoic4w1lofCKN4C8ieTbANg3iZOPXund2/aakEgu7FYiGQuagvKEmFMW bE9mO8xNTLqZN5T92i4Ei62bfr2AXNVj4GviCpN6XcLWziGClnnabR+59ppIij0dn6Wa IZJft1uFDIhgg6isRpgMBDc5yexrhXZCvZJ4gGq+dh5pZeZAO08Idh+1F5We6VoeF9AY yIh8MROm08JREIUOQsSP4vHAqcRefw3Lprhe8+YRFaL1s7TT0my9s1JhSL5fg6QdGRQ/ IaQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387480; x=1711992280; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=NOvdbCjBbK+KppNFvocMpjdwKencUevqhhU04xjFj6U=; b=FgnW3iFW3FDGJnXQ5Gyo1kCX7fQp3W+a4gwKSuhzIAT6sHw3/ovwlGc9F71YVk5UDF nlklKge1ZlI4E1dym/DWvoX/UD2Mp2oI23cOWASj8C7p8ti0HKSTvI6Omtr4f2HISfRs MYPJmtcQC9MHzzRLxGHVmkZvtXYGDto1XVVMEjS90XKJlSvqx8KF1g43+w9RJbT9zeEk 9JVC8elamUKZHqUYKmDqM9oTtm9Chquf/IxyW/YtaaSkG24UiO3gFICDRq9y6WRykFcI I88W+re1UxK9eQlunyqgUIV9fcfjFasWvfyrO8AOClLtWrhYcjzYMUYVJ8jmjP7yUXi4 yG1Q== X-Gm-Message-State: AOJu0YxPnRLf9W3u+8ZjvTjidiyyy196fzcvF7mBMqipP7KTTnAkvj4D Cc4HUpOxzDnyjLla0qrThZ7quQLp0U5rNkv8l7mV8HhARxu/Agn6BKpV1RKtOkObV8kTv+RW6DV FZiQ= X-Google-Smtp-Source: AGHT+IGAqJn0xeJCtJQZ1kBCbZTdGnFyHzGdFFrj4xUepFbWmrWRb39ToxMv5E1kZR75g/R+XNdvSg== X-Received: by 2002:a9d:65c6:0:b0:6e6:8da6:a216 with SMTP id z6-20020a9d65c6000000b006e68da6a216mr334613oth.25.1711387479851; Mon, 25 Mar 2024 10:24:39 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id x15-20020a05620a0ecf00b007887d30dbb7sm2288559qkm.60.2024.03.25.10.24.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 10:24:39 -0700 (PDT) Date: Mon, 25 Mar 2024 13:24:38 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano Subject: [PATCH 07/11] midx: move `write_midx_internal` (and related functions) to midx-write.c Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Move the last writing-related function from midx.c to midx-write.c. This patch moves `write_midx_internal()`, along with all of the functions used to implement it into midx-write.c. Like previous patch, this patch too does not introduce any functional changes, and is best moved with `--color-moved`. Signed-off-by: Taylor Blau --- midx-write.c | 1243 ++++++++++++++++++++++++++++++++++++++++++++++- midx.c | 1296 +------------------------------------------------- midx.h | 19 + 3 files changed, 1272 insertions(+), 1286 deletions(-) diff --git a/midx-write.c b/midx-write.c index 3d7697d8a2..c812156cbd 100644 --- a/midx-write.c +++ b/midx-write.c @@ -1,22 +1,1257 @@ #include "git-compat-util.h" +#include "abspath.h" #include "config.h" #include "hex.h" +#include "lockfile.h" #include "packfile.h" +#include "object-file.h" +#include "hash-lookup.h" #include "midx.h" #include "progress.h" +#include "trace2.h" #include "run-command.h" +#include "chunk-format.h" #include "pack-bitmap.h" +#include "refs.h" #include "revision.h" +#include "list-objects.h" -extern int write_midx_internal(const char *object_dir, +#define PACK_EXPIRED UINT_MAX +#define BITMAP_POS_UNKNOWN (~((uint32_t)0)) +#define MIDX_CHUNK_FANOUT_SIZE (sizeof(uint32_t) * 256) +#define MIDX_CHUNK_LARGE_OFFSET_WIDTH (sizeof(uint64_t)) + +extern int midx_checksum_valid(struct multi_pack_index *m); +extern void clear_midx_files_ext(const char *object_dir, const char *ext, + unsigned char *keep_hash); +extern int cmp_idx_or_pack_name(const char *idx_or_pack_name, + const char *idx_name); + +static size_t write_midx_header(struct hashfile *f, + unsigned char num_chunks, + uint32_t num_packs) +{ + hashwrite_be32(f, MIDX_SIGNATURE); + hashwrite_u8(f, MIDX_VERSION); + hashwrite_u8(f, oid_version(the_hash_algo)); + hashwrite_u8(f, num_chunks); + hashwrite_u8(f, 0); /* unused */ + hashwrite_be32(f, num_packs); + + return MIDX_HEADER_SIZE; +} + +struct pack_info { + uint32_t orig_pack_int_id; + char *pack_name; + struct packed_git *p; + + uint32_t bitmap_pos; + uint32_t bitmap_nr; + + unsigned expired : 1; +}; + +static void fill_pack_info(struct pack_info *info, + struct packed_git *p, const char *pack_name, + uint32_t orig_pack_int_id) +{ + memset(info, 0, sizeof(struct pack_info)); + + info->orig_pack_int_id = orig_pack_int_id; + info->pack_name = xstrdup(pack_name); + info->p = p; + info->bitmap_pos = BITMAP_POS_UNKNOWN; +} + +static int pack_info_compare(const void *_a, const void *_b) +{ + struct pack_info *a = (struct pack_info *)_a; + struct pack_info *b = (struct pack_info *)_b; + return strcmp(a->pack_name, b->pack_name); +} + +static int idx_or_pack_name_cmp(const void *_va, const void *_vb) +{ + const char *pack_name = _va; + const struct pack_info *compar = _vb; + + return cmp_idx_or_pack_name(pack_name, compar->pack_name); +} + +struct write_midx_context { + struct pack_info *info; + size_t nr; + size_t alloc; + struct multi_pack_index *m; + struct progress *progress; + unsigned pack_paths_checked; + + struct pack_midx_entry *entries; + size_t entries_nr; + + uint32_t *pack_perm; + uint32_t *pack_order; + unsigned large_offsets_needed:1; + uint32_t num_large_offsets; + + int preferred_pack_idx; + + struct string_list *to_include; +}; + +static void add_pack_to_midx(const char *full_path, size_t full_path_len, + const char *file_name, void *data) +{ + struct write_midx_context *ctx = data; + struct packed_git *p; + + if (ends_with(file_name, ".idx")) { + display_progress(ctx->progress, ++ctx->pack_paths_checked); + /* + * Note that at most one of ctx->m and ctx->to_include are set, + * so we are testing midx_contains_pack() and + * string_list_has_string() independently (guarded by the + * appropriate NULL checks). + * + * We could support passing to_include while reusing an existing + * MIDX, but don't currently since the reuse process drags + * forward all packs from an existing MIDX (without checking + * whether or not they appear in the to_include list). + * + * If we added support for that, these next two conditional + * should be performed independently (likely checking + * to_include before the existing MIDX). + */ + if (ctx->m && midx_contains_pack(ctx->m, file_name)) + return; + else if (ctx->to_include && + !string_list_has_string(ctx->to_include, file_name)) + return; + + ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc); + + p = add_packed_git(full_path, full_path_len, 0); + if (!p) { + warning(_("failed to add packfile '%s'"), + full_path); + return; + } + + if (open_pack_index(p)) { + warning(_("failed to open pack-index '%s'"), + full_path); + close_pack(p); + free(p); + return; + } + + fill_pack_info(&ctx->info[ctx->nr], p, file_name, ctx->nr); + ctx->nr++; + } +} + +struct pack_midx_entry { + struct object_id oid; + uint32_t pack_int_id; + time_t pack_mtime; + uint64_t offset; + unsigned preferred : 1; +}; + +static int midx_oid_compare(const void *_a, const void *_b) +{ + const struct pack_midx_entry *a = (const struct pack_midx_entry *)_a; + const struct pack_midx_entry *b = (const struct pack_midx_entry *)_b; + int cmp = oidcmp(&a->oid, &b->oid); + + if (cmp) + return cmp; + + /* Sort objects in a preferred pack first when multiple copies exist. */ + if (a->preferred > b->preferred) + return -1; + if (a->preferred < b->preferred) + return 1; + + if (a->pack_mtime > b->pack_mtime) + return -1; + else if (a->pack_mtime < b->pack_mtime) + return 1; + + return a->pack_int_id - b->pack_int_id; +} + +static int nth_midxed_pack_midx_entry(struct multi_pack_index *m, + struct pack_midx_entry *e, + uint32_t pos) +{ + if (pos >= m->num_objects) + return 1; + + nth_midxed_object_oid(&e->oid, m, pos); + e->pack_int_id = nth_midxed_pack_int_id(m, pos); + e->offset = nth_midxed_offset(m, pos); + + /* consider objects in midx to be from "old" packs */ + e->pack_mtime = 0; + return 0; +} + +static void fill_pack_entry(uint32_t pack_int_id, + struct packed_git *p, + uint32_t cur_object, + struct pack_midx_entry *entry, + int preferred) +{ + if (nth_packed_object_id(&entry->oid, p, cur_object) < 0) + die(_("failed to locate object %d in packfile"), cur_object); + + entry->pack_int_id = pack_int_id; + entry->pack_mtime = p->mtime; + + entry->offset = nth_packed_object_offset(p, cur_object); + entry->preferred = !!preferred; +} + +struct midx_fanout { + struct pack_midx_entry *entries; + size_t nr, alloc; +}; + +static void midx_fanout_grow(struct midx_fanout *fanout, size_t nr) +{ + if (nr < fanout->nr) + BUG("negative growth in midx_fanout_grow() (%"PRIuMAX" < %"PRIuMAX")", + (uintmax_t)nr, (uintmax_t)fanout->nr); + ALLOC_GROW(fanout->entries, nr, fanout->alloc); +} + +static void midx_fanout_sort(struct midx_fanout *fanout) +{ + QSORT(fanout->entries, fanout->nr, midx_oid_compare); +} + +static void midx_fanout_add_midx_fanout(struct midx_fanout *fanout, + struct multi_pack_index *m, + uint32_t cur_fanout, + int preferred_pack) +{ + uint32_t start = 0, end; + uint32_t cur_object; + + if (cur_fanout) + start = ntohl(m->chunk_oid_fanout[cur_fanout - 1]); + end = ntohl(m->chunk_oid_fanout[cur_fanout]); + + for (cur_object = start; cur_object < end; cur_object++) { + if ((preferred_pack > -1) && + (preferred_pack == nth_midxed_pack_int_id(m, cur_object))) { + /* + * Objects from preferred packs are added + * separately. + */ + continue; + } + + midx_fanout_grow(fanout, fanout->nr + 1); + nth_midxed_pack_midx_entry(m, + &fanout->entries[fanout->nr], + cur_object); + fanout->entries[fanout->nr].preferred = 0; + fanout->nr++; + } +} + +static void midx_fanout_add_pack_fanout(struct midx_fanout *fanout, + struct pack_info *info, + uint32_t cur_pack, + int preferred, + uint32_t cur_fanout) +{ + struct packed_git *pack = info[cur_pack].p; + uint32_t start = 0, end; + uint32_t cur_object; + + if (cur_fanout) + start = get_pack_fanout(pack, cur_fanout - 1); + end = get_pack_fanout(pack, cur_fanout); + + for (cur_object = start; cur_object < end; cur_object++) { + midx_fanout_grow(fanout, fanout->nr + 1); + fill_pack_entry(cur_pack, + info[cur_pack].p, + cur_object, + &fanout->entries[fanout->nr], + preferred); + fanout->nr++; + } +} + +/* + * It is possible to artificially get into a state where there are many + * duplicate copies of objects. That can create high memory pressure if + * we are to create a list of all objects before de-duplication. To reduce + * this memory pressure without a significant performance drop, automatically + * group objects by the first byte of their object id. Use the IDX fanout + * tables to group the data, copy to a local array, then sort. + * + * Copy only the de-duplicated entries (selected by most-recent modified time + * of a packfile containing the object). + */ +static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, + struct pack_info *info, + uint32_t nr_packs, + size_t *nr_objects, + int preferred_pack) +{ + uint32_t cur_fanout, cur_pack, cur_object; + size_t alloc_objects, total_objects = 0; + struct midx_fanout fanout = { 0 }; + struct pack_midx_entry *deduplicated_entries = NULL; + uint32_t start_pack = m ? m->num_packs : 0; + + for (cur_pack = start_pack; cur_pack < nr_packs; cur_pack++) + total_objects = st_add(total_objects, + info[cur_pack].p->num_objects); + + /* + * As we de-duplicate by fanout value, we expect the fanout + * slices to be evenly distributed, with some noise. Hence, + * allocate slightly more than one 256th. + */ + alloc_objects = fanout.alloc = total_objects > 3200 ? total_objects / 200 : 16; + + ALLOC_ARRAY(fanout.entries, fanout.alloc); + ALLOC_ARRAY(deduplicated_entries, alloc_objects); + *nr_objects = 0; + + for (cur_fanout = 0; cur_fanout < 256; cur_fanout++) { + fanout.nr = 0; + + if (m) + midx_fanout_add_midx_fanout(&fanout, m, cur_fanout, + preferred_pack); + + for (cur_pack = start_pack; cur_pack < nr_packs; cur_pack++) { + int preferred = cur_pack == preferred_pack; + midx_fanout_add_pack_fanout(&fanout, + info, cur_pack, + preferred, cur_fanout); + } + + if (-1 < preferred_pack && preferred_pack < start_pack) + midx_fanout_add_pack_fanout(&fanout, info, + preferred_pack, 1, + cur_fanout); + + midx_fanout_sort(&fanout); + + /* + * The batch is now sorted by OID and then mtime (descending). + * Take only the first duplicate. + */ + for (cur_object = 0; cur_object < fanout.nr; cur_object++) { + if (cur_object && oideq(&fanout.entries[cur_object - 1].oid, + &fanout.entries[cur_object].oid)) + continue; + + ALLOC_GROW(deduplicated_entries, st_add(*nr_objects, 1), + alloc_objects); + memcpy(&deduplicated_entries[*nr_objects], + &fanout.entries[cur_object], + sizeof(struct pack_midx_entry)); + (*nr_objects)++; + } + } + + free(fanout.entries); + return deduplicated_entries; +} + +static int write_midx_pack_names(struct hashfile *f, void *data) +{ + struct write_midx_context *ctx = data; + uint32_t i; + unsigned char padding[MIDX_CHUNK_ALIGNMENT]; + size_t written = 0; + + for (i = 0; i < ctx->nr; i++) { + size_t writelen; + + if (ctx->info[i].expired) + continue; + + if (i && strcmp(ctx->info[i].pack_name, ctx->info[i - 1].pack_name) <= 0) + BUG("incorrect pack-file order: %s before %s", + ctx->info[i - 1].pack_name, + ctx->info[i].pack_name); + + writelen = strlen(ctx->info[i].pack_name) + 1; + hashwrite(f, ctx->info[i].pack_name, writelen); + written += writelen; + } + + /* add padding to be aligned */ + i = MIDX_CHUNK_ALIGNMENT - (written % MIDX_CHUNK_ALIGNMENT); + if (i < MIDX_CHUNK_ALIGNMENT) { + memset(padding, 0, sizeof(padding)); + hashwrite(f, padding, i); + } + + return 0; +} + +static int write_midx_bitmapped_packs(struct hashfile *f, void *data) +{ + struct write_midx_context *ctx = data; + size_t i; + + for (i = 0; i < ctx->nr; i++) { + struct pack_info *pack = &ctx->info[i]; + if (pack->expired) + continue; + + if (pack->bitmap_pos == BITMAP_POS_UNKNOWN && pack->bitmap_nr) + BUG("pack '%s' has no bitmap position, but has %d bitmapped object(s)", + pack->pack_name, pack->bitmap_nr); + + hashwrite_be32(f, pack->bitmap_pos); + hashwrite_be32(f, pack->bitmap_nr); + } + return 0; +} + +static int write_midx_oid_fanout(struct hashfile *f, + void *data) +{ + struct write_midx_context *ctx = data; + struct pack_midx_entry *list = ctx->entries; + struct pack_midx_entry *last = ctx->entries + ctx->entries_nr; + uint32_t count = 0; + uint32_t i; + + /* + * Write the first-level table (the list is sorted, + * but we use a 256-entry lookup to be able to avoid + * having to do eight extra binary search iterations). + */ + for (i = 0; i < 256; i++) { + struct pack_midx_entry *next = list; + + while (next < last && next->oid.hash[0] == i) { + count++; + next++; + } + + hashwrite_be32(f, count); + list = next; + } + + return 0; +} + +static int write_midx_oid_lookup(struct hashfile *f, + void *data) +{ + struct write_midx_context *ctx = data; + unsigned char hash_len = the_hash_algo->rawsz; + struct pack_midx_entry *list = ctx->entries; + uint32_t i; + + for (i = 0; i < ctx->entries_nr; i++) { + struct pack_midx_entry *obj = list++; + + if (i < ctx->entries_nr - 1) { + struct pack_midx_entry *next = list; + if (oidcmp(&obj->oid, &next->oid) >= 0) + BUG("OIDs not in order: %s >= %s", + oid_to_hex(&obj->oid), + oid_to_hex(&next->oid)); + } + + hashwrite(f, obj->oid.hash, (int)hash_len); + } + + return 0; +} + +static int write_midx_object_offsets(struct hashfile *f, + void *data) +{ + struct write_midx_context *ctx = data; + struct pack_midx_entry *list = ctx->entries; + uint32_t i, nr_large_offset = 0; + + for (i = 0; i < ctx->entries_nr; i++) { + struct pack_midx_entry *obj = list++; + + if (ctx->pack_perm[obj->pack_int_id] == PACK_EXPIRED) + BUG("object %s is in an expired pack with int-id %d", + oid_to_hex(&obj->oid), + obj->pack_int_id); + + hashwrite_be32(f, ctx->pack_perm[obj->pack_int_id]); + + if (ctx->large_offsets_needed && obj->offset >> 31) + hashwrite_be32(f, MIDX_LARGE_OFFSET_NEEDED | nr_large_offset++); + else if (!ctx->large_offsets_needed && obj->offset >> 32) + BUG("object %s requires a large offset (%"PRIx64") but the MIDX is not writing large offsets!", + oid_to_hex(&obj->oid), + obj->offset); + else + hashwrite_be32(f, (uint32_t)obj->offset); + } + + return 0; +} + +static int write_midx_large_offsets(struct hashfile *f, + void *data) +{ + struct write_midx_context *ctx = data; + struct pack_midx_entry *list = ctx->entries; + struct pack_midx_entry *end = ctx->entries + ctx->entries_nr; + uint32_t nr_large_offset = ctx->num_large_offsets; + + while (nr_large_offset) { + struct pack_midx_entry *obj; + uint64_t offset; + + if (list >= end) + BUG("too many large-offset objects"); + + obj = list++; + offset = obj->offset; + + if (!(offset >> 31)) + continue; + + hashwrite_be64(f, offset); + + nr_large_offset--; + } + + return 0; +} + +static int write_midx_revindex(struct hashfile *f, + void *data) +{ + struct write_midx_context *ctx = data; + uint32_t i; + + for (i = 0; i < ctx->entries_nr; i++) + hashwrite_be32(f, ctx->pack_order[i]); + + return 0; +} + +struct midx_pack_order_data { + uint32_t nr; + uint32_t pack; + off_t offset; +}; + +static int midx_pack_order_cmp(const void *va, const void *vb) +{ + const struct midx_pack_order_data *a = va, *b = vb; + if (a->pack < b->pack) + return -1; + else if (a->pack > b->pack) + return 1; + else if (a->offset < b->offset) + return -1; + else if (a->offset > b->offset) + return 1; + else + return 0; +} + +static uint32_t *midx_pack_order(struct write_midx_context *ctx) +{ + struct midx_pack_order_data *data; + uint32_t *pack_order; + uint32_t i; + + trace2_region_enter("midx", "midx_pack_order", the_repository); + + ALLOC_ARRAY(data, ctx->entries_nr); + for (i = 0; i < ctx->entries_nr; i++) { + struct pack_midx_entry *e = &ctx->entries[i]; + data[i].nr = i; + data[i].pack = ctx->pack_perm[e->pack_int_id]; + if (!e->preferred) + data[i].pack |= (1U << 31); + data[i].offset = e->offset; + } + + QSORT(data, ctx->entries_nr, midx_pack_order_cmp); + + ALLOC_ARRAY(pack_order, ctx->entries_nr); + for (i = 0; i < ctx->entries_nr; i++) { + struct pack_midx_entry *e = &ctx->entries[data[i].nr]; + struct pack_info *pack = &ctx->info[ctx->pack_perm[e->pack_int_id]]; + if (pack->bitmap_pos == BITMAP_POS_UNKNOWN) + pack->bitmap_pos = i; + pack->bitmap_nr++; + pack_order[i] = data[i].nr; + } + for (i = 0; i < ctx->nr; i++) { + struct pack_info *pack = &ctx->info[ctx->pack_perm[i]]; + if (pack->bitmap_pos == BITMAP_POS_UNKNOWN) + pack->bitmap_pos = 0; + } + free(data); + + trace2_region_leave("midx", "midx_pack_order", the_repository); + + return pack_order; +} + +static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash, + struct write_midx_context *ctx) +{ + struct strbuf buf = STRBUF_INIT; + const char *tmp_file; + + trace2_region_enter("midx", "write_midx_reverse_index", the_repository); + + strbuf_addf(&buf, "%s-%s.rev", midx_name, hash_to_hex(midx_hash)); + + tmp_file = write_rev_file_order(NULL, ctx->pack_order, ctx->entries_nr, + midx_hash, WRITE_REV); + + if (finalize_object_file(tmp_file, buf.buf)) + die(_("cannot store reverse index file")); + + strbuf_release(&buf); + + trace2_region_leave("midx", "write_midx_reverse_index", the_repository); +} + +static void prepare_midx_packing_data(struct packing_data *pdata, + struct write_midx_context *ctx) +{ + uint32_t i; + + trace2_region_enter("midx", "prepare_midx_packing_data", the_repository); + + memset(pdata, 0, sizeof(struct packing_data)); + prepare_packing_data(the_repository, pdata); + + for (i = 0; i < ctx->entries_nr; i++) { + struct pack_midx_entry *from = &ctx->entries[ctx->pack_order[i]]; + struct object_entry *to = packlist_alloc(pdata, &from->oid); + + oe_set_in_pack(pdata, to, + ctx->info[ctx->pack_perm[from->pack_int_id]].p); + } + + trace2_region_leave("midx", "prepare_midx_packing_data", the_repository); +} + +static int add_ref_to_pending(const char *refname, + const struct object_id *oid, + int flag, void *cb_data) +{ + struct rev_info *revs = (struct rev_info*)cb_data; + struct object_id peeled; + struct object *object; + + if ((flag & REF_ISSYMREF) && (flag & REF_ISBROKEN)) { + warning("symbolic ref is dangling: %s", refname); + return 0; + } + + if (!peel_iterated_oid(oid, &peeled)) + oid = &peeled; + + object = parse_object_or_die(oid, refname); + if (object->type != OBJ_COMMIT) + return 0; + + add_pending_object(revs, object, ""); + if (bitmap_is_preferred_refname(revs->repo, refname)) + object->flags |= NEEDS_BITMAP; + return 0; +} + +struct bitmap_commit_cb { + struct commit **commits; + size_t commits_nr, commits_alloc; + + struct write_midx_context *ctx; +}; + +static const struct object_id *bitmap_oid_access(size_t index, + const void *_entries) +{ + const struct pack_midx_entry *entries = _entries; + return &entries[index].oid; +} + +static void bitmap_show_commit(struct commit *commit, void *_data) +{ + struct bitmap_commit_cb *data = _data; + int pos = oid_pos(&commit->object.oid, data->ctx->entries, + data->ctx->entries_nr, + bitmap_oid_access); + if (pos < 0) + return; + + ALLOC_GROW(data->commits, data->commits_nr + 1, data->commits_alloc); + data->commits[data->commits_nr++] = commit; +} + +static int read_refs_snapshot(const char *refs_snapshot, + struct rev_info *revs) +{ + struct strbuf buf = STRBUF_INIT; + struct object_id oid; + FILE *f = xfopen(refs_snapshot, "r"); + + while (strbuf_getline(&buf, f) != EOF) { + struct object *object; + int preferred = 0; + char *hex = buf.buf; + const char *end = NULL; + + if (buf.len && *buf.buf == '+') { + preferred = 1; + hex = &buf.buf[1]; + } + + if (parse_oid_hex(hex, &oid, &end) < 0) + die(_("could not parse line: %s"), buf.buf); + if (*end) + die(_("malformed line: %s"), buf.buf); + + object = parse_object_or_die(&oid, NULL); + if (preferred) + object->flags |= NEEDS_BITMAP; + + add_pending_object(revs, object, ""); + } + + fclose(f); + strbuf_release(&buf); + return 0; +} +static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr_p, + const char *refs_snapshot, + struct write_midx_context *ctx) +{ + struct rev_info revs; + struct bitmap_commit_cb cb = {0}; + + trace2_region_enter("midx", "find_commits_for_midx_bitmap", + the_repository); + + cb.ctx = ctx; + + repo_init_revisions(the_repository, &revs, NULL); + if (refs_snapshot) { + read_refs_snapshot(refs_snapshot, &revs); + } else { + setup_revisions(0, NULL, &revs, NULL); + for_each_ref(add_ref_to_pending, &revs); + } + + /* + * Skipping promisor objects here is intentional, since it only excludes + * them from the list of reachable commits that we want to select from + * when computing the selection of MIDX'd commits to receive bitmaps. + * + * Reachability bitmaps do require that their objects be closed under + * reachability, but fetching any objects missing from promisors at this + * point is too late. But, if one of those objects can be reached from + * an another object that is included in the bitmap, then we will + * complain later that we don't have reachability closure (and fail + * appropriately). + */ + fetch_if_missing = 0; + revs.exclude_promisor_objects = 1; + + if (prepare_revision_walk(&revs)) + die(_("revision walk setup failed")); + + traverse_commit_list(&revs, bitmap_show_commit, NULL, &cb); + if (indexed_commits_nr_p) + *indexed_commits_nr_p = cb.commits_nr; + + release_revisions(&revs); + + trace2_region_leave("midx", "find_commits_for_midx_bitmap", + the_repository); + + return cb.commits; +} + +static int write_midx_bitmap(const char *midx_name, + const unsigned char *midx_hash, + struct packing_data *pdata, + struct commit **commits, + uint32_t commits_nr, + uint32_t *pack_order, + unsigned flags) +{ + int ret, i; + uint16_t options = 0; + struct pack_idx_entry **index; + char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name, + hash_to_hex(midx_hash)); + + trace2_region_enter("midx", "write_midx_bitmap", the_repository); + + if (flags & MIDX_WRITE_BITMAP_HASH_CACHE) + options |= BITMAP_OPT_HASH_CACHE; + + if (flags & MIDX_WRITE_BITMAP_LOOKUP_TABLE) + options |= BITMAP_OPT_LOOKUP_TABLE; + + /* + * Build the MIDX-order index based on pdata.objects (which is already + * in MIDX order; c.f., 'midx_pack_order_cmp()' for the definition of + * this order). + */ + ALLOC_ARRAY(index, pdata->nr_objects); + for (i = 0; i < pdata->nr_objects; i++) + index[i] = &pdata->objects[i].idx; + + bitmap_writer_show_progress(flags & MIDX_PROGRESS); + bitmap_writer_build_type_index(pdata, index, pdata->nr_objects); + + /* + * bitmap_writer_finish expects objects in lex order, but pack_order + * gives us exactly that. use it directly instead of re-sorting the + * array. + * + * This changes the order of objects in 'index' between + * bitmap_writer_build_type_index and bitmap_writer_finish. + * + * The same re-ordering takes place in the single-pack bitmap code via + * write_idx_file(), which is called by finish_tmp_packfile(), which + * happens between bitmap_writer_build_type_index() and + * bitmap_writer_finish(). + */ + for (i = 0; i < pdata->nr_objects; i++) + index[pack_order[i]] = &pdata->objects[i].idx; + + bitmap_writer_select_commits(commits, commits_nr, -1); + ret = bitmap_writer_build(pdata); + if (ret < 0) + goto cleanup; + + bitmap_writer_set_checksum(midx_hash); + bitmap_writer_finish(index, pdata->nr_objects, bitmap_name, options); + +cleanup: + free(index); + free(bitmap_name); + + trace2_region_leave("midx", "write_midx_bitmap", the_repository); + + return ret; +} + +static struct multi_pack_index *lookup_multi_pack_index(struct repository *r, + const char *object_dir) +{ + struct multi_pack_index *result = NULL; + struct multi_pack_index *cur; + char *obj_dir_real = real_pathdup(object_dir, 1); + struct strbuf cur_path_real = STRBUF_INIT; + + /* Ensure the given object_dir is local, or a known alternate. */ + find_odb(r, obj_dir_real); + + for (cur = get_multi_pack_index(r); cur; cur = cur->next) { + strbuf_realpath(&cur_path_real, cur->object_dir, 1); + if (!strcmp(obj_dir_real, cur_path_real.buf)) { + result = cur; + goto cleanup; + } + } + +cleanup: + free(obj_dir_real); + strbuf_release(&cur_path_real); + return result; +} + +static int write_midx_internal(const char *object_dir, struct string_list *packs_to_include, struct string_list *packs_to_drop, const char *preferred_pack_name, const char *refs_snapshot, - unsigned flags); + unsigned flags) +{ + struct strbuf midx_name = STRBUF_INIT; + unsigned char midx_hash[GIT_MAX_RAWSZ]; + uint32_t i; + struct hashfile *f = NULL; + struct lock_file lk; + struct write_midx_context ctx = { 0 }; + int bitmapped_packs_concat_len = 0; + int pack_name_concat_len = 0; + int dropped_packs = 0; + int result = 0; + struct chunkfile *cf; -extern struct multi_pack_index *lookup_multi_pack_index(struct repository *r, - const char *object_dir); + trace2_region_enter("midx", "write_midx_internal", the_repository); + + get_midx_filename(&midx_name, object_dir); + if (safe_create_leading_directories(midx_name.buf)) + die_errno(_("unable to create leading directories of %s"), + midx_name.buf); + + if (!packs_to_include) { + /* + * Only reference an existing MIDX when not filtering which + * packs to include, since all packs and objects are copied + * blindly from an existing MIDX if one is present. + */ + ctx.m = lookup_multi_pack_index(the_repository, object_dir); + } + + if (ctx.m && !midx_checksum_valid(ctx.m)) { + warning(_("ignoring existing multi-pack-index; checksum mismatch")); + ctx.m = NULL; + } + + ctx.nr = 0; + ctx.alloc = ctx.m ? ctx.m->num_packs : 16; + ctx.info = NULL; + ALLOC_ARRAY(ctx.info, ctx.alloc); + + if (ctx.m) { + for (i = 0; i < ctx.m->num_packs; i++) { + ALLOC_GROW(ctx.info, ctx.nr + 1, ctx.alloc); + + if (flags & MIDX_WRITE_REV_INDEX) { + /* + * If generating a reverse index, need to have + * packed_git's loaded to compare their + * mtimes and object count. + */ + if (prepare_midx_pack(the_repository, ctx.m, i)) { + error(_("could not load pack")); + result = 1; + goto cleanup; + } + + if (open_pack_index(ctx.m->packs[i])) + die(_("could not open index for %s"), + ctx.m->packs[i]->pack_name); + } + + fill_pack_info(&ctx.info[ctx.nr++], ctx.m->packs[i], + ctx.m->pack_names[i], i); + } + } + + ctx.pack_paths_checked = 0; + if (flags & MIDX_PROGRESS) + ctx.progress = start_delayed_progress(_("Adding packfiles to multi-pack-index"), 0); + else + ctx.progress = NULL; + + ctx.to_include = packs_to_include; + + for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &ctx); + stop_progress(&ctx.progress); + + if ((ctx.m && ctx.nr == ctx.m->num_packs) && + !(packs_to_include || packs_to_drop)) { + struct bitmap_index *bitmap_git; + int bitmap_exists; + int want_bitmap = flags & MIDX_WRITE_BITMAP; + + bitmap_git = prepare_midx_bitmap_git(ctx.m); + bitmap_exists = bitmap_git && bitmap_is_midx(bitmap_git); + free_bitmap_index(bitmap_git); + + if (bitmap_exists || !want_bitmap) { + /* + * The correct MIDX already exists, and so does a + * corresponding bitmap (or one wasn't requested). + */ + if (!want_bitmap) + clear_midx_files_ext(object_dir, ".bitmap", + NULL); + goto cleanup; + } + } + + if (preferred_pack_name) { + ctx.preferred_pack_idx = -1; + + for (i = 0; i < ctx.nr; i++) { + if (!cmp_idx_or_pack_name(preferred_pack_name, + ctx.info[i].pack_name)) { + ctx.preferred_pack_idx = i; + break; + } + } + + if (ctx.preferred_pack_idx == -1) + warning(_("unknown preferred pack: '%s'"), + preferred_pack_name); + } else if (ctx.nr && + (flags & (MIDX_WRITE_REV_INDEX | MIDX_WRITE_BITMAP))) { + struct packed_git *oldest = ctx.info[ctx.preferred_pack_idx].p; + ctx.preferred_pack_idx = 0; + + if (packs_to_drop && packs_to_drop->nr) + BUG("cannot write a MIDX bitmap during expiration"); + + /* + * set a preferred pack when writing a bitmap to ensure that + * the pack from which the first object is selected in pseudo + * pack-order has all of its objects selected from that pack + * (and not another pack containing a duplicate) + */ + for (i = 1; i < ctx.nr; i++) { + struct packed_git *p = ctx.info[i].p; + + if (!oldest->num_objects || p->mtime < oldest->mtime) { + oldest = p; + ctx.preferred_pack_idx = i; + } + } + + if (!oldest->num_objects) { + /* + * If all packs are empty; unset the preferred index. + * This is acceptable since there will be no duplicate + * objects to resolve, so the preferred value doesn't + * matter. + */ + ctx.preferred_pack_idx = -1; + } + } else { + /* + * otherwise don't mark any pack as preferred to avoid + * interfering with expiration logic below + */ + ctx.preferred_pack_idx = -1; + } + + if (ctx.preferred_pack_idx > -1) { + struct packed_git *preferred = ctx.info[ctx.preferred_pack_idx].p; + if (!preferred->num_objects) { + error(_("cannot select preferred pack %s with no objects"), + preferred->pack_name); + result = 1; + goto cleanup; + } + } + + ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr, + ctx.preferred_pack_idx); + + ctx.large_offsets_needed = 0; + for (i = 0; i < ctx.entries_nr; i++) { + if (ctx.entries[i].offset > 0x7fffffff) + ctx.num_large_offsets++; + if (ctx.entries[i].offset > 0xffffffff) + ctx.large_offsets_needed = 1; + } + + QSORT(ctx.info, ctx.nr, pack_info_compare); + + if (packs_to_drop && packs_to_drop->nr) { + int drop_index = 0; + int missing_drops = 0; + + for (i = 0; i < ctx.nr && drop_index < packs_to_drop->nr; i++) { + int cmp = strcmp(ctx.info[i].pack_name, + packs_to_drop->items[drop_index].string); + + if (!cmp) { + drop_index++; + ctx.info[i].expired = 1; + } else if (cmp > 0) { + error(_("did not see pack-file %s to drop"), + packs_to_drop->items[drop_index].string); + drop_index++; + missing_drops++; + i--; + } else { + ctx.info[i].expired = 0; + } + } + + if (missing_drops) { + result = 1; + goto cleanup; + } + } + + /* + * pack_perm stores a permutation between pack-int-ids from the + * previous multi-pack-index to the new one we are writing: + * + * pack_perm[old_id] = new_id + */ + ALLOC_ARRAY(ctx.pack_perm, ctx.nr); + for (i = 0; i < ctx.nr; i++) { + if (ctx.info[i].expired) { + dropped_packs++; + ctx.pack_perm[ctx.info[i].orig_pack_int_id] = PACK_EXPIRED; + } else { + ctx.pack_perm[ctx.info[i].orig_pack_int_id] = i - dropped_packs; + } + } + + for (i = 0; i < ctx.nr; i++) { + if (ctx.info[i].expired) + continue; + pack_name_concat_len += strlen(ctx.info[i].pack_name) + 1; + bitmapped_packs_concat_len += 2 * sizeof(uint32_t); + } + + /* Check that the preferred pack wasn't expired (if given). */ + if (preferred_pack_name) { + struct pack_info *preferred = bsearch(preferred_pack_name, + ctx.info, ctx.nr, + sizeof(*ctx.info), + idx_or_pack_name_cmp); + if (preferred) { + uint32_t perm = ctx.pack_perm[preferred->orig_pack_int_id]; + if (perm == PACK_EXPIRED) + warning(_("preferred pack '%s' is expired"), + preferred_pack_name); + } + } + + if (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT) + pack_name_concat_len += MIDX_CHUNK_ALIGNMENT - + (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT); + + hold_lock_file_for_update(&lk, midx_name.buf, LOCK_DIE_ON_ERROR); + f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); + + if (ctx.nr - dropped_packs == 0) { + error(_("no pack files to index.")); + result = 1; + goto cleanup; + } + + if (!ctx.entries_nr) { + if (flags & MIDX_WRITE_BITMAP) + warning(_("refusing to write multi-pack .bitmap without any objects")); + flags &= ~(MIDX_WRITE_REV_INDEX | MIDX_WRITE_BITMAP); + } + + cf = init_chunkfile(f); + + add_chunk(cf, MIDX_CHUNKID_PACKNAMES, pack_name_concat_len, + write_midx_pack_names); + add_chunk(cf, MIDX_CHUNKID_OIDFANOUT, MIDX_CHUNK_FANOUT_SIZE, + write_midx_oid_fanout); + add_chunk(cf, MIDX_CHUNKID_OIDLOOKUP, + st_mult(ctx.entries_nr, the_hash_algo->rawsz), + write_midx_oid_lookup); + add_chunk(cf, MIDX_CHUNKID_OBJECTOFFSETS, + st_mult(ctx.entries_nr, MIDX_CHUNK_OFFSET_WIDTH), + write_midx_object_offsets); + + if (ctx.large_offsets_needed) + add_chunk(cf, MIDX_CHUNKID_LARGEOFFSETS, + st_mult(ctx.num_large_offsets, + MIDX_CHUNK_LARGE_OFFSET_WIDTH), + write_midx_large_offsets); + + if (flags & (MIDX_WRITE_REV_INDEX | MIDX_WRITE_BITMAP)) { + ctx.pack_order = midx_pack_order(&ctx); + add_chunk(cf, MIDX_CHUNKID_REVINDEX, + st_mult(ctx.entries_nr, sizeof(uint32_t)), + write_midx_revindex); + add_chunk(cf, MIDX_CHUNKID_BITMAPPEDPACKS, + bitmapped_packs_concat_len, + write_midx_bitmapped_packs); + } + + write_midx_header(f, get_num_chunks(cf), ctx.nr - dropped_packs); + write_chunkfile(cf, &ctx); + + finalize_hashfile(f, midx_hash, FSYNC_COMPONENT_PACK_METADATA, + CSUM_FSYNC | CSUM_HASH_IN_STREAM); + free_chunkfile(cf); + + if (flags & MIDX_WRITE_REV_INDEX && + git_env_bool("GIT_TEST_MIDX_WRITE_REV", 0)) + write_midx_reverse_index(midx_name.buf, midx_hash, &ctx); + + if (flags & MIDX_WRITE_BITMAP) { + struct packing_data pdata; + struct commit **commits; + uint32_t commits_nr; + + if (!ctx.entries_nr) + BUG("cannot write a bitmap without any objects"); + + prepare_midx_packing_data(&pdata, &ctx); + + commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, &ctx); + + /* + * The previous steps translated the information from + * 'entries' into information suitable for constructing + * bitmaps. We no longer need that array, so clear it to + * reduce memory pressure. + */ + FREE_AND_NULL(ctx.entries); + ctx.entries_nr = 0; + + if (write_midx_bitmap(midx_name.buf, midx_hash, &pdata, + commits, commits_nr, ctx.pack_order, + flags) < 0) { + error(_("could not write multi-pack bitmap")); + result = 1; + clear_packing_data(&pdata); + free(commits); + goto cleanup; + } + + clear_packing_data(&pdata); + free(commits); + } + /* + * NOTE: Do not use ctx.entries beyond this point, since it might + * have been freed in the previous if block. + */ + + if (ctx.m) + close_object_store(the_repository->objects); + + if (commit_lock_file(&lk) < 0) + die_errno(_("could not write multi-pack-index")); + + clear_midx_files_ext(object_dir, ".bitmap", midx_hash); + clear_midx_files_ext(object_dir, ".rev", midx_hash); + +cleanup: + for (i = 0; i < ctx.nr; i++) { + if (ctx.info[i].p) { + close_pack(ctx.info[i].p); + free(ctx.info[i].p); + } + free(ctx.info[i].pack_name); + } + + free(ctx.info); + free(ctx.entries); + free(ctx.pack_perm); + free(ctx.pack_order); + strbuf_release(&midx_name); + + trace2_region_leave("midx", "write_midx_internal", the_repository); + + return result; +} int write_midx_file(const char *object_dir, const char *preferred_pack_name, diff --git a/midx.c b/midx.c index 39b5c86736..ae3b49166c 100644 --- a/midx.c +++ b/midx.c @@ -1,62 +1,22 @@ #include "git-compat-util.h" -#include "abspath.h" #include "config.h" -#include "csum-file.h" #include "dir.h" -#include "gettext.h" #include "hex.h" -#include "lockfile.h" #include "packfile.h" #include "object-file.h" -#include "object-store-ll.h" #include "hash-lookup.h" #include "midx.h" #include "progress.h" #include "trace2.h" -#include "run-command.h" -#include "repository.h" #include "chunk-format.h" -#include "pack.h" #include "pack-bitmap.h" -#include "refs.h" -#include "revision.h" -#include "list-objects.h" #include "pack-revindex.h" -struct multi_pack_index *lookup_multi_pack_index(struct repository *r, - const char *object_dir); - -int write_midx_internal(const char *object_dir, - struct string_list *packs_to_include, - struct string_list *packs_to_drop, - const char *preferred_pack_name, - const char *refs_snapshot, - unsigned flags); - -#define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ -#define MIDX_VERSION 1 -#define MIDX_BYTE_FILE_VERSION 4 -#define MIDX_BYTE_HASH_VERSION 5 -#define MIDX_BYTE_NUM_CHUNKS 6 -#define MIDX_BYTE_NUM_PACKS 8 -#define MIDX_HEADER_SIZE 12 -#define MIDX_MIN_SIZE (MIDX_HEADER_SIZE + the_hash_algo->rawsz) - -#define MIDX_CHUNK_ALIGNMENT 4 -#define MIDX_CHUNKID_PACKNAMES 0x504e414d /* "PNAM" */ -#define MIDX_CHUNKID_BITMAPPEDPACKS 0x42544d50 /* "BTMP" */ -#define MIDX_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */ -#define MIDX_CHUNKID_OIDLOOKUP 0x4f49444c /* "OIDL" */ -#define MIDX_CHUNKID_OBJECTOFFSETS 0x4f4f4646 /* "OOFF" */ -#define MIDX_CHUNKID_LARGEOFFSETS 0x4c4f4646 /* "LOFF" */ -#define MIDX_CHUNKID_REVINDEX 0x52494458 /* "RIDX" */ -#define MIDX_CHUNK_FANOUT_SIZE (sizeof(uint32_t) * 256) -#define MIDX_CHUNK_OFFSET_WIDTH (2 * sizeof(uint32_t)) -#define MIDX_CHUNK_LARGE_OFFSET_WIDTH (sizeof(uint64_t)) -#define MIDX_CHUNK_BITMAPPED_PACKS_WIDTH (2 * sizeof(uint32_t)) -#define MIDX_LARGE_OFFSET_NEEDED 0x80000000 - -#define PACK_EXPIRED UINT_MAX +int midx_checksum_valid(struct multi_pack_index *m); +void clear_midx_files_ext(const char *object_dir, const char *ext, + unsigned char *keep_hash); +int cmp_idx_or_pack_name(const char *idx_or_pack_name, + const char *idx_name); const unsigned char *get_midx_checksum(struct multi_pack_index *m) { @@ -125,6 +85,8 @@ static int midx_read_object_offsets(const unsigned char *chunk_start, return 0; } +#define MIDX_MIN_SIZE (MIDX_HEADER_SIZE + the_hash_algo->rawsz) + struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local) { struct multi_pack_index *m = NULL; @@ -304,6 +266,8 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t return 0; } +#define MIDX_CHUNK_BITMAPPED_PACKS_WIDTH (2 * sizeof(uint32_t)) + int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m, struct bitmapped_pack *bp, uint32_t pack_int_id) { @@ -410,8 +374,8 @@ int fill_midx_entry(struct repository *r, } /* Match "foo.idx" against either "foo.pack" _or_ "foo.idx". */ -static int cmp_idx_or_pack_name(const char *idx_or_pack_name, - const char *idx_name) +int cmp_idx_or_pack_name(const char *idx_or_pack_name, + const char *idx_name) { /* Skip past any initial matching prefix. */ while (*idx_name && *idx_name == *idx_or_pack_name) { @@ -518,1243 +482,11 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, i return 0; } -static size_t write_midx_header(struct hashfile *f, - unsigned char num_chunks, - uint32_t num_packs) -{ - hashwrite_be32(f, MIDX_SIGNATURE); - hashwrite_u8(f, MIDX_VERSION); - hashwrite_u8(f, oid_version(the_hash_algo)); - hashwrite_u8(f, num_chunks); - hashwrite_u8(f, 0); /* unused */ - hashwrite_be32(f, num_packs); - - return MIDX_HEADER_SIZE; -} - -#define BITMAP_POS_UNKNOWN (~((uint32_t)0)) - -struct pack_info { - uint32_t orig_pack_int_id; - char *pack_name; - struct packed_git *p; - - uint32_t bitmap_pos; - uint32_t bitmap_nr; - - unsigned expired : 1; -}; - -static void fill_pack_info(struct pack_info *info, - struct packed_git *p, const char *pack_name, - uint32_t orig_pack_int_id) -{ - memset(info, 0, sizeof(struct pack_info)); - - info->orig_pack_int_id = orig_pack_int_id; - info->pack_name = xstrdup(pack_name); - info->p = p; - info->bitmap_pos = BITMAP_POS_UNKNOWN; -} - -static int pack_info_compare(const void *_a, const void *_b) -{ - struct pack_info *a = (struct pack_info *)_a; - struct pack_info *b = (struct pack_info *)_b; - return strcmp(a->pack_name, b->pack_name); -} - -static int idx_or_pack_name_cmp(const void *_va, const void *_vb) -{ - const char *pack_name = _va; - const struct pack_info *compar = _vb; - - return cmp_idx_or_pack_name(pack_name, compar->pack_name); -} - -struct write_midx_context { - struct pack_info *info; - size_t nr; - size_t alloc; - struct multi_pack_index *m; - struct progress *progress; - unsigned pack_paths_checked; - - struct pack_midx_entry *entries; - size_t entries_nr; - - uint32_t *pack_perm; - uint32_t *pack_order; - unsigned large_offsets_needed:1; - uint32_t num_large_offsets; - - int preferred_pack_idx; - - struct string_list *to_include; -}; - -static void add_pack_to_midx(const char *full_path, size_t full_path_len, - const char *file_name, void *data) -{ - struct write_midx_context *ctx = data; - struct packed_git *p; - - if (ends_with(file_name, ".idx")) { - display_progress(ctx->progress, ++ctx->pack_paths_checked); - /* - * Note that at most one of ctx->m and ctx->to_include are set, - * so we are testing midx_contains_pack() and - * string_list_has_string() independently (guarded by the - * appropriate NULL checks). - * - * We could support passing to_include while reusing an existing - * MIDX, but don't currently since the reuse process drags - * forward all packs from an existing MIDX (without checking - * whether or not they appear in the to_include list). - * - * If we added support for that, these next two conditional - * should be performed independently (likely checking - * to_include before the existing MIDX). - */ - if (ctx->m && midx_contains_pack(ctx->m, file_name)) - return; - else if (ctx->to_include && - !string_list_has_string(ctx->to_include, file_name)) - return; - - ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc); - - p = add_packed_git(full_path, full_path_len, 0); - if (!p) { - warning(_("failed to add packfile '%s'"), - full_path); - return; - } - - if (open_pack_index(p)) { - warning(_("failed to open pack-index '%s'"), - full_path); - close_pack(p); - free(p); - return; - } - - fill_pack_info(&ctx->info[ctx->nr], p, file_name, ctx->nr); - ctx->nr++; - } -} - -struct pack_midx_entry { - struct object_id oid; - uint32_t pack_int_id; - time_t pack_mtime; - uint64_t offset; - unsigned preferred : 1; -}; - -static int midx_oid_compare(const void *_a, const void *_b) -{ - const struct pack_midx_entry *a = (const struct pack_midx_entry *)_a; - const struct pack_midx_entry *b = (const struct pack_midx_entry *)_b; - int cmp = oidcmp(&a->oid, &b->oid); - - if (cmp) - return cmp; - - /* Sort objects in a preferred pack first when multiple copies exist. */ - if (a->preferred > b->preferred) - return -1; - if (a->preferred < b->preferred) - return 1; - - if (a->pack_mtime > b->pack_mtime) - return -1; - else if (a->pack_mtime < b->pack_mtime) - return 1; - - return a->pack_int_id - b->pack_int_id; -} - -static int nth_midxed_pack_midx_entry(struct multi_pack_index *m, - struct pack_midx_entry *e, - uint32_t pos) -{ - if (pos >= m->num_objects) - return 1; - - nth_midxed_object_oid(&e->oid, m, pos); - e->pack_int_id = nth_midxed_pack_int_id(m, pos); - e->offset = nth_midxed_offset(m, pos); - - /* consider objects in midx to be from "old" packs */ - e->pack_mtime = 0; - return 0; -} - -static void fill_pack_entry(uint32_t pack_int_id, - struct packed_git *p, - uint32_t cur_object, - struct pack_midx_entry *entry, - int preferred) -{ - if (nth_packed_object_id(&entry->oid, p, cur_object) < 0) - die(_("failed to locate object %d in packfile"), cur_object); - - entry->pack_int_id = pack_int_id; - entry->pack_mtime = p->mtime; - - entry->offset = nth_packed_object_offset(p, cur_object); - entry->preferred = !!preferred; -} - -struct midx_fanout { - struct pack_midx_entry *entries; - size_t nr, alloc; -}; - -static void midx_fanout_grow(struct midx_fanout *fanout, size_t nr) -{ - if (nr < fanout->nr) - BUG("negative growth in midx_fanout_grow() (%"PRIuMAX" < %"PRIuMAX")", - (uintmax_t)nr, (uintmax_t)fanout->nr); - ALLOC_GROW(fanout->entries, nr, fanout->alloc); -} - -static void midx_fanout_sort(struct midx_fanout *fanout) -{ - QSORT(fanout->entries, fanout->nr, midx_oid_compare); -} - -static void midx_fanout_add_midx_fanout(struct midx_fanout *fanout, - struct multi_pack_index *m, - uint32_t cur_fanout, - int preferred_pack) -{ - uint32_t start = 0, end; - uint32_t cur_object; - - if (cur_fanout) - start = ntohl(m->chunk_oid_fanout[cur_fanout - 1]); - end = ntohl(m->chunk_oid_fanout[cur_fanout]); - - for (cur_object = start; cur_object < end; cur_object++) { - if ((preferred_pack > -1) && - (preferred_pack == nth_midxed_pack_int_id(m, cur_object))) { - /* - * Objects from preferred packs are added - * separately. - */ - continue; - } - - midx_fanout_grow(fanout, fanout->nr + 1); - nth_midxed_pack_midx_entry(m, - &fanout->entries[fanout->nr], - cur_object); - fanout->entries[fanout->nr].preferred = 0; - fanout->nr++; - } -} - -static void midx_fanout_add_pack_fanout(struct midx_fanout *fanout, - struct pack_info *info, - uint32_t cur_pack, - int preferred, - uint32_t cur_fanout) -{ - struct packed_git *pack = info[cur_pack].p; - uint32_t start = 0, end; - uint32_t cur_object; - - if (cur_fanout) - start = get_pack_fanout(pack, cur_fanout - 1); - end = get_pack_fanout(pack, cur_fanout); - - for (cur_object = start; cur_object < end; cur_object++) { - midx_fanout_grow(fanout, fanout->nr + 1); - fill_pack_entry(cur_pack, - info[cur_pack].p, - cur_object, - &fanout->entries[fanout->nr], - preferred); - fanout->nr++; - } -} - -/* - * It is possible to artificially get into a state where there are many - * duplicate copies of objects. That can create high memory pressure if - * we are to create a list of all objects before de-duplication. To reduce - * this memory pressure without a significant performance drop, automatically - * group objects by the first byte of their object id. Use the IDX fanout - * tables to group the data, copy to a local array, then sort. - * - * Copy only the de-duplicated entries (selected by most-recent modified time - * of a packfile containing the object). - */ -static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, - struct pack_info *info, - uint32_t nr_packs, - size_t *nr_objects, - int preferred_pack) -{ - uint32_t cur_fanout, cur_pack, cur_object; - size_t alloc_objects, total_objects = 0; - struct midx_fanout fanout = { 0 }; - struct pack_midx_entry *deduplicated_entries = NULL; - uint32_t start_pack = m ? m->num_packs : 0; - - for (cur_pack = start_pack; cur_pack < nr_packs; cur_pack++) - total_objects = st_add(total_objects, - info[cur_pack].p->num_objects); - - /* - * As we de-duplicate by fanout value, we expect the fanout - * slices to be evenly distributed, with some noise. Hence, - * allocate slightly more than one 256th. - */ - alloc_objects = fanout.alloc = total_objects > 3200 ? total_objects / 200 : 16; - - ALLOC_ARRAY(fanout.entries, fanout.alloc); - ALLOC_ARRAY(deduplicated_entries, alloc_objects); - *nr_objects = 0; - - for (cur_fanout = 0; cur_fanout < 256; cur_fanout++) { - fanout.nr = 0; - - if (m) - midx_fanout_add_midx_fanout(&fanout, m, cur_fanout, - preferred_pack); - - for (cur_pack = start_pack; cur_pack < nr_packs; cur_pack++) { - int preferred = cur_pack == preferred_pack; - midx_fanout_add_pack_fanout(&fanout, - info, cur_pack, - preferred, cur_fanout); - } - - if (-1 < preferred_pack && preferred_pack < start_pack) - midx_fanout_add_pack_fanout(&fanout, info, - preferred_pack, 1, - cur_fanout); - - midx_fanout_sort(&fanout); - - /* - * The batch is now sorted by OID and then mtime (descending). - * Take only the first duplicate. - */ - for (cur_object = 0; cur_object < fanout.nr; cur_object++) { - if (cur_object && oideq(&fanout.entries[cur_object - 1].oid, - &fanout.entries[cur_object].oid)) - continue; - - ALLOC_GROW(deduplicated_entries, st_add(*nr_objects, 1), - alloc_objects); - memcpy(&deduplicated_entries[*nr_objects], - &fanout.entries[cur_object], - sizeof(struct pack_midx_entry)); - (*nr_objects)++; - } - } - - free(fanout.entries); - return deduplicated_entries; -} - -static int write_midx_pack_names(struct hashfile *f, void *data) -{ - struct write_midx_context *ctx = data; - uint32_t i; - unsigned char padding[MIDX_CHUNK_ALIGNMENT]; - size_t written = 0; - - for (i = 0; i < ctx->nr; i++) { - size_t writelen; - - if (ctx->info[i].expired) - continue; - - if (i && strcmp(ctx->info[i].pack_name, ctx->info[i - 1].pack_name) <= 0) - BUG("incorrect pack-file order: %s before %s", - ctx->info[i - 1].pack_name, - ctx->info[i].pack_name); - - writelen = strlen(ctx->info[i].pack_name) + 1; - hashwrite(f, ctx->info[i].pack_name, writelen); - written += writelen; - } - - /* add padding to be aligned */ - i = MIDX_CHUNK_ALIGNMENT - (written % MIDX_CHUNK_ALIGNMENT); - if (i < MIDX_CHUNK_ALIGNMENT) { - memset(padding, 0, sizeof(padding)); - hashwrite(f, padding, i); - } - - return 0; -} - -static int write_midx_bitmapped_packs(struct hashfile *f, void *data) -{ - struct write_midx_context *ctx = data; - size_t i; - - for (i = 0; i < ctx->nr; i++) { - struct pack_info *pack = &ctx->info[i]; - if (pack->expired) - continue; - - if (pack->bitmap_pos == BITMAP_POS_UNKNOWN && pack->bitmap_nr) - BUG("pack '%s' has no bitmap position, but has %d bitmapped object(s)", - pack->pack_name, pack->bitmap_nr); - - hashwrite_be32(f, pack->bitmap_pos); - hashwrite_be32(f, pack->bitmap_nr); - } - return 0; -} - -static int write_midx_oid_fanout(struct hashfile *f, - void *data) -{ - struct write_midx_context *ctx = data; - struct pack_midx_entry *list = ctx->entries; - struct pack_midx_entry *last = ctx->entries + ctx->entries_nr; - uint32_t count = 0; - uint32_t i; - - /* - * Write the first-level table (the list is sorted, - * but we use a 256-entry lookup to be able to avoid - * having to do eight extra binary search iterations). - */ - for (i = 0; i < 256; i++) { - struct pack_midx_entry *next = list; - - while (next < last && next->oid.hash[0] == i) { - count++; - next++; - } - - hashwrite_be32(f, count); - list = next; - } - - return 0; -} - -static int write_midx_oid_lookup(struct hashfile *f, - void *data) -{ - struct write_midx_context *ctx = data; - unsigned char hash_len = the_hash_algo->rawsz; - struct pack_midx_entry *list = ctx->entries; - uint32_t i; - - for (i = 0; i < ctx->entries_nr; i++) { - struct pack_midx_entry *obj = list++; - - if (i < ctx->entries_nr - 1) { - struct pack_midx_entry *next = list; - if (oidcmp(&obj->oid, &next->oid) >= 0) - BUG("OIDs not in order: %s >= %s", - oid_to_hex(&obj->oid), - oid_to_hex(&next->oid)); - } - - hashwrite(f, obj->oid.hash, (int)hash_len); - } - - return 0; -} - -static int write_midx_object_offsets(struct hashfile *f, - void *data) -{ - struct write_midx_context *ctx = data; - struct pack_midx_entry *list = ctx->entries; - uint32_t i, nr_large_offset = 0; - - for (i = 0; i < ctx->entries_nr; i++) { - struct pack_midx_entry *obj = list++; - - if (ctx->pack_perm[obj->pack_int_id] == PACK_EXPIRED) - BUG("object %s is in an expired pack with int-id %d", - oid_to_hex(&obj->oid), - obj->pack_int_id); - - hashwrite_be32(f, ctx->pack_perm[obj->pack_int_id]); - - if (ctx->large_offsets_needed && obj->offset >> 31) - hashwrite_be32(f, MIDX_LARGE_OFFSET_NEEDED | nr_large_offset++); - else if (!ctx->large_offsets_needed && obj->offset >> 32) - BUG("object %s requires a large offset (%"PRIx64") but the MIDX is not writing large offsets!", - oid_to_hex(&obj->oid), - obj->offset); - else - hashwrite_be32(f, (uint32_t)obj->offset); - } - - return 0; -} - -static int write_midx_large_offsets(struct hashfile *f, - void *data) -{ - struct write_midx_context *ctx = data; - struct pack_midx_entry *list = ctx->entries; - struct pack_midx_entry *end = ctx->entries + ctx->entries_nr; - uint32_t nr_large_offset = ctx->num_large_offsets; - - while (nr_large_offset) { - struct pack_midx_entry *obj; - uint64_t offset; - - if (list >= end) - BUG("too many large-offset objects"); - - obj = list++; - offset = obj->offset; - - if (!(offset >> 31)) - continue; - - hashwrite_be64(f, offset); - - nr_large_offset--; - } - - return 0; -} - -static int write_midx_revindex(struct hashfile *f, - void *data) -{ - struct write_midx_context *ctx = data; - uint32_t i; - - for (i = 0; i < ctx->entries_nr; i++) - hashwrite_be32(f, ctx->pack_order[i]); - - return 0; -} - -struct midx_pack_order_data { - uint32_t nr; - uint32_t pack; - off_t offset; -}; - -static int midx_pack_order_cmp(const void *va, const void *vb) -{ - const struct midx_pack_order_data *a = va, *b = vb; - if (a->pack < b->pack) - return -1; - else if (a->pack > b->pack) - return 1; - else if (a->offset < b->offset) - return -1; - else if (a->offset > b->offset) - return 1; - else - return 0; -} - -static uint32_t *midx_pack_order(struct write_midx_context *ctx) -{ - struct midx_pack_order_data *data; - uint32_t *pack_order; - uint32_t i; - - trace2_region_enter("midx", "midx_pack_order", the_repository); - - ALLOC_ARRAY(data, ctx->entries_nr); - for (i = 0; i < ctx->entries_nr; i++) { - struct pack_midx_entry *e = &ctx->entries[i]; - data[i].nr = i; - data[i].pack = ctx->pack_perm[e->pack_int_id]; - if (!e->preferred) - data[i].pack |= (1U << 31); - data[i].offset = e->offset; - } - - QSORT(data, ctx->entries_nr, midx_pack_order_cmp); - - ALLOC_ARRAY(pack_order, ctx->entries_nr); - for (i = 0; i < ctx->entries_nr; i++) { - struct pack_midx_entry *e = &ctx->entries[data[i].nr]; - struct pack_info *pack = &ctx->info[ctx->pack_perm[e->pack_int_id]]; - if (pack->bitmap_pos == BITMAP_POS_UNKNOWN) - pack->bitmap_pos = i; - pack->bitmap_nr++; - pack_order[i] = data[i].nr; - } - for (i = 0; i < ctx->nr; i++) { - struct pack_info *pack = &ctx->info[ctx->pack_perm[i]]; - if (pack->bitmap_pos == BITMAP_POS_UNKNOWN) - pack->bitmap_pos = 0; - } - free(data); - - trace2_region_leave("midx", "midx_pack_order", the_repository); - - return pack_order; -} - -static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash, - struct write_midx_context *ctx) -{ - struct strbuf buf = STRBUF_INIT; - const char *tmp_file; - - trace2_region_enter("midx", "write_midx_reverse_index", the_repository); - - strbuf_addf(&buf, "%s-%s.rev", midx_name, hash_to_hex(midx_hash)); - - tmp_file = write_rev_file_order(NULL, ctx->pack_order, ctx->entries_nr, - midx_hash, WRITE_REV); - - if (finalize_object_file(tmp_file, buf.buf)) - die(_("cannot store reverse index file")); - - strbuf_release(&buf); - - trace2_region_leave("midx", "write_midx_reverse_index", the_repository); -} - -static void clear_midx_files_ext(const char *object_dir, const char *ext, - unsigned char *keep_hash); - -static int midx_checksum_valid(struct multi_pack_index *m) +int midx_checksum_valid(struct multi_pack_index *m) { return hashfile_checksum_valid(m->data, m->data_len); } -static void prepare_midx_packing_data(struct packing_data *pdata, - struct write_midx_context *ctx) -{ - uint32_t i; - - trace2_region_enter("midx", "prepare_midx_packing_data", the_repository); - - memset(pdata, 0, sizeof(struct packing_data)); - prepare_packing_data(the_repository, pdata); - - for (i = 0; i < ctx->entries_nr; i++) { - struct pack_midx_entry *from = &ctx->entries[ctx->pack_order[i]]; - struct object_entry *to = packlist_alloc(pdata, &from->oid); - - oe_set_in_pack(pdata, to, - ctx->info[ctx->pack_perm[from->pack_int_id]].p); - } - - trace2_region_leave("midx", "prepare_midx_packing_data", the_repository); -} - -static int add_ref_to_pending(const char *refname, - const struct object_id *oid, - int flag, void *cb_data) -{ - struct rev_info *revs = (struct rev_info*)cb_data; - struct object_id peeled; - struct object *object; - - if ((flag & REF_ISSYMREF) && (flag & REF_ISBROKEN)) { - warning("symbolic ref is dangling: %s", refname); - return 0; - } - - if (!peel_iterated_oid(oid, &peeled)) - oid = &peeled; - - object = parse_object_or_die(oid, refname); - if (object->type != OBJ_COMMIT) - return 0; - - add_pending_object(revs, object, ""); - if (bitmap_is_preferred_refname(revs->repo, refname)) - object->flags |= NEEDS_BITMAP; - return 0; -} - -struct bitmap_commit_cb { - struct commit **commits; - size_t commits_nr, commits_alloc; - - struct write_midx_context *ctx; -}; - -static const struct object_id *bitmap_oid_access(size_t index, - const void *_entries) -{ - const struct pack_midx_entry *entries = _entries; - return &entries[index].oid; -} - -static void bitmap_show_commit(struct commit *commit, void *_data) -{ - struct bitmap_commit_cb *data = _data; - int pos = oid_pos(&commit->object.oid, data->ctx->entries, - data->ctx->entries_nr, - bitmap_oid_access); - if (pos < 0) - return; - - ALLOC_GROW(data->commits, data->commits_nr + 1, data->commits_alloc); - data->commits[data->commits_nr++] = commit; -} - -static int read_refs_snapshot(const char *refs_snapshot, - struct rev_info *revs) -{ - struct strbuf buf = STRBUF_INIT; - struct object_id oid; - FILE *f = xfopen(refs_snapshot, "r"); - - while (strbuf_getline(&buf, f) != EOF) { - struct object *object; - int preferred = 0; - char *hex = buf.buf; - const char *end = NULL; - - if (buf.len && *buf.buf == '+') { - preferred = 1; - hex = &buf.buf[1]; - } - - if (parse_oid_hex(hex, &oid, &end) < 0) - die(_("could not parse line: %s"), buf.buf); - if (*end) - die(_("malformed line: %s"), buf.buf); - - object = parse_object_or_die(&oid, NULL); - if (preferred) - object->flags |= NEEDS_BITMAP; - - add_pending_object(revs, object, ""); - } - - fclose(f); - strbuf_release(&buf); - return 0; -} - -static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr_p, - const char *refs_snapshot, - struct write_midx_context *ctx) -{ - struct rev_info revs; - struct bitmap_commit_cb cb = {0}; - - trace2_region_enter("midx", "find_commits_for_midx_bitmap", - the_repository); - - cb.ctx = ctx; - - repo_init_revisions(the_repository, &revs, NULL); - if (refs_snapshot) { - read_refs_snapshot(refs_snapshot, &revs); - } else { - setup_revisions(0, NULL, &revs, NULL); - for_each_ref(add_ref_to_pending, &revs); - } - - /* - * Skipping promisor objects here is intentional, since it only excludes - * them from the list of reachable commits that we want to select from - * when computing the selection of MIDX'd commits to receive bitmaps. - * - * Reachability bitmaps do require that their objects be closed under - * reachability, but fetching any objects missing from promisors at this - * point is too late. But, if one of those objects can be reached from - * an another object that is included in the bitmap, then we will - * complain later that we don't have reachability closure (and fail - * appropriately). - */ - fetch_if_missing = 0; - revs.exclude_promisor_objects = 1; - - if (prepare_revision_walk(&revs)) - die(_("revision walk setup failed")); - - traverse_commit_list(&revs, bitmap_show_commit, NULL, &cb); - if (indexed_commits_nr_p) - *indexed_commits_nr_p = cb.commits_nr; - - release_revisions(&revs); - - trace2_region_leave("midx", "find_commits_for_midx_bitmap", - the_repository); - - return cb.commits; -} - -static int write_midx_bitmap(const char *midx_name, - const unsigned char *midx_hash, - struct packing_data *pdata, - struct commit **commits, - uint32_t commits_nr, - uint32_t *pack_order, - unsigned flags) -{ - int ret, i; - uint16_t options = 0; - struct pack_idx_entry **index; - char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name, - hash_to_hex(midx_hash)); - - trace2_region_enter("midx", "write_midx_bitmap", the_repository); - - if (flags & MIDX_WRITE_BITMAP_HASH_CACHE) - options |= BITMAP_OPT_HASH_CACHE; - - if (flags & MIDX_WRITE_BITMAP_LOOKUP_TABLE) - options |= BITMAP_OPT_LOOKUP_TABLE; - - /* - * Build the MIDX-order index based on pdata.objects (which is already - * in MIDX order; c.f., 'midx_pack_order_cmp()' for the definition of - * this order). - */ - ALLOC_ARRAY(index, pdata->nr_objects); - for (i = 0; i < pdata->nr_objects; i++) - index[i] = &pdata->objects[i].idx; - - bitmap_writer_show_progress(flags & MIDX_PROGRESS); - bitmap_writer_build_type_index(pdata, index, pdata->nr_objects); - - /* - * bitmap_writer_finish expects objects in lex order, but pack_order - * gives us exactly that. use it directly instead of re-sorting the - * array. - * - * This changes the order of objects in 'index' between - * bitmap_writer_build_type_index and bitmap_writer_finish. - * - * The same re-ordering takes place in the single-pack bitmap code via - * write_idx_file(), which is called by finish_tmp_packfile(), which - * happens between bitmap_writer_build_type_index() and - * bitmap_writer_finish(). - */ - for (i = 0; i < pdata->nr_objects; i++) - index[pack_order[i]] = &pdata->objects[i].idx; - - bitmap_writer_select_commits(commits, commits_nr, -1); - ret = bitmap_writer_build(pdata); - if (ret < 0) - goto cleanup; - - bitmap_writer_set_checksum(midx_hash); - bitmap_writer_finish(index, pdata->nr_objects, bitmap_name, options); - -cleanup: - free(index); - free(bitmap_name); - - trace2_region_leave("midx", "write_midx_bitmap", the_repository); - - return ret; -} - -struct multi_pack_index *lookup_multi_pack_index(struct repository *r, - const char *object_dir) -{ - struct multi_pack_index *result = NULL; - struct multi_pack_index *cur; - char *obj_dir_real = real_pathdup(object_dir, 1); - struct strbuf cur_path_real = STRBUF_INIT; - - /* Ensure the given object_dir is local, or a known alternate. */ - find_odb(r, obj_dir_real); - - for (cur = get_multi_pack_index(r); cur; cur = cur->next) { - strbuf_realpath(&cur_path_real, cur->object_dir, 1); - if (!strcmp(obj_dir_real, cur_path_real.buf)) { - result = cur; - goto cleanup; - } - } - -cleanup: - free(obj_dir_real); - strbuf_release(&cur_path_real); - return result; -} - -int write_midx_internal(const char *object_dir, - struct string_list *packs_to_include, - struct string_list *packs_to_drop, - const char *preferred_pack_name, - const char *refs_snapshot, - unsigned flags) -{ - struct strbuf midx_name = STRBUF_INIT; - unsigned char midx_hash[GIT_MAX_RAWSZ]; - uint32_t i; - struct hashfile *f = NULL; - struct lock_file lk; - struct write_midx_context ctx = { 0 }; - int bitmapped_packs_concat_len = 0; - int pack_name_concat_len = 0; - int dropped_packs = 0; - int result = 0; - struct chunkfile *cf; - - trace2_region_enter("midx", "write_midx_internal", the_repository); - - get_midx_filename(&midx_name, object_dir); - if (safe_create_leading_directories(midx_name.buf)) - die_errno(_("unable to create leading directories of %s"), - midx_name.buf); - - if (!packs_to_include) { - /* - * Only reference an existing MIDX when not filtering which - * packs to include, since all packs and objects are copied - * blindly from an existing MIDX if one is present. - */ - ctx.m = lookup_multi_pack_index(the_repository, object_dir); - } - - if (ctx.m && !midx_checksum_valid(ctx.m)) { - warning(_("ignoring existing multi-pack-index; checksum mismatch")); - ctx.m = NULL; - } - - ctx.nr = 0; - ctx.alloc = ctx.m ? ctx.m->num_packs : 16; - ctx.info = NULL; - ALLOC_ARRAY(ctx.info, ctx.alloc); - - if (ctx.m) { - for (i = 0; i < ctx.m->num_packs; i++) { - ALLOC_GROW(ctx.info, ctx.nr + 1, ctx.alloc); - - if (flags & MIDX_WRITE_REV_INDEX) { - /* - * If generating a reverse index, need to have - * packed_git's loaded to compare their - * mtimes and object count. - */ - if (prepare_midx_pack(the_repository, ctx.m, i)) { - error(_("could not load pack")); - result = 1; - goto cleanup; - } - - if (open_pack_index(ctx.m->packs[i])) - die(_("could not open index for %s"), - ctx.m->packs[i]->pack_name); - } - - fill_pack_info(&ctx.info[ctx.nr++], ctx.m->packs[i], - ctx.m->pack_names[i], i); - } - } - - ctx.pack_paths_checked = 0; - if (flags & MIDX_PROGRESS) - ctx.progress = start_delayed_progress(_("Adding packfiles to multi-pack-index"), 0); - else - ctx.progress = NULL; - - ctx.to_include = packs_to_include; - - for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &ctx); - stop_progress(&ctx.progress); - - if ((ctx.m && ctx.nr == ctx.m->num_packs) && - !(packs_to_include || packs_to_drop)) { - struct bitmap_index *bitmap_git; - int bitmap_exists; - int want_bitmap = flags & MIDX_WRITE_BITMAP; - - bitmap_git = prepare_midx_bitmap_git(ctx.m); - bitmap_exists = bitmap_git && bitmap_is_midx(bitmap_git); - free_bitmap_index(bitmap_git); - - if (bitmap_exists || !want_bitmap) { - /* - * The correct MIDX already exists, and so does a - * corresponding bitmap (or one wasn't requested). - */ - if (!want_bitmap) - clear_midx_files_ext(object_dir, ".bitmap", - NULL); - goto cleanup; - } - } - - if (preferred_pack_name) { - ctx.preferred_pack_idx = -1; - - for (i = 0; i < ctx.nr; i++) { - if (!cmp_idx_or_pack_name(preferred_pack_name, - ctx.info[i].pack_name)) { - ctx.preferred_pack_idx = i; - break; - } - } - - if (ctx.preferred_pack_idx == -1) - warning(_("unknown preferred pack: '%s'"), - preferred_pack_name); - } else if (ctx.nr && - (flags & (MIDX_WRITE_REV_INDEX | MIDX_WRITE_BITMAP))) { - struct packed_git *oldest = ctx.info[ctx.preferred_pack_idx].p; - ctx.preferred_pack_idx = 0; - - if (packs_to_drop && packs_to_drop->nr) - BUG("cannot write a MIDX bitmap during expiration"); - - /* - * set a preferred pack when writing a bitmap to ensure that - * the pack from which the first object is selected in pseudo - * pack-order has all of its objects selected from that pack - * (and not another pack containing a duplicate) - */ - for (i = 1; i < ctx.nr; i++) { - struct packed_git *p = ctx.info[i].p; - - if (!oldest->num_objects || p->mtime < oldest->mtime) { - oldest = p; - ctx.preferred_pack_idx = i; - } - } - - if (!oldest->num_objects) { - /* - * If all packs are empty; unset the preferred index. - * This is acceptable since there will be no duplicate - * objects to resolve, so the preferred value doesn't - * matter. - */ - ctx.preferred_pack_idx = -1; - } - } else { - /* - * otherwise don't mark any pack as preferred to avoid - * interfering with expiration logic below - */ - ctx.preferred_pack_idx = -1; - } - - if (ctx.preferred_pack_idx > -1) { - struct packed_git *preferred = ctx.info[ctx.preferred_pack_idx].p; - if (!preferred->num_objects) { - error(_("cannot select preferred pack %s with no objects"), - preferred->pack_name); - result = 1; - goto cleanup; - } - } - - ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr, - ctx.preferred_pack_idx); - - ctx.large_offsets_needed = 0; - for (i = 0; i < ctx.entries_nr; i++) { - if (ctx.entries[i].offset > 0x7fffffff) - ctx.num_large_offsets++; - if (ctx.entries[i].offset > 0xffffffff) - ctx.large_offsets_needed = 1; - } - - QSORT(ctx.info, ctx.nr, pack_info_compare); - - if (packs_to_drop && packs_to_drop->nr) { - int drop_index = 0; - int missing_drops = 0; - - for (i = 0; i < ctx.nr && drop_index < packs_to_drop->nr; i++) { - int cmp = strcmp(ctx.info[i].pack_name, - packs_to_drop->items[drop_index].string); - - if (!cmp) { - drop_index++; - ctx.info[i].expired = 1; - } else if (cmp > 0) { - error(_("did not see pack-file %s to drop"), - packs_to_drop->items[drop_index].string); - drop_index++; - missing_drops++; - i--; - } else { - ctx.info[i].expired = 0; - } - } - - if (missing_drops) { - result = 1; - goto cleanup; - } - } - - /* - * pack_perm stores a permutation between pack-int-ids from the - * previous multi-pack-index to the new one we are writing: - * - * pack_perm[old_id] = new_id - */ - ALLOC_ARRAY(ctx.pack_perm, ctx.nr); - for (i = 0; i < ctx.nr; i++) { - if (ctx.info[i].expired) { - dropped_packs++; - ctx.pack_perm[ctx.info[i].orig_pack_int_id] = PACK_EXPIRED; - } else { - ctx.pack_perm[ctx.info[i].orig_pack_int_id] = i - dropped_packs; - } - } - - for (i = 0; i < ctx.nr; i++) { - if (ctx.info[i].expired) - continue; - pack_name_concat_len += strlen(ctx.info[i].pack_name) + 1; - bitmapped_packs_concat_len += 2 * sizeof(uint32_t); - } - - /* Check that the preferred pack wasn't expired (if given). */ - if (preferred_pack_name) { - struct pack_info *preferred = bsearch(preferred_pack_name, - ctx.info, ctx.nr, - sizeof(*ctx.info), - idx_or_pack_name_cmp); - if (preferred) { - uint32_t perm = ctx.pack_perm[preferred->orig_pack_int_id]; - if (perm == PACK_EXPIRED) - warning(_("preferred pack '%s' is expired"), - preferred_pack_name); - } - } - - if (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT) - pack_name_concat_len += MIDX_CHUNK_ALIGNMENT - - (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT); - - hold_lock_file_for_update(&lk, midx_name.buf, LOCK_DIE_ON_ERROR); - f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); - - if (ctx.nr - dropped_packs == 0) { - error(_("no pack files to index.")); - result = 1; - goto cleanup; - } - - if (!ctx.entries_nr) { - if (flags & MIDX_WRITE_BITMAP) - warning(_("refusing to write multi-pack .bitmap without any objects")); - flags &= ~(MIDX_WRITE_REV_INDEX | MIDX_WRITE_BITMAP); - } - - cf = init_chunkfile(f); - - add_chunk(cf, MIDX_CHUNKID_PACKNAMES, pack_name_concat_len, - write_midx_pack_names); - add_chunk(cf, MIDX_CHUNKID_OIDFANOUT, MIDX_CHUNK_FANOUT_SIZE, - write_midx_oid_fanout); - add_chunk(cf, MIDX_CHUNKID_OIDLOOKUP, - st_mult(ctx.entries_nr, the_hash_algo->rawsz), - write_midx_oid_lookup); - add_chunk(cf, MIDX_CHUNKID_OBJECTOFFSETS, - st_mult(ctx.entries_nr, MIDX_CHUNK_OFFSET_WIDTH), - write_midx_object_offsets); - - if (ctx.large_offsets_needed) - add_chunk(cf, MIDX_CHUNKID_LARGEOFFSETS, - st_mult(ctx.num_large_offsets, - MIDX_CHUNK_LARGE_OFFSET_WIDTH), - write_midx_large_offsets); - - if (flags & (MIDX_WRITE_REV_INDEX | MIDX_WRITE_BITMAP)) { - ctx.pack_order = midx_pack_order(&ctx); - add_chunk(cf, MIDX_CHUNKID_REVINDEX, - st_mult(ctx.entries_nr, sizeof(uint32_t)), - write_midx_revindex); - add_chunk(cf, MIDX_CHUNKID_BITMAPPEDPACKS, - bitmapped_packs_concat_len, - write_midx_bitmapped_packs); - } - - write_midx_header(f, get_num_chunks(cf), ctx.nr - dropped_packs); - write_chunkfile(cf, &ctx); - - finalize_hashfile(f, midx_hash, FSYNC_COMPONENT_PACK_METADATA, - CSUM_FSYNC | CSUM_HASH_IN_STREAM); - free_chunkfile(cf); - - if (flags & MIDX_WRITE_REV_INDEX && - git_env_bool("GIT_TEST_MIDX_WRITE_REV", 0)) - write_midx_reverse_index(midx_name.buf, midx_hash, &ctx); - - if (flags & MIDX_WRITE_BITMAP) { - struct packing_data pdata; - struct commit **commits; - uint32_t commits_nr; - - if (!ctx.entries_nr) - BUG("cannot write a bitmap without any objects"); - - prepare_midx_packing_data(&pdata, &ctx); - - commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, &ctx); - - /* - * The previous steps translated the information from - * 'entries' into information suitable for constructing - * bitmaps. We no longer need that array, so clear it to - * reduce memory pressure. - */ - FREE_AND_NULL(ctx.entries); - ctx.entries_nr = 0; - - if (write_midx_bitmap(midx_name.buf, midx_hash, &pdata, - commits, commits_nr, ctx.pack_order, - flags) < 0) { - error(_("could not write multi-pack bitmap")); - result = 1; - clear_packing_data(&pdata); - free(commits); - goto cleanup; - } - - clear_packing_data(&pdata); - free(commits); - } - /* - * NOTE: Do not use ctx.entries beyond this point, since it might - * have been freed in the previous if block. - */ - - if (ctx.m) - close_object_store(the_repository->objects); - - if (commit_lock_file(&lk) < 0) - die_errno(_("could not write multi-pack-index")); - - clear_midx_files_ext(object_dir, ".bitmap", midx_hash); - clear_midx_files_ext(object_dir, ".rev", midx_hash); - -cleanup: - for (i = 0; i < ctx.nr; i++) { - if (ctx.info[i].p) { - close_pack(ctx.info[i].p); - free(ctx.info[i].p); - } - free(ctx.info[i].pack_name); - } - - free(ctx.info); - free(ctx.entries); - free(ctx.pack_perm); - free(ctx.pack_order); - strbuf_release(&midx_name); - - trace2_region_leave("midx", "write_midx_internal", the_repository); - - return result; -} - struct clear_midx_data { char *keep; const char *ext; @@ -1775,8 +507,8 @@ static void clear_midx_file_ext(const char *full_path, size_t full_path_len UNUS die_errno(_("failed to remove %s"), full_path); } -static void clear_midx_files_ext(const char *object_dir, const char *ext, - unsigned char *keep_hash) +void clear_midx_files_ext(const char *object_dir, const char *ext, + unsigned char *keep_hash) { struct clear_midx_data data; memset(&data, 0, sizeof(struct clear_midx_data)); diff --git a/midx.h b/midx.h index b374a7afaf..dc477dff44 100644 --- a/midx.h +++ b/midx.h @@ -8,6 +8,25 @@ struct pack_entry; struct repository; struct bitmapped_pack; +#define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ +#define MIDX_VERSION 1 +#define MIDX_BYTE_FILE_VERSION 4 +#define MIDX_BYTE_HASH_VERSION 5 +#define MIDX_BYTE_NUM_CHUNKS 6 +#define MIDX_BYTE_NUM_PACKS 8 +#define MIDX_HEADER_SIZE 12 + +#define MIDX_CHUNK_ALIGNMENT 4 +#define MIDX_CHUNKID_PACKNAMES 0x504e414d /* "PNAM" */ +#define MIDX_CHUNKID_BITMAPPEDPACKS 0x42544d50 /* "BTMP" */ +#define MIDX_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */ +#define MIDX_CHUNKID_OIDLOOKUP 0x4f49444c /* "OIDL" */ +#define MIDX_CHUNKID_OBJECTOFFSETS 0x4f4f4646 /* "OOFF" */ +#define MIDX_CHUNKID_LARGEOFFSETS 0x4c4f4646 /* "LOFF" */ +#define MIDX_CHUNKID_REVINDEX 0x52494458 /* "RIDX" */ +#define MIDX_CHUNK_OFFSET_WIDTH (2 * sizeof(uint32_t)) +#define MIDX_LARGE_OFFSET_NEEDED 0x80000000 + #define GIT_TEST_MULTI_PACK_INDEX "GIT_TEST_MULTI_PACK_INDEX" #define GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP \ "GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP" From patchwork Mon Mar 25 17:24:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13602559 Received: from mail-qk1-f178.google.com (mail-qk1-f178.google.com [209.85.222.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F185A4AEE5 for ; Mon, 25 Mar 2024 17:24:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387485; cv=none; b=aT6vei4/N8xKFlIXzWLBoGCPq4YTRJ3IfeO+llKj9WtGx0/a/VzrxXt/0uGvKfGD7BnAIAxB96u+qSiZjDkLHcTFgPwiyK9+XeyQr9uPUe3QI9l9uNYWdwwbK+myAVGLLcaFoz0ZGS02BfztbtE0lvzs4CwSLBCA7h0JlVQjlIc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387485; c=relaxed/simple; bh=XtWq6NqD52/Hjkup2Pe4JxxY18W1gaSiSaCHCRauUfU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=EHoEKRStM7HVXloSzF4ENNBB6mHfSw6as3OWMO3nPJP7BvhZXxd6382x9x43Ql4xSZP74vECNgXo6OyNQksCZZ1+od83nP4n7EEFxEQFDxoOWlT22R94grfSF6fZxx9xj4DbVoT5ZkcqGP0tROaWxhs8x8RAD1YIOX9UffO51r8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=NwRa5hgt; arc=none smtp.client-ip=209.85.222.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="NwRa5hgt" Received: by mail-qk1-f178.google.com with SMTP id af79cd13be357-78a2093cd44so382116285a.0 for ; Mon, 25 Mar 2024 10:24:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1711387483; x=1711992283; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=z/pBLXp08H8YK1vw5fo90npRZRIsLzp8EivT+tlHGKs=; b=NwRa5hgt8cTIrEnS7AzmwaZbQBytcuhTYcWSyX0MUstQntMToUzd5ILgWatgCUupWO l5WQBX1xnO9sEdfDIX66r/P3mB1roMR27LBVXDkXMPWA9G7JgWbQ63ib/ThylueYqMnD bxswM2oUMOmvOfLwM07HiFTPNEdDy2/Oj8+uBO4fBGJ0tcDmiprich2JlrAju/oU9ZVl C8pP8qTMs/ByCaWRfj8PWcP9AtxwvpP95iP52QgpnVnPt98Ggi1z7xdjgqGWlJT7iXQe 9SrHes3tmkqitKiqqlD16VoE8/1/EtiTz5hgoWmV69kshIAZ7wATFGsrmmgXS8ZRZSeB cBPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387483; x=1711992283; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=z/pBLXp08H8YK1vw5fo90npRZRIsLzp8EivT+tlHGKs=; b=Dw4+irr6wQopEo5F2hX9o3NiM//QDsHGIN+yeNOgG2HJvyWPa69Ui1fdOi+X69ntwy QBAZxIKfMvSjFWjmQLmcqceUALEqyyMteFvDt6dKLyKUG9EqRTjbqIcIaoeHls2kNrqy 5hlsn+X/j6qduHfW8F/cfhbyFT4jL9h2Sbw/KkOsyR7/fuKyY0/6uMxdV5pTDQ33JLeP Io1j8Jfz/QUKBsvuZphPoWHS6z2IUPZuBq1rQ9Ekwa6basq086ZYWCxo34CN3CHt45HP Xj8/RxgozEedfbd0/bm0UwfAuKE5gzYsyPZmwnm02R7OdYNN6hwueQkO8qRz21AoOS9s GiVw== X-Gm-Message-State: AOJu0Ywv0qxgBOSMymIALP2ukKp9LnBraAlm1/zZliM2cbI8+GlHYw8i zE39P9FHQ6TxqYTaOJvFIDjoTWzBIJAeZddGAV2PMlfX9CFtxVS387dT974dGbECKbVVXPOf3d4 7UJE= X-Google-Smtp-Source: AGHT+IHj4mqin96IcRX/rEnhA4aoYwUiuS6DGbWyoUSgsrGr50VDuVYwZTPTsxJerrSRh52M3yYdFw== X-Received: by 2002:a05:6214:20c1:b0:691:3b87:1382 with SMTP id 1-20020a05621420c100b006913b871382mr9166258qve.26.1711387482831; Mon, 25 Mar 2024 10:24:42 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id r14-20020a0cf80e000000b0069677500d0bsm2990581qvn.29.2024.03.25.10.24.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 10:24:42 -0700 (PDT) Date: Mon, 25 Mar 2024 13:24:41 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano Subject: [PATCH 08/11] midx-write.c: avoid directly managed temporary strbuf Message-ID: <8e32755c492d20eec02c81351d249ce34cc6d7b9.1711387439.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In midx-write.c::midx_repack(), we construct the command-line arguments for a pack-objects invocation which will combine objects from the packs below our `--batch-size` option. To construct the base name of the output pack, we use a temporary strbuf, and then push the result of that onto the strvec which holds the command-line arguments, after which point we release the strbuf. We could replace this by doing something like: struct strbuf buf = STRBUF_INIT; strbuf_addf(&buf, "%s/pack/pack", object_dir); strvec_push_nodup(&cmd.args, strbuf_detach(&buf)); (combining the two separate `strbuf_addstr()` calls into a single `strbuf_addf()`). But that is more or less an open-coded version of strvec_pushf(), which we could use directly instead. (Note that at the time this code was written back in ce1e4a105b4 (midx: implement midx_repack(), 2019-06-10), strvec did not yet exist, so the above example would have replaced the last line with: argv_array_push_nodup(&cmd.args, strbuf_detach(&buf)); , but the code is otherwise unchanged). Avoid directly managing the temporary strbuf used to construct the base name for pack-object's command-line arguments, and instead use the purpose-built `strvec_pushf()` instead. Signed-off-by: Taylor Blau --- midx-write.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/midx-write.c b/midx-write.c index c812156cbd..89e325d08e 100644 --- a/midx-write.c +++ b/midx-write.c @@ -1446,7 +1446,6 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned char *include_pack; struct child_process cmd = CHILD_PROCESS_INIT; FILE *cmd_in; - struct strbuf base_name = STRBUF_INIT; struct multi_pack_index *m = lookup_multi_pack_index(r, object_dir); /* @@ -1473,10 +1472,6 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, strvec_push(&cmd.args, "pack-objects"); - strbuf_addstr(&base_name, object_dir); - strbuf_addstr(&base_name, "/pack/pack"); - strvec_push(&cmd.args, base_name.buf); - if (delta_base_offset) strvec_push(&cmd.args, "--delta-base-offset"); if (use_delta_islands) @@ -1487,7 +1482,7 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, else strvec_push(&cmd.args, "-q"); - strbuf_release(&base_name); + strvec_pushf(&cmd.args, "%s/pack/pack", object_dir); cmd.git_cmd = 1; cmd.in = cmd.out = -1; From patchwork Mon Mar 25 17:24:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13602561 Received: from mail-qv1-f49.google.com (mail-qv1-f49.google.com [209.85.219.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 21A7D4CB37 for ; Mon, 25 Mar 2024 17:24:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387488; cv=none; b=hWhi/ZoJsQfVFcnX6eU4Qv8aoyRXVsAEavVOfkyLkl0tF3Kmp58chwR84p5haTuP1CLeDn+8iqarhq1Ez40NocF9WXZZQc5iQclB238k/PMfOFzyIwDMDlt8iAH7JadugW+m7Qqe/Mcse40GFoIHkztn428eIaVMQTv5QVMy7D0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387488; c=relaxed/simple; bh=PMU5iJgBD7Yr6K5/+Me/Ylp9xR1hVgRjQc02+LFjTDQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=X4n1QRG/8OV3Lmk5cAILXyN06rJ3L8WVn0qCY/O9AyzTBq+mxiPaahnIJd84BOUttf623ho/mVuEeTSpNWOufzGNPEmTPHb2C/6kd6yDogyid/zDCEZSOZXxN+2WrjnopMf7P3fNTDtnBs0iBUNTX8ctqmbX631m7Q7qYvSSzX8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=MCq9aNTH; arc=none smtp.client-ip=209.85.219.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="MCq9aNTH" Received: by mail-qv1-f49.google.com with SMTP id 6a1803df08f44-6918781a913so39627016d6.3 for ; Mon, 25 Mar 2024 10:24:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1711387486; x=1711992286; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=1KemhrkDHYlcMEokb2iN8+B7Ei5uaoqtFQJBk96hTns=; b=MCq9aNTHIyQ9U0UH2B1uGwzzSleAKLBtXyg0OOlZ6Yddvi0q9mg86OP4iUQ8YCYMjM C+8ML9MWdyvyv2DvEzzAllELMXhpY6ahniIYt+plFJOX8EPRVZoYFovldwRdZctpatUV onj7kCaPX4BeEi+zfPu4rv7BvrifRb3Sf9fEWatmq3SilEBZOHVwqFpECadSWlgl/dqa de3Ax2GHKGldIeQhGqWgkFlwz+6CcUjoYQEbtu4Ul2KSCqUy7PQeAwv1deRrFMsgLHJJ 9StxCgArYuJhGqLLulN4+MRd4QqOZRRpLr06Xpn7LV/68nhpc1xzl2Jd/VJ2V8V+FLzw NRJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387486; x=1711992286; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=1KemhrkDHYlcMEokb2iN8+B7Ei5uaoqtFQJBk96hTns=; b=ZrI4GoiL9p14Gz/hauy88M9ZV+7DuW0bUAsPT3twtPGQewuRru8ReOnJqkVTR0XpLu IfmKCzhWQA24FimC765j769dV/jktaul3N0FfHfn2Z2gQs8NBKFUsCxVqwB+fk93t7lh ffpqw819OskWtiyf6tnBwSQa4ksLkZtCflLomXNvkR+3R1CvDV+cYk3skRZWRv/yuTic N8pC+k3ArDdq6AeaLg1gvcs+CME6seGPE3/MQ4kqSrz0WvyDdBRFTpKVIytd8DEFbpAd 58Ckv4u1n8RdiqOzKDVfNlM9FAda97RZDkKZE7pXXzxwljVlCNf/EFI6DXjHU+Va0lfN XtUw== X-Gm-Message-State: AOJu0Yxu8FPjAcrbGEhWNi55Jt0aF/V8CTj+crCU+m1RAAmFB+SZklht IysRYpB/azk7sKkgMZQHDnd2tp3QbshR3TfxUKxBo5p87VuIunAemfWMBvl6GlWcZwjVKQlieJW 9bdo= X-Google-Smtp-Source: AGHT+IHWTPDkgNz74uNjCH1a36FZVJpyp+k4PQY+Sb6LFZrJ+YS8IX8CaOdI1NPtAw8x4wLnY3EvmA== X-Received: by 2002:a05:6214:1cc2:b0:690:b9a7:b4fb with SMTP id g2-20020a0562141cc200b00690b9a7b4fbmr7297355qvd.60.1711387485868; Mon, 25 Mar 2024 10:24:45 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id q7-20020ad45ca7000000b006961c9a2ed8sm4286551qvh.47.2024.03.25.10.24.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 10:24:45 -0700 (PDT) Date: Mon, 25 Mar 2024 13:24:44 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano Subject: [PATCH 09/11] midx-write.c: factor out common want_included_pack() routine Message-ID: <5475b09a7afc4d55a8e1a1a72f20fa9109447cec.1711387439.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When performing a 'git multi-pack-index repack', the MIDX machinery tries to aggregate MIDX'd packs together either to (a) fill the given `--batch-size` argument, or (b) combine all packs together. In either case (using the `midx-write.c::fill_included_packs_batch()` or `midx-write.c::fill_included_packs_all()` function, respectively), we evaluate whether or not we want to repack each MIDX'd pack, according to whether or it is loadable, kept, cruft, or non-empty. Between the two `fill_included_packs_` callers, they both care about the same conditions, except for `fill_included_packs_batch()` which also cares that the pack is non-empty. We could extract two functions (say, `want_included_pack()` and a `_nonempty()` variant), but this is not necessary. For the case in `fill_included_packs_all()` which does not check the pack size, we add all of the pack's objects assuming that the pack meets all other criteria. But if the pack is empty in the first place, we add all of its zero objects, so whether or not we "accept" or "reject" it in the first place is irrelevant. This change improves the readability in both `fill_included_packs_` functions. Signed-off-by: Taylor Blau --- midx-write.c | 32 ++++++++++++++++++++------------ 1 file changed, 20 insertions(+), 12 deletions(-) diff --git a/midx-write.c b/midx-write.c index 89e325d08e..2f0f5d133f 100644 --- a/midx-write.c +++ b/midx-write.c @@ -1349,6 +1349,24 @@ static int compare_by_mtime(const void *a_, const void *b_) return 0; } +static int want_included_pack(struct repository *r, + struct multi_pack_index *m, + int pack_kept_objects, + uint32_t pack_int_id) +{ + struct packed_git *p; + if (prepare_midx_pack(r, m, pack_int_id)) + return 0; + p = m->packs[pack_int_id]; + if (!pack_kept_objects && p->pack_keep) + return 0; + if (p->is_cruft) + return 0; + if (open_pack_index(p) || !p->num_objects) + return 0; + return 1; +} + static int fill_included_packs_all(struct repository *r, struct multi_pack_index *m, unsigned char *include_pack) @@ -1359,11 +1377,7 @@ static int fill_included_packs_all(struct repository *r, repo_config_get_bool(r, "repack.packkeptobjects", &pack_kept_objects); for (i = 0; i < m->num_packs; i++) { - if (prepare_midx_pack(r, m, i)) - continue; - if (!pack_kept_objects && m->packs[i]->pack_keep) - continue; - if (m->packs[i]->is_cruft) + if (!want_included_pack(r, m, pack_kept_objects, i)) continue; include_pack[i] = 1; @@ -1410,13 +1424,7 @@ static int fill_included_packs_batch(struct repository *r, struct packed_git *p = m->packs[pack_int_id]; size_t expected_size; - if (!p) - continue; - if (!pack_kept_objects && p->pack_keep) - continue; - if (p->is_cruft) - continue; - if (open_pack_index(p) || !p->num_objects) + if (!want_included_pack(r, m, pack_kept_objects, pack_int_id)) continue; expected_size = st_mult(p->pack_size, From patchwork Mon Mar 25 17:24:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13602562 Received: from mail-qv1-f45.google.com (mail-qv1-f45.google.com [209.85.219.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A5964CB37 for ; Mon, 25 Mar 2024 17:24:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387491; cv=none; b=KE24spA4XudEPMmmXxBFzUEfXEa+Mm8FIbOftIX6jdU50o2zxLRNrxAeS1AFhJfrsOERJL8Iw7FjoICjTwXy/pFkh4pOaKqMoM41X/cT0HKTGt1XedPX16iESYo254WqUs06k1vT++y9wiPAI2Ngc3FHtXKNuG0R7QIc0CeRByE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387491; c=relaxed/simple; bh=xg0PruKaEaEqkQjmTcDFrpBPTXLAiwvV+oS5nTC5gOM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=WTmPxGFW+RjdN37lZfEFQxWZ/TiywabO8kBJVieX0jz2EIx6v9M5HPFMvlhaPBQRO6ZUPFiqRhBXtR3PjNqylJVswJQuHhEz7banb1RjcO6dvhrk6leVxdkBobzK4LD5Jp+r3yAYZbU7q3/A7OOjh9JPN3CeooLvc9wTYgKKzFk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=gt/LMilv; arc=none smtp.client-ip=209.85.219.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="gt/LMilv" Received: by mail-qv1-f45.google.com with SMTP id 6a1803df08f44-6918781a913so39627216d6.3 for ; Mon, 25 Mar 2024 10:24:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1711387489; x=1711992289; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=S0zHbQ26JjkQbqKgeGn4MZMpeIOwh8ponXQG3Zeqq/Y=; b=gt/LMilv82Jv+ee2xTp8jKYSnLDiEppAxFnO9uI1Gpc3mIt5pnixJ12zQmNIEoitEP 0NlbRhLfbH+lOkzhFF70ss9HIRW/pklIrJDQv54WK+V7B647EScklUGbLL0h5WbyuybE HphbE5sw54jlcW/LStOOqc403Aq9aa7QepNZjFwPF5h4mWmjsybGYyGo4DFGwJd46YJT cHO8RmvT9H6OAuCBYoQJ+b1rFD2H+xvN4SR+3hYS5oi6IhFEsaBlOusPkQwuNU3ehmNs 8a6k0lVRnaJYAAJBtGDuykrFhgd3r10RSe3q/4szSyERkyXQIcH8ypdfXIsuBhYHULCC h7sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387489; x=1711992289; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=S0zHbQ26JjkQbqKgeGn4MZMpeIOwh8ponXQG3Zeqq/Y=; b=PdSfvRTgaPqPsu4h7Ij2fGR1eavgH5CzjyrdQ12Ah7v3Kx/e1dFu5bEQH8qonSFPm7 Nq2IcVVXUVZa72xrbUZ3+hXK5dVCYt3sI0ZezaN9QAccXfe9NfFMFBGzNSqg5VDu0h1P P8tuaYJLtxr9X3ZwOMQo3FTiJoBn2RRezdG5v9WrloZX7iAkx3o27yQknntMZ8rS+XJG 3mB/KATpLMwDnew6+wwUfGSylKuVMxXWE5kO4y5Dxnd8wXmJUTCgziwaiE9mdXjBI7z8 OlStnKOz+b1Ri7q1XINsgo/GkvMz6OtZOhhAmncoJV6xbTEzcCbP3um0emg6iWJ8hgmn pmQA== X-Gm-Message-State: AOJu0YxfkD70G5n6zH8QCVoHxooIhqGYHn1X/i8zcuUaglVoVjcfTT4Z Rv1QeNkBgNs/zxNjROyrKLbGzfYozxUAkpBMEL+GN0kO3dWw/U+RckItYsg2oIi7N6WX3o4ZC5d xkTA= X-Google-Smtp-Source: AGHT+IG8nRqVmTICnyaAA/UJK0IJWHmsbAEoQQQmlPKwxooFxY8wFidcdAnCOl1WrpLuXOzw7kTZzg== X-Received: by 2002:a05:6214:400a:b0:696:8b32:63f with SMTP id kd10-20020a056214400a00b006968b32063fmr4985058qvb.37.1711387488831; Mon, 25 Mar 2024 10:24:48 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id jl7-20020ad45e87000000b0068f35e9e9a2sm4303331qvb.8.2024.03.25.10.24.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 10:24:48 -0700 (PDT) Date: Mon, 25 Mar 2024 13:24:47 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano Subject: [PATCH 10/11] midx-write.c: check count of packs to repack after grouping Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In both fill_included_packs_all() and fill_included_packs_batch(), we accumulate a list of packs whose contents we want to repack together, and then use that information to feed a list of objects as input to pack-objects. In both cases, the `fill_included_packs_` functions keep track of how many packs they want to repack together, and only execute pack-objects if there are at least two packs that need repacking. Having both of these functions keep track of this information themselves is not strictly necessary, since they also log which packs to repack via the `include_pack` array, so we can simply count the non-zero entries in that array after either function is done executing, reducing the overall amount of code necessary. Signed-off-by: Taylor Blau --- midx-write.c | 44 ++++++++++++++++++++------------------------ 1 file changed, 20 insertions(+), 24 deletions(-) diff --git a/midx-write.c b/midx-write.c index 2f0f5d133f..4f1d649aa6 100644 --- a/midx-write.c +++ b/midx-write.c @@ -1367,11 +1367,11 @@ static int want_included_pack(struct repository *r, return 1; } -static int fill_included_packs_all(struct repository *r, - struct multi_pack_index *m, - unsigned char *include_pack) +static void fill_included_packs_all(struct repository *r, + struct multi_pack_index *m, + unsigned char *include_pack) { - uint32_t i, count = 0; + uint32_t i; int pack_kept_objects = 0; repo_config_get_bool(r, "repack.packkeptobjects", &pack_kept_objects); @@ -1381,18 +1381,15 @@ static int fill_included_packs_all(struct repository *r, continue; include_pack[i] = 1; - count++; } - - return count < 2; } -static int fill_included_packs_batch(struct repository *r, - struct multi_pack_index *m, - unsigned char *include_pack, - size_t batch_size) +static void fill_included_packs_batch(struct repository *r, + struct multi_pack_index *m, + unsigned char *include_pack, + size_t batch_size) { - uint32_t i, packs_to_repack; + uint32_t i; size_t total_size; struct repack_info *pack_info; int pack_kept_objects = 0; @@ -1418,7 +1415,6 @@ static int fill_included_packs_batch(struct repository *r, QSORT(pack_info, m->num_packs, compare_by_mtime); total_size = 0; - packs_to_repack = 0; for (i = 0; total_size < batch_size && i < m->num_packs; i++) { int pack_int_id = pack_info[i].pack_int_id; struct packed_git *p = m->packs[pack_int_id]; @@ -1434,23 +1430,17 @@ static int fill_included_packs_batch(struct repository *r, if (expected_size >= batch_size) continue; - packs_to_repack++; total_size += expected_size; include_pack[pack_int_id] = 1; } free(pack_info); - - if (packs_to_repack < 2) - return 1; - - return 0; } int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags) { int result = 0; - uint32_t i; + uint32_t i, packs_to_repack = 0; unsigned char *include_pack; struct child_process cmd = CHILD_PROCESS_INIT; FILE *cmd_in; @@ -1469,10 +1459,16 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, CALLOC_ARRAY(include_pack, m->num_packs); - if (batch_size) { - if (fill_included_packs_batch(r, m, include_pack, batch_size)) - goto cleanup; - } else if (fill_included_packs_all(r, m, include_pack)) + if (batch_size) + fill_included_packs_batch(r, m, include_pack, batch_size); + else + fill_included_packs_all(r, m, include_pack); + + for (i = 0; i < m->num_packs; i++) { + if (include_pack[i]) + packs_to_repack++; + } + if (packs_to_repack <= 1) goto cleanup; repo_config_get_bool(r, "repack.usedeltabaseoffset", &delta_base_offset); From patchwork Mon Mar 25 17:24:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13602563 Received: from mail-qk1-f181.google.com (mail-qk1-f181.google.com [209.85.222.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8A4A1448CB for ; Mon, 25 Mar 2024 17:24:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387494; cv=none; b=RS826Us78v93fLyt8EG0Gk0bV+kXwTwMwa97Vj3U1c68zsuy6BNfvfRGFIvjPx2R6uHPVQlnoUY4A7vVV1KrSo+gB3NDfvJS8DYiMskwwyGW7jND+OOe9Bu7jE2BZR96uVy63I+FADonhioFNV/o9C5hAYlYENub987m2Abz8Hs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711387494; c=relaxed/simple; bh=KM3C5PDylPZro2XYnIjfoxNQGrNjai2tiJPhElUxg+Q=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=gEBlu75yTLhjvHmhFCaIrjvWBM4piT2W6VVt+WhqKpGGK40gyJRMSp6lOyu3JT249k5BWKf6ZI6dz8Wt3MPYoYz5oMfNGRlI8KvI5XccGDhHumwd2K+z2WTl5RiBh4X7DK5tifYf6sFDGMo/urhuMtLZ+eHkYwuRwGrGjc+wAvQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=Mf/oc+v1; arc=none smtp.client-ip=209.85.222.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="Mf/oc+v1" Received: by mail-qk1-f181.google.com with SMTP id af79cd13be357-789f1b59a28so302298985a.3 for ; Mon, 25 Mar 2024 10:24:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1711387492; x=1711992292; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=KZoe6v6JANoWwVQsnBR8xjoHp7tsRd/vEy+SzRgfSNE=; b=Mf/oc+v1ZCTsG8hvDAG1iXYFCHf/uKtP+i3JaIzppsValYIVWVElp5nEXrhhCmgBWU Ljz8Dlm8DR7tfT6296BxDTUmwbii0NNCD0fb20zVu6dqnYQOVq0TZ90qBgdXQjicYQW9 vz4Z5IGoGzxHUO/ErDLzHi3Q/KNl0B7q4SYP7col5H5KdzwTDctr2JpuGgQarhbQCRBJ +f5y2c7l524ijhB6UgIjW5UsM5HBUOYhVXqosUXFpo0cOXya/4ymtoSa6i8x0G9s47Tw JjKZNZHuvs9w+8jSW/KCu/W3I1yPY+HeS2NIqJAl+rDgoanX/YH1971+XrNU+txD56Ja On3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711387492; x=1711992292; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=KZoe6v6JANoWwVQsnBR8xjoHp7tsRd/vEy+SzRgfSNE=; b=UxWiHp08YiiBKrK4EJdEXBlB2obSF9cHAU9Zgq3bcjifcgYmi0WPvMRPyYZLZoOfNw vDxtj5zRIh7mNroNXrTS5c/LMd6nXG+EAc71iaOLeJjUScQ5ZMSpw1NvHUVg4OTh5xDt dQqhcWFQoBJLM8dCaN+JggR4D9W9Lgow0r3IwuSzFbWbthFwLlVe3NMdTnhQ7S86T2Cz 2xbGYP6KXW6E70xZ7w75CqvGPSTkzOFYuGCbZR/K/M47zeuZZGREecCk1U9ZjgUB7hCi WljVVCZ0Iu5BqmQUBDmIK8RKwA835lq/Yh7yvVkFMOIkkcT2r7w8DzUJTVwx8+f4Myr5 RRLA== X-Gm-Message-State: AOJu0Yw3ki4hvjDnk5wYyG3eWvPfYjRFdb9xRcK67E97+z9f3Kq8YWet Y3Mccf22nIWd1OHlh0OwZ7nvBClyoh3imr7/lD8zpvL3vcXRy/NccHDTjKV7KorcRiOjkayZupf 4tWo= X-Google-Smtp-Source: AGHT+IGkV3qVUnZSLRIawD6t1DpVou/iqa1QWCcYVJ+AKhw/ZUvW1k5CQzsewDPxf9f0VeOFR9lBMw== X-Received: by 2002:a05:620a:125b:b0:78a:3509:ccdf with SMTP id a27-20020a05620a125b00b0078a3509ccdfmr7723622qkl.65.1711387491738; Mon, 25 Mar 2024 10:24:51 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id b17-20020a05620a04f100b00789f3c50914sm2300025qkh.33.2024.03.25.10.24.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 10:24:51 -0700 (PDT) Date: Mon, 25 Mar 2024 13:24:50 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano Subject: [PATCH 11/11] midx-write.c: use `--stdin-packs` when repacking Message-ID: <736be63234baf7fc6df8259d9bb7298858b2bc74.1711387439.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When constructing a new pack `git multi-pack-index repack` provides a list of objects which is the union of objects in all MIDX'd packs which were "included" in the repack. Though correct, this typically yields a poorly structured pack, since providing the objects list over stdin does not give pack-objects a chance to discover the namehash values for each object, leading to sub-optimal delta selection. We can use `--stdin-packs` instead, which has a couple of benefits: - it does a supplemental walk over objects in the supplied list of packs to discover their namehash, leading to higher-quality delta selection - it requires us to list far less data over stdin; instead of listing each object in the resulting pack, we need only list the constituent packs from which those objects were selected in the MIDX Of course, this comes at a slight cost: though we save time on listing packs versus objects over stdin[^1] (around ~650 milliseconds), we add a non-trivial amount of time walking over the given objects in order to find better deltas. In general, this is likely to more closely match the user's expectations (i.e. that packs generated via `git multi-pack-index repack` are written with high-quality deltas). But if not, we can always introduce a new option in pack-objects to disable the supplemental object walk, which would yield a pure CPU-time savings, at the cost of the on-disk size of the resulting pack. [^1]: In a patched version of Git that doesn't perform the supplemental object walk in `pack-objects --stdin-packs`, we save around ~650ms (from 5.968 to 5.325 seconds) when running `git multi-pack-index repack --batch-size=0` on git.git with all objects packed, and all packs in a MIDX. Signed-off-by: Taylor Blau --- midx-write.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/midx-write.c b/midx-write.c index 4f1d649aa6..d341b9c628 100644 --- a/midx-write.c +++ b/midx-write.c @@ -1474,7 +1474,8 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, repo_config_get_bool(r, "repack.usedeltabaseoffset", &delta_base_offset); repo_config_get_bool(r, "repack.usedeltaislands", &use_delta_islands); - strvec_push(&cmd.args, "pack-objects"); + strvec_pushl(&cmd.args, "pack-objects", "--stdin-packs", "--non-empty", + NULL); if (delta_base_offset) strvec_push(&cmd.args, "--delta-base-offset"); @@ -1498,16 +1499,15 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, } cmd_in = xfdopen(cmd.in, "w"); - - for (i = 0; i < m->num_objects; i++) { - struct object_id oid; - uint32_t pack_int_id = nth_midxed_pack_int_id(m, i); - - if (!include_pack[pack_int_id]) + for (i = 0; i < m->num_packs; i++) { + struct packed_git *p = m->packs[i]; + if (!p) continue; - nth_midxed_object_oid(&oid, m, i); - fprintf(cmd_in, "%s\n", oid_to_hex(&oid)); + if (include_pack[i]) + fprintf(cmd_in, "%s\n", pack_basename(p)); + else + fprintf(cmd_in, "^%s\n", pack_basename(p)); } fclose(cmd_in);