From patchwork Tue Jun 11 17:28:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13694041 Received: from mail-ot1-f53.google.com (mail-ot1-f53.google.com [209.85.210.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A4A334CB55 for ; Tue, 11 Jun 2024 17:28:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718126902; cv=none; b=TVlP+BguLIEXPFTHdQzXld5NIzFkaJEWXfyorEENFZypKREMGLLh9nsK91RMGgA93wyZhXGPwv+vOx6wpZ9+LlmYpMVyRNBWhXoVgl6s9hp6QRnPlDEzVyHb+aSjz3mRLDg19rjvMpnAiR9XxKA0WAe4QHKFT8e7PfvJzjP4L1g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718126902; c=relaxed/simple; bh=2KRhPwawgmlftB/ce6/nKHSqtvEfkQqoef/QWxwQCcI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=a+p/UcIiUnqzLYZTYIoQMM4chwAB/hCfATdvRJqn8Am77Z2I2BDnPx+UUC6oGMpp6PdFglapg4Ft8ue+XPsgg1ZiSw/8wC43m2pNMPh4Tkb6rA5ov/H27trzCQK4QiVp+z8CqdcL7uWqGJo4EU1Ek/fCge2suqzBQb6Zdvk4FF8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=wdSI/Md5; arc=none smtp.client-ip=209.85.210.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="wdSI/Md5" Received: by mail-ot1-f53.google.com with SMTP id 46e09a7af769-6fa11ac8695so89235a34.3 for ; Tue, 11 Jun 2024 10:28:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1718126899; x=1718731699; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=t9hIeBsMNyWNxTNy4LnKQXiMVwos9XS2fWmbam8J6f0=; b=wdSI/Md5v5SPfZuq2ugYkXy01nYixYFwRM/tQgNquvrMVz1AIjEjJUDwRU3+tkeRhT ZIccn6wQA9+Lh8UfS6pXBV3QaY2ptu+5osa3oqvq6XaK3uP4luJFVcfyaF7GYCIwPnR3 vMBzeBMaaHSm+xZ0SA9natCu76Uth+veaN6ntYMAELetsKsY/696An0xdraysaDdWQid Wlu0NAcgH/U/z9ABDso7ZJh1NKL3S0sIWdYoZ7Ef4MOpxZqMWDFOJ/7Twf6iUIJ7yv7Q rnkW2n0vIsGi2EQq9dARoBIF4jRXHTSQQtCWj122OZ/f6AvxxLh91Z1Kg192/WSgaTmP ahAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718126899; x=1718731699; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=t9hIeBsMNyWNxTNy4LnKQXiMVwos9XS2fWmbam8J6f0=; b=IBrIfipAWbUpujN9uSLZhyBqiAAeW/g7lfzn4pEnUeH9NRf0ra7yn8hOqbWvFQKzTZ z+4TkOVHzswUZbF65IoBCmKTChfUn8mKstG0lFZGwICPOo0XJQNOK9tzQqRlStp/DjwK B90pj3PO4rxjW7Mu3+/LsNT3kMxIgdO8TRdUH13VOrS/Mbxctmn3BJUZUvb4msWixKhx Onhjo1KglU8JMWBVpCYH6J4g2hYtzKf68wKJRra2+KJnKaBEAfO9ye284ajz+00C8O2l XCHGJPyfKyRUS10Hx8PIGrmohK6Ct6CgqnvSC0wRTyeySufcjv1kP34NHpQtG1vPLpa3 hgIw== X-Gm-Message-State: AOJu0YwQSy3jD/ythnb9c6SPA519qSceS+f/Exuvkpy3Ilbtd17oo4zY gtXUdYJTJsc7lEg+L1vLRHSiBRjn27t6aJX1aDrag3Jw+EA8U2Zg+uOwR1Jk2OmPd7ZA0dthzU3 LXhk= X-Google-Smtp-Source: AGHT+IErwF9dRdW1PJTFQuA3BQFAJ2a7/yJgKCC0V4iA1iW7iDeAmaxBIBon9sBvWm0BcOtCOVcFkg== X-Received: by 2002:a9d:7383:0:b0:6f9:5bfc:81f0 with SMTP id 46e09a7af769-6f95bfc9347mr11220832a34.31.1718126899228; Tue, 11 Jun 2024 10:28:19 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-795332e1cddsm527937985a.132.2024.06.11.10.28.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jun 2024 10:28:18 -0700 (PDT) Date: Tue, 11 Jun 2024 13:28:17 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano , Kyle Lippincott , Patrick Steinhardt Subject: [PATCH v2 1/3] midx-write.c: do not read existing MIDX with `packs_to_include` Message-ID: <0bdf925366daea7ed229883272da79dbcf7023b3.1718126886.git.me@ttaylorr.com> References: <4aceb9233ed24fb1e1a324a77b665eea2cf22b39.1717946847.git.me@ttaylorr.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Commit d6a8c58675 (midx-write.c: support reading an existing MIDX with `packs_to_include`, 2024-05-29) changed the MIDX generation machinery to support reading from an existing MIDX when writing a new one. Unfortunately, the rest of the MIDX generation machinery is not prepared to deal with such a change. For instance, the function responsible for adding to the object ID fanout table from a MIDX source (midx_fanout_add_midx_fanout()) will gladly add objects from an existing MIDX for some fanout level regardless of whether or not those objects came from packs that are to be included in the subsequent MIDX write. This results in broken pseudo-pack object order (leading to incorrect object traversal results) and segmentation faults, like so (generated by running the added test prior to the changes in midx-write.c): #0 0x000055ee31393f47 in midx_pack_order (ctx=0x7ffdde205c70) at midx-write.c:590 #1 0x000055ee31395a69 in write_midx_internal (object_dir=0x55ee32570440 ".git/objects", packs_to_include=0x7ffdde205e20, packs_to_drop=0x0, preferred_pack_name=0x0, refs_snapshot=0x0, flags=15) at midx-write.c:1171 #2 0x000055ee31395f38 in write_midx_file_only (object_dir=0x55ee32570440 ".git/objects", packs_to_include=0x7ffdde205e20, preferred_pack_name=0x0, refs_snapshot=0x0, flags=15) at midx-write.c:1274 [...] In stack frame #0, the code on midx-write.c:590 is using the new pack ID corresponding to some object which was added from the existing MIDX. Importantly, the pack from which that object was selected in the existing MIDX does not appear in the new MIDX as it was excluded via `--stdin-packs`. In this instance, the pack in question had pack ID "1" in the existing MIDX, but since it was excluded from the new MIDX, we never filled in that entry in the pack_perm table, resulting in: (gdb) p *ctx->pack_perm@2 $1 = {0, 1515870810} Which is what causes the segfault above when we try and read: struct pack_info *pack = &ctx->info[ctx->pack_perm[i]]; if (pack->bitmap_pos == BITMAP_POS_UNKNOWN) pack->bitmap_pos = 0; Fundamentally, we should be able to read information from an existing MIDX when generating a new one. But in practice the midx-write.c code assumes that we won't run into issues like the above with incongruent pack IDs, and often makes those assumptions in extremely subtle and fragile ways. Instead, let's avoid reading from an existing MIDX altogether, and stick with the pre-d6a8c58675 implementation. Harden against any regressions in this area by adding a test which demonstrates these issues. Signed-off-by: Taylor Blau --- midx-write.c | 42 ++++++++++++++++++++++++++--------- t/t5326-multi-pack-bitmaps.sh | 30 +++++++++++++++++++++++++ 2 files changed, 61 insertions(+), 11 deletions(-) diff --git a/midx-write.c b/midx-write.c index 55a6b63bac..0125aa4dc9 100644 --- a/midx-write.c +++ b/midx-write.c @@ -101,13 +101,27 @@ struct write_midx_context { }; static int should_include_pack(const struct write_midx_context *ctx, - const char *file_name, - int exclude_from_midx) + const char *file_name) { - if (exclude_from_midx && ctx->m && midx_contains_pack(ctx->m, file_name)) + /* + * Note that at most one of ctx->m and ctx->to_include are set, + * so we are testing midx_contains_pack() and + * string_list_has_string() independently (guarded by the + * appropriate NULL checks). + * + * We could support passing to_include while reusing an existing + * MIDX, but don't currently since the reuse process drags + * forward all packs from an existing MIDX (without checking + * whether or not they appear in the to_include list). + * + * If we added support for that, these next two conditional + * should be performed independently (likely checking + * to_include before the existing MIDX). + */ + if (ctx->m && midx_contains_pack(ctx->m, file_name)) return 0; - if (ctx->to_include && !string_list_has_string(ctx->to_include, - file_name)) + else if (ctx->to_include && + !string_list_has_string(ctx->to_include, file_name)) return 0; return 1; } @@ -121,7 +135,7 @@ static void add_pack_to_midx(const char *full_path, size_t full_path_len, if (ends_with(file_name, ".idx")) { display_progress(ctx->progress, ++ctx->pack_paths_checked); - if (!should_include_pack(ctx, file_name, 1)) + if (!should_include_pack(ctx, file_name)) return; ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc); @@ -880,9 +894,6 @@ static int fill_packs_from_midx(struct write_midx_context *ctx, uint32_t i; for (i = 0; i < ctx->m->num_packs; i++) { - if (!should_include_pack(ctx, ctx->m->pack_names[i], 0)) - continue; - ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc); if (flags & MIDX_WRITE_REV_INDEX || preferred_pack_name) { @@ -937,7 +948,15 @@ static int write_midx_internal(const char *object_dir, die_errno(_("unable to create leading directories of %s"), midx_name.buf); - ctx.m = lookup_multi_pack_index(the_repository, object_dir); + if (!packs_to_include) { + /* + * Only reference an existing MIDX when not filtering which + * packs to include, since all packs and objects are copied + * blindly from an existing MIDX if one is present. + */ + ctx.m = lookup_multi_pack_index(the_repository, object_dir); + } + if (ctx.m && !midx_checksum_valid(ctx.m)) { warning(_("ignoring existing multi-pack-index; checksum mismatch")); ctx.m = NULL; @@ -946,7 +965,6 @@ static int write_midx_internal(const char *object_dir, ctx.nr = 0; ctx.alloc = ctx.m ? ctx.m->num_packs : 16; ctx.info = NULL; - ctx.to_include = packs_to_include; ALLOC_ARRAY(ctx.info, ctx.alloc); if (ctx.m && fill_packs_from_midx(&ctx, preferred_pack_name, @@ -963,6 +981,8 @@ static int write_midx_internal(const char *object_dir, else ctx.progress = NULL; + ctx.to_include = packs_to_include; + for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &ctx); stop_progress(&ctx.progress); diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh index cc7220b6c0..916da389b6 100755 --- a/t/t5326-multi-pack-bitmaps.sh +++ b/t/t5326-multi-pack-bitmaps.sh @@ -551,4 +551,34 @@ do ' done +test_expect_success 'remove one packfile between MIDX bitmap writes' ' + git init remove-pack-between-writes && + ( + cd remove-pack-between-writes && + + test_commit A && + test_commit B && + test_commit C && + + # Create packs with the prefix "pack-A", "pack-B", + # "pack-C" to impose a lexicographic order on these + # packs so the pack being removed is always from the + # middle. + packdir=.git/objects/pack && + A="$(echo A | git pack-objects $packdir/pack-A --revs)" && + B="$(echo B | git pack-objects $packdir/pack-B --revs)" && + C="$(echo C | git pack-objects $packdir/pack-C --revs)" && + + git multi-pack-index write --bitmap && + + cat >in <<-EOF && + pack-A-$A.idx + pack-C-$C.idx + EOF + git multi-pack-index write --bitmap --stdin-packs X-Patchwork-Id: 13694042 Received: from mail-oi1-f180.google.com (mail-oi1-f180.google.com [209.85.167.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E0D350288 for ; Tue, 11 Jun 2024 17:28:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718126905; cv=none; b=oAyPZHLXlkBzb8VQFA5TXWTzKzKVrfyqpMQTJ+QuPfYv5piDJiqsRV7cspeHLt4Sm+ei+ZumUAN8R50OIZFYHtBepanWbzZLzYZRQy3H70FB6UYLDeKAegsY63STIaX56m3/nUjT7qNyi1bpc/rpFodRKVEBL+97hX7NocGUXmw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718126905; c=relaxed/simple; bh=PhmIDP5ZiYDYtXxjJf3gpifdPLPgLxwKlFBfQEaedJ4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=H0VTBNoGa41bj5Ht1MA21V3Jcu7I482XFWCJsOftmmSJHh+PbvAVSo3+gQuk3rcnakSUOLum/NmY7Saw7ZOQ07HIUkk6khS+dd3eU2GILstvPkJcZbcTjby7/D60rpOHEdYwfEyxmViH+aUBZbWghPfB2XVyyJ+rDJO/dC3VNRs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=sF5qefpw; arc=none smtp.client-ip=209.85.167.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="sF5qefpw" Received: by mail-oi1-f180.google.com with SMTP id 5614622812f47-3c9d70d93dbso803464b6e.3 for ; Tue, 11 Jun 2024 10:28:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1718126902; x=1718731702; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=8YcLbpeKjhu5qvhoyBdbNqko79CkOovb0dIE/2i/IVE=; b=sF5qefpw7sBHQcxvX3Izi40EK7zXg8nvmTAnZU2LjGNHG+KiOYycvYpyj11uV1j1kX xhIoi7KCtp5gFZRUjLKtQ3epSG3YYysjpj9e5IthwDSjwZas08K+yhHrOXyT+q2X1EGK G5vLCbsddaD+xag335KE4u1MlCIMS8wcBLp5iwbLOBsKp928KCpu/jer20DVrfzN6WzV SIl21/kkZZLKvwLjnujcU0u/qkn2qeDErsy27QmEckqxp+AadP0JFslg0hzsa14gHmBJ 9C3VzOLxITZJCRnZW+O0pEcd9OnJ3UGjl909zF7OSsMhS3jM42yLxEgttWVIGSQDQ4uR NXDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718126902; x=1718731702; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=8YcLbpeKjhu5qvhoyBdbNqko79CkOovb0dIE/2i/IVE=; b=lz4DbB7YuOxxG9n8l4Thg8mufDqUoKThuVQz9Xc6hqhR/SkkcKGT2K2kKoRHmqCOHJ oVWPYBF9qMCkxp/lqfCnWzKTFZHpbJMAOydlV8//s+fDafaSD78IxS26tAfjUiAdzIOH lvp0XuCdDDUSNh73ER0WR2yXkM416UzUfReKEnwmKHjCL/iwRVucShJMxdLJFPX4C5tO xj7gz3AEdo4rFfE6ljiTVTb4EWoJAieF/7CKFr7v3EyB5k7D70xL9fEp/09iFe44INuf 5o4zV2Mgs8gjBbC/QN6TCNBzVn8o6L4moZwmj96vjt8uxiEW/Lbcw0ZKX9Af0ZYe0WhS rCTQ== X-Gm-Message-State: AOJu0Yy8pOLU74RZC8EUHqxKwQ7l01YES/8q0yWwJYNBH/sCx0I7GALi fih9Gg+/UjxIhmbf8uwwla6EaqK78LZ0lZqG8xwLuY9bma2vbwr9Mof3iLctzc2I1GnDqmnNsHa L+P0= X-Google-Smtp-Source: AGHT+IEDT9G33o+fwWfpjUQQPN5o4bLwgqCck5iL/WVMhYbjFffzISzTpsHNqcQ3xTSk0aDU4gG0Zw== X-Received: by 2002:a05:6808:1307:b0:3d2:1ddf:862c with SMTP id 5614622812f47-3d21ddf86admr11428562b6e.2.1718126902404; Tue, 11 Jun 2024 10:28:22 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7955e3451a6sm256195985a.128.2024.06.11.10.28.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jun 2024 10:28:22 -0700 (PDT) Date: Tue, 11 Jun 2024 13:28:20 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano , Kyle Lippincott , Patrick Steinhardt Subject: [PATCH v2 2/3] pack-bitmap.c: avoid uninitialized `pack_int_id` during reuse Message-ID: <4b006f44a53fb8e0a27a89b78e43a5ef225cebb8.1718126886.git.me@ttaylorr.com> References: <4aceb9233ed24fb1e1a324a77b665eea2cf22b39.1717946847.git.me@ttaylorr.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When performing multi-pack reuse, reuse_partial_packfile_from_bitmap() is responsible for generating an array of bitmapped_pack structs from which to perform reuse. In the multi-pack case, we loop over the MIDXs packs and copy the result of calling `nth_bitmapped_pack()` to construct the list of reusable paths. But we may also want to do pack-reuse over a single pack, either because we only had one pack to perform reuse over (in the case of single-pack bitmaps), or because we explicitly asked to do single pack reuse even with a MIDX[^1]. When this is the case, the array we generate of reusable packs contains only a single element, which is either (a) the pack attached to the single-pack bitmap, or (b) the MIDX's preferred pack. In 795006fff4 (pack-bitmap: gracefully handle missing BTMP chunks, 2024-04-15), we refactored the reuse_partial_packfile_from_bitmap() function and stopped assigning the pack_int_id field when reusing only the MIDX's preferred pack. This results in an uninitialized read down in try_partial_reuse() like so: ==7474==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x55c5cd191dde in try_partial_reuse pack-bitmap.c:1887:8 #1 0x55c5cd191dde in reuse_partial_packfile_from_bitmap_1 pack-bitmap.c:2001:8 #2 0x55c5cd191dde in reuse_partial_packfile_from_bitmap pack-bitmap.c:2105:3 #3 0x55c5cce0bd0e in get_object_list_from_bitmap builtin/pack-objects.c:4043:3 #4 0x55c5cce0bd0e in get_object_list builtin/pack-objects.c:4156:27 #5 0x55c5cce0bd0e in cmd_pack_objects builtin/pack-objects.c:4596:3 #6 0x55c5ccc8fac8 in run_builtin git.c:474:11 which happens when try_partial_reuse() tries to call midx_pair_to_pack_pos() when it tries to reject cross-pack deltas. Avoid the uninitialized read by ensuring that the pack_int_id field is set in the single-pack reuse case by setting it to either the MIDX preferred pack's pack_int_id, or '-1', in the case of single-pack bitmaps. In the latter case, we never read the pack_int_id field, so the choice of '-1' is intentional as a "garbage in, garbage out" measure. Guard against further regressions in this area by adding a test which ensures that we do not throw out deltas from the preferred pack as "cross-pack" due to an uninitialized pack_int_id. [^1]: This can happen for a couple of reasons, either because the repository is configured with 'pack.allowPackReuse=(true|single)', or because the MIDX was generated prior to the introduction of the BTMP chunk, which contains information necessary to perform multi-pack reuse. Reported-by: Kyle Lippincott Signed-off-by: Taylor Blau --- pack-bitmap.c | 13 +++++++++++++ t/t5332-multi-pack-reuse.sh | 26 ++++++++++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/pack-bitmap.c b/pack-bitmap.c index fe8e8a51d3..1d6b7f2826 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -2073,6 +2073,7 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, QSORT(packs, packs_nr, bitmapped_pack_cmp); } else { struct packed_git *pack; + uint32_t pack_int_id; if (bitmap_is_midx(bitmap_git)) { uint32_t preferred_pack_pos; @@ -2083,12 +2084,24 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, } pack = bitmap_git->midx->packs[preferred_pack_pos]; + pack_int_id = preferred_pack_pos; } else { pack = bitmap_git->pack; + /* + * Any value for 'pack_int_id' will do here. When we + * process the pack via try_partial_reuse(), we won't + * use the `pack_int_id` field since we have a non-MIDX + * bitmap. + * + * Use '-1' as a sentinel value to make it clear + * that we do not expect to read this field. + */ + pack_int_id = -1; } ALLOC_GROW(packs, packs_nr + 1, packs_alloc); packs[packs_nr].p = pack; + packs[packs_nr].pack_int_id = pack_int_id; packs[packs_nr].bitmap_nr = pack->num_objects; packs[packs_nr].bitmap_pos = 0; diff --git a/t/t5332-multi-pack-reuse.sh b/t/t5332-multi-pack-reuse.sh index 3c20738bce..ed823f37bc 100755 --- a/t/t5332-multi-pack-reuse.sh +++ b/t/t5332-multi-pack-reuse.sh @@ -204,4 +204,30 @@ test_expect_success 'omit delta from uninteresting base (cross pack)' ' test_pack_objects_reused_all $(($objects_nr - 1)) $packs_nr ' +test_expect_success 'non-omitted delta in MIDX preferred pack' ' + test_config pack.allowPackReuse single && + + cat >p1.objects <<-EOF && + $(git rev-parse $base) + ^$(git rev-parse $delta^) + EOF + cat >p2.objects <<-EOF && + $(git rev-parse F) + EOF + + p1="$(git pack-objects --revs $packdir/pack in <<-EOF && + pack-$p1.idx + pack-$p2.idx + EOF + git multi-pack-index write --bitmap --stdin-packs \ + --preferred-pack=pack-$p1.pack expect && + + test_pack_objects_reused_all $(wc -l X-Patchwork-Id: 13694043 Received: from mail-oa1-f49.google.com (mail-oa1-f49.google.com [209.85.160.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 92DCD47F5D for ; Tue, 11 Jun 2024 17:28:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718126907; cv=none; b=RrCkSME9nGdxZS3MZNtQj1Aon0il09I9Vfv7ecRRUYZZ9tCA1XDYsQWpKV2RYZxsXimZFkJQHdolYWTgvwnYiaSP26WhCQ9V7pFijmNGjEKg+C0FS/Q3oTT7Q5yV1W810jW205lMmM1aDiWtpIQjiDHXBAjBxejZSXwGxn6Zw5g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718126907; c=relaxed/simple; bh=XLmBgdYvzO2opiSP7Lfa58zo+DqYe9LA1lHWzLhzq4M=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=EpBxyWI1pjR8eLaiwsN8zIXpk1x4cKV3veHc0/Zd4RlccnbMayQpXyA1WlGFo3GkkOaECxp5G5HvFrV5Knc6kjkAnm9U+PEJ3BWvTujrYLuzKvOBtEiooyH3t3q0gNaWAEURxl9vd9a3FbI1t791uet34ReyUcqOsWRtkpNGnXo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=j5fjTQOh; arc=none smtp.client-ip=209.85.160.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="j5fjTQOh" Received: by mail-oa1-f49.google.com with SMTP id 586e51a60fabf-24c9f297524so3319735fac.0 for ; Tue, 11 Jun 2024 10:28:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1718126905; x=1718731705; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=7xjSmD23sLlXv3mrqs0SrsHiTZego6HQFibXfnZ+W2Y=; b=j5fjTQOhlgA0qK43f8c31/qqtPsWsxhcJcnGMUSYZTDkdyzsAi9O9AWC1vr1iiYy5g IjW0xNzgZqzNIy+gFrxml1bw128No2q+yVJkytJSXVuyuuvJFzNeHT0QYfa0EcXwifqr Ow4wmpMHBeP7wRbQtjZ24tNviNkgvbsqW0fohQ4oIfZaqP8tXs3O01VI6ttAsCTCB4zz S3iNvInp7exPYwP5pq3kx4G4Ahu9TU3P9q9KJt+KEtkx395f6HabKDAKIiKP/VFTveEm H8aaJVXMJWuwSFXb1hT6ee1pOtoVCTjwr17uUp8YEqKIV30ZUPIVBVgIEaNNhMPmss9B QQBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718126905; x=1718731705; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=7xjSmD23sLlXv3mrqs0SrsHiTZego6HQFibXfnZ+W2Y=; b=WO48fSF04SZLXF7chEeQSclpjQHoar5NEH6CyRBjWYrdmJ7w+qpRnCYMo0prRHMYR8 p9DrNvT9uUAhnhNzQyGdADV/9ld4zpIrBJMmbjOLCRlnTrTOt2BQ+PR+qUhDVNdjB43c FQ3DLMqx8GtzsoQbE13y8msaR4YX0PJkL7X1eyJ+UVIdoLFSUz7XAi8LkoUjhAXfldTd 7lY2b0CEyVtni9b/+FFvpp4V0ESk0C5JqSgkA0WhMyb+dEliB3qPMHN5MAhs+xpWLkkr q8TG7HNu8TS/1aOHTPyC6eK7lTn6O70QkSz6qSvEJLfqd/Sj0hgU7tN1aiXOlkoVaqAc jKYg== X-Gm-Message-State: AOJu0Yw21KslrmB+bZ4wP9vgl+aTphtVEsz/Jy65YF4hM1oa6jfG4i25 hr+gjtGquTFbgCWAvVopHeE3sBs95/X+r+OebF+Z2dmNfhS6T+e8TFwC5cj/Ja8macM5rVyv+AQ Q/S0= X-Google-Smtp-Source: AGHT+IHdXyb1ZgQgFMNcKd5+3YzFPnZNyj3yIcayFZMbs/dAlXud1gD34NodZfrE/DEPNCW0qPAc1A== X-Received: by 2002:a05:6871:3313:b0:254:8666:34d9 with SMTP id 586e51a60fabf-25486663d0dmr12765818fac.16.1718126905454; Tue, 11 Jun 2024 10:28:25 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7955d6e91e2sm265589285a.27.2024.06.11.10.28.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jun 2024 10:28:25 -0700 (PDT) Date: Tue, 11 Jun 2024 13:28:24 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Junio C Hamano , Kyle Lippincott , Patrick Steinhardt Subject: [PATCH v2 3/3] pack-revindex.c: guard against out-of-bounds pack lookups Message-ID: <06de4005f15a970628cef1aeb7f159ca05b8e34d.1718126886.git.me@ttaylorr.com> References: <4aceb9233ed24fb1e1a324a77b665eea2cf22b39.1717946847.git.me@ttaylorr.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The function midx_key_to_pack_pos() is a helper function used by midx_to_pack_pos() and midx_pair_to_pack_pos() to translate a (pack, offset) tuple into a position into the MIDX pseudo-pack order. Ensure that the pack ID given to midx_pair_to_pack_pos() is bounded by the number of packs within the MIDX to prevent, for instance, uninitialized memory from being used as a pack ID. Signed-off-by: Taylor Blau --- pack-revindex.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/pack-revindex.c b/pack-revindex.c index fc63aa76a2..93ffca7731 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -527,6 +527,9 @@ static int midx_key_to_pack_pos(struct multi_pack_index *m, { uint32_t *found; + if (key->pack >= m->num_packs) + BUG("MIDX pack lookup out of bounds (%"PRIu32" >= %"PRIu32")", + key->pack, m->num_packs); /* * The preferred pack sorts first, so determine its identifier by * looking at the first object in pseudo-pack order.