From patchwork Wed Apr 3 06:03:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13614978 Received: from fhigh8-smtp.messagingengine.com (fhigh8-smtp.messagingengine.com [103.168.172.159]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 44DFF4D9FD for ; Wed, 3 Apr 2024 06:03:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.159 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124241; cv=none; b=kNZoxzJCZqaV5fo11sqD1skla0adWUwUDn8Gn6U3dItVfWyYPfheEat585khiBNnq9VDDmR9AGb01cNvDF3BFOf4/aMipK2J6aX23yoF/0KdH2u6lmJ0Q3VUqhtYmI+laP1IjahOyeZ2YLanbPIQjgwynhfkmFBRICM5HW8VYrM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124241; c=relaxed/simple; bh=my8HVEZdECQUEtxPXi9PEoAKZXu8I44tcfXsbF+U2u0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=HVr/T0lK8CDqCm5qdCEDuw3lUNFFHardKe1/QPSFV5v4lnI9E4anmDqanXUGZMcu2LbdHeI+KMyfF2yyG5cL+Hvj+wG6mXjV7RZ48VpRq08huSsJRAM+FBxZzm9+HNJLFOgEPjW6fwQIFK62SlRgl3D8RlYNV7MfOUce1x8jnss= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=qDScJtJY; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=RofWnYgT; arc=none smtp.client-ip=103.168.172.159 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="qDScJtJY"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="RofWnYgT" Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 46BBC1140101; Wed, 3 Apr 2024 02:03:59 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Wed, 03 Apr 2024 02:03:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712124239; x=1712210639; bh=aARQCZP6RE SZ6uNqwqOLS7r7BLUITFVoAqD0a9DOMO4=; b=qDScJtJYQxBnMHDXaM+HOj11UQ VkOJkaSgVHwgRGhut1yPmIfrj0uxyQvejwC3oHCrfbsEFdGIScdzAS9Mum4isLpo SE81Fiu6YgYYTr5K97h6IZgbiAt1F+7eNy/fc2M7SNwzZeQpJDaN2BxhJ5BrJXrK G1NRGAM1lyd89VvsYBgKPWYUSfk/3h09Q/Y2rqvqBon8YrAxtbcY09LXl8aIRgDg M2zcNcvzRLfqhPZzBRSnaijYwoaG7qOy1/OYhf8Jk50H7yB6OGPzRpXE0cCdjiuP EItLNQnK5wn9ZhGW/7xv1wWjQVMwPkkneIkaHMxLeJjtnlfDdIaBBpmHukXw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712124239; x=1712210639; bh=aARQCZP6RESZ6uNqwqOLS7r7BLUI TFVoAqD0a9DOMO4=; b=RofWnYgThw8NVh54jm4AN1FBDqtqYcaURb2iApz+ihuc j/wNhUQvi4Nh0wlzM0haElS8MfBqW/SyorAz3SUJR4Vdfh0Qjc8urSMt0IjoKMIJ xi0jV90H37Crf2fczxZpUhN/mGaLlqyRVho/VXxdk3hIBf7KMI9hvbJrMtWopgiK UbwMAUs8ESokDAQcxPEj6Pgllgd55JwSg8wTVvuXS81RHDm07qBreeeQk3rNdidJ nj29OQRHpnZ/BuVGtYh6/hZt4X4TyeIQ034d2soYZYupQXjzSTtgAaiwyPJlMt0f saPi57zM7jAAyHNRk1c429fUCYorkjR41HVWsQdtVQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudeffedguddtudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpeffhffvvefukfhfgggtuggjsehgtderredttddvnecuhfhrohhmpefrrght rhhitghkucfuthgvihhnhhgrrhguthcuoehpshesphhkshdrihhmqeenucggtffrrghtth gvrhhnpeeukedtvedtffevleejtefgheehieegkeeluddvfeefgeehgfeltddtheejleff teenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehpsh esphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Apr 2024 02:03:58 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id d968b735 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Apr 2024 06:03:49 +0000 (UTC) Date: Wed, 3 Apr 2024 08:03:56 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Justin Tobler Subject: [PATCH v4 1/7] reftable/basics: fix return type of `binsearch()` to be `size_t` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The `binsearch()` function can be used to find the first element for which a callback functions returns a truish value. But while the array size is of type `size_t`, the function in fact returns an `int` that is supposed to index into that array. Fix the function signature to return a `size_t`. This conversion does not change any semantics given that the function would only ever return a value in the range `[0, sz]` anyway. Signed-off-by: Patrick Steinhardt --- reftable/basics.c | 2 +- reftable/basics.h | 2 +- reftable/basics_test.c | 6 +++--- reftable/block.c | 3 ++- reftable/refname.c | 17 +++++++---------- 5 files changed, 14 insertions(+), 16 deletions(-) diff --git a/reftable/basics.c b/reftable/basics.c index 0785aff941..2c5f34b39e 100644 --- a/reftable/basics.c +++ b/reftable/basics.c @@ -27,7 +27,7 @@ void put_be16(uint8_t *out, uint16_t i) out[1] = (uint8_t)(i & 0xff); } -int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args) +size_t binsearch(size_t sz, int (*f)(size_t k, void *args), void *args) { size_t lo = 0; size_t hi = sz; diff --git a/reftable/basics.h b/reftable/basics.h index 91f3533efe..2672520e76 100644 --- a/reftable/basics.h +++ b/reftable/basics.h @@ -28,7 +28,7 @@ void put_be16(uint8_t *out, uint16_t i); * Contrary to bsearch(3), this returns something useful if the argument is not * found. */ -int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args); +size_t binsearch(size_t sz, int (*f)(size_t k, void *args), void *args); /* * Frees a NULL terminated array of malloced strings. The array itself is also diff --git a/reftable/basics_test.c b/reftable/basics_test.c index 1fcd229725..dc1c87c5df 100644 --- a/reftable/basics_test.c +++ b/reftable/basics_test.c @@ -34,15 +34,15 @@ static void test_binsearch(void) int i = 0; for (i = 1; i < 11; i++) { - int res; + size_t res; + args.key = i; res = binsearch(sz, &binsearch_func, &args); if (res < sz) { EXPECT(args.key < arr[res]); - if (res > 0) { + if (res > 0) EXPECT(args.key >= arr[res - 1]); - } } else { EXPECT(args.key == 10 || args.key == 11); } diff --git a/reftable/block.c b/reftable/block.c index e2a2cee58d..422885bddb 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -382,7 +382,8 @@ int block_reader_seek(struct block_reader *br, struct block_iter *it, }; struct block_iter next = BLOCK_ITER_INIT; struct reftable_record rec; - int err = 0, i; + int err = 0; + size_t i; if (args.error) { err = REFTABLE_FORMAT_ERROR; diff --git a/reftable/refname.c b/reftable/refname.c index 7570e4acf9..64eba1b886 100644 --- a/reftable/refname.c +++ b/reftable/refname.c @@ -33,10 +33,9 @@ static int modification_has_ref(struct modification *mod, const char *name) .names = mod->add, .want = name, }; - int idx = binsearch(mod->add_len, find_name, &arg); - if (idx < mod->add_len && !strcmp(mod->add[idx], name)) { + size_t idx = binsearch(mod->add_len, find_name, &arg); + if (idx < mod->add_len && !strcmp(mod->add[idx], name)) return 0; - } } if (mod->del_len > 0) { @@ -44,10 +43,9 @@ static int modification_has_ref(struct modification *mod, const char *name) .names = mod->del, .want = name, }; - int idx = binsearch(mod->del_len, find_name, &arg); - if (idx < mod->del_len && !strcmp(mod->del[idx], name)) { + size_t idx = binsearch(mod->del_len, find_name, &arg); + if (idx < mod->del_len && !strcmp(mod->del[idx], name)) return 1; - } } err = reftable_table_read_ref(&mod->tab, name, &ref); @@ -77,7 +75,7 @@ static int modification_has_ref_with_prefix(struct modification *mod, .names = mod->add, .want = prefix, }; - int idx = binsearch(mod->add_len, find_name, &arg); + size_t idx = binsearch(mod->add_len, find_name, &arg); if (idx < mod->add_len && !strncmp(prefix, mod->add[idx], strlen(prefix))) goto done; @@ -96,11 +94,10 @@ static int modification_has_ref_with_prefix(struct modification *mod, .names = mod->del, .want = ref.refname, }; - int idx = binsearch(mod->del_len, find_name, &arg); + size_t idx = binsearch(mod->del_len, find_name, &arg); if (idx < mod->del_len && - !strcmp(ref.refname, mod->del[idx])) { + !strcmp(ref.refname, mod->del[idx])) continue; - } } if (strncmp(ref.refname, prefix, strlen(prefix))) { From patchwork Wed Apr 3 06:04:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13614979 Received: from fout5-smtp.messagingengine.com (fout5-smtp.messagingengine.com [103.168.172.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 48C764D117 for ; Wed, 3 Apr 2024 06:04:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.148 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124245; cv=none; b=GqsRy4H0MJQcSokeURunSr0OhyKg6BgScNhDcqE/LBEDSyS25xZiXXyXk9Io5MZuMc94z5t5TtEsGkHmeoP2jz2b/y1Wh0U4hJsgPhCTHM0RwFv+TDNJXo8/y8nXStjBeUCNKj+z6di7q9oD8FGpqRTiNk4PBLwg8bvY/z6QQus= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124245; c=relaxed/simple; bh=u8PO3PqngVNTYIAmfToY1ZVoeewDtfjmRTXmculP1m8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=o0UGwFgMeCr8H4v99a0g9IWrn+sNxCxWUPiQLtzGuEZmLtoIXDSSulsTUF75v0gEAMKaw99BFYzOxTc1CPtIILPd9VAU5p4+WwWrvwM4+qS+/Z830v/xqFnL4JxSuWfk9tuqq5s5SdEeMmqdjAksI7G77XLoDTcVwbFHX6Y+klc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=pZ8J+5BC; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=aYsaqNRF; arc=none smtp.client-ip=103.168.172.148 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="pZ8J+5BC"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="aYsaqNRF" Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailfout.nyi.internal (Postfix) with ESMTP id 631C81380085; Wed, 3 Apr 2024 02:04:03 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute7.internal (MEProxy); Wed, 03 Apr 2024 02:04:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712124243; x=1712210643; bh=Q491D6R/Vw AKsCegaY+LqJDAQ0XW09YcF44JNgl0HLI=; b=pZ8J+5BCEs02Jzkxujh3hlKvqn Cul1PGHlADjKffEM/19MTVhXOf8KkivpNXbLuJO6h3F32LtkndD//y5e2iQN90Qk pmSEappy+hTNp6MrWpsm+ld1rz8TLQFKurE4oF5LF+VaTf3SpzDH7lboh0LmBgC6 IucJE4p8stOaN88RqMZlN9yqm5+sI8ROfBwAILz5a0SiraxcTBN9n1259pmxbFk9 5ADPqtorB86zAtalmHJNC/p8Uh+jcem6gjZwjRKUDy70bjO3Cx6NuWNBg8ec2eeu jmwdVypiCUBH2vVGMyEKJLlyOEm3TSRX1xU4CavrC0k6owTdDTOgvGXU511w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712124243; x=1712210643; bh=Q491D6R/VwAKsCegaY+LqJDAQ0XW 09YcF44JNgl0HLI=; b=aYsaqNRFFoQ4gX49BQKpY/Vb/CyaHml1hfCYVkiQtO2M kouXDOeCa+srGvC5D6ulbUcAq3hf1FPPnOU9mJWZjDEO9vHcC7RDAUlLHVBtd1lN HESaZJQk4VNc5T2eiAxvuK+LoQxpisYsJ326aHvn4kCeFNJWdqiOpaAP+bV0V0gY Bt72jZvA3eZZ1SyLJwRE+bEtwcsnLxPN8nto5i9LbjJbiCuFLS1OHy5NF4GXGTBr nFpPSEO1lSfmqyOARaGuq3rjRQS7DEfld7Kr4EPPuXMUZUcNesx0NYZKa8f+logw zLc2vmovVC6yA0aMrJfM+ttWyuzU/OuVlx2QzRshFQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudeffedguddtudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpeffhffvvefukfhfgggtuggjsehgtderredttddvnecuhfhrohhmpefrrght rhhitghkucfuthgvihhnhhgrrhguthcuoehpshesphhkshdrihhmqeenucggtffrrghtth gvrhhnpeeukedtvedtffevleejtefgheehieegkeeluddvfeefgeehgfeltddtheejleff teenucevlhhushhtvghrufhiiigvpedunecurfgrrhgrmhepmhgrihhlfhhrohhmpehpsh esphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Apr 2024 02:04:02 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id d04e4b06 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Apr 2024 06:03:53 +0000 (UTC) Date: Wed, 3 Apr 2024 08:04:00 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Justin Tobler Subject: [PATCH v4 2/7] reftable/basics: improve `binsearch()` test Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The `binsearch()` test is somewhat weird in that it doesn't explicitly spell out its expectations. Instead it does so in a rather ad-hoc way with some hard-to-understand computations. Refactor the test to spell out the needle as well as expected index for all testcases. This refactoring highlights that the `binsearch_func()` is written somewhat weirdly to find the first integer smaller than the needle, not smaller or equal to it. Adjust the function accordingly. While at it, rename the callback function to better convey its meaning. Signed-off-by: Patrick Steinhardt --- reftable/basics_test.c | 55 ++++++++++++++++++++++++------------------ 1 file changed, 31 insertions(+), 24 deletions(-) diff --git a/reftable/basics_test.c b/reftable/basics_test.c index dc1c87c5df..997c4d9e01 100644 --- a/reftable/basics_test.c +++ b/reftable/basics_test.c @@ -12,40 +12,47 @@ license that can be found in the LICENSE file or at #include "test_framework.h" #include "reftable-tests.h" -struct binsearch_args { - int key; - int *arr; +struct integer_needle_lesseq_args { + int needle; + int *haystack; }; -static int binsearch_func(size_t i, void *void_args) +static int integer_needle_lesseq(size_t i, void *_args) { - struct binsearch_args *args = void_args; - - return args->key < args->arr[i]; + struct integer_needle_lesseq_args *args = _args; + return args->needle <= args->haystack[i]; } static void test_binsearch(void) { - int arr[] = { 2, 4, 6, 8, 10 }; - size_t sz = ARRAY_SIZE(arr); - struct binsearch_args args = { - .arr = arr, + int haystack[] = { 2, 4, 6, 8, 10 }; + struct { + int needle; + size_t expected_idx; + } testcases[] = { + {-9000, 0}, + {-1, 0}, + {0, 0}, + {2, 0}, + {3, 1}, + {4, 1}, + {7, 3}, + {9, 4}, + {10, 4}, + {11, 5}, + {9000, 5}, }; + size_t i = 0; - int i = 0; - for (i = 1; i < 11; i++) { - size_t res; - - args.key = i; - res = binsearch(sz, &binsearch_func, &args); + for (i = 0; i < ARRAY_SIZE(testcases); i++) { + struct integer_needle_lesseq_args args = { + .haystack = haystack, + .needle = testcases[i].needle, + }; + size_t idx; - if (res < sz) { - EXPECT(args.key < arr[res]); - if (res > 0) - EXPECT(args.key >= arr[res - 1]); - } else { - EXPECT(args.key == 10 || args.key == 11); - } + idx = binsearch(ARRAY_SIZE(haystack), &integer_needle_lesseq, &args); + EXPECT(idx == testcases[i].expected_idx); } } From patchwork Wed Apr 3 06:04:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13614980 Received: from fout5-smtp.messagingengine.com (fout5-smtp.messagingengine.com [103.168.172.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7035E4CB4E for ; Wed, 3 Apr 2024 06:04:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.148 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124251; cv=none; b=symNsHjn7mIhGW2iawU6s1tCMVtPA7c1/j7ifM95WXJtJ3lwOX5hzwqb2O4wgBNScNfwjRXI6IndqcTxyeOI2Cfy62vIvls8xT9pPvH0Dd5JHlDqzDaVvbufPLVxjCS84Kh+q8AWtCG+UUC0g9hcoScDUK0hQAkzgFHoETS4CYk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124251; c=relaxed/simple; bh=5ZU7/Ko/ReUk52oiPuCnYiaFeWGXNIb3xmS0kh4H0Wc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=jEth1M5LfNmDhcuFtE9xy8pWlXVrHMygKpCxDlxkK9TILZcPYN6Krek++P7kezvpf74inq1Paaiy6nF1SB7NOlPtva5oL1+98Tbyp1vKZTy/oJvhlMiLBjD3gf1mjAlbwVkVbchqzqkCiKOHUO0nIZgFXtjdZ0C7PmC6HtQnAoE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=bExGUeNd; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=fBwSeYhV; arc=none smtp.client-ip=103.168.172.148 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="bExGUeNd"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="fBwSeYhV" Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailfout.nyi.internal (Postfix) with ESMTP id 75B8D13800F3; Wed, 3 Apr 2024 02:04:08 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Wed, 03 Apr 2024 02:04:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712124248; x=1712210648; bh=yy0ugzm3sC 8NlpEsaYgfnwZMnXQewmQGcGTUMi6wiNU=; b=bExGUeNdojaXMOPHw1gNRqS9X3 apBnHW9u5sVANIebwaFogoOsv4wiksNwo7cDG9tIud7lWAlp7JeHjZY16uix1T0U zke2ingxj6PQhbKQ0miLgK15RiQ9h5VUHOJsR138DLZaypgtn/i7vWRYVZdwNURS /MPL+vXAqJf5bb2JY8ueLuflTKXrvxwjmanzeJEK2hk0QMTwHUJthFfne0U7HnmN AW/dqyDsi1I5819qJpKmRKCQ6ka9bG+1nydQ16F8U6Y5xYmY60d5KQYbcX3ImfyX gpeTN4lRwSBa7RoeTKQIjPmG2YRcBg9kk7BuoA+3B8+yfvxMTt4UisJ83/PQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712124248; x=1712210648; bh=yy0ugzm3sC8NlpEsaYgfnwZMnXQe wmQGcGTUMi6wiNU=; b=fBwSeYhVSkqy01SfV2x1/ZzpEUnl4Ud5N0HCQ5OqtGOJ pBDA18RCWfQDv9m0L/5PXfnt5AudRLobcAZuwrk6DS1gnRgc825nJcGm8E/dPmkK /iCqshTZiytJs2yXoVcrc3S4buWCjv5Z9IIJNEK++W/AudxbRUQkdfc0BWoUbd8H WOz96kzvPH4LSfEzk4Zp8b2Dy31fCPMy3S93gv6+w7bJjXu7jWT2HCeRbCu+UqY0 /HBmiMT0EEAvXn9w739c0niBPERA5d9LMJrrgHQIAnk761QDx8h2ZGBmiP0z1FH+ kQilKcmxlYKA0H04nsxPFj+6sg17e4i3VxK2yGXqRQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudeffedguddtudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpeffhffvvefukfhfgggtuggjsehgtderredttddvnecuhfhrohhmpefrrght rhhitghkucfuthgvihhnhhgrrhguthcuoehpshesphhkshdrihhmqeenucggtffrrghtth gvrhhnpeeukedtvedtffevleejtefgheehieegkeeluddvfeefgeehgfeltddtheejleff teenucevlhhushhtvghrufhiiigvpedunecurfgrrhgrmhepmhgrihhlfhhrohhmpehpsh esphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Apr 2024 02:04:07 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 7cabdcb0 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Apr 2024 06:03:57 +0000 (UTC) Date: Wed, 3 Apr 2024 08:04:05 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Justin Tobler Subject: [PATCH v4 3/7] reftable/refname: refactor binary search over refnames Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: It is comparatively hard to understand how exactly the binary search over refnames works given that the function and variable names are not exactly easy to grasp. Rename them to make this more obvious. This should not result in any change in behaviour. Signed-off-by: Patrick Steinhardt --- reftable/refname.c | 44 ++++++++++++++++++++++---------------------- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/reftable/refname.c b/reftable/refname.c index 64eba1b886..bbfde15754 100644 --- a/reftable/refname.c +++ b/reftable/refname.c @@ -12,15 +12,15 @@ #include "refname.h" #include "reftable-iterator.h" -struct find_arg { - char **names; - const char *want; +struct refname_needle_lesseq_args { + char **haystack; + const char *needle; }; -static int find_name(size_t k, void *arg) +static int refname_needle_lesseq(size_t k, void *_args) { - struct find_arg *f_arg = arg; - return strcmp(f_arg->names[k], f_arg->want) >= 0; + struct refname_needle_lesseq_args *args = _args; + return strcmp(args->needle, args->haystack[k]) <= 0; } static int modification_has_ref(struct modification *mod, const char *name) @@ -29,21 +29,21 @@ static int modification_has_ref(struct modification *mod, const char *name) int err = 0; if (mod->add_len > 0) { - struct find_arg arg = { - .names = mod->add, - .want = name, + struct refname_needle_lesseq_args args = { + .haystack = mod->add, + .needle = name, }; - size_t idx = binsearch(mod->add_len, find_name, &arg); + size_t idx = binsearch(mod->add_len, refname_needle_lesseq, &args); if (idx < mod->add_len && !strcmp(mod->add[idx], name)) return 0; } if (mod->del_len > 0) { - struct find_arg arg = { - .names = mod->del, - .want = name, + struct refname_needle_lesseq_args args = { + .haystack = mod->del, + .needle = name, }; - size_t idx = binsearch(mod->del_len, find_name, &arg); + size_t idx = binsearch(mod->del_len, refname_needle_lesseq, &args); if (idx < mod->del_len && !strcmp(mod->del[idx], name)) return 1; } @@ -71,11 +71,11 @@ static int modification_has_ref_with_prefix(struct modification *mod, int err = 0; if (mod->add_len > 0) { - struct find_arg arg = { - .names = mod->add, - .want = prefix, + struct refname_needle_lesseq_args args = { + .haystack = mod->add, + .needle = prefix, }; - size_t idx = binsearch(mod->add_len, find_name, &arg); + size_t idx = binsearch(mod->add_len, refname_needle_lesseq, &args); if (idx < mod->add_len && !strncmp(prefix, mod->add[idx], strlen(prefix))) goto done; @@ -90,11 +90,11 @@ static int modification_has_ref_with_prefix(struct modification *mod, goto done; if (mod->del_len > 0) { - struct find_arg arg = { - .names = mod->del, - .want = ref.refname, + struct refname_needle_lesseq_args args = { + .haystack = mod->del, + .needle = ref.refname, }; - size_t idx = binsearch(mod->del_len, find_name, &arg); + size_t idx = binsearch(mod->del_len, refname_needle_lesseq, &args); if (idx < mod->del_len && !strcmp(ref.refname, mod->del[idx])) continue; From patchwork Wed Apr 3 06:04:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13614981 Received: from fhigh8-smtp.messagingengine.com (fhigh8-smtp.messagingengine.com [103.168.172.159]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7AE324CDF9 for ; Wed, 3 Apr 2024 06:04:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.159 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124259; cv=none; b=diKLLz9EFXnOao+VnRi2h38ew22wJH7z+g2Knru3bWE8nC061YLs/KEUHxf+LmkKdHXSE3VkqENsCNEKsFhNneoeB5ajTgNHU2+XN4vJQA+cJ9WWm77QJ6wkxj+ln11MONTJgOX8gA4dbnE/Ou8iWsEwvDZW6gxwxVGk/rvpPB4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124259; c=relaxed/simple; bh=3UtKotODABPblYgwSHUgqcSnTML6ASa1GFNwjRSle6M=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=QBKx20KPlWojmNxNaUKgmtT/jM6AWZwfZ35S9UzKdpgk4l1S1/97udvqGgPtE8ra7CRUpYXqguareKsMzyZAPGUA6TUI6eTXSbT5E70mX0OOM+9A7Y361MqpTFczyWAVBR1YOXgLRiSClHpvTYFEV7YjaWuOLGZHf3RVnFMZA8o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=kXcmozT8; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=NzI6zTSv; arc=none smtp.client-ip=103.168.172.159 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="kXcmozT8"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="NzI6zTSv" Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 8D77D1140101; Wed, 3 Apr 2024 02:04:16 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Wed, 03 Apr 2024 02:04:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712124256; x=1712210656; bh=vjEflN+rEy hf+FQQLxxVuM1+9uEF3dgx5yt4odH0rYE=; b=kXcmozT8j405cEM2tI7qRISo2s AJlLYfF4pWZjHpOD8tx+biWVSceXWF8N5eGRpSN5SMeODKX1posJ5wXEHWBHUqGU TjCeFoY1r41u0Pm+pGTMoyubU20WPB91TLo8wl4XVAeNuD54OYBc/R7mCTz5l9Gl 2+LGu3p1N7cBw73HAaH6Eb3cPCqh6560RQMiQ69+sjtQ936byQfXIjXIakj8QAcD 1ijClZCjEVLA/OIHUHQtjDodR4EGDQz9ftF8vU1OEmBlM7Osg/otJJNLcUu2EFZH GZCKcfM6NzcASVmbvW8yY6e+nSzEV0IZjsnAigWefWAnEXecPTUj8BIZUKkg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712124256; x=1712210656; bh=vjEflN+rEyhf+FQQLxxVuM1+9uEF 3dgx5yt4odH0rYE=; b=NzI6zTSvgUD6XKTMc/LS/QAOrrUWV99HBBt2g3QX075L pJOwPfcqEfPJx1++OaJmlcjNl7UmGuS2xmK2dWK4rB+OIkfS6888aaZsHFfrvc0k b+YrUeNT8s2+Uz2ZLM6VuYeIgYIhJMHwlxVo51PP318mMRn7N+pVSBSEEJFgGKLN K5qfNuiWCWYi0oapCBJmrQNIV8cdjjZCUOVU4n8bHe4HtivYolgTKCLX0zftnv7o J7041dHU8+85s8v6SjipzGmiczj07FXjRR0L6vOyqdkMnQh+NVpt8SFo/RgQ4C81 3isIGTHOPuOXr7nihRPdpQc3dulO/F9oHyWHzW7MHg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudeffedguddtvdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpeffhffvvefukfhfgggtuggjsehgtderredttddvnecuhfhrohhmpefrrght rhhitghkucfuthgvihhnhhgrrhguthcuoehpshesphhkshdrihhmqeenucggtffrrghtth gvrhhnpeeukedtvedtffevleejtefgheehieegkeeluddvfeefgeehgfeltddtheejleff teenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehpsh esphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Apr 2024 02:04:15 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 61be2e93 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Apr 2024 06:04:06 +0000 (UTC) Date: Wed, 3 Apr 2024 08:04:09 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Justin Tobler Subject: [PATCH v4 4/7] reftable/block: refactor binary search over restart points Message-ID: <6d4a79f3e26f24e8eeec9abcad1d7fdaa3d6e252.1712123093.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When seeking a record in our block reader we perform a binary search over the block's restart points so that we don't have to do a linear scan over the whole block. The logic to do so is quite intricate though, which makes it hard to understand. Improve documentation and rename some of the functions and variables so that the code becomes easier to understand overall. This refactoring should not result in any change in behaviour. Signed-off-by: Patrick Steinhardt --- reftable/block.c | 100 ++++++++++++++++++++++++++++++++++------------- 1 file changed, 73 insertions(+), 27 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 422885bddb..57e3dbf658 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -273,34 +273,36 @@ void block_reader_start(struct block_reader *br, struct block_iter *it) it->next_off = br->header_off + 4; } -struct restart_find_args { +struct restart_needle_less_args { int error; - struct strbuf key; - struct block_reader *r; + struct strbuf needle; + struct block_reader *reader; }; -static int restart_key_less(size_t idx, void *args) +static int restart_needle_less(size_t idx, void *_args) { - struct restart_find_args *a = args; - uint32_t off = block_reader_restart_offset(a->r, idx); + struct restart_needle_less_args *args = _args; + uint32_t off = block_reader_restart_offset(args->reader, idx); struct string_view in = { - .buf = a->r->block.data + off, - .len = a->r->block_len - off, + .buf = args->reader->block.data + off, + .len = args->reader->block_len - off, }; - - /* the restart key is verbatim in the block, so this could avoid the - alloc for decoding the key */ - struct strbuf rkey = STRBUF_INIT; + struct strbuf kth_restart_key = STRBUF_INIT; uint8_t unused_extra; - int n = reftable_decode_key(&rkey, &unused_extra, in); - int result; + int result, n; + + /* + * TODO: The restart key is verbatim in the block, so we can in theory + * avoid decoding the key and thus save some allocations. + */ + n = reftable_decode_key(&kth_restart_key, &unused_extra, in); if (n < 0) { - a->error = 1; + args->error = 1; return -1; } - result = strbuf_cmp(&a->key, &rkey); - strbuf_release(&rkey); + result = strbuf_cmp(&args->needle, &kth_restart_key); + strbuf_release(&kth_restart_key); return result < 0; } @@ -376,9 +378,9 @@ void block_iter_close(struct block_iter *it) int block_reader_seek(struct block_reader *br, struct block_iter *it, struct strbuf *want) { - struct restart_find_args args = { - .key = *want, - .r = br, + struct restart_needle_less_args args = { + .needle = *want, + .reader = br, }; struct block_iter next = BLOCK_ITER_INIT; struct reftable_record rec; @@ -390,7 +392,38 @@ int block_reader_seek(struct block_reader *br, struct block_iter *it, goto done; } - i = binsearch(br->restart_count, &restart_key_less, &args); + /* + * Perform a binary search over the block's restart points, which + * avoids doing a linear scan over the whole block. Like this, we + * identify the section of the block that should contain our key. + * + * Note that we explicitly search for the first restart point _greater_ + * than the sought-after record, not _greater or equal_ to it. In case + * the sought-after record is located directly at the restart point we + * would otherwise start doing the linear search at the preceding + * restart point. While that works alright, we would end up scanning + * too many record. + */ + i = binsearch(br->restart_count, &restart_needle_less, &args); + + /* + * Now there are multiple cases: + * + * - `i == 0`: The wanted record is smaller than the record found at + * the first restart point. As the first restart point is the first + * record in the block, our wanted record cannot be located in this + * block at all. We still need to position the iterator so that the + * next call to `block_iter_next()` will yield an end-of-iterator + * signal. + * + * - `i == restart_count`: The wanted record was not found at any of + * the restart points. As there is no restart point at the end of + * the section the record may thus be contained in the last block. + * + * - `i > 0`: The wanted record must be contained in the section + * before the found restart point. We thus do a linear search + * starting from the preceding restart point. + */ if (i > 0) it->next_off = block_reader_restart_offset(br, i - 1); else @@ -399,21 +432,34 @@ int block_reader_seek(struct block_reader *br, struct block_iter *it, reftable_record_init(&rec, block_reader_type(br)); - /* We're looking for the last entry less/equal than the wanted key, so - we have to go one entry too far and then back up. - */ + /* + * We're looking for the last entry less than the wanted key so that + * the next call to `block_reader_next()` would yield the wanted + * record. We thus don't want to position our reader at the sought + * after record, but one before. To do so, we have to go one entry too + * far and then back up. + */ while (1) { block_iter_copy_from(&next, it); err = block_iter_next(&next, &rec); if (err < 0) goto done; - - reftable_record_key(&rec, &it->last_key); - if (err > 0 || strbuf_cmp(&it->last_key, want) >= 0) { + if (err > 0) { err = 0; goto done; } + /* + * Check whether the current key is greater or equal to the + * sought-after key. In case it is greater we know that the + * record does not exist in the block and can thus abort early. + * In case it is equal to the sought-after key we have found + * the desired record. + */ + reftable_record_key(&rec, &it->last_key); + if (strbuf_cmp(&it->last_key, want) >= 0) + goto done; + block_iter_copy_from(it, &next); } From patchwork Wed Apr 3 06:04:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13614982 Received: from fout5-smtp.messagingengine.com (fout5-smtp.messagingengine.com [103.168.172.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 817954D117 for ; Wed, 3 Apr 2024 06:04:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.148 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124264; cv=none; b=UC7tuMhBKeztFSLlR/WmUE29cw9I6A2Nqesr5vfHLP1jP4APUrnfR6LVjp8JLjDn2xbtcnl4RB2ywLYPt9Qne3ZpKY7kOYoRNGkiLfvC28LmJXwUuXtqERrv1ENnaVmoTjiwqKV1kZzMnlY+4ROmJWzny52w3RJf7BEJukQNNvA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124264; c=relaxed/simple; bh=vLB890cYTGdb9NgmSfBkmdTHg9m3d/oRVSp6C4v8dIM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=SkY7r3Ssx1S1XVjxLiz3kL6yJSWXxy3780P++KNr4Nc0z982vVnYeNnprf0gUWi2Trmzhw3DLeoYuGw5UTSsBi3pOG3VUmY6z2IZYI/zUCW8tfjMN5sR3NqW2qkhCANsgJR1VMuPdb/IcSCnhv4YZNjcs7bKJPzBAsDVAu3wzXE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=TiK/+uO0; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=qO2v8OVb; arc=none smtp.client-ip=103.168.172.148 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="TiK/+uO0"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="qO2v8OVb" Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailfout.nyi.internal (Postfix) with ESMTP id A705C13800F3; Wed, 3 Apr 2024 02:04:20 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2024 02:04:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712124260; x=1712210660; bh=AwrZ06wBOj DUJE3SlS+DuxcfVVG2qsxOWXKJvgKIEQE=; b=TiK/+uO0rmI/4CtOrmU9ga99nR X6AKGfkvL5lTZKujK6yEWjuyDUduJBUqriwLEA14uG+ZFwvKcBl24R6daErQ0N5c 6fc5LdxP7yYOfc73OB2rY8A1RhYUyGwaHkK8qCvj2etDgwPzl7nXdcb42j+IdF8S Wvik5IdAQO8cFGa02Sd0ptrFTdnzuokxj+rXbhpZLv7NlqTZfVSm1aReRzzch2eB qQBJ9VoW8MQnIhSvcoyfbzOA9U2rJzT00g+PLs0NUoC0LkxVBiKdkwsqLR1Nferb PZd9pXNAg9sKPzPfO2ad6sjEi8ptXXN/B9vaJK6lU2h0SvkTQ/PY6TtSP0AA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712124260; x=1712210660; bh=AwrZ06wBOjDUJE3SlS+DuxcfVVG2 qsxOWXKJvgKIEQE=; b=qO2v8OVbZjZMQvqZ25RZFzBcw0tPEZgLmiSAY1AhqLOU znbG6rHLctTtVoYnP0BufCwd+fII8dYPcHCYuwJY2VmfuseLLdw+9iYiMlA2uoOj 0cIRGqGZTwEqw/G0q45uFje+Kkwq4ti1ZNKxI8LXYA2ctEKGr65oVyYsfUpwHjpK qigpsiCwnPfxmw3gXGqRws3IObbKYi1OlzwLltnoL4HJq1VJ3aHpiPHcdtT2vjTX KK/WOvn8rk3AO1oIghA8x2SNpnA2X+3dZeh2ik2ClQ4YtUPbj3kQL2cp/lKz/BXC MUsS2GuDHnUswGa5bjPbbD3uxYASVBvCbvCS2vNPqA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudeffedguddtvdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpeffhffvvefukfhfgggtuggjsehgtderredttddvnecuhfhrohhmpefrrght rhhitghkucfuthgvihhnhhgrrhguthcuoehpshesphhkshdrihhmqeenucggtffrrghtth gvrhhnpeeukedtvedtffevleejtefgheehieegkeeluddvfeefgeehgfeltddtheejleff teenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehpsh esphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Apr 2024 02:04:19 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 1b4ba772 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Apr 2024 06:04:10 +0000 (UTC) Date: Wed, 3 Apr 2024 08:04:18 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Justin Tobler Subject: [PATCH v4 5/7] reftable/block: fix error handling when searching restart points Message-ID: <52c238ee9fdd3257333931f938be38712d3532d5.1712123093.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When doing the binary search over restart points in a block we need to decode the record keys. This decoding step can result in an error when the block is corrupted, which we indicate to the caller of the binary search by setting `args.error = 1`. But the only caller that exists mishandles this because it in fact performs the error check before calling `binsearch()`. Fix this bug by checking for errors at the right point in time. Furthermore, refactor `binsearch()` so that it aborts the search in case the callback function returns a negative value so that we don't needlessly continue to search the block. Signed-off-by: Patrick Steinhardt --- reftable/basics.c | 5 ++++- reftable/basics.h | 5 +++-- reftable/block.c | 9 ++++----- 3 files changed, 11 insertions(+), 8 deletions(-) diff --git a/reftable/basics.c b/reftable/basics.c index 2c5f34b39e..fea711db7e 100644 --- a/reftable/basics.c +++ b/reftable/basics.c @@ -39,8 +39,11 @@ size_t binsearch(size_t sz, int (*f)(size_t k, void *args), void *args) */ while (hi - lo > 1) { size_t mid = lo + (hi - lo) / 2; + int ret = f(mid, args); + if (ret < 0) + return sz; - if (f(mid, args)) + if (ret > 0) hi = mid; else lo = mid; diff --git a/reftable/basics.h b/reftable/basics.h index 2672520e76..523ecd5307 100644 --- a/reftable/basics.h +++ b/reftable/basics.h @@ -22,8 +22,9 @@ uint32_t get_be24(uint8_t *in); void put_be16(uint8_t *out, uint16_t i); /* - * find smallest index i in [0, sz) at which f(i) is true, assuming - * that f is ascending. Return sz if f(i) is false for all indices. + * find smallest index i in [0, sz) at which `f(i) > 0`, assuming that f is + * ascending. Return sz if `f(i) == 0` for all indices. The search is aborted + * and `sz` is returned in case `f(i) < 0`. * * Contrary to bsearch(3), this returns something useful if the argument is not * found. diff --git a/reftable/block.c b/reftable/block.c index 57e3dbf658..ca217636ae 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -387,11 +387,6 @@ int block_reader_seek(struct block_reader *br, struct block_iter *it, int err = 0; size_t i; - if (args.error) { - err = REFTABLE_FORMAT_ERROR; - goto done; - } - /* * Perform a binary search over the block's restart points, which * avoids doing a linear scan over the whole block. Like this, we @@ -405,6 +400,10 @@ int block_reader_seek(struct block_reader *br, struct block_iter *it, * too many record. */ i = binsearch(br->restart_count, &restart_needle_less, &args); + if (args.error) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } /* * Now there are multiple cases: From patchwork Wed Apr 3 06:04:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13614983 Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A2C4E4F211 for ; Wed, 3 Apr 2024 06:04:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.111.4.26 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124267; cv=none; b=mne6tsRw5D/qoJECwdC0Iidh8LI940aC2R9FR1zdDAB89COccsYegA3DiscauS+pJ34BYuS43m2e4kdTa9TBTT/NEvAMT55Xx3NHSwxIi2BHPObu8m1DCG19vOSs0Mgb2UsOThWdULf/8XJDyV7juaq71+8HpkNlm7f4c0fDlXQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124267; c=relaxed/simple; bh=fL8RgjOEf5feptyP6Pqcqe+Nkr2YC6uaTmV/oSINrUQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=kQQKszCBsrfjTzuT5qnaHkxtgFGufr1eRrtfmNjBmKSyApWzDiV3eCP7cMAX33jaYwr7WOux2fVLTYwW8Z1f/+FSNNeewsXI14p941nLbK2M2taEuSc8jAodj2DY/YEBr/iqszrBy/7PrIsU3zJsSAmjlWecef1nyUNbZ48Yd5w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=i2ogyHZO; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=KR9+MK2P; arc=none smtp.client-ip=66.111.4.26 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="i2ogyHZO"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="KR9+MK2P" Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailout.nyi.internal (Postfix) with ESMTP id C7A3D5C0058; Wed, 3 Apr 2024 02:04:24 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute7.internal (MEProxy); Wed, 03 Apr 2024 02:04:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712124264; x=1712210664; bh=MYoYDCyRX0 1gsLudcoSFhsEoS7u2hokagnZQDcuRM3A=; b=i2ogyHZOA6u4jGLrYbodfYmMWy 3sWe4Ln56Xc8y6muEU/pr3dToizzj8MpPM3BLQi+zDO3YLwBvxQkQErMRkxByCRt mFslonIsbAqkVOiTPAX+YvhwoEUyGI4fin5j+bPBu5vekRZO17fSV8PaJSsg9aTC zsDd5AhQ1Lca4xWOIdSXVddZ7a4Zyh072DOq0nh0onFEh5FuPugQASne3qQerkLz VGUc1F2za+lGRZ3xYpwg8eHl+QWLaWBkwfygaJlQ1N+Ntm/sk68SWtcKMtp50Ldu bwKiKBr3uAD2w4ecpYnB2NzR5sX/7C9ToR7sxlVMUMX3b1L/CFZUEVwgfAEg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712124264; x=1712210664; bh=MYoYDCyRX01gsLudcoSFhsEoS7u2 hokagnZQDcuRM3A=; b=KR9+MK2PqiHCMvzi+JEsBTFW5aH90/+O2JV0kgxNxg+p jDoO2s2pv9z8wtE8CkDfwc+CyQBWpwElFey7cq0wusITp36yEu29VKpx+z22zu5B xH+4znSQKX9jDL9swsg8ZqL6Ya7dgfcJnE69lpDE6U2rvGqQM98zfX0/AKJATXF+ 5qUvjGgH2hXdMqfpGpY/19lRhoVRcnvPXRGLEKo8AJSoDN9IQFjtTFo7ml44dYIv QkNd8BHoJi9OTO+p2JxYVEb2jvQXJWuKGmNrEHxLap8Wq4y/ATcMlUE4Kbi/Svev IjMHZYXlWnRE3eEKjHV/tGnc56pd29p0eOXeflaOFQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudeffedguddtudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpeffhffvvefukfhfgggtuggjsehgtderredttddvnecuhfhrohhmpefrrght rhhitghkucfuthgvihhnhhgrrhguthcuoehpshesphhkshdrihhmqeenucggtffrrghtth gvrhhnpeeukedtvedtffevleejtefgheehieegkeeluddvfeefgeehgfeltddtheejleff teenucevlhhushhtvghrufhiiigvpedvnecurfgrrhgrmhepmhgrihhlfhhrohhmpehpsh esphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Apr 2024 02:04:23 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 4b6ec755 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Apr 2024 06:04:14 +0000 (UTC) Date: Wed, 3 Apr 2024 08:04:22 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Justin Tobler Subject: [PATCH v4 6/7] reftable/record: extract function to decode key lengths Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: We're about to refactor the binary search over restart points so that it does not need to fully decode the record keys anymore. To do so we will need to decode the record key lengths, which is non-trivial logic. Extract the logic to decode these lengths from `refatble_decode_key()` so that we can reuse it. Signed-off-by: Patrick Steinhardt --- reftable/record.c | 34 +++++++++++++++++++++++++--------- reftable/record.h | 6 ++++++ 2 files changed, 31 insertions(+), 9 deletions(-) diff --git a/reftable/record.c b/reftable/record.c index 23b497adab..5506f3e913 100644 --- a/reftable/record.c +++ b/reftable/record.c @@ -159,26 +159,42 @@ int reftable_encode_key(int *restart, struct string_view dest, return start.len - dest.len; } -int reftable_decode_key(struct strbuf *last_key, uint8_t *extra, - struct string_view in) +int reftable_decode_keylen(struct string_view in, + uint64_t *prefix_len, + uint64_t *suffix_len, + uint8_t *extra) { - int start_len = in.len; - uint64_t prefix_len = 0; - uint64_t suffix_len = 0; + size_t start_len = in.len; int n; - n = get_var_int(&prefix_len, &in); + n = get_var_int(prefix_len, &in); if (n < 0) return -1; string_view_consume(&in, n); - n = get_var_int(&suffix_len, &in); + n = get_var_int(suffix_len, &in); if (n <= 0) return -1; string_view_consume(&in, n); - *extra = (uint8_t)(suffix_len & 0x7); - suffix_len >>= 3; + *extra = (uint8_t)(*suffix_len & 0x7); + *suffix_len >>= 3; + + return start_len - in.len; +} + +int reftable_decode_key(struct strbuf *last_key, uint8_t *extra, + struct string_view in) +{ + int start_len = in.len; + uint64_t prefix_len = 0; + uint64_t suffix_len = 0; + int n; + + n = reftable_decode_keylen(in, &prefix_len, &suffix_len, extra); + if (n < 0) + return -1; + string_view_consume(&in, n); if (in.len < suffix_len || prefix_len > last_key->len) diff --git a/reftable/record.h b/reftable/record.h index 826ee1c55c..d778133e6e 100644 --- a/reftable/record.h +++ b/reftable/record.h @@ -86,6 +86,12 @@ int reftable_encode_key(int *is_restart, struct string_view dest, struct strbuf prev_key, struct strbuf key, uint8_t extra); +/* Decode a record's key lengths. */ +int reftable_decode_keylen(struct string_view in, + uint64_t *prefix_len, + uint64_t *suffix_len, + uint8_t *extra); + /* * Decode into `last_key` and `extra` from `in`. `last_key` is expected to * contain the decoded key of the preceding record, if any. From patchwork Wed Apr 3 06:04:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13614984 Received: from fout5-smtp.messagingengine.com (fout5-smtp.messagingengine.com [103.168.172.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8A47A4D133 for ; Wed, 3 Apr 2024 06:04:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.148 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124275; cv=none; b=BVglzPDRd2YAVVEvH5vhmCSzsSjBHHxPtHBvrJGJI17ZtqIoAjcOSwb4b3W+dHzkMxmDrEs7fIazBELsDFSPboJxy5Me0dSw29tWjD3lpk9NQXdUzU1Ku1+N6wctUuths/E0+VPOlS1B2JpDTAefCgFHS6hz+593n27qdyr2diU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712124275; c=relaxed/simple; bh=OstjbyC2SKVuKoIrbC3GLyKYiMPilvwFIdeyWaYUtvY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=bvPZ6DApivusYE0cZOcXp8aByLMttp8DYjf/yM6Wf0yfzuawaFyPIso8oyiiKSmTyCQGTOqygUDhE15sM5Nq9C9G2nkQFri1fdzlsAXvS0NRymXuVhgrWiXLRwQHhsItcoyRODB518r4nxiCTKxIPrbszBrRlmyS94Mid6/h3hg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=NQtJE09O; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=F6fQVGr0; arc=none smtp.client-ip=103.168.172.148 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="NQtJE09O"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="F6fQVGr0" Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailfout.nyi.internal (Postfix) with ESMTP id DCD9113800F4; Wed, 3 Apr 2024 02:04:32 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute7.internal (MEProxy); Wed, 03 Apr 2024 02:04:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712124272; x=1712210672; bh=kn4ev0CKDc 8v6+HGzzmOini7k6ZVXrPCJ1+NHJ3Ah+M=; b=NQtJE09O2O9ORxvmfI8wCEy3RJ GeDrDmck3LlL/KxQfYfUyInxGYbg9YEAwmxngcu/Yatv7RX6NJMQYR8Nnp9wx92/ r7tNvynHqHn6rzKXz9OLOHPOIgIy1O26heLEaSFZCzd5owfZ64d+W36E1X0/kxQV +QF9zPg9Z6XscSB3ySCLCZRoGJEA7eWxZEta594iCKqJQR4ToWdftDXlxQsl0QbO qPqaXBGz5blwOKfx8FXXMutfMACu17ZYpAEpZfJwYAI8PKgnpMHXzhSz1Q9BCqZG 5a6mowWP9gqvrRAaMjmbbRepLG8qMG/aQAiXXltZ65MzR76VT9r4ZWWpvLmw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712124272; x=1712210672; bh=kn4ev0CKDc8v6+HGzzmOini7k6ZV XrPCJ1+NHJ3Ah+M=; b=F6fQVGr0kAu+cBR24osnV9rz/XIW1LP8FyRNrSIxqDIn NAwiO12ILlwinHWXSGHVK+v6t5a8TNJaHyMyl+AtjNQ1ou7FxI3haCL4AR/o8sLn 0uRiJmYL6rJpFv1+z8IJG6+0rZXHej4qtIPhrQPjSqpHAGmH6nwz+CBPk7tHyJrl 5zKU0MxWQ4uLFMix5Scka8+b87+YyFWAp0TTe2TiSnk57Zc7rBp1bBhnaudiHeVd pXJya9aUgdeV5ko3ES7nNiF6axBpoyRw1pPpeHvSe+S7NhYGycF0unoSZ2klkCml Pwo5IDwXIH0ZWOV2LBL/OkWPWU6OJ2LBATzkKdSdYg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudeffedguddtudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpeffhffvvefukfhfgggtuggjsehgtderredttddvnecuhfhrohhmpefrrght rhhitghkucfuthgvihhnhhgrrhguthcuoehpshesphhkshdrihhmqeenucggtffrrghtth gvrhhnpeeukedtvedtffevleejtefgheehieegkeeluddvfeefgeehgfeltddtheejleff teenucevlhhushhtvghrufhiiigvpeefnecurfgrrhgrmhepmhgrihhlfhhrohhmpehpsh esphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Apr 2024 02:04:31 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 62d74181 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Apr 2024 06:04:22 +0000 (UTC) Date: Wed, 3 Apr 2024 08:04:29 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Justin Tobler Subject: [PATCH v4 7/7] reftable/block: avoid decoding keys when searching restart points Message-ID: <632f725dde7b7f2eda83d285af90c7d989f294dc.1712123093.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When searching over restart points in a block we decode the key of each of the records, which results in a memory allocation. This is quite pointless though given that records it restart points will never use prefix compression and thus store their keys verbatim in the block. Refactor the code so that we can avoid decoding the keys, which saves us some allocations. Signed-off-by: Patrick Steinhardt --- reftable/block.c | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index ca217636ae..298e8c56b9 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -287,23 +287,32 @@ static int restart_needle_less(size_t idx, void *_args) .buf = args->reader->block.data + off, .len = args->reader->block_len - off, }; - struct strbuf kth_restart_key = STRBUF_INIT; - uint8_t unused_extra; - int result, n; + uint64_t prefix_len, suffix_len; + uint8_t extra; + int n; /* - * TODO: The restart key is verbatim in the block, so we can in theory - * avoid decoding the key and thus save some allocations. + * Records at restart points are stored without prefix compression, so + * there is no need to fully decode the record key here. This removes + * the need for allocating memory. */ - n = reftable_decode_key(&kth_restart_key, &unused_extra, in); - if (n < 0) { + n = reftable_decode_keylen(in, &prefix_len, &suffix_len, &extra); + if (n < 0 || prefix_len) { args->error = 1; return -1; } - result = strbuf_cmp(&args->needle, &kth_restart_key); - strbuf_release(&kth_restart_key); - return result < 0; + string_view_consume(&in, n); + if (suffix_len > in.len) { + args->error = 1; + return -1; + } + + n = memcmp(args->needle.buf, in.buf, + args->needle.len < suffix_len ? args->needle.len : suffix_len); + if (n) + return n < 0; + return args->needle.len < suffix_len; } void block_iter_copy_from(struct block_iter *dest, struct block_iter *src)