From patchwork Mon Apr 8 12:16:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621038 Received: from fout2-smtp.messagingengine.com (fout2-smtp.messagingengine.com [103.168.172.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 696B86CDBC for ; Mon, 8 Apr 2024 12:16:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.145 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578593; cv=none; b=u2Pn/NCSNSX2hfK6MaALaS40MsInTKXATuVA36c0QvNryPTNK473pAt3ecATPOdsYJUcHNpXsnBWjnWlxe7QvUNPVio084qXkXW0J9oEf460YmV/8b1/EEvZoMysXRLgOiTTCCRTMeBoAAbwFD3Z0Q4Far8vtcH8SVRdQ9OKk/w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578593; c=relaxed/simple; bh=cRwBXdp+TTnkxuWtqOrfl4PG3MNnA/Ma2F79eIrIxyQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=S/QzUgL3w9E1ghLTbvoINzfz4lk4xjvvk3kyVL3SjZyyvAiRwEBR63CKnJnAMsUgrkVDjUf7B3nS9BUxuQIYbFfYrC6JEJh5KZkVV4UcryPvcA340YruAh7j3rwYk9G6a75HuGBsPAnBLJ/J8w/nybRxySJoBLwT3M6hgUWsFcc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=G92pJ48n; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=nha6WSkr; arc=none smtp.client-ip=103.168.172.145 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="G92pJ48n"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="nha6WSkr" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailfout.nyi.internal (Postfix) with ESMTP id 7889213800BC; Mon, 8 Apr 2024 08:16:30 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Mon, 08 Apr 2024 08:16:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712578590; x=1712664990; bh=rYl9npi/Kk ASR8TCsOw8QkvKpQAQq1W79+txgBN3EwU=; b=G92pJ48niqZuE2bOHu+JnIvy4q c0uO8wCHrPG8GYj11zdejZdRziIcmnp+IeXcDlSW1H4fXXvQjcEQj7A1TMsKMUkP U/mHpnuV5dmcP2CLTsFOmRZ6G5gym7rg1Nj3nmM575Qa7Ld4FBAHLIiUe58hUQ5i 7+Q/1vx1E0LAkudRKMQR3/MSzAMCgTfPMc+s6KNFKeJEN+Hv40q1KnH9X//KjUom jQuGVE+K8HJ7C/WK+PG77qPxqxfUhthk4ZsgpK4ZnS7VVuc7PCnodISuDwWTouHj WWY5qtTqsnx+RAwbUc9dTTvklfBa7qADe24jh4J3FwQ510zppRqRPoHlTFAQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712578590; x=1712664990; bh=rYl9npi/KkASR8TCsOw8QkvKpQAQ q1W79+txgBN3EwU=; b=nha6WSkrRxWeD7cp3a5C9kleD/0RRTalmSjYOfFknUli J0d7ONnCWUMDTeyVZJswLwsWsamFARWG1rT8t5Q4Kk56vGz6wQsN3aAdQ8c4TyYm NNMTSRRm5O5keuwPJ8XKqqjvgYOypI4Ag5hh5EewxAni246Bc2UE76CvnvDiB6xh TVliae1gfJ4SqOhu8/tKHEesw66s1GzecPBRd8RP8R79aLl66+8vwpuZgR8azeho W+zw2KgTaBYCz7iZVgGFyjTf0g+xrARgRmx+x5nKTECFejeH6gmdImcldPQ7/LVZ TIU/IgUHeh6m+uxU+Rc/JyucFXRFNc9uUx8ykziFHw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedghedtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgepudenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:16:29 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 24934707 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:16:23 +0000 (UTC) Date: Mon, 8 Apr 2024 14:16:26 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Karthik Nayak , Justin Tobler Subject: [PATCH v2 01/10] reftable/block: rename `block_reader_start()` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The function `block_reader_start()` does not really apply to the block reader, but to the block iterator. It's name is thus somewhat confusing. Rename it to `block_iter_seek_start()` to clarify. We will rename `block_reader_seek()` in similar spirit in the next commit. Signed-off-by: Patrick Steinhardt --- reftable/block.c | 2 +- reftable/block.h | 2 +- reftable/block_test.c | 2 +- reftable/iter.c | 2 +- reftable/reader.c | 4 ++-- 5 files changed, 6 insertions(+), 6 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 298e8c56b9..53ea92d04e 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -266,7 +266,7 @@ static uint32_t block_reader_restart_offset(struct block_reader *br, int i) return get_be24(br->restart_bytes + 3 * i); } -void block_reader_start(struct block_reader *br, struct block_iter *it) +void block_iter_seek_start(struct block_iter *it, struct block_reader *br) { it->br = br; strbuf_reset(&it->last_key); diff --git a/reftable/block.h b/reftable/block.h index 47acc62c0a..44a9a8de7d 100644 --- a/reftable/block.h +++ b/reftable/block.h @@ -98,7 +98,7 @@ int block_reader_init(struct block_reader *br, struct reftable_block *bl, int hash_size); /* Position `it` at start of the block */ -void block_reader_start(struct block_reader *br, struct block_iter *it); +void block_iter_seek_start(struct block_iter *it, struct block_reader *br); /* Position `it` to the `want` key in the block */ int block_reader_seek(struct block_reader *br, struct block_iter *it, diff --git a/reftable/block_test.c b/reftable/block_test.c index e162c6e33f..a719be7c50 100644 --- a/reftable/block_test.c +++ b/reftable/block_test.c @@ -69,7 +69,7 @@ static void test_block_read_write(void) block_reader_init(&br, &block, header_off, block_size, GIT_SHA1_RAWSZ); - block_reader_start(&br, &it); + block_iter_seek_start(&it, &br); while (1) { int r = block_iter_next(&it, &rec); diff --git a/reftable/iter.c b/reftable/iter.c index 7aa30c4a51..aa9ac199b1 100644 --- a/reftable/iter.c +++ b/reftable/iter.c @@ -115,7 +115,7 @@ static int indexed_table_ref_iter_next_block(struct indexed_table_ref_iter *it) /* indexed block does not exist. */ return REFTABLE_FORMAT_ERROR; } - block_reader_start(&it->block_reader, &it->cur); + block_iter_seek_start(&it->cur, &it->block_reader); return 0; } diff --git a/reftable/reader.c b/reftable/reader.c index b113daab77..d7d47e8640 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -345,7 +345,7 @@ static int table_iter_next_block(struct table_iter *dest, *brp = br; dest->is_finished = 0; - block_reader_start(brp, &dest->bi); + block_iter_seek_start(&dest->bi, brp); } return 0; } @@ -429,7 +429,7 @@ static int reader_table_iter_at(struct reftable_reader *r, ti->r = r; ti->typ = block_reader_type(brp); ti->block_off = off; - block_reader_start(brp, &ti->bi); + block_iter_seek_start(&ti->bi, brp); return 0; } From patchwork Mon Apr 8 12:16:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621039 Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB17A6E61B for ; Mon, 8 Apr 2024 12:16:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.157 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578597; cv=none; b=mAXxrWgsHe4woBgeJab901GuJkoZ9LcBa1nXWv+zrCRPa0igpvh10gWVCY4YRexTi+qWxEvaP/flYo0HQ9abpd54muuIedhS36SIRbR53BZvVzwaCTT1tGitDxxi3HxmnXTXPzAtQIZGRKXoRD/MD6T+AxTBimVv7Lq273M5dOc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578597; c=relaxed/simple; bh=5WjwbG8avMh+0Gv4XE7sqUqBLXTY2IUuc6aU/pQYonE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=SDdYLxW2qvXKNoO5dTVvGlnZL8BWTag5kCXvnj25Lj9Zi+rBS5Zm0QIAa+E2sugYKa02eNBQnnuMcc3JAOw261N5mQrS34WX3GGQQiEojbHPI/p6ZWa4Li+XeeXUzXyXtgEXiCzneZvl4gZZ08Hn5NE0JsaRTL4v2BhiG+NUJy0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=j+7aTqdn; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=ZeBr2zNH; arc=none smtp.client-ip=103.168.172.157 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="j+7aTqdn"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="ZeBr2zNH" Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailfhigh.nyi.internal (Postfix) with ESMTP id BCA851140095; Mon, 8 Apr 2024 08:16:34 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Mon, 08 Apr 2024 08:16:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712578594; x=1712664994; bh=Tzgy1xmN9Z 42XI6TaI6KZxBSCDjcNLTNSVtKcZT0f3A=; b=j+7aTqdngUbUi/7Q1XJ/XgW4OW qbY1J+z3yMaccKds18LaEjOni/I5MAdZ1EPNadt3r5JTG3v5fJDq6CWbU0HvGDqs mPAgQP/3knf2+qqv+XFX8BTsltaSw8a+BLFRhH7rdUZ5iMo7bCKCsjhTtvGWUres kM11hFuq/R/yaMtDO+k07gsBopSqFa0DcQQ+XZnPI8BYYNtAIS9RvAwzc09KN4Mz 0CoEZDSZMcek7rgT/DnzpD+Lv2tjLsLvgl4Tuu1l73X/bTJP3X3e6U59tsA2v9C3 OGp1Su8yFx5OfJHvakSodGl6xGSYLLebGp6jdcb2RO/6pxxmmDGYFO8x5Htg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712578594; x=1712664994; bh=Tzgy1xmN9Z42XI6TaI6KZxBSCDjc NLTNSVtKcZT0f3A=; b=ZeBr2zNHna4KK0mIedqHKsYHMQ3+oiufivl/KUWGEe57 a+FyCTixGQSHauHOf5FNXVHftHWr0axQjc47tzlYnmOCZizEEBwxOTs8XaM516P0 6MYD85JmosquWz8KvgPyVMUa2vSeRg/JxolulN7CoVh5hmaup4pF8QwMYiwE+D3t 1otJ44JzccyVJ9nfV1FQ70naQ0GSzgbI2n9OfvbYp2stDOlqiEA2qinnIlhh1jsk drBaMFXyPKVmW7viHDU5S9oV7fYMogaVoBVS6bP8d21WYU8hbG14SjDWGtzgDtnR Ke7wolBljNRn1xFUg5gjeJmeBVKuwhUSop5qQpbX1g== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedghedtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:16:33 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 925ad9ec (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:16:27 +0000 (UTC) Date: Mon, 8 Apr 2024 14:16:31 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Karthik Nayak , Justin Tobler Subject: [PATCH v2 02/10] reftable/block: merge `block_iter_seek()` and `block_reader_seek()` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The function `block_iter_seek()` is merely a simple wrapper around `block_reader_seek()`. Merge those two functions into a new function `block_iter_seek_key()` that more clearly says what it is actually doing. Signed-off-by: Patrick Steinhardt --- reftable/block.c | 9 ++------- reftable/block.h | 7 ++----- reftable/block_test.c | 4 ++-- reftable/reader.c | 4 ++-- 4 files changed, 8 insertions(+), 16 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 53ea92d04e..1015f9c04c 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -373,19 +373,14 @@ int block_reader_first_key(struct block_reader *br, struct strbuf *key) return 0; } -int block_iter_seek(struct block_iter *it, struct strbuf *want) -{ - return block_reader_seek(it->br, it, want); -} - void block_iter_close(struct block_iter *it) { strbuf_release(&it->last_key); strbuf_release(&it->scratch); } -int block_reader_seek(struct block_reader *br, struct block_iter *it, - struct strbuf *want) +int block_iter_seek_key(struct block_iter *it, struct block_reader *br, + struct strbuf *want) { struct restart_needle_less_args args = { .needle = *want, diff --git a/reftable/block.h b/reftable/block.h index 44a9a8de7d..1734bee917 100644 --- a/reftable/block.h +++ b/reftable/block.h @@ -101,8 +101,8 @@ int block_reader_init(struct block_reader *br, struct reftable_block *bl, void block_iter_seek_start(struct block_iter *it, struct block_reader *br); /* Position `it` to the `want` key in the block */ -int block_reader_seek(struct block_reader *br, struct block_iter *it, - struct strbuf *want); +int block_iter_seek_key(struct block_iter *it, struct block_reader *br, + struct strbuf *want); /* Returns the block type (eg. 'r' for refs) */ uint8_t block_reader_type(struct block_reader *r); @@ -115,9 +115,6 @@ void block_iter_copy_from(struct block_iter *dest, struct block_iter *src); /* return < 0 for error, 0 for OK, > 0 for EOF. */ int block_iter_next(struct block_iter *it, struct reftable_record *rec); -/* Seek to `want` with in the block pointed to by `it` */ -int block_iter_seek(struct block_iter *it, struct strbuf *want); - /* deallocate memory for `it`. The block reader and its block is left intact. */ void block_iter_close(struct block_iter *it); diff --git a/reftable/block_test.c b/reftable/block_test.c index a719be7c50..26a9cfbc83 100644 --- a/reftable/block_test.c +++ b/reftable/block_test.c @@ -89,7 +89,7 @@ static void test_block_read_write(void) strbuf_reset(&want); strbuf_addstr(&want, names[i]); - n = block_reader_seek(&br, &it, &want); + n = block_iter_seek_key(&it, &br, &want); EXPECT(n == 0); n = block_iter_next(&it, &rec); @@ -98,7 +98,7 @@ static void test_block_read_write(void) EXPECT_STREQ(names[i], rec.u.ref.refname); want.len--; - n = block_reader_seek(&br, &it, &want); + n = block_iter_seek_key(&it, &br, &want); EXPECT(n == 0); n = block_iter_next(&it, &rec); diff --git a/reftable/reader.c b/reftable/reader.c index d7d47e8640..f70efa2b7c 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -483,7 +483,7 @@ static int reader_seek_linear(struct table_iter *ti, table_iter_copy_from(ti, &next); } - err = block_iter_seek(&ti->bi, &want_key); + err = block_iter_seek_key(&ti->bi, ti->bi.br, &want_key); if (err < 0) goto done; err = 0; @@ -558,7 +558,7 @@ static int reader_seek_indexed(struct reftable_reader *r, if (err != 0) goto done; - err = block_iter_seek(&next.bi, &want_index.u.idx.last_key); + err = block_iter_seek_key(&next.bi, next.bi.br, &want_index.u.idx.last_key); if (err < 0) goto done; From patchwork Mon Apr 8 12:16:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621040 Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B3B426F06F for ; Mon, 8 Apr 2024 12:16:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.157 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578609; cv=none; b=gk+ywFA1AIIitNwuIvrKSYdNUTmJOiU4JfVrRuZOX0PC2FfxZe4ZGxQkhzrHAf/YPqpd+Rrdtbl7YUQFHPFZVIUM0Vsq+zOJLdHzMPIcLpbhekmjPZtILhhVhTDRhpdvAaiem1wpTJW4DF1wTdS0hIwzZf608quMKcl4ws6SEJM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578609; c=relaxed/simple; bh=OWzlA5T/ToejWvSLsC9ULIRA8ZeqqrMQzUuJbyP4Rlg=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=dsRvQXKAQYfdQFdJoiA3QJJ+eq6/55zOTR0yVL+0fBYWG4eLDxPEttFLTsSTBcLpXXUgzBhdwM/ojLC0tc5pHuNgvs8zhuu4GWlk5FvLHCswyoOcyj+QrVqC6iXpQd7pNS4Vx8qx33oIQzbnTrG7HrS0ls09wPxHem/8w+JaiS0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=m5T+QlEx; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=rjaz6Bja; arc=none smtp.client-ip=103.168.172.157 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="m5T+QlEx"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="rjaz6Bja" Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailfhigh.nyi.internal (Postfix) with ESMTP id EFB3411400FA; Mon, 8 Apr 2024 08:16:39 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Mon, 08 Apr 2024 08:16:39 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712578599; x=1712664999; bh=J6tQzUs5Ts 7q0/HAs/mPjLVTDGd9EKAfWFLz9em0AQY=; b=m5T+QlExpYZylArJfDuWheu0RP DM3M332HqX00fqVjwAu8wgTor0v3UO9TxBQXn9wgWkav8Aqx9V8wDHgt30U5mxRQ bETg3b+BfoQAdawXom5oLFmOTYmPZzY54oGFGledAVqiUS3uTGeNqkereOvplkvK n2vhGIysKZBAKZMgiYDs39BoOQsQVYc+/EXGraqVeONxXXqKwrpE41KDkkGGI4Oz zrWbTxE2I4itVVdw7pGfBmvb1Eg+XJJ7t/WVlcXezJu8LLTXZfcvBS8ZNOksDpyY uqLm0IPEiw2e+R8OSdBg5RHd/k+KMs+Sx09ivDVVBHt2KXzqlXpzrr08RvAA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712578599; x=1712664999; bh=J6tQzUs5Ts7q0/HAs/mPjLVTDGd9 EKAfWFLz9em0AQY=; b=rjaz6Bjad0L/tfXanUZ2G+x5ocpkH8HGNr/rc40vCUFF acAVjsXXX/6ERpxt3S1DHIqMckfiUqsYvHnp1MQf8Cwuynx+x+8h0a7K2+n3bgiC eTENehu/r/S4l0u9vQL6JPf44OB0pXqOiwYRp9WRFQVhSP/IuZg5FeDdwxeOub8x +gE5WTHb0kDW1LxkWordyDIiRp4cDgpiT4kjBkUfkMARsNCrDLPvZd1Nhi9sk0Sw ORkZWNw2ZEv9TAlVvDKOGylaP/fKPdj/Ne2Ocdtmj5F44T73WTqsgwJ801B0E5f+ y/D5HdHrQtBVCENtXDPNa4v+FRuFgmBxD1zuaEvOPQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedggeelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:16:38 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id a2ee439a (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:16:32 +0000 (UTC) Date: Mon, 8 Apr 2024 14:16:36 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Karthik Nayak , Justin Tobler Subject: [PATCH v2 03/10] reftable/block: better grouping of functions Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Function definitions and declaration of `struct block_reader` and `struct block_iter` are somewhat mixed up, making it hard to see which functions belong together. Rearrange them. Signed-off-by: Patrick Steinhardt --- reftable/block.c | 50 ++++++++++++++++++++++++------------------------ reftable/block.h | 22 ++++++++++----------- 2 files changed, 36 insertions(+), 36 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 1015f9c04c..e65453e11b 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -175,11 +175,6 @@ int block_writer_finish(struct block_writer *w) return w->next; } -uint8_t block_reader_type(struct block_reader *r) -{ - return r->block.data[r->header_off]; -} - int block_reader_init(struct block_reader *br, struct reftable_block *block, uint32_t header_off, uint32_t table_block_size, int hash_size) @@ -261,6 +256,31 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, return err; } +uint8_t block_reader_type(struct block_reader *r) +{ + return r->block.data[r->header_off]; +} + +int block_reader_first_key(struct block_reader *br, struct strbuf *key) +{ + int off = br->header_off + 4, n; + struct string_view in = { + .buf = br->block.data + off, + .len = br->block_len - off, + }; + uint8_t extra = 0; + + strbuf_reset(key); + + n = reftable_decode_key(key, &extra, in); + if (n < 0) + return n; + if (!key->len) + return REFTABLE_FORMAT_ERROR; + + return 0; +} + static uint32_t block_reader_restart_offset(struct block_reader *br, int i) { return get_be24(br->restart_bytes + 3 * i); @@ -353,26 +373,6 @@ int block_iter_next(struct block_iter *it, struct reftable_record *rec) return 0; } -int block_reader_first_key(struct block_reader *br, struct strbuf *key) -{ - int off = br->header_off + 4, n; - struct string_view in = { - .buf = br->block.data + off, - .len = br->block_len - off, - }; - uint8_t extra = 0; - - strbuf_reset(key); - - n = reftable_decode_key(key, &extra, in); - if (n < 0) - return n; - if (!key->len) - return REFTABLE_FORMAT_ERROR; - - return 0; -} - void block_iter_close(struct block_iter *it) { strbuf_release(&it->last_key); diff --git a/reftable/block.h b/reftable/block.h index 1734bee917..d73ed73549 100644 --- a/reftable/block.h +++ b/reftable/block.h @@ -76,6 +76,17 @@ struct block_reader { uint32_t full_block_size; }; +/* initializes a block reader. */ +int block_reader_init(struct block_reader *br, struct reftable_block *bl, + uint32_t header_off, uint32_t table_block_size, + int hash_size); + +/* Returns the block type (eg. 'r' for refs) */ +uint8_t block_reader_type(struct block_reader *r); + +/* Decodes the first key in the block */ +int block_reader_first_key(struct block_reader *br, struct strbuf *key); + /* Iterate over entries in a block */ struct block_iter { /* offset within the block of the next entry to read. */ @@ -92,11 +103,6 @@ struct block_iter { .scratch = STRBUF_INIT, \ } -/* initializes a block reader. */ -int block_reader_init(struct block_reader *br, struct reftable_block *bl, - uint32_t header_off, uint32_t table_block_size, - int hash_size); - /* Position `it` at start of the block */ void block_iter_seek_start(struct block_iter *it, struct block_reader *br); @@ -104,12 +110,6 @@ void block_iter_seek_start(struct block_iter *it, struct block_reader *br); int block_iter_seek_key(struct block_iter *it, struct block_reader *br, struct strbuf *want); -/* Returns the block type (eg. 'r' for refs) */ -uint8_t block_reader_type(struct block_reader *r); - -/* Decodes the first key in the block */ -int block_reader_first_key(struct block_reader *br, struct strbuf *key); - void block_iter_copy_from(struct block_iter *dest, struct block_iter *src); /* return < 0 for error, 0 for OK, > 0 for EOF. */ From patchwork Mon Apr 8 12:16:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621041 Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E57EC6A8A3 for ; Mon, 8 Apr 2024 12:16:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.157 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578619; cv=none; b=qu7q8rUjbRgebXhfGK5mHhUrCXkz9KldZPvIDb4mZFLDsmd3KRGECkjtuEWJSxOjeWEfIMKypbHDgIJnT+JvrwlUovVmg3N8t9DRy1Yko/EMXD5dN85hEnYN4laQ6yAGLVLbi6Seh1dunCkVInaEXGzkUiPIZQ/0OF/hpZOfrP4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578619; c=relaxed/simple; bh=HzRr0J6eFAjXeVqNXsGRR6aGFJP55W0WvuXfjTZCejE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=JCiZIPas5l3TTm4zWae0kMz4jbGA4XVJ8/sTKupYt1q5NOYgTnltbaAwh6+UjEvpSPZ0B3zYMMBsN6q1rXA7h1QEhce3RijWBjA62cgfQVgjI34aoJdHKIzX0MqmKAwB+bEWDq0yy0D3a6F0RvbVFnhmCbx/QmwJM2/DiOCXxWQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=cIff7wFi; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=GZpL6gL2; arc=none smtp.client-ip=103.168.172.157 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="cIff7wFi"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="GZpL6gL2" Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 3CB671140077; Mon, 8 Apr 2024 08:16:44 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Mon, 08 Apr 2024 08:16:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712578604; x=1712665004; bh=MpVXaBDT+9 GHdv574Gar3EYI1kJvGsH4TGuUo3QnhzA=; b=cIff7wFiZSv+IzQ8MxLOvoaxvl WZorVgvnJyBnA1hwGt7s1N1DCTJH3/jv33Qqew1JGLo5IqaK5kgntaF3D++Fmj7h l/9v4T8E55DlFCTF7RNtXWMo5IFXqA8IPCLjUC2D9A+jpNz7BlE3saK0VLYZWlt6 lzeEZPWSlMZOjNw2F3JRJ7gcNhxyu/omdZmtJH7KlyKapX0h7KEQt+8JCH7M/08d 1ZUnkipCKjv+Yt2DE4TcJJJwZJ065bMcGoe/pmE8s8ucIejmw8mWHzUT/k/Bv9iS +XNgANf9SNbjuHzEFczkhc8cgYz2IzRj3l/hLdUYplWf3biiDc7BQU67EgnQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712578604; x=1712665004; bh=MpVXaBDT+9GHdv574Gar3EYI1kJv GsH4TGuUo3QnhzA=; b=GZpL6gL2vlCvCJpQWFWC0WjRH9OOHpM3NpmErFTV6u30 zvGF4f9e49FNssxuBRF+tK9hJyPEawXHJkaAV0zl2bg35kt66b3LVSstGlRoEEtf gT9HmxszRVBEe1mAFY/kj1wVrrdVfv5226IFfNdIHZjAVhteVIGGPsTd7KdXH1sZ +mNc2yrhxF00YhVvkm2QI090AReBqpuClYaaG1vijwKOlemQv1OpITRVP+YOust0 F3836ZVQH2bGnTpQ7EZoIWLIDu9BTxChqOkYMZdprlbcFD2bakfhk6RZ5759FgUS 8/OGAN8s6IbK53QHQKpB+yeRI+3+7w4+URirDsH7gw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedghedtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgepudenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:16:43 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 9d347e97 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:16:37 +0000 (UTC) Date: Mon, 8 Apr 2024 14:16:40 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Karthik Nayak , Justin Tobler Subject: [PATCH v2 04/10] reftable/block: introduce `block_reader_release()` Message-ID: <35f1bf50729efcee7b9162ffe79d40142418c0cf.1712578376.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Introduce a new function `block_reader_release()` that releases resources acquired by the block reader. This function will be extended in a subsequent commit. Signed-off-by: Patrick Steinhardt --- reftable/block.c | 5 +++++ reftable/block.h | 2 ++ reftable/reader.c | 2 +- 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/reftable/block.c b/reftable/block.c index e65453e11b..fe836c21e5 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -256,6 +256,11 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, return err; } +void block_reader_release(struct block_reader *br) +{ + reftable_block_done(&br->block); +} + uint8_t block_reader_type(struct block_reader *r) { return r->block.data[r->header_off]; diff --git a/reftable/block.h b/reftable/block.h index d73ed73549..601a1e0e89 100644 --- a/reftable/block.h +++ b/reftable/block.h @@ -81,6 +81,8 @@ int block_reader_init(struct block_reader *br, struct reftable_block *bl, uint32_t header_off, uint32_t table_block_size, int hash_size); +void block_reader_release(struct block_reader *br); + /* Returns the block type (eg. 'r' for refs) */ uint8_t block_reader_type(struct block_reader *r); diff --git a/reftable/reader.c b/reftable/reader.c index f70efa2b7c..f925570bf3 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -253,7 +253,7 @@ static void table_iter_block_done(struct table_iter *ti) if (!ti->bi.br) { return; } - reftable_block_done(&ti->bi.br->block); + block_reader_release(ti->bi.br); FREE_AND_NULL(ti->bi.br); ti->bi.last_key.len = 0; From patchwork Mon Apr 8 12:16:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621042 Received: from fout2-smtp.messagingengine.com (fout2-smtp.messagingengine.com [103.168.172.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AC9096BFC8 for ; Mon, 8 Apr 2024 12:16:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.145 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578619; cv=none; b=uBZ88haNaI3N/wlITFy21spY/So5sh9sZOqa68eNYJEbFsRaIpIu9mGwwcVBztht3y8AGixhb4VJ0+KBtfQgFK3p+JgfBobhDtTopRFoi2NvYe2HuYUjeVkmsbGXbWQAl3PUKfznR14cvVbi3P3lwNXA5lkirpuumIBrhEKb4ms= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578619; c=relaxed/simple; bh=Y9AwtJaAaJbBxyvdIfLEQTnLgRaanNPhferN9uIbtr8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=eTLA9mOjASW5APmxqpfwOw3Zg5PYZU+BO4dqypxGTbRVNJiU/X4SuSlnxgUA0zva4tx7iTFAzd8vUOEU822BfuuvwXfpxCHM0SOm0e9z63ihkC0Kkt84BKQ16Nq9A0s/HUXWRpDckxrQ0Z1AXdAAjXH3QufUA39TcTqlVJVd72g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=DAjvR6y6; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=YwcpIO7T; arc=none smtp.client-ip=103.168.172.145 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="DAjvR6y6"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="YwcpIO7T" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailfout.nyi.internal (Postfix) with ESMTP id 795E713800D2; Mon, 8 Apr 2024 08:16:49 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Mon, 08 Apr 2024 08:16:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712578609; x=1712665009; bh=GD+ap4Zx0O E9K9ZF9Tf3KGoCkp0crN8L/HV7LQ/0528=; b=DAjvR6y6JnQ91CdJivFx45OCHO y/N/faqgVlAIt0P/Biisz5kFXkh77SiNKKKsNOK6vPAcfgk+i2kT6qbANcGKpkzC RWSmil6Mlwnmpf2R6rTC2QTHhrn7OhD5PG7AAN3RMZ6/lM7jM9czY1InKEt5uQDz dKrvMN6fB2KPM7g01zNVdQ6jUtlGysDIe3LyR4wivhetgmTUEgkSoFbrjmhg6+Q9 Q0JeNLG1rX7Gm2r48/1syEKKF7PSrujK4Xpu7uTSOeOSRVCXWtRquGr57HHVeSvn fQHBKp9aB3fVj6fSM2Fr9n73O3P1uauz0OC3iHsijOAGeZlKDU80S5TxO/Og== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712578609; x=1712665009; bh=GD+ap4Zx0OE9K9ZF9Tf3KGoCkp0c rN8L/HV7LQ/0528=; b=YwcpIO7TqNQoKgYBisOP99Bnt+1ovR5fl3C7I2cuuQT/ y1SOln4nVBzPuUo2DMsTB44sJY7qdeaV+G0jyZFhZLtnblMpFB7jDcfJr/kiRB3Q A8pud5QuNDKc5oK4zrtKLReERsf1otYPFju09lMiGnUheaRKy/p2af+GR73rhZNK vc67xFbE4UWHcfhGrt1WaFoIKi9XFIRsYnHB368BU5GpCZhX3JU4/NSNIfS/wKqY wUO5n2M/Kx3yhdXV/98TMpsrqDFIrtKg6Y6pfpSRqoEuMo36fmfVtds7PDld2p3b 5OFMo7UW1sq2GPdLgD1a5Ib1NBYoteqW89glJ+B5TQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedghedtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgepvdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:16:48 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 1cca5479 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:16:41 +0000 (UTC) Date: Mon, 8 Apr 2024 14:16:45 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Karthik Nayak , Justin Tobler Subject: [PATCH v2 05/10] reftable/block: move ownership of block reader into `struct table_iter` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The table iterator allows the caller to iterate through all records in a reftable table. To do so it iterates through all blocks of the desired type one by one, where for each block it creates a new block iterator and yields all its entries. One of the things that is somewhat confusing in this context is who owns the block reader that is being used to read the blocks and pass them to the block iterator. Intuitively, as the table iterator is responsible for iterating through the blocks, one would assume that this iterator is also responsible for managing the lifecycle of the reader. And while it somewhat is, the block reader is ultimately stored inside of the block iterator. Refactor the code such that the block reader is instead fully managed by the table iterator. Instead of passing the reader to the block iterator, we now only end up passing the block data to it. Despite clearing up the lifecycle of the reader, it will also allow for better reuse of the reader in subsequent patches. The following benchmark prints a single matching ref out of 1 million refs. Before: HEAP SUMMARY: in use at exit: 13,603 bytes in 125 blocks total heap usage: 6,607 allocs, 6,482 frees, 509,635 bytes allocated After: HEAP SUMMARY: in use at exit: 13,603 bytes in 125 blocks total heap usage: 7,235 allocs, 7,110 frees, 301,481 bytes allocated Note that while there are more allocation and free calls now, the overall number of bytes allocated is significantly lower. The number of allocations will be reduced significantly by the next patch though. Signed-off-by: Patrick Steinhardt --- reftable/block.c | 43 ++++++++++------ reftable/block.h | 17 ++++--- reftable/reader.c | 123 ++++++++++++++++++++++------------------------ 3 files changed, 100 insertions(+), 83 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index fe836c21e5..2d8d0668b3 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -261,12 +261,12 @@ void block_reader_release(struct block_reader *br) reftable_block_done(&br->block); } -uint8_t block_reader_type(struct block_reader *r) +uint8_t block_reader_type(const struct block_reader *r) { return r->block.data[r->header_off]; } -int block_reader_first_key(struct block_reader *br, struct strbuf *key) +int block_reader_first_key(const struct block_reader *br, struct strbuf *key) { int off = br->header_off + 4, n; struct string_view in = { @@ -286,14 +286,16 @@ int block_reader_first_key(struct block_reader *br, struct strbuf *key) return 0; } -static uint32_t block_reader_restart_offset(struct block_reader *br, int i) +static uint32_t block_reader_restart_offset(const struct block_reader *br, int i) { return get_be24(br->restart_bytes + 3 * i); } -void block_iter_seek_start(struct block_iter *it, struct block_reader *br) +void block_iter_seek_start(struct block_iter *it, const struct block_reader *br) { - it->br = br; + it->block = br->block.data; + it->block_len = br->block_len; + it->hash_size = br->hash_size; strbuf_reset(&it->last_key); it->next_off = br->header_off + 4; } @@ -301,7 +303,7 @@ void block_iter_seek_start(struct block_iter *it, struct block_reader *br) struct restart_needle_less_args { int error; struct strbuf needle; - struct block_reader *reader; + const struct block_reader *reader; }; static int restart_needle_less(size_t idx, void *_args) @@ -340,9 +342,11 @@ static int restart_needle_less(size_t idx, void *_args) return args->needle.len < suffix_len; } -void block_iter_copy_from(struct block_iter *dest, struct block_iter *src) +void block_iter_copy_from(struct block_iter *dest, const struct block_iter *src) { - dest->br = src->br; + dest->block = src->block; + dest->block_len = src->block_len; + dest->hash_size = src->hash_size; dest->next_off = src->next_off; strbuf_reset(&dest->last_key); strbuf_addbuf(&dest->last_key, &src->last_key); @@ -351,14 +355,14 @@ void block_iter_copy_from(struct block_iter *dest, struct block_iter *src) int block_iter_next(struct block_iter *it, struct reftable_record *rec) { struct string_view in = { - .buf = it->br->block.data + it->next_off, - .len = it->br->block_len - it->next_off, + .buf = (unsigned char *) it->block + it->next_off, + .len = it->block_len - it->next_off, }; struct string_view start = in; uint8_t extra = 0; int n = 0; - if (it->next_off >= it->br->block_len) + if (it->next_off >= it->block_len) return 1; n = reftable_decode_key(&it->last_key, &extra, in); @@ -368,7 +372,7 @@ int block_iter_next(struct block_iter *it, struct reftable_record *rec) return REFTABLE_FORMAT_ERROR; string_view_consume(&in, n); - n = reftable_record_decode(rec, it->last_key, extra, in, it->br->hash_size, + n = reftable_record_decode(rec, it->last_key, extra, in, it->hash_size, &it->scratch); if (n < 0) return -1; @@ -378,13 +382,22 @@ int block_iter_next(struct block_iter *it, struct reftable_record *rec) return 0; } +void block_iter_reset(struct block_iter *it) +{ + strbuf_reset(&it->last_key); + it->next_off = 0; + it->block = NULL; + it->block_len = 0; + it->hash_size = 0; +} + void block_iter_close(struct block_iter *it) { strbuf_release(&it->last_key); strbuf_release(&it->scratch); } -int block_iter_seek_key(struct block_iter *it, struct block_reader *br, +int block_iter_seek_key(struct block_iter *it, const struct block_reader *br, struct strbuf *want) { struct restart_needle_less_args args = { @@ -436,7 +449,9 @@ int block_iter_seek_key(struct block_iter *it, struct block_reader *br, it->next_off = block_reader_restart_offset(br, i - 1); else it->next_off = br->header_off + 4; - it->br = br; + it->block = br->block.data; + it->block_len = br->block_len; + it->hash_size = br->hash_size; reftable_record_init(&rec, block_reader_type(br)); diff --git a/reftable/block.h b/reftable/block.h index 601a1e0e89..d733d45ee0 100644 --- a/reftable/block.h +++ b/reftable/block.h @@ -84,16 +84,18 @@ int block_reader_init(struct block_reader *br, struct reftable_block *bl, void block_reader_release(struct block_reader *br); /* Returns the block type (eg. 'r' for refs) */ -uint8_t block_reader_type(struct block_reader *r); +uint8_t block_reader_type(const struct block_reader *r); /* Decodes the first key in the block */ -int block_reader_first_key(struct block_reader *br, struct strbuf *key); +int block_reader_first_key(const struct block_reader *br, struct strbuf *key); /* Iterate over entries in a block */ struct block_iter { /* offset within the block of the next entry to read. */ uint32_t next_off; - struct block_reader *br; + const unsigned char *block; + size_t block_len; + int hash_size; /* key for last entry we read. */ struct strbuf last_key; @@ -106,17 +108,20 @@ struct block_iter { } /* Position `it` at start of the block */ -void block_iter_seek_start(struct block_iter *it, struct block_reader *br); +void block_iter_seek_start(struct block_iter *it, const struct block_reader *br); /* Position `it` to the `want` key in the block */ -int block_iter_seek_key(struct block_iter *it, struct block_reader *br, +int block_iter_seek_key(struct block_iter *it, const struct block_reader *br, struct strbuf *want); -void block_iter_copy_from(struct block_iter *dest, struct block_iter *src); +void block_iter_copy_from(struct block_iter *dest, const struct block_iter *src); /* return < 0 for error, 0 for OK, > 0 for EOF. */ int block_iter_next(struct block_iter *it, struct reftable_record *rec); +/* Reset the block iterator to pristine state without releasing its memory. */ +void block_iter_reset(struct block_iter *it); + /* deallocate memory for `it`. The block reader and its block is left intact. */ void block_iter_close(struct block_iter *it); diff --git a/reftable/reader.c b/reftable/reader.c index f925570bf3..b77b639751 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -220,6 +220,7 @@ struct table_iter { struct reftable_reader *r; uint8_t typ; uint64_t block_off; + struct block_reader br; struct block_iter bi; int is_finished; }; @@ -227,16 +228,6 @@ struct table_iter { .bi = BLOCK_ITER_INIT \ } -static void table_iter_copy_from(struct table_iter *dest, - struct table_iter *src) -{ - dest->r = src->r; - dest->typ = src->typ; - dest->block_off = src->block_off; - dest->is_finished = src->is_finished; - block_iter_copy_from(&dest->bi, &src->bi); -} - static int table_iter_next_in_block(struct table_iter *ti, struct reftable_record *rec) { @@ -250,14 +241,8 @@ static int table_iter_next_in_block(struct table_iter *ti, static void table_iter_block_done(struct table_iter *ti) { - if (!ti->bi.br) { - return; - } - block_reader_release(ti->bi.br); - FREE_AND_NULL(ti->bi.br); - - ti->bi.last_key.len = 0; - ti->bi.next_off = 0; + block_reader_release(&ti->br); + block_iter_reset(&ti->bi); } static int32_t extract_block_size(uint8_t *data, uint8_t *typ, uint64_t off, @@ -321,32 +306,33 @@ int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br, return err; } +static void table_iter_close(struct table_iter *ti) +{ + table_iter_block_done(ti); + block_iter_close(&ti->bi); +} + static int table_iter_next_block(struct table_iter *dest, struct table_iter *src) { - uint64_t next_block_off = src->block_off + src->bi.br->full_block_size; - struct block_reader br = { 0 }; - int err = 0; + uint64_t next_block_off = src->block_off + src->br.full_block_size; + int err; dest->r = src->r; dest->typ = src->typ; dest->block_off = next_block_off; - err = reader_init_block_reader(src->r, &br, next_block_off, src->typ); - if (err > 0) { + err = reader_init_block_reader(src->r, &dest->br, next_block_off, src->typ); + if (err > 0) dest->is_finished = 1; - return 1; - } - if (err != 0) + if (err) { + table_iter_block_done(dest); return err; - else { - struct block_reader *brp = - reftable_malloc(sizeof(struct block_reader)); - *brp = br; - - dest->is_finished = 0; - block_iter_seek_start(&dest->bi, brp); } + + dest->is_finished = 0; + block_iter_seek_start(&dest->bi, &dest->br); + return 0; } @@ -377,14 +363,13 @@ static int table_iter_next(struct table_iter *ti, struct reftable_record *rec) * iterator is drained. */ err = table_iter_next_block(&next, ti); - table_iter_block_done(ti); if (err) { ti->is_finished = 1; return err; } - table_iter_copy_from(ti, &next); - block_iter_close(&next.bi); + table_iter_close(ti); + *ti = next; } } @@ -393,16 +378,14 @@ static int table_iter_next_void(void *ti, struct reftable_record *rec) return table_iter_next(ti, rec); } -static void table_iter_close(void *p) +static void table_iter_close_void(void *ti) { - struct table_iter *ti = p; - table_iter_block_done(ti); - block_iter_close(&ti->bi); + table_iter_close(ti); } static struct reftable_iterator_vtable table_iter_vtable = { .next = &table_iter_next_void, - .close = &table_iter_close, + .close = &table_iter_close_void, }; static void iterator_from_table_iter(struct reftable_iterator *it, @@ -417,19 +400,16 @@ static int reader_table_iter_at(struct reftable_reader *r, struct table_iter *ti, uint64_t off, uint8_t typ) { - struct block_reader br = { 0 }; - struct block_reader *brp = NULL; + int err; - int err = reader_init_block_reader(r, &br, off, typ); + err = reader_init_block_reader(r, &ti->br, off, typ); if (err != 0) return err; - brp = reftable_malloc(sizeof(struct block_reader)); - *brp = br; ti->r = r; - ti->typ = block_reader_type(brp); + ti->typ = block_reader_type(&ti->br); ti->block_off = off; - block_iter_seek_start(&ti->bi, brp); + block_iter_seek_start(&ti->bi, &ti->br); return 0; } @@ -454,23 +434,34 @@ static int reader_seek_linear(struct table_iter *ti, { struct strbuf want_key = STRBUF_INIT; struct strbuf got_key = STRBUF_INIT; - struct table_iter next = TABLE_ITER_INIT; struct reftable_record rec; int err = -1; reftable_record_init(&rec, reftable_record_type(want)); reftable_record_key(want, &want_key); + /* + * First we need to locate the block that must contain our record. To + * do so we scan through blocks linearly until we find the first block + * whose first key is bigger than our wanted key. Once we have found + * that block we know that the key must be contained in the preceding + * block. + * + * This algorithm is somewhat unfortunate because it means that we + * always have to seek one block too far and then back up. But as we + * can only decode the _first_ key of a block but not its _last_ key we + * have no other way to do this. + */ while (1) { + struct table_iter next = TABLE_ITER_INIT; + err = table_iter_next_block(&next, ti); if (err < 0) goto done; - - if (err > 0) { + if (err > 0) break; - } - err = block_reader_first_key(next.bi.br, &got_key); + err = block_reader_first_key(&next.br, &got_key); if (err < 0) goto done; @@ -480,16 +471,20 @@ static int reader_seek_linear(struct table_iter *ti, } table_iter_block_done(ti); - table_iter_copy_from(ti, &next); + *ti = next; } - err = block_iter_seek_key(&ti->bi, ti->bi.br, &want_key); + /* + * We have located the block that must contain our record, so we seek + * the wanted key inside of it. If the block does not contain our key + * we know that the corresponding record does not exist. + */ + err = block_iter_seek_key(&ti->bi, &ti->br, &want_key); if (err < 0) goto done; err = 0; done: - block_iter_close(&next.bi); reftable_record_release(&rec); strbuf_release(&want_key); strbuf_release(&got_key); @@ -508,6 +503,7 @@ static int reader_seek_indexed(struct reftable_reader *r, .u.idx = { .last_key = STRBUF_INIT }, }; struct table_iter index_iter = TABLE_ITER_INIT; + struct table_iter empty = TABLE_ITER_INIT; struct table_iter next = TABLE_ITER_INIT; int err = 0; @@ -549,7 +545,6 @@ static int reader_seek_indexed(struct reftable_reader *r, * not exist. */ err = table_iter_next(&index_iter, &index_result); - table_iter_block_done(&index_iter); if (err != 0) goto done; @@ -558,7 +553,7 @@ static int reader_seek_indexed(struct reftable_reader *r, if (err != 0) goto done; - err = block_iter_seek_key(&next.bi, next.bi.br, &want_index.u.idx.last_key); + err = block_iter_seek_key(&next.bi, &next.br, &want_index.u.idx.last_key); if (err < 0) goto done; @@ -572,18 +567,20 @@ static int reader_seek_indexed(struct reftable_reader *r, break; } - table_iter_copy_from(&index_iter, &next); + table_iter_close(&index_iter); + index_iter = next; + next = empty; } if (err == 0) { - struct table_iter empty = TABLE_ITER_INIT; struct table_iter *malloced = reftable_calloc(1, sizeof(*malloced)); - *malloced = empty; - table_iter_copy_from(malloced, &next); + *malloced = next; + next = empty; iterator_from_table_iter(it, malloced); } + done: - block_iter_close(&next.bi); + table_iter_close(&next); table_iter_close(&index_iter); reftable_record_release(&want_index); reftable_record_release(&index_result); From patchwork Mon Apr 8 12:16:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621043 Received: from fout2-smtp.messagingengine.com (fout2-smtp.messagingengine.com [103.168.172.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CBBF669E07 for ; Mon, 8 Apr 2024 12:16:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.145 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578625; cv=none; b=EArcfOBGU9dCALH7CHjo6KGuCf2hbekos52v7ysNK8z4O23k8NqYoPknBe8sraXtJ2qDsH45avwX6CTtxvF5c5zR+yS435Mu2SXkCXov+yHpr9JyeK8l9S55R4VpE9VP6z23nJhrcbG8wt8OKhQuVu2RIN0l22utnLgeGAmudMg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578625; c=relaxed/simple; bh=h9Zu2MfSCa9Ira5lzqVQb93eQXrCsJf+u+6L/+AcZ+0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=o7ehJeC3j0Y/hk3kNOwovSzgOfWmQOkfNqkG4ykS8HxDejc9Eui2c3kPfVRJ4YPAPj+RHtkdUtcvrNmPXx3iND5ISDQh6uQQg9QVo7KrJ7DJyG5vXmLFlqirWa+IG3Wt8+r9jxQtS3FJW3HG+E+tbzta6tw/YVDcdqRVo+4xODc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=gWJEVc1t; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=f1VqdeUR; arc=none smtp.client-ip=103.168.172.145 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="gWJEVc1t"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="f1VqdeUR" Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailfout.nyi.internal (Postfix) with ESMTP id B7C8A13800BE; Mon, 8 Apr 2024 08:16:53 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Mon, 08 Apr 2024 08:16:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712578613; x=1712665013; bh=BFVtMKp35k 760b6c1cdfhLBgYF4lXpHCRKE9A8LAmVI=; b=gWJEVc1tnvGVS51mP4S7aCxYH8 QpstrEVI/HLRkTCW+SWfoXDLKYg0q5lbPzppMlyFqt/WYCUdU6Gtq7RhPz38Mzgu 27EXUZxemq7Sqn71gR7AwzfLzZliELthveuuGUgxsdCL5RF+UCS+ESb+Jq3kTHvH QIMHCcfNIig/iUF9nQBi8CqXo5URdrdcRFqG90+pnTkiZL42qxEXOGGZyIpyWp6X Ds+7qDGDT0597U9SqkL22OxGcCLUGNgpzUaVMOPr1aNgHs8hypykBljf3JLuY2jv dEhiphxc1haG592I6tEo2XPkTwPz09vUiI4HdNTcUpPjwE8nkkVDyhc/JGgQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712578613; x=1712665013; bh=BFVtMKp35k760b6c1cdfhLBgYF4l XpHCRKE9A8LAmVI=; b=f1VqdeUR4Q2be+jT8Uk9PzWFBqdhoiUOhbYleNT7pRHs JR2DpHI6MkhtLQG7MvRkURcURk9tJ2hIiHyyHef48SE5qJNWeCBxjSe2+p9uNTq7 BcXARFW5LFgC5yLUZnWnHE8I6j6nhEHdB6tw7rJYoCtUwZh95iKnMrDlMOA34eFA ZUG1HWIw8rbZYWMWxBchhEwMepAo/8jEpi20IZ3JCpM5xdTIxkAH0maG0bvLv9wC ArumNN7PyOqAc4LzRTZ0fhcuYIKS+wrnqqoHXResYpNqjTXlO8DGIE9EbFOzVqkW WHnmnWjqZO4irz6VIFcsRcoVwaSmT1Kt4YgekHYDYQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedghedtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgepvdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:16:52 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 7733ce13 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:16:46 +0000 (UTC) Date: Mon, 8 Apr 2024 14:16:50 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Karthik Nayak , Justin Tobler Subject: [PATCH v2 06/10] reftable/reader: iterate to next block in place Message-ID: <685f0a40bc5ac6a8c3eaab78bac2050c0cad8de4.1712578376.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The table iterator has to iterate towards the next block once it has yielded all records of the current block. This is done by creating a new table iterator, initializing it to the next block, releasing the old iterator and then copying over the data. Refactor the code to instead advance the table iterator in place. This is simpler and unlocks some optimizations in subsequent patches. Also, it allows us to avoid some allocations. The following measurements show a single matching ref out of 1 million refs. Before this change: HEAP SUMMARY: in use at exit: 13,603 bytes in 125 blocks total heap usage: 7,235 allocs, 7,110 frees, 301,481 bytes allocated After: HEAP SUMMARY: in use at exit: 13,603 bytes in 125 blocks total heap usage: 315 allocs, 190 frees, 107,027 bytes allocated Signed-off-by: Patrick Steinhardt --- reftable/block.c | 2 ++ reftable/reader.c | 47 ++++++++++++++++++++++++++--------------------- 2 files changed, 28 insertions(+), 21 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 2d8d0668b3..0c4e71eae3 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -188,6 +188,8 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, uint8_t *restart_bytes = NULL; uint8_t *uncompressed = NULL; + reftable_block_done(&br->block); + if (!reftable_is_block_type(typ)) { err = REFTABLE_FORMAT_ERROR; goto done; diff --git a/reftable/reader.c b/reftable/reader.c index b77b639751..dd4de294a1 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -312,26 +312,20 @@ static void table_iter_close(struct table_iter *ti) block_iter_close(&ti->bi); } -static int table_iter_next_block(struct table_iter *dest, - struct table_iter *src) +static int table_iter_next_block(struct table_iter *ti) { - uint64_t next_block_off = src->block_off + src->br.full_block_size; + uint64_t next_block_off = ti->block_off + ti->br.full_block_size; int err; - dest->r = src->r; - dest->typ = src->typ; - dest->block_off = next_block_off; - - err = reader_init_block_reader(src->r, &dest->br, next_block_off, src->typ); + err = reader_init_block_reader(ti->r, &ti->br, next_block_off, ti->typ); if (err > 0) - dest->is_finished = 1; - if (err) { - table_iter_block_done(dest); + ti->is_finished = 1; + if (err) return err; - } - dest->is_finished = 0; - block_iter_seek_start(&dest->bi, &dest->br); + ti->block_off = next_block_off; + ti->is_finished = 0; + block_iter_seek_start(&ti->bi, &ti->br); return 0; } @@ -342,7 +336,6 @@ static int table_iter_next(struct table_iter *ti, struct reftable_record *rec) return REFTABLE_API_ERROR; while (1) { - struct table_iter next = TABLE_ITER_INIT; int err; if (ti->is_finished) @@ -362,14 +355,11 @@ static int table_iter_next(struct table_iter *ti, struct reftable_record *rec) * table and retry. If there are no more blocks then the * iterator is drained. */ - err = table_iter_next_block(&next, ti); + err = table_iter_next_block(ti); if (err) { ti->is_finished = 1; return err; } - - table_iter_close(ti); - *ti = next; } } @@ -453,9 +443,24 @@ static int reader_seek_linear(struct table_iter *ti, * have no other way to do this. */ while (1) { - struct table_iter next = TABLE_ITER_INIT; + struct table_iter next = *ti; + + /* + * We must be careful to not modify underlying data of `ti` + * because we may find that `next` does not contain our desired + * block, but that `ti` does. In that case, we would discard + * `next` and continue with `ti`. + * + * This also means that we cannot reuse allocated memory for + * `next` here. While it would be great if we could, it should + * in practice not be too bad given that we should only ever + * end up doing linear seeks with at most three blocks. As soon + * as we have more than three blocks we would have an index, so + * we would not do a linear search there anymore. + */ + memset(&next.br.block, 0, sizeof(next.br.block)); - err = table_iter_next_block(&next, ti); + err = table_iter_next_block(&next); if (err < 0) goto done; if (err > 0) From patchwork Mon Apr 8 12:16:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621044 Received: from fout2-smtp.messagingengine.com (fout2-smtp.messagingengine.com [103.168.172.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1B68B1EB39 for ; Mon, 8 Apr 2024 12:17:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.145 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578629; cv=none; b=Vua8o001u0rk+xGhVF6KIm1Ss/xhnpDTNATFaGvD8fZenyUPbZOwg7EUH3ni9Avy6O0O7pz5Ey60nVX5MRFL7ngfVQsXUFTc67NCy9VTYX22WihlQpw45fT+EXDiI2SiqLTAJlpplqKpLBfnnY2ubD6ae13CyqYQGTYoQ4z/Qsc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578629; c=relaxed/simple; bh=r2uKbPDbZBw6e13hvfMb2qXCcMsdh/ftFIkxDUrQUy4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=WW9c2Y+I14QLWSFvgo7NDs7Lp9/NwaUGS/fS1PnIG+Wa8Ua8gSgDlSA6PoU+mPPy0y+EYGRl+IiF3uephIPf90gG1NdIdSy1/n9gI1vnwhDN5VrljUNhRD3csMGdMMSvy0/FZGeRRWEyIPoLx3CvNlU1oaKylpXlL2O1DoMeH3w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=A0OKx6Hi; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=njG0pJ4p; arc=none smtp.client-ip=103.168.172.145 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="A0OKx6Hi"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="njG0pJ4p" Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailfout.nyi.internal (Postfix) with ESMTP id 0B65F13800BC; Mon, 8 Apr 2024 08:16:58 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute7.internal (MEProxy); Mon, 08 Apr 2024 08:16:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712578618; x=1712665018; bh=/Q08PZGlQu Znk3wJICWdNzvnFyAcIJu196rXv+KL7F0=; b=A0OKx6Hi6RjqJE97vaKXK8NSaD isD8Yp2wf2pGhRV5ObBgkQBQ0mxdb8EpYfZuoGZRaglRWh1mCMWZDHNSx6dZp6fu kQI2NnDLxKNEe81Xnod1psOixq01LrBwVt2LrDDPu1L4KTbKZ4ZFz5tpopue+y2t MATmxpBtbLll6G6kXS/i8+WlAo8uDGC0vueW5Ec0L6yZ5TZfXfYqB2SXIs9N4fXI RGulZcDm1VRiwKBpd/MQT8gbHTj1sHxnbfY/bqCFPp77wWg3CrKI1zKFtyeH1671 Ukat0rA6n2VApU7HoEiNkXVIp+JzWUsiXmCQ9Oq83a0zWYTs0/Ym5CB8OzJQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712578618; x=1712665018; bh=/Q08PZGlQuZnk3wJICWdNzvnFyAc IJu196rXv+KL7F0=; b=njG0pJ4pt68nzhFCqPJtv7EjwZQb4j0H9CWMfekctVQb If5oyYNcwtFAYHhbP6Ofe5GjADSbKZ5PKLXseC6n+SrAzrnF9ak7NWBlqejFtvj+ m8zKPvubTPs2DORmdQJaECwqBwL0ZW7Oh5qzczTLvL8p6epuZwIWFNeAOPQBMTv0 dhpb/2dpDFmJXbxoPeLLH+Jo4FiltAQMzyuh7iJfGmw4K+oYvKZ5H8XEUVJ9Juex ZkubkjQLp/C5/KG8TKtuXWZqeuoqnj29ah0Q4k+1CuGMK1fleHaDujl+xa9YhveI Bjs+rLQicIj4KnQ69kyT94miokqnGTZL3z+H5ykQSw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedghedtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:16:56 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id faeb18ad (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:16:51 +0000 (UTC) Date: Mon, 8 Apr 2024 14:16:54 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Karthik Nayak , Justin Tobler Subject: [PATCH v2 07/10] reftable/block: reuse uncompressed blocks Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The reftable backend stores reflog entries in a compressed format and thus needs to uncompress blocks before one can read records from it. For each reflog block we thus have to allocate an array that we can decompress the block contents into. This block is being discarded whenever the table iterator moves to the next block. Consequently, we reallocate a new array on every block, which is quite wasteful. Refactor the code to reuse the uncompressed block data when moving the block reader to a new block. This significantly reduces the number of allocations when iterating through many compressed blocks. The following measurements are done with `git reflog list` when listing 100k reflogs. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 45,755 allocs, 45,633 frees, 254,779,456 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 23,028 allocs, 22,906 frees, 162,813,547 bytes allocated Signed-off-by: Patrick Steinhardt --- reftable/block.c | 14 ++++++-------- reftable/block.h | 4 ++++ reftable/reader.c | 27 ++++++++++++++++----------- 3 files changed, 26 insertions(+), 19 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 0c4e71eae3..9460273290 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -186,7 +186,6 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, uint16_t restart_count = 0; uint32_t restart_start = 0; uint8_t *restart_bytes = NULL; - uint8_t *uncompressed = NULL; reftable_block_done(&br->block); @@ -202,14 +201,15 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, uLongf src_len = block->len - block_header_skip; /* Log blocks specify the *uncompressed* size in their header. */ - REFTABLE_ALLOC_ARRAY(uncompressed, sz); + REFTABLE_ALLOC_GROW(br->uncompressed_data, sz, + br->uncompressed_cap); /* Copy over the block header verbatim. It's not compressed. */ - memcpy(uncompressed, block->data, block_header_skip); + memcpy(br->uncompressed_data, block->data, block_header_skip); /* Uncompress */ if (Z_OK != - uncompress2(uncompressed + block_header_skip, &dst_len, + uncompress2(br->uncompressed_data + block_header_skip, &dst_len, block->data + block_header_skip, &src_len)) { err = REFTABLE_ZLIB_ERROR; goto done; @@ -222,10 +222,8 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, /* We're done with the input data. */ reftable_block_done(block); - block->data = uncompressed; - uncompressed = NULL; + block->data = br->uncompressed_data; block->len = sz; - block->source = malloc_block_source(); full_block_size = src_len + block_header_skip; } else if (full_block_size == 0) { full_block_size = sz; @@ -254,12 +252,12 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, br->restart_bytes = restart_bytes; done: - reftable_free(uncompressed); return err; } void block_reader_release(struct block_reader *br) { + reftable_free(br->uncompressed_data); reftable_block_done(&br->block); } diff --git a/reftable/block.h b/reftable/block.h index d733d45ee0..12414eb642 100644 --- a/reftable/block.h +++ b/reftable/block.h @@ -66,6 +66,10 @@ struct block_reader { struct reftable_block block; int hash_size; + /* Uncompressed data for log entries. */ + unsigned char *uncompressed_data; + size_t uncompressed_cap; + /* size of the data, excluding restart data. */ uint32_t block_len; uint8_t *restart_bytes; diff --git a/reftable/reader.c b/reftable/reader.c index dd4de294a1..aacd5f1337 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -459,6 +459,8 @@ static int reader_seek_linear(struct table_iter *ti, * we would not do a linear search there anymore. */ memset(&next.br.block, 0, sizeof(next.br.block)); + next.br.uncompressed_data = NULL; + next.br.uncompressed_cap = 0; err = table_iter_next_block(&next); if (err < 0) @@ -599,25 +601,28 @@ static int reader_seek_internal(struct reftable_reader *r, struct reftable_reader_offsets *offs = reader_offsets_for(r, reftable_record_type(rec)); uint64_t idx = offs->index_offset; - struct table_iter ti = TABLE_ITER_INIT; - int err = 0; + struct table_iter ti = TABLE_ITER_INIT, *p; + int err; + if (idx > 0) return reader_seek_indexed(r, it, rec); err = reader_start(r, &ti, reftable_record_type(rec), 0); if (err < 0) - return err; + goto out; + err = reader_seek_linear(&ti, rec); if (err < 0) - return err; - else { - struct table_iter *p = - reftable_malloc(sizeof(struct table_iter)); - *p = ti; - iterator_from_table_iter(it, p); - } + goto out; - return 0; + REFTABLE_ALLOC_ARRAY(p, 1); + *p = ti; + iterator_from_table_iter(it, p); + +out: + if (err) + table_iter_close(&ti); + return err; } static int reader_seek(struct reftable_reader *r, struct reftable_iterator *it, From patchwork Mon Apr 8 12:16:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621046 Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AEF486AFB9 for ; Mon, 8 Apr 2024 12:17:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.157 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578639; cv=none; b=g7awlxaYNEUcuP+w0pYmg+c64FqnbFnptxpBM10mWDzBn0UQ0yQfeY5Jm52HMFP1BwGnBpT1U+V89FZQ8j4Vb7yAZC/Dd5k3FNU/yHz5Z4O+cuNDgUmwNjBDpE26imqLZYcI01Itw6N6yuRKZ77IuhYFaU72n5iAHxwGPFHdRk4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578639; c=relaxed/simple; bh=1UEgWNZnr9G4511/3cJBBdqCDUCMT9TgB+6HvJtmbgA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Cm4MgaPTx2BG6hn27IVjSey/rzXcLHX+ptlUc1rPjPB8UwHP8lJChjj25WkSCY2CgfPHf0upgV3dTaAAxD1tVsYs97wE3u1lufprxdJX11+PvVS9GwDyKlykA4eVz0Hx3LOaJU5YcLUsVMKyzbQ5AhPL3exMbCind6Jtu04fa8Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=XKC7H7WR; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=l1/7/fk3; arc=none smtp.client-ip=103.168.172.157 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="XKC7H7WR"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="l1/7/fk3" Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 65E161140102; Mon, 8 Apr 2024 08:17:03 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Mon, 08 Apr 2024 08:17:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712578623; x=1712665023; bh=wEKJIAfU57 wpqnBkP3PTUTptULU4fSXfGYx2Oz2y7J4=; b=XKC7H7WREtkTVffpLFbCWZgD3R Rx4pnRwXQDtgVfG3M4fqba8vRQ8wmO8naGUPk1yZJniTVBfr3DY8JI7R3QPE4uNz kKsFl7zTiCKy6YktfIz/1oi2ZcwXcJ6U3pp8WVFM5NrgmC4y5qfkbDftR/RL3U0Y tNdGatDVGZpir13akWTNiQsoE/BLFPm+r8k2ioDpOc3b9i3llDs/9cY0+uov+Rh4 TVa8Oi10mJvmLOi8nVH/leIilNy1qenkYKCz1KkzbeaBHkVK+B6A0BeGeRePJXOf ye4WsvM2tk1ZPm2m/5su9y+ptPjWEza0BzNgU5+56hCGPRVOBVHWQMpF6huQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712578623; x=1712665023; bh=wEKJIAfU57wpqnBkP3PTUTptULU4 fSXfGYx2Oz2y7J4=; b=l1/7/fk3ieFJanyX3sCK6oXEgxjORBFuBztxONzyeed7 5eRmfqGdf8yNrhFLhawmd8BPTp6M2MlyrPLP0IpfJx3ke/h+SEtQyxECb2CpXPET zV3RQjPeCj8d7O3PnNcWeUOmPf5OEJagLGX886sqTZXaEIrldeQOmGfbAIJfex1A B1ngu5wnw53z4KlNXhHjTJJPnZ5nQn5H3cQCB4aPAwHvS+QjIpOVTi0fi6k5/4ww fx5qBpP1X15yexLVf9M1u4xeDVWCWGH5vG0/HLAC1QNEYQibmEkq0PKCXFWim+/G HEhdnwbgTT8xDqshLX1iqGHaVHuvyow7ysNOUew1hw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedggeelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgepudenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:17:02 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id af7e23ec (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:16:55 +0000 (UTC) Date: Mon, 8 Apr 2024 14:16:59 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Karthik Nayak , Justin Tobler Subject: [PATCH v2 08/10] reftable/block: open-code call to `uncompress2()` Message-ID: <6635c7b986559e0684a3c5d374bb3460aeaa0084.1712578376.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The reftable format stores log blocks in a compressed format. Thus, whenever we want to read such a block we first need to decompress it. This is done by calling the convenience function `uncompress2()` of the zlib library, which is a simple wrapper that manages the lifecycle of the `zstream` structure for us. While nice for one-off inflation of data, when iterating through reflogs we will likely end up inflating many such log blocks. This requires us to reallocate the state of the `zstream` every single time, which adds up over time. It would thus be great to reuse the `zstream` instead of discarding it after every inflation. Open-code the call to `uncompress2()` such that we can start reusing the `zstream` in the subsequent commit. Note that our open-coded variant is different from `uncompress2()` in two ways: - We do not loop around `inflate()` until we have processed all input. As our input is limited by the maximum block size, which is 16MB, we should not hit limits of `inflate()`. - We use `Z_FINISH` instead of `Z_NO_FLUSH`. Quoting the `inflate()` documentation: "inflate() should normally be called until it returns Z_STREAM_END or an error. However if all decompression is to be performed in a single step (a single call of inflate), the parameter flush should be set to Z_FINISH." Furthermore, "Z_FINISH also informs inflate to not maintain a sliding window if the stream completes, which reduces inflate's memory footprint." Other than that this commit is expected to be functionally equivalent and does not yet reuse the `zstream`. Signed-off-by: Patrick Steinhardt --- reftable/block.c | 38 ++++++++++++++++++++++++++++---------- 1 file changed, 28 insertions(+), 10 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 9460273290..435922b569 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -195,10 +195,10 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, } if (typ == BLOCK_TYPE_LOG) { - int block_header_skip = 4 + header_off; - uLongf dst_len = sz - block_header_skip; /* total size of dest - buffer. */ - uLongf src_len = block->len - block_header_skip; + uint32_t block_header_skip = 4 + header_off; + uLong dst_len = sz - block_header_skip; + uLong src_len = block->len - block_header_skip; + z_stream stream = {0}; /* Log blocks specify the *uncompressed* size in their header. */ REFTABLE_ALLOC_GROW(br->uncompressed_data, sz, @@ -207,15 +207,33 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, /* Copy over the block header verbatim. It's not compressed. */ memcpy(br->uncompressed_data, block->data, block_header_skip); - /* Uncompress */ - if (Z_OK != - uncompress2(br->uncompressed_data + block_header_skip, &dst_len, - block->data + block_header_skip, &src_len)) { + err = inflateInit(&stream); + if (err != Z_OK) { err = REFTABLE_ZLIB_ERROR; goto done; } - if (dst_len + block_header_skip != sz) { + stream.next_in = block->data + block_header_skip; + stream.avail_in = src_len; + stream.next_out = br->uncompressed_data + block_header_skip; + stream.avail_out = dst_len; + + /* + * We know both input as well as output size, and we know that + * the sizes should never be bigger than `uInt_MAX` because + * blocks can at most be 16MB large. We can thus use `Z_FINISH` + * here to instruct zlib to inflate the data in one go, which + * is more efficient than using `Z_NO_FLUSH`. + */ + err = inflate(&stream, Z_FINISH); + inflateEnd(&stream); + if (err != Z_STREAM_END) { + err = REFTABLE_ZLIB_ERROR; + goto done; + } + err = 0; + + if (stream.total_out + block_header_skip != sz) { err = REFTABLE_FORMAT_ERROR; goto done; } @@ -224,7 +242,7 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, reftable_block_done(block); block->data = br->uncompressed_data; block->len = sz; - full_block_size = src_len + block_header_skip; + full_block_size = src_len + block_header_skip - stream.avail_in; } else if (full_block_size == 0) { full_block_size = sz; } else if (sz < full_block_size && sz < block->len && From patchwork Mon Apr 8 12:17:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621047 Received: from fout2-smtp.messagingengine.com (fout2-smtp.messagingengine.com [103.168.172.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3914A6E61E for ; Mon, 8 Apr 2024 12:17:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.145 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578638; cv=none; b=rgDgCtf1vF/kehwB2LJWZAEwJaFgZ1hKXS/7v6GHQ2gC7xpejhyTeHQ+3vQmCbCH3vIfScxnHnoL81O9oeDBpz8Kv5npiaz9pItHvKV5ekPG3WvGtRAJmChbbC2XGUca1FjB2CHQUT4kZoBk/aNwQT+yDziE1EqMzmoVdN+uVBM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578638; c=relaxed/simple; bh=ImNBMPFLQf8y3OfW+NCLr717VJ2AFGbDeNQQpwFiSnA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ooTd1mNZ+5s8coOxcNmW6PgS7ZTicAQjIa1tdv6m9pUfH7s3XISpVxKZ68U+RXVSpiTMkWA/6ZCqmxF66gcPyTGJba0qLhXBsXeLp8PpeALmx+TKi5orcBoOnHWPP141v6zGCt+bjTsZe1Tggn2LUjo+54+Afrvj47ZKPJhDpwE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=KNx5sHmf; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=rcvnEKRP; arc=none smtp.client-ip=103.168.172.145 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="KNx5sHmf"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="rcvnEKRP" Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailfout.nyi.internal (Postfix) with ESMTP id A692413800FB; Mon, 8 Apr 2024 08:17:07 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Mon, 08 Apr 2024 08:17:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712578627; x=1712665027; bh=1Vi5Jga9Pn UJOTCViUQmQae+eorRDZowKA43a11mwhY=; b=KNx5sHmfz5qgERdehOpDd9bNG3 Eq+e/Tq5K9EpRsQUQUynpPn8yQuINSXxuAQcBZmamLSplDddMZNJ0vCRf4GxfFlf lvcfXkzNimxWUt18Ni2nScg6UfkYmXJlEJuto8+aZ32AIvFrhENCW+snbzYZFRd9 xqsJtDAbu4OskLcpqDbyFItv/eSAM9owmFDVNmIbYzpvSHizx0khXjPul5kcrRJ7 Ov0VNZ0Hzaw2zLyLe64yUv94PYC8WDpBNHszhEFk3rxbBqkoLkjVl1fephtQWux2 854x9pJcOR6xw1iqFmUrXe2wTCffvqZwpQbm7uL0dFAIRdhvNoc7ncq8bHbg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712578627; x=1712665027; bh=1Vi5Jga9PnUJOTCViUQmQae+eorR DZowKA43a11mwhY=; b=rcvnEKRPRrErb7jriUr1WhuNndB1AzXELWWlAZ1/eEnn z6dccY/h1L0gwspc/gYfu7PT2bySOrffXLwi4OOeqRtyEX/G7KjFk+DhMs5JxXJ2 1y5FPXOg2tpe2g1LESIeeJwKABtXOjlySP9hOyCF6jSPht09k4DA73QL0d029OPe 26dqdgWnHitwWQkhF9yozXENs2P7GFCI4FSmOpA4LK/iTG89gL4fPduEzy9IV6xf X4mn3WOmDYg5iq/+zUipXJ6p23qOP4eE9iQS5RcNpWfWJx8SUPFmECCxajBcFS9I e4NeaPSj0tLmQFQzgITDv5sGMFBW/nIrb+lyRGu6yw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedghedtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgepfeenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:17:06 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 790c45c3 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:17:00 +0000 (UTC) Date: Mon, 8 Apr 2024 14:17:04 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Karthik Nayak , Justin Tobler Subject: [PATCH v2 09/10] reftable/block: reuse `zstream` state on inflation Message-ID: <587b5601c0c51a8f041258a27e693a2590468eed.1712578376.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When calling `inflateInit()` and `inflate()`, the zlib library will allocate several data structures for the underlying `zstream` to keep track of various information. Thus, when inflating repeatedly, it is possible to optimize memory allocation patterns by reusing the `zstream` and then calling `inflateReset()` on it to prepare it for the next chunk of data to inflate. This is exactly what the reftable code is doing: when iterating through reflogs we need to potentially inflate many log blocks, but we discard the `zstream` every single time. Instead, as we reuse the `block_reader` for each of the blocks anyway, we can initialize the `zstream` once and then reuse it for subsequent inflations. Refactor the code to do so, which leads to a significant reduction in the number of allocations. The following measurements were done when iterating through 1 million reflog entries. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 23,028 allocs, 22,906 frees, 162,813,552 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 302 allocs, 180 frees, 88,352 bytes allocated Signed-off-by: Patrick Steinhardt --- reftable/block.c | 25 +++++++++++++++---------- reftable/block.h | 3 +++ reftable/reader.c | 1 + 3 files changed, 19 insertions(+), 10 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 435922b569..c6c4a68ea1 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -198,7 +198,6 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, uint32_t block_header_skip = 4 + header_off; uLong dst_len = sz - block_header_skip; uLong src_len = block->len - block_header_skip; - z_stream stream = {0}; /* Log blocks specify the *uncompressed* size in their header. */ REFTABLE_ALLOC_GROW(br->uncompressed_data, sz, @@ -207,16 +206,21 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, /* Copy over the block header verbatim. It's not compressed. */ memcpy(br->uncompressed_data, block->data, block_header_skip); - err = inflateInit(&stream); + if (!br->zstream) { + REFTABLE_CALLOC_ARRAY(br->zstream, 1); + err = inflateInit(br->zstream); + } else { + err = inflateReset(br->zstream); + } if (err != Z_OK) { err = REFTABLE_ZLIB_ERROR; goto done; } - stream.next_in = block->data + block_header_skip; - stream.avail_in = src_len; - stream.next_out = br->uncompressed_data + block_header_skip; - stream.avail_out = dst_len; + br->zstream->next_in = block->data + block_header_skip; + br->zstream->avail_in = src_len; + br->zstream->next_out = br->uncompressed_data + block_header_skip; + br->zstream->avail_out = dst_len; /* * We know both input as well as output size, and we know that @@ -225,15 +229,14 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, * here to instruct zlib to inflate the data in one go, which * is more efficient than using `Z_NO_FLUSH`. */ - err = inflate(&stream, Z_FINISH); - inflateEnd(&stream); + err = inflate(br->zstream, Z_FINISH); if (err != Z_STREAM_END) { err = REFTABLE_ZLIB_ERROR; goto done; } err = 0; - if (stream.total_out + block_header_skip != sz) { + if (br->zstream->total_out + block_header_skip != sz) { err = REFTABLE_FORMAT_ERROR; goto done; } @@ -242,7 +245,7 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, reftable_block_done(block); block->data = br->uncompressed_data; block->len = sz; - full_block_size = src_len + block_header_skip - stream.avail_in; + full_block_size = src_len + block_header_skip - br->zstream->avail_in; } else if (full_block_size == 0) { full_block_size = sz; } else if (sz < full_block_size && sz < block->len && @@ -275,6 +278,8 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, void block_reader_release(struct block_reader *br) { + inflateEnd(br->zstream); + reftable_free(br->zstream); reftable_free(br->uncompressed_data); reftable_block_done(&br->block); } diff --git a/reftable/block.h b/reftable/block.h index 12414eb642..c1bd1892cb 100644 --- a/reftable/block.h +++ b/reftable/block.h @@ -56,6 +56,8 @@ int block_writer_finish(struct block_writer *w); /* clears out internally allocated block_writer members. */ void block_writer_release(struct block_writer *bw); +struct z_stream; + /* Read a block. */ struct block_reader { /* offset of the block header; nonzero for the first block in a @@ -67,6 +69,7 @@ struct block_reader { int hash_size; /* Uncompressed data for log entries. */ + z_stream *zstream; unsigned char *uncompressed_data; size_t uncompressed_cap; diff --git a/reftable/reader.c b/reftable/reader.c index aacd5f1337..481dff10d4 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -459,6 +459,7 @@ static int reader_seek_linear(struct table_iter *ti, * we would not do a linear search there anymore. */ memset(&next.br.block, 0, sizeof(next.br.block)); + next.br.zstream = NULL; next.br.uncompressed_data = NULL; next.br.uncompressed_cap = 0; From patchwork Mon Apr 8 12:17:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621045 Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5298F2AF0F for ; Mon, 8 Apr 2024 12:17:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.157 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578639; cv=none; b=avHzoMzNiPsyzw8WhoIMiyvAKrHIVCtipZCjPLU8BIUuOmFgRkOnEaihkzvp+TmDZwMBRmpwpvGF8lLik+yQxIWDyRa7HkdZDWtf8rVYdSW/fsOZOMCj9/irlJvmAgybbPzLtVwQBmDoNhA2WAouYOhJZRUXJ7k6jlYIU76U3FA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712578639; c=relaxed/simple; bh=NUCfYJNFSAPiD5SdeKaNwEP6IPrYN+qdv3eAnbCtz3Y=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=BPVOlraQWZycHnyW12Bcsftxazq3KvsKoCGwwOA4Sw0R8gN38OLSbecqQjJPUy6KVDjGNojF3vGzhRBmMQIBFz++CUtmwI6mhRjQUM/YMd4XJ4ehaorYHxXIP+RSkko6PxETUims6Gpg2D0+XhvcfWt/fRoJOVQWJHs3bjye/ds= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=DNxJjwLK; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=ljBcfuUF; arc=none smtp.client-ip=103.168.172.157 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="DNxJjwLK"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="ljBcfuUF" Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailfhigh.nyi.internal (Postfix) with ESMTP id CB96311400F9; Mon, 8 Apr 2024 08:17:11 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Mon, 08 Apr 2024 08:17:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712578631; x=1712665031; bh=jma2GOiNkp rpT2NaL0euVdJd//RSQxhROsTGTdurSkA=; b=DNxJjwLKPfKkmVo115nzsPkbfd tMpFqZtS+CoLAInyFlKZsRkWJcupn8Bd4kA2o6c+8JiCI3MiQIVrY+Of7H+4EQ8H d9Z5JLvbt3e04ov9zY9gdCf0jUisRpAtuHciRUIyyWOiTKkCA6Qpn5kAWLWHEbhf y8ZFhXjfMVNsxTDduwUTcGCgZvsSQ3AkY5+XeE+TYcBANHkF0GD1LVlBFS/fGA6G ZBjCcJsBdl0MJ/PsRDT6X3AqBzSyRTFaWBd8vFsN9c6LU8B9U1vPERK88h3naZp7 Xz9URnuAWvHcCvw2PZQlZ/ukoc1/hugCs5pDUOGMOMYHjo5OVBaNeTJlGQ/g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712578631; x=1712665031; bh=jma2GOiNkprpT2NaL0euVdJd//RS QxhROsTGTdurSkA=; b=ljBcfuUFRBz0ruWdI7wjOTAoD0dJDwg77VV076QwHQl0 +QXTCKxc3GyoKunOyfl0nqsSRetoiomqfjbSVrrowqC/srvvKa4JQtBcqz47rEs3 cduZMjSgjuZC+FDvsbKxAG5ry3Z+EJcn22ah/DiL7l/SwlDFmt/47s5tUxiX7wnd LIu8Ni/iHenYlUcaoOG3Y559h02ND9Yo8rXwJ95Nj9u7g1DS85e4ocQQqdExhd/M uKO9lFOhXk1iPlNhcM9kg3wWO0lRH14/mxhOg5Aja1mbkefLizLs8FNtMxOANuAR S87bBG8xYVg22Pu8zs4bpfbFTO66+1Z1MaqYc24c2g== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedghedtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtjeenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepteeuvefhhfdufedvgeeiueeileegtdfhgeeftdeuveejjedtgfejhedujeeutddu necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:17:10 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 168ff877 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:17:04 +0000 (UTC) Date: Mon, 8 Apr 2024 14:17:08 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Karthik Nayak , Justin Tobler Subject: [PATCH v2 10/10] reftable/block: avoid copying block iterators on seek Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When seeking a reftable record in a block we need to position the iterator _before_ the sought-after record so that the next call to `block_iter_next()` would yield that record. To achieve this, the loop that performs the linear needs to restore the previous position once it has found the record. This is done by advancing two `block_iter`s: one to check whether the next record is our sought-after record, and one that we update after every iteration. This of course involves quite a lot of copying and also leads to needless memory allocations. Refactor the code to get rid of the `next` iterator and the copying this involves. Instead, we can restore the previous offset such that the call to `next` will return the correct record. Next to being simpler conceptually this also leads to a nice speedup. The following benchmark parser 10k refs out of 100k existing refs via `git-rev-list --no-walk`: Benchmark 1: rev-list: print many refs (HEAD~) Time (mean ± σ): 170.2 ms ± 1.7 ms [User: 86.1 ms, System: 83.6 ms] Range (min … max): 166.4 ms … 180.3 ms 500 runs Benchmark 2: rev-list: print many refs (HEAD~) Time (mean ± σ): 161.6 ms ± 1.6 ms [User: 78.1 ms, System: 83.0 ms] Range (min … max): 158.4 ms … 172.3 ms 500 runs Summary rev-list: print many refs (HEAD) ran 1.05 ± 0.01 times faster than rev-list: print many refs (HEAD~) Signed-off-by: Patrick Steinhardt --- reftable/block.c | 32 ++++++++++++++------------------ reftable/block.h | 2 -- 2 files changed, 14 insertions(+), 20 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index c6c4a68ea1..3e87460cba 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -365,16 +365,6 @@ static int restart_needle_less(size_t idx, void *_args) return args->needle.len < suffix_len; } -void block_iter_copy_from(struct block_iter *dest, const struct block_iter *src) -{ - dest->block = src->block; - dest->block_len = src->block_len; - dest->hash_size = src->hash_size; - dest->next_off = src->next_off; - strbuf_reset(&dest->last_key); - strbuf_addbuf(&dest->last_key, &src->last_key); -} - int block_iter_next(struct block_iter *it, struct reftable_record *rec) { struct string_view in = { @@ -427,7 +417,6 @@ int block_iter_seek_key(struct block_iter *it, const struct block_reader *br, .needle = *want, .reader = br, }; - struct block_iter next = BLOCK_ITER_INIT; struct reftable_record rec; int err = 0; size_t i; @@ -486,11 +475,13 @@ int block_iter_seek_key(struct block_iter *it, const struct block_reader *br, * far and then back up. */ while (1) { - block_iter_copy_from(&next, it); - err = block_iter_next(&next, &rec); + size_t prev_off = it->next_off; + + err = block_iter_next(it, &rec); if (err < 0) goto done; if (err > 0) { + it->next_off = prev_off; err = 0; goto done; } @@ -501,18 +492,23 @@ int block_iter_seek_key(struct block_iter *it, const struct block_reader *br, * record does not exist in the block and can thus abort early. * In case it is equal to the sought-after key we have found * the desired record. + * + * Note that we store the next record's key record directly in + * `last_key` without restoring the key of the preceding record + * in case we need to go one record back. This is safe to do as + * `block_iter_next()` would return the ref whose key is equal + * to `last_key` now, and naturally all keys share a prefix + * with themselves. */ reftable_record_key(&rec, &it->last_key); - if (strbuf_cmp(&it->last_key, want) >= 0) + if (strbuf_cmp(&it->last_key, want) >= 0) { + it->next_off = prev_off; goto done; - - block_iter_copy_from(it, &next); + } } done: - block_iter_close(&next); reftable_record_release(&rec); - return err; } diff --git a/reftable/block.h b/reftable/block.h index c1bd1892cb..ea4384a7e2 100644 --- a/reftable/block.h +++ b/reftable/block.h @@ -121,8 +121,6 @@ void block_iter_seek_start(struct block_iter *it, const struct block_reader *br) int block_iter_seek_key(struct block_iter *it, const struct block_reader *br, struct strbuf *want); -void block_iter_copy_from(struct block_iter *dest, const struct block_iter *src); - /* return < 0 for error, 0 for OK, > 0 for EOF. */ int block_iter_next(struct block_iter *it, struct reftable_record *rec);