From patchwork Mon Aug 30 14:57:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465391 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE65AC4320A for ; Mon, 30 Aug 2021 14:58:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A251B60462 for ; Mon, 30 Aug 2021 14:58:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237294AbhH3O64 (ORCPT ); Mon, 30 Aug 2021 10:58:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237162AbhH3O6x (ORCPT ); Mon, 30 Aug 2021 10:58:53 -0400 Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80591C061760 for ; Mon, 30 Aug 2021 07:57:59 -0700 (PDT) Received: by mail-wm1-x332.google.com with SMTP id g135so9000803wme.5 for ; Mon, 30 Aug 2021 07:57:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=6nw2gGbgLgD/ItZe7r3+xcBp3zrwfuUcf5x285TXfC0=; b=d33BB/6bmnWGE4hii88fmG8iub9xQVyXwIwIwDBFOMuKiAmAh2JJ2ZVQKlylF8HUAo agTSBDtV+TR1EOtqnysDsdHOyqM4wuTb2bQbb+08w7/assRAK/LItIAKyFsLmsALgB76 fr6Lbf0+gx7br3sTV+BGJ7PKUTy5LyKP1V1/cxLm5W8qZY0gf7zeef0REIsvKsH1RhcN 6FmvOK9TZnmTmsQAD9hXHnJTRPOCGyuIDlif1CfgnAUUKmrd+Rbqv881W3JfGCOcCw9y /7l54YLxAjuhrrG3SGjADztBaQ0A630NRlWztOpE9iVNmkkyDVsQdvFdJotRa48/XkjR LHMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=6nw2gGbgLgD/ItZe7r3+xcBp3zrwfuUcf5x285TXfC0=; b=E9BYKDY9OyRPk6yMN3SoJksTsKBYtO6ipTKyS1qTvbVoVMYAJfdmw3fPFQeFsgXpc+ yCzSSpcKM5iUi7/ed3GcoowBWTXS+5T0SsbRwdrnzLZsARX9CVIIARyRuA4ESoefMk+l JgDOeCjiGMmrQ7HQvXzvWZ/rvO8kSWXQntU8V1y09CpOZ/A9k5gutPWFX6E7TGrSsPUz bEE87VwmORL6gnatacj9CNnnTJlR8rFVqxZZSeNrA4z39eIUOenvBGqwaaG2S4lQJDGa xx5piDIdRVE+dX62oA+PUAa2z3Y+zHYgxKgcSOVuGfL9EyDX1TnKj2Ee5HFZSIZR6jtr dPgQ== X-Gm-Message-State: AOAM530XcnlRixLvUghq/AQJJaG95Q6kox8a8Tix0ShgJCaPxuR26VrA FhjtYF2eMX7lb2Tq+OEuPbsMVMT8HNI= X-Google-Smtp-Source: ABdhPJxdjFwM0JnPG0AgDqwT3DcUmP/vHgLWeZX79bHLPRLjGx5GDcwy/sjDLmi8EAHVxNTYkBsMjQ== X-Received: by 2002:a1c:7d06:: with SMTP id y6mr22449793wmc.7.1630335478160; Mon, 30 Aug 2021 07:57:58 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z126sm14506135wmc.11.2021.08.30.07.57.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:57:57 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:37 +0000 Subject: [PATCH 01/19] hash.h: provide constants for the hash IDs Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This will simplify referencing them from code that is not deeply integrated with Git, in particular, the reftable library. Signed-off-by: Han-Wen Nienhuys --- hash.h | 6 ++++++ object-file.c | 7 ++----- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/hash.h b/hash.h index 9e25c40e9ac..5d40368f18a 100644 --- a/hash.h +++ b/hash.h @@ -95,12 +95,18 @@ static inline void git_SHA256_Clone(git_SHA256_CTX *dst, const git_SHA256_CTX *s /* Number of algorithms supported (including unknown). */ #define GIT_HASH_NALGOS (GIT_HASH_SHA256 + 1) +/* "sha1", big-endian */ +#define GIT_SHA1_FORMAT_ID 0x73686131 + /* The length in bytes and in hex digits of an object name (SHA-1 value). */ #define GIT_SHA1_RAWSZ 20 #define GIT_SHA1_HEXSZ (2 * GIT_SHA1_RAWSZ) /* The block size of SHA-1. */ #define GIT_SHA1_BLKSZ 64 +/* "s256", big-endian */ +#define GIT_SHA256_FORMAT_ID 0x73323536 + /* The length in bytes and in hex digits of an object name (SHA-256 value). */ #define GIT_SHA256_RAWSZ 32 #define GIT_SHA256_HEXSZ (2 * GIT_SHA256_RAWSZ) diff --git a/object-file.c b/object-file.c index a8be8994814..7bfd5e6e2e9 100644 --- a/object-file.c +++ b/object-file.c @@ -164,7 +164,6 @@ static void git_hash_unknown_final_oid(struct object_id *oid, git_hash_ctx *ctx) BUG("trying to finalize unknown hash"); } - const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { { NULL, @@ -183,8 +182,7 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { }, { "sha1", - /* "sha1", big-endian */ - 0x73686131, + GIT_SHA1_FORMAT_ID, GIT_SHA1_RAWSZ, GIT_SHA1_HEXSZ, GIT_SHA1_BLKSZ, @@ -199,8 +197,7 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { }, { "sha256", - /* "s256", big-endian */ - 0x73323536, + GIT_SHA256_FORMAT_ID, GIT_SHA256_RAWSZ, GIT_SHA256_HEXSZ, GIT_SHA256_BLKSZ, From patchwork Mon Aug 30 14:57:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465395 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21EADC43214 for ; Mon, 30 Aug 2021 14:58:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ECC8A60E77 for ; Mon, 30 Aug 2021 14:58:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237321AbhH3O7A (ORCPT ); Mon, 30 Aug 2021 10:59:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55522 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237218AbhH3O6x (ORCPT ); Mon, 30 Aug 2021 10:58:53 -0400 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D66DC061575 for ; Mon, 30 Aug 2021 07:58:00 -0700 (PDT) Received: by mail-wm1-x336.google.com with SMTP id x2-20020a1c7c02000000b002e6f1f69a1eso14943091wmc.5 for ; Mon, 30 Aug 2021 07:57:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=yj9tx1HJN1gjVb7BbFdzwGzB8sZO/EgnY23zy/aVzOc=; b=if0SGIjbkJbImrL+ij2IAjiVq1ZOE6F0/C/yf15ZIcS1Pf4kYZOZGZAmUF1G8WLRWR 9aHPLCbdHQQ+RkTpeQI1Q9dMBWq9OHKZLULdiWvAx+rlR71gJMEMl6HMS1QizKv77Yas cKCydOpX/RuabGAeg3Do05VztLHhjfkCwss70CjGY2xfkdlyCOhT7kiiRnPoax5vwD7f D0KtD6SR3SAUWQpgoE8LPj+RnwfSOP7+kkU/1yqLynLs7zWrp74lGLaYq3ZbdM2vfOXT umPwmnDQciexoIYOYTmrfO4UZSvkh5ikhoxDdVCDsWhBaCVsvS1ofSwDC4xGBoC/e0th w4mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=yj9tx1HJN1gjVb7BbFdzwGzB8sZO/EgnY23zy/aVzOc=; b=AsXVhb6f2JwAUZLDvLg/Y5BNWqQWTUN6EnHQG+iSD2UEHkaLrG4G7nUkmE6W5Gc4C2 Js8oajDvCGf+Iq1lu53Jq4DavUV7Rc4o5RRbyQeG6u8BgdeFHvcSBOBLsy+Tq3ZXBeGu q2/KpoAY4yeUz6buOQ8rYDI55H1TLhpMwpheC/+brWUjRhAVGSYaQ0C42CzMPT4kaQkv vmd2JD4vGZpp6jPbs2vZGql4Q6FojQs2yn0kHxLwDkxqrqrpFuOAPjiqgm2pQSdOxdsN 4jBSflOcbbVONA5oHwer8zW/WUt3jZbETdivuRHAceyQWssMy+BG657zeLNIMt7gqW6c aGPg== X-Gm-Message-State: AOAM531b+9Q/qLdqFV436KzDz1gAmasGTM9EykJCs5CA/KNxAc9q3N+v 5/y73kyhWpoPeOYzN2QbYWVAqWeqthQ= X-Google-Smtp-Source: ABdhPJxrNJj5ObhTYx7YIn801xfIQVykKhTa8xNN2yJz/4KD92t8dBUQISs1R8J2rrcBZGFXjR1l2A== X-Received: by 2002:a05:600c:a08:: with SMTP id z8mr34401030wmp.165.1630335478734; Mon, 30 Aug 2021 07:57:58 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id i20sm4295263wml.37.2021.08.30.07.57.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:57:58 -0700 (PDT) Message-Id: <2d31c783fb5624a8dcbd4fedda9b7687ed784690.1630335476.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:38 +0000 Subject: [PATCH 02/19] reftable: RFC: add LICENSE Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The objective of this code is to be usable as a C library, so it can be reused in libgit2. This is currently using a BSD license as it is the liberal license I could find, but this could be changed to whatever fits the stated goal above. This code is currently imported from github.com/hanwen/reftable. Once this code lands in git.git, the C code will be removed from github.com/hanwen/reftable, and the git.git code will be the source of truth. Signed-off-by: Han-Wen Nienhuys --- reftable/LICENSE | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 reftable/LICENSE diff --git a/reftable/LICENSE b/reftable/LICENSE new file mode 100644 index 00000000000..402e0f9356b --- /dev/null +++ b/reftable/LICENSE @@ -0,0 +1,31 @@ +BSD License + +Copyright (c) 2020, Google LLC +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +* Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +* Neither the name of Google LLC nor the names of its contributors may +be used to endorse or promote products derived from this software +without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. From patchwork Mon Aug 30 14:57:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465393 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C538C4320E for ; Mon, 30 Aug 2021 14:58:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E743D60E90 for ; Mon, 30 Aug 2021 14:58:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237312AbhH3O67 (ORCPT ); Mon, 30 Aug 2021 10:58:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237279AbhH3O6y (ORCPT ); Mon, 30 Aug 2021 10:58:54 -0400 Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1035C06175F for ; Mon, 30 Aug 2021 07:58:00 -0700 (PDT) Received: by mail-wm1-x332.google.com with SMTP id k20-20020a05600c0b5400b002e87ad6956eso199927wmr.1 for ; Mon, 30 Aug 2021 07:58:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=/Izbl9wexLtHbO918dhEysjpUsFaAvsDqZEji3I0p88=; b=aKMBDDwNpRJmqGi8Nnl0kSYQT/UXma7l13YH7gNV2SRL2cS+poUM6qJdmV66yoicKH A96hhjs2OY/voSLEV3FxtbruiPvxga4OAEe9npLQaUsPgUwM530xkYEoIs0hxPf+aS98 uvDuFCBoSb5L1c3wPVya2YplI1s/kWDqOmeAdhRMGqjSTQVlAUvzv7tGjhuaZsBm3F7u UgbcAUDjATW/ir+4xDbb/A3fxUhHoY3qpqnoTT/S2KjYTNDd8C53oXnN5wsfhjyn+a82 7ffLVWKVVJGNSaVe85hc9iw6CBiI0qtJyOTKydAXhaxIecnNOvKxUXU6cG2pR2mZWH7f g1Eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=/Izbl9wexLtHbO918dhEysjpUsFaAvsDqZEji3I0p88=; b=U3jKXzV2bW/F8/yUU3oxpc287TVWG1SQ1qEH6UjY+ACm+nB9v9GH+KytHSPhGrkETz 8qY9uw1GQx2jEvr8BmA0MrBVUZFjvLH2Sa3gf0jv0P6VAWDOkAMJY+Ja7R1dpjqQ5z7u q4fuSCnASN4qI35OjvajFwms3TWlwk+/enDOYfVULOU34WhGIFZuwVN4gWIpTDyU6xdn ZXti6MCIp8xh7TnXRHmrBwgXC4XzkHkaw1G6TBoAeXRDN2mEr9NJlq0riYvs3TX2Sk16 bXSPnk/TbVIUV3CVGy3KR/1jwLFgrYfUnDMCoq0y/vCJHC9GHFu5TOz9zWyx8ht5PScp CTdA== X-Gm-Message-State: AOAM53139ydoXWJ0Pm4CZCDQBVZYeGCNH0db2WW9KuRq23zjwAMSzygE xF5YbdsUtT/0XBKsNG+fKUvXX+gRmAc= X-Google-Smtp-Source: ABdhPJyhWxaQwe9EP0CU95zOKqltM4/q8wHQ3RRdQLw02APqeD9lk2akoDtUccTTBtDhLUNMuNEnLw== X-Received: by 2002:a05:600c:19c6:: with SMTP id u6mr30573529wmq.14.1630335479291; Mon, 30 Aug 2021 07:57:59 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y15sm15350855wrw.64.2021.08.30.07.57.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:57:59 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:39 +0000 Subject: [PATCH 03/19] reftable: add error related functionality Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable/ directory is structured as a library, so it cannot crash on misuse. Instead, it returns an error codes. In addition, the error code can be used to signal conditions from lower levels of the library to be handled by higher levels of the library. For example, a transaction might legitimately write an empty reftable file, but in that case, we'd want to shortcut the transaction overhead. Signed-off-by: Han-Wen Nienhuys --- reftable/error.c | 41 ++++++++++++++++++++++++++ reftable/reftable-error.h | 62 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 103 insertions(+) create mode 100644 reftable/error.c create mode 100644 reftable/reftable-error.h diff --git a/reftable/error.c b/reftable/error.c new file mode 100644 index 00000000000..f6f16def921 --- /dev/null +++ b/reftable/error.c @@ -0,0 +1,41 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "reftable-error.h" + +#include + +const char *reftable_error_str(int err) +{ + static char buf[250]; + switch (err) { + case REFTABLE_IO_ERROR: + return "I/O error"; + case REFTABLE_FORMAT_ERROR: + return "corrupt reftable file"; + case REFTABLE_NOT_EXIST_ERROR: + return "file does not exist"; + case REFTABLE_LOCK_ERROR: + return "data is outdated"; + case REFTABLE_API_ERROR: + return "misuse of the reftable API"; + case REFTABLE_ZLIB_ERROR: + return "zlib failure"; + case REFTABLE_NAME_CONFLICT: + return "file/directory conflict"; + case REFTABLE_EMPTY_TABLE_ERROR: + return "wrote empty table"; + case REFTABLE_REFNAME_ERROR: + return "invalid refname"; + case -1: + return "general error"; + default: + snprintf(buf, sizeof(buf), "unknown error code %d", err); + return buf; + } +} diff --git a/reftable/reftable-error.h b/reftable/reftable-error.h new file mode 100644 index 00000000000..6f89bedf1a5 --- /dev/null +++ b/reftable/reftable-error.h @@ -0,0 +1,62 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_ERROR_H +#define REFTABLE_ERROR_H + +/* + * Errors in reftable calls are signaled with negative integer return values. 0 + * means success. + */ +enum reftable_error { + /* Unexpected file system behavior */ + REFTABLE_IO_ERROR = -2, + + /* Format inconsistency on reading data */ + REFTABLE_FORMAT_ERROR = -3, + + /* File does not exist. Returned from block_source_from_file(), because + * it needs special handling in stack. + */ + REFTABLE_NOT_EXIST_ERROR = -4, + + /* Trying to write out-of-date data. */ + REFTABLE_LOCK_ERROR = -5, + + /* Misuse of the API: + * - on writing a record with NULL refname. + * - on writing a reftable_ref_record outside the table limits + * - on writing a ref or log record before the stack's + * next_update_inde*x + * - on writing a log record with multiline message with + * exact_log_message unset + * - on reading a reftable_ref_record from log iterator, or vice versa. + * + * When a call misuses the API, the internal state of the library is + * kept unchanged. + */ + REFTABLE_API_ERROR = -6, + + /* Decompression error */ + REFTABLE_ZLIB_ERROR = -7, + + /* Wrote a table without blocks. */ + REFTABLE_EMPTY_TABLE_ERROR = -8, + + /* Dir/file conflict. */ + REFTABLE_NAME_CONFLICT = -9, + + /* Invalid ref name. */ + REFTABLE_REFNAME_ERROR = -10, +}; + +/* convert the numeric error code to a string. The string should not be + * deallocated. */ +const char *reftable_error_str(int err); + +#endif From patchwork Mon Aug 30 14:57:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465399 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32AD7C4320A for ; Mon, 30 Aug 2021 14:58:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 18CE960E77 for ; Mon, 30 Aug 2021 14:58:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237333AbhH3O7B (ORCPT ); Mon, 30 Aug 2021 10:59:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55534 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237283AbhH3O6z (ORCPT ); Mon, 30 Aug 2021 10:58:55 -0400 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8D988C061764 for ; Mon, 30 Aug 2021 07:58:01 -0700 (PDT) Received: by mail-wr1-x42a.google.com with SMTP id q14so22896636wrp.3 for ; Mon, 30 Aug 2021 07:58:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=hXzGSCcq5MSaKKIXKdwGsv5gUPU+shT8X/TrCOw8vKc=; b=B7INNNoetoNVRJwqLozF9WVyKE+J1VpB/qCNIQzzLHPORH7f+IdEFqZ3L7MPbcRi4Y BEX5oU80vOYbtVbu5Bmr9iilD76lwI62Uy5GoQdJ9JZJgp4Di1fHs8unts1QVxO7QPXE Gh9/Y4g8JIrBdL3eCssV8VsNOLywm+j1RBkqaFH9gFQ7WWLcKd+6cltzrpW9NvFyOKdn X+AMrRukJ8ctaRktzo0Dnvo87+3XywPKioHeyXuuubibIrGk2W/T0ClsiyHXzJ2deJ46 u+eKShZs60BZDxkJOnlD3xCiqBMCYpovq0o74gp+BN5UdNjNqb1sP9JX/IMwkN1h5R8C 5T6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=hXzGSCcq5MSaKKIXKdwGsv5gUPU+shT8X/TrCOw8vKc=; b=qCkYJMFsvJTbAlo8QXWh6QCCFgmYdsDhO5IWfe1aSq3rnL8AuBPeLBFok9e0P8d4bA 3hW4M1l7hLcicqacjQUNbUFRLMRW0fC7rwJkEdNaafz86u7sQVLhLp+Ywkj7RMYHmYtS muuoZymzRupj4kIRsdg4Pn1leyNd6puXWW+ygtnmdg9cJvvalCkRKrLpvHUvhdmLBDjd NA+O2KTTv8N9DsE3jIjPcr81fScI3VbWkfCC3tDTGYG2UMVyDO2ydjy/L7UXovc2GHI2 1U72l4/sISKaUrIE72Xuzp882Qo4//8lTj1486rtL0HT252KWg7iIp/HCZPi/X4LiiV5 xv7Q== X-Gm-Message-State: AOAM531QTyHwgSBu1IuOJb4lOdl0TqbfdW3/yzCHkggrCL+93VvzE54g /+NZvBMi3P22nZCMZ7v6xGy/TVMUI7c= X-Google-Smtp-Source: ABdhPJyFK0Qvfyz4G+R2FaHtXz749BIIFZ4s03cUUu9LTY0R4z/YQmDD009Z7IglHe2+cDHbRTO2vA== X-Received: by 2002:adf:914e:: with SMTP id j72mr26842888wrj.218.1630335479851; Mon, 30 Aug 2021 07:57:59 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s13sm20239773wmc.47.2021.08.30.07.57.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:57:59 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:40 +0000 Subject: [PATCH 04/19] reftable: utility functions Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This commit provides basic utility classes for the reftable library. Signed-off-by: Han-Wen Nienhuys Helped-by: Johannes Schindelin --- Makefile | 25 +++- contrib/buildsystems/CMakeLists.txt | 14 ++- contrib/buildsystems/Generators/Vcxproj.pm | 11 +- reftable/basics.c | 128 +++++++++++++++++++++ reftable/basics.h | 60 ++++++++++ reftable/basics_test.c | 98 ++++++++++++++++ reftable/publicbasics.c | 58 ++++++++++ reftable/reftable-malloc.h | 18 +++ reftable/reftable-tests.h | 22 ++++ reftable/system.h | 24 ++++ reftable/test_framework.c | 23 ++++ reftable/test_framework.h | 53 +++++++++ t/helper/test-reftable.c | 9 ++ t/helper/test-tool.c | 3 +- t/helper/test-tool.h | 1 + t/t0032-reftable-unittest.sh | 15 +++ 16 files changed, 555 insertions(+), 7 deletions(-) create mode 100644 reftable/basics.c create mode 100644 reftable/basics.h create mode 100644 reftable/basics_test.c create mode 100644 reftable/publicbasics.c create mode 100644 reftable/reftable-malloc.h create mode 100644 reftable/reftable-tests.h create mode 100644 reftable/system.h create mode 100644 reftable/test_framework.c create mode 100644 reftable/test_framework.h create mode 100644 t/helper/test-reftable.c create mode 100755 t/t0032-reftable-unittest.sh diff --git a/Makefile b/Makefile index d1feab008fc..98c965c61fd 100644 --- a/Makefile +++ b/Makefile @@ -743,6 +743,7 @@ TEST_BUILTINS_OBJS += test-read-cache.o TEST_BUILTINS_OBJS += test-read-graph.o TEST_BUILTINS_OBJS += test-read-midx.o TEST_BUILTINS_OBJS += test-ref-store.o +TEST_BUILTINS_OBJS += test-reftable.o TEST_BUILTINS_OBJS += test-regex.o TEST_BUILTINS_OBJS += test-repository.o TEST_BUILTINS_OBJS += test-revision-walking.o @@ -821,6 +822,8 @@ TEST_SHELL_PATH = $(SHELL_PATH) LIB_FILE = libgit.a XDIFF_LIB = xdiff/lib.a +REFTABLE_LIB = reftable/libreftable.a +REFTABLE_TEST_LIB = reftable/libreftable_test.a GENERATED_H += command-list.h GENERATED_H += config-list.h @@ -1195,7 +1198,7 @@ THIRD_PARTY_SOURCES += compat/regex/% THIRD_PARTY_SOURCES += sha1collisiondetection/% THIRD_PARTY_SOURCES += sha1dc/% -GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) +GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) EXTLIBS = GIT_USER_AGENT = git/$(GIT_VERSION) @@ -2446,7 +2449,15 @@ XDIFF_OBJS += xdiff/xutils.o .PHONY: xdiff-objs xdiff-objs: $(XDIFF_OBJS) +REFTABLE_OBJS += reftable/basics.o +REFTABLE_OBJS += reftable/error.o +REFTABLE_OBJS += reftable/publicbasics.o + +REFTABLE_TEST_OBJS += reftable/test_framework.o +REFTABLE_TEST_OBJS += reftable/basics_test.o + TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS)) + .PHONY: test-objs test-objs: $(TEST_OBJS) @@ -2462,6 +2473,8 @@ OBJECTS += $(PROGRAM_OBJS) OBJECTS += $(TEST_OBJS) OBJECTS += $(XDIFF_OBJS) OBJECTS += $(FUZZ_OBJS) +OBJECTS += $(REFTABLE_OBJS) $(REFTABLE_TEST_OBJS) + ifndef NO_CURL OBJECTS += http.o http-walker.o remote-curl.o endif @@ -2612,6 +2625,12 @@ $(LIB_FILE): $(LIB_OBJS) $(XDIFF_LIB): $(XDIFF_OBJS) $(QUIET_AR)$(AR) $(ARFLAGS) $@ $^ +$(REFTABLE_LIB): $(REFTABLE_OBJS) + $(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^ + +$(REFTABLE_TEST_LIB): $(REFTABLE_TEST_OBJS) + $(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^ + export DEFAULT_EDITOR DEFAULT_PAGER Documentation/GIT-EXCLUDED-PROGRAMS: FORCE @@ -2904,7 +2923,7 @@ perf: all t/helper/test-tool$X: $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS)) -t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) +t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) $(REFTABLE_TEST_LIB) $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(filter %.a,$^) $(LIBS) check-sha1:: t/helper/test-tool$X @@ -3234,7 +3253,7 @@ cocciclean: clean: profile-clean coverage-clean cocciclean $(RM) *.res $(RM) $(OBJECTS) - $(RM) $(LIB_FILE) $(XDIFF_LIB) + $(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X $(RM) $(TEST_PROGRAMS) $(RM) $(FUZZ_PROGRAMS) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 171b4124afe..c2bf5bdffc6 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -640,6 +640,12 @@ parse_makefile_for_sources(libxdiff_SOURCES "XDIFF_OBJS") list(TRANSFORM libxdiff_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/") add_library(xdiff STATIC ${libxdiff_SOURCES}) +#reftable +parse_makefile_for_sources(reftable_SOURCES "REFTABLE_OBJS") + +list(TRANSFORM reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/") +add_library(reftable STATIC ${reftable_SOURCES}) + if(WIN32) if(NOT MSVC)#use windres when compiling with gcc and clang add_custom_command(OUTPUT ${CMAKE_BINARY_DIR}/git.res @@ -662,7 +668,7 @@ endif() #link all required libraries to common-main add_library(common-main OBJECT ${CMAKE_SOURCE_DIR}/common-main.c) -target_link_libraries(common-main libgit xdiff ${ZLIB_LIBRARIES}) +target_link_libraries(common-main libgit xdiff reftable ${ZLIB_LIBRARIES}) if(Intl_FOUND) target_link_libraries(common-main ${Intl_LIBRARIES}) endif() @@ -902,11 +908,15 @@ if(BUILD_TESTING) add_executable(test-fake-ssh ${CMAKE_SOURCE_DIR}/t/helper/test-fake-ssh.c) target_link_libraries(test-fake-ssh common-main) +#reftable-tests +parse_makefile_for_sources(test-reftable_SOURCES "REFTABLE_TEST_OBJS") +list(TRANSFORM test-reftable_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/") + #test-tool parse_makefile_for_sources(test-tool_SOURCES "TEST_BUILTINS_OBJS") list(TRANSFORM test-tool_SOURCES PREPEND "${CMAKE_SOURCE_DIR}/t/helper/") -add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES}) +add_executable(test-tool ${CMAKE_SOURCE_DIR}/t/helper/test-tool.c ${test-tool_SOURCES} ${test-reftable_SOURCES}) target_link_libraries(test-tool common-main) set_target_properties(test-fake-ssh test-tool diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm index d2584450ba1..1a25789d285 100644 --- a/contrib/buildsystems/Generators/Vcxproj.pm +++ b/contrib/buildsystems/Generators/Vcxproj.pm @@ -77,7 +77,7 @@ sub createProject { my $libs_release = "\n "; my $libs_debug = "\n "; if (!$static_library) { - $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}})); + $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib|reftable\/libreftable\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}})); $libs_debug = $libs_release; $libs_debug =~ s/zlib\.lib/zlibd\.lib/g; $libs_debug =~ s/libexpat\.lib/libexpatd\.lib/g; @@ -232,6 +232,7 @@ EOM EOM if (!$static_library || $target =~ 'vcs-svn' || $target =~ 'xdiff') { my $uuid_libgit = $$build_structure{"LIBS_libgit_GUID"}; + my $uuid_libreftable = $$build_structure{"LIBS_reftable/libreftable_GUID"}; my $uuid_xdiff_lib = $$build_structure{"LIBS_xdiff/lib_GUID"}; print F << "EOM"; @@ -241,6 +242,14 @@ EOM false EOM + if (!($name =~ /xdiff|libreftable/)) { + print F << "EOM"; + + $uuid_libreftable + false + +EOM + } if (!($name =~ 'xdiff')) { print F << "EOM"; diff --git a/reftable/basics.c b/reftable/basics.c new file mode 100644 index 00000000000..f761e48028c --- /dev/null +++ b/reftable/basics.c @@ -0,0 +1,128 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "basics.h" + +void put_be24(uint8_t *out, uint32_t i) +{ + out[0] = (uint8_t)((i >> 16) & 0xff); + out[1] = (uint8_t)((i >> 8) & 0xff); + out[2] = (uint8_t)(i & 0xff); +} + +uint32_t get_be24(uint8_t *in) +{ + return (uint32_t)(in[0]) << 16 | (uint32_t)(in[1]) << 8 | + (uint32_t)(in[2]); +} + +void put_be16(uint8_t *out, uint16_t i) +{ + out[0] = (uint8_t)((i >> 8) & 0xff); + out[1] = (uint8_t)(i & 0xff); +} + +int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args) +{ + size_t lo = 0; + size_t hi = sz; + + /* Invariants: + * + * (hi == sz) || f(hi) == true + * (lo == 0 && f(0) == true) || fi(lo) == false + */ + while (hi - lo > 1) { + size_t mid = lo + (hi - lo) / 2; + + if (f(mid, args)) + hi = mid; + else + lo = mid; + } + + if (lo) + return hi; + + return f(0, args) ? 0 : 1; +} + +void free_names(char **a) +{ + char **p; + if (!a) { + return; + } + for (p = a; *p; p++) { + reftable_free(*p); + } + reftable_free(a); +} + +int names_length(char **names) +{ + char **p = names; + for (; *p; p++) { + /* empty */ + } + return p - names; +} + +void parse_names(char *buf, int size, char ***namesp) +{ + char **names = NULL; + size_t names_cap = 0; + size_t names_len = 0; + + char *p = buf; + char *end = buf + size; + while (p < end) { + char *next = strchr(p, '\n'); + if (next && next < end) { + *next = 0; + } else { + next = end; + } + if (p < next) { + if (names_len == names_cap) { + names_cap = 2 * names_cap + 1; + names = reftable_realloc( + names, names_cap * sizeof(*names)); + } + names[names_len++] = xstrdup(p); + } + p = next + 1; + } + + names = reftable_realloc(names, (names_len + 1) * sizeof(*names)); + names[names_len] = NULL; + *namesp = names; +} + +int names_equal(char **a, char **b) +{ + int i = 0; + for (; a[i] && b[i]; i++) { + if (strcmp(a[i], b[i])) { + return 0; + } + } + + return a[i] == b[i]; +} + +int common_prefix_size(struct strbuf *a, struct strbuf *b) +{ + int p = 0; + for (; p < a->len && p < b->len; p++) { + if (a->buf[p] != b->buf[p]) + break; + } + + return p; +} diff --git a/reftable/basics.h b/reftable/basics.h new file mode 100644 index 00000000000..096b36862b9 --- /dev/null +++ b/reftable/basics.h @@ -0,0 +1,60 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef BASICS_H +#define BASICS_H + +/* + * miscellaneous utilities that are not provided by Git. + */ + +#include "system.h" + +/* Bigendian en/decoding of integers */ + +void put_be24(uint8_t *out, uint32_t i); +uint32_t get_be24(uint8_t *in); +void put_be16(uint8_t *out, uint16_t i); + +/* + * find smallest index i in [0, sz) at which f(i) is true, assuming + * that f is ascending. Return sz if f(i) is false for all indices. + * + * Contrary to bsearch(3), this returns something useful if the argument is not + * found. + */ +int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args); + +/* + * Frees a NULL terminated array of malloced strings. The array itself is also + * freed. + */ +void free_names(char **a); + +/* parse a newline separated list of names. `size` is the length of the buffer, + * without terminating '\0'. Empty names are discarded. */ +void parse_names(char *buf, int size, char ***namesp); + +/* compares two NULL-terminated arrays of strings. */ +int names_equal(char **a, char **b); + +/* returns the array size of a NULL-terminated array of strings. */ +int names_length(char **names); + +/* Allocation routines; they invoke the functions set through + * reftable_set_alloc() */ +void *reftable_malloc(size_t sz); +void *reftable_realloc(void *p, size_t sz); +void reftable_free(void *p); +void *reftable_calloc(size_t sz); + +/* Find the longest shared prefix size of `a` and `b` */ +struct strbuf; +int common_prefix_size(struct strbuf *a, struct strbuf *b); + +#endif diff --git a/reftable/basics_test.c b/reftable/basics_test.c new file mode 100644 index 00000000000..1fcd2297256 --- /dev/null +++ b/reftable/basics_test.c @@ -0,0 +1,98 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" + +#include "basics.h" +#include "test_framework.h" +#include "reftable-tests.h" + +struct binsearch_args { + int key; + int *arr; +}; + +static int binsearch_func(size_t i, void *void_args) +{ + struct binsearch_args *args = void_args; + + return args->key < args->arr[i]; +} + +static void test_binsearch(void) +{ + int arr[] = { 2, 4, 6, 8, 10 }; + size_t sz = ARRAY_SIZE(arr); + struct binsearch_args args = { + .arr = arr, + }; + + int i = 0; + for (i = 1; i < 11; i++) { + int res; + args.key = i; + res = binsearch(sz, &binsearch_func, &args); + + if (res < sz) { + EXPECT(args.key < arr[res]); + if (res > 0) { + EXPECT(args.key >= arr[res - 1]); + } + } else { + EXPECT(args.key == 10 || args.key == 11); + } + } +} + +static void test_names_length(void) +{ + char *a[] = { "a", "b", NULL }; + EXPECT(names_length(a) == 2); +} + +static void test_parse_names_normal(void) +{ + char in[] = "a\nb\n"; + char **out = NULL; + parse_names(in, strlen(in), &out); + EXPECT(!strcmp(out[0], "a")); + EXPECT(!strcmp(out[1], "b")); + EXPECT(!out[2]); + free_names(out); +} + +static void test_parse_names_drop_empty(void) +{ + char in[] = "a\n\n"; + char **out = NULL; + parse_names(in, strlen(in), &out); + EXPECT(!strcmp(out[0], "a")); + EXPECT(!out[1]); + free_names(out); +} + +static void test_common_prefix(void) +{ + struct strbuf s1 = STRBUF_INIT; + struct strbuf s2 = STRBUF_INIT; + strbuf_addstr(&s1, "abcdef"); + strbuf_addstr(&s2, "abc"); + EXPECT(common_prefix_size(&s1, &s2) == 3); + strbuf_release(&s1); + strbuf_release(&s2); +} + +int basics_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_common_prefix); + RUN_TEST(test_parse_names_normal); + RUN_TEST(test_parse_names_drop_empty); + RUN_TEST(test_binsearch); + RUN_TEST(test_names_length); + return 0; +} diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c new file mode 100644 index 00000000000..bd0a02d3f68 --- /dev/null +++ b/reftable/publicbasics.c @@ -0,0 +1,58 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "reftable-malloc.h" + +#include "basics.h" +#include "system.h" + +static void *(*reftable_malloc_ptr)(size_t sz) = &malloc; +static void *(*reftable_realloc_ptr)(void *, size_t) = &realloc; +static void (*reftable_free_ptr)(void *) = &free; + +void *reftable_malloc(size_t sz) +{ + return (*reftable_malloc_ptr)(sz); +} + +void *reftable_realloc(void *p, size_t sz) +{ + return (*reftable_realloc_ptr)(p, sz); +} + +void reftable_free(void *p) +{ + reftable_free_ptr(p); +} + +void *reftable_calloc(size_t sz) +{ + void *p = reftable_malloc(sz); + memset(p, 0, sz); + return p; +} + +void reftable_set_alloc(void *(*malloc)(size_t), + void *(*realloc)(void *, size_t), void (*free)(void *)) +{ + reftable_malloc_ptr = malloc; + reftable_realloc_ptr = realloc; + reftable_free_ptr = free; +} + +int hash_size(uint32_t id) +{ + switch (id) { + case 0: + case GIT_SHA1_FORMAT_ID: + return GIT_SHA1_RAWSZ; + case GIT_SHA256_FORMAT_ID: + return GIT_SHA256_RAWSZ; + } + abort(); +} diff --git a/reftable/reftable-malloc.h b/reftable/reftable-malloc.h new file mode 100644 index 00000000000..5f2185f1f34 --- /dev/null +++ b/reftable/reftable-malloc.h @@ -0,0 +1,18 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_H +#define REFTABLE_H + +#include + +/* Overrides the functions to use for memory management. */ +void reftable_set_alloc(void *(*malloc)(size_t), + void *(*realloc)(void *, size_t), void (*free)(void *)); + +#endif diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h new file mode 100644 index 00000000000..5e7698ae654 --- /dev/null +++ b/reftable/reftable-tests.h @@ -0,0 +1,22 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_TESTS_H +#define REFTABLE_TESTS_H + +int basics_test_main(int argc, const char **argv); +int block_test_main(int argc, const char **argv); +int merged_test_main(int argc, const char **argv); +int record_test_main(int argc, const char **argv); +int refname_test_main(int argc, const char **argv); +int reftable_test_main(int argc, const char **argv); +int stack_test_main(int argc, const char **argv); +int tree_test_main(int argc, const char **argv); +int reftable_dump_main(int argc, char *const *argv); + +#endif diff --git a/reftable/system.h b/reftable/system.h new file mode 100644 index 00000000000..4f62827b83b --- /dev/null +++ b/reftable/system.h @@ -0,0 +1,24 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef SYSTEM_H +#define SYSTEM_H + +/* This header glues the reftable library to the rest of Git */ + +#include "git-compat-util.h" +#include "strbuf.h" +#include "hash.h" /* hash ID, sizes.*/ +#include "dir.h" /* remove_dir_recursively, for tests.*/ + +#include + +struct strbuf; +int hash_size(uint32_t id); + +#endif diff --git a/reftable/test_framework.c b/reftable/test_framework.c new file mode 100644 index 00000000000..84ac972cad0 --- /dev/null +++ b/reftable/test_framework.c @@ -0,0 +1,23 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" +#include "test_framework.h" + +#include "basics.h" + +void set_test_hash(uint8_t *p, int i) +{ + memset(p, (uint8_t)i, hash_size(GIT_SHA1_FORMAT_ID)); +} + +ssize_t strbuf_add_void(void *b, const void *data, size_t sz) +{ + strbuf_add(b, data, sz); + return sz; +} diff --git a/reftable/test_framework.h b/reftable/test_framework.h new file mode 100644 index 00000000000..774cb275bf6 --- /dev/null +++ b/reftable/test_framework.h @@ -0,0 +1,53 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef TEST_FRAMEWORK_H +#define TEST_FRAMEWORK_H + +#include "system.h" +#include "reftable-error.h" + +#define EXPECT_ERR(c) \ + if (c != 0) { \ + fflush(stderr); \ + fflush(stdout); \ + fprintf(stderr, "%s: %d: error == %d (%s), want 0\n", \ + __FILE__, __LINE__, c, reftable_error_str(c)); \ + abort(); \ + } + +#define EXPECT_STREQ(a, b) \ + if (strcmp(a, b)) { \ + fflush(stderr); \ + fflush(stdout); \ + fprintf(stderr, "%s:%d: %s (%s) != %s (%s)\n", __FILE__, \ + __LINE__, #a, a, #b, b); \ + abort(); \ + } + +#define EXPECT(c) \ + if (!(c)) { \ + fflush(stderr); \ + fflush(stdout); \ + fprintf(stderr, "%s: %d: failed assertion %s\n", __FILE__, \ + __LINE__, #c); \ + abort(); \ + } + +#define RUN_TEST(f) \ + fprintf(stderr, "running %s\n", #f); \ + fflush(stderr); \ + f(); + +void set_test_hash(uint8_t *p, int i); + +/* Like strbuf_add, but suitable for passing to reftable_new_writer + */ +ssize_t strbuf_add_void(void *b, const void *data, size_t sz); + +#endif diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c new file mode 100644 index 00000000000..3b58e423e7b --- /dev/null +++ b/t/helper/test-reftable.c @@ -0,0 +1,9 @@ +#include "reftable/reftable-tests.h" +#include "test-tool.h" + +int cmd__reftable(int argc, const char **argv) +{ + basics_test_main(argc, argv); + + return 0; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index 3ce5585e53a..f7c888ffda7 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -53,13 +53,14 @@ static struct test_cmd cmds[] = { { "pcre2-config", cmd__pcre2_config }, { "pkt-line", cmd__pkt_line }, { "prio-queue", cmd__prio_queue }, - { "proc-receive", cmd__proc_receive}, + { "proc-receive", cmd__proc_receive }, { "progress", cmd__progress }, { "reach", cmd__reach }, { "read-cache", cmd__read_cache }, { "read-graph", cmd__read_graph }, { "read-midx", cmd__read_midx }, { "ref-store", cmd__ref_store }, + { "reftable", cmd__reftable }, { "regex", cmd__regex }, { "repository", cmd__repository }, { "revision-walking", cmd__revision_walking }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index 9f0f5228508..25f77469146 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -49,6 +49,7 @@ int cmd__read_cache(int argc, const char **argv); int cmd__read_graph(int argc, const char **argv); int cmd__read_midx(int argc, const char **argv); int cmd__ref_store(int argc, const char **argv); +int cmd__reftable(int argc, const char **argv); int cmd__regex(int argc, const char **argv); int cmd__repository(int argc, const char **argv); int cmd__revision_walking(int argc, const char **argv); diff --git a/t/t0032-reftable-unittest.sh b/t/t0032-reftable-unittest.sh new file mode 100755 index 00000000000..0ed14971a58 --- /dev/null +++ b/t/t0032-reftable-unittest.sh @@ -0,0 +1,15 @@ +#!/bin/sh +# +# Copyright (c) 2020 Google LLC +# + +test_description='reftable unittests' + +. ./test-lib.sh + +test_expect_success 'unittests' ' + TMPDIR=$(pwd) && export TMPDIR && + test-tool reftable +' + +test_done From patchwork Mon Aug 30 14:57:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465401 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0391DC432BE for ; Mon, 30 Aug 2021 14:58:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E03A760E77 for ; Mon, 30 Aug 2021 14:58:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237343AbhH3O7E (ORCPT ); Mon, 30 Aug 2021 10:59:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237286AbhH3O6z (ORCPT ); Mon, 30 Aug 2021 10:58:55 -0400 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9A19C0613D9 for ; Mon, 30 Aug 2021 07:58:01 -0700 (PDT) Received: by mail-wm1-x334.google.com with SMTP id d22-20020a1c1d16000000b002e7777970f0so14920156wmd.3 for ; Mon, 30 Aug 2021 07:58:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=GVw8b7BR+NpgZBMxUfqH6h7kkYaWBomx29mkcbLHXpg=; b=fT+6ft1UHaDFwCycQF0thpnC8V4SPalDxfdWTiagKMDTtvthDwCxHTvDqiBBmRhwk8 p+YDfWZ6040UWHoQWgwephNrEvRc9q7IFb8nUI7X8XvmVZTFJV6QleiH5+RkQMo9UNFi ReyTnxo/WVnNThRnmbzJZbHS/9E2VS1Sbh467LCiSpnmxvPscCvY0EV618q4b8XmdroX PBvjiqypmm0d2SCcFXEzUJmkcXACyL71jbuMfeIMXlhqk7/C+tmOirM7RlW8a0Kd0cff A6GLTWhcM0LG3ozFJRVf+e39taBiJT9XrQWVYaROx/P+hDduRnNvTGScH8UbycRYEy86 VZng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=GVw8b7BR+NpgZBMxUfqH6h7kkYaWBomx29mkcbLHXpg=; b=UG5uSTii4dRbcvCn6RA2C0dDVkM8F9kmUsIc2N3b2T0rajqHjMHbNc7IckGJERchWB HulZLybr0q9XFJayqpdC5hfMRc+Q3xkmPn9NU0/nfMUVV8CajggCK6WDhi3NQZ55PapC PMvFn35hojjjm8svV6QjqYNL8DPa+DHHG54XJ4AqmlqjAfY5wrNLe5LjbOKQbtSyNBm7 KTQay0A/IHTEeF5A7LGtmBIAJqUfwqN2UBNviJ6iQFyoNGVgLTQVgZJ0z9rDXPU+uO+r HNBfEjn0EDt3OUHnDtMf9hJmlMjOVimP+6bGSDx9K7Wn0wlpgKy7no1G2ljEIxHEW4zk IdEg== X-Gm-Message-State: AOAM530/jFl/kTO+qOuUw8B6L7trsS4VJ9mKEIICGds3D9YlvL78bfKr sJCQmaBNh4fiqo61FkGq6qWErCW2oOw= X-Google-Smtp-Source: ABdhPJxVRJbe0ofIi382qptcFjTMZq45tj99hhBc1mC8KA6WxJhPZR1gO9t0rnDtL/1CZDqd3PL0+g== X-Received: by 2002:a1c:19c1:: with SMTP id 184mr33680767wmz.98.1630335480445; Mon, 30 Aug 2021 07:58:00 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n4sm18644424wro.81.2021.08.30.07.58.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:00 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:41 +0000 Subject: [PATCH 05/19] reftable: add blocksource, an abstraction for random access reads Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable format is usually used with files for storage. However, we abstract away this using the blocksource data structure. This has two advantages: * log blocks are zlib compressed, and handling them is simplified if we can discard byte segments from within the block layer. * for unittests, it is useful to read and write in-memory. The blocksource allows us to abstract the data away from on-disk files. Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/blocksource.c | 148 ++++++++++++++++++++++++++++++++ reftable/blocksource.h | 22 +++++ reftable/reftable-blocksource.h | 49 +++++++++++ 4 files changed, 220 insertions(+) create mode 100644 reftable/blocksource.c create mode 100644 reftable/blocksource.h create mode 100644 reftable/reftable-blocksource.h diff --git a/Makefile b/Makefile index 98c965c61fd..621ac53d09f 100644 --- a/Makefile +++ b/Makefile @@ -2451,6 +2451,7 @@ xdiff-objs: $(XDIFF_OBJS) REFTABLE_OBJS += reftable/basics.o REFTABLE_OBJS += reftable/error.o +REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_TEST_OBJS += reftable/test_framework.o diff --git a/reftable/blocksource.c b/reftable/blocksource.c new file mode 100644 index 00000000000..0044eecd9aa --- /dev/null +++ b/reftable/blocksource.c @@ -0,0 +1,148 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" + +#include "basics.h" +#include "blocksource.h" +#include "reftable-blocksource.h" +#include "reftable-error.h" + +static void strbuf_return_block(void *b, struct reftable_block *dest) +{ + memset(dest->data, 0xff, dest->len); + reftable_free(dest->data); +} + +static void strbuf_close(void *b) +{ +} + +static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off, + uint32_t size) +{ + struct strbuf *b = v; + assert(off + size <= b->len); + dest->data = reftable_calloc(size); + memcpy(dest->data, b->buf + off, size); + dest->len = size; + return size; +} + +static uint64_t strbuf_size(void *b) +{ + return ((struct strbuf *)b)->len; +} + +static struct reftable_block_source_vtable strbuf_vtable = { + .size = &strbuf_size, + .read_block = &strbuf_read_block, + .return_block = &strbuf_return_block, + .close = &strbuf_close, +}; + +void block_source_from_strbuf(struct reftable_block_source *bs, + struct strbuf *buf) +{ + assert(!bs->ops); + bs->ops = &strbuf_vtable; + bs->arg = buf; +} + +static void malloc_return_block(void *b, struct reftable_block *dest) +{ + memset(dest->data, 0xff, dest->len); + reftable_free(dest->data); +} + +static struct reftable_block_source_vtable malloc_vtable = { + .return_block = &malloc_return_block, +}; + +static struct reftable_block_source malloc_block_source_instance = { + .ops = &malloc_vtable, +}; + +struct reftable_block_source malloc_block_source(void) +{ + return malloc_block_source_instance; +} + +struct file_block_source { + int fd; + uint64_t size; +}; + +static uint64_t file_size(void *b) +{ + return ((struct file_block_source *)b)->size; +} + +static void file_return_block(void *b, struct reftable_block *dest) +{ + memset(dest->data, 0xff, dest->len); + reftable_free(dest->data); +} + +static void file_close(void *b) +{ + int fd = ((struct file_block_source *)b)->fd; + if (fd > 0) { + close(fd); + ((struct file_block_source *)b)->fd = 0; + } + + reftable_free(b); +} + +static int file_read_block(void *v, struct reftable_block *dest, uint64_t off, + uint32_t size) +{ + struct file_block_source *b = v; + assert(off + size <= b->size); + dest->data = reftable_malloc(size); + if (pread(b->fd, dest->data, size, off) != size) + return -1; + dest->len = size; + return size; +} + +static struct reftable_block_source_vtable file_vtable = { + .size = &file_size, + .read_block = &file_read_block, + .return_block = &file_return_block, + .close = &file_close, +}; + +int reftable_block_source_from_file(struct reftable_block_source *bs, + const char *name) +{ + struct stat st = { 0 }; + int err = 0; + int fd = open(name, O_RDONLY); + struct file_block_source *p = NULL; + if (fd < 0) { + if (errno == ENOENT) { + return REFTABLE_NOT_EXIST_ERROR; + } + return -1; + } + + err = fstat(fd, &st); + if (err < 0) + return -1; + + p = reftable_calloc(sizeof(struct file_block_source)); + p->size = st.st_size; + p->fd = fd; + + assert(!bs->ops); + bs->ops = &file_vtable; + bs->arg = p; + return 0; +} diff --git a/reftable/blocksource.h b/reftable/blocksource.h new file mode 100644 index 00000000000..072e2727ad2 --- /dev/null +++ b/reftable/blocksource.h @@ -0,0 +1,22 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef BLOCKSOURCE_H +#define BLOCKSOURCE_H + +#include "system.h" + +struct reftable_block_source; + +/* Create an in-memory block source for reading reftables */ +void block_source_from_strbuf(struct reftable_block_source *bs, + struct strbuf *buf); + +struct reftable_block_source malloc_block_source(void); + +#endif diff --git a/reftable/reftable-blocksource.h b/reftable/reftable-blocksource.h new file mode 100644 index 00000000000..5aa3990a573 --- /dev/null +++ b/reftable/reftable-blocksource.h @@ -0,0 +1,49 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_BLOCKSOURCE_H +#define REFTABLE_BLOCKSOURCE_H + +#include + +/* block_source is a generic wrapper for a seekable readable file. + */ +struct reftable_block_source { + struct reftable_block_source_vtable *ops; + void *arg; +}; + +/* a contiguous segment of bytes. It keeps track of its generating block_source + * so it can return itself into the pool. */ +struct reftable_block { + uint8_t *data; + int len; + struct reftable_block_source source; +}; + +/* block_source_vtable are the operations that make up block_source */ +struct reftable_block_source_vtable { + /* returns the size of a block source */ + uint64_t (*size)(void *source); + + /* reads a segment from the block source. It is an error to read + beyond the end of the block */ + int (*read_block)(void *source, struct reftable_block *dest, + uint64_t off, uint32_t size); + /* mark the block as read; may return the data back to malloc */ + void (*return_block)(void *source, struct reftable_block *blockp); + + /* release all resources associated with the block source */ + void (*close)(void *source); +}; + +/* opens a file on the file system as a block_source */ +int reftable_block_source_from_file(struct reftable_block_source *block_src, + const char *name); + +#endif From patchwork Mon Aug 30 14:57:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465403 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07364C43214 for ; Mon, 30 Aug 2021 14:58:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E0E4E60E77 for ; Mon, 30 Aug 2021 14:58:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237353AbhH3O7G (ORCPT ); Mon, 30 Aug 2021 10:59:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237162AbhH3O65 (ORCPT ); Mon, 30 Aug 2021 10:58:57 -0400 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B540C061760 for ; Mon, 30 Aug 2021 07:58:03 -0700 (PDT) Received: by mail-wm1-x32e.google.com with SMTP id u15so9030775wmj.1 for ; Mon, 30 Aug 2021 07:58:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=dxTzGtHYqlQcpu36qLXBI0c7awBAuGi1fRoe7WfYPfo=; b=WCmecRTRpi1XCYMzhjCGkaBJrZQ1yWJt2JbpnwN4ZHbLK9D8XWcN7vsm1XCKi+7CO4 CqKrXS+mIMCZqlipJgxb0ceecCawXVBC/ptnI/A8ke8reQNBT5TpH+dclCMdN/TntOaF AlxjI0ImvjFCKnqkEl34MBm2fJZcVOu/uf7swzArJMr3pPc2AwP9Eg2cHwv3KSEfei17 QRKjwu7Co80M8uEczHs1R2GAGpGAzGI1DhPaJ9EX4634bPU3SKDkDNvb7Xhbc0W6q6+Q fmugXyYnDhTPKU1ZMJjzxYuuNccbCxmuBXGh5zhz7Ki2XnXGkk5kzthJJNaA1npPuobr QAYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=dxTzGtHYqlQcpu36qLXBI0c7awBAuGi1fRoe7WfYPfo=; b=MFZ6iYeJ/RiYhl/Y6nP36yROyUIjXE0mJUkOnLvwkKvSChzMK7x0pWodUDv4FU1hvX tlrroyqG8EAHDZAIO373J8aRVwDkxguw8znuIp9+0S+DQCnwq4QKfnjgxwbFhZXozTo4 fDKSe7+NySIkdR5SPQSds7C00keOTB6+XyHx98A8wuVBvTnpOywByjncXoRkjLcBy2pY 09tucQHIKf9LZOvN8cahuc6uYueukpj0s2UdryqGmb2FV4rjKTYYlde703UYAH+8FHfK 8D72FVdYRf5yf5U+kNfTbsAUSvsoWEz6UaPYT4r4ZVAsJc4gJBgeYekU7B7l3NxOlN6Y sVcg== X-Gm-Message-State: AOAM531cJvqUhjAqxWzzj9US+CI7XeYJ98uPEIjY51aQWuiLkVudDDZU LPMF14L7TLkkYYivgSzaSftlOzTf7Ns= X-Google-Smtp-Source: ABdhPJxd3XRrZosxH7154UPDCak+lIkpIviwOG5tO+l/JUw5pMWkxan0lqrJWE3p2K1y337GOgdFqw== X-Received: by 2002:a1c:a793:: with SMTP id q141mr32817728wme.157.1630335481025; Mon, 30 Aug 2021 07:58:01 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b15sm17305245wru.1.2021.08.30.07.58.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:00 -0700 (PDT) Message-Id: <23977331cf801212df96084838df10c83712a8a5.1630335476.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:42 +0000 Subject: [PATCH 06/19] reftable: (de)serialization for the polymorphic record type. Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable format is structured as a sequence of blocks, and each block contains a sequence of prefix-compressed key-value records. There are 4 types of records, and they have similarities in how they must be handled. This is achieved by introducing a polymorphic 'record' type that encapsulates ref, log, index and object records. Signed-off-by: Han-Wen Nienhuys --- Makefile | 2 + reftable/constants.h | 21 + reftable/record.c | 1212 ++++++++++++++++++++++++++++++++++++ reftable/record.h | 139 +++++ reftable/record_test.c | 412 ++++++++++++ reftable/reftable-record.h | 114 ++++ t/helper/test-reftable.c | 2 +- 7 files changed, 1901 insertions(+), 1 deletion(-) create mode 100644 reftable/constants.h create mode 100644 reftable/record.c create mode 100644 reftable/record.h create mode 100644 reftable/record_test.c create mode 100644 reftable/reftable-record.h diff --git a/Makefile b/Makefile index 621ac53d09f..02a83a67467 100644 --- a/Makefile +++ b/Makefile @@ -2453,7 +2453,9 @@ REFTABLE_OBJS += reftable/basics.o REFTABLE_OBJS += reftable/error.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o +REFTABLE_OBJS += reftable/record.o +REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/basics_test.o diff --git a/reftable/constants.h b/reftable/constants.h new file mode 100644 index 00000000000..5eee72c4c11 --- /dev/null +++ b/reftable/constants.h @@ -0,0 +1,21 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef CONSTANTS_H +#define CONSTANTS_H + +#define BLOCK_TYPE_LOG 'g' +#define BLOCK_TYPE_INDEX 'i' +#define BLOCK_TYPE_REF 'r' +#define BLOCK_TYPE_OBJ 'o' +#define BLOCK_TYPE_ANY 0 + +#define MAX_RESTARTS ((1 << 16) - 1) +#define DEFAULT_BLOCK_SIZE 4096 + +#endif diff --git a/reftable/record.c b/reftable/record.c new file mode 100644 index 00000000000..6a5dac32dc6 --- /dev/null +++ b/reftable/record.c @@ -0,0 +1,1212 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +/* record.c - methods for different types of records. */ + +#include "record.h" + +#include "system.h" +#include "constants.h" +#include "reftable-error.h" +#include "basics.h" + +int get_var_int(uint64_t *dest, struct string_view *in) +{ + int ptr = 0; + uint64_t val; + + if (in->len == 0) + return -1; + val = in->buf[ptr] & 0x7f; + + while (in->buf[ptr] & 0x80) { + ptr++; + if (ptr > in->len) { + return -1; + } + val = (val + 1) << 7 | (uint64_t)(in->buf[ptr] & 0x7f); + } + + *dest = val; + return ptr + 1; +} + +int put_var_int(struct string_view *dest, uint64_t val) +{ + uint8_t buf[10] = { 0 }; + int i = 9; + int n = 0; + buf[i] = (uint8_t)(val & 0x7f); + i--; + while (1) { + val >>= 7; + if (!val) { + break; + } + val--; + buf[i] = 0x80 | (uint8_t)(val & 0x7f); + i--; + } + + n = sizeof(buf) - i - 1; + if (dest->len < n) + return -1; + memcpy(dest->buf, &buf[i + 1], n); + return n; +} + +int reftable_is_block_type(uint8_t typ) +{ + switch (typ) { + case BLOCK_TYPE_REF: + case BLOCK_TYPE_LOG: + case BLOCK_TYPE_OBJ: + case BLOCK_TYPE_INDEX: + return 1; + } + return 0; +} + +uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec) +{ + switch (rec->value_type) { + case REFTABLE_REF_VAL1: + return rec->value.val1; + case REFTABLE_REF_VAL2: + return rec->value.val2.value; + default: + return NULL; + } +} + +uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec) +{ + switch (rec->value_type) { + case REFTABLE_REF_VAL2: + return rec->value.val2.target_value; + default: + return NULL; + } +} + +static int decode_string(struct strbuf *dest, struct string_view in) +{ + int start_len = in.len; + uint64_t tsize = 0; + int n = get_var_int(&tsize, &in); + if (n <= 0) + return -1; + string_view_consume(&in, n); + if (in.len < tsize) + return -1; + + strbuf_reset(dest); + strbuf_add(dest, in.buf, tsize); + string_view_consume(&in, tsize); + + return start_len - in.len; +} + +static int encode_string(char *str, struct string_view s) +{ + struct string_view start = s; + int l = strlen(str); + int n = put_var_int(&s, l); + if (n < 0) + return -1; + string_view_consume(&s, n); + if (s.len < l) + return -1; + memcpy(s.buf, str, l); + string_view_consume(&s, l); + + return start.len - s.len; +} + +int reftable_encode_key(int *restart, struct string_view dest, + struct strbuf prev_key, struct strbuf key, + uint8_t extra) +{ + struct string_view start = dest; + int prefix_len = common_prefix_size(&prev_key, &key); + uint64_t suffix_len = key.len - prefix_len; + int n = put_var_int(&dest, (uint64_t)prefix_len); + if (n < 0) + return -1; + string_view_consume(&dest, n); + + *restart = (prefix_len == 0); + + n = put_var_int(&dest, suffix_len << 3 | (uint64_t)extra); + if (n < 0) + return -1; + string_view_consume(&dest, n); + + if (dest.len < suffix_len) + return -1; + memcpy(dest.buf, key.buf + prefix_len, suffix_len); + string_view_consume(&dest, suffix_len); + + return start.len - dest.len; +} + +int reftable_decode_key(struct strbuf *key, uint8_t *extra, + struct strbuf last_key, struct string_view in) +{ + int start_len = in.len; + uint64_t prefix_len = 0; + uint64_t suffix_len = 0; + int n = get_var_int(&prefix_len, &in); + if (n < 0) + return -1; + string_view_consume(&in, n); + + if (prefix_len > last_key.len) + return -1; + + n = get_var_int(&suffix_len, &in); + if (n <= 0) + return -1; + string_view_consume(&in, n); + + *extra = (uint8_t)(suffix_len & 0x7); + suffix_len >>= 3; + + if (in.len < suffix_len) + return -1; + + strbuf_reset(key); + strbuf_add(key, last_key.buf, prefix_len); + strbuf_add(key, in.buf, suffix_len); + string_view_consume(&in, suffix_len); + + return start_len - in.len; +} + +static void reftable_ref_record_key(const void *r, struct strbuf *dest) +{ + const struct reftable_ref_record *rec = + (const struct reftable_ref_record *)r; + strbuf_reset(dest); + strbuf_addstr(dest, rec->refname); +} + +static void reftable_ref_record_copy_from(void *rec, const void *src_rec, + int hash_size) +{ + struct reftable_ref_record *ref = rec; + const struct reftable_ref_record *src = src_rec; + assert(hash_size > 0); + + /* This is simple and correct, but we could probably reuse the hash + * fields. */ + reftable_ref_record_release(ref); + if (src->refname) { + ref->refname = xstrdup(src->refname); + } + ref->update_index = src->update_index; + ref->value_type = src->value_type; + switch (src->value_type) { + case REFTABLE_REF_DELETION: + break; + case REFTABLE_REF_VAL1: + ref->value.val1 = reftable_malloc(hash_size); + memcpy(ref->value.val1, src->value.val1, hash_size); + break; + case REFTABLE_REF_VAL2: + ref->value.val2.value = reftable_malloc(hash_size); + memcpy(ref->value.val2.value, src->value.val2.value, hash_size); + ref->value.val2.target_value = reftable_malloc(hash_size); + memcpy(ref->value.val2.target_value, + src->value.val2.target_value, hash_size); + break; + case REFTABLE_REF_SYMREF: + ref->value.symref = xstrdup(src->value.symref); + break; + } +} + +static char hexdigit(int c) +{ + if (c <= 9) + return '0' + c; + return 'a' + (c - 10); +} + +static void hex_format(char *dest, uint8_t *src, int hash_size) +{ + assert(hash_size > 0); + if (src) { + int i = 0; + for (i = 0; i < hash_size; i++) { + dest[2 * i] = hexdigit(src[i] >> 4); + dest[2 * i + 1] = hexdigit(src[i] & 0xf); + } + dest[2 * hash_size] = 0; + } +} + +void reftable_ref_record_print(struct reftable_ref_record *ref, + uint32_t hash_id) +{ + char hex[2 * GIT_SHA256_RAWSZ + 1] = { 0 }; /* BUG */ + printf("ref{%s(%" PRIu64 ") ", ref->refname, ref->update_index); + switch (ref->value_type) { + case REFTABLE_REF_SYMREF: + printf("=> %s", ref->value.symref); + break; + case REFTABLE_REF_VAL2: + hex_format(hex, ref->value.val2.value, hash_size(hash_id)); + printf("val 2 %s", hex); + hex_format(hex, ref->value.val2.target_value, + hash_size(hash_id)); + printf("(T %s)", hex); + break; + case REFTABLE_REF_VAL1: + hex_format(hex, ref->value.val1, hash_size(hash_id)); + printf("val 1 %s", hex); + break; + case REFTABLE_REF_DELETION: + printf("delete"); + break; + } + printf("}\n"); +} + +static void reftable_ref_record_release_void(void *rec) +{ + reftable_ref_record_release(rec); +} + +void reftable_ref_record_release(struct reftable_ref_record *ref) +{ + switch (ref->value_type) { + case REFTABLE_REF_SYMREF: + reftable_free(ref->value.symref); + break; + case REFTABLE_REF_VAL2: + reftable_free(ref->value.val2.target_value); + reftable_free(ref->value.val2.value); + break; + case REFTABLE_REF_VAL1: + reftable_free(ref->value.val1); + break; + case REFTABLE_REF_DELETION: + break; + default: + abort(); + } + + reftable_free(ref->refname); + memset(ref, 0, sizeof(struct reftable_ref_record)); +} + +static uint8_t reftable_ref_record_val_type(const void *rec) +{ + const struct reftable_ref_record *r = + (const struct reftable_ref_record *)rec; + return r->value_type; +} + +static int reftable_ref_record_encode(const void *rec, struct string_view s, + int hash_size) +{ + const struct reftable_ref_record *r = + (const struct reftable_ref_record *)rec; + struct string_view start = s; + int n = put_var_int(&s, r->update_index); + assert(hash_size > 0); + if (n < 0) + return -1; + string_view_consume(&s, n); + + switch (r->value_type) { + case REFTABLE_REF_SYMREF: + n = encode_string(r->value.symref, s); + if (n < 0) { + return -1; + } + string_view_consume(&s, n); + break; + case REFTABLE_REF_VAL2: + if (s.len < 2 * hash_size) { + return -1; + } + memcpy(s.buf, r->value.val2.value, hash_size); + string_view_consume(&s, hash_size); + memcpy(s.buf, r->value.val2.target_value, hash_size); + string_view_consume(&s, hash_size); + break; + case REFTABLE_REF_VAL1: + if (s.len < hash_size) { + return -1; + } + memcpy(s.buf, r->value.val1, hash_size); + string_view_consume(&s, hash_size); + break; + case REFTABLE_REF_DELETION: + break; + default: + abort(); + } + + return start.len - s.len; +} + +static int reftable_ref_record_decode(void *rec, struct strbuf key, + uint8_t val_type, struct string_view in, + int hash_size) +{ + struct reftable_ref_record *r = rec; + struct string_view start = in; + uint64_t update_index = 0; + int n = get_var_int(&update_index, &in); + if (n < 0) + return n; + string_view_consume(&in, n); + + reftable_ref_record_release(r); + + assert(hash_size > 0); + + r->refname = reftable_realloc(r->refname, key.len + 1); + memcpy(r->refname, key.buf, key.len); + r->update_index = update_index; + r->refname[key.len] = 0; + r->value_type = val_type; + switch (val_type) { + case REFTABLE_REF_VAL1: + if (in.len < hash_size) { + return -1; + } + + r->value.val1 = reftable_malloc(hash_size); + memcpy(r->value.val1, in.buf, hash_size); + string_view_consume(&in, hash_size); + break; + + case REFTABLE_REF_VAL2: + if (in.len < 2 * hash_size) { + return -1; + } + + r->value.val2.value = reftable_malloc(hash_size); + memcpy(r->value.val2.value, in.buf, hash_size); + string_view_consume(&in, hash_size); + + r->value.val2.target_value = reftable_malloc(hash_size); + memcpy(r->value.val2.target_value, in.buf, hash_size); + string_view_consume(&in, hash_size); + break; + + case REFTABLE_REF_SYMREF: { + struct strbuf dest = STRBUF_INIT; + int n = decode_string(&dest, in); + if (n < 0) { + return -1; + } + string_view_consume(&in, n); + r->value.symref = dest.buf; + } break; + + case REFTABLE_REF_DELETION: + break; + default: + abort(); + break; + } + + return start.len - in.len; +} + +static int reftable_ref_record_is_deletion_void(const void *p) +{ + return reftable_ref_record_is_deletion( + (const struct reftable_ref_record *)p); +} + +static struct reftable_record_vtable reftable_ref_record_vtable = { + .key = &reftable_ref_record_key, + .type = BLOCK_TYPE_REF, + .copy_from = &reftable_ref_record_copy_from, + .val_type = &reftable_ref_record_val_type, + .encode = &reftable_ref_record_encode, + .decode = &reftable_ref_record_decode, + .release = &reftable_ref_record_release_void, + .is_deletion = &reftable_ref_record_is_deletion_void, +}; + +static void reftable_obj_record_key(const void *r, struct strbuf *dest) +{ + const struct reftable_obj_record *rec = + (const struct reftable_obj_record *)r; + strbuf_reset(dest); + strbuf_add(dest, rec->hash_prefix, rec->hash_prefix_len); +} + +static void reftable_obj_record_release(void *rec) +{ + struct reftable_obj_record *obj = rec; + FREE_AND_NULL(obj->hash_prefix); + FREE_AND_NULL(obj->offsets); + memset(obj, 0, sizeof(struct reftable_obj_record)); +} + +static void reftable_obj_record_copy_from(void *rec, const void *src_rec, + int hash_size) +{ + struct reftable_obj_record *obj = rec; + const struct reftable_obj_record *src = + (const struct reftable_obj_record *)src_rec; + + reftable_obj_record_release(obj); + *obj = *src; + obj->hash_prefix = reftable_malloc(obj->hash_prefix_len); + memcpy(obj->hash_prefix, src->hash_prefix, obj->hash_prefix_len); + + obj->offsets = reftable_malloc(obj->offset_len * sizeof(uint64_t)); + COPY_ARRAY(obj->offsets, src->offsets, obj->offset_len); +} + +static uint8_t reftable_obj_record_val_type(const void *rec) +{ + const struct reftable_obj_record *r = rec; + if (r->offset_len > 0 && r->offset_len < 8) + return r->offset_len; + return 0; +} + +static int reftable_obj_record_encode(const void *rec, struct string_view s, + int hash_size) +{ + const struct reftable_obj_record *r = rec; + struct string_view start = s; + int i = 0; + int n = 0; + uint64_t last = 0; + if (r->offset_len == 0 || r->offset_len >= 8) { + n = put_var_int(&s, r->offset_len); + if (n < 0) { + return -1; + } + string_view_consume(&s, n); + } + if (r->offset_len == 0) + return start.len - s.len; + n = put_var_int(&s, r->offsets[0]); + if (n < 0) + return -1; + string_view_consume(&s, n); + + last = r->offsets[0]; + for (i = 1; i < r->offset_len; i++) { + int n = put_var_int(&s, r->offsets[i] - last); + if (n < 0) { + return -1; + } + string_view_consume(&s, n); + last = r->offsets[i]; + } + return start.len - s.len; +} + +static int reftable_obj_record_decode(void *rec, struct strbuf key, + uint8_t val_type, struct string_view in, + int hash_size) +{ + struct string_view start = in; + struct reftable_obj_record *r = rec; + uint64_t count = val_type; + int n = 0; + uint64_t last; + int j; + r->hash_prefix = reftable_malloc(key.len); + memcpy(r->hash_prefix, key.buf, key.len); + r->hash_prefix_len = key.len; + + if (val_type == 0) { + n = get_var_int(&count, &in); + if (n < 0) { + return n; + } + + string_view_consume(&in, n); + } + + r->offsets = NULL; + r->offset_len = 0; + if (count == 0) + return start.len - in.len; + + r->offsets = reftable_malloc(count * sizeof(uint64_t)); + r->offset_len = count; + + n = get_var_int(&r->offsets[0], &in); + if (n < 0) + return n; + string_view_consume(&in, n); + + last = r->offsets[0]; + j = 1; + while (j < count) { + uint64_t delta = 0; + int n = get_var_int(&delta, &in); + if (n < 0) { + return n; + } + string_view_consume(&in, n); + + last = r->offsets[j] = (delta + last); + j++; + } + return start.len - in.len; +} + +static int not_a_deletion(const void *p) +{ + return 0; +} + +static struct reftable_record_vtable reftable_obj_record_vtable = { + .key = &reftable_obj_record_key, + .type = BLOCK_TYPE_OBJ, + .copy_from = &reftable_obj_record_copy_from, + .val_type = &reftable_obj_record_val_type, + .encode = &reftable_obj_record_encode, + .decode = &reftable_obj_record_decode, + .release = &reftable_obj_record_release, + .is_deletion = not_a_deletion, +}; + +void reftable_log_record_print(struct reftable_log_record *log, + uint32_t hash_id) +{ + char hex[GIT_SHA256_RAWSZ + 1] = { 0 }; + + switch (log->value_type) { + case REFTABLE_LOG_DELETION: + printf("log{%s(%" PRIu64 ") delete", log->refname, + log->update_index); + break; + case REFTABLE_LOG_UPDATE: + printf("log{%s(%" PRIu64 ") %s <%s> %" PRIu64 " %04d\n", + log->refname, log->update_index, log->value.update.name, + log->value.update.email, log->value.update.time, + log->value.update.tz_offset); + hex_format(hex, log->value.update.old_hash, hash_size(hash_id)); + printf("%s => ", hex); + hex_format(hex, log->value.update.new_hash, hash_size(hash_id)); + printf("%s\n\n%s\n}\n", hex, log->value.update.message); + break; + } +} + +static void reftable_log_record_key(const void *r, struct strbuf *dest) +{ + const struct reftable_log_record *rec = + (const struct reftable_log_record *)r; + int len = strlen(rec->refname); + uint8_t i64[8]; + uint64_t ts = 0; + strbuf_reset(dest); + strbuf_add(dest, (uint8_t *)rec->refname, len + 1); + + ts = (~ts) - rec->update_index; + put_be64(&i64[0], ts); + strbuf_add(dest, i64, sizeof(i64)); +} + +static void reftable_log_record_copy_from(void *rec, const void *src_rec, + int hash_size) +{ + struct reftable_log_record *dst = rec; + const struct reftable_log_record *src = + (const struct reftable_log_record *)src_rec; + + reftable_log_record_release(dst); + *dst = *src; + if (dst->refname) { + dst->refname = xstrdup(dst->refname); + } + switch (dst->value_type) { + case REFTABLE_LOG_DELETION: + break; + case REFTABLE_LOG_UPDATE: + if (dst->value.update.email) { + dst->value.update.email = + xstrdup(dst->value.update.email); + } + if (dst->value.update.name) { + dst->value.update.name = + xstrdup(dst->value.update.name); + } + if (dst->value.update.message) { + dst->value.update.message = + xstrdup(dst->value.update.message); + } + + if (dst->value.update.new_hash) { + dst->value.update.new_hash = reftable_malloc(hash_size); + memcpy(dst->value.update.new_hash, + src->value.update.new_hash, hash_size); + } + if (dst->value.update.old_hash) { + dst->value.update.old_hash = reftable_malloc(hash_size); + memcpy(dst->value.update.old_hash, + src->value.update.old_hash, hash_size); + } + break; + } +} + +static void reftable_log_record_release_void(void *rec) +{ + struct reftable_log_record *r = rec; + reftable_log_record_release(r); +} + +void reftable_log_record_release(struct reftable_log_record *r) +{ + reftable_free(r->refname); + switch (r->value_type) { + case REFTABLE_LOG_DELETION: + break; + case REFTABLE_LOG_UPDATE: + reftable_free(r->value.update.new_hash); + reftable_free(r->value.update.old_hash); + reftable_free(r->value.update.name); + reftable_free(r->value.update.email); + reftable_free(r->value.update.message); + break; + } + memset(r, 0, sizeof(struct reftable_log_record)); +} + +static uint8_t reftable_log_record_val_type(const void *rec) +{ + const struct reftable_log_record *log = + (const struct reftable_log_record *)rec; + + return reftable_log_record_is_deletion(log) ? 0 : 1; +} + +static uint8_t zero[GIT_SHA256_RAWSZ] = { 0 }; + +static int reftable_log_record_encode(const void *rec, struct string_view s, + int hash_size) +{ + const struct reftable_log_record *r = rec; + struct string_view start = s; + int n = 0; + uint8_t *oldh = NULL; + uint8_t *newh = NULL; + if (reftable_log_record_is_deletion(r)) + return 0; + + oldh = r->value.update.old_hash; + newh = r->value.update.new_hash; + if (!oldh) { + oldh = zero; + } + if (!newh) { + newh = zero; + } + + if (s.len < 2 * hash_size) + return -1; + + memcpy(s.buf, oldh, hash_size); + memcpy(s.buf + hash_size, newh, hash_size); + string_view_consume(&s, 2 * hash_size); + + n = encode_string(r->value.update.name ? r->value.update.name : "", s); + if (n < 0) + return -1; + string_view_consume(&s, n); + + n = encode_string(r->value.update.email ? r->value.update.email : "", + s); + if (n < 0) + return -1; + string_view_consume(&s, n); + + n = put_var_int(&s, r->value.update.time); + if (n < 0) + return -1; + string_view_consume(&s, n); + + if (s.len < 2) + return -1; + + put_be16(s.buf, r->value.update.tz_offset); + string_view_consume(&s, 2); + + n = encode_string( + r->value.update.message ? r->value.update.message : "", s); + if (n < 0) + return -1; + string_view_consume(&s, n); + + return start.len - s.len; +} + +static int reftable_log_record_decode(void *rec, struct strbuf key, + uint8_t val_type, struct string_view in, + int hash_size) +{ + struct string_view start = in; + struct reftable_log_record *r = rec; + uint64_t max = 0; + uint64_t ts = 0; + struct strbuf dest = STRBUF_INIT; + int n; + + if (key.len <= 9 || key.buf[key.len - 9] != 0) + return REFTABLE_FORMAT_ERROR; + + r->refname = reftable_realloc(r->refname, key.len - 8); + memcpy(r->refname, key.buf, key.len - 8); + ts = get_be64(key.buf + key.len - 8); + + r->update_index = (~max) - ts; + + if (val_type != r->value_type) { + switch (r->value_type) { + case REFTABLE_LOG_UPDATE: + FREE_AND_NULL(r->value.update.old_hash); + FREE_AND_NULL(r->value.update.new_hash); + FREE_AND_NULL(r->value.update.message); + FREE_AND_NULL(r->value.update.email); + FREE_AND_NULL(r->value.update.name); + break; + case REFTABLE_LOG_DELETION: + break; + } + } + + r->value_type = val_type; + if (val_type == REFTABLE_LOG_DELETION) + return 0; + + if (in.len < 2 * hash_size) + return REFTABLE_FORMAT_ERROR; + + r->value.update.old_hash = + reftable_realloc(r->value.update.old_hash, hash_size); + r->value.update.new_hash = + reftable_realloc(r->value.update.new_hash, hash_size); + + memcpy(r->value.update.old_hash, in.buf, hash_size); + memcpy(r->value.update.new_hash, in.buf + hash_size, hash_size); + + string_view_consume(&in, 2 * hash_size); + + n = decode_string(&dest, in); + if (n < 0) + goto done; + string_view_consume(&in, n); + + r->value.update.name = + reftable_realloc(r->value.update.name, dest.len + 1); + memcpy(r->value.update.name, dest.buf, dest.len); + r->value.update.name[dest.len] = 0; + + strbuf_reset(&dest); + n = decode_string(&dest, in); + if (n < 0) + goto done; + string_view_consume(&in, n); + + r->value.update.email = + reftable_realloc(r->value.update.email, dest.len + 1); + memcpy(r->value.update.email, dest.buf, dest.len); + r->value.update.email[dest.len] = 0; + + ts = 0; + n = get_var_int(&ts, &in); + if (n < 0) + goto done; + string_view_consume(&in, n); + r->value.update.time = ts; + if (in.len < 2) + goto done; + + r->value.update.tz_offset = get_be16(in.buf); + string_view_consume(&in, 2); + + strbuf_reset(&dest); + n = decode_string(&dest, in); + if (n < 0) + goto done; + string_view_consume(&in, n); + + r->value.update.message = + reftable_realloc(r->value.update.message, dest.len + 1); + memcpy(r->value.update.message, dest.buf, dest.len); + r->value.update.message[dest.len] = 0; + + strbuf_release(&dest); + return start.len - in.len; + +done: + strbuf_release(&dest); + return REFTABLE_FORMAT_ERROR; +} + +static int null_streq(char *a, char *b) +{ + char *empty = ""; + if (!a) + a = empty; + + if (!b) + b = empty; + + return 0 == strcmp(a, b); +} + +static int zero_hash_eq(uint8_t *a, uint8_t *b, int sz) +{ + if (!a) + a = zero; + + if (!b) + b = zero; + + return !memcmp(a, b, sz); +} + +int reftable_log_record_equal(struct reftable_log_record *a, + struct reftable_log_record *b, int hash_size) +{ + if (!(null_streq(a->refname, b->refname) && + a->update_index == b->update_index && + a->value_type == b->value_type)) + return 0; + + switch (a->value_type) { + case REFTABLE_LOG_DELETION: + return 1; + case REFTABLE_LOG_UPDATE: + return null_streq(a->value.update.name, b->value.update.name) && + a->value.update.time == b->value.update.time && + a->value.update.tz_offset == b->value.update.tz_offset && + null_streq(a->value.update.email, + b->value.update.email) && + null_streq(a->value.update.message, + b->value.update.message) && + zero_hash_eq(a->value.update.old_hash, + b->value.update.old_hash, hash_size) && + zero_hash_eq(a->value.update.new_hash, + b->value.update.new_hash, hash_size); + } + + abort(); +} + +static int reftable_log_record_is_deletion_void(const void *p) +{ + return reftable_log_record_is_deletion( + (const struct reftable_log_record *)p); +} + +static struct reftable_record_vtable reftable_log_record_vtable = { + .key = &reftable_log_record_key, + .type = BLOCK_TYPE_LOG, + .copy_from = &reftable_log_record_copy_from, + .val_type = &reftable_log_record_val_type, + .encode = &reftable_log_record_encode, + .decode = &reftable_log_record_decode, + .release = &reftable_log_record_release_void, + .is_deletion = &reftable_log_record_is_deletion_void, +}; + +struct reftable_record reftable_new_record(uint8_t typ) +{ + struct reftable_record rec = { NULL }; + switch (typ) { + case BLOCK_TYPE_REF: { + struct reftable_ref_record *r = + reftable_calloc(sizeof(struct reftable_ref_record)); + reftable_record_from_ref(&rec, r); + return rec; + } + + case BLOCK_TYPE_OBJ: { + struct reftable_obj_record *r = + reftable_calloc(sizeof(struct reftable_obj_record)); + reftable_record_from_obj(&rec, r); + return rec; + } + case BLOCK_TYPE_LOG: { + struct reftable_log_record *r = + reftable_calloc(sizeof(struct reftable_log_record)); + reftable_record_from_log(&rec, r); + return rec; + } + case BLOCK_TYPE_INDEX: { + struct reftable_index_record empty = { .last_key = + STRBUF_INIT }; + struct reftable_index_record *r = + reftable_calloc(sizeof(struct reftable_index_record)); + *r = empty; + reftable_record_from_index(&rec, r); + return rec; + } + } + abort(); + return rec; +} + +/* clear out the record, yielding the reftable_record data that was + * encapsulated. */ +static void *reftable_record_yield(struct reftable_record *rec) +{ + void *p = rec->data; + rec->data = NULL; + return p; +} + +void reftable_record_destroy(struct reftable_record *rec) +{ + reftable_record_release(rec); + reftable_free(reftable_record_yield(rec)); +} + +static void reftable_index_record_key(const void *r, struct strbuf *dest) +{ + const struct reftable_index_record *rec = r; + strbuf_reset(dest); + strbuf_addbuf(dest, &rec->last_key); +} + +static void reftable_index_record_copy_from(void *rec, const void *src_rec, + int hash_size) +{ + struct reftable_index_record *dst = rec; + const struct reftable_index_record *src = src_rec; + + strbuf_reset(&dst->last_key); + strbuf_addbuf(&dst->last_key, &src->last_key); + dst->offset = src->offset; +} + +static void reftable_index_record_release(void *rec) +{ + struct reftable_index_record *idx = rec; + strbuf_release(&idx->last_key); +} + +static uint8_t reftable_index_record_val_type(const void *rec) +{ + return 0; +} + +static int reftable_index_record_encode(const void *rec, struct string_view out, + int hash_size) +{ + const struct reftable_index_record *r = + (const struct reftable_index_record *)rec; + struct string_view start = out; + + int n = put_var_int(&out, r->offset); + if (n < 0) + return n; + + string_view_consume(&out, n); + + return start.len - out.len; +} + +static int reftable_index_record_decode(void *rec, struct strbuf key, + uint8_t val_type, struct string_view in, + int hash_size) +{ + struct string_view start = in; + struct reftable_index_record *r = rec; + int n = 0; + + strbuf_reset(&r->last_key); + strbuf_addbuf(&r->last_key, &key); + + n = get_var_int(&r->offset, &in); + if (n < 0) + return n; + + string_view_consume(&in, n); + return start.len - in.len; +} + +static struct reftable_record_vtable reftable_index_record_vtable = { + .key = &reftable_index_record_key, + .type = BLOCK_TYPE_INDEX, + .copy_from = &reftable_index_record_copy_from, + .val_type = &reftable_index_record_val_type, + .encode = &reftable_index_record_encode, + .decode = &reftable_index_record_decode, + .release = &reftable_index_record_release, + .is_deletion = ¬_a_deletion, +}; + +void reftable_record_key(struct reftable_record *rec, struct strbuf *dest) +{ + rec->ops->key(rec->data, dest); +} + +uint8_t reftable_record_type(struct reftable_record *rec) +{ + return rec->ops->type; +} + +int reftable_record_encode(struct reftable_record *rec, struct string_view dest, + int hash_size) +{ + return rec->ops->encode(rec->data, dest, hash_size); +} + +void reftable_record_copy_from(struct reftable_record *rec, + struct reftable_record *src, int hash_size) +{ + assert(src->ops->type == rec->ops->type); + + rec->ops->copy_from(rec->data, src->data, hash_size); +} + +uint8_t reftable_record_val_type(struct reftable_record *rec) +{ + return rec->ops->val_type(rec->data); +} + +int reftable_record_decode(struct reftable_record *rec, struct strbuf key, + uint8_t extra, struct string_view src, int hash_size) +{ + return rec->ops->decode(rec->data, key, extra, src, hash_size); +} + +void reftable_record_release(struct reftable_record *rec) +{ + rec->ops->release(rec->data); +} + +int reftable_record_is_deletion(struct reftable_record *rec) +{ + return rec->ops->is_deletion(rec->data); +} + +void reftable_record_from_ref(struct reftable_record *rec, + struct reftable_ref_record *ref_rec) +{ + assert(!rec->ops); + rec->data = ref_rec; + rec->ops = &reftable_ref_record_vtable; +} + +void reftable_record_from_obj(struct reftable_record *rec, + struct reftable_obj_record *obj_rec) +{ + assert(!rec->ops); + rec->data = obj_rec; + rec->ops = &reftable_obj_record_vtable; +} + +void reftable_record_from_index(struct reftable_record *rec, + struct reftable_index_record *index_rec) +{ + assert(!rec->ops); + rec->data = index_rec; + rec->ops = &reftable_index_record_vtable; +} + +void reftable_record_from_log(struct reftable_record *rec, + struct reftable_log_record *log_rec) +{ + assert(!rec->ops); + rec->data = log_rec; + rec->ops = &reftable_log_record_vtable; +} + +struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *rec) +{ + assert(reftable_record_type(rec) == BLOCK_TYPE_REF); + return rec->data; +} + +struct reftable_log_record *reftable_record_as_log(struct reftable_record *rec) +{ + assert(reftable_record_type(rec) == BLOCK_TYPE_LOG); + return rec->data; +} + +static int hash_equal(uint8_t *a, uint8_t *b, int hash_size) +{ + if (a && b) + return !memcmp(a, b, hash_size); + + return a == b; +} + +int reftable_ref_record_equal(struct reftable_ref_record *a, + struct reftable_ref_record *b, int hash_size) +{ + assert(hash_size > 0); + if (!(0 == strcmp(a->refname, b->refname) && + a->update_index == b->update_index && + a->value_type == b->value_type)) + return 0; + + switch (a->value_type) { + case REFTABLE_REF_SYMREF: + return !strcmp(a->value.symref, b->value.symref); + case REFTABLE_REF_VAL2: + return hash_equal(a->value.val2.value, b->value.val2.value, + hash_size) && + hash_equal(a->value.val2.target_value, + b->value.val2.target_value, hash_size); + case REFTABLE_REF_VAL1: + return hash_equal(a->value.val1, b->value.val1, hash_size); + case REFTABLE_REF_DELETION: + return 1; + default: + abort(); + } +} + +int reftable_ref_record_compare_name(const void *a, const void *b) +{ + return strcmp(((struct reftable_ref_record *)a)->refname, + ((struct reftable_ref_record *)b)->refname); +} + +int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref) +{ + return ref->value_type == REFTABLE_REF_DELETION; +} + +int reftable_log_record_compare_key(const void *a, const void *b) +{ + const struct reftable_log_record *la = a; + const struct reftable_log_record *lb = b; + + int cmp = strcmp(la->refname, lb->refname); + if (cmp) + return cmp; + if (la->update_index > lb->update_index) + return -1; + return (la->update_index < lb->update_index) ? 1 : 0; +} + +int reftable_log_record_is_deletion(const struct reftable_log_record *log) +{ + return (log->value_type == REFTABLE_LOG_DELETION); +} + +void string_view_consume(struct string_view *s, int n) +{ + s->buf += n; + s->len -= n; +} diff --git a/reftable/record.h b/reftable/record.h new file mode 100644 index 00000000000..498e8c50bf4 --- /dev/null +++ b/reftable/record.h @@ -0,0 +1,139 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef RECORD_H +#define RECORD_H + +#include "system.h" + +#include + +#include "reftable-record.h" + +/* + * A substring of existing string data. This structure takes no responsibility + * for the lifetime of the data it points to. + */ +struct string_view { + uint8_t *buf; + size_t len; +}; + +/* Advance `s.buf` by `n`, and decrease length. */ +void string_view_consume(struct string_view *s, int n); + +/* utilities for de/encoding varints */ + +int get_var_int(uint64_t *dest, struct string_view *in); +int put_var_int(struct string_view *dest, uint64_t val); + +/* Methods for records. */ +struct reftable_record_vtable { + /* encode the key of to a uint8_t strbuf. */ + void (*key)(const void *rec, struct strbuf *dest); + + /* The record type of ('r' for ref). */ + uint8_t type; + + void (*copy_from)(void *dest, const void *src, int hash_size); + + /* a value of [0..7], indicating record subvariants (eg. ref vs. symref + * vs ref deletion) */ + uint8_t (*val_type)(const void *rec); + + /* encodes rec into dest, returning how much space was used. */ + int (*encode)(const void *rec, struct string_view dest, int hash_size); + + /* decode data from `src` into the record. */ + int (*decode)(void *rec, struct strbuf key, uint8_t extra, + struct string_view src, int hash_size); + + /* deallocate and null the record. */ + void (*release)(void *rec); + + /* is this a tombstone? */ + int (*is_deletion)(const void *rec); +}; + +/* record is a generic wrapper for different types of records. */ +struct reftable_record { + void *data; + struct reftable_record_vtable *ops; +}; + +/* returns true for recognized block types. Block start with the block type. */ +int reftable_is_block_type(uint8_t typ); + +/* creates a malloced record of the given type. Dispose with record_destroy */ +struct reftable_record reftable_new_record(uint8_t typ); + +/* Encode `key` into `dest`. Sets `is_restart` to indicate a restart. Returns + * number of bytes written. */ +int reftable_encode_key(int *is_restart, struct string_view dest, + struct strbuf prev_key, struct strbuf key, + uint8_t extra); + +/* Decode into `key` and `extra` from `in` */ +int reftable_decode_key(struct strbuf *key, uint8_t *extra, + struct strbuf last_key, struct string_view in); + +/* reftable_index_record are used internally to speed up lookups. */ +struct reftable_index_record { + uint64_t offset; /* Offset of block */ + struct strbuf last_key; /* Last key of the block. */ +}; + +/* reftable_obj_record stores an object ID => ref mapping. */ +struct reftable_obj_record { + uint8_t *hash_prefix; /* leading bytes of the object ID */ + int hash_prefix_len; /* number of leading bytes. Constant + * across a single table. */ + uint64_t *offsets; /* a vector of file offsets. */ + int offset_len; +}; + +/* see struct record_vtable */ + +void reftable_record_key(struct reftable_record *rec, struct strbuf *dest); +uint8_t reftable_record_type(struct reftable_record *rec); +void reftable_record_copy_from(struct reftable_record *rec, + struct reftable_record *src, int hash_size); +uint8_t reftable_record_val_type(struct reftable_record *rec); +int reftable_record_encode(struct reftable_record *rec, struct string_view dest, + int hash_size); +int reftable_record_decode(struct reftable_record *rec, struct strbuf key, + uint8_t extra, struct string_view src, + int hash_size); +int reftable_record_is_deletion(struct reftable_record *rec); + +/* zeroes out the embedded record */ +void reftable_record_release(struct reftable_record *rec); + +/* clear and deallocate embedded record, and zero `rec`. */ +void reftable_record_destroy(struct reftable_record *rec); + +/* initialize generic records from concrete records. The generic record should + * be zeroed out. */ +void reftable_record_from_obj(struct reftable_record *rec, + struct reftable_obj_record *objrec); +void reftable_record_from_index(struct reftable_record *rec, + struct reftable_index_record *idxrec); +void reftable_record_from_ref(struct reftable_record *rec, + struct reftable_ref_record *refrec); +void reftable_record_from_log(struct reftable_record *rec, + struct reftable_log_record *logrec); +struct reftable_ref_record *reftable_record_as_ref(struct reftable_record *ref); +struct reftable_log_record *reftable_record_as_log(struct reftable_record *ref); + +/* for qsort. */ +int reftable_ref_record_compare_name(const void *a, const void *b); + +/* for qsort. */ +int reftable_log_record_compare_key(const void *a, const void *b); + +#endif diff --git a/reftable/record_test.c b/reftable/record_test.c new file mode 100644 index 00000000000..f4ad7cace41 --- /dev/null +++ b/reftable/record_test.c @@ -0,0 +1,412 @@ +/* + Copyright 2020 Google LLC + + Use of this source code is governed by a BSD-style + license that can be found in the LICENSE file or at + https://developers.google.com/open-source/licenses/bsd +*/ + +#include "record.h" + +#include "system.h" +#include "basics.h" +#include "constants.h" +#include "test_framework.h" +#include "reftable-tests.h" + +static void test_copy(struct reftable_record *rec) +{ + struct reftable_record copy = + reftable_new_record(reftable_record_type(rec)); + reftable_record_copy_from(©, rec, GIT_SHA1_RAWSZ); + /* do it twice to catch memory leaks */ + reftable_record_copy_from(©, rec, GIT_SHA1_RAWSZ); + switch (reftable_record_type(©)) { + case BLOCK_TYPE_REF: + EXPECT(reftable_ref_record_equal(reftable_record_as_ref(©), + reftable_record_as_ref(rec), + GIT_SHA1_RAWSZ)); + break; + case BLOCK_TYPE_LOG: + EXPECT(reftable_log_record_equal(reftable_record_as_log(©), + reftable_record_as_log(rec), + GIT_SHA1_RAWSZ)); + break; + } + reftable_record_destroy(©); +} + +static void test_varint_roundtrip(void) +{ + uint64_t inputs[] = { 0, + 1, + 27, + 127, + 128, + 257, + 4096, + ((uint64_t)1 << 63), + ((uint64_t)1 << 63) + ((uint64_t)1 << 63) - 1 }; + int i = 0; + for (i = 0; i < ARRAY_SIZE(inputs); i++) { + uint8_t dest[10]; + + struct string_view out = { + .buf = dest, + .len = sizeof(dest), + }; + uint64_t in = inputs[i]; + int n = put_var_int(&out, in); + uint64_t got = 0; + + EXPECT(n > 0); + out.len = n; + n = get_var_int(&got, &out); + EXPECT(n > 0); + + EXPECT(got == in); + } +} + +static void test_common_prefix(void) +{ + struct { + const char *a, *b; + int want; + } cases[] = { + { "abc", "ab", 2 }, + { "", "abc", 0 }, + { "abc", "abd", 2 }, + { "abc", "pqr", 0 }, + }; + + int i = 0; + for (i = 0; i < ARRAY_SIZE(cases); i++) { + struct strbuf a = STRBUF_INIT; + struct strbuf b = STRBUF_INIT; + strbuf_addstr(&a, cases[i].a); + strbuf_addstr(&b, cases[i].b); + EXPECT(common_prefix_size(&a, &b) == cases[i].want); + + strbuf_release(&a); + strbuf_release(&b); + } +} + +static void set_hash(uint8_t *h, int j) +{ + int i = 0; + for (i = 0; i < hash_size(GIT_SHA1_FORMAT_ID); i++) { + h[i] = (j >> i) & 0xff; + } +} + +static void test_reftable_ref_record_roundtrip(void) +{ + int i = 0; + + for (i = REFTABLE_REF_DELETION; i < REFTABLE_NR_REF_VALUETYPES; i++) { + struct reftable_ref_record in = { NULL }; + struct reftable_ref_record out = { NULL }; + struct reftable_record rec_out = { NULL }; + struct strbuf key = STRBUF_INIT; + struct reftable_record rec = { NULL }; + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + + int n, m; + + in.value_type = i; + switch (i) { + case REFTABLE_REF_DELETION: + break; + case REFTABLE_REF_VAL1: + in.value.val1 = reftable_malloc(GIT_SHA1_RAWSZ); + set_hash(in.value.val1, 1); + break; + case REFTABLE_REF_VAL2: + in.value.val2.value = reftable_malloc(GIT_SHA1_RAWSZ); + set_hash(in.value.val2.value, 1); + in.value.val2.target_value = + reftable_malloc(GIT_SHA1_RAWSZ); + set_hash(in.value.val2.target_value, 2); + break; + case REFTABLE_REF_SYMREF: + in.value.symref = xstrdup("target"); + break; + } + in.refname = xstrdup("refs/heads/master"); + + reftable_record_from_ref(&rec, &in); + test_copy(&rec); + + EXPECT(reftable_record_val_type(&rec) == i); + + reftable_record_key(&rec, &key); + n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ); + EXPECT(n > 0); + + /* decode into a non-zero reftable_record to test for leaks. */ + + reftable_record_from_ref(&rec_out, &out); + m = reftable_record_decode(&rec_out, key, i, dest, + GIT_SHA1_RAWSZ); + EXPECT(n == m); + + EXPECT(reftable_ref_record_equal(&in, &out, GIT_SHA1_RAWSZ)); + reftable_record_release(&rec_out); + + strbuf_release(&key); + reftable_ref_record_release(&in); + } +} + +static void test_reftable_log_record_equal(void) +{ + struct reftable_log_record in[2] = { + { + .refname = xstrdup("refs/heads/master"), + .update_index = 42, + }, + { + .refname = xstrdup("refs/heads/master"), + .update_index = 22, + } + }; + + EXPECT(!reftable_log_record_equal(&in[0], &in[1], GIT_SHA1_RAWSZ)); + in[1].update_index = in[0].update_index; + EXPECT(reftable_log_record_equal(&in[0], &in[1], GIT_SHA1_RAWSZ)); + reftable_log_record_release(&in[0]); + reftable_log_record_release(&in[1]); +} + +static void test_reftable_log_record_roundtrip(void) +{ + int i; + struct reftable_log_record in[2] = { + { + .refname = xstrdup("refs/heads/master"), + .update_index = 42, + .value_type = REFTABLE_LOG_UPDATE, + .value = { + .update = { + .old_hash = reftable_malloc(GIT_SHA1_RAWSZ), + .new_hash = reftable_malloc(GIT_SHA1_RAWSZ), + .name = xstrdup("han-wen"), + .email = xstrdup("hanwen@google.com"), + .message = xstrdup("test"), + .time = 1577123507, + .tz_offset = 100, + }, + } + }, + { + .refname = xstrdup("refs/heads/master"), + .update_index = 22, + .value_type = REFTABLE_LOG_DELETION, + } + }; + set_test_hash(in[0].value.update.new_hash, 1); + set_test_hash(in[0].value.update.old_hash, 2); + for (i = 0; i < ARRAY_SIZE(in); i++) { + struct reftable_record rec = { NULL }; + struct strbuf key = STRBUF_INIT; + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + /* populate out, to check for leaks. */ + struct reftable_log_record out = { + .refname = xstrdup("old name"), + .value_type = REFTABLE_LOG_UPDATE, + .value = { + .update = { + .new_hash = reftable_calloc(GIT_SHA1_RAWSZ), + .old_hash = reftable_calloc(GIT_SHA1_RAWSZ), + .name = xstrdup("old name"), + .email = xstrdup("old@email"), + .message = xstrdup("old message"), + }, + }, + }; + struct reftable_record rec_out = { NULL }; + int n, m, valtype; + + reftable_record_from_log(&rec, &in[i]); + + test_copy(&rec); + + reftable_record_key(&rec, &key); + + n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ); + EXPECT(n >= 0); + reftable_record_from_log(&rec_out, &out); + valtype = reftable_record_val_type(&rec); + m = reftable_record_decode(&rec_out, key, valtype, dest, + GIT_SHA1_RAWSZ); + EXPECT(n == m); + + EXPECT(reftable_log_record_equal(&in[i], &out, GIT_SHA1_RAWSZ)); + reftable_log_record_release(&in[i]); + strbuf_release(&key); + reftable_record_release(&rec_out); + } +} + +static void test_u24_roundtrip(void) +{ + uint32_t in = 0x112233; + uint8_t dest[3]; + uint32_t out; + put_be24(dest, in); + out = get_be24(dest); + EXPECT(in == out); +} + +static void test_key_roundtrip(void) +{ + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + struct strbuf last_key = STRBUF_INIT; + struct strbuf key = STRBUF_INIT; + struct strbuf roundtrip = STRBUF_INIT; + int restart; + uint8_t extra; + int n, m; + uint8_t rt_extra; + + strbuf_addstr(&last_key, "refs/heads/master"); + strbuf_addstr(&key, "refs/tags/bla"); + extra = 6; + n = reftable_encode_key(&restart, dest, last_key, key, extra); + EXPECT(!restart); + EXPECT(n > 0); + + m = reftable_decode_key(&roundtrip, &rt_extra, last_key, dest); + EXPECT(n == m); + EXPECT(0 == strbuf_cmp(&key, &roundtrip)); + EXPECT(rt_extra == extra); + + strbuf_release(&last_key); + strbuf_release(&key); + strbuf_release(&roundtrip); +} + +static void test_reftable_obj_record_roundtrip(void) +{ + uint8_t testHash1[GIT_SHA1_RAWSZ] = { 1, 2, 3, 4, 0 }; + uint64_t till9[] = { 1, 2, 3, 4, 500, 600, 700, 800, 9000 }; + struct reftable_obj_record recs[3] = { { + .hash_prefix = testHash1, + .hash_prefix_len = 5, + .offsets = till9, + .offset_len = 3, + }, + { + .hash_prefix = testHash1, + .hash_prefix_len = 5, + .offsets = till9, + .offset_len = 9, + }, + { + .hash_prefix = testHash1, + .hash_prefix_len = 5, + } }; + int i = 0; + for (i = 0; i < ARRAY_SIZE(recs); i++) { + struct reftable_obj_record in = recs[i]; + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + struct reftable_record rec = { NULL }; + struct strbuf key = STRBUF_INIT; + struct reftable_obj_record out = { NULL }; + struct reftable_record rec_out = { NULL }; + int n, m; + uint8_t extra; + + reftable_record_from_obj(&rec, &in); + test_copy(&rec); + reftable_record_key(&rec, &key); + n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ); + EXPECT(n > 0); + extra = reftable_record_val_type(&rec); + reftable_record_from_obj(&rec_out, &out); + m = reftable_record_decode(&rec_out, key, extra, dest, + GIT_SHA1_RAWSZ); + EXPECT(n == m); + + EXPECT(in.hash_prefix_len == out.hash_prefix_len); + EXPECT(in.offset_len == out.offset_len); + + EXPECT(!memcmp(in.hash_prefix, out.hash_prefix, + in.hash_prefix_len)); + EXPECT(0 == memcmp(in.offsets, out.offsets, + sizeof(uint64_t) * in.offset_len)); + strbuf_release(&key); + reftable_record_release(&rec_out); + } +} + +static void test_reftable_index_record_roundtrip(void) +{ + struct reftable_index_record in = { + .offset = 42, + .last_key = STRBUF_INIT, + }; + uint8_t buffer[1024] = { 0 }; + struct string_view dest = { + .buf = buffer, + .len = sizeof(buffer), + }; + struct strbuf key = STRBUF_INIT; + struct reftable_record rec = { NULL }; + struct reftable_index_record out = { .last_key = STRBUF_INIT }; + struct reftable_record out_rec = { NULL }; + int n, m; + uint8_t extra; + + strbuf_addstr(&in.last_key, "refs/heads/master"); + reftable_record_from_index(&rec, &in); + reftable_record_key(&rec, &key); + test_copy(&rec); + + EXPECT(0 == strbuf_cmp(&key, &in.last_key)); + n = reftable_record_encode(&rec, dest, GIT_SHA1_RAWSZ); + EXPECT(n > 0); + + extra = reftable_record_val_type(&rec); + reftable_record_from_index(&out_rec, &out); + m = reftable_record_decode(&out_rec, key, extra, dest, GIT_SHA1_RAWSZ); + EXPECT(m == n); + + EXPECT(in.offset == out.offset); + + reftable_record_release(&out_rec); + strbuf_release(&key); + strbuf_release(&in.last_key); +} + +int record_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_reftable_log_record_equal); + RUN_TEST(test_reftable_log_record_roundtrip); + RUN_TEST(test_reftable_ref_record_roundtrip); + RUN_TEST(test_varint_roundtrip); + RUN_TEST(test_key_roundtrip); + RUN_TEST(test_common_prefix); + RUN_TEST(test_reftable_obj_record_roundtrip); + RUN_TEST(test_reftable_index_record_roundtrip); + RUN_TEST(test_u24_roundtrip); + return 0; +} diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h new file mode 100644 index 00000000000..5370d2288c7 --- /dev/null +++ b/reftable/reftable-record.h @@ -0,0 +1,114 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_RECORD_H +#define REFTABLE_RECORD_H + +#include + +/* + * Basic data types + * + * Reftables store the state of each ref in struct reftable_ref_record, and they + * store a sequence of reflog updates in struct reftable_log_record. + */ + +/* reftable_ref_record holds a ref database entry target_value */ +struct reftable_ref_record { + char *refname; /* Name of the ref, malloced. */ + uint64_t update_index; /* Logical timestamp at which this value is + * written */ + + enum { + /* tombstone to hide deletions from earlier tables */ + REFTABLE_REF_DELETION = 0x0, + + /* a simple ref */ + REFTABLE_REF_VAL1 = 0x1, + /* a tag, plus its peeled hash */ + REFTABLE_REF_VAL2 = 0x2, + + /* a symbolic reference */ + REFTABLE_REF_SYMREF = 0x3, +#define REFTABLE_NR_REF_VALUETYPES 4 + } value_type; + union { + uint8_t *val1; /* malloced hash. */ + struct { + uint8_t *value; /* first value, malloced hash */ + uint8_t *target_value; /* second value, malloced hash */ + } val2; + char *symref; /* referent, malloced 0-terminated string */ + } value; +}; + +/* Returns the first hash, or NULL if `rec` is not of type + * REFTABLE_REF_VAL1 or REFTABLE_REF_VAL2. */ +uint8_t *reftable_ref_record_val1(struct reftable_ref_record *rec); + +/* Returns the second hash, or NULL if `rec` is not of type + * REFTABLE_REF_VAL2. */ +uint8_t *reftable_ref_record_val2(struct reftable_ref_record *rec); + +/* returns whether 'ref' represents a deletion */ +int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref); + +/* prints a reftable_ref_record onto stdout. Useful for debugging. */ +void reftable_ref_record_print(struct reftable_ref_record *ref, + uint32_t hash_id); + +/* frees and nulls all pointer values inside `ref`. */ +void reftable_ref_record_release(struct reftable_ref_record *ref); + +/* returns whether two reftable_ref_records are the same. Useful for testing. */ +int reftable_ref_record_equal(struct reftable_ref_record *a, + struct reftable_ref_record *b, int hash_size); + +/* reftable_log_record holds a reflog entry */ +struct reftable_log_record { + char *refname; + uint64_t update_index; /* logical timestamp of a transactional update. + */ + + enum { + /* tombstone to hide deletions from earlier tables */ + REFTABLE_LOG_DELETION = 0x0, + + /* a simple update */ + REFTABLE_LOG_UPDATE = 0x1, +#define REFTABLE_NR_LOG_VALUETYPES 2 + } value_type; + + union { + struct { + uint8_t *new_hash; + uint8_t *old_hash; + char *name; + char *email; + uint64_t time; + int16_t tz_offset; + char *message; + } update; + } value; +}; + +/* returns whether 'ref' represents the deletion of a log record. */ +int reftable_log_record_is_deletion(const struct reftable_log_record *log); + +/* frees and nulls all pointer values. */ +void reftable_log_record_release(struct reftable_log_record *log); + +/* returns whether two records are equal. Useful for testing. */ +int reftable_log_record_equal(struct reftable_log_record *a, + struct reftable_log_record *b, int hash_size); + +/* dumps a reftable_log_record on stdout, for debugging/testing. */ +void reftable_log_record_print(struct reftable_log_record *log, + uint32_t hash_id); + +#endif diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 3b58e423e7b..09d4b83ef9b 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -4,6 +4,6 @@ int cmd__reftable(int argc, const char **argv) { basics_test_main(argc, argv); - + record_test_main(argc, argv); return 0; } From patchwork Mon Aug 30 14:57:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18082C4320E for ; Mon, 30 Aug 2021 14:58:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F07DC60E90 for ; Mon, 30 Aug 2021 14:58:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237347AbhH3O7F (ORCPT ); Mon, 30 Aug 2021 10:59:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55544 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237296AbhH3O65 (ORCPT ); Mon, 30 Aug 2021 10:58:57 -0400 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F4DCC06175F for ; Mon, 30 Aug 2021 07:58:03 -0700 (PDT) Received: by mail-wr1-x434.google.com with SMTP id v10so22895884wrd.4 for ; Mon, 30 Aug 2021 07:58:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=eKJUW+EIuem3kOkVGjcbRvWGrU2xysP1MRhhcXMjRGE=; b=YgyFY2sY219ld3jtXwjA1EI1WbIW3M08Clo8tRJa1FsLsgkAFRHqcpDM8icEbkvB1D TnaHc9OPsOu7nwqQXUQFurIGBBcgRkH8Fxm6KC9Ly3jch5gfU6Lw0jULDpKwLevgvfar 6dFZIqKtEOcMXyfqeNSUEkDa+CoAxhsqpOMPnXqyNs5wyB4aENNcwvhlrC3GK3C6Zzla gtR4+NAWGcPZ+Gd4ECgw+QdcyHdzNC4qSAeFmB77T4TzZA2ZQ+CUzf+S7J8ROdVVn4AW bQNMhIt8abozkSEw0m1waWZRjJ4/fgIID5apY8Wd19v+a2GKZuTELzJqSj+VTvugs/Js 062A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=eKJUW+EIuem3kOkVGjcbRvWGrU2xysP1MRhhcXMjRGE=; b=O6BgcHzTXofJz9aXGlfSlLuKFGGzRPff2TuNjRP46I1fJ/c6G1X0vEJYSaMbW4Obeb lB59cDSGoNTGElK7dNGIujqM+p6SeAZK00MPDeomEbSRFb8IwYUhKdDPA7jFudmofRCu p3Ua3Sln/CcdJenY/xEC1M8PJSoSwRsd7hEbn37+0Aox3mq9pG27kCid/FgMNbyrhw37 vKnfiyDLDvbDxeUVI0llJdKmszM2ZOFS5N6uHg+17rqKul4kkka3sQxcKda0Y3MYISl6 5dK/nq94nBzaaqG95RetlXW8SKF+mN4GnOE5qRn+kvl0EjKpIcO7OrzKf5n7jKkF+tzU JqRw== X-Gm-Message-State: AOAM532RK+9ZAS56a8Z/uzuPvx5mUq2uvXxmPfJW6XVVYoBdSI/+PFiW xVJsWDj4FdNtxeEQ26luFOh4e7RFnOg= X-Google-Smtp-Source: ABdhPJxsg1Y+ayAbmfdExq8/GANGVeKANMrydnzlzaVUFMaKYZROPOxHLG6oBJyEY0dd/UUkiWuhWA== X-Received: by 2002:adf:a197:: with SMTP id u23mr12636412wru.253.1630335481585; Mon, 30 Aug 2021 07:58:01 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s14sm13611544wmc.25.2021.08.30.07.58.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:01 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:43 +0000 Subject: [PATCH 07/19] Provide zlib's uncompress2 from compat/zlib-compat.c MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This will be needed for reading reflog blocks in reftable. Helped-by: Carlo Marcelo Arenas Belón Signed-off-by: Han-Wen Nienhuys --- Makefile | 7 +++ ci/lib.sh | 1 + compat/.gitattributes | 1 + compat/zlib-uncompress2.c | 92 +++++++++++++++++++++++++++++++++++++++ config.mak.uname | 1 + configure.ac | 13 ++++++ 6 files changed, 115 insertions(+) create mode 100644 compat/.gitattributes create mode 100644 compat/zlib-uncompress2.c diff --git a/Makefile b/Makefile index 02a83a67467..ec89320080b 100644 --- a/Makefile +++ b/Makefile @@ -256,6 +256,8 @@ all:: # # Define NO_DEFLATE_BOUND if your zlib does not have deflateBound. # +# Define NO_UNCOMPRESS2 if your zlib does not have uncompress2. +# # Define NO_NORETURN if using buggy versions of gcc 4.6+ and profile feedback, # as the compiler can crash (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49299) # @@ -1738,6 +1740,11 @@ ifdef NO_DEFLATE_BOUND BASIC_CFLAGS += -DNO_DEFLATE_BOUND endif +ifdef NO_UNCOMPRESS2 + BASIC_CFLAGS += -DNO_UNCOMPRESS2 + REFTABLE_OBJS += compat/zlib-uncompress2.o +endif + ifdef NO_POSIX_GOODIES BASIC_CFLAGS += -DNO_POSIX_GOODIES endif diff --git a/ci/lib.sh b/ci/lib.sh index 476c3f369f5..5711c63979d 100755 --- a/ci/lib.sh +++ b/ci/lib.sh @@ -224,6 +224,7 @@ linux-gcc-default) ;; Linux32) CC=gcc + MAKEFLAGS="$MAKEFLAGS NO_UNCOMPRESS2=1" ;; linux-musl) CC=gcc diff --git a/compat/.gitattributes b/compat/.gitattributes new file mode 100644 index 00000000000..40dbfb170da --- /dev/null +++ b/compat/.gitattributes @@ -0,0 +1 @@ +/zlib-uncompress2.c whitespace=-indent-with-non-tab,-trailing-space diff --git a/compat/zlib-uncompress2.c b/compat/zlib-uncompress2.c new file mode 100644 index 00000000000..6893bb469ce --- /dev/null +++ b/compat/zlib-uncompress2.c @@ -0,0 +1,92 @@ +/* taken from zlib's uncompr.c + + commit cacf7f1d4e3d44d871b605da3b647f07d718623f + Author: Mark Adler + Date: Sun Jan 15 09:18:46 2017 -0800 + + zlib 1.2.11 + +*/ + +/* + * Copyright (C) 1995-2003, 2010, 2014, 2016 Jean-loup Gailly, Mark Adler + * For conditions of distribution and use, see copyright notice in zlib.h + */ + +#include + +/* clang-format off */ + +/* =========================================================================== + Decompresses the source buffer into the destination buffer. *sourceLen is + the byte length of the source buffer. Upon entry, *destLen is the total size + of the destination buffer, which must be large enough to hold the entire + uncompressed data. (The size of the uncompressed data must have been saved + previously by the compressor and transmitted to the decompressor by some + mechanism outside the scope of this compression library.) Upon exit, + *destLen is the size of the decompressed data and *sourceLen is the number + of source bytes consumed. Upon return, source + *sourceLen points to the + first unused input byte. + + uncompress returns Z_OK if success, Z_MEM_ERROR if there was not enough + memory, Z_BUF_ERROR if there was not enough room in the output buffer, or + Z_DATA_ERROR if the input data was corrupted, including if the input data is + an incomplete zlib stream. +*/ +int ZEXPORT uncompress2 ( + Bytef *dest, + uLongf *destLen, + const Bytef *source, + uLong *sourceLen) { + z_stream stream; + int err; + const uInt max = (uInt)-1; + uLong len, left; + Byte buf[1]; /* for detection of incomplete stream when *destLen == 0 */ + + len = *sourceLen; + if (*destLen) { + left = *destLen; + *destLen = 0; + } + else { + left = 1; + dest = buf; + } + + stream.next_in = (z_const Bytef *)source; + stream.avail_in = 0; + stream.zalloc = (alloc_func)0; + stream.zfree = (free_func)0; + stream.opaque = (voidpf)0; + + err = inflateInit(&stream); + if (err != Z_OK) return err; + + stream.next_out = dest; + stream.avail_out = 0; + + do { + if (stream.avail_out == 0) { + stream.avail_out = left > (uLong)max ? max : (uInt)left; + left -= stream.avail_out; + } + if (stream.avail_in == 0) { + stream.avail_in = len > (uLong)max ? max : (uInt)len; + len -= stream.avail_in; + } + err = inflate(&stream, Z_NO_FLUSH); + } while (err == Z_OK); + + *sourceLen -= len + stream.avail_in; + if (dest != buf) + *destLen = stream.total_out; + else if (stream.total_out && err == Z_BUF_ERROR) + left = 1; + + inflateEnd(&stream); + return err == Z_STREAM_END ? Z_OK : + err == Z_NEED_DICT ? Z_DATA_ERROR : + err == Z_BUF_ERROR && left + stream.avail_out ? Z_DATA_ERROR : + err; +} diff --git a/config.mak.uname b/config.mak.uname index 76516aaa9a5..c1890e4bc8d 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -258,6 +258,7 @@ ifeq ($(uname_S),FreeBSD) FILENO_IS_A_MACRO = UnfortunatelyYes endif ifeq ($(uname_S),OpenBSD) + NO_UNCOMPRESS2 = YesPlease NO_STRCASESTR = YesPlease NO_MEMMEM = YesPlease USE_ST_TIMESPEC = YesPlease diff --git a/configure.ac b/configure.ac index 031e8d3fee8..c3a913103d0 100644 --- a/configure.ac +++ b/configure.ac @@ -672,9 +672,22 @@ AC_LINK_IFELSE([ZLIBTEST_SRC], NO_DEFLATE_BOUND=yes]) LIBS="$old_LIBS" +AC_DEFUN([ZLIBTEST_UNCOMPRESS2_SRC], [ +AC_LANG_PROGRAM([#include ], + [uncompress2(NULL,NULL,NULL,NULL);])]) +AC_MSG_CHECKING([for uncompress2 in -lz]) +old_LIBS="$LIBS" +LIBS="$LIBS -lz" +AC_LINK_IFELSE([ZLIBTEST_UNCOMPRESS2_SRC], + [AC_MSG_RESULT([yes])], + [AC_MSG_RESULT([no]) + NO_UNCOMPRESS2=yes]) +LIBS="$old_LIBS" + GIT_UNSTASH_FLAGS($ZLIB_PATH) GIT_CONF_SUBST([NO_DEFLATE_BOUND]) +GIT_CONF_SUBST([NO_UNCOMPRESS2]) # # Define NEEDS_SOCKET if linking with libc is not enough (SunOS, From patchwork Mon Aug 30 14:57:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15EEBC4320E for ; Mon, 30 Aug 2021 14:58:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B63A960E90 for ; Mon, 30 Aug 2021 14:58:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237375AbhH3O7K (ORCPT ); Mon, 30 Aug 2021 10:59:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237308AbhH3O66 (ORCPT ); Mon, 30 Aug 2021 10:58:58 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7565C061575 for ; Mon, 30 Aug 2021 07:58:03 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id i6so22940280wrv.2 for ; Mon, 30 Aug 2021 07:58:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=adjC2+IQehLMytCpog3hQurF1wZf9avOOh66tWYta9w=; b=NK3zvAEpu5y2BhIVCFLV1JpcZc9VtdHqCKC7pN08ofy3C/xEkgd2aY+S45SKEqPcFh pHQFq1XycogYGUxuJZcMLze9S6cZQVKuM3q/xdPWPulDh5aOb3GUIrguanGisjJBiZic ajPrcLVDid1ynTRmAIdlUA7dOU6JIs6nWvjdta4PkvBO/dKNzj1PehLjik5bZAvt91Xs QUnmF/ysaRB6zUebrl1cF+P1e4ORxUIwd0V/OOwcmwqkdj+MFkvnmI9TI0RNPLL1TASj VP8g3RaDhp1DCNMrSaVB2j8uXTCXxxWAdzHfFTlH2BxHroJfO78qBXUKyLEafvV7DV7y ioVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=adjC2+IQehLMytCpog3hQurF1wZf9avOOh66tWYta9w=; b=aVK15PP2K2rmghY6h62JtrtfEqGtvoKjOg8W+mh1Q+E80v2iXGH/DNqEDi44BLgaUW BGtW9xt5Cf7L5qCuty4VLoIzNE9N/Fuh+r0tPWspQOTl7WzqCwtAHBOEGwMmsNFZ2fAV /tef7yjg1hfIr/Fo8+XXaqOo2iZCpnGdt4mSD23Pk61dzmAPjY0nTcxe1Usjpm+1SMhh N9+JC5pF3SLUIlUe1axXL7rde3WofHOYj+lwW89iE0/opILamuAwcjJZlwzAyELGg/WU uEFV62FkVgajDrLMuQwCQm9D3KpKqbV5Bd2i+mAvJr5B3OnqM00l1F3fTrOuoj4k4wmO fxdw== X-Gm-Message-State: AOAM5311TrycqN/nd9aNHn3KN/nNpQFTbrDrDpw0ast4VfK0sn7OrrWm SxOtb7ltE8bfx580TwibHxtprT4kCbU= X-Google-Smtp-Source: ABdhPJxaYuf9IH4DA/QEsXodeVtdIOzPLzsHlutGFbHaOyuk4rOusej5tnyhkiQjXqyict2BR8IVqw== X-Received: by 2002:adf:f50a:: with SMTP id q10mr26104470wro.271.1630335482184; Mon, 30 Aug 2021 07:58:02 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b18sm15764644wrr.89.2021.08.30.07.58.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:01 -0700 (PDT) Message-Id: <2648e05c51061474c89541d126ba145e95d6b092.1630335476.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:44 +0000 Subject: [PATCH 08/19] reftable: reading/writing blocks Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable format is structured as a sequence of block. Within a block, records are prefix compressed, with an index of offsets for fully expand keys to enable binary search within blocks. This commit provides the logic to read and write these blocks. Signed-off-by: Han-Wen Nienhuys --- Makefile | 2 + reftable/block.c | 448 +++++++++++++++++++++++++++++++++++++++ reftable/block.h | 127 +++++++++++ reftable/block_test.c | 120 +++++++++++ t/helper/test-reftable.c | 1 + 5 files changed, 698 insertions(+) create mode 100644 reftable/block.c create mode 100644 reftable/block.h create mode 100644 reftable/block_test.c diff --git a/Makefile b/Makefile index ec89320080b..65bd39ecdc0 100644 --- a/Makefile +++ b/Makefile @@ -2458,10 +2458,12 @@ xdiff-objs: $(XDIFF_OBJS) REFTABLE_OBJS += reftable/basics.o REFTABLE_OBJS += reftable/error.o +REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_OBJS += reftable/record.o +REFTABLE_TEST_OBJS += reftable/block_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/basics_test.o diff --git a/reftable/block.c b/reftable/block.c new file mode 100644 index 00000000000..eb5268dd3a6 --- /dev/null +++ b/reftable/block.c @@ -0,0 +1,448 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "block.h" + +#include "blocksource.h" +#include "constants.h" +#include "record.h" +#include "reftable-error.h" +#include "system.h" +#include + +#ifdef NO_UNCOMPRESS2 +/* + * This is uncompress2, which is only available in zlib >= 1.2.9 + * (released as of early 2017) + */ +int uncompress2(Bytef *dest, uLongf *destLen, const Bytef *source, + uLong *sourceLen); +#endif + +int header_size(int version) +{ + switch (version) { + case 1: + return 24; + case 2: + return 28; + } + abort(); +} + +int footer_size(int version) +{ + switch (version) { + case 1: + return 68; + case 2: + return 72; + } + abort(); +} + +static int block_writer_register_restart(struct block_writer *w, int n, + int is_restart, struct strbuf *key) +{ + int rlen = w->restart_len; + if (rlen >= MAX_RESTARTS) { + is_restart = 0; + } + + if (is_restart) { + rlen++; + } + if (2 + 3 * rlen + n > w->block_size - w->next) + return -1; + if (is_restart) { + if (w->restart_len == w->restart_cap) { + w->restart_cap = w->restart_cap * 2 + 1; + w->restarts = reftable_realloc( + w->restarts, sizeof(uint32_t) * w->restart_cap); + } + + w->restarts[w->restart_len++] = w->next; + } + + w->next += n; + + strbuf_reset(&w->last_key); + strbuf_addbuf(&w->last_key, key); + w->entries++; + return 0; +} + +void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf, + uint32_t block_size, uint32_t header_off, int hash_size) +{ + bw->buf = buf; + bw->hash_size = hash_size; + bw->block_size = block_size; + bw->header_off = header_off; + bw->buf[header_off] = typ; + bw->next = header_off + 4; + bw->restart_interval = 16; + bw->entries = 0; + bw->restart_len = 0; + bw->last_key.len = 0; +} + +uint8_t block_writer_type(struct block_writer *bw) +{ + return bw->buf[bw->header_off]; +} + +/* adds the reftable_record to the block. Returns -1 if it does not fit, 0 on + success */ +int block_writer_add(struct block_writer *w, struct reftable_record *rec) +{ + struct strbuf empty = STRBUF_INIT; + struct strbuf last = + w->entries % w->restart_interval == 0 ? empty : w->last_key; + struct string_view out = { + .buf = w->buf + w->next, + .len = w->block_size - w->next, + }; + + struct string_view start = out; + + int is_restart = 0; + struct strbuf key = STRBUF_INIT; + int n = 0; + + reftable_record_key(rec, &key); + n = reftable_encode_key(&is_restart, out, last, key, + reftable_record_val_type(rec)); + if (n < 0) + goto done; + string_view_consume(&out, n); + + n = reftable_record_encode(rec, out, w->hash_size); + if (n < 0) + goto done; + string_view_consume(&out, n); + + if (block_writer_register_restart(w, start.len - out.len, is_restart, + &key) < 0) + goto done; + + strbuf_release(&key); + return 0; + +done: + strbuf_release(&key); + return -1; +} + +int block_writer_finish(struct block_writer *w) +{ + int i = 0; + for (i = 0; i < w->restart_len; i++) { + put_be24(w->buf + w->next, w->restarts[i]); + w->next += 3; + } + + put_be16(w->buf + w->next, w->restart_len); + w->next += 2; + put_be24(w->buf + 1 + w->header_off, w->next); + + if (block_writer_type(w) == BLOCK_TYPE_LOG) { + int block_header_skip = 4 + w->header_off; + uint8_t *compressed = NULL; + int zresult = 0; + uLongf src_len = w->next - block_header_skip; + size_t dest_cap = src_len; + + compressed = reftable_malloc(dest_cap); + while (1) { + uLongf out_dest_len = dest_cap; + + zresult = compress2(compressed, &out_dest_len, + w->buf + block_header_skip, src_len, + 9); + if (zresult == Z_BUF_ERROR) { + dest_cap *= 2; + compressed = + reftable_realloc(compressed, dest_cap); + continue; + } + + if (Z_OK != zresult) { + reftable_free(compressed); + return REFTABLE_ZLIB_ERROR; + } + + memcpy(w->buf + block_header_skip, compressed, + out_dest_len); + w->next = out_dest_len + block_header_skip; + reftable_free(compressed); + break; + } + } + return w->next; +} + +uint8_t block_reader_type(struct block_reader *r) +{ + return r->block.data[r->header_off]; +} + +int block_reader_init(struct block_reader *br, struct reftable_block *block, + uint32_t header_off, uint32_t table_block_size, + int hash_size) +{ + uint32_t full_block_size = table_block_size; + uint8_t typ = block->data[header_off]; + uint32_t sz = get_be24(block->data + header_off + 1); + + uint16_t restart_count = 0; + uint32_t restart_start = 0; + uint8_t *restart_bytes = NULL; + + if (!reftable_is_block_type(typ)) + return REFTABLE_FORMAT_ERROR; + + if (typ == BLOCK_TYPE_LOG) { + int block_header_skip = 4 + header_off; + uLongf dst_len = sz - block_header_skip; /* total size of dest + buffer. */ + uLongf src_len = block->len - block_header_skip; + /* Log blocks specify the *uncompressed* size in their header. + */ + uint8_t *uncompressed = reftable_malloc(sz); + + /* Copy over the block header verbatim. It's not compressed. */ + memcpy(uncompressed, block->data, block_header_skip); + + /* Uncompress */ + if (Z_OK != + uncompress2(uncompressed + block_header_skip, &dst_len, + block->data + block_header_skip, &src_len)) { + reftable_free(uncompressed); + return REFTABLE_ZLIB_ERROR; + } + + if (dst_len + block_header_skip != sz) + return REFTABLE_FORMAT_ERROR; + + /* We're done with the input data. */ + reftable_block_done(block); + block->data = uncompressed; + block->len = sz; + block->source = malloc_block_source(); + full_block_size = src_len + block_header_skip; + } else if (full_block_size == 0) { + full_block_size = sz; + } else if (sz < full_block_size && sz < block->len && + block->data[sz] != 0) { + /* If the block is smaller than the full block size, it is + padded (data followed by '\0') or the next block is + unaligned. */ + full_block_size = sz; + } + + restart_count = get_be16(block->data + sz - 2); + restart_start = sz - 2 - 3 * restart_count; + restart_bytes = block->data + restart_start; + + /* transfer ownership. */ + br->block = *block; + block->data = NULL; + block->len = 0; + + br->hash_size = hash_size; + br->block_len = restart_start; + br->full_block_size = full_block_size; + br->header_off = header_off; + br->restart_count = restart_count; + br->restart_bytes = restart_bytes; + + return 0; +} + +static uint32_t block_reader_restart_offset(struct block_reader *br, int i) +{ + return get_be24(br->restart_bytes + 3 * i); +} + +void block_reader_start(struct block_reader *br, struct block_iter *it) +{ + it->br = br; + strbuf_reset(&it->last_key); + it->next_off = br->header_off + 4; +} + +struct restart_find_args { + int error; + struct strbuf key; + struct block_reader *r; +}; + +static int restart_key_less(size_t idx, void *args) +{ + struct restart_find_args *a = args; + uint32_t off = block_reader_restart_offset(a->r, idx); + struct string_view in = { + .buf = a->r->block.data + off, + .len = a->r->block_len - off, + }; + + /* the restart key is verbatim in the block, so this could avoid the + alloc for decoding the key */ + struct strbuf rkey = STRBUF_INIT; + struct strbuf last_key = STRBUF_INIT; + uint8_t unused_extra; + int n = reftable_decode_key(&rkey, &unused_extra, last_key, in); + int result; + if (n < 0) { + a->error = 1; + return -1; + } + + result = strbuf_cmp(&a->key, &rkey); + strbuf_release(&rkey); + return result; +} + +void block_iter_copy_from(struct block_iter *dest, struct block_iter *src) +{ + dest->br = src->br; + dest->next_off = src->next_off; + strbuf_reset(&dest->last_key); + strbuf_addbuf(&dest->last_key, &src->last_key); +} + +int block_iter_next(struct block_iter *it, struct reftable_record *rec) +{ + struct string_view in = { + .buf = it->br->block.data + it->next_off, + .len = it->br->block_len - it->next_off, + }; + struct string_view start = in; + struct strbuf key = STRBUF_INIT; + uint8_t extra = 0; + int n = 0; + + if (it->next_off >= it->br->block_len) + return 1; + + n = reftable_decode_key(&key, &extra, it->last_key, in); + if (n < 0) + return -1; + + string_view_consume(&in, n); + n = reftable_record_decode(rec, key, extra, in, it->br->hash_size); + if (n < 0) + return -1; + string_view_consume(&in, n); + + strbuf_reset(&it->last_key); + strbuf_addbuf(&it->last_key, &key); + it->next_off += start.len - in.len; + strbuf_release(&key); + return 0; +} + +int block_reader_first_key(struct block_reader *br, struct strbuf *key) +{ + struct strbuf empty = STRBUF_INIT; + int off = br->header_off + 4; + struct string_view in = { + .buf = br->block.data + off, + .len = br->block_len - off, + }; + + uint8_t extra = 0; + int n = reftable_decode_key(key, &extra, empty, in); + if (n < 0) + return n; + + return 0; +} + +int block_iter_seek(struct block_iter *it, struct strbuf *want) +{ + return block_reader_seek(it->br, it, want); +} + +void block_iter_close(struct block_iter *it) +{ + strbuf_release(&it->last_key); +} + +int block_reader_seek(struct block_reader *br, struct block_iter *it, + struct strbuf *want) +{ + struct restart_find_args args = { + .key = *want, + .r = br, + }; + struct reftable_record rec = reftable_new_record(block_reader_type(br)); + struct strbuf key = STRBUF_INIT; + int err = 0; + struct block_iter next = { + .last_key = STRBUF_INIT, + }; + + int i = binsearch(br->restart_count, &restart_key_less, &args); + if (args.error) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + it->br = br; + if (i > 0) { + i--; + it->next_off = block_reader_restart_offset(br, i); + } else { + it->next_off = br->header_off + 4; + } + + /* We're looking for the last entry less/equal than the wanted key, so + we have to go one entry too far and then back up. + */ + while (1) { + block_iter_copy_from(&next, it); + err = block_iter_next(&next, &rec); + if (err < 0) + goto done; + + reftable_record_key(&rec, &key); + if (err > 0 || strbuf_cmp(&key, want) >= 0) { + err = 0; + goto done; + } + + block_iter_copy_from(it, &next); + } + +done: + strbuf_release(&key); + strbuf_release(&next.last_key); + reftable_record_destroy(&rec); + + return err; +} + +void block_writer_release(struct block_writer *bw) +{ + FREE_AND_NULL(bw->restarts); + strbuf_release(&bw->last_key); + /* the block is not owned. */ +} + +void reftable_block_done(struct reftable_block *blockp) +{ + struct reftable_block_source source = blockp->source; + if (blockp && source.ops) + source.ops->return_block(source.arg, blockp); + blockp->data = NULL; + blockp->len = 0; + blockp->source.ops = NULL; + blockp->source.arg = NULL; +} diff --git a/reftable/block.h b/reftable/block.h new file mode 100644 index 00000000000..e207706a644 --- /dev/null +++ b/reftable/block.h @@ -0,0 +1,127 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef BLOCK_H +#define BLOCK_H + +#include "basics.h" +#include "record.h" +#include "reftable-blocksource.h" + +/* + * Writes reftable blocks. The block_writer is reused across blocks to minimize + * allocation overhead. + */ +struct block_writer { + uint8_t *buf; + uint32_t block_size; + + /* Offset ofof the global header. Nonzero in the first block only. */ + uint32_t header_off; + + /* How often to restart keys. */ + int restart_interval; + int hash_size; + + /* Offset of next uint8_t to write. */ + uint32_t next; + uint32_t *restarts; + uint32_t restart_len; + uint32_t restart_cap; + + struct strbuf last_key; + int entries; +}; + +/* + * initializes the blockwriter to write `typ` entries, using `buf` as temporary + * storage. `buf` is not owned by the block_writer. */ +void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf, + uint32_t block_size, uint32_t header_off, int hash_size); + +/* returns the block type (eg. 'r' for ref records. */ +uint8_t block_writer_type(struct block_writer *bw); + +/* appends the record, or -1 if it doesn't fit. */ +int block_writer_add(struct block_writer *w, struct reftable_record *rec); + +/* appends the key restarts, and compress the block if necessary. */ +int block_writer_finish(struct block_writer *w); + +/* clears out internally allocated block_writer members. */ +void block_writer_release(struct block_writer *bw); + +/* Read a block. */ +struct block_reader { + /* offset of the block header; nonzero for the first block in a + * reftable. */ + uint32_t header_off; + + /* the memory block */ + struct reftable_block block; + int hash_size; + + /* size of the data, excluding restart data. */ + uint32_t block_len; + uint8_t *restart_bytes; + uint16_t restart_count; + + /* size of the data in the file. For log blocks, this is the compressed + * size. */ + uint32_t full_block_size; +}; + +/* Iterate over entries in a block */ +struct block_iter { + /* offset within the block of the next entry to read. */ + uint32_t next_off; + struct block_reader *br; + + /* key for last entry we read. */ + struct strbuf last_key; +}; + +/* initializes a block reader. */ +int block_reader_init(struct block_reader *br, struct reftable_block *bl, + uint32_t header_off, uint32_t table_block_size, + int hash_size); + +/* Position `it` at start of the block */ +void block_reader_start(struct block_reader *br, struct block_iter *it); + +/* Position `it` to the `want` key in the block */ +int block_reader_seek(struct block_reader *br, struct block_iter *it, + struct strbuf *want); + +/* Returns the block type (eg. 'r' for refs) */ +uint8_t block_reader_type(struct block_reader *r); + +/* Decodes the first key in the block */ +int block_reader_first_key(struct block_reader *br, struct strbuf *key); + +void block_iter_copy_from(struct block_iter *dest, struct block_iter *src); + +/* return < 0 for error, 0 for OK, > 0 for EOF. */ +int block_iter_next(struct block_iter *it, struct reftable_record *rec); + +/* Seek to `want` with in the block pointed to by `it` */ +int block_iter_seek(struct block_iter *it, struct strbuf *want); + +/* deallocate memory for `it`. The block reader and its block is left intact. */ +void block_iter_close(struct block_iter *it); + +/* size of file header, depending on format version */ +int header_size(int version); + +/* size of file footer, depending on format version */ +int footer_size(int version); + +/* returns a block to its source. */ +void reftable_block_done(struct reftable_block *ret); + +#endif diff --git a/reftable/block_test.c b/reftable/block_test.c new file mode 100644 index 00000000000..4b3ea262dcb --- /dev/null +++ b/reftable/block_test.c @@ -0,0 +1,120 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "block.h" + +#include "system.h" +#include "blocksource.h" +#include "basics.h" +#include "constants.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-tests.h" + +static void test_block_read_write(void) +{ + const int header_off = 21; /* random */ + char *names[30]; + const int N = ARRAY_SIZE(names); + const int block_size = 1024; + struct reftable_block block = { NULL }; + struct block_writer bw = { + .last_key = STRBUF_INIT, + }; + struct reftable_ref_record ref = { NULL }; + struct reftable_record rec = { NULL }; + int i = 0; + int n; + struct block_reader br = { 0 }; + struct block_iter it = { .last_key = STRBUF_INIT }; + int j = 0; + struct strbuf want = STRBUF_INIT; + + block.data = reftable_calloc(block_size); + block.len = block_size; + block.source = malloc_block_source(); + block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size, + header_off, hash_size(GIT_SHA1_FORMAT_ID)); + reftable_record_from_ref(&rec, &ref); + + for (i = 0; i < N; i++) { + char name[100]; + uint8_t hash[GIT_SHA1_RAWSZ]; + snprintf(name, sizeof(name), "branch%02d", i); + memset(hash, i, sizeof(hash)); + + ref.refname = name; + ref.value_type = REFTABLE_REF_VAL1; + ref.value.val1 = hash; + + names[i] = xstrdup(name); + n = block_writer_add(&bw, &rec); + ref.refname = NULL; + ref.value_type = REFTABLE_REF_DELETION; + EXPECT(n == 0); + } + + n = block_writer_finish(&bw); + EXPECT(n > 0); + + block_writer_release(&bw); + + block_reader_init(&br, &block, header_off, block_size, GIT_SHA1_RAWSZ); + + block_reader_start(&br, &it); + + while (1) { + int r = block_iter_next(&it, &rec); + EXPECT(r >= 0); + if (r > 0) { + break; + } + EXPECT_STREQ(names[j], ref.refname); + j++; + } + + reftable_record_release(&rec); + block_iter_close(&it); + + for (i = 0; i < N; i++) { + struct block_iter it = { .last_key = STRBUF_INIT }; + strbuf_reset(&want); + strbuf_addstr(&want, names[i]); + + n = block_reader_seek(&br, &it, &want); + EXPECT(n == 0); + + n = block_iter_next(&it, &rec); + EXPECT(n == 0); + + EXPECT_STREQ(names[i], ref.refname); + + want.len--; + n = block_reader_seek(&br, &it, &want); + EXPECT(n == 0); + + n = block_iter_next(&it, &rec); + EXPECT(n == 0); + EXPECT_STREQ(names[10 * (i / 10)], ref.refname); + + block_iter_close(&it); + } + + reftable_record_release(&rec); + reftable_block_done(&br.block); + strbuf_release(&want); + for (i = 0; i < N; i++) { + reftable_free(names[i]); + } +} + +int block_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_block_read_write); + return 0; +} diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 09d4b83ef9b..c9deeaf08c7 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -4,6 +4,7 @@ int cmd__reftable(int argc, const char **argv) { basics_test_main(argc, argv); + block_test_main(argc, argv); record_test_main(argc, argv); return 0; } From patchwork Mon Aug 30 14:57:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8BF9AC432BE for ; Mon, 30 Aug 2021 14:58:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6DB6860E77 for ; Mon, 30 Aug 2021 14:58:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237358AbhH3O7I (ORCPT ); Mon, 30 Aug 2021 10:59:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237305AbhH3O66 (ORCPT ); Mon, 30 Aug 2021 10:58:58 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A818C06175F for ; Mon, 30 Aug 2021 07:58:04 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id d26so22974490wrc.0 for ; Mon, 30 Aug 2021 07:58:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=FrBrCx9Fa7CZ3etygCJp/FvxBrRfhn+6RYsFjEAmHJc=; b=LS+3dFa5XtWi73UsdfV8Yl2j5YDsrm25HxS6bewxhuygD3wcNI8DEk7QCHxmYJWMFS jyFwYtOClrg8up8QjU2turO+LaJAkNEqC5A2Mh39ilfdAScnuXxitKPJU9F9CDSGsSDE kZn/LuuXzYTFC/6waeejj9g5W9aMM9KVlzNSU5VRN1eJQCdVs2EI166AAoZ85SkNzGCL BhhZJWnOSIFhtwmzT782CzZyJGey/jSoqN6xyrxbdj7aPtrCsEWbuZigVBr60WXxrXb5 ZdpgujrgRhKJJ6wCeFLVCgiCzSr85r+m/CjV48c5NcIPSV/jxnKoyvjDgel3VNRyOpn8 EbjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=FrBrCx9Fa7CZ3etygCJp/FvxBrRfhn+6RYsFjEAmHJc=; b=Ks1ucAoBrmTv6ddraFkFVUUDilparVrKSR/2fYs1DlEsP7KkKtZLnAb1fOQvS3jXaI nuXyPnI1PS3ST2s2ilP27CJL4fFzJcaRys1AAmYqN4jRnGFZLspTUoQG7yTmC1l26loh nNG+Ha7PM5UY+ywG4+LwKrNPy/PcoT1qPfKART78SiD9SsYgKr65X2yxF7QB+eeGAUXI jx3DKvi+FFpcSTqmIGVzCYg+2zgH1jlaGKruugMMQYCCvo/oy/6YSM478WNDV6XRGApa +9lBSa1/dpDHGRPHSGQ/4mMTovRJh4Llboai1zblH6mLpuAZppa8pK5ESaaXMn6am+18 zfCQ== X-Gm-Message-State: AOAM5314xgV0tnleb9PrT1uSYtJl3/tnUHRBY9p/HRgB//VYinpf9MdH rN8YIqtbQji8flIDOfhZ3k8moyqWXiY= X-Google-Smtp-Source: ABdhPJzjebeQrryFgn8U7hdCJCc7gx770ouNEX8N357dLCd9bU/48sjaliv9H+IRR22VqofDSFA8pA== X-Received: by 2002:adf:e68b:: with SMTP id r11mr26488586wrm.394.1630335482758; Mon, 30 Aug 2021 07:58:02 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m3sm19888760wrg.45.2021.08.30.07.58.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:02 -0700 (PDT) Message-Id: <49a9a00465dc65463eec44089f9b571a2145901d.1630335476.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:45 +0000 Subject: [PATCH 09/19] reftable: a generic binary tree implementation Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The reftable format includes support for an (OID => ref) map. This map can speed up visibility and reachability checks. In particular, various operations along the fetch/push path within Gerrit have ben sped up by using this structure. The map is constructed with help of a binary tree. Object IDs are hashes, so they are uniformly distributed. Hence, the tree does not attempt forced rebalancing. Signed-off-by: Han-Wen Nienhuys --- Makefile | 4 ++- reftable/tree.c | 63 ++++++++++++++++++++++++++++++++++++++++ reftable/tree.h | 34 ++++++++++++++++++++++ reftable/tree_test.c | 61 ++++++++++++++++++++++++++++++++++++++ t/helper/test-reftable.c | 1 + 5 files changed, 162 insertions(+), 1 deletion(-) create mode 100644 reftable/tree.c create mode 100644 reftable/tree.h create mode 100644 reftable/tree_test.c diff --git a/Makefile b/Makefile index 65bd39ecdc0..2e5a9f40ed1 100644 --- a/Makefile +++ b/Makefile @@ -2462,11 +2462,13 @@ REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_OBJS += reftable/record.o +REFTABLE_OBJS += reftable/tree.o +REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o -REFTABLE_TEST_OBJS += reftable/basics_test.o +REFTABLE_TEST_OBJS += reftable/tree_test.o TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS)) diff --git a/reftable/tree.c b/reftable/tree.c new file mode 100644 index 00000000000..82db7995dd6 --- /dev/null +++ b/reftable/tree.c @@ -0,0 +1,63 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "tree.h" + +#include "basics.h" +#include "system.h" + +struct tree_node *tree_search(void *key, struct tree_node **rootp, + int (*compare)(const void *, const void *), + int insert) +{ + int res; + if (*rootp == NULL) { + if (!insert) { + return NULL; + } else { + struct tree_node *n = + reftable_calloc(sizeof(struct tree_node)); + n->key = key; + *rootp = n; + return *rootp; + } + } + + res = compare(key, (*rootp)->key); + if (res < 0) + return tree_search(key, &(*rootp)->left, compare, insert); + else if (res > 0) + return tree_search(key, &(*rootp)->right, compare, insert); + return *rootp; +} + +void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key), + void *arg) +{ + if (t->left) { + infix_walk(t->left, action, arg); + } + action(arg, t->key); + if (t->right) { + infix_walk(t->right, action, arg); + } +} + +void tree_free(struct tree_node *t) +{ + if (t == NULL) { + return; + } + if (t->left) { + tree_free(t->left); + } + if (t->right) { + tree_free(t->right); + } + reftable_free(t); +} diff --git a/reftable/tree.h b/reftable/tree.h new file mode 100644 index 00000000000..fbdd002e23a --- /dev/null +++ b/reftable/tree.h @@ -0,0 +1,34 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef TREE_H +#define TREE_H + +/* tree_node is a generic binary search tree. */ +struct tree_node { + void *key; + struct tree_node *left, *right; +}; + +/* looks for `key` in `rootp` using `compare` as comparison function. If insert + * is set, insert the key if it's not found. Else, return NULL. + */ +struct tree_node *tree_search(void *key, struct tree_node **rootp, + int (*compare)(const void *, const void *), + int insert); + +/* performs an infix walk of the tree. */ +void infix_walk(struct tree_node *t, void (*action)(void *arg, void *key), + void *arg); + +/* + * deallocates the tree nodes recursively. Keys should be deallocated separately + * by walking over the tree. */ +void tree_free(struct tree_node *t); + +#endif diff --git a/reftable/tree_test.c b/reftable/tree_test.c new file mode 100644 index 00000000000..09a970e17b9 --- /dev/null +++ b/reftable/tree_test.c @@ -0,0 +1,61 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "tree.h" + +#include "basics.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-tests.h" + +static int test_compare(const void *a, const void *b) +{ + return (char *)a - (char *)b; +} + +struct curry { + void *last; +}; + +static void check_increasing(void *arg, void *key) +{ + struct curry *c = arg; + if (c->last) { + assert(test_compare(c->last, key) < 0); + } + c->last = key; +} + +static void test_tree(void) +{ + struct tree_node *root = NULL; + + void *values[11] = { NULL }; + struct tree_node *nodes[11] = { NULL }; + int i = 1; + struct curry c = { NULL }; + do { + nodes[i] = tree_search(values + i, &root, &test_compare, 1); + i = (i * 7) % 11; + } while (i != 1); + + for (i = 1; i < ARRAY_SIZE(nodes); i++) { + assert(values + i == nodes[i]->key); + assert(nodes[i] == + tree_search(values + i, &root, &test_compare, 0)); + } + + infix_walk(root, check_increasing, &c); + tree_free(root); +} + +int tree_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_tree); + return 0; +} diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index c9deeaf08c7..050551fa698 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -6,5 +6,6 @@ int cmd__reftable(int argc, const char **argv) basics_test_main(argc, argv); block_test_main(argc, argv); record_test_main(argc, argv); + tree_test_main(argc, argv); return 0; } From patchwork Mon Aug 30 14:57:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465415 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B824C4320A for ; Mon, 30 Aug 2021 14:58:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 07AC660E98 for ; Mon, 30 Aug 2021 14:58:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237396AbhH3O7K (ORCPT ); Mon, 30 Aug 2021 10:59:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237310AbhH3O67 (ORCPT ); Mon, 30 Aug 2021 10:58:59 -0400 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D34E6C061760 for ; Mon, 30 Aug 2021 07:58:04 -0700 (PDT) Received: by mail-wm1-x32f.google.com with SMTP id v20-20020a1cf714000000b002e71f4d2026so129115wmh.1 for ; Mon, 30 Aug 2021 07:58:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=+fL667T8EOHhnYjyHfLs3Yf9mp+6HNM5mbvfrx8LlRg=; b=ZKjqzKgysB3JDGgYdzfRjH2A3rCudEEOmRLl2DGB/JC3M4uRw1+3UJpKO3oi4qgF1D 2hpbAyy3cfFsW+moLR1YPZZ5xJLyx1gI2ExvRVZC9Iy/MZmM0GrrwTn/W0GPanOcytrP F/8o/7hM02BONvTVKLoCKf6ZjOU7sNY0hF7FtnWwqpc5FVbOjyB/quQlWcIT962Y9Qfu wcxIcC1R5xC1wd67RUil/p0to+c0ddW40eFHuAMHjNZhiJ6v6yELrtkcDI8cjIgZZzgo My4UMFfou7m/SIs//gVBpMCfl2a/9eHCh0KGZgXmK2ZjNYYe79tYfH3gU8WX1l9dxpC/ IcoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=+fL667T8EOHhnYjyHfLs3Yf9mp+6HNM5mbvfrx8LlRg=; b=HseMkCixVVg9jvr+IrkREHY6oYXDLWmF9BiTtcCUyFhVqgH0fswCFsw3rX6XDvsS+B NvOAr2DGnpLIl+LZoi3u0zKm2gGy/6dO1AVi3hKWmz1opkbXM1Tj9ip6HzOMugVEDPmZ z6JIn0KlqtMllBY3qigvhMg63KjLGyiWcKQTvIy0GzlbMzF0S3HGAmmVBgj9zLTZYWEQ SO2Kdr8ugqbNtzcBeFXsMqiwPn9v34pDMujG3KJQ1l2KrdkGmXVx66vPTDBlhp4fop4K s2Sx2Tf24J0ff51MQrFcQWH0ifb4oR5jQm9IlXPqqLjVQouFeIL/h5wbcpjl3xkTcDsy bUnw== X-Gm-Message-State: AOAM531/zc+zrtk1u9GMrgKvaPANFE4WUrx8GstDRpFdQm8OeC97E8MD PA6uzINJlaItbIUmChJmT1AHbrrCot0= X-Google-Smtp-Source: ABdhPJyYA9d+lFjPhz6kQw5Cek5spUXv3MVkYYxS0TzprEOMdYIZwloU4AYIVbw8ardFSXfzyg60oQ== X-Received: by 2002:a05:600c:1912:: with SMTP id j18mr6722312wmq.29.1630335483321; Mon, 30 Aug 2021 07:58:03 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o12sm14686629wmr.2.2021.08.30.07.58.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:03 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:46 +0000 Subject: [PATCH 10/19] reftable: write reftable files Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/reftable-writer.h | 148 ++++++++ reftable/writer.c | 690 +++++++++++++++++++++++++++++++++++++ reftable/writer.h | 50 +++ 4 files changed, 889 insertions(+) create mode 100644 reftable/reftable-writer.h create mode 100644 reftable/writer.c create mode 100644 reftable/writer.h diff --git a/Makefile b/Makefile index 2e5a9f40ed1..05be43355a6 100644 --- a/Makefile +++ b/Makefile @@ -2463,6 +2463,7 @@ REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_OBJS += reftable/record.o REFTABLE_OBJS += reftable/tree.o +REFTABLE_OBJS += reftable/writer.o REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h new file mode 100644 index 00000000000..af36462ced5 --- /dev/null +++ b/reftable/reftable-writer.h @@ -0,0 +1,148 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_WRITER_H +#define REFTABLE_WRITER_H + +#include "reftable-record.h" + +#include +#include /* ssize_t */ + +/* Writing single reftables */ + +/* reftable_write_options sets options for writing a single reftable. */ +struct reftable_write_options { + /* boolean: do not pad out blocks to block size. */ + unsigned unpadded : 1; + + /* the blocksize. Should be less than 2^24. */ + uint32_t block_size; + + /* boolean: do not generate a SHA1 => ref index. */ + unsigned skip_index_objects : 1; + + /* how often to write complete keys in each block. */ + int restart_interval; + + /* 4-byte identifier ("sha1", "s256") of the hash. + * Defaults to SHA1 if unset + */ + uint32_t hash_id; + + /* boolean: do not check ref names for validity or dir/file conflicts. + */ + unsigned skip_name_check : 1; + + /* boolean: copy log messages exactly. If unset, check that the message + * is a single line, and add '\n' if missing. + */ + unsigned exact_log_message : 1; +}; + +/* reftable_block_stats holds statistics for a single block type */ +struct reftable_block_stats { + /* total number of entries written */ + int entries; + /* total number of key restarts */ + int restarts; + /* total number of blocks */ + int blocks; + /* total number of index blocks */ + int index_blocks; + /* depth of the index */ + int max_index_level; + + /* offset of the first block for this type */ + uint64_t offset; + /* offset of the top level index block for this type, or 0 if not + * present */ + uint64_t index_offset; +}; + +/* stats holds overall statistics for a single reftable */ +struct reftable_stats { + /* total number of blocks written. */ + int blocks; + /* stats for ref data */ + struct reftable_block_stats ref_stats; + /* stats for the SHA1 to ref map. */ + struct reftable_block_stats obj_stats; + /* stats for index blocks */ + struct reftable_block_stats idx_stats; + /* stats for log blocks */ + struct reftable_block_stats log_stats; + + /* disambiguation length of shortened object IDs. */ + int object_id_len; +}; + +/* reftable_new_writer creates a new writer */ +struct reftable_writer * +reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t), + void *writer_arg, struct reftable_write_options *opts); + +/* Set the range of update indices for the records we will add. When writing a + table into a stack, the min should be at least + reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned. + + For transactional updates to a stack, typically min==max, and the + update_index can be obtained by inspeciting the stack. When converting an + existing ref database into a single reftable, this would be a range of + update-index timestamps. + */ +void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, + uint64_t max); + +/* + Add a reftable_ref_record. The record should have names that come after + already added records. + + The update_index must be within the limits set by + reftable_writer_set_limits(), or REFTABLE_API_ERROR is returned. It is an + REFTABLE_API_ERROR error to write a ref record after a log record. +*/ +int reftable_writer_add_ref(struct reftable_writer *w, + struct reftable_ref_record *ref); + +/* + Convenience function to add multiple reftable_ref_records; the function sorts + the records before adding them, reordering the records array passed in. +*/ +int reftable_writer_add_refs(struct reftable_writer *w, + struct reftable_ref_record *refs, int n); + +/* + adds reftable_log_records. Log records are keyed by (refname, decreasing + update_index). The key for the record added must come after the already added + log records. +*/ +int reftable_writer_add_log(struct reftable_writer *w, + struct reftable_log_record *log); + +/* + Convenience function to add multiple reftable_log_records; the function sorts + the records before adding them, reordering records array passed in. +*/ +int reftable_writer_add_logs(struct reftable_writer *w, + struct reftable_log_record *logs, int n); + +/* reftable_writer_close finalizes the reftable. The writer is retained so + * statistics can be inspected. */ +int reftable_writer_close(struct reftable_writer *w); + +/* writer_stats returns the statistics on the reftable being written. + + This struct becomes invalid when the writer is freed. + */ +const struct reftable_stats *writer_stats(struct reftable_writer *w); + +/* reftable_writer_free deallocates memory for the writer */ +void reftable_writer_free(struct reftable_writer *w); + +#endif diff --git a/reftable/writer.c b/reftable/writer.c new file mode 100644 index 00000000000..3ca721e9f64 --- /dev/null +++ b/reftable/writer.c @@ -0,0 +1,690 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "writer.h" + +#include "system.h" + +#include "block.h" +#include "constants.h" +#include "record.h" +#include "tree.h" +#include "reftable-error.h" + +/* finishes a block, and writes it to storage */ +static int writer_flush_block(struct reftable_writer *w); + +/* deallocates memory related to the index */ +static void writer_clear_index(struct reftable_writer *w); + +/* finishes writing a 'r' (refs) or 'g' (reflogs) section */ +static int writer_finish_public_section(struct reftable_writer *w); + +static struct reftable_block_stats * +writer_reftable_block_stats(struct reftable_writer *w, uint8_t typ) +{ + switch (typ) { + case 'r': + return &w->stats.ref_stats; + case 'o': + return &w->stats.obj_stats; + case 'i': + return &w->stats.idx_stats; + case 'g': + return &w->stats.log_stats; + } + abort(); + return NULL; +} + +/* write data, queuing the padding for the next write. Returns negative for + * error. */ +static int padded_write(struct reftable_writer *w, uint8_t *data, size_t len, + int padding) +{ + int n = 0; + if (w->pending_padding > 0) { + uint8_t *zeroed = reftable_calloc(w->pending_padding); + int n = w->write(w->write_arg, zeroed, w->pending_padding); + if (n < 0) + return n; + + w->pending_padding = 0; + reftable_free(zeroed); + } + + w->pending_padding = padding; + n = w->write(w->write_arg, data, len); + if (n < 0) + return n; + n += padding; + return 0; +} + +static void options_set_defaults(struct reftable_write_options *opts) +{ + if (opts->restart_interval == 0) { + opts->restart_interval = 16; + } + + if (opts->hash_id == 0) { + opts->hash_id = GIT_SHA1_FORMAT_ID; + } + if (opts->block_size == 0) { + opts->block_size = DEFAULT_BLOCK_SIZE; + } +} + +static int writer_version(struct reftable_writer *w) +{ + return (w->opts.hash_id == 0 || w->opts.hash_id == GIT_SHA1_FORMAT_ID) ? + 1 : + 2; +} + +static int writer_write_header(struct reftable_writer *w, uint8_t *dest) +{ + memcpy(dest, "REFT", 4); + + dest[4] = writer_version(w); + + put_be24(dest + 5, w->opts.block_size); + put_be64(dest + 8, w->min_update_index); + put_be64(dest + 16, w->max_update_index); + if (writer_version(w) == 2) { + put_be32(dest + 24, w->opts.hash_id); + } + return header_size(writer_version(w)); +} + +static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ) +{ + int block_start = 0; + if (w->next == 0) { + block_start = header_size(writer_version(w)); + } + + strbuf_release(&w->last_key); + block_writer_init(&w->block_writer_data, typ, w->block, + w->opts.block_size, block_start, + hash_size(w->opts.hash_id)); + w->block_writer = &w->block_writer_data; + w->block_writer->restart_interval = w->opts.restart_interval; +} + +static struct strbuf reftable_empty_strbuf = STRBUF_INIT; + +struct reftable_writer * +reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t), + void *writer_arg, struct reftable_write_options *opts) +{ + struct reftable_writer *wp = + reftable_calloc(sizeof(struct reftable_writer)); + strbuf_init(&wp->block_writer_data.last_key, 0); + options_set_defaults(opts); + if (opts->block_size >= (1 << 24)) { + /* TODO - error return? */ + abort(); + } + wp->last_key = reftable_empty_strbuf; + wp->block = reftable_calloc(opts->block_size); + wp->write = writer_func; + wp->write_arg = writer_arg; + wp->opts = *opts; + writer_reinit_block_writer(wp, BLOCK_TYPE_REF); + + return wp; +} + +void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, + uint64_t max) +{ + w->min_update_index = min; + w->max_update_index = max; +} + +void reftable_writer_free(struct reftable_writer *w) +{ + reftable_free(w->block); + reftable_free(w); +} + +struct obj_index_tree_node { + struct strbuf hash; + uint64_t *offsets; + size_t offset_len; + size_t offset_cap; +}; + +#define OBJ_INDEX_TREE_NODE_INIT \ + { \ + .hash = STRBUF_INIT \ + } + +static int obj_index_tree_node_compare(const void *a, const void *b) +{ + return strbuf_cmp(&((const struct obj_index_tree_node *)a)->hash, + &((const struct obj_index_tree_node *)b)->hash); +} + +static void writer_index_hash(struct reftable_writer *w, struct strbuf *hash) +{ + uint64_t off = w->next; + + struct obj_index_tree_node want = { .hash = *hash }; + + struct tree_node *node = tree_search(&want, &w->obj_index_tree, + &obj_index_tree_node_compare, 0); + struct obj_index_tree_node *key = NULL; + if (node == NULL) { + struct obj_index_tree_node empty = OBJ_INDEX_TREE_NODE_INIT; + key = reftable_malloc(sizeof(struct obj_index_tree_node)); + *key = empty; + + strbuf_reset(&key->hash); + strbuf_addbuf(&key->hash, hash); + tree_search((void *)key, &w->obj_index_tree, + &obj_index_tree_node_compare, 1); + } else { + key = node->key; + } + + if (key->offset_len > 0 && key->offsets[key->offset_len - 1] == off) { + return; + } + + if (key->offset_len == key->offset_cap) { + key->offset_cap = 2 * key->offset_cap + 1; + key->offsets = reftable_realloc( + key->offsets, sizeof(uint64_t) * key->offset_cap); + } + + key->offsets[key->offset_len++] = off; +} + +static int writer_add_record(struct reftable_writer *w, + struct reftable_record *rec) +{ + struct strbuf key = STRBUF_INIT; + int err = -1; + reftable_record_key(rec, &key); + if (strbuf_cmp(&w->last_key, &key) >= 0) { + err = REFTABLE_API_ERROR; + goto done; + } + + strbuf_reset(&w->last_key); + strbuf_addbuf(&w->last_key, &key); + if (w->block_writer == NULL) { + writer_reinit_block_writer(w, reftable_record_type(rec)); + } + + assert(block_writer_type(w->block_writer) == reftable_record_type(rec)); + + if (block_writer_add(w->block_writer, rec) == 0) { + err = 0; + goto done; + } + + err = writer_flush_block(w); + if (err < 0) { + goto done; + } + + writer_reinit_block_writer(w, reftable_record_type(rec)); + err = block_writer_add(w->block_writer, rec); + if (err < 0) { + goto done; + } + + err = 0; +done: + strbuf_release(&key); + return err; +} + +int reftable_writer_add_ref(struct reftable_writer *w, + struct reftable_ref_record *ref) +{ + struct reftable_record rec = { NULL }; + struct reftable_ref_record copy = *ref; + int err = 0; + + if (ref->refname == NULL) + return REFTABLE_API_ERROR; + if (ref->update_index < w->min_update_index || + ref->update_index > w->max_update_index) + return REFTABLE_API_ERROR; + + reftable_record_from_ref(&rec, ©); + copy.update_index -= w->min_update_index; + + err = writer_add_record(w, &rec); + if (err < 0) + return err; + + if (!w->opts.skip_index_objects && reftable_ref_record_val1(ref)) { + struct strbuf h = STRBUF_INIT; + strbuf_add(&h, (char *)reftable_ref_record_val1(ref), + hash_size(w->opts.hash_id)); + writer_index_hash(w, &h); + strbuf_release(&h); + } + + if (!w->opts.skip_index_objects && reftable_ref_record_val2(ref)) { + struct strbuf h = STRBUF_INIT; + strbuf_add(&h, reftable_ref_record_val2(ref), + hash_size(w->opts.hash_id)); + writer_index_hash(w, &h); + strbuf_release(&h); + } + return 0; +} + +int reftable_writer_add_refs(struct reftable_writer *w, + struct reftable_ref_record *refs, int n) +{ + int err = 0; + int i = 0; + QSORT(refs, n, reftable_ref_record_compare_name); + for (i = 0; err == 0 && i < n; i++) { + err = reftable_writer_add_ref(w, &refs[i]); + } + return err; +} + +static int reftable_writer_add_log_verbatim(struct reftable_writer *w, + struct reftable_log_record *log) +{ + struct reftable_record rec = { NULL }; + if (w->block_writer && + block_writer_type(w->block_writer) == BLOCK_TYPE_REF) { + int err = writer_finish_public_section(w); + if (err < 0) + return err; + } + + w->next -= w->pending_padding; + w->pending_padding = 0; + + reftable_record_from_log(&rec, log); + return writer_add_record(w, &rec); +} + +int reftable_writer_add_log(struct reftable_writer *w, + struct reftable_log_record *log) +{ + char *input_log_message = NULL; + struct strbuf cleaned_message = STRBUF_INIT; + int err = 0; + + if (log->value_type == REFTABLE_LOG_DELETION) + return reftable_writer_add_log_verbatim(w, log); + + if (log->refname == NULL) + return REFTABLE_API_ERROR; + + input_log_message = log->value.update.message; + if (!w->opts.exact_log_message && log->value.update.message) { + strbuf_addstr(&cleaned_message, log->value.update.message); + while (cleaned_message.len && + cleaned_message.buf[cleaned_message.len - 1] == '\n') + strbuf_setlen(&cleaned_message, + cleaned_message.len - 1); + if (strchr(cleaned_message.buf, '\n')) { + /* multiple lines not allowed. */ + err = REFTABLE_API_ERROR; + goto done; + } + strbuf_addstr(&cleaned_message, "\n"); + log->value.update.message = cleaned_message.buf; + } + + err = reftable_writer_add_log_verbatim(w, log); + log->value.update.message = input_log_message; +done: + strbuf_release(&cleaned_message); + return err; +} + +int reftable_writer_add_logs(struct reftable_writer *w, + struct reftable_log_record *logs, int n) +{ + int err = 0; + int i = 0; + QSORT(logs, n, reftable_log_record_compare_key); + + for (i = 0; err == 0 && i < n; i++) { + err = reftable_writer_add_log(w, &logs[i]); + } + return err; +} + +static int writer_finish_section(struct reftable_writer *w) +{ + uint8_t typ = block_writer_type(w->block_writer); + uint64_t index_start = 0; + int max_level = 0; + int threshold = w->opts.unpadded ? 1 : 3; + int before_blocks = w->stats.idx_stats.blocks; + int err = writer_flush_block(w); + int i = 0; + struct reftable_block_stats *bstats = NULL; + if (err < 0) + return err; + + while (w->index_len > threshold) { + struct reftable_index_record *idx = NULL; + int idx_len = 0; + + max_level++; + index_start = w->next; + writer_reinit_block_writer(w, BLOCK_TYPE_INDEX); + + idx = w->index; + idx_len = w->index_len; + + w->index = NULL; + w->index_len = 0; + w->index_cap = 0; + for (i = 0; i < idx_len; i++) { + struct reftable_record rec = { NULL }; + reftable_record_from_index(&rec, idx + i); + if (block_writer_add(w->block_writer, &rec) == 0) { + continue; + } + + err = writer_flush_block(w); + if (err < 0) + return err; + + writer_reinit_block_writer(w, BLOCK_TYPE_INDEX); + + err = block_writer_add(w->block_writer, &rec); + if (err != 0) { + /* write into fresh block should always succeed + */ + abort(); + } + } + for (i = 0; i < idx_len; i++) { + strbuf_release(&idx[i].last_key); + } + reftable_free(idx); + } + + writer_clear_index(w); + + err = writer_flush_block(w); + if (err < 0) + return err; + + bstats = writer_reftable_block_stats(w, typ); + bstats->index_blocks = w->stats.idx_stats.blocks - before_blocks; + bstats->index_offset = index_start; + bstats->max_index_level = max_level; + + /* Reinit lastKey, as the next section can start with any key. */ + w->last_key.len = 0; + + return 0; +} + +struct common_prefix_arg { + struct strbuf *last; + int max; +}; + +static void update_common(void *void_arg, void *key) +{ + struct common_prefix_arg *arg = void_arg; + struct obj_index_tree_node *entry = key; + if (arg->last) { + int n = common_prefix_size(&entry->hash, arg->last); + if (n > arg->max) { + arg->max = n; + } + } + arg->last = &entry->hash; +} + +struct write_record_arg { + struct reftable_writer *w; + int err; +}; + +static void write_object_record(void *void_arg, void *key) +{ + struct write_record_arg *arg = void_arg; + struct obj_index_tree_node *entry = key; + struct reftable_obj_record obj_rec = { + .hash_prefix = (uint8_t *)entry->hash.buf, + .hash_prefix_len = arg->w->stats.object_id_len, + .offsets = entry->offsets, + .offset_len = entry->offset_len, + }; + struct reftable_record rec = { NULL }; + if (arg->err < 0) + goto done; + + reftable_record_from_obj(&rec, &obj_rec); + arg->err = block_writer_add(arg->w->block_writer, &rec); + if (arg->err == 0) + goto done; + + arg->err = writer_flush_block(arg->w); + if (arg->err < 0) + goto done; + + writer_reinit_block_writer(arg->w, BLOCK_TYPE_OBJ); + arg->err = block_writer_add(arg->w->block_writer, &rec); + if (arg->err == 0) + goto done; + obj_rec.offset_len = 0; + arg->err = block_writer_add(arg->w->block_writer, &rec); + + /* Should be able to write into a fresh block. */ + assert(arg->err == 0); + +done:; +} + +static void object_record_free(void *void_arg, void *key) +{ + struct obj_index_tree_node *entry = key; + + FREE_AND_NULL(entry->offsets); + strbuf_release(&entry->hash); + reftable_free(entry); +} + +static int writer_dump_object_index(struct reftable_writer *w) +{ + struct write_record_arg closure = { .w = w }; + struct common_prefix_arg common = { NULL }; + if (w->obj_index_tree) { + infix_walk(w->obj_index_tree, &update_common, &common); + } + w->stats.object_id_len = common.max + 1; + + writer_reinit_block_writer(w, BLOCK_TYPE_OBJ); + + if (w->obj_index_tree) { + infix_walk(w->obj_index_tree, &write_object_record, &closure); + } + + if (closure.err < 0) + return closure.err; + return writer_finish_section(w); +} + +static int writer_finish_public_section(struct reftable_writer *w) +{ + uint8_t typ = 0; + int err = 0; + + if (w->block_writer == NULL) + return 0; + + typ = block_writer_type(w->block_writer); + err = writer_finish_section(w); + if (err < 0) + return err; + if (typ == BLOCK_TYPE_REF && !w->opts.skip_index_objects && + w->stats.ref_stats.index_blocks > 0) { + err = writer_dump_object_index(w); + if (err < 0) + return err; + } + + if (w->obj_index_tree) { + infix_walk(w->obj_index_tree, &object_record_free, NULL); + tree_free(w->obj_index_tree); + w->obj_index_tree = NULL; + } + + w->block_writer = NULL; + return 0; +} + +int reftable_writer_close(struct reftable_writer *w) +{ + uint8_t footer[72]; + uint8_t *p = footer; + int err = writer_finish_public_section(w); + int empty_table = w->next == 0; + if (err != 0) + goto done; + w->pending_padding = 0; + if (empty_table) { + /* Empty tables need a header anyway. */ + uint8_t header[28]; + int n = writer_write_header(w, header); + err = padded_write(w, header, n, 0); + if (err < 0) + goto done; + } + + p += writer_write_header(w, footer); + put_be64(p, w->stats.ref_stats.index_offset); + p += 8; + put_be64(p, (w->stats.obj_stats.offset) << 5 | w->stats.object_id_len); + p += 8; + put_be64(p, w->stats.obj_stats.index_offset); + p += 8; + + put_be64(p, w->stats.log_stats.offset); + p += 8; + put_be64(p, w->stats.log_stats.index_offset); + p += 8; + + put_be32(p, crc32(0, footer, p - footer)); + p += 4; + + err = padded_write(w, footer, footer_size(writer_version(w)), 0); + if (err < 0) + goto done; + + if (empty_table) { + err = REFTABLE_EMPTY_TABLE_ERROR; + goto done; + } + +done: + /* free up memory. */ + block_writer_release(&w->block_writer_data); + writer_clear_index(w); + strbuf_release(&w->last_key); + return err; +} + +static void writer_clear_index(struct reftable_writer *w) +{ + int i = 0; + for (i = 0; i < w->index_len; i++) { + strbuf_release(&w->index[i].last_key); + } + + FREE_AND_NULL(w->index); + w->index_len = 0; + w->index_cap = 0; +} + +static const int debug = 0; + +static int writer_flush_nonempty_block(struct reftable_writer *w) +{ + uint8_t typ = block_writer_type(w->block_writer); + struct reftable_block_stats *bstats = + writer_reftable_block_stats(w, typ); + uint64_t block_typ_off = (bstats->blocks == 0) ? w->next : 0; + int raw_bytes = block_writer_finish(w->block_writer); + int padding = 0; + int err = 0; + struct reftable_index_record ir = { .last_key = STRBUF_INIT }; + if (raw_bytes < 0) + return raw_bytes; + + if (!w->opts.unpadded && typ != BLOCK_TYPE_LOG) { + padding = w->opts.block_size - raw_bytes; + } + + if (block_typ_off > 0) { + bstats->offset = block_typ_off; + } + + bstats->entries += w->block_writer->entries; + bstats->restarts += w->block_writer->restart_len; + bstats->blocks++; + w->stats.blocks++; + + if (debug) { + fprintf(stderr, "block %c off %" PRIu64 " sz %d (%d)\n", typ, + w->next, raw_bytes, + get_be24(w->block + w->block_writer->header_off + 1)); + } + + if (w->next == 0) { + writer_write_header(w, w->block); + } + + err = padded_write(w, w->block, raw_bytes, padding); + if (err < 0) + return err; + + if (w->index_cap == w->index_len) { + w->index_cap = 2 * w->index_cap + 1; + w->index = reftable_realloc( + w->index, + sizeof(struct reftable_index_record) * w->index_cap); + } + + ir.offset = w->next; + strbuf_reset(&ir.last_key); + strbuf_addbuf(&ir.last_key, &w->block_writer->last_key); + w->index[w->index_len] = ir; + + w->index_len++; + w->next += padding + raw_bytes; + w->block_writer = NULL; + return 0; +} + +static int writer_flush_block(struct reftable_writer *w) +{ + if (w->block_writer == NULL) + return 0; + if (w->block_writer->entries == 0) + return 0; + return writer_flush_nonempty_block(w); +} + +const struct reftable_stats *writer_stats(struct reftable_writer *w) +{ + return &w->stats; +} diff --git a/reftable/writer.h b/reftable/writer.h new file mode 100644 index 00000000000..09b88673d97 --- /dev/null +++ b/reftable/writer.h @@ -0,0 +1,50 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef WRITER_H +#define WRITER_H + +#include "basics.h" +#include "block.h" +#include "tree.h" +#include "reftable-writer.h" + +struct reftable_writer { + ssize_t (*write)(void *, const void *, size_t); + void *write_arg; + int pending_padding; + struct strbuf last_key; + + /* offset of next block to write. */ + uint64_t next; + uint64_t min_update_index, max_update_index; + struct reftable_write_options opts; + + /* memory buffer for writing */ + uint8_t *block; + + /* writer for the current section. NULL or points to + * block_writer_data */ + struct block_writer *block_writer; + + struct block_writer block_writer_data; + + /* pending index records for the current section */ + struct reftable_index_record *index; + size_t index_len; + size_t index_cap; + + /* + * tree for use with tsearch; used to populate the 'o' inverse OID + * map */ + struct tree_node *obj_index_tree; + + struct reftable_stats stats; +}; + +#endif From patchwork Mon Aug 30 14:57:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B84EC43216 for ; Mon, 30 Aug 2021 14:58:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 14D3960F91 for ; Mon, 30 Aug 2021 14:58:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237411AbhH3O7L (ORCPT ); Mon, 30 Aug 2021 10:59:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55560 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237311AbhH3O67 (ORCPT ); Mon, 30 Aug 2021 10:58:59 -0400 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B68FC061764 for ; Mon, 30 Aug 2021 07:58:05 -0700 (PDT) Received: by mail-wr1-x431.google.com with SMTP id u9so22849790wrg.8 for ; Mon, 30 Aug 2021 07:58:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=GBpKfLYu14KsZ7P/PmrbFPXgiHHOfo1ERkJGTQojyMY=; b=nrVkDgOc0ehJOQuC0hhr/6vBp/raaaPKp2roiAy8/mhmO/Zo4GJwwRDrp123U0FkrD apRnnPgfY6T9pZvG83B1HuiuDQSnLcIHth5a5w81sVDuFn7E97i2KsuRRP7tE5ocikGu Fqar2l3Z/m3pH7Een88WwoaA7Eu6Pe/H495txsQ3SeqwW6PwmIegNU6yB9juLbR/aUaD 4CtT8Vn2dvP3V2ADMg+YvDlFJ95R1/Z3a515k56iklqbZ57mOgJVvaOMJQZQSkOLlru5 +JychBzuderVrpBwIoxf9tH10SO9Mu+xl1K1WegclWjdKvbm6BIYz9tnI2HL/UGv1gDC JB8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=GBpKfLYu14KsZ7P/PmrbFPXgiHHOfo1ERkJGTQojyMY=; b=HECVaD0Ss7yE1h1Zibmd2BiONPvh+tkx/VB6qDNNcPQ27WE0ts8epQloMxvjpv+Cbb T3ys2xVu3oRYZdNBoM92R0i3aSMcJ4U/1raTmrUvQJ8+8b/0AkwD+8M4OQ51h5hjHBEO Zo5plaudcnSaXywaO4tInMStoyJkczzBF/AHG1Eb/lBQ0rd4qwYYchGlrVnFJd+tSp7S 9m39TzQZiE1N5bhLWpruat6VVmxO6p5sGUjyFfjoXGWRUO9gJzPbITXSLzv3QZyOWziE JiW6jx6peMsE9TR1/QFMe25aAuwyAhZZ/3ZIzm6eJl/5O81S8N+pjrcrVobEgYbyV7hD 8phw== X-Gm-Message-State: AOAM530+jQPz1p/wZWso+om9hO8o+2P2Ivq2HoHE+pO2RtNqQZyv7SBu 41Jk7893s58Xggkb6x0kQO9p9Ed64Ss= X-Google-Smtp-Source: ABdhPJzRuU6T/OypyLUl2zGTk/eBJ7LazlcWSUFCyrd5sFQJgPiiAWFhb3fRHXH6DDfqiJuAVgsreg== X-Received: by 2002:a5d:464c:: with SMTP id j12mr25509674wrs.27.1630335483886; Mon, 30 Aug 2021 07:58:03 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o5sm15446427wrw.17.2021.08.30.07.58.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:03 -0700 (PDT) Message-Id: <0f32d6f4c91b2717877ea3f5fa33d7661cb2db7b.1630335476.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:47 +0000 Subject: [PATCH 11/19] reftable: generic interface to tables Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys Signed-off-by: Han-Wen Nienhuys --- Makefile | 3 + reftable/generic.c | 169 +++++++++++++++++++++++++++++++++++ reftable/generic.h | 32 +++++++ reftable/reftable-generic.h | 47 ++++++++++ reftable/reftable-iterator.h | 39 ++++++++ reftable/reftable.c | 115 ++++++++++++++++++++++++ 6 files changed, 405 insertions(+) create mode 100644 reftable/generic.c create mode 100644 reftable/generic.h create mode 100644 reftable/reftable-generic.h create mode 100644 reftable/reftable-iterator.h create mode 100644 reftable/reftable.c diff --git a/Makefile b/Makefile index 05be43355a6..18093366acb 100644 --- a/Makefile +++ b/Makefile @@ -2462,6 +2462,9 @@ REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/publicbasics.o REFTABLE_OBJS += reftable/record.o +REFTABLE_OBJS += reftable/refname.o +REFTABLE_OBJS += reftable/generic.o +REFTABLE_OBJS += reftable/stack.o REFTABLE_OBJS += reftable/tree.o REFTABLE_OBJS += reftable/writer.o diff --git a/reftable/generic.c b/reftable/generic.c new file mode 100644 index 00000000000..7a8a738d860 --- /dev/null +++ b/reftable/generic.c @@ -0,0 +1,169 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "basics.h" +#include "record.h" +#include "generic.h" +#include "reftable-iterator.h" +#include "reftable-generic.h" + +int reftable_table_seek_ref(struct reftable_table *tab, + struct reftable_iterator *it, const char *name) +{ + struct reftable_ref_record ref = { + .refname = (char *)name, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, &ref); + return tab->ops->seek_record(tab->table_arg, it, &rec); +} + +int reftable_table_seek_log(struct reftable_table *tab, + struct reftable_iterator *it, const char *name) +{ + struct reftable_log_record log = { + .refname = (char *)name, + .update_index = ~((uint64_t)0), + }; + struct reftable_record rec = { NULL }; + reftable_record_from_log(&rec, &log); + return tab->ops->seek_record(tab->table_arg, it, &rec); +} + +int reftable_table_read_ref(struct reftable_table *tab, const char *name, + struct reftable_ref_record *ref) +{ + struct reftable_iterator it = { NULL }; + int err = reftable_table_seek_ref(tab, &it, name); + if (err) + goto done; + + err = reftable_iterator_next_ref(&it, ref); + if (err) + goto done; + + if (strcmp(ref->refname, name) || + reftable_ref_record_is_deletion(ref)) { + reftable_ref_record_release(ref); + err = 1; + goto done; + } + +done: + reftable_iterator_destroy(&it); + return err; +} + +int reftable_table_print(struct reftable_table *tab) { + struct reftable_iterator it = { NULL }; + struct reftable_ref_record ref = { NULL }; + struct reftable_log_record log = { NULL }; + uint32_t hash_id = reftable_table_hash_id(tab); + int err = reftable_table_seek_ref(tab, &it, ""); + if (err < 0) { + return err; + } + + while (1) { + err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + break; + } + if (err < 0) { + return err; + } + reftable_ref_record_print(&ref, hash_id); + } + reftable_iterator_destroy(&it); + reftable_ref_record_release(&ref); + + err = reftable_table_seek_log(tab, &it, ""); + if (err < 0) { + return err; + } + while (1) { + err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + break; + } + if (err < 0) { + return err; + } + reftable_log_record_print(&log, hash_id); + } + reftable_iterator_destroy(&it); + reftable_log_record_release(&log); + return 0; +} + +uint64_t reftable_table_max_update_index(struct reftable_table *tab) +{ + return tab->ops->max_update_index(tab->table_arg); +} + +uint64_t reftable_table_min_update_index(struct reftable_table *tab) +{ + return tab->ops->min_update_index(tab->table_arg); +} + +uint32_t reftable_table_hash_id(struct reftable_table *tab) +{ + return tab->ops->hash_id(tab->table_arg); +} + +void reftable_iterator_destroy(struct reftable_iterator *it) +{ + if (!it->ops) { + return; + } + it->ops->close(it->iter_arg); + it->ops = NULL; + FREE_AND_NULL(it->iter_arg); +} + +int reftable_iterator_next_ref(struct reftable_iterator *it, + struct reftable_ref_record *ref) +{ + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, ref); + return iterator_next(it, &rec); +} + +int reftable_iterator_next_log(struct reftable_iterator *it, + struct reftable_log_record *log) +{ + struct reftable_record rec = { NULL }; + reftable_record_from_log(&rec, log); + return iterator_next(it, &rec); +} + +int iterator_next(struct reftable_iterator *it, struct reftable_record *rec) +{ + return it->ops->next(it->iter_arg, rec); +} + +static int empty_iterator_next(void *arg, struct reftable_record *rec) +{ + return 1; +} + +static void empty_iterator_close(void *arg) +{ +} + +static struct reftable_iterator_vtable empty_vtable = { + .next = &empty_iterator_next, + .close = &empty_iterator_close, +}; + +void iterator_set_empty(struct reftable_iterator *it) +{ + assert(!it->ops); + it->iter_arg = NULL; + it->ops = &empty_vtable; +} diff --git a/reftable/generic.h b/reftable/generic.h new file mode 100644 index 00000000000..98886a06402 --- /dev/null +++ b/reftable/generic.h @@ -0,0 +1,32 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef GENERIC_H +#define GENERIC_H + +#include "record.h" +#include "reftable-generic.h" + +/* generic interface to reftables */ +struct reftable_table_vtable { + int (*seek_record)(void *tab, struct reftable_iterator *it, + struct reftable_record *); + uint32_t (*hash_id)(void *tab); + uint64_t (*min_update_index)(void *tab); + uint64_t (*max_update_index)(void *tab); +}; + +struct reftable_iterator_vtable { + int (*next)(void *iter_arg, struct reftable_record *rec); + void (*close)(void *iter_arg); +}; + +void iterator_set_empty(struct reftable_iterator *it); +int iterator_next(struct reftable_iterator *it, struct reftable_record *rec); + +#endif diff --git a/reftable/reftable-generic.h b/reftable/reftable-generic.h new file mode 100644 index 00000000000..d239751a778 --- /dev/null +++ b/reftable/reftable-generic.h @@ -0,0 +1,47 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_GENERIC_H +#define REFTABLE_GENERIC_H + +#include "reftable-iterator.h" + +struct reftable_table_vtable; + +/* + * Provides a unified API for reading tables, either merged tables, or single + * readers. */ +struct reftable_table { + struct reftable_table_vtable *ops; + void *table_arg; +}; + +int reftable_table_seek_log(struct reftable_table *tab, + struct reftable_iterator *it, const char *name); + +int reftable_table_seek_ref(struct reftable_table *tab, + struct reftable_iterator *it, const char *name); + +/* returns the hash ID from a generic reftable_table */ +uint32_t reftable_table_hash_id(struct reftable_table *tab); + +/* returns the max update_index covered by this table. */ +uint64_t reftable_table_max_update_index(struct reftable_table *tab); + +/* returns the min update_index covered by this table. */ +uint64_t reftable_table_min_update_index(struct reftable_table *tab); + +/* convenience function to read a single ref. Returns < 0 for error, 0 + for success, and 1 if ref not found. */ +int reftable_table_read_ref(struct reftable_table *tab, const char *name, + struct reftable_ref_record *ref); + +/* dump table contents onto stdout for debugging */ +int reftable_table_print(struct reftable_table *tab); + +#endif diff --git a/reftable/reftable-iterator.h b/reftable/reftable-iterator.h new file mode 100644 index 00000000000..d3eee7af357 --- /dev/null +++ b/reftable/reftable-iterator.h @@ -0,0 +1,39 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_ITERATOR_H +#define REFTABLE_ITERATOR_H + +#include "reftable-record.h" + +struct reftable_iterator_vtable; + +/* iterator is the generic interface for walking over data stored in a + * reftable. + */ +struct reftable_iterator { + struct reftable_iterator_vtable *ops; + void *iter_arg; +}; + +/* reads the next reftable_ref_record. Returns < 0 for error, 0 for OK and > 0: + * end of iteration. + */ +int reftable_iterator_next_ref(struct reftable_iterator *it, + struct reftable_ref_record *ref); + +/* reads the next reftable_log_record. Returns < 0 for error, 0 for OK and > 0: + * end of iteration. + */ +int reftable_iterator_next_log(struct reftable_iterator *it, + struct reftable_log_record *log); + +/* releases resources associated with an iterator. */ +void reftable_iterator_destroy(struct reftable_iterator *it); + +#endif diff --git a/reftable/reftable.c b/reftable/reftable.c new file mode 100644 index 00000000000..0e4607a7cd6 --- /dev/null +++ b/reftable/reftable.c @@ -0,0 +1,115 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "basics.h" +#include "record.h" +#include "generic.h" +#include "reftable-iterator.h" +#include "reftable-generic.h" + +int reftable_table_seek_ref(struct reftable_table *tab, + struct reftable_iterator *it, const char *name) +{ + struct reftable_ref_record ref = { + .refname = (char *)name, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, &ref); + return tab->ops->seek_record(tab->table_arg, it, &rec); +} + +int reftable_table_read_ref(struct reftable_table *tab, const char *name, + struct reftable_ref_record *ref) +{ + struct reftable_iterator it = { NULL }; + int err = reftable_table_seek_ref(tab, &it, name); + if (err) + goto done; + + err = reftable_iterator_next_ref(&it, ref); + if (err) + goto done; + + if (strcmp(ref->refname, name) || + reftable_ref_record_is_deletion(ref)) { + reftable_ref_record_release(ref); + err = 1; + goto done; + } + +done: + reftable_iterator_destroy(&it); + return err; +} + +uint64_t reftable_table_max_update_index(struct reftable_table *tab) +{ + return tab->ops->max_update_index(tab->table_arg); +} + +uint64_t reftable_table_min_update_index(struct reftable_table *tab) +{ + return tab->ops->min_update_index(tab->table_arg); +} + +uint32_t reftable_table_hash_id(struct reftable_table *tab) +{ + return tab->ops->hash_id(tab->table_arg); +} + +void reftable_iterator_destroy(struct reftable_iterator *it) +{ + if (!it->ops) { + return; + } + it->ops->close(it->iter_arg); + it->ops = NULL; + FREE_AND_NULL(it->iter_arg); +} + +int reftable_iterator_next_ref(struct reftable_iterator *it, + struct reftable_ref_record *ref) +{ + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, ref); + return iterator_next(it, &rec); +} + +int reftable_iterator_next_log(struct reftable_iterator *it, + struct reftable_log_record *log) +{ + struct reftable_record rec = { NULL }; + reftable_record_from_log(&rec, log); + return iterator_next(it, &rec); +} + +int iterator_next(struct reftable_iterator *it, struct reftable_record *rec) +{ + return it->ops->next(it->iter_arg, rec); +} + +static int empty_iterator_next(void *arg, struct reftable_record *rec) +{ + return 1; +} + +static void empty_iterator_close(void *arg) +{ +} + +static struct reftable_iterator_vtable empty_vtable = { + .next = &empty_iterator_next, + .close = &empty_iterator_close, +}; + +void iterator_set_empty(struct reftable_iterator *it) +{ + assert(!it->ops); + it->iter_arg = NULL; + it->ops = &empty_vtable; +} From patchwork Mon Aug 30 14:57:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465419 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41901C43214 for ; Mon, 30 Aug 2021 14:58:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3312D60E90 for ; Mon, 30 Aug 2021 14:58:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237445AbhH3O7N (ORCPT ); Mon, 30 Aug 2021 10:59:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55566 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237317AbhH3O7A (ORCPT ); Mon, 30 Aug 2021 10:59:00 -0400 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2836C06175F for ; Mon, 30 Aug 2021 07:58:05 -0700 (PDT) Received: by mail-wr1-x431.google.com with SMTP id x6so14662520wrv.13 for ; Mon, 30 Aug 2021 07:58:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=/zHFMiag5sAgo5a7kB0ICwEVe77ME1C5zvuGZpozD1Y=; b=SfM0d/1ynq2ksiSqN0u/xD+o9wvJP31Rhiej/1aGp8FyYLI02TAZ+wzZvl2q9vUWQM w1xJZHSLBYtg8CgZiIEquhz3ESTsDZkeyG1COu+ytLhXoDqWasnw5Uk8ZdAabi8UZwew wv5rAMW6n3QCeS9oD1PIbjO0WK5Kkbdc5gd4xhlozGAxyc2SCwZtu+LjaQnphGUOD4uC kfNZlAvuOE5BcGJ3wYTpZ6gi7kB3pkQgCvzQJR0eQtjeRqUiF9CXQhdm5OEZf2oiQdfZ EsgUxUcl06I2RG+HGaiwR4K6uCywc5KXEo8k5p5XOj09gDiHb9Ysi2U+ldwmIDYWcOri 29cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=/zHFMiag5sAgo5a7kB0ICwEVe77ME1C5zvuGZpozD1Y=; b=a1u0syfwS3+HmPNfztWWBvaYJuqKY3Y/hh+BZwVagwTMxqCol2ylG91URnShb2LaHm znUkoJBPqdJJt2pIyVXgUyf0t2wkahbp7nucDBVY7g0ontjlSjtqsoo9VxQRCsrlLfch 11zyXpGewPkErDA4VhWW67OhTwPuluUD9ixCD6ga6y41cGHzja97gWy6bRGB4q0RzMKY sOE69Zqh394KFBJZ1ZP5rj/9M6xfyQ4E21sm9pbwJQ/UFlM9vpiEpTd6edu05wOoBJ2o i8/LgPDILXzxpV3dQqq+Jg8ASnZCzgxPdad0hOknLoQn6VNP8FVo1oZHLJuke7jWKzaR p5qA== X-Gm-Message-State: AOAM5300t/CMeJpR3bO9kRiIPOk4muQXYHR9bLSkus9UJI5LpT8QWEja OFwab5/Dyh0fOtZK0lq03BFuLLdQDlk= X-Google-Smtp-Source: ABdhPJwIoOj10tOXMyeBSDd7wXmTjZe8wkK1AMpCYvkRuHd4i8BDgAzGIHGwpEBJSmeJy9GAgl5S/A== X-Received: by 2002:adf:efc2:: with SMTP id i2mr26639856wrp.94.1630335484467; Mon, 30 Aug 2021 07:58:04 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a6sm7675871wrh.97.2021.08.30.07.58.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:04 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:48 +0000 Subject: [PATCH 12/19] reftable: read reftable files Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This supports reading a single reftable file. The commit introduces an abstract iterator type, which captures the usecases both of reading individual refs, and iterating over a segment of the ref namespace. Signed-off-by: Han-Wen Nienhuys --- Makefile | 2 + reftable/iter.c | 194 +++++++++ reftable/iter.h | 69 ++++ reftable/reader.c | 801 +++++++++++++++++++++++++++++++++++++ reftable/reader.h | 66 +++ reftable/reftable-reader.h | 101 +++++ 6 files changed, 1233 insertions(+) create mode 100644 reftable/iter.c create mode 100644 reftable/iter.h create mode 100644 reftable/reader.c create mode 100644 reftable/reader.h create mode 100644 reftable/reftable-reader.h diff --git a/Makefile b/Makefile index 18093366acb..91952268e51 100644 --- a/Makefile +++ b/Makefile @@ -2460,7 +2460,9 @@ REFTABLE_OBJS += reftable/basics.o REFTABLE_OBJS += reftable/error.o REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o +REFTABLE_OBJS += reftable/iter.o REFTABLE_OBJS += reftable/publicbasics.o +REFTABLE_OBJS += reftable/reader.o REFTABLE_OBJS += reftable/record.o REFTABLE_OBJS += reftable/refname.o REFTABLE_OBJS += reftable/generic.o diff --git a/reftable/iter.c b/reftable/iter.c new file mode 100644 index 00000000000..93d04f735b8 --- /dev/null +++ b/reftable/iter.c @@ -0,0 +1,194 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "iter.h" + +#include "system.h" + +#include "block.h" +#include "generic.h" +#include "constants.h" +#include "reader.h" +#include "reftable-error.h" + +int iterator_is_null(struct reftable_iterator *it) +{ + return !it->ops; +} + +static void filtering_ref_iterator_close(void *iter_arg) +{ + struct filtering_ref_iterator *fri = iter_arg; + strbuf_release(&fri->oid); + reftable_iterator_destroy(&fri->it); +} + +static int filtering_ref_iterator_next(void *iter_arg, + struct reftable_record *rec) +{ + struct filtering_ref_iterator *fri = iter_arg; + struct reftable_ref_record *ref = rec->data; + int err = 0; + while (1) { + err = reftable_iterator_next_ref(&fri->it, ref); + if (err != 0) { + break; + } + + if (fri->double_check) { + struct reftable_iterator it = { NULL }; + + err = reftable_table_seek_ref(&fri->tab, &it, + ref->refname); + if (err == 0) { + err = reftable_iterator_next_ref(&it, ref); + } + + reftable_iterator_destroy(&it); + + if (err < 0) { + break; + } + + if (err > 0) { + continue; + } + } + + if (ref->value_type == REFTABLE_REF_VAL2 && + (!memcmp(fri->oid.buf, ref->value.val2.target_value, + fri->oid.len) || + !memcmp(fri->oid.buf, ref->value.val2.value, + fri->oid.len))) + return 0; + + if (ref->value_type == REFTABLE_REF_VAL1 && + !memcmp(fri->oid.buf, ref->value.val1, fri->oid.len)) { + return 0; + } + } + + reftable_ref_record_release(ref); + return err; +} + +static struct reftable_iterator_vtable filtering_ref_iterator_vtable = { + .next = &filtering_ref_iterator_next, + .close = &filtering_ref_iterator_close, +}; + +void iterator_from_filtering_ref_iterator(struct reftable_iterator *it, + struct filtering_ref_iterator *fri) +{ + assert(!it->ops); + it->iter_arg = fri; + it->ops = &filtering_ref_iterator_vtable; +} + +static void indexed_table_ref_iter_close(void *p) +{ + struct indexed_table_ref_iter *it = p; + block_iter_close(&it->cur); + reftable_block_done(&it->block_reader.block); + reftable_free(it->offsets); + strbuf_release(&it->oid); +} + +static int indexed_table_ref_iter_next_block(struct indexed_table_ref_iter *it) +{ + uint64_t off; + int err = 0; + if (it->offset_idx == it->offset_len) { + it->is_finished = 1; + return 1; + } + + reftable_block_done(&it->block_reader.block); + + off = it->offsets[it->offset_idx++]; + err = reader_init_block_reader(it->r, &it->block_reader, off, + BLOCK_TYPE_REF); + if (err < 0) { + return err; + } + if (err > 0) { + /* indexed block does not exist. */ + return REFTABLE_FORMAT_ERROR; + } + block_reader_start(&it->block_reader, &it->cur); + return 0; +} + +static int indexed_table_ref_iter_next(void *p, struct reftable_record *rec) +{ + struct indexed_table_ref_iter *it = p; + struct reftable_ref_record *ref = rec->data; + + while (1) { + int err = block_iter_next(&it->cur, rec); + if (err < 0) { + return err; + } + + if (err > 0) { + err = indexed_table_ref_iter_next_block(it); + if (err < 0) { + return err; + } + + if (it->is_finished) { + return 1; + } + continue; + } + /* BUG */ + if (!memcmp(it->oid.buf, ref->value.val2.target_value, + it->oid.len) || + !memcmp(it->oid.buf, ref->value.val2.value, it->oid.len)) { + return 0; + } + } +} + +int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest, + struct reftable_reader *r, uint8_t *oid, + int oid_len, uint64_t *offsets, int offset_len) +{ + struct indexed_table_ref_iter empty = INDEXED_TABLE_REF_ITER_INIT; + struct indexed_table_ref_iter *itr = + reftable_calloc(sizeof(struct indexed_table_ref_iter)); + int err = 0; + + *itr = empty; + itr->r = r; + strbuf_add(&itr->oid, oid, oid_len); + + itr->offsets = offsets; + itr->offset_len = offset_len; + + err = indexed_table_ref_iter_next_block(itr); + if (err < 0) { + reftable_free(itr); + } else { + *dest = itr; + } + return err; +} + +static struct reftable_iterator_vtable indexed_table_ref_iter_vtable = { + .next = &indexed_table_ref_iter_next, + .close = &indexed_table_ref_iter_close, +}; + +void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it, + struct indexed_table_ref_iter *itr) +{ + assert(!it->ops); + it->iter_arg = itr; + it->ops = &indexed_table_ref_iter_vtable; +} diff --git a/reftable/iter.h b/reftable/iter.h new file mode 100644 index 00000000000..09eb0cbfa59 --- /dev/null +++ b/reftable/iter.h @@ -0,0 +1,69 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef ITER_H +#define ITER_H + +#include "system.h" +#include "block.h" +#include "record.h" + +#include "reftable-iterator.h" +#include "reftable-generic.h" + +/* Returns true for a zeroed out iterator, such as the one returned from + * iterator_destroy. */ +int iterator_is_null(struct reftable_iterator *it); + +/* iterator that produces only ref records that point to `oid` */ +struct filtering_ref_iterator { + int double_check; + struct reftable_table tab; + struct strbuf oid; + struct reftable_iterator it; +}; +#define FILTERING_REF_ITERATOR_INIT \ + { \ + .oid = STRBUF_INIT \ + } + +void iterator_from_filtering_ref_iterator(struct reftable_iterator *, + struct filtering_ref_iterator *); + +/* iterator that produces only ref records that point to `oid`, + * but using the object index. + */ +struct indexed_table_ref_iter { + struct reftable_reader *r; + struct strbuf oid; + + /* mutable */ + uint64_t *offsets; + + /* Points to the next offset to read. */ + int offset_idx; + int offset_len; + struct block_reader block_reader; + struct block_iter cur; + int is_finished; +}; + +#define INDEXED_TABLE_REF_ITER_INIT \ + { \ + .cur = { .last_key = STRBUF_INIT }, .oid = STRBUF_INIT, \ + } + +void iterator_from_indexed_table_ref_iter(struct reftable_iterator *it, + struct indexed_table_ref_iter *itr); + +/* Takes ownership of `offsets` */ +int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest, + struct reftable_reader *r, uint8_t *oid, + int oid_len, uint64_t *offsets, int offset_len); + +#endif diff --git a/reftable/reader.c b/reftable/reader.c new file mode 100644 index 00000000000..ab7a981eb8e --- /dev/null +++ b/reftable/reader.c @@ -0,0 +1,801 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "reader.h" + +#include "system.h" +#include "block.h" +#include "constants.h" +#include "generic.h" +#include "iter.h" +#include "record.h" +#include "reftable-error.h" +#include "reftable-generic.h" +#include "tree.h" + +uint64_t block_source_size(struct reftable_block_source *source) +{ + return source->ops->size(source->arg); +} + +int block_source_read_block(struct reftable_block_source *source, + struct reftable_block *dest, uint64_t off, + uint32_t size) +{ + int result = source->ops->read_block(source->arg, dest, off, size); + dest->source = *source; + return result; +} + +void block_source_close(struct reftable_block_source *source) +{ + if (!source->ops) { + return; + } + + source->ops->close(source->arg); + source->ops = NULL; +} + +static struct reftable_reader_offsets * +reader_offsets_for(struct reftable_reader *r, uint8_t typ) +{ + switch (typ) { + case BLOCK_TYPE_REF: + return &r->ref_offsets; + case BLOCK_TYPE_LOG: + return &r->log_offsets; + case BLOCK_TYPE_OBJ: + return &r->obj_offsets; + } + abort(); +} + +static int reader_get_block(struct reftable_reader *r, + struct reftable_block *dest, uint64_t off, + uint32_t sz) +{ + if (off >= r->size) + return 0; + + if (off + sz > r->size) { + sz = r->size - off; + } + + return block_source_read_block(&r->source, dest, off, sz); +} + +uint32_t reftable_reader_hash_id(struct reftable_reader *r) +{ + return r->hash_id; +} + +const char *reader_name(struct reftable_reader *r) +{ + return r->name; +} + +static int parse_footer(struct reftable_reader *r, uint8_t *footer, + uint8_t *header) +{ + uint8_t *f = footer; + uint8_t first_block_typ; + int err = 0; + uint32_t computed_crc; + uint32_t file_crc; + + if (memcmp(f, "REFT", 4)) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + f += 4; + + if (memcmp(footer, header, header_size(r->version))) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + f++; + r->block_size = get_be24(f); + + f += 3; + r->min_update_index = get_be64(f); + f += 8; + r->max_update_index = get_be64(f); + f += 8; + + if (r->version == 1) { + r->hash_id = GIT_SHA1_FORMAT_ID; + } else { + r->hash_id = get_be32(f); + switch (r->hash_id) { + case GIT_SHA1_FORMAT_ID: + break; + case GIT_SHA256_FORMAT_ID: + break; + default: + err = REFTABLE_FORMAT_ERROR; + goto done; + } + f += 4; + } + + r->ref_offsets.index_offset = get_be64(f); + f += 8; + + r->obj_offsets.offset = get_be64(f); + f += 8; + + r->object_id_len = r->obj_offsets.offset & ((1 << 5) - 1); + r->obj_offsets.offset >>= 5; + + r->obj_offsets.index_offset = get_be64(f); + f += 8; + r->log_offsets.offset = get_be64(f); + f += 8; + r->log_offsets.index_offset = get_be64(f); + f += 8; + + computed_crc = crc32(0, footer, f - footer); + file_crc = get_be32(f); + f += 4; + if (computed_crc != file_crc) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + first_block_typ = header[header_size(r->version)]; + r->ref_offsets.is_present = (first_block_typ == BLOCK_TYPE_REF); + r->ref_offsets.offset = 0; + r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG || + r->log_offsets.offset > 0); + r->obj_offsets.is_present = r->obj_offsets.offset > 0; + err = 0; +done: + return err; +} + +int init_reader(struct reftable_reader *r, struct reftable_block_source *source, + const char *name) +{ + struct reftable_block footer = { NULL }; + struct reftable_block header = { NULL }; + int err = 0; + uint64_t file_size = block_source_size(source); + + /* Need +1 to read type of first block. */ + uint32_t read_size = header_size(2) + 1; /* read v2 because it's larger. */ + memset(r, 0, sizeof(struct reftable_reader)); + + if (read_size > file_size) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + err = block_source_read_block(source, &header, 0, read_size); + if (err != read_size) { + err = REFTABLE_IO_ERROR; + goto done; + } + + if (memcmp(header.data, "REFT", 4)) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + r->version = header.data[4]; + if (r->version != 1 && r->version != 2) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } + + r->size = file_size - footer_size(r->version); + r->source = *source; + r->name = xstrdup(name); + r->hash_id = 0; + + err = block_source_read_block(source, &footer, r->size, + footer_size(r->version)); + if (err != footer_size(r->version)) { + err = REFTABLE_IO_ERROR; + goto done; + } + + err = parse_footer(r, footer.data, header.data); +done: + reftable_block_done(&footer); + reftable_block_done(&header); + return err; +} + +struct table_iter { + struct reftable_reader *r; + uint8_t typ; + uint64_t block_off; + struct block_iter bi; + int is_finished; +}; +#define TABLE_ITER_INIT \ + { \ + .bi = {.last_key = STRBUF_INIT } \ + } + +static void table_iter_copy_from(struct table_iter *dest, + struct table_iter *src) +{ + dest->r = src->r; + dest->typ = src->typ; + dest->block_off = src->block_off; + dest->is_finished = src->is_finished; + block_iter_copy_from(&dest->bi, &src->bi); +} + +static int table_iter_next_in_block(struct table_iter *ti, + struct reftable_record *rec) +{ + int res = block_iter_next(&ti->bi, rec); + if (res == 0 && reftable_record_type(rec) == BLOCK_TYPE_REF) { + ((struct reftable_ref_record *)rec->data)->update_index += + ti->r->min_update_index; + } + + return res; +} + +static void table_iter_block_done(struct table_iter *ti) +{ + if (!ti->bi.br) { + return; + } + reftable_block_done(&ti->bi.br->block); + FREE_AND_NULL(ti->bi.br); + + ti->bi.last_key.len = 0; + ti->bi.next_off = 0; +} + +static int32_t extract_block_size(uint8_t *data, uint8_t *typ, uint64_t off, + int version) +{ + int32_t result = 0; + + if (off == 0) { + data += header_size(version); + } + + *typ = data[0]; + if (reftable_is_block_type(*typ)) { + result = get_be24(data + 1); + } + return result; +} + +int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br, + uint64_t next_off, uint8_t want_typ) +{ + int32_t guess_block_size = r->block_size ? r->block_size : + DEFAULT_BLOCK_SIZE; + struct reftable_block block = { NULL }; + uint8_t block_typ = 0; + int err = 0; + uint32_t header_off = next_off ? 0 : header_size(r->version); + int32_t block_size = 0; + + if (next_off >= r->size) + return 1; + + err = reader_get_block(r, &block, next_off, guess_block_size); + if (err < 0) + return err; + + block_size = extract_block_size(block.data, &block_typ, next_off, + r->version); + if (block_size < 0) + return block_size; + + if (want_typ != BLOCK_TYPE_ANY && block_typ != want_typ) { + reftable_block_done(&block); + return 1; + } + + if (block_size > guess_block_size) { + reftable_block_done(&block); + err = reader_get_block(r, &block, next_off, block_size); + if (err < 0) { + return err; + } + } + + return block_reader_init(br, &block, header_off, r->block_size, + hash_size(r->hash_id)); +} + +static int table_iter_next_block(struct table_iter *dest, + struct table_iter *src) +{ + uint64_t next_block_off = src->block_off + src->bi.br->full_block_size; + struct block_reader br = { 0 }; + int err = 0; + + dest->r = src->r; + dest->typ = src->typ; + dest->block_off = next_block_off; + + err = reader_init_block_reader(src->r, &br, next_block_off, src->typ); + if (err > 0) { + dest->is_finished = 1; + return 1; + } + if (err != 0) + return err; + else { + struct block_reader *brp = + reftable_malloc(sizeof(struct block_reader)); + *brp = br; + + dest->is_finished = 0; + block_reader_start(brp, &dest->bi); + } + return 0; +} + +static int table_iter_next(struct table_iter *ti, struct reftable_record *rec) +{ + if (reftable_record_type(rec) != ti->typ) + return REFTABLE_API_ERROR; + + while (1) { + struct table_iter next = TABLE_ITER_INIT; + int err = 0; + if (ti->is_finished) { + return 1; + } + + err = table_iter_next_in_block(ti, rec); + if (err <= 0) { + return err; + } + + err = table_iter_next_block(&next, ti); + if (err != 0) { + ti->is_finished = 1; + } + table_iter_block_done(ti); + if (err != 0) { + return err; + } + table_iter_copy_from(ti, &next); + block_iter_close(&next.bi); + } +} + +static int table_iter_next_void(void *ti, struct reftable_record *rec) +{ + return table_iter_next(ti, rec); +} + +static void table_iter_close(void *p) +{ + struct table_iter *ti = p; + table_iter_block_done(ti); + block_iter_close(&ti->bi); +} + +static struct reftable_iterator_vtable table_iter_vtable = { + .next = &table_iter_next_void, + .close = &table_iter_close, +}; + +static void iterator_from_table_iter(struct reftable_iterator *it, + struct table_iter *ti) +{ + assert(!it->ops); + it->iter_arg = ti; + it->ops = &table_iter_vtable; +} + +static int reader_table_iter_at(struct reftable_reader *r, + struct table_iter *ti, uint64_t off, + uint8_t typ) +{ + struct block_reader br = { 0 }; + struct block_reader *brp = NULL; + + int err = reader_init_block_reader(r, &br, off, typ); + if (err != 0) + return err; + + brp = reftable_malloc(sizeof(struct block_reader)); + *brp = br; + ti->r = r; + ti->typ = block_reader_type(brp); + ti->block_off = off; + block_reader_start(brp, &ti->bi); + return 0; +} + +static int reader_start(struct reftable_reader *r, struct table_iter *ti, + uint8_t typ, int index) +{ + struct reftable_reader_offsets *offs = reader_offsets_for(r, typ); + uint64_t off = offs->offset; + if (index) { + off = offs->index_offset; + if (off == 0) { + return 1; + } + typ = BLOCK_TYPE_INDEX; + } + + return reader_table_iter_at(r, ti, off, typ); +} + +static int reader_seek_linear(struct reftable_reader *r, struct table_iter *ti, + struct reftable_record *want) +{ + struct reftable_record rec = + reftable_new_record(reftable_record_type(want)); + struct strbuf want_key = STRBUF_INIT; + struct strbuf got_key = STRBUF_INIT; + struct table_iter next = TABLE_ITER_INIT; + int err = -1; + + reftable_record_key(want, &want_key); + + while (1) { + err = table_iter_next_block(&next, ti); + if (err < 0) + goto done; + + if (err > 0) { + break; + } + + err = block_reader_first_key(next.bi.br, &got_key); + if (err < 0) + goto done; + + if (strbuf_cmp(&got_key, &want_key) > 0) { + table_iter_block_done(&next); + break; + } + + table_iter_block_done(ti); + table_iter_copy_from(ti, &next); + } + + err = block_iter_seek(&ti->bi, &want_key); + if (err < 0) + goto done; + err = 0; + +done: + block_iter_close(&next.bi); + reftable_record_destroy(&rec); + strbuf_release(&want_key); + strbuf_release(&got_key); + return err; +} + +static int reader_seek_indexed(struct reftable_reader *r, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + struct reftable_index_record want_index = { .last_key = STRBUF_INIT }; + struct reftable_record want_index_rec = { NULL }; + struct reftable_index_record index_result = { .last_key = STRBUF_INIT }; + struct reftable_record index_result_rec = { NULL }; + struct table_iter index_iter = TABLE_ITER_INIT; + struct table_iter next = TABLE_ITER_INIT; + int err = 0; + + reftable_record_key(rec, &want_index.last_key); + reftable_record_from_index(&want_index_rec, &want_index); + reftable_record_from_index(&index_result_rec, &index_result); + + err = reader_start(r, &index_iter, reftable_record_type(rec), 1); + if (err < 0) + goto done; + + err = reader_seek_linear(r, &index_iter, &want_index_rec); + while (1) { + err = table_iter_next(&index_iter, &index_result_rec); + table_iter_block_done(&index_iter); + if (err != 0) + goto done; + + err = reader_table_iter_at(r, &next, index_result.offset, 0); + if (err != 0) + goto done; + + err = block_iter_seek(&next.bi, &want_index.last_key); + if (err < 0) + goto done; + + if (next.typ == reftable_record_type(rec)) { + err = 0; + break; + } + + if (next.typ != BLOCK_TYPE_INDEX) { + err = REFTABLE_FORMAT_ERROR; + break; + } + + table_iter_copy_from(&index_iter, &next); + } + + if (err == 0) { + struct table_iter empty = TABLE_ITER_INIT; + struct table_iter *malloced = + reftable_calloc(sizeof(struct table_iter)); + *malloced = empty; + table_iter_copy_from(malloced, &next); + iterator_from_table_iter(it, malloced); + } +done: + block_iter_close(&next.bi); + table_iter_close(&index_iter); + reftable_record_release(&want_index_rec); + reftable_record_release(&index_result_rec); + return err; +} + +static int reader_seek_internal(struct reftable_reader *r, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + struct reftable_reader_offsets *offs = + reader_offsets_for(r, reftable_record_type(rec)); + uint64_t idx = offs->index_offset; + struct table_iter ti = TABLE_ITER_INIT; + int err = 0; + if (idx > 0) + return reader_seek_indexed(r, it, rec); + + err = reader_start(r, &ti, reftable_record_type(rec), 0); + if (err < 0) + return err; + err = reader_seek_linear(r, &ti, rec); + if (err < 0) + return err; + else { + struct table_iter *p = + reftable_malloc(sizeof(struct table_iter)); + *p = ti; + iterator_from_table_iter(it, p); + } + + return 0; +} + +int reader_seek(struct reftable_reader *r, struct reftable_iterator *it, + struct reftable_record *rec) +{ + uint8_t typ = reftable_record_type(rec); + + struct reftable_reader_offsets *offs = reader_offsets_for(r, typ); + if (!offs->is_present) { + iterator_set_empty(it); + return 0; + } + + return reader_seek_internal(r, it, rec); +} + +int reftable_reader_seek_ref(struct reftable_reader *r, + struct reftable_iterator *it, const char *name) +{ + struct reftable_ref_record ref = { + .refname = (char *)name, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, &ref); + return reader_seek(r, it, &rec); +} + +int reftable_reader_seek_log_at(struct reftable_reader *r, + struct reftable_iterator *it, const char *name, + uint64_t update_index) +{ + struct reftable_log_record log = { + .refname = (char *)name, + .update_index = update_index, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_log(&rec, &log); + return reader_seek(r, it, &rec); +} + +int reftable_reader_seek_log(struct reftable_reader *r, + struct reftable_iterator *it, const char *name) +{ + uint64_t max = ~((uint64_t)0); + return reftable_reader_seek_log_at(r, it, name, max); +} + +void reader_close(struct reftable_reader *r) +{ + block_source_close(&r->source); + FREE_AND_NULL(r->name); +} + +int reftable_new_reader(struct reftable_reader **p, + struct reftable_block_source *src, char const *name) +{ + struct reftable_reader *rd = + reftable_calloc(sizeof(struct reftable_reader)); + int err = init_reader(rd, src, name); + if (err == 0) { + *p = rd; + } else { + block_source_close(src); + reftable_free(rd); + } + return err; +} + +void reftable_reader_free(struct reftable_reader *r) +{ + reader_close(r); + reftable_free(r); +} + +static int reftable_reader_refs_for_indexed(struct reftable_reader *r, + struct reftable_iterator *it, + uint8_t *oid) +{ + struct reftable_obj_record want = { + .hash_prefix = oid, + .hash_prefix_len = r->object_id_len, + }; + struct reftable_record want_rec = { NULL }; + struct reftable_iterator oit = { NULL }; + struct reftable_obj_record got = { NULL }; + struct reftable_record got_rec = { NULL }; + int err = 0; + struct indexed_table_ref_iter *itr = NULL; + + /* Look through the reverse index. */ + reftable_record_from_obj(&want_rec, &want); + err = reader_seek(r, &oit, &want_rec); + if (err != 0) + goto done; + + /* read out the reftable_obj_record */ + reftable_record_from_obj(&got_rec, &got); + err = iterator_next(&oit, &got_rec); + if (err < 0) + goto done; + + if (err > 0 || + memcmp(want.hash_prefix, got.hash_prefix, r->object_id_len)) { + /* didn't find it; return empty iterator */ + iterator_set_empty(it); + err = 0; + goto done; + } + + err = new_indexed_table_ref_iter(&itr, r, oid, hash_size(r->hash_id), + got.offsets, got.offset_len); + if (err < 0) + goto done; + got.offsets = NULL; + iterator_from_indexed_table_ref_iter(it, itr); + +done: + reftable_iterator_destroy(&oit); + reftable_record_release(&got_rec); + return err; +} + +static int reftable_reader_refs_for_unindexed(struct reftable_reader *r, + struct reftable_iterator *it, + uint8_t *oid) +{ + struct table_iter ti_empty = TABLE_ITER_INIT; + struct table_iter *ti = reftable_calloc(sizeof(struct table_iter)); + struct filtering_ref_iterator *filter = NULL; + struct filtering_ref_iterator empty = FILTERING_REF_ITERATOR_INIT; + int oid_len = hash_size(r->hash_id); + int err; + + *ti = ti_empty; + err = reader_start(r, ti, BLOCK_TYPE_REF, 0); + if (err < 0) { + reftable_free(ti); + return err; + } + + filter = reftable_malloc(sizeof(struct filtering_ref_iterator)); + *filter = empty; + + strbuf_add(&filter->oid, oid, oid_len); + reftable_table_from_reader(&filter->tab, r); + filter->double_check = 0; + iterator_from_table_iter(&filter->it, ti); + + iterator_from_filtering_ref_iterator(it, filter); + return 0; +} + +int reftable_reader_refs_for(struct reftable_reader *r, + struct reftable_iterator *it, uint8_t *oid) +{ + if (r->obj_offsets.is_present) + return reftable_reader_refs_for_indexed(r, it, oid); + return reftable_reader_refs_for_unindexed(r, it, oid); +} + +uint64_t reftable_reader_max_update_index(struct reftable_reader *r) +{ + return r->max_update_index; +} + +uint64_t reftable_reader_min_update_index(struct reftable_reader *r) +{ + return r->min_update_index; +} + +/* generic table interface. */ + +static int reftable_reader_seek_void(void *tab, struct reftable_iterator *it, + struct reftable_record *rec) +{ + return reader_seek(tab, it, rec); +} + +static uint32_t reftable_reader_hash_id_void(void *tab) +{ + return reftable_reader_hash_id(tab); +} + +static uint64_t reftable_reader_min_update_index_void(void *tab) +{ + return reftable_reader_min_update_index(tab); +} + +static uint64_t reftable_reader_max_update_index_void(void *tab) +{ + return reftable_reader_max_update_index(tab); +} + +static struct reftable_table_vtable reader_vtable = { + .seek_record = reftable_reader_seek_void, + .hash_id = reftable_reader_hash_id_void, + .min_update_index = reftable_reader_min_update_index_void, + .max_update_index = reftable_reader_max_update_index_void, +}; + +void reftable_table_from_reader(struct reftable_table *tab, + struct reftable_reader *reader) +{ + assert(!tab->ops); + tab->ops = &reader_vtable; + tab->table_arg = reader; +} + + +int reftable_reader_print_file(const char *tablename) +{ + struct reftable_block_source src = { NULL }; + int err = reftable_block_source_from_file(&src, tablename); + struct reftable_reader *r = NULL; + struct reftable_table tab = { NULL }; + if (err < 0) + goto done; + + err = reftable_new_reader(&r, &src, tablename); + if (err < 0) + goto done; + + reftable_table_from_reader(&tab, r); + err = reftable_table_print(&tab); +done: + reftable_reader_free(r); + return err; +} diff --git a/reftable/reader.h b/reftable/reader.h new file mode 100644 index 00000000000..39583e5dbcd --- /dev/null +++ b/reftable/reader.h @@ -0,0 +1,66 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef READER_H +#define READER_H + +#include "block.h" +#include "record.h" +#include "reftable-iterator.h" +#include "reftable-reader.h" + +uint64_t block_source_size(struct reftable_block_source *source); + +int block_source_read_block(struct reftable_block_source *source, + struct reftable_block *dest, uint64_t off, + uint32_t size); +void block_source_close(struct reftable_block_source *source); + +/* metadata for a block type */ +struct reftable_reader_offsets { + int is_present; + uint64_t offset; + uint64_t index_offset; +}; + +/* The state for reading a reftable file. */ +struct reftable_reader { + /* for convience, associate a name with the instance. */ + char *name; + struct reftable_block_source source; + + /* Size of the file, excluding the footer. */ + uint64_t size; + + /* 'sha1' for SHA1, 's256' for SHA-256 */ + uint32_t hash_id; + + uint32_t block_size; + uint64_t min_update_index; + uint64_t max_update_index; + /* Length of the OID keys in the 'o' section */ + int object_id_len; + int version; + + struct reftable_reader_offsets ref_offsets; + struct reftable_reader_offsets obj_offsets; + struct reftable_reader_offsets log_offsets; +}; + +int init_reader(struct reftable_reader *r, struct reftable_block_source *source, + const char *name); +int reader_seek(struct reftable_reader *r, struct reftable_iterator *it, + struct reftable_record *rec); +void reader_close(struct reftable_reader *r); +const char *reader_name(struct reftable_reader *r); + +/* initialize a block reader to read from `r` */ +int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br, + uint64_t next_off, uint8_t want_typ); + +#endif diff --git a/reftable/reftable-reader.h b/reftable/reftable-reader.h new file mode 100644 index 00000000000..4a4bc2fdf85 --- /dev/null +++ b/reftable/reftable-reader.h @@ -0,0 +1,101 @@ +/* + Copyright 2020 Google LLC + + Use of this source code is governed by a BSD-style + license that can be found in the LICENSE file or at + https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_READER_H +#define REFTABLE_READER_H + +#include "reftable-iterator.h" +#include "reftable-blocksource.h" + +/* + * Reading single tables + * + * The follow routines are for reading single files. For an + * application-level interface, skip ahead to struct + * reftable_merged_table and struct reftable_stack. + */ + +/* The reader struct is a handle to an open reftable file. */ +struct reftable_reader; + +/* Generic table. */ +struct reftable_table; + +/* reftable_new_reader opens a reftable for reading. If successful, + * returns 0 code and sets pp. The name is used for creating a + * stack. Typically, it is the basename of the file. The block source + * `src` is owned by the reader, and is closed on calling + * reftable_reader_destroy(). On error, the block source `src` is + * closed as well. + */ +int reftable_new_reader(struct reftable_reader **pp, + struct reftable_block_source *src, const char *name); + +/* reftable_reader_seek_ref returns an iterator where 'name' would be inserted + in the table. To seek to the start of the table, use name = "". + + example: + + struct reftable_reader *r = NULL; + int err = reftable_new_reader(&r, &src, "filename"); + if (err < 0) { ... } + struct reftable_iterator it = {0}; + err = reftable_reader_seek_ref(r, &it, "refs/heads/master"); + if (err < 0) { ... } + struct reftable_ref_record ref = {0}; + while (1) { + err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + break; + } + if (err < 0) { + ..error handling.. + } + ..found.. + } + reftable_iterator_destroy(&it); + reftable_ref_record_release(&ref); +*/ +int reftable_reader_seek_ref(struct reftable_reader *r, + struct reftable_iterator *it, const char *name); + +/* returns the hash ID used in this table. */ +uint32_t reftable_reader_hash_id(struct reftable_reader *r); + +/* seek to logs for the given name, older than update_index. To seek to the + start of the table, use name = "". +*/ +int reftable_reader_seek_log_at(struct reftable_reader *r, + struct reftable_iterator *it, const char *name, + uint64_t update_index); + +/* seek to newest log entry for given name. */ +int reftable_reader_seek_log(struct reftable_reader *r, + struct reftable_iterator *it, const char *name); + +/* closes and deallocates a reader. */ +void reftable_reader_free(struct reftable_reader *); + +/* return an iterator for the refs pointing to `oid`. */ +int reftable_reader_refs_for(struct reftable_reader *r, + struct reftable_iterator *it, uint8_t *oid); + +/* return the max_update_index for a table */ +uint64_t reftable_reader_max_update_index(struct reftable_reader *r); + +/* return the min_update_index for a table */ +uint64_t reftable_reader_min_update_index(struct reftable_reader *r); + +/* creates a generic table from a file reader. */ +void reftable_table_from_reader(struct reftable_table *tab, + struct reftable_reader *reader); + +/* print table onto stdout for debugging. */ +int reftable_reader_print_file(const char *tablename); + +#endif From patchwork Mon Aug 30 14:57:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465417 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41CE4C19F35 for ; Mon, 30 Aug 2021 14:58:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2556F60E77 for ; Mon, 30 Aug 2021 14:58:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237430AbhH3O7M (ORCPT ); Mon, 30 Aug 2021 10:59:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55572 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237218AbhH3O7A (ORCPT ); Mon, 30 Aug 2021 10:59:00 -0400 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8483CC061575 for ; Mon, 30 Aug 2021 07:58:06 -0700 (PDT) Received: by mail-wm1-x32f.google.com with SMTP id c8-20020a7bc008000000b002e6e462e95fso14940471wmb.2 for ; Mon, 30 Aug 2021 07:58:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=A7JfykMCLtQjBTFNcBnY1N6Wfn1HDuvqiYim23adbww=; b=E4I5KSmxtZf5ZXht6mVCSoD6MGFrYh32EjlwoR5f5DStAc0UAVgvssfVO6yRHuVLV5 LCnbCbMNsPuur18ttsOKSUPlPLE+xReiHP3xPKEJmURRXCabzOJLBGd6n35KejmrWq0Z jEd87w4QkbvXXQe3n9sQRSskvn2PO/zCClFIg8q18JR+imiaQtqa0UU5ASGP7mrmIQ+S UUwXllaTX4cSAInOKssuA5/yvcEWNzD9LUnz67NzMIohW+gmZlkzTSdcWR4atPKVBbbU QagGLybZz7g6WgM/2cvNQn3f/sEAF8HvuvIygJeDsswRww2fHRMuj6fViTj4aG5UFGIa p3FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=A7JfykMCLtQjBTFNcBnY1N6Wfn1HDuvqiYim23adbww=; b=GNkrxvAvY7wNoqqLlqwnhw3nfHCgOQjEAFYgfInvCc4gC6CgisKFJ9aS7bdSeiyxZW SeqdeLtkVrS9u35PEfn7polCybM58Np6/MTUfv2wxO+SpcFW2n0mYAymalz+xmvEkIxu DT+RNCYSlDSVezo0hoV0eWtHFeInMMfMZqDQMPwmfyKXYtHq1l60uItPRYWgW4CPSzql mBKxnVtLUdZG4l62ppoihun2Fo5w6PeL37FgGRdo7JFNq0Y8jY4xUlSENlU56U96SRI2 BMFvUApYDtmkYAIpNwn4XE42TUF1xC8XKaSRkoi2F9IVy76xTLDCDR6uJnifIVdVeKBJ aVUw== X-Gm-Message-State: AOAM530kF4PyTfkJRpRcHZgoCLBnJf2vYMZUKpUO3aqr4C/SldXqe/ZJ pMhye93PvLL15XuqMeAqR69PDXVz7jw= X-Google-Smtp-Source: ABdhPJxERz0gqgltvUvllA1sfVqTnpM24zgGQ5dYw9vBjKv0Dsh9gDNp7Jq365MjUGAqSKVX0bVGSg== X-Received: by 2002:a1c:46:: with SMTP id 67mr22883942wma.29.1630335485095; Mon, 30 Aug 2021 07:58:05 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u8sm20582958wmq.45.2021.08.30.07.58.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:04 -0700 (PDT) Message-Id: <4bbefe977ae3dec1920e363a9b2d9700c54e5943.1630335476.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:49 +0000 Subject: [PATCH 13/19] reftable: reftable file level tests Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys With support for reading and writing files in place, we can construct files (in memory) and attempt to read them back. Because some sections of the format are optional (eg. indices, log entries), we have to exercise this code using multiple sizes of input data Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/readwrite_test.c | 652 ++++++++++++++++++++++++++++++++++++++ reftable/reftable-tests.h | 2 +- t/helper/test-reftable.c | 1 + 4 files changed, 655 insertions(+), 1 deletion(-) create mode 100644 reftable/readwrite_test.c diff --git a/Makefile b/Makefile index 91952268e51..4466aa99765 100644 --- a/Makefile +++ b/Makefile @@ -2473,6 +2473,7 @@ REFTABLE_OBJS += reftable/writer.o REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o REFTABLE_TEST_OBJS += reftable/record_test.o +REFTABLE_TEST_OBJS += reftable/readwrite_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/tree_test.o diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c new file mode 100644 index 00000000000..5f6bcc2f775 --- /dev/null +++ b/reftable/readwrite_test.c @@ -0,0 +1,652 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" + +#include "basics.h" +#include "block.h" +#include "blocksource.h" +#include "constants.h" +#include "reader.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-tests.h" +#include "reftable-writer.h" + +static const int update_index = 5; + +static void test_buffer(void) +{ + struct strbuf buf = STRBUF_INIT; + struct reftable_block_source source = { NULL }; + struct reftable_block out = { NULL }; + int n; + uint8_t in[] = "hello"; + strbuf_add(&buf, in, sizeof(in)); + block_source_from_strbuf(&source, &buf); + EXPECT(block_source_size(&source) == 6); + n = block_source_read_block(&source, &out, 0, sizeof(in)); + EXPECT(n == sizeof(in)); + EXPECT(!memcmp(in, out.data, n)); + reftable_block_done(&out); + + n = block_source_read_block(&source, &out, 1, 2); + EXPECT(n == 2); + EXPECT(!memcmp(out.data, "el", 2)); + + reftable_block_done(&out); + block_source_close(&source); + strbuf_release(&buf); +} + +static void write_table(char ***names, struct strbuf *buf, int N, + int block_size, uint32_t hash_id) +{ + struct reftable_write_options opts = { + .block_size = block_size, + .hash_id = hash_id, + }; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, buf, &opts); + struct reftable_ref_record ref = { NULL }; + int i = 0, n; + struct reftable_log_record log = { NULL }; + const struct reftable_stats *stats = NULL; + *names = reftable_calloc(sizeof(char *) * (N + 1)); + reftable_writer_set_limits(w, update_index, update_index); + for (i = 0; i < N; i++) { + uint8_t hash[GIT_SHA256_RAWSZ] = { 0 }; + char name[100]; + int n; + + set_test_hash(hash, i); + + snprintf(name, sizeof(name), "refs/heads/branch%02d", i); + + ref.refname = name; + ref.update_index = update_index; + ref.value_type = REFTABLE_REF_VAL1; + ref.value.val1 = hash; + (*names)[i] = xstrdup(name); + + n = reftable_writer_add_ref(w, &ref); + EXPECT(n == 0); + } + + for (i = 0; i < N; i++) { + uint8_t hash[GIT_SHA256_RAWSZ] = { 0 }; + char name[100]; + int n; + + set_test_hash(hash, i); + + snprintf(name, sizeof(name), "refs/heads/branch%02d", i); + + log.refname = name; + log.update_index = update_index; + log.value_type = REFTABLE_LOG_UPDATE; + log.value.update.new_hash = hash; + log.value.update.message = "message"; + + n = reftable_writer_add_log(w, &log); + EXPECT(n == 0); + } + + n = reftable_writer_close(w); + EXPECT(n == 0); + + stats = writer_stats(w); + for (i = 0; i < stats->ref_stats.blocks; i++) { + int off = i * opts.block_size; + if (off == 0) { + off = header_size( + (hash_id == GIT_SHA256_FORMAT_ID) ? 2 : 1); + } + EXPECT(buf->buf[off] == 'r'); + } + + EXPECT(stats->log_stats.blocks > 0); + reftable_writer_free(w); +} + +static void test_log_buffer_size(void) +{ + struct strbuf buf = STRBUF_INIT; + struct reftable_write_options opts = { + .block_size = 4096, + }; + int err; + int i; + struct reftable_log_record + log = { .refname = "refs/heads/master", + .update_index = 0xa, + .value_type = REFTABLE_LOG_UPDATE, + .value = { .update = { + .name = "Han-Wen Nienhuys", + .email = "hanwen@google.com", + .tz_offset = 100, + .time = 0x5e430672, + .message = "commit: 9\n", + } } }; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + + /* This tests buffer extension for log compression. Must use a random + hash, to ensure that the compressed part is larger than the original. + */ + uint8_t hash1[GIT_SHA1_RAWSZ], hash2[GIT_SHA1_RAWSZ]; + for (i = 0; i < GIT_SHA1_RAWSZ; i++) { + hash1[i] = (uint8_t)(rand() % 256); + hash2[i] = (uint8_t)(rand() % 256); + } + log.value.update.old_hash = hash1; + log.value.update.new_hash = hash2; + reftable_writer_set_limits(w, update_index, update_index); + err = reftable_writer_add_log(w, &log); + EXPECT_ERR(err); + err = reftable_writer_close(w); + EXPECT_ERR(err); + reftable_writer_free(w); + strbuf_release(&buf); +} + +static void test_log_write_read(void) +{ + int N = 2; + char **names = reftable_calloc(sizeof(char *) * (N + 1)); + int err; + struct reftable_write_options opts = { + .block_size = 256, + }; + struct reftable_ref_record ref = { NULL }; + int i = 0; + struct reftable_log_record log = { NULL }; + int n; + struct reftable_iterator it = { NULL }; + struct reftable_reader rd = { NULL }; + struct reftable_block_source source = { NULL }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + const struct reftable_stats *stats = NULL; + reftable_writer_set_limits(w, 0, N); + for (i = 0; i < N; i++) { + char name[256]; + struct reftable_ref_record ref = { NULL }; + snprintf(name, sizeof(name), "b%02d%0*d", i, 130, 7); + names[i] = xstrdup(name); + ref.refname = name; + ref.update_index = i; + + err = reftable_writer_add_ref(w, &ref); + EXPECT_ERR(err); + } + for (i = 0; i < N; i++) { + uint8_t hash1[GIT_SHA1_RAWSZ], hash2[GIT_SHA1_RAWSZ]; + struct reftable_log_record log = { NULL }; + set_test_hash(hash1, i); + set_test_hash(hash2, i + 1); + + log.refname = names[i]; + log.update_index = i; + log.value_type = REFTABLE_LOG_UPDATE; + log.value.update.old_hash = hash1; + log.value.update.new_hash = hash2; + + err = reftable_writer_add_log(w, &log); + EXPECT_ERR(err); + } + + n = reftable_writer_close(w); + EXPECT(n == 0); + + stats = writer_stats(w); + EXPECT(stats->log_stats.blocks > 0); + reftable_writer_free(w); + w = NULL; + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.log"); + EXPECT_ERR(err); + + err = reftable_reader_seek_ref(&rd, &it, names[N - 1]); + EXPECT_ERR(err); + + err = reftable_iterator_next_ref(&it, &ref); + EXPECT_ERR(err); + + /* end of iteration. */ + err = reftable_iterator_next_ref(&it, &ref); + EXPECT(0 < err); + + reftable_iterator_destroy(&it); + reftable_ref_record_release(&ref); + + err = reftable_reader_seek_log(&rd, &it, ""); + EXPECT_ERR(err); + + i = 0; + while (1) { + int err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + break; + } + + EXPECT_ERR(err); + EXPECT_STREQ(names[i], log.refname); + EXPECT(i == log.update_index); + i++; + reftable_log_record_release(&log); + } + + EXPECT(i == N); + reftable_iterator_destroy(&it); + + /* cleanup. */ + strbuf_release(&buf); + free_names(names); + reader_close(&rd); +} + +static void test_table_read_write_sequential(void) +{ + char **names; + struct strbuf buf = STRBUF_INIT; + int N = 50; + struct reftable_iterator it = { NULL }; + struct reftable_block_source source = { NULL }; + struct reftable_reader rd = { NULL }; + int err = 0; + int j = 0; + + write_table(&names, &buf, N, 256, GIT_SHA1_FORMAT_ID); + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.ref"); + EXPECT_ERR(err); + + err = reftable_reader_seek_ref(&rd, &it, ""); + EXPECT_ERR(err); + + while (1) { + struct reftable_ref_record ref = { NULL }; + int r = reftable_iterator_next_ref(&it, &ref); + EXPECT(r >= 0); + if (r > 0) { + break; + } + EXPECT(0 == strcmp(names[j], ref.refname)); + EXPECT(update_index == ref.update_index); + + j++; + reftable_ref_record_release(&ref); + } + EXPECT(j == N); + reftable_iterator_destroy(&it); + strbuf_release(&buf); + free_names(names); + + reader_close(&rd); +} + +static void test_table_write_small_table(void) +{ + char **names; + struct strbuf buf = STRBUF_INIT; + int N = 1; + write_table(&names, &buf, N, 4096, GIT_SHA1_FORMAT_ID); + EXPECT(buf.len < 200); + strbuf_release(&buf); + free_names(names); +} + +static void test_table_read_api(void) +{ + char **names; + struct strbuf buf = STRBUF_INIT; + int N = 50; + struct reftable_reader rd = { NULL }; + struct reftable_block_source source = { NULL }; + int err; + int i; + struct reftable_log_record log = { NULL }; + struct reftable_iterator it = { NULL }; + + write_table(&names, &buf, N, 256, GIT_SHA1_FORMAT_ID); + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.ref"); + EXPECT_ERR(err); + + err = reftable_reader_seek_ref(&rd, &it, names[0]); + EXPECT_ERR(err); + + err = reftable_iterator_next_log(&it, &log); + EXPECT(err == REFTABLE_API_ERROR); + + strbuf_release(&buf); + for (i = 0; i < N; i++) { + reftable_free(names[i]); + } + reftable_iterator_destroy(&it); + reftable_free(names); + reader_close(&rd); + strbuf_release(&buf); +} + +static void test_table_read_write_seek(int index, int hash_id) +{ + char **names; + struct strbuf buf = STRBUF_INIT; + int N = 50; + struct reftable_reader rd = { NULL }; + struct reftable_block_source source = { NULL }; + int err; + int i = 0; + + struct reftable_iterator it = { NULL }; + struct strbuf pastLast = STRBUF_INIT; + struct reftable_ref_record ref = { NULL }; + + write_table(&names, &buf, N, 256, hash_id); + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.ref"); + EXPECT_ERR(err); + EXPECT(hash_id == reftable_reader_hash_id(&rd)); + + if (!index) { + rd.ref_offsets.index_offset = 0; + } else { + EXPECT(rd.ref_offsets.index_offset > 0); + } + + for (i = 1; i < N; i++) { + int err = reftable_reader_seek_ref(&rd, &it, names[i]); + EXPECT_ERR(err); + err = reftable_iterator_next_ref(&it, &ref); + EXPECT_ERR(err); + EXPECT(0 == strcmp(names[i], ref.refname)); + EXPECT(REFTABLE_REF_VAL1 == ref.value_type); + EXPECT(i == ref.value.val1[0]); + + reftable_ref_record_release(&ref); + reftable_iterator_destroy(&it); + } + + strbuf_addstr(&pastLast, names[N - 1]); + strbuf_addstr(&pastLast, "/"); + + err = reftable_reader_seek_ref(&rd, &it, pastLast.buf); + if (err == 0) { + struct reftable_ref_record ref = { NULL }; + int err = reftable_iterator_next_ref(&it, &ref); + EXPECT(err > 0); + } else { + EXPECT(err > 0); + } + + strbuf_release(&pastLast); + reftable_iterator_destroy(&it); + + strbuf_release(&buf); + for (i = 0; i < N; i++) { + reftable_free(names[i]); + } + reftable_free(names); + reader_close(&rd); +} + +static void test_table_read_write_seek_linear(void) +{ + test_table_read_write_seek(0, GIT_SHA1_FORMAT_ID); +} + +static void test_table_read_write_seek_linear_sha256(void) +{ + test_table_read_write_seek(0, GIT_SHA256_FORMAT_ID); +} + +static void test_table_read_write_seek_index(void) +{ + test_table_read_write_seek(1, GIT_SHA1_FORMAT_ID); +} + +static void test_table_refs_for(int indexed) +{ + int N = 50; + char **want_names = reftable_calloc(sizeof(char *) * (N + 1)); + int want_names_len = 0; + uint8_t want_hash[GIT_SHA1_RAWSZ]; + + struct reftable_write_options opts = { + .block_size = 256, + }; + struct reftable_ref_record ref = { NULL }; + int i = 0; + int n; + int err; + struct reftable_reader rd; + struct reftable_block_source source = { NULL }; + + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + + struct reftable_iterator it = { NULL }; + int j; + + set_test_hash(want_hash, 4); + + for (i = 0; i < N; i++) { + uint8_t hash[GIT_SHA1_RAWSZ]; + char fill[51] = { 0 }; + char name[100]; + uint8_t hash1[GIT_SHA1_RAWSZ]; + uint8_t hash2[GIT_SHA1_RAWSZ]; + struct reftable_ref_record ref = { NULL }; + + memset(hash, i, sizeof(hash)); + memset(fill, 'x', 50); + /* Put the variable part in the start */ + snprintf(name, sizeof(name), "br%02d%s", i, fill); + name[40] = 0; + ref.refname = name; + + set_test_hash(hash1, i / 4); + set_test_hash(hash2, 3 + i / 4); + ref.value_type = REFTABLE_REF_VAL2; + ref.value.val2.value = hash1; + ref.value.val2.target_value = hash2; + + /* 80 bytes / entry, so 3 entries per block. Yields 17 + */ + /* blocks. */ + n = reftable_writer_add_ref(w, &ref); + EXPECT(n == 0); + + if (!memcmp(hash1, want_hash, GIT_SHA1_RAWSZ) || + !memcmp(hash2, want_hash, GIT_SHA1_RAWSZ)) { + want_names[want_names_len++] = xstrdup(name); + } + } + + n = reftable_writer_close(w); + EXPECT(n == 0); + + reftable_writer_free(w); + w = NULL; + + block_source_from_strbuf(&source, &buf); + + err = init_reader(&rd, &source, "file.ref"); + EXPECT_ERR(err); + if (!indexed) { + rd.obj_offsets.is_present = 0; + } + + err = reftable_reader_seek_ref(&rd, &it, ""); + EXPECT_ERR(err); + reftable_iterator_destroy(&it); + + err = reftable_reader_refs_for(&rd, &it, want_hash); + EXPECT_ERR(err); + + j = 0; + while (1) { + int err = reftable_iterator_next_ref(&it, &ref); + EXPECT(err >= 0); + if (err > 0) { + break; + } + + EXPECT(j < want_names_len); + EXPECT(0 == strcmp(ref.refname, want_names[j])); + j++; + reftable_ref_record_release(&ref); + } + EXPECT(j == want_names_len); + + strbuf_release(&buf); + free_names(want_names); + reftable_iterator_destroy(&it); + reader_close(&rd); +} + +static void test_table_refs_for_no_index(void) +{ + test_table_refs_for(0); +} + +static void test_table_refs_for_obj_index(void) +{ + test_table_refs_for(1); +} + +static void test_write_empty_table(void) +{ + struct reftable_write_options opts = { 0 }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + struct reftable_block_source source = { NULL }; + struct reftable_reader *rd = NULL; + struct reftable_ref_record rec = { NULL }; + struct reftable_iterator it = { NULL }; + int err; + + reftable_writer_set_limits(w, 1, 1); + + err = reftable_writer_close(w); + EXPECT(err == REFTABLE_EMPTY_TABLE_ERROR); + reftable_writer_free(w); + + EXPECT(buf.len == header_size(1) + footer_size(1)); + + block_source_from_strbuf(&source, &buf); + + err = reftable_new_reader(&rd, &source, "filename"); + EXPECT_ERR(err); + + err = reftable_reader_seek_ref(rd, &it, ""); + EXPECT_ERR(err); + + err = reftable_iterator_next_ref(&it, &rec); + EXPECT(err > 0); + + reftable_iterator_destroy(&it); + reftable_reader_free(rd); + strbuf_release(&buf); +} + +static void test_write_key_order(void) +{ + struct reftable_write_options opts = { 0 }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + struct reftable_ref_record refs[2] = { + { + .refname = "b", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value = { + .symref = "target", + }, + }, { + .refname = "a", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value = { + .symref = "target", + }, + } + }; + int err; + + reftable_writer_set_limits(w, 1, 1); + err = reftable_writer_add_ref(w, &refs[0]); + EXPECT_ERR(err); + err = reftable_writer_add_ref(w, &refs[1]); + printf("%d\n", err); + EXPECT(err == REFTABLE_API_ERROR); + reftable_writer_close(w); + reftable_writer_free(w); + strbuf_release(&buf); +} + +static void test_corrupt_table_empty(void) +{ + struct strbuf buf = STRBUF_INIT; + struct reftable_block_source source = { NULL }; + struct reftable_reader rd = { NULL }; + int err; + + block_source_from_strbuf(&source, &buf); + err = init_reader(&rd, &source, "file.log"); + EXPECT(err == REFTABLE_FORMAT_ERROR); +} + +static void test_corrupt_table(void) +{ + uint8_t zeros[1024] = { 0 }; + struct strbuf buf = STRBUF_INIT; + struct reftable_block_source source = { NULL }; + struct reftable_reader rd = { NULL }; + int err; + strbuf_add(&buf, zeros, sizeof(zeros)); + + block_source_from_strbuf(&source, &buf); + err = init_reader(&rd, &source, "file.log"); + EXPECT(err == REFTABLE_FORMAT_ERROR); + strbuf_release(&buf); +} + +int readwrite_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_corrupt_table); + RUN_TEST(test_corrupt_table_empty); + RUN_TEST(test_log_write_read); + RUN_TEST(test_write_key_order); + RUN_TEST(test_table_read_write_seek_linear_sha256); + RUN_TEST(test_log_buffer_size); + RUN_TEST(test_table_write_small_table); + RUN_TEST(test_buffer); + RUN_TEST(test_table_read_api); + RUN_TEST(test_table_read_write_sequential); + RUN_TEST(test_table_read_write_seek_linear); + RUN_TEST(test_table_read_write_seek_index); + RUN_TEST(test_table_refs_for_no_index); + RUN_TEST(test_table_refs_for_obj_index); + RUN_TEST(test_write_empty_table); + return 0; +} diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h index 5e7698ae654..3d541fa5c0c 100644 --- a/reftable/reftable-tests.h +++ b/reftable/reftable-tests.h @@ -14,7 +14,7 @@ int block_test_main(int argc, const char **argv); int merged_test_main(int argc, const char **argv); int record_test_main(int argc, const char **argv); int refname_test_main(int argc, const char **argv); -int reftable_test_main(int argc, const char **argv); +int readwrite_test_main(int argc, const char **argv); int stack_test_main(int argc, const char **argv); int tree_test_main(int argc, const char **argv); int reftable_dump_main(int argc, char *const *argv); diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 050551fa698..898aba836fd 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -6,6 +6,7 @@ int cmd__reftable(int argc, const char **argv) basics_test_main(argc, argv); block_test_main(argc, argv); record_test_main(argc, argv); + readwrite_test_main(argc, argv); tree_test_main(argc, argv); return 0; } From patchwork Mon Aug 30 14:57:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465413 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDC94C19F3A for ; Mon, 30 Aug 2021 14:58:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A74AB60E77 for ; Mon, 30 Aug 2021 14:58:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237162AbhH3O7O (ORCPT ); Mon, 30 Aug 2021 10:59:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55574 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237322AbhH3O7A (ORCPT ); Mon, 30 Aug 2021 10:59:00 -0400 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 060EFC061760 for ; Mon, 30 Aug 2021 07:58:07 -0700 (PDT) Received: by mail-wr1-x42e.google.com with SMTP id u16so22895185wrn.5 for ; Mon, 30 Aug 2021 07:58:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=FrLum25fQFHyxBIkYr1hJSR3vApp6bfuuFlsgLZ7JT4=; b=qONlfZ5m6fWdPfag0HHSX8ri3Eatye/PKZDvTg4hWXsVAn/LKtkIgRVT+ti/SRXkTD gvXPC2NW3bWxvVluVGUHdxx5K+cFhneJ0Xl2QgfDbqyN2xz5de8U9+5A+IwN/tQqRHVv AT4M6B5bdsFB4rByjPjWGmy/OPVLRPI2ZJVajwBTrTOGkLnhZmDEWdBaftWtUynBxnEH MaLG7fgIoXJLDRKGjPf4CgFsmGjWKOHNG9Z/BuQAIccb2827b4tlowhNTBhfWb/izmY8 4q2L7HSlmwuxvF/uIz7pZTdMjAytnVgvnaQDeuq5lGdvunnRRUYfZhrvf6/GGs2E4Yd1 XMyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=FrLum25fQFHyxBIkYr1hJSR3vApp6bfuuFlsgLZ7JT4=; b=bE0LINaFrGWLjfENe4augNq5ZTHTe7ouMajVaLClWhrS2YtdwSz570IN2FKJ1efmPk 5qo6+NbLj4em3mBaTLIQjDPA+HF9vhnMkzNot+FYzXuXLu3rJGACVnCp0Pxiz5EoI6hq 65B0c/7S8jq+MN8a1USuA9ajdMyMTv7lJwWIyzidlLOUfFDZY6PHcVZURGtgxXGeTOcS 8rMxYG+3Tixn2UCDQ/J4KPNS/5UeRIaGN/CCtu/YA4im0yAGUBtO715P8Or0ctozz56P wBWV/AxWP7S+uqhqvcjaoaWI+CgRcsLuVm3G9Yesowst83ibZ76xvzgZYlKqK/w0Z1P9 mEfw== X-Gm-Message-State: AOAM533oSERPuT2CCxxFN/tJw+awDywHoNxAFodsmRNQS4oayOw7gQue +gfi7fL+DzEsLMcNDVdv5Rgfpf/OEJs= X-Google-Smtp-Source: ABdhPJwX48WdWaW8znlXIt8iIGrO6JzbrOYgaHRKQLQiMGp7xuug6k+JYrZsnw48AHVPpFm73b5sYA== X-Received: by 2002:adf:ebc8:: with SMTP id v8mr26056322wrn.153.1630335485672; Mon, 30 Aug 2021 07:58:05 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u27sm16264927wru.2.2021.08.30.07.58.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:05 -0700 (PDT) Message-Id: <7ee46496081bfeef4c953e8839539616f50fecbb.1630335476.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:50 +0000 Subject: [PATCH 14/19] reftable: add a heap-based priority queue for reftable records Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This is needed to create a merged view multiple reftables Signed-off-by: Han-Wen Nienhuys --- Makefile | 2 + reftable/pq.c | 115 ++++++++++++++++++++++++++++++++++++++ reftable/pq.h | 32 +++++++++++ reftable/pq_test.c | 72 ++++++++++++++++++++++++ reftable/reftable-tests.h | 1 + t/helper/test-reftable.c | 1 + 6 files changed, 223 insertions(+) create mode 100644 reftable/pq.c create mode 100644 reftable/pq.h create mode 100644 reftable/pq_test.c diff --git a/Makefile b/Makefile index 4466aa99765..e80f253ecc3 100644 --- a/Makefile +++ b/Makefile @@ -2462,6 +2462,7 @@ REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/iter.o REFTABLE_OBJS += reftable/publicbasics.o +REFTABLE_OBJS += reftable/pq.o REFTABLE_OBJS += reftable/reader.o REFTABLE_OBJS += reftable/record.o REFTABLE_OBJS += reftable/refname.o @@ -2472,6 +2473,7 @@ REFTABLE_OBJS += reftable/writer.o REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o +REFTABLE_TEST_OBJS += reftable/pq_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/readwrite_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o diff --git a/reftable/pq.c b/reftable/pq.c new file mode 100644 index 00000000000..8918d158e2d --- /dev/null +++ b/reftable/pq.c @@ -0,0 +1,115 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "pq.h" + +#include "reftable-record.h" +#include "system.h" +#include "basics.h" + +static int pq_less(struct pq_entry a, struct pq_entry b) +{ + struct strbuf ak = STRBUF_INIT; + struct strbuf bk = STRBUF_INIT; + int cmp = 0; + reftable_record_key(&a.rec, &ak); + reftable_record_key(&b.rec, &bk); + + cmp = strbuf_cmp(&ak, &bk); + + strbuf_release(&ak); + strbuf_release(&bk); + + if (cmp == 0) + return a.index > b.index; + + return cmp < 0; +} + +struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq) +{ + return pq.heap[0]; +} + +int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq) +{ + return pq.len == 0; +} + +void merged_iter_pqueue_check(struct merged_iter_pqueue pq) +{ + int i = 0; + for (i = 1; i < pq.len; i++) { + int parent = (i - 1) / 2; + + assert(pq_less(pq.heap[parent], pq.heap[i])); + } +} + +struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq) +{ + int i = 0; + struct pq_entry e = pq->heap[0]; + pq->heap[0] = pq->heap[pq->len - 1]; + pq->len--; + + i = 0; + while (i < pq->len) { + int min = i; + int j = 2 * i + 1; + int k = 2 * i + 2; + if (j < pq->len && pq_less(pq->heap[j], pq->heap[i])) { + min = j; + } + if (k < pq->len && pq_less(pq->heap[k], pq->heap[min])) { + min = k; + } + + if (min == i) { + break; + } + + SWAP(pq->heap[i], pq->heap[min]); + i = min; + } + + return e; +} + +void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e) +{ + int i = 0; + if (pq->len == pq->cap) { + pq->cap = 2 * pq->cap + 1; + pq->heap = reftable_realloc(pq->heap, + pq->cap * sizeof(struct pq_entry)); + } + + pq->heap[pq->len++] = e; + i = pq->len - 1; + while (i > 0) { + int j = (i - 1) / 2; + if (pq_less(pq->heap[j], pq->heap[i])) { + break; + } + + SWAP(pq->heap[j], pq->heap[i]); + + i = j; + } +} + +void merged_iter_pqueue_release(struct merged_iter_pqueue *pq) +{ + int i = 0; + for (i = 0; i < pq->len; i++) { + reftable_record_destroy(&pq->heap[i].rec); + } + FREE_AND_NULL(pq->heap); + pq->len = pq->cap = 0; +} diff --git a/reftable/pq.h b/reftable/pq.h new file mode 100644 index 00000000000..385d2fb139a --- /dev/null +++ b/reftable/pq.h @@ -0,0 +1,32 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef PQ_H +#define PQ_H + +#include "record.h" + +struct pq_entry { + int index; + struct reftable_record rec; +}; + +struct merged_iter_pqueue { + struct pq_entry *heap; + size_t len; + size_t cap; +}; + +struct pq_entry merged_iter_pqueue_top(struct merged_iter_pqueue pq); +int merged_iter_pqueue_is_empty(struct merged_iter_pqueue pq); +void merged_iter_pqueue_check(struct merged_iter_pqueue pq); +struct pq_entry merged_iter_pqueue_remove(struct merged_iter_pqueue *pq); +void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, struct pq_entry e); +void merged_iter_pqueue_release(struct merged_iter_pqueue *pq); + +#endif diff --git a/reftable/pq_test.c b/reftable/pq_test.c new file mode 100644 index 00000000000..ad21673e854 --- /dev/null +++ b/reftable/pq_test.c @@ -0,0 +1,72 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" + +#include "basics.h" +#include "constants.h" +#include "pq.h" +#include "record.h" +#include "reftable-tests.h" +#include "test_framework.h" + +static void test_pq(void) +{ + char *names[54] = { NULL }; + int N = ARRAY_SIZE(names) - 1; + + struct merged_iter_pqueue pq = { NULL }; + const char *last = NULL; + + int i = 0; + for (i = 0; i < N; i++) { + char name[100]; + snprintf(name, sizeof(name), "%02d", i); + names[i] = xstrdup(name); + } + + i = 1; + do { + struct reftable_record rec = + reftable_new_record(BLOCK_TYPE_REF); + struct pq_entry e = { 0 }; + + reftable_record_as_ref(&rec)->refname = names[i]; + e.rec = rec; + merged_iter_pqueue_add(&pq, e); + merged_iter_pqueue_check(pq); + i = (i * 7) % N; + } while (i != 1); + + while (!merged_iter_pqueue_is_empty(pq)) { + struct pq_entry e = merged_iter_pqueue_remove(&pq); + struct reftable_ref_record *ref = + reftable_record_as_ref(&e.rec); + + merged_iter_pqueue_check(pq); + + if (last) { + assert(strcmp(last, ref->refname) < 0); + } + last = ref->refname; + ref->refname = NULL; + reftable_free(ref); + } + + for (i = 0; i < N; i++) { + reftable_free(names[i]); + } + + merged_iter_pqueue_release(&pq); +} + +int pq_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_pq); + return 0; +} diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h index 3d541fa5c0c..0019cbcfa49 100644 --- a/reftable/reftable-tests.h +++ b/reftable/reftable-tests.h @@ -12,6 +12,7 @@ https://developers.google.com/open-source/licenses/bsd int basics_test_main(int argc, const char **argv); int block_test_main(int argc, const char **argv); int merged_test_main(int argc, const char **argv); +int pq_test_main(int argc, const char **argv); int record_test_main(int argc, const char **argv); int refname_test_main(int argc, const char **argv); int readwrite_test_main(int argc, const char **argv); diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 898aba836fd..0b5a1701df1 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -5,6 +5,7 @@ int cmd__reftable(int argc, const char **argv) { basics_test_main(argc, argv); block_test_main(argc, argv); + pq_test_main(argc, argv); record_test_main(argc, argv); readwrite_test_main(argc, argv); tree_test_main(argc, argv); From patchwork Mon Aug 30 14:57:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465421 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A663C19F38 for ; Mon, 30 Aug 2021 14:58:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 402A260F23 for ; Mon, 30 Aug 2021 14:58:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237310AbhH3O7P (ORCPT ); Mon, 30 Aug 2021 10:59:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55566 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237331AbhH3O7B (ORCPT ); Mon, 30 Aug 2021 10:59:01 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8AFAC061796 for ; Mon, 30 Aug 2021 07:58:07 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id g138so9016251wmg.4 for ; Mon, 30 Aug 2021 07:58:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=qw7JB7KHxGuY/IOom2tDieDPgwFdfD2xXzEr61GRUmQ=; b=cV97sk5rTWfvHE6IFESHepuDmtX1TThtH4oJua9PGVkrp68kSgq7X4/MtjDK9dtpvW LARo0BaSZBJEdOJYYNXxYx5wzhFyvrZ+Hmvq41pcZrXT9jaJ8YpGR2YE7hIW17jQ/27b Qnm1//TDG/Rv72JvCgG4JKBHw5f/lNkjN0Gmgh5KxgLEqsalgu/5h+ti8R/iAoymqbuW 9utlqRxDb/06QfFWluzZZ0p+eeSVeBOXE+81mWf8JrerlRjJXBVjkfpk7dR+6j4R1vjX TQwLJoss911GHVeLmPWu2mg9vLhoo5IrFqjysQSwSVjeKOIMgkt+ElxUwI+AUaKNTCHN n1UA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=qw7JB7KHxGuY/IOom2tDieDPgwFdfD2xXzEr61GRUmQ=; b=dEvPMUellfI4DBfHRiIFWEWP8WBld+LX5rR3/fzSwgEIoYWEP9JTeKbN9ZjMXd109l TqIuwKF+U3vEHf94nqaKg6VuPa/jn+b23rgrwSahDKAQwRN3OmWVG3tQ8mQLd7s2qP94 EYCDKbwE2/vUuHd3fjR87n9orBKbus7ERUwUJjjQTndlJbFQ9ylIJtNUwxKzkRnXDgk8 2SyDfLTLhj6b9clQuw6CkOHBUGi1lqznRKrsAyNoTqMs5vKo8+Pi/1UponCSTrIwIadt 1+Tt5eY2JBTft3xFfhNYRsJFDUvHjXtDJfzIn2TZpNnX/lDaKDMoOzrqIV1awnAtFKpV Wi6Q== X-Gm-Message-State: AOAM5325dmx27xQnIrHB1Rt+CmZMWNLZKnq5hdQQ6oOIqbU43eLAYdSd lkude7+ZfCiP0D8gt1UiiB8rjhTX2Fg= X-Google-Smtp-Source: ABdhPJzSrqfq5TSet2Rk7TucF2NM5XeyRVtZWw8f9wA2YliP6kIMR7ojVDXQgxJTvIq/9jwvzu03Yw== X-Received: by 2002:a1c:21c3:: with SMTP id h186mr22785852wmh.186.1630335486209; Mon, 30 Aug 2021 07:58:06 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w1sm19107135wmc.19.2021.08.30.07.58.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:05 -0700 (PDT) Message-Id: <8bc889aaae87d4702713219baa0cb03c2cf10b84.1630335476.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:51 +0000 Subject: [PATCH 15/19] reftable: add merged table view Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This adds an abstract, read-only interface to the ref database. This primitive is used to construct the read view of the ref database (the read view is constructed by merging several *.ref files). It also provides the mechanism to provide a unified view of the refs in the main repository and the per-worktree refs. Signed-off-by: Han-Wen Nienhuys --- Makefile | 2 + reftable/merged.c | 362 +++++++++++++++++++++++++++++++++++++ reftable/merged.h | 35 ++++ reftable/merged_test.c | 292 ++++++++++++++++++++++++++++++ reftable/reftable-merged.h | 72 ++++++++ t/helper/test-reftable.c | 1 + 6 files changed, 764 insertions(+) create mode 100644 reftable/merged.c create mode 100644 reftable/merged.h create mode 100644 reftable/merged_test.c create mode 100644 reftable/reftable-merged.h diff --git a/Makefile b/Makefile index e80f253ecc3..6c8545b9eb7 100644 --- a/Makefile +++ b/Makefile @@ -2462,6 +2462,7 @@ REFTABLE_OBJS += reftable/block.o REFTABLE_OBJS += reftable/blocksource.o REFTABLE_OBJS += reftable/iter.o REFTABLE_OBJS += reftable/publicbasics.o +REFTABLE_OBJS += reftable/merged.o REFTABLE_OBJS += reftable/pq.o REFTABLE_OBJS += reftable/reader.o REFTABLE_OBJS += reftable/record.o @@ -2473,6 +2474,7 @@ REFTABLE_OBJS += reftable/writer.o REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o +REFTABLE_TEST_OBJS += reftable/merged_test.o REFTABLE_TEST_OBJS += reftable/pq_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/readwrite_test.o diff --git a/reftable/merged.c b/reftable/merged.c new file mode 100644 index 00000000000..e5b53da6db3 --- /dev/null +++ b/reftable/merged.c @@ -0,0 +1,362 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "merged.h" + +#include "constants.h" +#include "iter.h" +#include "pq.h" +#include "reader.h" +#include "record.h" +#include "generic.h" +#include "reftable-merged.h" +#include "reftable-error.h" +#include "system.h" + +static int merged_iter_init(struct merged_iter *mi) +{ + int i = 0; + for (i = 0; i < mi->stack_len; i++) { + struct reftable_record rec = reftable_new_record(mi->typ); + int err = iterator_next(&mi->stack[i], &rec); + if (err < 0) { + return err; + } + + if (err > 0) { + reftable_iterator_destroy(&mi->stack[i]); + reftable_record_destroy(&rec); + } else { + struct pq_entry e = { + .rec = rec, + .index = i, + }; + merged_iter_pqueue_add(&mi->pq, e); + } + } + + return 0; +} + +static void merged_iter_close(void *p) +{ + struct merged_iter *mi = p; + int i = 0; + merged_iter_pqueue_release(&mi->pq); + for (i = 0; i < mi->stack_len; i++) { + reftable_iterator_destroy(&mi->stack[i]); + } + reftable_free(mi->stack); +} + +static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi, + size_t idx) +{ + struct reftable_record rec = reftable_new_record(mi->typ); + struct pq_entry e = { + .rec = rec, + .index = idx, + }; + int err = iterator_next(&mi->stack[idx], &rec); + if (err < 0) + return err; + + if (err > 0) { + reftable_iterator_destroy(&mi->stack[idx]); + reftable_record_destroy(&rec); + return 0; + } + + merged_iter_pqueue_add(&mi->pq, e); + return 0; +} + +static int merged_iter_advance_subiter(struct merged_iter *mi, size_t idx) +{ + if (iterator_is_null(&mi->stack[idx])) + return 0; + return merged_iter_advance_nonnull_subiter(mi, idx); +} + +static int merged_iter_next_entry(struct merged_iter *mi, + struct reftable_record *rec) +{ + struct strbuf entry_key = STRBUF_INIT; + struct pq_entry entry = { 0 }; + int err = 0; + + if (merged_iter_pqueue_is_empty(mi->pq)) + return 1; + + entry = merged_iter_pqueue_remove(&mi->pq); + err = merged_iter_advance_subiter(mi, entry.index); + if (err < 0) + return err; + + /* + One can also use reftable as datacenter-local storage, where the ref + database is maintained in globally consistent database (eg. + CockroachDB or Spanner). In this scenario, replication delays together + with compaction may cause newer tables to contain older entries. In + such a deployment, the loop below must be changed to collect all + entries for the same key, and return new the newest one. + */ + reftable_record_key(&entry.rec, &entry_key); + while (!merged_iter_pqueue_is_empty(mi->pq)) { + struct pq_entry top = merged_iter_pqueue_top(mi->pq); + struct strbuf k = STRBUF_INIT; + int err = 0, cmp = 0; + + reftable_record_key(&top.rec, &k); + + cmp = strbuf_cmp(&k, &entry_key); + strbuf_release(&k); + + if (cmp > 0) { + break; + } + + merged_iter_pqueue_remove(&mi->pq); + err = merged_iter_advance_subiter(mi, top.index); + if (err < 0) { + return err; + } + reftable_record_destroy(&top.rec); + } + + reftable_record_copy_from(rec, &entry.rec, hash_size(mi->hash_id)); + reftable_record_destroy(&entry.rec); + strbuf_release(&entry_key); + return 0; +} + +static int merged_iter_next(struct merged_iter *mi, struct reftable_record *rec) +{ + while (1) { + int err = merged_iter_next_entry(mi, rec); + if (err == 0 && mi->suppress_deletions && + reftable_record_is_deletion(rec)) { + continue; + } + + return err; + } +} + +static int merged_iter_next_void(void *p, struct reftable_record *rec) +{ + struct merged_iter *mi = p; + if (merged_iter_pqueue_is_empty(mi->pq)) + return 1; + + return merged_iter_next(mi, rec); +} + +static struct reftable_iterator_vtable merged_iter_vtable = { + .next = &merged_iter_next_void, + .close = &merged_iter_close, +}; + +static void iterator_from_merged_iter(struct reftable_iterator *it, + struct merged_iter *mi) +{ + assert(!it->ops); + it->iter_arg = mi; + it->ops = &merged_iter_vtable; +} + +int reftable_new_merged_table(struct reftable_merged_table **dest, + struct reftable_table *stack, int n, + uint32_t hash_id) +{ + struct reftable_merged_table *m = NULL; + uint64_t last_max = 0; + uint64_t first_min = 0; + int i = 0; + for (i = 0; i < n; i++) { + uint64_t min = reftable_table_min_update_index(&stack[i]); + uint64_t max = reftable_table_max_update_index(&stack[i]); + + if (reftable_table_hash_id(&stack[i]) != hash_id) { + return REFTABLE_FORMAT_ERROR; + } + if (i == 0 || min < first_min) { + first_min = min; + } + if (i == 0 || max > last_max) { + last_max = max; + } + } + + m = reftable_calloc(sizeof(struct reftable_merged_table)); + m->stack = stack; + m->stack_len = n; + m->min = first_min; + m->max = last_max; + m->hash_id = hash_id; + *dest = m; + return 0; +} + +/* clears the list of subtable, without affecting the readers themselves. */ +void merged_table_release(struct reftable_merged_table *mt) +{ + FREE_AND_NULL(mt->stack); + mt->stack_len = 0; +} + +void reftable_merged_table_free(struct reftable_merged_table *mt) +{ + if (!mt) { + return; + } + merged_table_release(mt); + reftable_free(mt); +} + +uint64_t +reftable_merged_table_max_update_index(struct reftable_merged_table *mt) +{ + return mt->max; +} + +uint64_t +reftable_merged_table_min_update_index(struct reftable_merged_table *mt) +{ + return mt->min; +} + +static int reftable_table_seek_record(struct reftable_table *tab, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + return tab->ops->seek_record(tab->table_arg, it, rec); +} + +static int merged_table_seek_record(struct reftable_merged_table *mt, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + struct reftable_iterator *iters = reftable_calloc( + sizeof(struct reftable_iterator) * mt->stack_len); + struct merged_iter merged = { + .stack = iters, + .typ = reftable_record_type(rec), + .hash_id = mt->hash_id, + .suppress_deletions = mt->suppress_deletions, + }; + int n = 0; + int err = 0; + int i = 0; + for (i = 0; i < mt->stack_len && err == 0; i++) { + int e = reftable_table_seek_record(&mt->stack[i], &iters[n], + rec); + if (e < 0) { + err = e; + } + if (e == 0) { + n++; + } + } + if (err < 0) { + int i = 0; + for (i = 0; i < n; i++) { + reftable_iterator_destroy(&iters[i]); + } + reftable_free(iters); + return err; + } + + merged.stack_len = n; + err = merged_iter_init(&merged); + if (err < 0) { + merged_iter_close(&merged); + return err; + } else { + struct merged_iter *p = + reftable_malloc(sizeof(struct merged_iter)); + *p = merged; + iterator_from_merged_iter(it, p); + } + return 0; +} + +int reftable_merged_table_seek_ref(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name) +{ + struct reftable_ref_record ref = { + .refname = (char *)name, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_ref(&rec, &ref); + return merged_table_seek_record(mt, it, &rec); +} + +int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name, uint64_t update_index) +{ + struct reftable_log_record log = { + .refname = (char *)name, + .update_index = update_index, + }; + struct reftable_record rec = { NULL }; + reftable_record_from_log(&rec, &log); + return merged_table_seek_record(mt, it, &rec); +} + +int reftable_merged_table_seek_log(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name) +{ + uint64_t max = ~((uint64_t)0); + return reftable_merged_table_seek_log_at(mt, it, name, max); +} + +uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *mt) +{ + return mt->hash_id; +} + +static int reftable_merged_table_seek_void(void *tab, + struct reftable_iterator *it, + struct reftable_record *rec) +{ + return merged_table_seek_record(tab, it, rec); +} + +static uint32_t reftable_merged_table_hash_id_void(void *tab) +{ + return reftable_merged_table_hash_id(tab); +} + +static uint64_t reftable_merged_table_min_update_index_void(void *tab) +{ + return reftable_merged_table_min_update_index(tab); +} + +static uint64_t reftable_merged_table_max_update_index_void(void *tab) +{ + return reftable_merged_table_max_update_index(tab); +} + +static struct reftable_table_vtable merged_table_vtable = { + .seek_record = reftable_merged_table_seek_void, + .hash_id = reftable_merged_table_hash_id_void, + .min_update_index = reftable_merged_table_min_update_index_void, + .max_update_index = reftable_merged_table_max_update_index_void, +}; + +void reftable_table_from_merged_table(struct reftable_table *tab, + struct reftable_merged_table *merged) +{ + assert(!tab->ops); + tab->ops = &merged_table_vtable; + tab->table_arg = merged; +} diff --git a/reftable/merged.h b/reftable/merged.h new file mode 100644 index 00000000000..8c4d4d58d77 --- /dev/null +++ b/reftable/merged.h @@ -0,0 +1,35 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef MERGED_H +#define MERGED_H + +#include "pq.h" + +struct reftable_merged_table { + struct reftable_table *stack; + size_t stack_len; + uint32_t hash_id; + int suppress_deletions; + + uint64_t min; + uint64_t max; +}; + +struct merged_iter { + struct reftable_iterator *stack; + uint32_t hash_id; + size_t stack_len; + uint8_t typ; + int suppress_deletions; + struct merged_iter_pqueue pq; +}; + +void merged_table_release(struct reftable_merged_table *mt); + +#endif diff --git a/reftable/merged_test.c b/reftable/merged_test.c new file mode 100644 index 00000000000..1e2afe37b8b --- /dev/null +++ b/reftable/merged_test.c @@ -0,0 +1,292 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "merged.h" + +#include "system.h" + +#include "basics.h" +#include "blocksource.h" +#include "constants.h" +#include "reader.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-merged.h" +#include "reftable-tests.h" +#include "reftable-generic.h" +#include "reftable-writer.h" + +static void write_test_table(struct strbuf *buf, + struct reftable_ref_record refs[], int n) +{ + int min = 0xffffffff; + int max = 0; + int i = 0; + int err; + + struct reftable_write_options opts = { + .block_size = 256, + }; + struct reftable_writer *w = NULL; + for (i = 0; i < n; i++) { + uint64_t ui = refs[i].update_index; + if (ui > max) { + max = ui; + } + if (ui < min) { + min = ui; + } + } + + w = reftable_new_writer(&strbuf_add_void, buf, &opts); + reftable_writer_set_limits(w, min, max); + + for (i = 0; i < n; i++) { + uint64_t before = refs[i].update_index; + int n = reftable_writer_add_ref(w, &refs[i]); + assert(n == 0); + assert(before == refs[i].update_index); + } + + err = reftable_writer_close(w); + EXPECT_ERR(err); + + reftable_writer_free(w); +} + +static struct reftable_merged_table * +merged_table_from_records(struct reftable_ref_record **refs, + struct reftable_block_source **source, + struct reftable_reader ***readers, int *sizes, + struct strbuf *buf, int n) +{ + int i = 0; + struct reftable_merged_table *mt = NULL; + int err; + struct reftable_table *tabs = + reftable_calloc(n * sizeof(struct reftable_table)); + *readers = reftable_calloc(n * sizeof(struct reftable_reader *)); + *source = reftable_calloc(n * sizeof(**source)); + for (i = 0; i < n; i++) { + write_test_table(&buf[i], refs[i], sizes[i]); + block_source_from_strbuf(&(*source)[i], &buf[i]); + + err = reftable_new_reader(&(*readers)[i], &(*source)[i], + "name"); + EXPECT_ERR(err); + reftable_table_from_reader(&tabs[i], (*readers)[i]); + } + + err = reftable_new_merged_table(&mt, tabs, n, GIT_SHA1_FORMAT_ID); + EXPECT_ERR(err); + return mt; +} + +static void readers_destroy(struct reftable_reader **readers, size_t n) +{ + int i = 0; + for (; i < n; i++) + reftable_reader_free(readers[i]); + reftable_free(readers); +} + +static void test_merged_between(void) +{ + uint8_t hash1[GIT_SHA1_RAWSZ] = { 1, 2, 3, 0 }; + + struct reftable_ref_record r1[] = { { + .refname = "b", + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + } }; + struct reftable_ref_record r2[] = { { + .refname = "a", + .update_index = 2, + .value_type = REFTABLE_REF_DELETION, + } }; + + struct reftable_ref_record *refs[] = { r1, r2 }; + int sizes[] = { 1, 1 }; + struct strbuf bufs[2] = { STRBUF_INIT, STRBUF_INIT }; + struct reftable_block_source *bs = NULL; + struct reftable_reader **readers = NULL; + struct reftable_merged_table *mt = + merged_table_from_records(refs, &bs, &readers, sizes, bufs, 2); + int i; + struct reftable_ref_record ref = { NULL }; + struct reftable_iterator it = { NULL }; + int err = reftable_merged_table_seek_ref(mt, &it, "a"); + EXPECT_ERR(err); + + err = reftable_iterator_next_ref(&it, &ref); + EXPECT_ERR(err); + EXPECT(ref.update_index == 2); + reftable_ref_record_release(&ref); + reftable_iterator_destroy(&it); + readers_destroy(readers, 2); + reftable_merged_table_free(mt); + for (i = 0; i < ARRAY_SIZE(bufs); i++) { + strbuf_release(&bufs[i]); + } + reftable_free(bs); +} + +static void test_merged(void) +{ + uint8_t hash1[GIT_SHA1_RAWSZ] = { 1 }; + uint8_t hash2[GIT_SHA1_RAWSZ] = { 2 }; + struct reftable_ref_record r1[] = { + { + .refname = "a", + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + }, + { + .refname = "b", + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + }, + { + .refname = "c", + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + } + }; + struct reftable_ref_record r2[] = { { + .refname = "a", + .update_index = 2, + .value_type = REFTABLE_REF_DELETION, + } }; + struct reftable_ref_record r3[] = { + { + .refname = "c", + .update_index = 3, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash2, + }, + { + .refname = "d", + .update_index = 3, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash1, + }, + }; + + struct reftable_ref_record want[] = { + r2[0], + r1[1], + r3[0], + r3[1], + }; + + struct reftable_ref_record *refs[] = { r1, r2, r3 }; + int sizes[3] = { 3, 1, 2 }; + struct strbuf bufs[3] = { STRBUF_INIT, STRBUF_INIT, STRBUF_INIT }; + struct reftable_block_source *bs = NULL; + struct reftable_reader **readers = NULL; + struct reftable_merged_table *mt = + merged_table_from_records(refs, &bs, &readers, sizes, bufs, 3); + + struct reftable_iterator it = { NULL }; + int err = reftable_merged_table_seek_ref(mt, &it, "a"); + struct reftable_ref_record *out = NULL; + size_t len = 0; + size_t cap = 0; + int i = 0; + + EXPECT_ERR(err); + while (len < 100) { /* cap loops/recursion. */ + struct reftable_ref_record ref = { NULL }; + int err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + break; + } + if (len == cap) { + cap = 2 * cap + 1; + out = reftable_realloc( + out, sizeof(struct reftable_ref_record) * cap); + } + out[len++] = ref; + } + reftable_iterator_destroy(&it); + + assert(ARRAY_SIZE(want) == len); + for (i = 0; i < len; i++) { + assert(reftable_ref_record_equal(&want[i], &out[i], + GIT_SHA1_RAWSZ)); + } + for (i = 0; i < len; i++) { + reftable_ref_record_release(&out[i]); + } + reftable_free(out); + + for (i = 0; i < 3; i++) { + strbuf_release(&bufs[i]); + } + readers_destroy(readers, 3); + reftable_merged_table_free(mt); + reftable_free(bs); +} + +static void test_default_write_opts(void) +{ + struct reftable_write_options opts = { 0 }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + + struct reftable_ref_record rec = { + .refname = "master", + .update_index = 1, + }; + int err; + struct reftable_block_source source = { NULL }; + struct reftable_table *tab = reftable_calloc(sizeof(*tab) * 1); + uint32_t hash_id; + struct reftable_reader *rd = NULL; + struct reftable_merged_table *merged = NULL; + + reftable_writer_set_limits(w, 1, 1); + + err = reftable_writer_add_ref(w, &rec); + EXPECT_ERR(err); + + err = reftable_writer_close(w); + EXPECT_ERR(err); + reftable_writer_free(w); + + block_source_from_strbuf(&source, &buf); + + err = reftable_new_reader(&rd, &source, "filename"); + EXPECT_ERR(err); + + hash_id = reftable_reader_hash_id(rd); + assert(hash_id == GIT_SHA1_FORMAT_ID); + + reftable_table_from_reader(&tab[0], rd); + err = reftable_new_merged_table(&merged, tab, 1, GIT_SHA1_FORMAT_ID); + EXPECT_ERR(err); + + reftable_reader_free(rd); + reftable_merged_table_free(merged); + strbuf_release(&buf); +} + +/* XXX test refs_for(oid) */ + +int merged_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_merged_between); + RUN_TEST(test_merged); + RUN_TEST(test_default_write_opts); + return 0; +} diff --git a/reftable/reftable-merged.h b/reftable/reftable-merged.h new file mode 100644 index 00000000000..1a6d16915ab --- /dev/null +++ b/reftable/reftable-merged.h @@ -0,0 +1,72 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_MERGED_H +#define REFTABLE_MERGED_H + +#include "reftable-iterator.h" + +/* + * Merged tables + * + * A ref database kept in a sequence of table files. The merged_table presents a + * unified view to reading (seeking, iterating) a sequence of immutable tables. + * + * The merged tables are on purpose kept disconnected from their actual storage + * (eg. files on disk), because it is useful to merge tables aren't files. For + * example, the per-workspace and global ref namespace can be implemented as a + * merged table of two stacks of file-backed reftables. + */ + +/* A merged table is implements seeking/iterating over a stack of tables. */ +struct reftable_merged_table; + +/* A generic reftable; see below. */ +struct reftable_table; + +/* reftable_new_merged_table creates a new merged table. It takes ownership of + the stack array. +*/ +int reftable_new_merged_table(struct reftable_merged_table **dest, + struct reftable_table *stack, int n, + uint32_t hash_id); + +/* returns an iterator positioned just before 'name' */ +int reftable_merged_table_seek_ref(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name); + +/* returns an iterator for log entry, at given update_index */ +int reftable_merged_table_seek_log_at(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name, uint64_t update_index); + +/* like reftable_merged_table_seek_log_at but look for the newest entry. */ +int reftable_merged_table_seek_log(struct reftable_merged_table *mt, + struct reftable_iterator *it, + const char *name); + +/* returns the max update_index covered by this merged table. */ +uint64_t +reftable_merged_table_max_update_index(struct reftable_merged_table *mt); + +/* returns the min update_index covered by this merged table. */ +uint64_t +reftable_merged_table_min_update_index(struct reftable_merged_table *mt); + +/* releases memory for the merged_table */ +void reftable_merged_table_free(struct reftable_merged_table *m); + +/* return the hash ID of the merged table. */ +uint32_t reftable_merged_table_hash_id(struct reftable_merged_table *m); + +/* create a generic table from reftable_merged_table */ +void reftable_table_from_merged_table(struct reftable_table *tab, + struct reftable_merged_table *table); + +#endif diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 0b5a1701df1..8087f2da4e6 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -5,6 +5,7 @@ int cmd__reftable(int argc, const char **argv) { basics_test_main(argc, argv); block_test_main(argc, argv); + merged_test_main(argc, argv); pq_test_main(argc, argv); record_test_main(argc, argv); readwrite_test_main(argc, argv); From patchwork Mon Aug 30 14:57:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465423 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B99AC432BE for ; Mon, 30 Aug 2021 14:58:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 091ED60F23 for ; Mon, 30 Aug 2021 14:58:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237331AbhH3O7P (ORCPT ); Mon, 30 Aug 2021 10:59:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237304AbhH3O7C (ORCPT ); Mon, 30 Aug 2021 10:59:02 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 39C46C061575 for ; Mon, 30 Aug 2021 07:58:08 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id n5so22792269wro.12 for ; Mon, 30 Aug 2021 07:58:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=gQoUCngqd43hLyoaaIICK/6KW/z+wqpzZf9crQac1Xc=; b=iSwhe3Vb8sALrdVbaTaoZuH4tx5PTlaJ9faWy79RlzhsiClbrdAl+SD8QEjENJ5q4p iaD+AqwUTAdNdNXSXuRNVMP1LjxdLgw9YAwHzafJV4eyCBaB6UMRRgdq/OXEJqD7pa6C TNIRikbLHWe28M0J9tQpJtJqnKjZMhjrDZB1PgaE29SFMKbvOtpKGQx6bIh8sPQ6GQwj A9ZLbVML8SSmLtdN8lYnUTSEBxvk+Wud+5XcpxUcMv9lVhjhZiL6tGWwi4qSm/KEf6pn U24fjcDAt56spYroWJUcHuErsJz2gpJUALZaxrOLVTrbiafFZqOlj5NPpY3oxVLUkASV 5YaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=gQoUCngqd43hLyoaaIICK/6KW/z+wqpzZf9crQac1Xc=; b=SOtjea7j6DGR/LkjmO6vKUVHOF3wWHoobcu+aIh5mZA2WYcu1Zn0/hnSdAz8Tj1DOe ZKB3R2dB51MMIzpbkyodyE0kKlE6Hk3QWDxtEKuYSLzJ3w8qc+l8AUjrq6knBZj2GSmq AP7jMEFl58I5fO+CAsthpEqBM4xZ6bTXFWDNdSYi31eCnFp2If8oHZFV4BoQrK4/MwyR ICHAi/+RrOPqW7WP9e7L+Gf1+GmiYHWANTkUO1iq2KLFES8tTqIHpZlMI04qQgC7DqZU kFilRUGVykFHEm0xM22J5npchZsy5OenpTlgBX3yLqgJWS65cSILaKzg1DkmF9SuMFkr IEeQ== X-Gm-Message-State: AOAM532K43m67h0S56L5Emk0nFiV/7gpYp66MdtVzJBHXS5/tnRu5jeB DQz7nxbRTVTFokZm78P6+isAHA/ZLW8= X-Google-Smtp-Source: ABdhPJxU35rJ33vCxmKGNbHi+Xl0+t/GrTlVIgo0Zw0fmIXOB+fWJPjPkEx96EtcYzLkSOgcKOUuHA== X-Received: by 2002:adf:eb89:: with SMTP id t9mr27474134wrn.66.1630335486832; Mon, 30 Aug 2021 07:58:06 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m1sm7118912wmq.10.2021.08.30.07.58.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:06 -0700 (PDT) Message-Id: <9a30c1c4666fd27e6bdaf6677b11c45aa869260c.1630335476.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:52 +0000 Subject: [PATCH 16/19] reftable: implement refname validation Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys The packed/loose format has restrictions on refnames: a and a/b cannot coexist. This limitation does not apply to reftable per se, but must be maintained for interoperability. This code adds validation routines to abort transactions that are trying to add invalid names. Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/refname.c | 209 +++++++++++++++++++++++++++++++++++++++ reftable/refname.h | 29 ++++++ reftable/refname_test.c | 102 +++++++++++++++++++ t/helper/test-reftable.c | 1 + 5 files changed, 342 insertions(+) create mode 100644 reftable/refname.c create mode 100644 reftable/refname.h create mode 100644 reftable/refname_test.c diff --git a/Makefile b/Makefile index 6c8545b9eb7..827a7190140 100644 --- a/Makefile +++ b/Makefile @@ -2478,6 +2478,7 @@ REFTABLE_TEST_OBJS += reftable/merged_test.o REFTABLE_TEST_OBJS += reftable/pq_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/readwrite_test.o +REFTABLE_TEST_OBJS += reftable/refname_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/tree_test.o diff --git a/reftable/refname.c b/reftable/refname.c new file mode 100644 index 00000000000..95734969324 --- /dev/null +++ b/reftable/refname.c @@ -0,0 +1,209 @@ +/* + Copyright 2020 Google LLC + + Use of this source code is governed by a BSD-style + license that can be found in the LICENSE file or at + https://developers.google.com/open-source/licenses/bsd +*/ + +#include "system.h" +#include "reftable-error.h" +#include "basics.h" +#include "refname.h" +#include "reftable-iterator.h" + +struct find_arg { + char **names; + const char *want; +}; + +static int find_name(size_t k, void *arg) +{ + struct find_arg *f_arg = arg; + return strcmp(f_arg->names[k], f_arg->want) >= 0; +} + +static int modification_has_ref(struct modification *mod, const char *name) +{ + struct reftable_ref_record ref = { NULL }; + int err = 0; + + if (mod->add_len > 0) { + struct find_arg arg = { + .names = mod->add, + .want = name, + }; + int idx = binsearch(mod->add_len, find_name, &arg); + if (idx < mod->add_len && !strcmp(mod->add[idx], name)) { + return 0; + } + } + + if (mod->del_len > 0) { + struct find_arg arg = { + .names = mod->del, + .want = name, + }; + int idx = binsearch(mod->del_len, find_name, &arg); + if (idx < mod->del_len && !strcmp(mod->del[idx], name)) { + return 1; + } + } + + err = reftable_table_read_ref(&mod->tab, name, &ref); + reftable_ref_record_release(&ref); + return err; +} + +static void modification_release(struct modification *mod) +{ + /* don't delete the strings themselves; they're owned by ref records. + */ + FREE_AND_NULL(mod->add); + FREE_AND_NULL(mod->del); + mod->add_len = 0; + mod->del_len = 0; +} + +static int modification_has_ref_with_prefix(struct modification *mod, + const char *prefix) +{ + struct reftable_iterator it = { NULL }; + struct reftable_ref_record ref = { NULL }; + int err = 0; + + if (mod->add_len > 0) { + struct find_arg arg = { + .names = mod->add, + .want = prefix, + }; + int idx = binsearch(mod->add_len, find_name, &arg); + if (idx < mod->add_len && + !strncmp(prefix, mod->add[idx], strlen(prefix))) + goto done; + } + err = reftable_table_seek_ref(&mod->tab, &it, prefix); + if (err) + goto done; + + while (1) { + err = reftable_iterator_next_ref(&it, &ref); + if (err) + goto done; + + if (mod->del_len > 0) { + struct find_arg arg = { + .names = mod->del, + .want = ref.refname, + }; + int idx = binsearch(mod->del_len, find_name, &arg); + if (idx < mod->del_len && + !strcmp(ref.refname, mod->del[idx])) { + continue; + } + } + + if (strncmp(ref.refname, prefix, strlen(prefix))) { + err = 1; + goto done; + } + err = 0; + goto done; + } + +done: + reftable_ref_record_release(&ref); + reftable_iterator_destroy(&it); + return err; +} + +static int validate_refname(const char *name) +{ + while (1) { + char *next = strchr(name, '/'); + if (!*name) { + return REFTABLE_REFNAME_ERROR; + } + if (!next) { + return 0; + } + if (next - name == 0 || (next - name == 1 && *name == '.') || + (next - name == 2 && name[0] == '.' && name[1] == '.')) + return REFTABLE_REFNAME_ERROR; + name = next + 1; + } + return 0; +} + +int validate_ref_record_addition(struct reftable_table tab, + struct reftable_ref_record *recs, size_t sz) +{ + struct modification mod = { + .tab = tab, + .add = reftable_calloc(sizeof(char *) * sz), + .del = reftable_calloc(sizeof(char *) * sz), + }; + int i = 0; + int err = 0; + for (; i < sz; i++) { + if (reftable_ref_record_is_deletion(&recs[i])) { + mod.del[mod.del_len++] = recs[i].refname; + } else { + mod.add[mod.add_len++] = recs[i].refname; + } + } + + err = modification_validate(&mod); + modification_release(&mod); + return err; +} + +static void strbuf_trim_component(struct strbuf *sl) +{ + while (sl->len > 0) { + int is_slash = (sl->buf[sl->len - 1] == '/'); + strbuf_setlen(sl, sl->len - 1); + if (is_slash) + break; + } +} + +int modification_validate(struct modification *mod) +{ + struct strbuf slashed = STRBUF_INIT; + int err = 0; + int i = 0; + for (; i < mod->add_len; i++) { + err = validate_refname(mod->add[i]); + if (err) + goto done; + strbuf_reset(&slashed); + strbuf_addstr(&slashed, mod->add[i]); + strbuf_addstr(&slashed, "/"); + + err = modification_has_ref_with_prefix(mod, slashed.buf); + if (err == 0) { + err = REFTABLE_NAME_CONFLICT; + goto done; + } + if (err < 0) + goto done; + + strbuf_reset(&slashed); + strbuf_addstr(&slashed, mod->add[i]); + while (slashed.len) { + strbuf_trim_component(&slashed); + err = modification_has_ref(mod, slashed.buf); + if (err == 0) { + err = REFTABLE_NAME_CONFLICT; + goto done; + } + if (err < 0) + goto done; + } + } + err = 0; +done: + strbuf_release(&slashed); + return err; +} diff --git a/reftable/refname.h b/reftable/refname.h new file mode 100644 index 00000000000..a24b40fcb42 --- /dev/null +++ b/reftable/refname.h @@ -0,0 +1,29 @@ +/* + Copyright 2020 Google LLC + + Use of this source code is governed by a BSD-style + license that can be found in the LICENSE file or at + https://developers.google.com/open-source/licenses/bsd +*/ +#ifndef REFNAME_H +#define REFNAME_H + +#include "reftable-record.h" +#include "reftable-generic.h" + +struct modification { + struct reftable_table tab; + + char **add; + size_t add_len; + + char **del; + size_t del_len; +}; + +int validate_ref_record_addition(struct reftable_table tab, + struct reftable_ref_record *recs, size_t sz); + +int modification_validate(struct modification *mod); + +#endif diff --git a/reftable/refname_test.c b/reftable/refname_test.c new file mode 100644 index 00000000000..8645cd93bbd --- /dev/null +++ b/reftable/refname_test.c @@ -0,0 +1,102 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "basics.h" +#include "block.h" +#include "blocksource.h" +#include "constants.h" +#include "reader.h" +#include "record.h" +#include "refname.h" +#include "reftable-error.h" +#include "reftable-writer.h" +#include "system.h" + +#include "test_framework.h" +#include "reftable-tests.h" + +struct testcase { + char *add; + char *del; + int error_code; +}; + +static void test_conflict(void) +{ + struct reftable_write_options opts = { 0 }; + struct strbuf buf = STRBUF_INIT; + struct reftable_writer *w = + reftable_new_writer(&strbuf_add_void, &buf, &opts); + struct reftable_ref_record rec = { + .refname = "a/b", + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "destination", /* make sure it's not a symref. + */ + .update_index = 1, + }; + int err; + int i; + struct reftable_block_source source = { NULL }; + struct reftable_reader *rd = NULL; + struct reftable_table tab = { NULL }; + struct testcase cases[] = { + { "a/b/c", NULL, REFTABLE_NAME_CONFLICT }, + { "b", NULL, 0 }, + { "a", NULL, REFTABLE_NAME_CONFLICT }, + { "a", "a/b", 0 }, + + { "p/", NULL, REFTABLE_REFNAME_ERROR }, + { "p//q", NULL, REFTABLE_REFNAME_ERROR }, + { "p/./q", NULL, REFTABLE_REFNAME_ERROR }, + { "p/../q", NULL, REFTABLE_REFNAME_ERROR }, + + { "a/b/c", "a/b", 0 }, + { NULL, "a//b", 0 }, + }; + reftable_writer_set_limits(w, 1, 1); + + err = reftable_writer_add_ref(w, &rec); + EXPECT_ERR(err); + + err = reftable_writer_close(w); + EXPECT_ERR(err); + reftable_writer_free(w); + + block_source_from_strbuf(&source, &buf); + err = reftable_new_reader(&rd, &source, "filename"); + EXPECT_ERR(err); + + reftable_table_from_reader(&tab, rd); + + for (i = 0; i < ARRAY_SIZE(cases); i++) { + struct modification mod = { + .tab = tab, + }; + + if (cases[i].add) { + mod.add = &cases[i].add; + mod.add_len = 1; + } + if (cases[i].del) { + mod.del = &cases[i].del; + mod.del_len = 1; + } + + err = modification_validate(&mod); + EXPECT(err == cases[i].error_code); + } + + reftable_reader_free(rd); + strbuf_release(&buf); +} + +int refname_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_conflict); + return 0; +} diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 8087f2da4e6..c8db6852c35 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -8,6 +8,7 @@ int cmd__reftable(int argc, const char **argv) merged_test_main(argc, argv); pq_test_main(argc, argv); record_test_main(argc, argv); + refname_test_main(argc, argv); readwrite_test_main(argc, argv); tree_test_main(argc, argv); return 0; From patchwork Mon Aug 30 14:57:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465429 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0943FC432BE for ; Mon, 30 Aug 2021 14:58:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D5B6760E97 for ; Mon, 30 Aug 2021 14:58:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237370AbhH3O7X (ORCPT ); Mon, 30 Aug 2021 10:59:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237338AbhH3O7D (ORCPT ); Mon, 30 Aug 2021 10:59:03 -0400 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58DE6C06175F for ; Mon, 30 Aug 2021 07:58:09 -0700 (PDT) Received: by mail-wr1-x431.google.com with SMTP id v10so22896280wrd.4 for ; Mon, 30 Aug 2021 07:58:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=1RExQ1QckJMddybPe5jO9BZ4yfUkxI+j8mcPqR17XyI=; b=bfoGh/FkqXOnGLaPs/r3c5Y51M9Xu/wjojPq987oC+7YweOQ6EwvW6EnOlHQqJCJdu Acx/Ptuklf7PWB7/vwyLaq5p8uaBqPY6JP6QtdiBxQQuT2AT2QTlApJUNa6PdMyy22jJ CULfi99tA5xcd0CmeXNQUVR1r3qpcSTxdIYkzfk/d55jdfYNtutEVshaqWiGEc9MQzVR kR4S66HK4Ffa+QENt62ykIFpSuhw3/Q5agHM9EgF1lH+ohrtKs5CA/YNq0gO+6vUU250 PFskUT2OS1c1XVDxmpiHF1WWjcMmf8R4Y9jaC4k0VAjWSdVNeV6J+VGnXU2fbKZhel2n GfvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=1RExQ1QckJMddybPe5jO9BZ4yfUkxI+j8mcPqR17XyI=; b=jCdPRdlFiLLCCdWeBEdi0uScgbfPrh+z9DCNgeroYtjJl+BTD7gzVeoBSTMimqFzN9 tJ8dB+mo/Tp68zWEuQzDP5QIWWeTLPj7k4iep/uzpK9MAopsU5UdLRCBUJ+6j7R8X6sj chaOeB2x7R9yWKpYKsKz6CEu0/ksyYX6TMUQUxIfxoX5xZ4EbbNDnggFntBeUY/gbyan RvpAiy3h5G/ai+oPooyVml7Oqp8JRSrxje2CEl/20/lG5LG1UpMbNtXHgVSY3KjGK0D+ g6HXkv9XXa5s1NUjg24o9cVFdypBrXTBhYRPUEYhermzWfnELITW/f7Q28yTDtNbqCNw Pt+g== X-Gm-Message-State: AOAM530RWLwrHeZV2qwnl237BEV9bC7IDa9Eq3SuJQcKY1o3Ay9O8bnk tjI9IJs0qOo2yD2yfVxlYtrkiqtHbec= X-Google-Smtp-Source: ABdhPJzvDbf5uhi0McipEHALI6s5bQqCXeQlUa9Qxf9S2SLsXSnmCAo0WaQHahEPz9I85zZU8cfVgA== X-Received: by 2002:adf:c390:: with SMTP id p16mr27443394wrf.105.1630335487484; Mon, 30 Aug 2021 07:58:07 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d145sm19563223wmd.3.2021.08.30.07.58.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:07 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:53 +0000 Subject: [PATCH 17/19] reftable: implement stack, a mutable database of reftable files. Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + reftable/reftable-stack.h | 128 ++++ reftable/stack.c | 1396 +++++++++++++++++++++++++++++++++++++ reftable/stack.h | 41 ++ reftable/stack_test.c | 949 +++++++++++++++++++++++++ t/helper/test-reftable.c | 1 + 6 files changed, 2516 insertions(+) create mode 100644 reftable/reftable-stack.h create mode 100644 reftable/stack.c create mode 100644 reftable/stack.h create mode 100644 reftable/stack_test.c diff --git a/Makefile b/Makefile index 827a7190140..8be7e05ff99 100644 --- a/Makefile +++ b/Makefile @@ -2479,6 +2479,7 @@ REFTABLE_TEST_OBJS += reftable/pq_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/readwrite_test.o REFTABLE_TEST_OBJS += reftable/refname_test.o +REFTABLE_TEST_OBJS += reftable/stack_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/tree_test.o diff --git a/reftable/reftable-stack.h b/reftable/reftable-stack.h new file mode 100644 index 00000000000..1b602dda58a --- /dev/null +++ b/reftable/reftable-stack.h @@ -0,0 +1,128 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef REFTABLE_STACK_H +#define REFTABLE_STACK_H + +#include "reftable-writer.h" + +/* + * The stack presents an interface to a mutable sequence of reftables. + + * A stack can be mutated by pushing a table to the top of the stack. + + * The reftable_stack automatically compacts files on disk to ensure good + * amortized performance. + * + * For windows and other platforms that cannot have open files as rename + * destinations, concurrent access from multiple processes needs the rand() + * random seed to be randomized. + */ +struct reftable_stack; + +/* open a new reftable stack. The tables along with the table list will be + * stored in 'dir'. Typically, this should be .git/reftables. + */ +int reftable_new_stack(struct reftable_stack **dest, const char *dir, + struct reftable_write_options config); + +/* returns the update_index at which a next table should be written. */ +uint64_t reftable_stack_next_update_index(struct reftable_stack *st); + +/* holds a transaction to add tables at the top of a stack. */ +struct reftable_addition; + +/* + * returns a new transaction to add reftables to the given stack. As a side + * effect, the ref database is locked. + */ +int reftable_stack_new_addition(struct reftable_addition **dest, + struct reftable_stack *st); + +/* Adds a reftable to transaction. */ +int reftable_addition_add(struct reftable_addition *add, + int (*write_table)(struct reftable_writer *wr, + void *arg), + void *arg); + +/* Commits the transaction, releasing the lock. After calling this, + * reftable_addition_destroy should still be called. + */ +int reftable_addition_commit(struct reftable_addition *add); + +/* Release all non-committed data from the transaction, and deallocate the + * transaction. Releases the lock if held. */ +void reftable_addition_destroy(struct reftable_addition *add); + +/* add a new table to the stack. The write_table function must call + * reftable_writer_set_limits, add refs and return an error value. */ +int reftable_stack_add(struct reftable_stack *st, + int (*write_table)(struct reftable_writer *wr, + void *write_arg), + void *write_arg); + +/* returns the merged_table for seeking. This table is valid until the + * next write or reload, and should not be closed or deleted. + */ +struct reftable_merged_table * +reftable_stack_merged_table(struct reftable_stack *st); + +/* frees all resources associated with the stack. */ +void reftable_stack_destroy(struct reftable_stack *st); + +/* Reloads the stack if necessary. This is very cheap to run if the stack was up + * to date */ +int reftable_stack_reload(struct reftable_stack *st); + +/* Policy for expiring reflog entries. */ +struct reftable_log_expiry_config { + /* Drop entries older than this timestamp */ + uint64_t time; + + /* Drop older entries */ + uint64_t min_update_index; +}; + +/* compacts all reftables into a giant table. Expire reflog entries if config is + * non-NULL */ +int reftable_stack_compact_all(struct reftable_stack *st, + struct reftable_log_expiry_config *config); + +/* heuristically compact unbalanced table stack. */ +int reftable_stack_auto_compact(struct reftable_stack *st); + +/* delete stale .ref tables. */ +int reftable_stack_clean(struct reftable_stack *st); + +/* convenience function to read a single ref. Returns < 0 for error, 0 for + * success, and 1 if ref not found. */ +int reftable_stack_read_ref(struct reftable_stack *st, const char *refname, + struct reftable_ref_record *ref); + +/* convenience function to read a single log. Returns < 0 for error, 0 for + * success, and 1 if ref not found. */ +int reftable_stack_read_log(struct reftable_stack *st, const char *refname, + struct reftable_log_record *log); + +/* statistics on past compactions. */ +struct reftable_compaction_stats { + uint64_t bytes; /* total number of bytes written */ + uint64_t entries_written; /* total number of entries written, including + failures. */ + int attempts; /* how often we tried to compact */ + int failures; /* failures happen on concurrent updates */ +}; + +/* return statistics for compaction up till now. */ +struct reftable_compaction_stats * +reftable_stack_compaction_stats(struct reftable_stack *st); + +/* print the entire stack represented by the directory */ +int reftable_stack_print_directory(const char *stackdir, uint32_t hash_id); + +#endif diff --git a/reftable/stack.c b/reftable/stack.c new file mode 100644 index 00000000000..48e22a6c184 --- /dev/null +++ b/reftable/stack.c @@ -0,0 +1,1396 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "stack.h" + +#include "system.h" +#include "merged.h" +#include "reader.h" +#include "refname.h" +#include "reftable-error.h" +#include "reftable-record.h" +#include "reftable-merged.h" +#include "writer.h" + +static int stack_try_add(struct reftable_stack *st, + int (*write_table)(struct reftable_writer *wr, + void *arg), + void *arg); +static int stack_write_compact(struct reftable_stack *st, + struct reftable_writer *wr, int first, int last, + struct reftable_log_expiry_config *config); +static int stack_check_addition(struct reftable_stack *st, + const char *new_tab_name); +static void reftable_addition_close(struct reftable_addition *add); +static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st, + int reuse_open); + +static void stack_filename(struct strbuf *dest, struct reftable_stack *st, + const char *name) +{ + strbuf_reset(dest); + strbuf_addstr(dest, st->reftable_dir); + strbuf_addstr(dest, "/"); + strbuf_addstr(dest, name); +} + +static ssize_t reftable_fd_write(void *arg, const void *data, size_t sz) +{ + int *fdp = (int *)arg; + return write(*fdp, data, sz); +} + +int reftable_new_stack(struct reftable_stack **dest, const char *dir, + struct reftable_write_options config) +{ + struct reftable_stack *p = + reftable_calloc(sizeof(struct reftable_stack)); + struct strbuf list_file_name = STRBUF_INIT; + int err = 0; + + if (config.hash_id == 0) { + config.hash_id = GIT_SHA1_FORMAT_ID; + } + + *dest = NULL; + + strbuf_reset(&list_file_name); + strbuf_addstr(&list_file_name, dir); + strbuf_addstr(&list_file_name, "/tables.list"); + + p->list_file = strbuf_detach(&list_file_name, NULL); + p->reftable_dir = xstrdup(dir); + p->config = config; + + err = reftable_stack_reload_maybe_reuse(p, 1); + if (err < 0) { + reftable_stack_destroy(p); + } else { + *dest = p; + } + return err; +} + +static int fd_read_lines(int fd, char ***namesp) +{ + off_t size = lseek(fd, 0, SEEK_END); + char *buf = NULL; + int err = 0; + if (size < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + err = lseek(fd, 0, SEEK_SET); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + buf = reftable_malloc(size + 1); + if (read(fd, buf, size) != size) { + err = REFTABLE_IO_ERROR; + goto done; + } + buf[size] = 0; + + parse_names(buf, size, namesp); + +done: + reftable_free(buf); + return err; +} + +int read_lines(const char *filename, char ***namesp) +{ + int fd = open(filename, O_RDONLY, 0644); + int err = 0; + if (fd < 0) { + if (errno == ENOENT) { + *namesp = reftable_calloc(sizeof(char *)); + return 0; + } + + return REFTABLE_IO_ERROR; + } + err = fd_read_lines(fd, namesp); + close(fd); + return err; +} + +struct reftable_merged_table * +reftable_stack_merged_table(struct reftable_stack *st) +{ + return st->merged; +} + +static int has_name(char **names, const char *name) +{ + while (*names) { + if (!strcmp(*names, name)) + return 1; + names++; + } + return 0; +} + +/* Close and free the stack */ +void reftable_stack_destroy(struct reftable_stack *st) +{ + char **names = NULL; + int err = 0; + if (st->merged) { + reftable_merged_table_free(st->merged); + st->merged = NULL; + } + + err = read_lines(st->list_file, &names); + if (err < 0) { + FREE_AND_NULL(names); + } + + if (st->readers) { + int i = 0; + struct strbuf filename = STRBUF_INIT; + for (i = 0; i < st->readers_len; i++) { + const char *name = reader_name(st->readers[i]); + strbuf_reset(&filename); + if (names && !has_name(names, name)) { + stack_filename(&filename, st, name); + } + reftable_reader_free(st->readers[i]); + + if (filename.len) { + /* On Windows, can only unlink after closing. */ + unlink(filename.buf); + } + } + strbuf_release(&filename); + st->readers_len = 0; + FREE_AND_NULL(st->readers); + } + FREE_AND_NULL(st->list_file); + FREE_AND_NULL(st->reftable_dir); + reftable_free(st); + free_names(names); +} + +static struct reftable_reader **stack_copy_readers(struct reftable_stack *st, + int cur_len) +{ + struct reftable_reader **cur = + reftable_calloc(sizeof(struct reftable_reader *) * cur_len); + int i = 0; + for (i = 0; i < cur_len; i++) { + cur[i] = st->readers[i]; + } + return cur; +} + +static int reftable_stack_reload_once(struct reftable_stack *st, char **names, + int reuse_open) +{ + int cur_len = !st->merged ? 0 : st->merged->stack_len; + struct reftable_reader **cur = stack_copy_readers(st, cur_len); + int err = 0; + int names_len = names_length(names); + struct reftable_reader **new_readers = + reftable_calloc(sizeof(struct reftable_reader *) * names_len); + struct reftable_table *new_tables = + reftable_calloc(sizeof(struct reftable_table) * names_len); + int new_readers_len = 0; + struct reftable_merged_table *new_merged = NULL; + int i; + + while (*names) { + struct reftable_reader *rd = NULL; + char *name = *names++; + + /* this is linear; we assume compaction keeps the number of + tables under control so this is not quadratic. */ + int j = 0; + for (j = 0; reuse_open && j < cur_len; j++) { + if (cur[j] && 0 == strcmp(cur[j]->name, name)) { + rd = cur[j]; + cur[j] = NULL; + break; + } + } + + if (!rd) { + struct reftable_block_source src = { NULL }; + struct strbuf table_path = STRBUF_INIT; + stack_filename(&table_path, st, name); + + err = reftable_block_source_from_file(&src, + table_path.buf); + strbuf_release(&table_path); + + if (err < 0) + goto done; + + err = reftable_new_reader(&rd, &src, name); + if (err < 0) + goto done; + } + + new_readers[new_readers_len] = rd; + reftable_table_from_reader(&new_tables[new_readers_len], rd); + new_readers_len++; + } + + /* success! */ + err = reftable_new_merged_table(&new_merged, new_tables, + new_readers_len, st->config.hash_id); + if (err < 0) + goto done; + + new_tables = NULL; + st->readers_len = new_readers_len; + if (st->merged) { + merged_table_release(st->merged); + reftable_merged_table_free(st->merged); + } + if (st->readers) { + reftable_free(st->readers); + } + st->readers = new_readers; + new_readers = NULL; + new_readers_len = 0; + + new_merged->suppress_deletions = 1; + st->merged = new_merged; + for (i = 0; i < cur_len; i++) { + if (cur[i]) { + const char *name = reader_name(cur[i]); + struct strbuf filename = STRBUF_INIT; + stack_filename(&filename, st, name); + + reader_close(cur[i]); + reftable_reader_free(cur[i]); + + /* On Windows, can only unlink after closing. */ + unlink(filename.buf); + + strbuf_release(&filename); + } + } + +done: + for (i = 0; i < new_readers_len; i++) { + reader_close(new_readers[i]); + reftable_reader_free(new_readers[i]); + } + reftable_free(new_readers); + reftable_free(new_tables); + reftable_free(cur); + return err; +} + +/* return negative if a before b. */ +static int tv_cmp(struct timeval *a, struct timeval *b) +{ + time_t diff = a->tv_sec - b->tv_sec; + int udiff = a->tv_usec - b->tv_usec; + + if (diff != 0) + return diff; + + return udiff; +} + +static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st, + int reuse_open) +{ + struct timeval deadline = { 0 }; + int err = gettimeofday(&deadline, NULL); + int64_t delay = 0; + int tries = 0; + if (err < 0) + return err; + + deadline.tv_sec += 3; + while (1) { + char **names = NULL; + char **names_after = NULL; + struct timeval now = { 0 }; + int err = gettimeofday(&now, NULL); + int err2 = 0; + if (err < 0) { + return err; + } + + /* Only look at deadlines after the first few times. This + simplifies debugging in GDB */ + tries++; + if (tries > 3 && tv_cmp(&now, &deadline) >= 0) { + break; + } + + err = read_lines(st->list_file, &names); + if (err < 0) { + free_names(names); + return err; + } + err = reftable_stack_reload_once(st, names, reuse_open); + if (err == 0) { + free_names(names); + break; + } + if (err != REFTABLE_NOT_EXIST_ERROR) { + free_names(names); + return err; + } + + /* err == REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent + writer. Check if there was one by checking if the name list + changed. + */ + err2 = read_lines(st->list_file, &names_after); + if (err2 < 0) { + free_names(names); + return err2; + } + + if (names_equal(names_after, names)) { + free_names(names); + free_names(names_after); + return err; + } + free_names(names); + free_names(names_after); + + delay = delay + (delay * rand()) / RAND_MAX + 1; + sleep_millisec(delay); + } + + return 0; +} + +/* -1 = error + 0 = up to date + 1 = changed. */ +static int stack_uptodate(struct reftable_stack *st) +{ + char **names = NULL; + int err = read_lines(st->list_file, &names); + int i = 0; + if (err < 0) + return err; + + for (i = 0; i < st->readers_len; i++) { + if (!names[i]) { + err = 1; + goto done; + } + + if (strcmp(st->readers[i]->name, names[i])) { + err = 1; + goto done; + } + } + + if (names[st->merged->stack_len]) { + err = 1; + goto done; + } + +done: + free_names(names); + return err; +} + +int reftable_stack_reload(struct reftable_stack *st) +{ + int err = stack_uptodate(st); + if (err > 0) + return reftable_stack_reload_maybe_reuse(st, 1); + return err; +} + +int reftable_stack_add(struct reftable_stack *st, + int (*write)(struct reftable_writer *wr, void *arg), + void *arg) +{ + int err = stack_try_add(st, write, arg); + if (err < 0) { + if (err == REFTABLE_LOCK_ERROR) { + /* Ignore error return, we want to propagate + REFTABLE_LOCK_ERROR. + */ + reftable_stack_reload(st); + } + return err; + } + + if (!st->disable_auto_compact) + return reftable_stack_auto_compact(st); + + return 0; +} + +static void format_name(struct strbuf *dest, uint64_t min, uint64_t max) +{ + char buf[100]; + uint32_t rnd = (uint32_t)rand(); + snprintf(buf, sizeof(buf), "0x%012" PRIx64 "-0x%012" PRIx64 "-%08x", + min, max, rnd); + strbuf_reset(dest); + strbuf_addstr(dest, buf); +} + +struct reftable_addition { + int lock_file_fd; + struct strbuf lock_file_name; + struct reftable_stack *stack; + + char **new_tables; + int new_tables_len; + uint64_t next_update_index; +}; + +#define REFTABLE_ADDITION_INIT \ + { \ + .lock_file_name = STRBUF_INIT \ + } + +static int reftable_stack_init_addition(struct reftable_addition *add, + struct reftable_stack *st) +{ + int err = 0; + add->stack = st; + + strbuf_reset(&add->lock_file_name); + strbuf_addstr(&add->lock_file_name, st->list_file); + strbuf_addstr(&add->lock_file_name, ".lock"); + + add->lock_file_fd = open(add->lock_file_name.buf, + O_EXCL | O_CREAT | O_WRONLY, 0644); + if (add->lock_file_fd < 0) { + if (errno == EEXIST) { + err = REFTABLE_LOCK_ERROR; + } else { + err = REFTABLE_IO_ERROR; + } + goto done; + } + err = stack_uptodate(st); + if (err < 0) + goto done; + + if (err > 1) { + err = REFTABLE_LOCK_ERROR; + goto done; + } + + add->next_update_index = reftable_stack_next_update_index(st); +done: + if (err) { + reftable_addition_close(add); + } + return err; +} + +static void reftable_addition_close(struct reftable_addition *add) +{ + int i = 0; + struct strbuf nm = STRBUF_INIT; + for (i = 0; i < add->new_tables_len; i++) { + stack_filename(&nm, add->stack, add->new_tables[i]); + unlink(nm.buf); + reftable_free(add->new_tables[i]); + add->new_tables[i] = NULL; + } + reftable_free(add->new_tables); + add->new_tables = NULL; + add->new_tables_len = 0; + + if (add->lock_file_fd > 0) { + close(add->lock_file_fd); + add->lock_file_fd = 0; + } + if (add->lock_file_name.len > 0) { + unlink(add->lock_file_name.buf); + strbuf_release(&add->lock_file_name); + } + + strbuf_release(&nm); +} + +void reftable_addition_destroy(struct reftable_addition *add) +{ + if (!add) { + return; + } + reftable_addition_close(add); + reftable_free(add); +} + +int reftable_addition_commit(struct reftable_addition *add) +{ + struct strbuf table_list = STRBUF_INIT; + int i = 0; + int err = 0; + if (add->new_tables_len == 0) + goto done; + + for (i = 0; i < add->stack->merged->stack_len; i++) { + strbuf_addstr(&table_list, add->stack->readers[i]->name); + strbuf_addstr(&table_list, "\n"); + } + for (i = 0; i < add->new_tables_len; i++) { + strbuf_addstr(&table_list, add->new_tables[i]); + strbuf_addstr(&table_list, "\n"); + } + + err = write(add->lock_file_fd, table_list.buf, table_list.len); + strbuf_release(&table_list); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + err = close(add->lock_file_fd); + add->lock_file_fd = 0; + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + err = rename(add->lock_file_name.buf, add->stack->list_file); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + /* success, no more state to clean up. */ + strbuf_release(&add->lock_file_name); + for (i = 0; i < add->new_tables_len; i++) { + reftable_free(add->new_tables[i]); + } + reftable_free(add->new_tables); + add->new_tables = NULL; + add->new_tables_len = 0; + + err = reftable_stack_reload(add->stack); +done: + reftable_addition_close(add); + return err; +} + +int reftable_stack_new_addition(struct reftable_addition **dest, + struct reftable_stack *st) +{ + int err = 0; + struct reftable_addition empty = REFTABLE_ADDITION_INIT; + *dest = reftable_calloc(sizeof(**dest)); + **dest = empty; + err = reftable_stack_init_addition(*dest, st); + if (err) { + reftable_free(*dest); + *dest = NULL; + } + return err; +} + +static int stack_try_add(struct reftable_stack *st, + int (*write_table)(struct reftable_writer *wr, + void *arg), + void *arg) +{ + struct reftable_addition add = REFTABLE_ADDITION_INIT; + int err = reftable_stack_init_addition(&add, st); + if (err < 0) + goto done; + if (err > 0) { + err = REFTABLE_LOCK_ERROR; + goto done; + } + + err = reftable_addition_add(&add, write_table, arg); + if (err < 0) + goto done; + + err = reftable_addition_commit(&add); +done: + reftable_addition_close(&add); + return err; +} + +int reftable_addition_add(struct reftable_addition *add, + int (*write_table)(struct reftable_writer *wr, + void *arg), + void *arg) +{ + struct strbuf temp_tab_file_name = STRBUF_INIT; + struct strbuf tab_file_name = STRBUF_INIT; + struct strbuf next_name = STRBUF_INIT; + struct reftable_writer *wr = NULL; + int err = 0; + int tab_fd = 0; + + strbuf_reset(&next_name); + format_name(&next_name, add->next_update_index, add->next_update_index); + + stack_filename(&temp_tab_file_name, add->stack, next_name.buf); + strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX"); + + tab_fd = mkstemp(temp_tab_file_name.buf); + if (tab_fd < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + wr = reftable_new_writer(reftable_fd_write, &tab_fd, + &add->stack->config); + err = write_table(wr, arg); + if (err < 0) + goto done; + + err = reftable_writer_close(wr); + if (err == REFTABLE_EMPTY_TABLE_ERROR) { + err = 0; + goto done; + } + if (err < 0) + goto done; + + err = close(tab_fd); + tab_fd = 0; + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + err = stack_check_addition(add->stack, temp_tab_file_name.buf); + if (err < 0) + goto done; + + if (wr->min_update_index < add->next_update_index) { + err = REFTABLE_API_ERROR; + goto done; + } + + format_name(&next_name, wr->min_update_index, wr->max_update_index); + strbuf_addstr(&next_name, ".ref"); + + stack_filename(&tab_file_name, add->stack, next_name.buf); + + /* + On windows, this relies on rand() picking a unique destination name. + Maybe we should do retry loop as well? + */ + err = rename(temp_tab_file_name.buf, tab_file_name.buf); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + + add->new_tables = reftable_realloc(add->new_tables, + sizeof(*add->new_tables) * + (add->new_tables_len + 1)); + add->new_tables[add->new_tables_len] = strbuf_detach(&next_name, NULL); + add->new_tables_len++; +done: + if (tab_fd > 0) { + close(tab_fd); + tab_fd = 0; + } + if (temp_tab_file_name.len > 0) { + unlink(temp_tab_file_name.buf); + } + + strbuf_release(&temp_tab_file_name); + strbuf_release(&tab_file_name); + strbuf_release(&next_name); + reftable_writer_free(wr); + return err; +} + +uint64_t reftable_stack_next_update_index(struct reftable_stack *st) +{ + int sz = st->merged->stack_len; + if (sz > 0) + return reftable_reader_max_update_index(st->readers[sz - 1]) + + 1; + return 1; +} + +static int stack_compact_locked(struct reftable_stack *st, int first, int last, + struct strbuf *temp_tab, + struct reftable_log_expiry_config *config) +{ + struct strbuf next_name = STRBUF_INIT; + int tab_fd = -1; + struct reftable_writer *wr = NULL; + int err = 0; + + format_name(&next_name, + reftable_reader_min_update_index(st->readers[first]), + reftable_reader_max_update_index(st->readers[last])); + + stack_filename(temp_tab, st, next_name.buf); + strbuf_addstr(temp_tab, ".temp.XXXXXX"); + + tab_fd = mkstemp(temp_tab->buf); + wr = reftable_new_writer(reftable_fd_write, &tab_fd, &st->config); + + err = stack_write_compact(st, wr, first, last, config); + if (err < 0) + goto done; + err = reftable_writer_close(wr); + if (err < 0) + goto done; + + err = close(tab_fd); + tab_fd = 0; + +done: + reftable_writer_free(wr); + if (tab_fd > 0) { + close(tab_fd); + tab_fd = 0; + } + if (err != 0 && temp_tab->len > 0) { + unlink(temp_tab->buf); + strbuf_release(temp_tab); + } + strbuf_release(&next_name); + return err; +} + +static int stack_write_compact(struct reftable_stack *st, + struct reftable_writer *wr, int first, int last, + struct reftable_log_expiry_config *config) +{ + int subtabs_len = last - first + 1; + struct reftable_table *subtabs = reftable_calloc( + sizeof(struct reftable_table) * (last - first + 1)); + struct reftable_merged_table *mt = NULL; + int err = 0; + struct reftable_iterator it = { NULL }; + struct reftable_ref_record ref = { NULL }; + struct reftable_log_record log = { NULL }; + + uint64_t entries = 0; + + int i = 0, j = 0; + for (i = first, j = 0; i <= last; i++) { + struct reftable_reader *t = st->readers[i]; + reftable_table_from_reader(&subtabs[j++], t); + st->stats.bytes += t->size; + } + reftable_writer_set_limits(wr, st->readers[first]->min_update_index, + st->readers[last]->max_update_index); + + err = reftable_new_merged_table(&mt, subtabs, subtabs_len, + st->config.hash_id); + if (err < 0) { + reftable_free(subtabs); + goto done; + } + + err = reftable_merged_table_seek_ref(mt, &it, ""); + if (err < 0) + goto done; + + while (1) { + err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + err = 0; + break; + } + if (err < 0) { + break; + } + + if (first == 0 && reftable_ref_record_is_deletion(&ref)) { + continue; + } + + err = reftable_writer_add_ref(wr, &ref); + if (err < 0) { + break; + } + entries++; + } + reftable_iterator_destroy(&it); + + err = reftable_merged_table_seek_log(mt, &it, ""); + if (err < 0) + goto done; + + while (1) { + err = reftable_iterator_next_log(&it, &log); + if (err > 0) { + err = 0; + break; + } + if (err < 0) { + break; + } + if (first == 0 && reftable_log_record_is_deletion(&log)) { + continue; + } + + if (config && config->min_update_index > 0 && + log.update_index < config->min_update_index) { + continue; + } + + if (config && config->time > 0 && + log.value.update.time < config->time) { + continue; + } + + err = reftable_writer_add_log(wr, &log); + if (err < 0) { + break; + } + entries++; + } + +done: + reftable_iterator_destroy(&it); + if (mt) { + merged_table_release(mt); + reftable_merged_table_free(mt); + } + reftable_ref_record_release(&ref); + reftable_log_record_release(&log); + st->stats.entries_written += entries; + return err; +} + +/* < 0: error. 0 == OK, > 0 attempt failed; could retry. */ +static int stack_compact_range(struct reftable_stack *st, int first, int last, + struct reftable_log_expiry_config *expiry) +{ + struct strbuf temp_tab_file_name = STRBUF_INIT; + struct strbuf new_table_name = STRBUF_INIT; + struct strbuf lock_file_name = STRBUF_INIT; + struct strbuf ref_list_contents = STRBUF_INIT; + struct strbuf new_table_path = STRBUF_INIT; + int err = 0; + int have_lock = 0; + int lock_file_fd = 0; + int compact_count = last - first + 1; + char **listp = NULL; + char **delete_on_success = + reftable_calloc(sizeof(char *) * (compact_count + 1)); + char **subtable_locks = + reftable_calloc(sizeof(char *) * (compact_count + 1)); + int i = 0; + int j = 0; + int is_empty_table = 0; + + if (first > last || (!expiry && first == last)) { + err = 0; + goto done; + } + + st->stats.attempts++; + + strbuf_reset(&lock_file_name); + strbuf_addstr(&lock_file_name, st->list_file); + strbuf_addstr(&lock_file_name, ".lock"); + + lock_file_fd = + open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644); + if (lock_file_fd < 0) { + if (errno == EEXIST) { + err = 1; + } else { + err = REFTABLE_IO_ERROR; + } + goto done; + } + /* Don't want to write to the lock for now. */ + close(lock_file_fd); + lock_file_fd = 0; + + have_lock = 1; + err = stack_uptodate(st); + if (err != 0) + goto done; + + for (i = first, j = 0; i <= last; i++) { + struct strbuf subtab_file_name = STRBUF_INIT; + struct strbuf subtab_lock = STRBUF_INIT; + int sublock_file_fd = -1; + + stack_filename(&subtab_file_name, st, + reader_name(st->readers[i])); + + strbuf_reset(&subtab_lock); + strbuf_addbuf(&subtab_lock, &subtab_file_name); + strbuf_addstr(&subtab_lock, ".lock"); + + sublock_file_fd = open(subtab_lock.buf, + O_EXCL | O_CREAT | O_WRONLY, 0644); + if (sublock_file_fd > 0) { + close(sublock_file_fd); + } else if (sublock_file_fd < 0) { + if (errno == EEXIST) { + err = 1; + } else { + err = REFTABLE_IO_ERROR; + } + } + + subtable_locks[j] = subtab_lock.buf; + delete_on_success[j] = subtab_file_name.buf; + j++; + + if (err != 0) + goto done; + } + + err = unlink(lock_file_name.buf); + if (err < 0) + goto done; + have_lock = 0; + + err = stack_compact_locked(st, first, last, &temp_tab_file_name, + expiry); + /* Compaction + tombstones can create an empty table out of non-empty + * tables. */ + is_empty_table = (err == REFTABLE_EMPTY_TABLE_ERROR); + if (is_empty_table) { + err = 0; + } + if (err < 0) + goto done; + + lock_file_fd = + open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0644); + if (lock_file_fd < 0) { + if (errno == EEXIST) { + err = 1; + } else { + err = REFTABLE_IO_ERROR; + } + goto done; + } + have_lock = 1; + + format_name(&new_table_name, st->readers[first]->min_update_index, + st->readers[last]->max_update_index); + strbuf_addstr(&new_table_name, ".ref"); + + stack_filename(&new_table_path, st, new_table_name.buf); + + if (!is_empty_table) { + /* retry? */ + err = rename(temp_tab_file_name.buf, new_table_path.buf); + if (err < 0) { + err = REFTABLE_IO_ERROR; + goto done; + } + } + + for (i = 0; i < first; i++) { + strbuf_addstr(&ref_list_contents, st->readers[i]->name); + strbuf_addstr(&ref_list_contents, "\n"); + } + if (!is_empty_table) { + strbuf_addbuf(&ref_list_contents, &new_table_name); + strbuf_addstr(&ref_list_contents, "\n"); + } + for (i = last + 1; i < st->merged->stack_len; i++) { + strbuf_addstr(&ref_list_contents, st->readers[i]->name); + strbuf_addstr(&ref_list_contents, "\n"); + } + + err = write(lock_file_fd, ref_list_contents.buf, ref_list_contents.len); + if (err < 0) { + err = REFTABLE_IO_ERROR; + unlink(new_table_path.buf); + goto done; + } + err = close(lock_file_fd); + lock_file_fd = 0; + if (err < 0) { + err = REFTABLE_IO_ERROR; + unlink(new_table_path.buf); + goto done; + } + + err = rename(lock_file_name.buf, st->list_file); + if (err < 0) { + err = REFTABLE_IO_ERROR; + unlink(new_table_path.buf); + goto done; + } + have_lock = 0; + + /* Reload the stack before deleting. On windows, we can only delete the + files after we closed them. + */ + err = reftable_stack_reload_maybe_reuse(st, first < last); + + listp = delete_on_success; + while (*listp) { + if (strcmp(*listp, new_table_path.buf)) { + unlink(*listp); + } + listp++; + } + +done: + free_names(delete_on_success); + + listp = subtable_locks; + while (*listp) { + unlink(*listp); + listp++; + } + free_names(subtable_locks); + if (lock_file_fd > 0) { + close(lock_file_fd); + lock_file_fd = 0; + } + if (have_lock) { + unlink(lock_file_name.buf); + } + strbuf_release(&new_table_name); + strbuf_release(&new_table_path); + strbuf_release(&ref_list_contents); + strbuf_release(&temp_tab_file_name); + strbuf_release(&lock_file_name); + return err; +} + +int reftable_stack_compact_all(struct reftable_stack *st, + struct reftable_log_expiry_config *config) +{ + return stack_compact_range(st, 0, st->merged->stack_len - 1, config); +} + +static int stack_compact_range_stats(struct reftable_stack *st, int first, + int last, + struct reftable_log_expiry_config *config) +{ + int err = stack_compact_range(st, first, last, config); + if (err > 0) { + st->stats.failures++; + } + return err; +} + +static int segment_size(struct segment *s) +{ + return s->end - s->start; +} + +int fastlog2(uint64_t sz) +{ + int l = 0; + if (sz == 0) + return 0; + for (; sz; sz /= 2) { + l++; + } + return l - 1; +} + +struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n) +{ + struct segment *segs = reftable_calloc(sizeof(struct segment) * n); + int next = 0; + struct segment cur = { 0 }; + int i = 0; + + if (n == 0) { + *seglen = 0; + return segs; + } + for (i = 0; i < n; i++) { + int log = fastlog2(sizes[i]); + if (cur.log != log && cur.bytes > 0) { + struct segment fresh = { + .start = i, + }; + + segs[next++] = cur; + cur = fresh; + } + + cur.log = log; + cur.end = i + 1; + cur.bytes += sizes[i]; + } + segs[next++] = cur; + *seglen = next; + return segs; +} + +struct segment suggest_compaction_segment(uint64_t *sizes, int n) +{ + int seglen = 0; + struct segment *segs = sizes_to_segments(&seglen, sizes, n); + struct segment min_seg = { + .log = 64, + }; + int i = 0; + for (i = 0; i < seglen; i++) { + if (segment_size(&segs[i]) == 1) { + continue; + } + + if (segs[i].log < min_seg.log) { + min_seg = segs[i]; + } + } + + while (min_seg.start > 0) { + int prev = min_seg.start - 1; + if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev])) { + break; + } + + min_seg.start = prev; + min_seg.bytes += sizes[prev]; + } + + reftable_free(segs); + return min_seg; +} + +static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st) +{ + uint64_t *sizes = + reftable_calloc(sizeof(uint64_t) * st->merged->stack_len); + int version = (st->config.hash_id == GIT_SHA1_FORMAT_ID) ? 1 : 2; + int overhead = header_size(version) - 1; + int i = 0; + for (i = 0; i < st->merged->stack_len; i++) { + sizes[i] = st->readers[i]->size - overhead; + } + return sizes; +} + +int reftable_stack_auto_compact(struct reftable_stack *st) +{ + uint64_t *sizes = stack_table_sizes_for_compaction(st); + struct segment seg = + suggest_compaction_segment(sizes, st->merged->stack_len); + reftable_free(sizes); + if (segment_size(&seg) > 0) + return stack_compact_range_stats(st, seg.start, seg.end - 1, + NULL); + + return 0; +} + +struct reftable_compaction_stats * +reftable_stack_compaction_stats(struct reftable_stack *st) +{ + return &st->stats; +} + +int reftable_stack_read_ref(struct reftable_stack *st, const char *refname, + struct reftable_ref_record *ref) +{ + struct reftable_table tab = { NULL }; + reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st)); + return reftable_table_read_ref(&tab, refname, ref); +} + +int reftable_stack_read_log(struct reftable_stack *st, const char *refname, + struct reftable_log_record *log) +{ + struct reftable_iterator it = { NULL }; + struct reftable_merged_table *mt = reftable_stack_merged_table(st); + int err = reftable_merged_table_seek_log(mt, &it, refname); + if (err) + goto done; + + err = reftable_iterator_next_log(&it, log); + if (err) + goto done; + + if (strcmp(log->refname, refname) || + reftable_log_record_is_deletion(log)) { + err = 1; + goto done; + } + +done: + if (err) { + reftable_log_record_release(log); + } + reftable_iterator_destroy(&it); + return err; +} + +static int stack_check_addition(struct reftable_stack *st, + const char *new_tab_name) +{ + int err = 0; + struct reftable_block_source src = { NULL }; + struct reftable_reader *rd = NULL; + struct reftable_table tab = { NULL }; + struct reftable_ref_record *refs = NULL; + struct reftable_iterator it = { NULL }; + int cap = 0; + int len = 0; + int i = 0; + + if (st->config.skip_name_check) + return 0; + + err = reftable_block_source_from_file(&src, new_tab_name); + if (err < 0) + goto done; + + err = reftable_new_reader(&rd, &src, new_tab_name); + if (err < 0) + goto done; + + err = reftable_reader_seek_ref(rd, &it, ""); + if (err > 0) { + err = 0; + goto done; + } + if (err < 0) + goto done; + + while (1) { + struct reftable_ref_record ref = { NULL }; + err = reftable_iterator_next_ref(&it, &ref); + if (err > 0) { + break; + } + if (err < 0) + goto done; + + if (len >= cap) { + cap = 2 * cap + 1; + refs = reftable_realloc(refs, cap * sizeof(refs[0])); + } + + refs[len++] = ref; + } + + reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st)); + + err = validate_ref_record_addition(tab, refs, len); + +done: + for (i = 0; i < len; i++) { + reftable_ref_record_release(&refs[i]); + } + + free(refs); + reftable_iterator_destroy(&it); + reftable_reader_free(rd); + return err; +} + +static int is_table_name(const char *s) +{ + const char *dot = strrchr(s, '.'); + return dot && !strcmp(dot, ".ref"); +} + +static void remove_maybe_stale_table(struct reftable_stack *st, uint64_t max, + const char *name) +{ + int err = 0; + uint64_t update_idx = 0; + struct reftable_block_source src = { NULL }; + struct reftable_reader *rd = NULL; + struct strbuf table_path = STRBUF_INIT; + stack_filename(&table_path, st, name); + + err = reftable_block_source_from_file(&src, table_path.buf); + if (err < 0) + goto done; + + err = reftable_new_reader(&rd, &src, name); + if (err < 0) + goto done; + + update_idx = reftable_reader_max_update_index(rd); + reftable_reader_free(rd); + + if (update_idx <= max) { + unlink(table_path.buf); + } +done: + strbuf_release(&table_path); +} + +static int reftable_stack_clean_locked(struct reftable_stack *st) +{ + uint64_t max = reftable_merged_table_max_update_index( + reftable_stack_merged_table(st)); + DIR *dir = opendir(st->reftable_dir); + struct dirent *d = NULL; + if (!dir) { + return REFTABLE_IO_ERROR; + } + + while ((d = readdir(dir))) { + int i = 0; + int found = 0; + if (!is_table_name(d->d_name)) + continue; + + for (i = 0; !found && i < st->readers_len; i++) { + found = !strcmp(reader_name(st->readers[i]), d->d_name); + } + if (found) + continue; + + remove_maybe_stale_table(st, max, d->d_name); + } + + closedir(dir); + return 0; +} + +int reftable_stack_clean(struct reftable_stack *st) +{ + struct reftable_addition *add = NULL; + int err = reftable_stack_new_addition(&add, st); + if (err < 0) { + goto done; + } + + err = reftable_stack_reload(st); + if (err < 0) { + goto done; + } + + err = reftable_stack_clean_locked(st); + +done: + reftable_addition_destroy(add); + return err; +} + +int reftable_stack_print_directory(const char *stackdir, uint32_t hash_id) +{ + struct reftable_stack *stack = NULL; + struct reftable_write_options cfg = { .hash_id = hash_id }; + struct reftable_merged_table *merged = NULL; + struct reftable_table table = { NULL }; + + int err = reftable_new_stack(&stack, stackdir, cfg); + if (err < 0) + goto done; + + merged = reftable_stack_merged_table(stack); + reftable_table_from_merged_table(&table, merged); + err = reftable_table_print(&table); +done: + if (stack) + reftable_stack_destroy(stack); + return err; +} diff --git a/reftable/stack.h b/reftable/stack.h new file mode 100644 index 00000000000..f57005846e5 --- /dev/null +++ b/reftable/stack.h @@ -0,0 +1,41 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#ifndef STACK_H +#define STACK_H + +#include "system.h" +#include "reftable-writer.h" +#include "reftable-stack.h" + +struct reftable_stack { + char *list_file; + char *reftable_dir; + int disable_auto_compact; + + struct reftable_write_options config; + + struct reftable_reader **readers; + size_t readers_len; + struct reftable_merged_table *merged; + struct reftable_compaction_stats stats; +}; + +int read_lines(const char *filename, char ***lines); + +struct segment { + int start, end; + int log; + uint64_t bytes; +}; + +int fastlog2(uint64_t sz); +struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n); +struct segment suggest_compaction_segment(uint64_t *sizes, int n); + +#endif diff --git a/reftable/stack_test.c b/reftable/stack_test.c new file mode 100644 index 00000000000..7a4641ab601 --- /dev/null +++ b/reftable/stack_test.c @@ -0,0 +1,949 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "stack.h" + +#include "system.h" + +#include "reftable-reader.h" +#include "merged.h" +#include "basics.h" +#include "constants.h" +#include "record.h" +#include "test_framework.h" +#include "reftable-tests.h" + +#include +#include + +static void clear_dir(const char *dirname) +{ + struct strbuf path = STRBUF_INIT; + strbuf_addstr(&path, dirname); + remove_dir_recursively(&path, 0); + strbuf_release(&path); +} + +static int count_dir_entries(const char *dirname) +{ + DIR *dir = opendir(dirname); + int len = 0; + struct dirent *d; + if (dir == NULL) + return 0; + + while ((d = readdir(dir))) { + if (!strcmp(d->d_name, "..") || !strcmp(d->d_name, ".")) + continue; + len++; + } + closedir(dir); + return len; +} + +// Work linenumber into the tempdir, so we can see which tests forget to +// cleanup. +static char *get_tmp_template(int linenumber) +{ + const char *tmp = getenv("TMPDIR"); + static char template[1024]; + snprintf(template, sizeof(template) - 1, "%s/stack_test-%d.XXXXXX", + tmp ? tmp : "/tmp", linenumber); + return template; +} + +static char *get_tmp_dir(int linenumber) +{ + char *dir = get_tmp_template(linenumber); + EXPECT(mkdtemp(dir)); + return dir; +} + +static void test_read_file(void) +{ + char *fn = get_tmp_template(__LINE__); + int fd = mkstemp(fn); + char out[1024] = "line1\n\nline2\nline3"; + int n, err; + char **names = NULL; + char *want[] = { "line1", "line2", "line3" }; + int i = 0; + + EXPECT(fd > 0); + n = write(fd, out, strlen(out)); + EXPECT(n == strlen(out)); + err = close(fd); + EXPECT(err >= 0); + + err = read_lines(fn, &names); + EXPECT_ERR(err); + + for (i = 0; names[i]; i++) { + EXPECT(0 == strcmp(want[i], names[i])); + } + free_names(names); + remove(fn); +} + +static void test_parse_names(void) +{ + char buf[] = "line\n"; + char **names = NULL; + parse_names(buf, strlen(buf), &names); + + EXPECT(NULL != names[0]); + EXPECT(0 == strcmp(names[0], "line")); + EXPECT(NULL == names[1]); + free_names(names); +} + +static void test_names_equal(void) +{ + char *a[] = { "a", "b", "c", NULL }; + char *b[] = { "a", "b", "d", NULL }; + char *c[] = { "a", "b", NULL }; + + EXPECT(names_equal(a, a)); + EXPECT(!names_equal(a, b)); + EXPECT(!names_equal(a, c)); +} + +static int write_test_ref(struct reftable_writer *wr, void *arg) +{ + struct reftable_ref_record *ref = arg; + reftable_writer_set_limits(wr, ref->update_index, ref->update_index); + return reftable_writer_add_ref(wr, ref); +} + +struct write_log_arg { + struct reftable_log_record *log; + uint64_t update_index; +}; + +static int write_test_log(struct reftable_writer *wr, void *arg) +{ + struct write_log_arg *wla = arg; + + reftable_writer_set_limits(wr, wla->update_index, wla->update_index); + return reftable_writer_add_log(wr, wla->log); +} + +static void test_reftable_stack_add_one(void) +{ + char *dir = get_tmp_dir(__LINE__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + struct reftable_ref_record ref = { + .refname = "HEAD", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + struct reftable_ref_record dest = { NULL }; + + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT_ERR(err); + + err = reftable_stack_read_ref(st, ref.refname, &dest); + EXPECT_ERR(err); + EXPECT(0 == strcmp("master", dest.value.symref)); + + printf("testing print functionality:\n"); + err = reftable_stack_print_directory(dir, GIT_SHA1_FORMAT_ID); + EXPECT_ERR(err); + + err = reftable_stack_print_directory(dir, GIT_SHA256_FORMAT_ID); + EXPECT(err == REFTABLE_FORMAT_ERROR); + + reftable_ref_record_release(&dest); + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_uptodate(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st1 = NULL; + struct reftable_stack *st2 = NULL; + char *dir = get_tmp_dir(__LINE__); + + int err; + struct reftable_ref_record ref1 = { + .refname = "HEAD", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + struct reftable_ref_record ref2 = { + .refname = "branch2", + .update_index = 2, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + + + /* simulate multi-process access to the same stack + by creating two stacks for the same directory. + */ + err = reftable_new_stack(&st1, dir, cfg); + EXPECT_ERR(err); + + err = reftable_new_stack(&st2, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st1, &write_test_ref, &ref1); + EXPECT_ERR(err); + + err = reftable_stack_add(st2, &write_test_ref, &ref2); + EXPECT(err == REFTABLE_LOCK_ERROR); + + err = reftable_stack_reload(st2); + EXPECT_ERR(err); + + err = reftable_stack_add(st2, &write_test_ref, &ref2); + EXPECT_ERR(err); + reftable_stack_destroy(st1); + reftable_stack_destroy(st2); + clear_dir(dir); +} + +static void test_reftable_stack_transaction_api(void) +{ + char *dir = get_tmp_dir(__LINE__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + struct reftable_addition *add = NULL; + + struct reftable_ref_record ref = { + .refname = "HEAD", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + struct reftable_ref_record dest = { NULL }; + + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + reftable_addition_destroy(add); + + err = reftable_stack_new_addition(&add, st); + EXPECT_ERR(err); + + err = reftable_addition_add(add, &write_test_ref, &ref); + EXPECT_ERR(err); + + err = reftable_addition_commit(add); + EXPECT_ERR(err); + + reftable_addition_destroy(add); + + err = reftable_stack_read_ref(st, ref.refname, &dest); + EXPECT_ERR(err); + EXPECT(REFTABLE_REF_SYMREF == dest.value_type); + EXPECT(0 == strcmp("master", dest.value.symref)); + + reftable_ref_record_release(&dest); + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_validate_refname(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + char *dir = get_tmp_dir(__LINE__); + + int i; + struct reftable_ref_record ref = { + .refname = "a/b", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + char *additions[] = { "a", "a/b/c" }; + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT_ERR(err); + + for (i = 0; i < ARRAY_SIZE(additions); i++) { + struct reftable_ref_record ref = { + .refname = additions[i], + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT(err == REFTABLE_NAME_CONFLICT); + } + + reftable_stack_destroy(st); + clear_dir(dir); +} + +static int write_error(struct reftable_writer *wr, void *arg) +{ + return *((int *)arg); +} + +static void test_reftable_stack_update_index_check(void) +{ + char *dir = get_tmp_dir(__LINE__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + struct reftable_ref_record ref1 = { + .refname = "name1", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + struct reftable_ref_record ref2 = { + .refname = "name2", + .update_index = 1, + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref1); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref2); + EXPECT(err == REFTABLE_API_ERROR); + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_lock_failure(void) +{ + char *dir = get_tmp_dir(__LINE__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err, i; + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + for (i = -1; i != REFTABLE_EMPTY_TABLE_ERROR; i--) { + err = reftable_stack_add(st, &write_error, &i); + EXPECT(err == i); + } + + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_add(void) +{ + int i = 0; + int err = 0; + struct reftable_write_options cfg = { + .exact_log_message = 1, + }; + struct reftable_stack *st = NULL; + char *dir = get_tmp_dir(__LINE__); + + struct reftable_ref_record refs[2] = { { NULL } }; + struct reftable_log_record logs[2] = { { NULL } }; + int N = ARRAY_SIZE(refs); + + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + st->disable_auto_compact = 1; + + for (i = 0; i < N; i++) { + char buf[256]; + snprintf(buf, sizeof(buf), "branch%02d", i); + refs[i].refname = xstrdup(buf); + refs[i].update_index = i + 1; + refs[i].value_type = REFTABLE_REF_VAL1; + refs[i].value.val1 = reftable_malloc(GIT_SHA1_RAWSZ); + set_test_hash(refs[i].value.val1, i); + + logs[i].refname = xstrdup(buf); + logs[i].update_index = N + i + 1; + logs[i].value_type = REFTABLE_LOG_UPDATE; + + logs[i].value.update.new_hash = reftable_malloc(GIT_SHA1_RAWSZ); + logs[i].value.update.email = xstrdup("identity@invalid"); + set_test_hash(logs[i].value.update.new_hash, i); + } + + for (i = 0; i < N; i++) { + int err = reftable_stack_add(st, &write_test_ref, &refs[i]); + EXPECT_ERR(err); + } + + for (i = 0; i < N; i++) { + struct write_log_arg arg = { + .log = &logs[i], + .update_index = reftable_stack_next_update_index(st), + }; + int err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + } + + err = reftable_stack_compact_all(st, NULL); + EXPECT_ERR(err); + + for (i = 0; i < N; i++) { + struct reftable_ref_record dest = { NULL }; + + int err = reftable_stack_read_ref(st, refs[i].refname, &dest); + EXPECT_ERR(err); + EXPECT(reftable_ref_record_equal(&dest, refs + i, + GIT_SHA1_RAWSZ)); + reftable_ref_record_release(&dest); + } + + for (i = 0; i < N; i++) { + struct reftable_log_record dest = { NULL }; + int err = reftable_stack_read_log(st, refs[i].refname, &dest); + EXPECT_ERR(err); + EXPECT(reftable_log_record_equal(&dest, logs + i, + GIT_SHA1_RAWSZ)); + reftable_log_record_release(&dest); + } + + /* cleanup */ + reftable_stack_destroy(st); + for (i = 0; i < N; i++) { + reftable_ref_record_release(&refs[i]); + reftable_log_record_release(&logs[i]); + } + clear_dir(dir); +} + +static void test_reftable_stack_log_normalize(void) +{ + int err = 0; + struct reftable_write_options cfg = { + 0, + }; + struct reftable_stack *st = NULL; + char *dir = get_tmp_dir(__LINE__); + + uint8_t h1[GIT_SHA1_RAWSZ] = { 0x01 }, h2[GIT_SHA1_RAWSZ] = { 0x02 }; + + struct reftable_log_record input = { .refname = "branch", + .update_index = 1, + .value_type = REFTABLE_LOG_UPDATE, + .value = { .update = { + .new_hash = h1, + .old_hash = h2, + } } }; + struct reftable_log_record dest = { + .update_index = 0, + }; + struct write_log_arg arg = { + .log = &input, + .update_index = 1, + }; + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + input.value.update.message = "one\ntwo"; + err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT(err == REFTABLE_API_ERROR); + + input.value.update.message = "one"; + err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + + err = reftable_stack_read_log(st, input.refname, &dest); + EXPECT_ERR(err); + EXPECT(0 == strcmp(dest.value.update.message, "one\n")); + + input.value.update.message = "two\n"; + arg.update_index = 2; + err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + err = reftable_stack_read_log(st, input.refname, &dest); + EXPECT_ERR(err); + EXPECT(0 == strcmp(dest.value.update.message, "two\n")); + + /* cleanup */ + reftable_stack_destroy(st); + reftable_log_record_release(&dest); + clear_dir(dir); +} + +static void test_reftable_stack_tombstone(void) +{ + int i = 0; + char *dir = get_tmp_dir(__LINE__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + struct reftable_ref_record refs[2] = { { NULL } }; + struct reftable_log_record logs[2] = { { NULL } }; + int N = ARRAY_SIZE(refs); + struct reftable_ref_record dest = { NULL }; + struct reftable_log_record log_dest = { NULL }; + + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + /* even entries add the refs, odd entries delete them. */ + for (i = 0; i < N; i++) { + const char *buf = "branch"; + refs[i].refname = xstrdup(buf); + refs[i].update_index = i + 1; + if (i % 2 == 0) { + refs[i].value_type = REFTABLE_REF_VAL1; + refs[i].value.val1 = reftable_malloc(GIT_SHA1_RAWSZ); + set_test_hash(refs[i].value.val1, i); + } + + logs[i].refname = xstrdup(buf); + /* update_index is part of the key. */ + logs[i].update_index = 42; + if (i % 2 == 0) { + logs[i].value_type = REFTABLE_LOG_UPDATE; + logs[i].value.update.new_hash = + reftable_malloc(GIT_SHA1_RAWSZ); + set_test_hash(logs[i].value.update.new_hash, i); + logs[i].value.update.email = + xstrdup("identity@invalid"); + } + } + for (i = 0; i < N; i++) { + int err = reftable_stack_add(st, &write_test_ref, &refs[i]); + EXPECT_ERR(err); + } + + for (i = 0; i < N; i++) { + struct write_log_arg arg = { + .log = &logs[i], + .update_index = reftable_stack_next_update_index(st), + }; + int err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + } + + err = reftable_stack_read_ref(st, "branch", &dest); + EXPECT(err == 1); + reftable_ref_record_release(&dest); + + err = reftable_stack_read_log(st, "branch", &log_dest); + EXPECT(err == 1); + reftable_log_record_release(&log_dest); + + err = reftable_stack_compact_all(st, NULL); + EXPECT_ERR(err); + + err = reftable_stack_read_ref(st, "branch", &dest); + EXPECT(err == 1); + + err = reftable_stack_read_log(st, "branch", &log_dest); + EXPECT(err == 1); + reftable_ref_record_release(&dest); + reftable_log_record_release(&log_dest); + + /* cleanup */ + reftable_stack_destroy(st); + for (i = 0; i < N; i++) { + reftable_ref_record_release(&refs[i]); + reftable_log_record_release(&logs[i]); + } + clear_dir(dir); +} + +static void test_reftable_stack_hash_id(void) +{ + char *dir = get_tmp_dir(__LINE__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + + struct reftable_ref_record ref = { + .refname = "master", + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "target", + .update_index = 1, + }; + struct reftable_write_options cfg32 = { .hash_id = GIT_SHA256_FORMAT_ID }; + struct reftable_stack *st32 = NULL; + struct reftable_write_options cfg_default = { 0 }; + struct reftable_stack *st_default = NULL; + struct reftable_ref_record dest = { NULL }; + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT_ERR(err); + + /* can't read it with the wrong hash ID. */ + err = reftable_new_stack(&st32, dir, cfg32); + EXPECT(err == REFTABLE_FORMAT_ERROR); + + /* check that we can read it back with default config too. */ + err = reftable_new_stack(&st_default, dir, cfg_default); + EXPECT_ERR(err); + + err = reftable_stack_read_ref(st_default, "master", &dest); + EXPECT_ERR(err); + + EXPECT(reftable_ref_record_equal(&ref, &dest, GIT_SHA1_RAWSZ)); + reftable_ref_record_release(&dest); + reftable_stack_destroy(st); + reftable_stack_destroy(st_default); + clear_dir(dir); +} + +static void test_log2(void) +{ + EXPECT(1 == fastlog2(3)); + EXPECT(2 == fastlog2(4)); + EXPECT(2 == fastlog2(5)); +} + +static void test_sizes_to_segments(void) +{ + uint64_t sizes[] = { 2, 3, 4, 5, 7, 9 }; + /* .................0 1 2 3 4 5 */ + + int seglen = 0; + struct segment *segs = + sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes)); + EXPECT(segs[2].log == 3); + EXPECT(segs[2].start == 5); + EXPECT(segs[2].end == 6); + + EXPECT(segs[1].log == 2); + EXPECT(segs[1].start == 2); + EXPECT(segs[1].end == 5); + reftable_free(segs); +} + +static void test_sizes_to_segments_empty(void) +{ + int seglen = 0; + struct segment *segs = sizes_to_segments(&seglen, NULL, 0); + EXPECT(seglen == 0); + reftable_free(segs); +} + +static void test_sizes_to_segments_all_equal(void) +{ + uint64_t sizes[] = { 5, 5 }; + + int seglen = 0; + struct segment *segs = + sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes)); + EXPECT(seglen == 1); + EXPECT(segs[0].start == 0); + EXPECT(segs[0].end == 2); + reftable_free(segs); +} + +static void test_suggest_compaction_segment(void) +{ + uint64_t sizes[] = { 128, 64, 17, 16, 9, 9, 9, 16, 16 }; + /* .................0 1 2 3 4 5 6 */ + struct segment min = + suggest_compaction_segment(sizes, ARRAY_SIZE(sizes)); + EXPECT(min.start == 2); + EXPECT(min.end == 7); +} + +static void test_suggest_compaction_segment_nothing(void) +{ + uint64_t sizes[] = { 64, 32, 16, 8, 4, 2 }; + struct segment result = + suggest_compaction_segment(sizes, ARRAY_SIZE(sizes)); + EXPECT(result.start == result.end); +} + +static void test_reflog_expire(void) +{ + char *dir = get_tmp_dir(__LINE__); + + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + struct reftable_log_record logs[20] = { { NULL } }; + int N = ARRAY_SIZE(logs) - 1; + int i = 0; + int err; + struct reftable_log_expiry_config expiry = { + .time = 10, + }; + struct reftable_log_record log = { NULL }; + + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + for (i = 1; i <= N; i++) { + char buf[256]; + snprintf(buf, sizeof(buf), "branch%02d", i); + + logs[i].refname = xstrdup(buf); + logs[i].update_index = i; + logs[i].value_type = REFTABLE_LOG_UPDATE; + logs[i].value.update.time = i; + logs[i].value.update.new_hash = reftable_malloc(GIT_SHA1_RAWSZ); + logs[i].value.update.email = xstrdup("identity@invalid"); + set_test_hash(logs[i].value.update.new_hash, i); + } + + for (i = 1; i <= N; i++) { + struct write_log_arg arg = { + .log = &logs[i], + .update_index = reftable_stack_next_update_index(st), + }; + int err = reftable_stack_add(st, &write_test_log, &arg); + EXPECT_ERR(err); + } + + err = reftable_stack_compact_all(st, NULL); + EXPECT_ERR(err); + + err = reftable_stack_compact_all(st, &expiry); + EXPECT_ERR(err); + + err = reftable_stack_read_log(st, logs[9].refname, &log); + EXPECT(err == 1); + + err = reftable_stack_read_log(st, logs[11].refname, &log); + EXPECT_ERR(err); + + expiry.min_update_index = 15; + err = reftable_stack_compact_all(st, &expiry); + EXPECT_ERR(err); + + err = reftable_stack_read_log(st, logs[14].refname, &log); + EXPECT(err == 1); + + err = reftable_stack_read_log(st, logs[16].refname, &log); + EXPECT_ERR(err); + + /* cleanup */ + reftable_stack_destroy(st); + for (i = 0; i <= N; i++) { + reftable_log_record_release(&logs[i]); + } + clear_dir(dir); + reftable_log_record_release(&log); +} + +static int write_nothing(struct reftable_writer *wr, void *arg) +{ + reftable_writer_set_limits(wr, 1, 1); + return 0; +} + +static void test_empty_add(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + int err; + char *dir = get_tmp_dir(__LINE__); + + struct reftable_stack *st2 = NULL; + + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_add(st, &write_nothing, NULL); + EXPECT_ERR(err); + + err = reftable_new_stack(&st2, dir, cfg); + EXPECT_ERR(err); + clear_dir(dir); + reftable_stack_destroy(st); + reftable_stack_destroy(st2); +} + +static void test_reftable_stack_auto_compaction(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st = NULL; + char *dir = get_tmp_dir(__LINE__); + + int err, i; + int N = 100; + + err = reftable_new_stack(&st, dir, cfg); + EXPECT_ERR(err); + + for (i = 0; i < N; i++) { + char name[100]; + struct reftable_ref_record ref = { + .refname = name, + .update_index = reftable_stack_next_update_index(st), + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + snprintf(name, sizeof(name), "branch%04d", i); + + err = reftable_stack_add(st, &write_test_ref, &ref); + EXPECT_ERR(err); + + EXPECT(i < 3 || st->merged->stack_len < 2 * fastlog2(i)); + } + + EXPECT(reftable_stack_compaction_stats(st)->entries_written < + (uint64_t)(N * fastlog2(N))); + + reftable_stack_destroy(st); + clear_dir(dir); +} + +static void test_reftable_stack_compaction_concurrent(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st1 = NULL, *st2 = NULL; + char *dir = get_tmp_dir(__LINE__); + + int err, i; + int N = 3; + + err = reftable_new_stack(&st1, dir, cfg); + EXPECT_ERR(err); + + for (i = 0; i < N; i++) { + char name[100]; + struct reftable_ref_record ref = { + .refname = name, + .update_index = reftable_stack_next_update_index(st1), + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + snprintf(name, sizeof(name), "branch%04d", i); + + err = reftable_stack_add(st1, &write_test_ref, &ref); + EXPECT_ERR(err); + } + + err = reftable_new_stack(&st2, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_compact_all(st1, NULL); + EXPECT_ERR(err); + + reftable_stack_destroy(st1); + reftable_stack_destroy(st2); + + EXPECT(count_dir_entries(dir) == 2); + clear_dir(dir); +} + +static void unclean_stack_close(struct reftable_stack *st) +{ + // break abstraction boundary to simulate unclean shutdown. + int i = 0; + for (; i < st->readers_len; i++) { + reftable_reader_free(st->readers[i]); + } + st->readers_len = 0; + FREE_AND_NULL(st->readers); +} + +static void test_reftable_stack_compaction_concurrent_clean(void) +{ + struct reftable_write_options cfg = { 0 }; + struct reftable_stack *st1 = NULL, *st2 = NULL, *st3 = NULL; + char *dir = get_tmp_dir(__LINE__); + + int err, i; + int N = 3; + + err = reftable_new_stack(&st1, dir, cfg); + EXPECT_ERR(err); + + for (i = 0; i < N; i++) { + char name[100]; + struct reftable_ref_record ref = { + .refname = name, + .update_index = reftable_stack_next_update_index(st1), + .value_type = REFTABLE_REF_SYMREF, + .value.symref = "master", + }; + snprintf(name, sizeof(name), "branch%04d", i); + + err = reftable_stack_add(st1, &write_test_ref, &ref); + EXPECT_ERR(err); + } + + err = reftable_new_stack(&st2, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_compact_all(st1, NULL); + EXPECT_ERR(err); + + unclean_stack_close(st1); + unclean_stack_close(st2); + + err = reftable_new_stack(&st3, dir, cfg); + EXPECT_ERR(err); + + err = reftable_stack_clean(st3); + EXPECT_ERR(err); + EXPECT(count_dir_entries(dir) == 2); + + reftable_stack_destroy(st1); + reftable_stack_destroy(st2); + reftable_stack_destroy(st3); + + clear_dir(dir); +} + +int stack_test_main(int argc, const char *argv[]) +{ + RUN_TEST(test_empty_add); + RUN_TEST(test_log2); + RUN_TEST(test_names_equal); + RUN_TEST(test_parse_names); + RUN_TEST(test_read_file); + RUN_TEST(test_reflog_expire); + RUN_TEST(test_reftable_stack_add); + RUN_TEST(test_reftable_stack_add_one); + RUN_TEST(test_reftable_stack_auto_compaction); + RUN_TEST(test_reftable_stack_compaction_concurrent); + RUN_TEST(test_reftable_stack_compaction_concurrent_clean); + RUN_TEST(test_reftable_stack_hash_id); + RUN_TEST(test_reftable_stack_lock_failure); + RUN_TEST(test_reftable_stack_log_normalize); + RUN_TEST(test_reftable_stack_tombstone); + RUN_TEST(test_reftable_stack_transaction_api); + RUN_TEST(test_reftable_stack_update_index_check); + RUN_TEST(test_reftable_stack_uptodate); + RUN_TEST(test_reftable_stack_validate_refname); + RUN_TEST(test_sizes_to_segments); + RUN_TEST(test_sizes_to_segments_all_equal); + RUN_TEST(test_sizes_to_segments_empty); + RUN_TEST(test_suggest_compaction_segment); + RUN_TEST(test_suggest_compaction_segment_nothing); + return 0; +} diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index c8db6852c35..996da85f7b5 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -10,6 +10,7 @@ int cmd__reftable(int argc, const char **argv) record_test_main(argc, argv); refname_test_main(argc, argv); readwrite_test_main(argc, argv); + stack_test_main(argc, argv); tree_test_main(argc, argv); return 0; } From patchwork Mon Aug 30 14:57:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465425 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5ACAC432BE for ; Mon, 30 Aug 2021 14:58:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9E72160E77 for ; Mon, 30 Aug 2021 14:58:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237357AbhH3O7T (ORCPT ); Mon, 30 Aug 2021 10:59:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55582 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237337AbhH3O7D (ORCPT ); Mon, 30 Aug 2021 10:59:03 -0400 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F63AC061760 for ; Mon, 30 Aug 2021 07:58:09 -0700 (PDT) Received: by mail-wm1-x336.google.com with SMTP id f9-20020a05600c1549b029025b0f5d8c6cso14934869wmg.4 for ; Mon, 30 Aug 2021 07:58:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=vRAQxRV4mWm5KzO2JMO1K4ORWwbDDHcdGl0emEa8Wlg=; b=CQVxCFdHMNtfYLEwhiAzel4ofhYO3Yww6NSsj631uwUi+4ftdw6+jPmK17vFPWUiVc BD40rtn+KE/Qd610D0F29Ur3DE0Az4dnKCXeIZDbRC7BL3osuDwwGEcu0pFXoj5IS6ub 7+fgAoomHUgh2L+S7J5LfcHmtD32LFlG7Jd5koTr+wh4soMmvtxWuep72TfSRMoad9/0 VZ70AN4IB5jFuBmI4DhQ/qxBJojj6h0HSmUJIaZKJXLhKOWW8v3uouUJlbMWx4p/fcUs kDeZyY/d+UikOr5rYZzqi/sa9tvIxE/Hqfasy6HDgU12ODsuFLol3Mm9B915Jvtcf+Uz 2BwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=vRAQxRV4mWm5KzO2JMO1K4ORWwbDDHcdGl0emEa8Wlg=; b=IrTIP1rcmbSmtFEom7xikrk/B3eYRjdEqqp0QpLi8AZ2QeiZe9Fo7MAtg1mjZIGePq cdD9Dg1jy8uyVZA4ZBMU8thxnkwSJ914ljUfSnEZ+nDPAgGG691KYE67GTEQs2Rvu/0u SfLAAcmHSRNIBcwKBG/6OUoTikFUIWNL0frSvUlg3Pw88IkV6Id7ZHrX11mo/8CA2rh+ QnopUbsiWDEK6ISLeGQ8kYbTAbq1EuX3UJSDvwR8sjuURD4ErlIfougvllFgC0HMb352 jbWktx6E6ufq60BVk2D0wOlFnquTUqg/5Uy1zlN5r46f5nxst9gyJH892gNXfk7Kgl00 kKJw== X-Gm-Message-State: AOAM533RD361FV2t8DHM4BlnKm50gBrsW+C69ToJbfiOMYF5vY1t/Ntq u5ZfNdpWwLdQ4Xm49kfaChn/NyVqXbw= X-Google-Smtp-Source: ABdhPJyWuP6GCi/D+GZyQgafevuacOhEieQCBrr/QnvZCEgRjFEMRA2+XnkTRGYgJERrulKNdqcyDQ== X-Received: by 2002:a1c:23c7:: with SMTP id j190mr33739577wmj.1.1630335488090; Mon, 30 Aug 2021 07:58:08 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c7sm13427072wmq.13.2021.08.30.07.58.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:07 -0700 (PDT) Message-Id: <0170f20b5e75fad4894d935eccb5eb0c0af31e77.1630335476.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:54 +0000 Subject: [PATCH 18/19] reftable: add dump utility MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys provide a command-line utility for inspecting individual tables, and inspecting a complete ref database Signed-off-by: Han-Wen Nienhuys Helped-by: Carlo Marcelo Arenas Belón --- reftable/dump.c | 107 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 107 insertions(+) create mode 100644 reftable/dump.c diff --git a/reftable/dump.c b/reftable/dump.c new file mode 100644 index 00000000000..155953d1b82 --- /dev/null +++ b/reftable/dump.c @@ -0,0 +1,107 @@ +/* +Copyright 2020 Google LLC + +Use of this source code is governed by a BSD-style +license that can be found in the LICENSE file or at +https://developers.google.com/open-source/licenses/bsd +*/ + +#include "git-compat-util.h" +#include "hash.h" + +#include "reftable-blocksource.h" +#include "reftable-error.h" +#include "reftable-merged.h" +#include "reftable-record.h" +#include "reftable-tests.h" +#include "reftable-writer.h" +#include "reftable-iterator.h" +#include "reftable-reader.h" +#include "reftable-stack.h" +#include "reftable-generic.h" + +#include +#include +#include +#include +#include + +static int compact_stack(const char *stackdir) +{ + struct reftable_stack *stack = NULL; + struct reftable_write_options cfg = { 0 }; + + int err = reftable_new_stack(&stack, stackdir, cfg); + if (err < 0) + goto done; + + err = reftable_stack_compact_all(stack, NULL); + if (err < 0) + goto done; +done: + if (stack) { + reftable_stack_destroy(stack); + } + return err; +} + +static void print_help(void) +{ + printf("usage: dump [-cst] arg\n\n" + "options: \n" + " -c compact\n" + " -t dump table\n" + " -s dump stack\n" + " -6 sha256 hash format\n" + " -h this help\n" + "\n"); +} + +int reftable_dump_main(int argc, char *const *argv) +{ + int err = 0; + int opt_dump_table = 0; + int opt_dump_stack = 0; + int opt_compact = 0; + uint32_t opt_hash_id = GIT_SHA1_FORMAT_ID; + const char *arg = NULL, *argv0 = argv[0]; + + for (; argc > 1; argv++, argc--) + if (*argv[1] != '-') + break; + else if (!strcmp("-t", argv[1])) + opt_dump_table = 1; + else if (!strcmp("-6", argv[1])) + opt_hash_id = GIT_SHA256_FORMAT_ID; + else if (!strcmp("-s", argv[1])) + opt_dump_stack = 1; + else if (!strcmp("-c", argv[1])) + opt_compact = 1; + else if (!strcmp("-?", argv[1]) || !strcmp("-h", argv[1])) { + print_help(); + return 2; + } + + if (argc != 2) { + fprintf(stderr, "need argument\n"); + print_help(); + return 2; + } + + arg = argv[1]; + + if (opt_dump_table) { + err = reftable_reader_print_file(arg); + } else if (opt_dump_stack) { + err = reftable_stack_print_directory(arg, opt_hash_id); + } else if (opt_compact) { + err = compact_stack(arg); + } + + if (err < 0) { + fprintf(stderr, "%s: %s: %s\n", argv0, arg, + reftable_error_str(err)); + return 1; + } + return 0; +} From patchwork Mon Aug 30 14:57:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han-Wen Nienhuys X-Patchwork-Id: 12465427 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1ADBC4320E for ; Mon, 30 Aug 2021 14:58:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8A36060E97 for ; Mon, 30 Aug 2021 14:58:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237335AbhH3O7Y (ORCPT ); Mon, 30 Aug 2021 10:59:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55592 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237342AbhH3O7D (ORCPT ); Mon, 30 Aug 2021 10:59:03 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F24DEC06179A for ; Mon, 30 Aug 2021 07:58:09 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id u9so22850099wrg.8 for ; Mon, 30 Aug 2021 07:58:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=1Tx5k9d8cN1SY5mIofDnF0yHom8a5/TcLCrLRdI22f0=; b=c3pfcNHh5hd69A6lrmjNL0ZAdkCB/Sngz3e5fR/6FedL9/DwL37p+gXvud+tgtUiP4 kbOzfJkk8zYhA3ro2FeccDc6kKYpHXrP/FUZ7nfwIpm0T2bPmG6fkWbhlTUeeYVSIEF5 J2g5wzDhj+KoDRbGaHN8kjud/X8TuaBCPpnmcvYkRBrXiZYi2+bEgHdkDIO/mlWIFJfI iERgyQHRu8GrDe3NWrPV+FGA/ALIGVtoE685x54kNGce5vauItw4I2L3C7TxS4IdQQ0K f0amVuQJ8/DDpY1RAOhJx7uIpNKGzYGdDQ+kQvTCgl8YBbrM+MHhS/t9Waffu36HG5vb J+gA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=1Tx5k9d8cN1SY5mIofDnF0yHom8a5/TcLCrLRdI22f0=; b=tLKoS2rdICNr8N1Q6LrjPQWw6RY8pc6A/y4GXkqQsERb9wYXLlS/Ywn9gyy5krIWlb A/CY4hritYdFUz844JTXvY3jl5CORNGm0Kqf/B2IPFl+mO3VaGnO0zt0fgHZKFVn1TDv FoEtNxACk2VT/vjXwpRjmTKIzpqqO7AaXqMLUZxNvXjSQB/JqHM/3v6MnuuD2BU2tX/s QomtXcA3bHjY6d0YE3+aP4wSMW/jENuNI7/+hwDKUGz7thkoarjTcoaci4iYF0z0WPhj XCAjACEL/IR5HO7InfeOH9T6DtX6nt01YkpvXAwZjyPZRjxSZtE32AwEGR9d81c1Qaua A46Q== X-Gm-Message-State: AOAM5333vnwwNkhg+sxi2uSe6o2u+6vCE+4M6ABpOxTPuIzwXnEI2PgL E2viZ2HoyKGQNsZCZaXIyzwv12Wm6mw= X-Google-Smtp-Source: ABdhPJw1889oiMMeJS6RdPqECnOjX9cDaNJejHYdMWeBKgnQLpEF8p8RKp4o0jbiBB7MKyRMoGL8nw== X-Received: by 2002:adf:82b0:: with SMTP id 45mr26927171wrc.161.1630335488681; Mon, 30 Aug 2021 07:58:08 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c15sm11400200wmr.8.2021.08.30.07.58.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Aug 2021 07:58:08 -0700 (PDT) Message-Id: <4fc6b5cbf1674907f9b36cc6bd678d289b83b48c.1630335476.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 30 Aug 2021 14:57:55 +0000 Subject: [PATCH 19/19] Add "test-tool dump-reftable" command. Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Han-Wen Nienhuys Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Han-Wen Nienhuys From: Han-Wen Nienhuys This command dumps individual tables or a stack of of tables. Signed-off-by: Han-Wen Nienhuys --- Makefile | 1 + t/helper/test-reftable.c | 5 +++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + 4 files changed, 8 insertions(+) diff --git a/Makefile b/Makefile index 8be7e05ff99..3d80975c706 100644 --- a/Makefile +++ b/Makefile @@ -2474,6 +2474,7 @@ REFTABLE_OBJS += reftable/writer.o REFTABLE_TEST_OBJS += reftable/basics_test.o REFTABLE_TEST_OBJS += reftable/block_test.o +REFTABLE_TEST_OBJS += reftable/dump.o REFTABLE_TEST_OBJS += reftable/merged_test.o REFTABLE_TEST_OBJS += reftable/pq_test.o REFTABLE_TEST_OBJS += reftable/record_test.o diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 996da85f7b5..26b03d7b789 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -14,3 +14,8 @@ int cmd__reftable(int argc, const char **argv) tree_test_main(argc, argv); return 0; } + +int cmd__dump_reftable(int argc, const char **argv) +{ + return reftable_dump_main(argc, (char *const *)argv); +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index f7c888ffda7..338a57b104d 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -61,6 +61,7 @@ static struct test_cmd cmds[] = { { "read-midx", cmd__read_midx }, { "ref-store", cmd__ref_store }, { "reftable", cmd__reftable }, + { "dump-reftable", cmd__dump_reftable }, { "regex", cmd__regex }, { "repository", cmd__repository }, { "revision-walking", cmd__revision_walking }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index 25f77469146..48cee1f4a2d 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -19,6 +19,7 @@ int cmd__dump_cache_tree(int argc, const char **argv); int cmd__dump_fsmonitor(int argc, const char **argv); int cmd__dump_split_index(int argc, const char **argv); int cmd__dump_untracked_cache(int argc, const char **argv); +int cmd__dump_reftable(int argc, const char **argv); int cmd__example_decorate(int argc, const char **argv); int cmd__fast_rebase(int argc, const char **argv); int cmd__genrandom(int argc, const char **argv);