Message ID | bc0bf65553c8dd89bf5fcaa592fc3427507f1993.1715336798.git.ps@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | reftable: expose write options as config | expand |
Patrick Steinhardt <ps@pks.im> writes: > + > +reftable.restartInterval:: > + The interval at which to create restart points. The reftable backend > + determines the restart points at file creation. The process is > + arbitrary, but every 16 or 64 records is recommended. Every 16 may be It is unclear what exactly "The process is arbitrary, but" wants to say, especially the use of the noun "process". The process the user uses to choose the inteval value is? The default value chosen by us was arbitrary and out of thin air? Just striking the whole sentence (or removing up to ", but" part and starting the sentence with "Every 16 or 64") may make the resulting paragraph easier to follow, I suspect. > + } else if (!strcmp(var, "reftable.restartinterval")) { > + unsigned long restart_interval = git_config_ulong(var, value, ctx->kvi); > + if (restart_interval > UINT16_MAX) > + die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX); OK.
On Fri, May 10, 2024 at 02:57:46PM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > + > > +reftable.restartInterval:: > > + The interval at which to create restart points. The reftable backend > > + determines the restart points at file creation. The process is > > + arbitrary, but every 16 or 64 records is recommended. Every 16 may be > > It is unclear what exactly "The process is arbitrary, but" wants to > say, especially the use of the noun "process". The process the user > uses to choose the inteval value is? The default value chosen by us > was arbitrary and out of thin air? The latter is what I wanted to say, but I agree that it's hard to parse. And honestly, I don't even know how arbitrary it is, so I should probably not claim something like this in the first place. > Just striking the whole sentence (or removing up to ", but" part and > starting the sentence with "Every 16 or 64") may make the resulting > paragraph easier to follow, I suspect. Will do. Patrick
diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt index fa7c4be014..16b915c75e 100644 --- a/Documentation/config/reftable.txt +++ b/Documentation/config/reftable.txt @@ -12,3 +12,22 @@ readers during access. + The largest block size is `16777215` bytes (15.99 MiB). The default value is `4096` bytes (4kB). A value of `0` will use the default value. + +reftable.restartInterval:: + The interval at which to create restart points. The reftable backend + determines the restart points at file creation. The process is + arbitrary, but every 16 or 64 records is recommended. Every 16 may be + more suitable for smaller block sizes (4k or 8k), every 64 for larger + block sizes (64k). ++ +More frequent restart points reduces prefix compression and increases +space consumed by the restart table, both of which increase file size. ++ +Less frequent restart points makes prefix compression more effective, +decreasing overall file size, with increased penalties for readers +walking through more records after the binary search step. ++ +A maximum of `65535` restart points per block is supported. ++ +The default value is to create restart points every 16 records. A value of `0` +will use the default value. diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index bd9999cefc..9972dfc1a3 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -242,6 +242,11 @@ static int reftable_be_config(const char *var, const char *value, if (block_size > 16777215) die("reftable block size cannot exceed 16MB"); opts->block_size = block_size; + } else if (!strcmp(var, "reftable.restartinterval")) { + unsigned long restart_interval = git_config_ulong(var, value, ctx->kvi); + if (restart_interval > UINT16_MAX) + die("reftable block size cannot exceed %u", (unsigned)UINT16_MAX); + opts->restart_interval = restart_interval; } return 0; diff --git a/t/t0613-reftable-write-options.sh b/t/t0613-reftable-write-options.sh index 8bdbc6ec70..e0a5b26f58 100755 --- a/t/t0613-reftable-write-options.sh +++ b/t/t0613-reftable-write-options.sh @@ -171,4 +171,47 @@ test_expect_success 'block size exceeding maximum supported size' ' ) ' +test_expect_success 'restart interval at every single record' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + for i in $(test_seq 10) + do + printf "update refs/heads/branch-%d HEAD\n" "$i" || + return 1 + done >input && + git update-ref --stdin <input && + git -c reftable.restartInterval=1 pack-refs && + + cat >expect <<-EOF && + header: + block_size: 4096 + ref: + - length: 566 + restarts: 13 + log: + - length: 1393 + restarts: 12 + EOF + test-tool dump-reftable -b .git/reftable/*.ref >actual && + test_cmp expect actual + ) +' + +test_expect_success 'restart interval exceeding maximum supported interval' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + cat >expect <<-EOF && + fatal: reftable block size cannot exceed 65535 + EOF + test_must_fail git -c reftable.restartInterval=65536 pack-refs 2>err && + test_cmp expect err + ) +' + test_done
Add a new option `reftable.restartInterval` that allows the user to control the restart interval when writing reftable records used by the reftable library. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- Documentation/config/reftable.txt | 19 ++++++++++++++ refs/reftable-backend.c | 5 ++++ t/t0613-reftable-write-options.sh | 43 +++++++++++++++++++++++++++++++ 3 files changed, 67 insertions(+)