diff mbox series

[v2] reftable/writer: ensure valid range for log's update_index

Message ID 20241206-424-reftable-writer-add-check-for-limits-v2-1-82ca350b10be@gmail.com (mailing list archive)
State New
Headers show
Series [v2] reftable/writer: ensure valid range for log's update_index | expand

Commit Message

karthik nayak Dec. 6, 2024, 1:13 p.m. UTC
Each reftable addition has an associated update_index. While writing
refs, the update_index is verified to be within the range of the
reftable writer, i.e. `writer.min_update_index <= ref.update_index` and
`writer.max_update_index => ref.update_index`.

The corresponding check for reflogs in `reftable_writer_add_log` is
however missing. Add a similar check, but only check for the upper
limit. This is because reflogs are treated a bit differently than refs.
Each reflog entry in reftable has an associated update_index and we also
allow expiring entries in the middle, which is done by simply writing a
new reflog entry with the same update_index. This means, writing reflog
entries with update_index lesser than the writer's update_index is an
expected scenario.

Add a new unit test to check for the limits and fix some of the existing
tests, which were setting arbitrary values for the update_index by
ensuring they stay within the now checked limits.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
Changes in v2:
- Update the commit message to have an inclusive rannge.
- Update the unit test to also include an update with index below range.
- Link to v1: https://lore.kernel.org/r/20241205-424-reftable-writer-add-check-for-limits-v1-1-b287b055204e@gmail.com
---
 reftable/writer.c                   | 12 ++++++++++
 t/unit-tests/t-reftable-readwrite.c | 47 +++++++++++++++++++++++++++++++++++--
 t/unit-tests/t-reftable-stack.c     |  8 +++++--
 3 files changed, 63 insertions(+), 4 deletions(-)


---

Range-diff versus v1:

1:  e30bdaeeee ! 1:  1c3cb5e92b reftable/writer: ensure valid range for log's update_index
    @@ Commit message
     
         Each reftable addition has an associated update_index. While writing
         refs, the update_index is verified to be within the range of the
    -    reftable writer, i.e. `writer.min_update_index < ref.update_index` and
    -    `writer.max_update_index > ref.update_index`.
    +    reftable writer, i.e. `writer.min_update_index <= ref.update_index` and
    +    `writer.max_update_index => ref.update_index`.
     
         The corresponding check for reflogs in `reftable_writer_add_log` is
         however missing. Add a similar check, but only check for the upper
    @@ t/unit-tests/t-reftable-readwrite.c: static void t_log_overflow(void)
     +	struct reftable_writer *w = t_reftable_strbuf_writer(&buf, &opts);
     +	struct reftable_log_record log = {
     +		.refname = (char *)"refs/head/master",
    -+		.update_index = 1,
    ++		.update_index = 0,
     +		.value_type = REFTABLE_LOG_UPDATE,
     +		.value = {
     +			.update = {
    @@ t/unit-tests/t-reftable-readwrite.c: static void t_log_overflow(void)
     +	};
     +	int err;
     +
    -+	reftable_writer_set_limits(w, 1, 2);
    ++	reftable_writer_set_limits(w, 1, 1);
     +
    ++	/* write with update_index (0) below set limits (1, 1) */
     +	err = reftable_writer_add_log(w, &log);
     +	check_int(err, ==, 0);
     +
    -+	log.update_index = 2;
    ++	/* write with update_index (1) in the set limits (1, 1) */
    ++	log.update_index = 1;
     +	err = reftable_writer_add_log(w, &log);
     +	check_int(err, ==, 0);
     +
    ++	/* write with update_index (3) above set limits (1, 1) */
     +	log.update_index = 3;
     +	err = reftable_writer_add_log(w, &log);
     +	check_int(err, ==, REFTABLE_API_ERROR);


--- 

base-commit: cc01bad4a9f566cf4453c7edd6b433851b0835e2
change-id: 20241120-424-reftable-writer-add-check-for-limits-01b1fb703528

Thanks
- Karthik

Comments

Junio C Hamano Dec. 6, 2024, 11:11 p.m. UTC | #1
Karthik Nayak <karthik.188@gmail.com> writes:

> The corresponding check for reflogs in `reftable_writer_add_log` is
> however missing. Add a similar check, but only check for the upper
> limit. This is because reflogs are treated a bit differently than refs.
> Each reflog entry in reftable has an associated update_index and we also
> allow expiring entries in the middle, which is done by simply writing a
> new reflog entry with the same update_index. This means, writing reflog
> entries with update_index lesser than the writer's update_index is an
> expected scenario.
>
> Add a new unit test to check for the limits and fix some of the existing
> tests, which were setting arbitrary values for the update_index by
> ensuring they stay within the now checked limits.

Interesting.  I am a little curious how this was found.  As long as
the other parts of the system (i.e. the callers) are not buggy, the
update pointer will stay within the range, and that is why I do not
think I can write an end-to-end test to trigger an existing bug that
would be caught by this "fix".  Fixing the existing unit tests that
feed a wrong input and expect some right outcome is good, but would
it be a good to also have a new unit test that validates that such
an incorrect precondition for calling is caught and the caller gets
the expected REFTABLE_API_ERROR as a result, I wonder?  Being able
to trigger a condition that is harder to do with the end-to-end test
is one good thing about having unit test framework, no?

Will queue.  Thanks.
karthik nayak Dec. 7, 2024, 9:54 a.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> Karthik Nayak <karthik.188@gmail.com> writes:
>
>> The corresponding check for reflogs in `reftable_writer_add_log` is
>> however missing. Add a similar check, but only check for the upper
>> limit. This is because reflogs are treated a bit differently than refs.
>> Each reflog entry in reftable has an associated update_index and we also
>> allow expiring entries in the middle, which is done by simply writing a
>> new reflog entry with the same update_index. This means, writing reflog
>> entries with update_index lesser than the writer's update_index is an
>> expected scenario.
>>
>> Add a new unit test to check for the limits and fix some of the existing
>> tests, which were setting arbitrary values for the update_index by
>> ensuring they stay within the now checked limits.
>
> Interesting.  I am a little curious how this was found.  As long as
> the other parts of the system (i.e. the callers) are not buggy, the
> update pointer will stay within the range, and that is why I do not
> think I can write an end-to-end test to trigger an existing bug that
> would be caught by this "fix".

I was reading the reftable code and noticed it. So mostly luck. Agreed
with what you're saying, I'd say this is mostly a 'safeguard'.

> Fixing the existing unit tests that
> feed a wrong input and expect some right outcome is good, but would
> it be a good to also have a new unit test that validates that such
> an incorrect precondition for calling is caught and the caller gets
> the expected REFTABLE_API_ERROR as a result, I wonder?

I'm confused, I did add a _new_ test in this patch to do exactly what
you're suggesting.

> Being able
> to trigger a condition that is harder to do with the end-to-end test
> is one good thing about having unit test framework, no?
>

Totally, these kind of specific changes are perfect for unit-tests. Plus
it was very simple to add one too.

> Will queue.  Thanks.

Thanks!

Karthik
Toon Claes Dec. 11, 2024, 1:10 p.m. UTC | #3
karthik nayak <karthik.188@gmail.com> writes:

> I was reading the reftable code and noticed it. So mostly luck. Agreed
> with what you're saying, I'd say this is mostly a 'safeguard'.

I was wondering as well, thanks for that info.

> Totally, these kind of specific changes are perfect for unit-tests. Plus
> it was very simple to add one too.

I appreciate you're doing this, that makes it easier to reason about
these changes.

>> Will queue.  Thanks.

Junio, thank you for the quick turnarond on this patch. For what it's
worth, I've given this a review and I approve these changes.
diff mbox series

Patch

diff --git a/reftable/writer.c b/reftable/writer.c
index fd136794d5a27b33b5017f36fbd6b095ab8dac5b..f87086777cd20a9890a63f10c5d6932310dd5610 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -412,6 +412,18 @@  int reftable_writer_add_log(struct reftable_writer *w,
 	if (log->value_type == REFTABLE_LOG_DELETION)
 		return reftable_writer_add_log_verbatim(w, log);
 
+	/*
+	 * Verify only the upper limit of the update_index. Each reflog entry
+	 * is tied to a specific update_index. Entries in the reflog can be
+	 * replaced by adding a new entry with the same update_index,
+	 * effectively canceling the old one.
+	 *
+	 * Consequently, reflog updates may include update_index values lower
+	 * than the writer's min_update_index.
+	 */
+	if (log->update_index > w->max_update_index)
+		return REFTABLE_API_ERROR;
+
 	if (!log->refname)
 		return REFTABLE_API_ERROR;
 
diff --git a/t/unit-tests/t-reftable-readwrite.c b/t/unit-tests/t-reftable-readwrite.c
index d279b86df0aeda11b3fb4d2c15803760ae394941..7ffbd78c6759fc85ab9d5a2909d4d20662fc56c7 100644
--- a/t/unit-tests/t-reftable-readwrite.c
+++ b/t/unit-tests/t-reftable-readwrite.c
@@ -90,7 +90,7 @@  static void t_log_buffer_size(void)
 	int i;
 	struct reftable_log_record
 		log = { .refname = (char *) "refs/heads/master",
-			.update_index = 0xa,
+			.update_index = update_index,
 			.value_type = REFTABLE_LOG_UPDATE,
 			.value = { .update = {
 					   .name = (char *) "Han-Wen Nienhuys",
@@ -127,7 +127,7 @@  static void t_log_overflow(void)
 	int err;
 	struct reftable_log_record log = {
 		.refname = (char *) "refs/heads/master",
-		.update_index = 0xa,
+		.update_index = update_index,
 		.value_type = REFTABLE_LOG_UPDATE,
 		.value = {
 			.update = {
@@ -151,6 +151,48 @@  static void t_log_overflow(void)
 	reftable_buf_release(&buf);
 }
 
+static void t_log_write_limits(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct reftable_buf buf = REFTABLE_BUF_INIT;
+	struct reftable_writer *w = t_reftable_strbuf_writer(&buf, &opts);
+	struct reftable_log_record log = {
+		.refname = (char *)"refs/head/master",
+		.update_index = 0,
+		.value_type = REFTABLE_LOG_UPDATE,
+		.value = {
+			.update = {
+				.old_hash = { 1 },
+				.new_hash = { 2 },
+				.name = (char *)"Han-Wen Nienhuys",
+				.email = (char *)"hanwen@google.com",
+				.tz_offset = 100,
+				.time = 0x5e430672,
+			},
+		},
+	};
+	int err;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	/* write with update_index (0) below set limits (1, 1) */
+	err = reftable_writer_add_log(w, &log);
+	check_int(err, ==, 0);
+
+	/* write with update_index (1) in the set limits (1, 1) */
+	log.update_index = 1;
+	err = reftable_writer_add_log(w, &log);
+	check_int(err, ==, 0);
+
+	/* write with update_index (3) above set limits (1, 1) */
+	log.update_index = 3;
+	err = reftable_writer_add_log(w, &log);
+	check_int(err, ==, REFTABLE_API_ERROR);
+
+	reftable_writer_free(w);
+	reftable_buf_release(&buf);
+}
+
 static void t_log_write_read(void)
 {
 	struct reftable_write_options opts = {
@@ -917,6 +959,7 @@  int cmd_main(int argc UNUSED, const char *argv[] UNUSED)
 	TEST(t_corrupt_table_empty(), "read-write on an empty table");
 	TEST(t_log_buffer_size(), "buffer extension for log compression");
 	TEST(t_log_overflow(), "log overflow returns expected error");
+	TEST(t_log_write_limits(), "writer limits for writing log records");
 	TEST(t_log_write_read(), "read-write on log records");
 	TEST(t_log_zlib_corruption(), "reading corrupted log record returns expected error");
 	TEST(t_table_read_api(), "read on a table");
diff --git a/t/unit-tests/t-reftable-stack.c b/t/unit-tests/t-reftable-stack.c
index 72f6747064f62106e87c9a90e5fe315139d46e60..52b81475c36aa94ff09f3bf77a7d23992957deb4 100644
--- a/t/unit-tests/t-reftable-stack.c
+++ b/t/unit-tests/t-reftable-stack.c
@@ -770,8 +770,12 @@  static void t_reftable_stack_tombstone(void)
 		}
 
 		logs[i].refname = xstrdup(buf);
-		/* update_index is part of the key. */
-		logs[i].update_index = 42;
+		/*
+		 * update_index is part of the key so should be constant.
+		 * The value itself should be less than the writer's upper
+		 * limit.
+		 */
+		logs[i].update_index = 1;
 		if (i % 2 == 0) {
 			logs[i].value_type = REFTABLE_LOG_UPDATE;
 			t_reftable_set_hash(logs[i].value.update.new_hash, i,