mem-pool: fix big allocations

Message ID	fa89d269-1a23-4ed6-bebc-30c0b629f444@web.de (mailing list archive)
State	Superseded
Headers	show Received: from mout.web.de (mout.web.de [212.227.17.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA67D7691D for <git@vger.kernel.org>; Thu, 21 Dec 2023 23:13:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=web.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=web.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=web.de header.i=l.s.r@web.de header.b="Vef+6wW/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=web.de; s=s29768273; t=1703200397; x=1703805197; i=l.s.r@web.de; bh=2zLetmXicq0i2JGreYXYW4ybSs2WLr8SfUsTc9ndOkg=; h=X-UI-Sender-Class:Date:To:From:Subject:Cc; b=Vef+6wW/bt4NNYbfMQqC/hrSzIOJ1rKyTdx8RSaKChgGGTjxS24LB5vqWP/6cIBN bYrImJBNjVC79F96gXooNhf+JhHMyfSJiBJTh6DworXJFo7S1ZK0yOqccqixqJuuA dO1jmD+p0cI0Ksdt4DZK4QtLL8/Igd0qTWjDPeFz8D8sxRE1uC2yrrXtrWBdKsWB7 p/J6+XxVDZSXeNdawCCL/npco7HLyQ5UjQpcwTy1r0a6RA70Xh+7mhJCeX6xxjdOE S774PSrJZ+jgiNyvubi5nheF0DSj9LYSHnnCESUgGgZIOXPJfOL7tsKZFiC/M6srp 7B//4I1ZBt1nRKDOMA== X-UI-Sender-Class: 814a7b36-bfc1-4dae-8640-3722d8ec6cd6 Received: from [192.168.178.29] ([79.203.23.9]) by smtp.web.de (mrweb106 [213.165.67.124]) with ESMTPSA (Nemesis) id 1MK56y-1rZdXH0ku6-00LvBz; Fri, 22 Dec 2023 00:13:17 +0100 Message-ID: <fa89d269-1a23-4ed6-bebc-30c0b629f444@web.de> Date: Fri, 22 Dec 2023 00:13:16 +0100 Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: <git.vger.kernel.org> List-Subscribe: <mailto:git+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:git+unsubscribe@vger.kernel.org> MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Git List <git@vger.kernel.org> From: =?utf-8?q?Ren=C3=A9_Scharfe?= <l.s.r@web.de> Subject: [PATCH] mem-pool: fix big allocations Cc: Jameson Miller <jamill@microsoft.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Provags-ID: V03:K1:5vaxK9jgSq6Dg2YAJerUpDueOxBaCrl1BsEr5/L7PrNmbBZYnAm 1cxFGmh4xKe4hYaC8YsU0m1tn1aYaOtiZN41Amv7uVZSVp6w3QSIkNCq9Cl2mvZ91zHT73y /zFrGmHhAgcWnoAsoYcW+dUNXig49PSyb+ofFfTcLw/SxQ+NG3Oc1l6ONZJ8L9lDa/C6O/S zSsaT8LYsakdE21ZEMlzg== X-Spam-Flag: NO UI-OutboundReport: notjunk:1;M01:P0:5d68egDDhLM=;eK2n/MMFCCtkfguQFoFacV1jG0x YdWNT5Z4uVRhOJJikjapY8yItUnltHWOk7/JmlfV0EKg1ZESIfYi+zST9fr21BVCeYs56bhTZ YItQHsSbP0mtbIS55F8Q7q/33bG+4B6N11i6Jjpn2a7jI3h1wuEbCv6KnQ8aDnZ4muo8XaBEl iz4Xsubw3LTBo6+JZWfLIqlB0VlAR6y9SjkhDKqb4efj6+P5NJW4YCtF+3Ssz7fq3KUCHis9s DkF5gtb0xj05oME9fIfhHUNb9EWPzRRQeHMHpPZ6HKSP7rRRSAm4QZOXpZFek2JQ20GXL3n9g MRsb9BVJzbSUMq9TTcDs+DnJ2l4GryOCKkP6fBgndxJbuPFHmu6NyF+Ks9ULlXU3H9fKRp5PP lKvlzL6OYVawRHgZlizRSCrkyUd65YWlSw1+3TbSUEycRKzyRPQR+cB4bUYbu0U3ZKmxDIqEb 7XB8LqZOFk1lshH63m9hMu2rZ/l+Kcp/LRLBzqlbBhBVA5mcBIjKhM5mXlY98xPzwFG/GqNuc xKszwBNuwC1beKqkOBy8XFWESR+qwzYAkvhjbqOPw30wHD9m61VszPnoV6d9bZ5R1i3IbCxC7 0Cedv+jgoY/jAB15INGEvgm6+kMRqcqnqWEBuXdZDqbafRXA3LED0AbcpSzW6dbjqU0skplqX vKI5O+bWIXxuzOHvSK9JSiZ+Xkd7L/obe5kLgHeLk3ORnO256Gz1R02kaq6P1sjgsNBV75HlU WtR2LDgv36X9o0Ooi6AQMXbdq/I1MrHyeACTZEqBwDDaowGV+5aQWBUC6ovM/lmoJSOHlCAMY GDxT6o5CZfpJsOSN9kbx74CAAv3IjUD64t/F+iQ4hE9GGmVM6pmmAcfs8rRSLx7tL2cARU64x f+119+LA8/F4/V531YLRutpB7Ft5NGii8zL4e2mkYeWrnZ30Lkx7n1bK1EyRqrVz0xZfX+pXQ Okn+EA==
Series	mem-pool: fix big allocations \| expand mem-pool: fix big allocations

René Scharfe Dec. 21, 2023, 11:13 p.m. UTC

Memory pool allocations that require a new block and would fill at
least half of it are handled specially.  Before 158dfeff3d (mem-pool:
add life cycle management functions, 2018-07-02) they used to be
allocated outside of the pool.  This patch made mem_pool_alloc() create
a bespoke block instead, to allow releasing it when the pool gets
discarded.

Unfortunately mem_pool_alloc() returns a pointer to the start of such a
bespoke block, i.e. to the struct mp_block at its top.  When the caller
writes to it, the management information gets corrupted.  This affects
mem_pool_discard() and -- if there are no other blocks in the pool --
also mem_pool_alloc().

Return the payload pointer of bespoke blocks, just like for smaller
allocations, to protect the management struct.

Also update next_free to mark the block as full.  This is only strictly
necessary for the first allocated block, because subsequent ones are
inserted after the current block and never considered for further
allocations, but it's easier to just do it in all cases.

Add a basic unit test to demonstate the issue by using mem_pool_calloc()
with a tiny block size, which forces the creation of a bespoke block.
Without the mem_pool_alloc() fix it reports zeroed pointers:

   ok 1 - mem_pool_calloc returns 100 zeroed bytes with big block
   # check "((pool->mp_block->next_free) != (((void*)0))) == 1" failed at t/unit-tests/t-mem-pool.c:22
   #    left: 0
   #   right: 1
   # check "((pool->mp_block->end) != (((void*)0))) == 1" failed at t/unit-tests/t-mem-pool.c:23
   #    left: 0
   #   right: 1
   not ok 2 - mem_pool_calloc returns 100 zeroed bytes with tiny block
   1..2

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 Makefile                  |  1 +
 mem-pool.c                |  6 +++---
 t/unit-tests/t-mem-pool.c | 34 ++++++++++++++++++++++++++++++++++
 3 files changed, 38 insertions(+), 3 deletions(-)
 create mode 100644 t/unit-tests/t-mem-pool.c

--
2.43.0

Elijah Newren Dec. 24, 2023, 3:11 a.m. UTC | #1

On Thu, Dec 21, 2023 at 3:13 PM René Scharfe <l.s.r@web.de> wrote:
>
> Memory pool allocations that require a new block and would fill at
> least half of it are handled specially.  Before 158dfeff3d (mem-pool:
> add life cycle management functions, 2018-07-02) they used to be
> allocated outside of the pool.  This patch made mem_pool_alloc() create
> a bespoke block instead, to allow releasing it when the pool gets
> discarded.
>
> Unfortunately mem_pool_alloc() returns a pointer to the start of such a
> bespoke block, i.e. to the struct mp_block at its top.  When the caller
> writes to it, the management information gets corrupted.  This affects
> mem_pool_discard() and -- if there are no other blocks in the pool --
> also mem_pool_alloc().
>
> Return the payload pointer of bespoke blocks, just like for smaller
> allocations, to protect the management struct.
>
> Also update next_free to mark the block as full.  This is only strictly
> necessary for the first allocated block, because subsequent ones are
> inserted after the current block and never considered for further
> allocations, but it's easier to just do it in all cases.
>
> Add a basic unit test to demonstate the issue by using mem_pool_calloc()

demonstrate (missing 'r')?

> with a tiny block size, which forces the creation of a bespoke block.
> Without the mem_pool_alloc() fix it reports zeroed pointers:
>
>    ok 1 - mem_pool_calloc returns 100 zeroed bytes with big block
>    # check "((pool->mp_block->next_free) != (((void*)0))) == 1" failed at t/unit-tests/t-mem-pool.c:22
>    #    left: 0
>    #   right: 1
>    # check "((pool->mp_block->end) != (((void*)0))) == 1" failed at t/unit-tests/t-mem-pool.c:23
>    #    left: 0
>    #   right: 1
>    not ok 2 - mem_pool_calloc returns 100 zeroed bytes with tiny block
>    1..2
>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  Makefile                  |  1 +
>  mem-pool.c                |  6 +++---
>  t/unit-tests/t-mem-pool.c | 34 ++++++++++++++++++++++++++++++++++
>  3 files changed, 38 insertions(+), 3 deletions(-)
>  create mode 100644 t/unit-tests/t-mem-pool.c
>
> diff --git a/Makefile b/Makefile
> index 88ba7a3c51..15990ff312 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1340,6 +1340,7 @@ THIRD_PARTY_SOURCES += sha1collisiondetection/%
>  THIRD_PARTY_SOURCES += sha1dc/%
>
>  UNIT_TEST_PROGRAMS += t-basic
> +UNIT_TEST_PROGRAMS += t-mem-pool
>  UNIT_TEST_PROGRAMS += t-strbuf
>  UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
>  UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
> diff --git a/mem-pool.c b/mem-pool.c
> index c34846d176..e8d976c3ee 100644
> --- a/mem-pool.c
> +++ b/mem-pool.c
> @@ -99,9 +99,9 @@ void *mem_pool_alloc(struct mem_pool *pool, size_t len)
>
>         if (!p) {
>                 if (len >= (pool->block_alloc / 2))
> -                       return mem_pool_alloc_block(pool, len, pool->mp_block);
> -
> -               p = mem_pool_alloc_block(pool, pool->block_alloc, NULL);
> +                       p = mem_pool_alloc_block(pool, len, pool->mp_block);
> +               else
> +                       p = mem_pool_alloc_block(pool, pool->block_alloc, NULL);
>         }
>
>         r = p->next_free;
> diff --git a/t/unit-tests/t-mem-pool.c b/t/unit-tests/t-mem-pool.c
> new file mode 100644
> index 0000000000..2295779b0b
> --- /dev/null
> +++ b/t/unit-tests/t-mem-pool.c
> @@ -0,0 +1,34 @@
> +#include "test-lib.h"
> +#include "mem-pool.h"
> +
> +#define check_ptr(a, op, b) check_int(((a) op (b)), ==, 1)
> +
> +static void setup_static(void (*f)(struct mem_pool *), size_t block_alloc)
> +{
> +       struct mem_pool pool = { .block_alloc = block_alloc };
> +       f(&pool);
> +       mem_pool_discard(&pool, 0);
> +}
> +
> +static void t_calloc_100(struct mem_pool *pool)
> +{
> +       size_t size = 100;
> +       char *buffer = mem_pool_calloc(pool, 1, size);
> +       for (size_t i = 0; i < size; i++)
> +               check_int(buffer[i], ==, 0);
> +       if (!check_ptr(pool->mp_block, !=, NULL))
> +               return;
> +       check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end);
> +       check_ptr(pool->mp_block->next_free, !=, NULL);
> +       check_ptr(pool->mp_block->end, !=, NULL);
> +}
> +
> +int cmd_main(int argc, const char **argv)
> +{
> +       TEST(setup_static(t_calloc_100, 1024 * 1024),
> +            "mem_pool_calloc returns 100 zeroed bytes with big block");
> +       TEST(setup_static(t_calloc_100, 1),
> +            "mem_pool_calloc returns 100 zeroed bytes with tiny block");
> +
> +       return test_done();
> +}
> --
> 2.43.0

Nice catch; looks good to me.  Out of curiosity, how'd you find the issue?

René Scharfe Dec. 24, 2023, 9:30 a.m. UTC | #2

Am 24.12.23 um 04:11 schrieb Elijah Newren:
> On Thu, Dec 21, 2023 at 3:13 PM René Scharfe <l.s.r@web.de> wrote:
>>
>> Add a basic unit test to demonstate the issue by using mem_pool_calloc()
>
> demonstrate (missing 'r')?

Yes.  I didn't intend to mention any demons or states, let alone a
demon state. *facepalm*

> Nice catch; looks good to me.  Out of curiosity, how'd you find the issue?

I was working on using a memory pool for name-rev.  I played with
different values for block_alloc, not fully sure why anymore -- either
for checking the performance impact or testing robustness.

René

Phillip Wood Dec. 28, 2023, 3:10 p.m. UTC | #3

Hi René

On 21/12/2023 23:13, René Scharfe wrote:
> +#define check_ptr(a, op, b) check_int(((a) op (b)), ==, 1)
> [...]
> +static void t_calloc_100(struct mem_pool *pool)
> +{
> +	size_t size = 100;
> +	char *buffer = mem_pool_calloc(pool, 1, size);
> +	for (size_t i = 0; i < size; i++)
> +		check_int(buffer[i], ==, 0);
> +	if (!check_ptr(pool->mp_block, !=, NULL))
> +		return;
> +	check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end);
> +	check_ptr(pool->mp_block->next_free, !=, NULL);
> +	check_ptr(pool->mp_block->end, !=, NULL);
> +}

It's great to see the unit test framework being used here. I wonder
though if it would be simpler just to use

	check(ptr != NULL)

as I'm not sure what the check_ptr() macro adds. The diff at the end of
this email shows a possible implementation of a check_ptr() macro for
the unit test library. I'm wary of adding it though because I'm not sure
printing the pointer values is actually very useful most of the
time. I'm also concerned that the rules around pointer arithmetic and
comparisons mean that many pointer tests such as

     check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end);

will be undefined if they fail. The documentation for check_ptr() below
tries to illustrate that concern. If the compiler can prove that a check
is undefined when that check fails it is at liberty to hard code the
test as passing. In practice I think most failing pointer comparisons
would fall into the category of "this is undefined but the compiler
can't prove it" but that doesn't really make me any happier.

Best Wishes

Phillip

---- >8 ----
diff --git a/t/unit-tests/test-lib.h b/t/unit-tests/test-lib.h
index a8f07ae0b7..ecd1fce17d 100644
--- a/t/unit-tests/test-lib.h
+++ b/t/unit-tests/test-lib.h
@@ -99,6 +99,39 @@ int check_int_loc(const char *loc, const char *check, int ok,
  int check_uint_loc(const char *loc, const char *check, int ok,
  		   uintmax_t a, uintmax_t b);
  
+/*
+ * Compare two pointers. Prints a message with the two values if the
+ * comparison fails. NB this is not thread safe.
+ *
+ * Use this with care. The rules around pointer arithmetic and comparison
+ * in C are quite strict and violating them results in undefined behavior
+ * To avoid a failing comparison resulting undefined behavior we compare
+ * the integer value of the pointers. While this avoids undefined
+ * behavior in the comparison in many cases a failing test will be the
+ * result of creating an invalid pointer in a way that violates the
+ * rules on pointer arithmetic. For example if `start` and `end` are
+ * pointers to the beginning and end of an allocation and `offset` is an
+ * integer then
+ *
+ *     check_ptr(start + offset, <=, end)
+ *
+ * is undefined when `offset` is larger than `end - start`. Rewriting the
+ * comparison as
+ *
+ *     check_uint(offset, <=, end - start)
+ *
+ * avoids undefined behavior when offset is too large, but is still
+ * undefined if there is a bug that means `start` and `end` do not point
+ * to the same allocation.
+ */
+#define check_ptr(a, op, b)						\
+	(test__tmp[0].p = (a), test__tmp[1].p = (b),			\
+	 check_ptr_loc(TEST_LOCATION(), #a" "#op" "#b,			\
+		       (uintptr_t)test__tmp[0].p op (uintptr_t)test__tmp[1].p,	\
+			test__tmp[0].p, test__tmp[1].p))
+
+int check_ptr_loc(const char *loc, const char *check, int ok, void *a, void *b);
+
  /*
   * Compare two chars. Prints a message with the two values if the
   * comparison fails. NB this is not thread safe.
@@ -133,6 +166,7 @@ int check_str_loc(const char *loc, const char *check,
  #define TEST__MAKE_LOCATION(line) __FILE__ ":" TEST__STR(line)
  
  union test__tmp {
+	void *p;
  	intmax_t i;
  	uintmax_t u;
  	char c;
diff --git a/t/unit-tests/test-lib.c b/t/unit-tests/test-lib.c
index 7bf9dfdb95..cb757edbd8 100644
--- a/t/unit-tests/test-lib.c
+++ b/t/unit-tests/test-lib.c
@@ -311,6 +311,18 @@ int check_uint_loc(const char *loc, const char *check, int ok,
  	return ret;
  }
  
+int check_ptr_loc(const char *loc, const char *check, int ok, void *a, void *b)
+{
+	int ret = test_assert(loc, check, ok);
+
+	if (!ret) {
+		test_msg("   left: %p", a);
+		test_msg("  right: %p", b);
+	}
+
+	return ret;
+}
+
  static void print_one_char(char ch, char quote)
  {
  	if ((unsigned char)ch < 0x20u || ch == 0x7f) {

René Scharfe Dec. 28, 2023, 4:05 p.m. UTC | #4

Am 28.12.23 um 16:10 schrieb Phillip Wood:
> Hi René
>
> On 21/12/2023 23:13, René Scharfe wrote:
>> +#define check_ptr(a, op, b) check_int(((a) op (b)), ==, 1)
>> [...]
>> +static void t_calloc_100(struct mem_pool *pool)
>> +{
>> +    size_t size = 100;
>> +    char *buffer = mem_pool_calloc(pool, 1, size);
>> +    for (size_t i = 0; i < size; i++)
>> +        check_int(buffer[i], ==, 0);
>> +    if (!check_ptr(pool->mp_block, !=, NULL))
>> +        return;
>> +    check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end);
>> +    check_ptr(pool->mp_block->next_free, !=, NULL);
>> +    check_ptr(pool->mp_block->end, !=, NULL);
>> +}
>
> It's great to see the unit test framework being used here. I wonder
> though if it would be simpler just to use
>
>     check(ptr != NULL)

Yes, that's better.

> as I'm not sure what the check_ptr() macro adds. The diff at the end of
> this email shows a possible implementation of a check_ptr() macro for
> the unit test library. I'm wary of adding it though because I'm not sure
> printing the pointer values is actually very useful most of the
> time. I'm also concerned that the rules around pointer arithmetic and
> comparisons mean that many pointer tests such as
>
>     check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end);
>
> will be undefined if they fail.

True, the compiler could legally emit mush when it finds out that the
pointers are for different objects.  And the error being fixed produces
such unrelated pointer pairs -- oops.

This check is not important here, we can just drop it.

mem_pool_contains() has the same problem, by the way.

Restricting ourselves to only equality comparisons for pointers prevents
some interesting sanity checks, though.  Casting to intptr_t or
uintptr_t would allow arbitrary comparisons without risk of undefined
behavior, though.  Perhaps that would make a check_ptr() macro viable
and useful.

René

Phillip Wood Dec. 28, 2023, 4:48 p.m. UTC | #5

On 28/12/2023 16:05, René Scharfe wrote:
> Am 28.12.23 um 16:10 schrieb Phillip Wood:
>> The diff at the end of
>> this email shows a possible implementation of a check_ptr() macro for
>> the unit test library. I'm wary of adding it though because I'm not sure
>> printing the pointer values is actually very useful most of the
>> time. I'm also concerned that the rules around pointer arithmetic and
>> comparisons mean that many pointer tests such as
>>
>>      check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end);
>>
>> will be undefined if they fail.
> 
> True, the compiler could legally emit mush when it finds out that the
> pointers are for different objects.  And the error being fixed produces
> such unrelated pointer pairs -- oops.
> 
> This check is not important here, we can just drop it.
> 
> mem_pool_contains() has the same problem, by the way.
> 
> Restricting ourselves to only equality comparisons for pointers prevents
> some interesting sanity checks, though.  Casting to intptr_t or
> uintptr_t would allow arbitrary comparisons without risk of undefined
> behavior, though.  Perhaps that would make a check_ptr() macro viable
> and useful.

That certainly helps and the check_ptr() macro in my previous email 
casts the pointers to uintptr_t before comparing them. Maybe I'm 
worrying too much, but my concern is that in a failing comparison it is 
likely one of the pointers is invalid (for example it is the result of 
some undefined pointer arithmetic) and the program is undefined from the 
point the invalid pointer is created. The documentation for check_ptr() 
in my previous mail contains the following example

     For example if `start` and `end` are pointers to the beginning and
     end of an allocation and `offset` is an integer then

         check_ptr(start + offset, <=, end)

     is undefined when `offset` is larger than `end - start`. Rewriting
     the comparison as

         check_uint(offset, <=, end - start)

     avoids undefined behavior when offset is too large, but is still
     undefined if there is a bug that means `start` and `end` do not
     point to the same allocation.

I agree it would be nice to allow arbitrary pointer comparisons but it 
would be good to do it in a way that does not expose us to undefined 
behavior. I'm not sure what the right balance is here.

Best Wishes

Phillip

René Scharfe Dec. 28, 2023, 6:56 p.m. UTC | #6

Am 28.12.23 um 17:48 schrieb phillip.wood123@gmail.com:
> On 28/12/2023 16:05, René Scharfe wrote:
>> Am 28.12.23 um 16:10 schrieb Phillip Wood:
>>> The diff at the end of
>>> this email shows a possible implementation of a check_ptr() macro for
>>> the unit test library. I'm wary of adding it though because I'm not sure
>>> printing the pointer values is actually very useful most of the
>>> time. I'm also concerned that the rules around pointer arithmetic and
>>> comparisons mean that many pointer tests such as
>>>
>>>      check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end);
>>>
>>> will be undefined if they fail.
>>
>> True, the compiler could legally emit mush when it finds out that the
>> pointers are for different objects.  And the error being fixed produces
>> such unrelated pointer pairs -- oops.
>>
>> This check is not important here, we can just drop it.
>>
>> mem_pool_contains() has the same problem, by the way.
>>
>> Restricting ourselves to only equality comparisons for pointers prevents
>> some interesting sanity checks, though.  Casting to intptr_t or
>> uintptr_t would allow arbitrary comparisons without risk of undefined
>> behavior, though.  Perhaps that would make a check_ptr() macro viable
>> and useful.
>
> That certainly helps and the check_ptr() macro in my previous email
> casts the pointers to uintptr_t before comparing them. Maybe I'm
> worrying too much, but my concern is that in a failing comparison it
> is likely one of the pointers is invalid (for example it is the
> result of some undefined pointer arithmetic) and the program is
> undefined from the point the invalid pointer is created.

There are no restrictions on integer comparisons.  So comparing after
casting to uintptr_t should not invoke undefined behavior.  If undefined
behavior was involved in calculating the pointers in the first place
then the compiler might still legally go crazy, but not due to the
comparison.  Right?

Whether the result of a uintptr_t-cast comparison of pointers to
different objects is meaningful is a different question.  Hopefully
range checks are possible.

> The
> documentation for check_ptr() in my previous mail contains the
> following example
>
>     For example if `start` and `end` are pointers to the beginning and
>     end of an allocation and `offset` is an integer then
>
>         check_ptr(start + offset, <=, end)
>
>     is undefined when `offset` is larger than `end - start`. Rewriting
>     the comparison as
>
>         check_uint(offset, <=, end - start)
>
>     avoids undefined behavior when offset is too large, but is still
>     undefined if there is a bug that means `start` and `end` do not
>     point to the same allocation.

True, but in such a unit test we'd need additional checks verifying
that start and end belong to the same object.  Or perhaps use a
numerical size instead of an end pointer.

René

Phillip Wood Dec. 28, 2023, 7:34 p.m. UTC | #7

On 28/12/2023 18:56, René Scharfe wrote:
> Am 28.12.23 um 17:48 schrieb phillip.wood123@gmail.com:
>> On 28/12/2023 16:05, René Scharfe wrote:
>>> Am 28.12.23 um 16:10 schrieb Phillip Wood:
>>>> The diff at the end of
>>>> this email shows a possible implementation of a check_ptr() macro for
>>>> the unit test library. I'm wary of adding it though because I'm not sure
>>>> printing the pointer values is actually very useful most of the
>>>> time. I'm also concerned that the rules around pointer arithmetic and
>>>> comparisons mean that many pointer tests such as
>>>>
>>>>       check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end);
>>>>
>>>> will be undefined if they fail.
>>>
>>> True, the compiler could legally emit mush when it finds out that the
>>> pointers are for different objects.  And the error being fixed produces
>>> such unrelated pointer pairs -- oops.
>>>
>>> This check is not important here, we can just drop it.
>>>
>>> mem_pool_contains() has the same problem, by the way.
>>>
>>> Restricting ourselves to only equality comparisons for pointers prevents
>>> some interesting sanity checks, though.  Casting to intptr_t or
>>> uintptr_t would allow arbitrary comparisons without risk of undefined
>>> behavior, though.  Perhaps that would make a check_ptr() macro viable
>>> and useful.
>>
>> That certainly helps and the check_ptr() macro in my previous email
>> casts the pointers to uintptr_t before comparing them. Maybe I'm
>> worrying too much, but my concern is that in a failing comparison it
>> is likely one of the pointers is invalid (for example it is the
>> result of some undefined pointer arithmetic) and the program is
>> undefined from the point the invalid pointer is created.
> 
> There are no restrictions on integer comparisons.  So comparing after
> casting to uintptr_t should not invoke undefined behavior.  If undefined
> behavior was involved in calculating the pointers in the first place
> then the compiler might still legally go crazy, but not due to the
> comparison.  Right?

Exactly, my worry is that if the comparison fails it is likely that 
there will have been undefined behavior involved in calculating the 
pointer before we get to the comparison in which case so casting to 
uintptr_t in the comparison does not help.

> Whether the result of a uintptr_t-cast comparison of pointers to
> different objects is meaningful is a different question.  Hopefully
> range checks are possible.
> 
>> The
>> documentation for check_ptr() in my previous mail contains the
>> following example
>>
>>      For example if `start` and `end` are pointers to the beginning and
>>      end of an allocation and `offset` is an integer then
>>
>>          check_ptr(start + offset, <=, end)
>>
>>      is undefined when `offset` is larger than `end - start`. Rewriting
>>      the comparison as
>>
>>          check_uint(offset, <=, end - start)
>>
>>      avoids undefined behavior when offset is too large, but is still
>>      undefined if there is a bug that means `start` and `end` do not
>>      point to the same allocation.
> 
> True, but in such a unit test we'd need additional checks verifying
> that start and end belong to the same object.  Or perhaps use a
> numerical size instead of an end pointer.

Agreed, but I think the implication is that there will be cases we 
should be using check_uint() as in the second comparison above rather 
than check_ptr() as in the first comparison above. I'm not opposed to 
adding check_ptr() if we think it will be useful but I am worried it is 
easy to misuse it. If we do add check_ptr() we should have some 
guidelines about when it makes sense to use it.

Best Wishes

Phillip

René Scharfe Dec. 29, 2023, 6:53 a.m. UTC | #8

Am 28.12.23 um 20:34 schrieb phillip.wood123@gmail.com:
>
> Exactly, my worry is that if the comparison fails it is likely that
> there will have been undefined behavior involved in calculating the
> pointer before we get to the comparison in which case so casting to
> uintptr_t in the comparison does not help.
If there's undefined behavior (UB) somewhere in the test or the tested
unit then the compiler could skip any checks and report success anyway.
Not adding UB in the test framework and tests is the least we can do.

Perhaps disabling link-time optimization would allow us to shield the
unit tests from UB in the tested code, in the sense that the compiler
would then not be able to skip checks.

René

mem-pool: fix big allocations

Commit Message

Comments

Patch