Message ID | fa89d269-1a23-4ed6-bebc-30c0b629f444@web.de (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | mem-pool: fix big allocations | expand |
On Thu, Dec 21, 2023 at 3:13 PM René Scharfe <l.s.r@web.de> wrote: > > Memory pool allocations that require a new block and would fill at > least half of it are handled specially. Before 158dfeff3d (mem-pool: > add life cycle management functions, 2018-07-02) they used to be > allocated outside of the pool. This patch made mem_pool_alloc() create > a bespoke block instead, to allow releasing it when the pool gets > discarded. > > Unfortunately mem_pool_alloc() returns a pointer to the start of such a > bespoke block, i.e. to the struct mp_block at its top. When the caller > writes to it, the management information gets corrupted. This affects > mem_pool_discard() and -- if there are no other blocks in the pool -- > also mem_pool_alloc(). > > Return the payload pointer of bespoke blocks, just like for smaller > allocations, to protect the management struct. > > Also update next_free to mark the block as full. This is only strictly > necessary for the first allocated block, because subsequent ones are > inserted after the current block and never considered for further > allocations, but it's easier to just do it in all cases. > > Add a basic unit test to demonstate the issue by using mem_pool_calloc() demonstrate (missing 'r')? > with a tiny block size, which forces the creation of a bespoke block. > Without the mem_pool_alloc() fix it reports zeroed pointers: > > ok 1 - mem_pool_calloc returns 100 zeroed bytes with big block > # check "((pool->mp_block->next_free) != (((void*)0))) == 1" failed at t/unit-tests/t-mem-pool.c:22 > # left: 0 > # right: 1 > # check "((pool->mp_block->end) != (((void*)0))) == 1" failed at t/unit-tests/t-mem-pool.c:23 > # left: 0 > # right: 1 > not ok 2 - mem_pool_calloc returns 100 zeroed bytes with tiny block > 1..2 > > Signed-off-by: René Scharfe <l.s.r@web.de> > --- > Makefile | 1 + > mem-pool.c | 6 +++--- > t/unit-tests/t-mem-pool.c | 34 ++++++++++++++++++++++++++++++++++ > 3 files changed, 38 insertions(+), 3 deletions(-) > create mode 100644 t/unit-tests/t-mem-pool.c > > diff --git a/Makefile b/Makefile > index 88ba7a3c51..15990ff312 100644 > --- a/Makefile > +++ b/Makefile > @@ -1340,6 +1340,7 @@ THIRD_PARTY_SOURCES += sha1collisiondetection/% > THIRD_PARTY_SOURCES += sha1dc/% > > UNIT_TEST_PROGRAMS += t-basic > +UNIT_TEST_PROGRAMS += t-mem-pool > UNIT_TEST_PROGRAMS += t-strbuf > UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS)) > UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS)) > diff --git a/mem-pool.c b/mem-pool.c > index c34846d176..e8d976c3ee 100644 > --- a/mem-pool.c > +++ b/mem-pool.c > @@ -99,9 +99,9 @@ void *mem_pool_alloc(struct mem_pool *pool, size_t len) > > if (!p) { > if (len >= (pool->block_alloc / 2)) > - return mem_pool_alloc_block(pool, len, pool->mp_block); > - > - p = mem_pool_alloc_block(pool, pool->block_alloc, NULL); > + p = mem_pool_alloc_block(pool, len, pool->mp_block); > + else > + p = mem_pool_alloc_block(pool, pool->block_alloc, NULL); > } > > r = p->next_free; > diff --git a/t/unit-tests/t-mem-pool.c b/t/unit-tests/t-mem-pool.c > new file mode 100644 > index 0000000000..2295779b0b > --- /dev/null > +++ b/t/unit-tests/t-mem-pool.c > @@ -0,0 +1,34 @@ > +#include "test-lib.h" > +#include "mem-pool.h" > + > +#define check_ptr(a, op, b) check_int(((a) op (b)), ==, 1) > + > +static void setup_static(void (*f)(struct mem_pool *), size_t block_alloc) > +{ > + struct mem_pool pool = { .block_alloc = block_alloc }; > + f(&pool); > + mem_pool_discard(&pool, 0); > +} > + > +static void t_calloc_100(struct mem_pool *pool) > +{ > + size_t size = 100; > + char *buffer = mem_pool_calloc(pool, 1, size); > + for (size_t i = 0; i < size; i++) > + check_int(buffer[i], ==, 0); > + if (!check_ptr(pool->mp_block, !=, NULL)) > + return; > + check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end); > + check_ptr(pool->mp_block->next_free, !=, NULL); > + check_ptr(pool->mp_block->end, !=, NULL); > +} > + > +int cmd_main(int argc, const char **argv) > +{ > + TEST(setup_static(t_calloc_100, 1024 * 1024), > + "mem_pool_calloc returns 100 zeroed bytes with big block"); > + TEST(setup_static(t_calloc_100, 1), > + "mem_pool_calloc returns 100 zeroed bytes with tiny block"); > + > + return test_done(); > +} > -- > 2.43.0 Nice catch; looks good to me. Out of curiosity, how'd you find the issue?
Am 24.12.23 um 04:11 schrieb Elijah Newren: > On Thu, Dec 21, 2023 at 3:13 PM René Scharfe <l.s.r@web.de> wrote: >> >> Add a basic unit test to demonstate the issue by using mem_pool_calloc() > > demonstrate (missing 'r')? Yes. I didn't intend to mention any demons or states, let alone a demon state. *facepalm* > Nice catch; looks good to me. Out of curiosity, how'd you find the issue? I was working on using a memory pool for name-rev. I played with different values for block_alloc, not fully sure why anymore -- either for checking the performance impact or testing robustness. René
Hi René On 21/12/2023 23:13, René Scharfe wrote: > +#define check_ptr(a, op, b) check_int(((a) op (b)), ==, 1) > [...] > +static void t_calloc_100(struct mem_pool *pool) > +{ > + size_t size = 100; > + char *buffer = mem_pool_calloc(pool, 1, size); > + for (size_t i = 0; i < size; i++) > + check_int(buffer[i], ==, 0); > + if (!check_ptr(pool->mp_block, !=, NULL)) > + return; > + check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end); > + check_ptr(pool->mp_block->next_free, !=, NULL); > + check_ptr(pool->mp_block->end, !=, NULL); > +} It's great to see the unit test framework being used here. I wonder though if it would be simpler just to use check(ptr != NULL) as I'm not sure what the check_ptr() macro adds. The diff at the end of this email shows a possible implementation of a check_ptr() macro for the unit test library. I'm wary of adding it though because I'm not sure printing the pointer values is actually very useful most of the time. I'm also concerned that the rules around pointer arithmetic and comparisons mean that many pointer tests such as check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end); will be undefined if they fail. The documentation for check_ptr() below tries to illustrate that concern. If the compiler can prove that a check is undefined when that check fails it is at liberty to hard code the test as passing. In practice I think most failing pointer comparisons would fall into the category of "this is undefined but the compiler can't prove it" but that doesn't really make me any happier. Best Wishes Phillip ---- >8 ---- diff --git a/t/unit-tests/test-lib.h b/t/unit-tests/test-lib.h index a8f07ae0b7..ecd1fce17d 100644 --- a/t/unit-tests/test-lib.h +++ b/t/unit-tests/test-lib.h @@ -99,6 +99,39 @@ int check_int_loc(const char *loc, const char *check, int ok, int check_uint_loc(const char *loc, const char *check, int ok, uintmax_t a, uintmax_t b); +/* + * Compare two pointers. Prints a message with the two values if the + * comparison fails. NB this is not thread safe. + * + * Use this with care. The rules around pointer arithmetic and comparison + * in C are quite strict and violating them results in undefined behavior + * To avoid a failing comparison resulting undefined behavior we compare + * the integer value of the pointers. While this avoids undefined + * behavior in the comparison in many cases a failing test will be the + * result of creating an invalid pointer in a way that violates the + * rules on pointer arithmetic. For example if `start` and `end` are + * pointers to the beginning and end of an allocation and `offset` is an + * integer then + * + * check_ptr(start + offset, <=, end) + * + * is undefined when `offset` is larger than `end - start`. Rewriting the + * comparison as + * + * check_uint(offset, <=, end - start) + * + * avoids undefined behavior when offset is too large, but is still + * undefined if there is a bug that means `start` and `end` do not point + * to the same allocation. + */ +#define check_ptr(a, op, b) \ + (test__tmp[0].p = (a), test__tmp[1].p = (b), \ + check_ptr_loc(TEST_LOCATION(), #a" "#op" "#b, \ + (uintptr_t)test__tmp[0].p op (uintptr_t)test__tmp[1].p, \ + test__tmp[0].p, test__tmp[1].p)) + +int check_ptr_loc(const char *loc, const char *check, int ok, void *a, void *b); + /* * Compare two chars. Prints a message with the two values if the * comparison fails. NB this is not thread safe. @@ -133,6 +166,7 @@ int check_str_loc(const char *loc, const char *check, #define TEST__MAKE_LOCATION(line) __FILE__ ":" TEST__STR(line) union test__tmp { + void *p; intmax_t i; uintmax_t u; char c; diff --git a/t/unit-tests/test-lib.c b/t/unit-tests/test-lib.c index 7bf9dfdb95..cb757edbd8 100644 --- a/t/unit-tests/test-lib.c +++ b/t/unit-tests/test-lib.c @@ -311,6 +311,18 @@ int check_uint_loc(const char *loc, const char *check, int ok, return ret; } +int check_ptr_loc(const char *loc, const char *check, int ok, void *a, void *b) +{ + int ret = test_assert(loc, check, ok); + + if (!ret) { + test_msg(" left: %p", a); + test_msg(" right: %p", b); + } + + return ret; +} + static void print_one_char(char ch, char quote) { if ((unsigned char)ch < 0x20u || ch == 0x7f) {
Am 28.12.23 um 16:10 schrieb Phillip Wood: > Hi René > > On 21/12/2023 23:13, René Scharfe wrote: >> +#define check_ptr(a, op, b) check_int(((a) op (b)), ==, 1) >> [...] >> +static void t_calloc_100(struct mem_pool *pool) >> +{ >> + size_t size = 100; >> + char *buffer = mem_pool_calloc(pool, 1, size); >> + for (size_t i = 0; i < size; i++) >> + check_int(buffer[i], ==, 0); >> + if (!check_ptr(pool->mp_block, !=, NULL)) >> + return; >> + check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end); >> + check_ptr(pool->mp_block->next_free, !=, NULL); >> + check_ptr(pool->mp_block->end, !=, NULL); >> +} > > It's great to see the unit test framework being used here. I wonder > though if it would be simpler just to use > > check(ptr != NULL) Yes, that's better. > as I'm not sure what the check_ptr() macro adds. The diff at the end of > this email shows a possible implementation of a check_ptr() macro for > the unit test library. I'm wary of adding it though because I'm not sure > printing the pointer values is actually very useful most of the > time. I'm also concerned that the rules around pointer arithmetic and > comparisons mean that many pointer tests such as > > check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end); > > will be undefined if they fail. True, the compiler could legally emit mush when it finds out that the pointers are for different objects. And the error being fixed produces such unrelated pointer pairs -- oops. This check is not important here, we can just drop it. mem_pool_contains() has the same problem, by the way. Restricting ourselves to only equality comparisons for pointers prevents some interesting sanity checks, though. Casting to intptr_t or uintptr_t would allow arbitrary comparisons without risk of undefined behavior, though. Perhaps that would make a check_ptr() macro viable and useful. René
On 28/12/2023 16:05, René Scharfe wrote: > Am 28.12.23 um 16:10 schrieb Phillip Wood: >> The diff at the end of >> this email shows a possible implementation of a check_ptr() macro for >> the unit test library. I'm wary of adding it though because I'm not sure >> printing the pointer values is actually very useful most of the >> time. I'm also concerned that the rules around pointer arithmetic and >> comparisons mean that many pointer tests such as >> >> check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end); >> >> will be undefined if they fail. > > True, the compiler could legally emit mush when it finds out that the > pointers are for different objects. And the error being fixed produces > such unrelated pointer pairs -- oops. > > This check is not important here, we can just drop it. > > mem_pool_contains() has the same problem, by the way. > > Restricting ourselves to only equality comparisons for pointers prevents > some interesting sanity checks, though. Casting to intptr_t or > uintptr_t would allow arbitrary comparisons without risk of undefined > behavior, though. Perhaps that would make a check_ptr() macro viable > and useful. That certainly helps and the check_ptr() macro in my previous email casts the pointers to uintptr_t before comparing them. Maybe I'm worrying too much, but my concern is that in a failing comparison it is likely one of the pointers is invalid (for example it is the result of some undefined pointer arithmetic) and the program is undefined from the point the invalid pointer is created. The documentation for check_ptr() in my previous mail contains the following example For example if `start` and `end` are pointers to the beginning and end of an allocation and `offset` is an integer then check_ptr(start + offset, <=, end) is undefined when `offset` is larger than `end - start`. Rewriting the comparison as check_uint(offset, <=, end - start) avoids undefined behavior when offset is too large, but is still undefined if there is a bug that means `start` and `end` do not point to the same allocation. I agree it would be nice to allow arbitrary pointer comparisons but it would be good to do it in a way that does not expose us to undefined behavior. I'm not sure what the right balance is here. Best Wishes Phillip
Am 28.12.23 um 17:48 schrieb phillip.wood123@gmail.com: > On 28/12/2023 16:05, René Scharfe wrote: >> Am 28.12.23 um 16:10 schrieb Phillip Wood: >>> The diff at the end of >>> this email shows a possible implementation of a check_ptr() macro for >>> the unit test library. I'm wary of adding it though because I'm not sure >>> printing the pointer values is actually very useful most of the >>> time. I'm also concerned that the rules around pointer arithmetic and >>> comparisons mean that many pointer tests such as >>> >>> check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end); >>> >>> will be undefined if they fail. >> >> True, the compiler could legally emit mush when it finds out that the >> pointers are for different objects. And the error being fixed produces >> such unrelated pointer pairs -- oops. >> >> This check is not important here, we can just drop it. >> >> mem_pool_contains() has the same problem, by the way. >> >> Restricting ourselves to only equality comparisons for pointers prevents >> some interesting sanity checks, though. Casting to intptr_t or >> uintptr_t would allow arbitrary comparisons without risk of undefined >> behavior, though. Perhaps that would make a check_ptr() macro viable >> and useful. > > That certainly helps and the check_ptr() macro in my previous email > casts the pointers to uintptr_t before comparing them. Maybe I'm > worrying too much, but my concern is that in a failing comparison it > is likely one of the pointers is invalid (for example it is the > result of some undefined pointer arithmetic) and the program is > undefined from the point the invalid pointer is created. There are no restrictions on integer comparisons. So comparing after casting to uintptr_t should not invoke undefined behavior. If undefined behavior was involved in calculating the pointers in the first place then the compiler might still legally go crazy, but not due to the comparison. Right? Whether the result of a uintptr_t-cast comparison of pointers to different objects is meaningful is a different question. Hopefully range checks are possible. > The > documentation for check_ptr() in my previous mail contains the > following example > > For example if `start` and `end` are pointers to the beginning and > end of an allocation and `offset` is an integer then > > check_ptr(start + offset, <=, end) > > is undefined when `offset` is larger than `end - start`. Rewriting > the comparison as > > check_uint(offset, <=, end - start) > > avoids undefined behavior when offset is too large, but is still > undefined if there is a bug that means `start` and `end` do not > point to the same allocation. True, but in such a unit test we'd need additional checks verifying that start and end belong to the same object. Or perhaps use a numerical size instead of an end pointer. René
On 28/12/2023 18:56, René Scharfe wrote: > Am 28.12.23 um 17:48 schrieb phillip.wood123@gmail.com: >> On 28/12/2023 16:05, René Scharfe wrote: >>> Am 28.12.23 um 16:10 schrieb Phillip Wood: >>>> The diff at the end of >>>> this email shows a possible implementation of a check_ptr() macro for >>>> the unit test library. I'm wary of adding it though because I'm not sure >>>> printing the pointer values is actually very useful most of the >>>> time. I'm also concerned that the rules around pointer arithmetic and >>>> comparisons mean that many pointer tests such as >>>> >>>> check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end); >>>> >>>> will be undefined if they fail. >>> >>> True, the compiler could legally emit mush when it finds out that the >>> pointers are for different objects. And the error being fixed produces >>> such unrelated pointer pairs -- oops. >>> >>> This check is not important here, we can just drop it. >>> >>> mem_pool_contains() has the same problem, by the way. >>> >>> Restricting ourselves to only equality comparisons for pointers prevents >>> some interesting sanity checks, though. Casting to intptr_t or >>> uintptr_t would allow arbitrary comparisons without risk of undefined >>> behavior, though. Perhaps that would make a check_ptr() macro viable >>> and useful. >> >> That certainly helps and the check_ptr() macro in my previous email >> casts the pointers to uintptr_t before comparing them. Maybe I'm >> worrying too much, but my concern is that in a failing comparison it >> is likely one of the pointers is invalid (for example it is the >> result of some undefined pointer arithmetic) and the program is >> undefined from the point the invalid pointer is created. > > There are no restrictions on integer comparisons. So comparing after > casting to uintptr_t should not invoke undefined behavior. If undefined > behavior was involved in calculating the pointers in the first place > then the compiler might still legally go crazy, but not due to the > comparison. Right? Exactly, my worry is that if the comparison fails it is likely that there will have been undefined behavior involved in calculating the pointer before we get to the comparison in which case so casting to uintptr_t in the comparison does not help. > Whether the result of a uintptr_t-cast comparison of pointers to > different objects is meaningful is a different question. Hopefully > range checks are possible. > >> The >> documentation for check_ptr() in my previous mail contains the >> following example >> >> For example if `start` and `end` are pointers to the beginning and >> end of an allocation and `offset` is an integer then >> >> check_ptr(start + offset, <=, end) >> >> is undefined when `offset` is larger than `end - start`. Rewriting >> the comparison as >> >> check_uint(offset, <=, end - start) >> >> avoids undefined behavior when offset is too large, but is still >> undefined if there is a bug that means `start` and `end` do not >> point to the same allocation. > > True, but in such a unit test we'd need additional checks verifying > that start and end belong to the same object. Or perhaps use a > numerical size instead of an end pointer. Agreed, but I think the implication is that there will be cases we should be using check_uint() as in the second comparison above rather than check_ptr() as in the first comparison above. I'm not opposed to adding check_ptr() if we think it will be useful but I am worried it is easy to misuse it. If we do add check_ptr() we should have some guidelines about when it makes sense to use it. Best Wishes Phillip
Am 28.12.23 um 20:34 schrieb phillip.wood123@gmail.com: > > Exactly, my worry is that if the comparison fails it is likely that > there will have been undefined behavior involved in calculating the > pointer before we get to the comparison in which case so casting to > uintptr_t in the comparison does not help. If there's undefined behavior (UB) somewhere in the test or the tested unit then the compiler could skip any checks and report success anyway. Not adding UB in the test framework and tests is the least we can do. Perhaps disabling link-time optimization would allow us to shield the unit tests from UB in the tested code, in the sense that the compiler would then not be able to skip checks. René
diff --git a/Makefile b/Makefile index 88ba7a3c51..15990ff312 100644 --- a/Makefile +++ b/Makefile @@ -1340,6 +1340,7 @@ THIRD_PARTY_SOURCES += sha1collisiondetection/% THIRD_PARTY_SOURCES += sha1dc/% UNIT_TEST_PROGRAMS += t-basic +UNIT_TEST_PROGRAMS += t-mem-pool UNIT_TEST_PROGRAMS += t-strbuf UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS)) UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS)) diff --git a/mem-pool.c b/mem-pool.c index c34846d176..e8d976c3ee 100644 --- a/mem-pool.c +++ b/mem-pool.c @@ -99,9 +99,9 @@ void *mem_pool_alloc(struct mem_pool *pool, size_t len) if (!p) { if (len >= (pool->block_alloc / 2)) - return mem_pool_alloc_block(pool, len, pool->mp_block); - - p = mem_pool_alloc_block(pool, pool->block_alloc, NULL); + p = mem_pool_alloc_block(pool, len, pool->mp_block); + else + p = mem_pool_alloc_block(pool, pool->block_alloc, NULL); } r = p->next_free; diff --git a/t/unit-tests/t-mem-pool.c b/t/unit-tests/t-mem-pool.c new file mode 100644 index 0000000000..2295779b0b --- /dev/null +++ b/t/unit-tests/t-mem-pool.c @@ -0,0 +1,34 @@ +#include "test-lib.h" +#include "mem-pool.h" + +#define check_ptr(a, op, b) check_int(((a) op (b)), ==, 1) + +static void setup_static(void (*f)(struct mem_pool *), size_t block_alloc) +{ + struct mem_pool pool = { .block_alloc = block_alloc }; + f(&pool); + mem_pool_discard(&pool, 0); +} + +static void t_calloc_100(struct mem_pool *pool) +{ + size_t size = 100; + char *buffer = mem_pool_calloc(pool, 1, size); + for (size_t i = 0; i < size; i++) + check_int(buffer[i], ==, 0); + if (!check_ptr(pool->mp_block, !=, NULL)) + return; + check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end); + check_ptr(pool->mp_block->next_free, !=, NULL); + check_ptr(pool->mp_block->end, !=, NULL); +} + +int cmd_main(int argc, const char **argv) +{ + TEST(setup_static(t_calloc_100, 1024 * 1024), + "mem_pool_calloc returns 100 zeroed bytes with big block"); + TEST(setup_static(t_calloc_100, 1), + "mem_pool_calloc returns 100 zeroed bytes with tiny block"); + + return test_done(); +}
Memory pool allocations that require a new block and would fill at least half of it are handled specially. Before 158dfeff3d (mem-pool: add life cycle management functions, 2018-07-02) they used to be allocated outside of the pool. This patch made mem_pool_alloc() create a bespoke block instead, to allow releasing it when the pool gets discarded. Unfortunately mem_pool_alloc() returns a pointer to the start of such a bespoke block, i.e. to the struct mp_block at its top. When the caller writes to it, the management information gets corrupted. This affects mem_pool_discard() and -- if there are no other blocks in the pool -- also mem_pool_alloc(). Return the payload pointer of bespoke blocks, just like for smaller allocations, to protect the management struct. Also update next_free to mark the block as full. This is only strictly necessary for the first allocated block, because subsequent ones are inserted after the current block and never considered for further allocations, but it's easier to just do it in all cases. Add a basic unit test to demonstate the issue by using mem_pool_calloc() with a tiny block size, which forces the creation of a bespoke block. Without the mem_pool_alloc() fix it reports zeroed pointers: ok 1 - mem_pool_calloc returns 100 zeroed bytes with big block # check "((pool->mp_block->next_free) != (((void*)0))) == 1" failed at t/unit-tests/t-mem-pool.c:22 # left: 0 # right: 1 # check "((pool->mp_block->end) != (((void*)0))) == 1" failed at t/unit-tests/t-mem-pool.c:23 # left: 0 # right: 1 not ok 2 - mem_pool_calloc returns 100 zeroed bytes with tiny block 1..2 Signed-off-by: René Scharfe <l.s.r@web.de> --- Makefile | 1 + mem-pool.c | 6 +++--- t/unit-tests/t-mem-pool.c | 34 ++++++++++++++++++++++++++++++++++ 3 files changed, 38 insertions(+), 3 deletions(-) create mode 100644 t/unit-tests/t-mem-pool.c -- 2.43.0