[0/9] Clean up alloc_cache allocations

Message ID	20241119012224.1698238-1-krisman@suse.de (mailing list archive)
Headers	show Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A7682AD20 for <io-uring@vger.kernel.org>; Tue, 19 Nov 2024 01:22:33 +0000 (UTC) From: Gabriel Krisman Bertazi <krisman@suse.de> To: axboe@kernel.dk, asml.silence@gmail.com Cc: io-uring@vger.kernel.org, Gabriel Krisman Bertazi <krisman@suse.de> Subject: [PATCH 0/9] Clean up alloc_cache allocations Date: Mon, 18 Nov 2024 20:22:15 -0500 Message-ID: <20241119012224.1698238-1-krisman@suse.de> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Clean up alloc_cache allocations \| expand [0/9] Clean up alloc_cache allocations [1/9] io_uring: Fold allocation into alloc_cache helper [2/9] io_uring: Add generic helper to allocate async data [3/9] io_uring/futex: Allocate ifd with generic alloc_cache helper [4/9] io_uring/poll: Allocate apoll with generic alloc_cache helper [5/9] io_uring/uring_cmd: Allocate async data through generic helper [6/9] io_uring/net: Allocate msghdr async data through helper [7/9] io_uring/rw: Allocate async data through helper [8/9] io_uring: Move old async data allocation helper to header [9/9] io_uring/msg_ring: Drop custom destructor

Message ID

20241119012224.1698238-1-krisman@suse.de (mailing list archive)

Headers

From: Gabriel Krisman Bertazi <krisman@suse.de>
To: axboe@kernel.dk,
	asml.silence@gmail.com
Cc: io-uring@vger.kernel.org,
	Gabriel Krisman Bertazi <krisman@suse.de>
Subject: [PATCH 0/9] Clean up alloc_cache allocations
Date: Mon, 18 Nov 2024 20:22:15 -0500
Message-ID: <20241119012224.1698238-1-krisman@suse.de>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

Clean up alloc_cache allocations | expand

Message

Gabriel Krisman Bertazi Nov. 19, 2024, 1:22 a.m. UTC

Jens, Pavel,

The allocation paths that use alloc_cache duplicate the same code
pattern, sometimes in a quite convoluted way.  This series cleans up
that code by folding the allocation into the cache code itself, making
it just an allocator function, and keeping the cache policy invisible to
callers.  A bigger justification for doing this, beyond code simplicity,
is that it makes it trivial to test the impact of disabling the cache
and using slab directly, which I've used for slab improvement
experiments.  I think this is one step forward in the direction
eventually lifting the alloc_cache into a proper magazine layer in slab
out of io_uring.

It survived liburing testsuite, and when microbenchmarking the
read-write path with mmtests and fio, I didn't observe any significant
performance variation (there was actually a 2% gain, but that was
within the variance of the test runs, making it not signficant and
surely test noise).

I'm specifically interested, and happy to do so, if there are specific
benchmarks you'd like me to run it against.

Thanks,

Gabriel Krisman Bertazi (9):
  io_uring: Fold allocation into alloc_cache helper
  io_uring: Add generic helper to allocate async data
  io_uring/futex: Allocate ifd with generic alloc_cache helper
  io_uring/poll: Allocate apoll with generic alloc_cache helper
  io_uring/uring_cmd: Allocate async data through generic helper
  io_uring/net: Allocate msghdr async data through helper
  io_uring/rw: Allocate async data through helper
  io_uring: Move old async data allocation helper to header
  io_uring/msg_ring: Drop custom destructor

 io_uring/alloc_cache.h |  7 +++++++
 io_uring/futex.c       | 13 +------------
 io_uring/io_uring.c    | 17 ++---------------
 io_uring/io_uring.h    | 22 ++++++++++++++++++++++
 io_uring/msg_ring.c    |  7 -------
 io_uring/msg_ring.h    |  1 -
 io_uring/net.c         | 28 ++++++++++------------------
 io_uring/poll.c        | 13 +++++--------
 io_uring/rw.c          | 28 ++++++++--------------------
 io_uring/timeout.c     |  5 ++---
 io_uring/uring_cmd.c   | 20 ++------------------
 io_uring/waitid.c      |  4 ++--
 12 files changed, 61 insertions(+), 104 deletions(-)

Comments

Jens Axboe Nov. 19, 2024, 2:05 a.m. UTC | #1

On 11/18/24 6:22 PM, Gabriel Krisman Bertazi wrote:
> Jens, Pavel,
> 
> The allocation paths that use alloc_cache duplicate the same code
> pattern, sometimes in a quite convoluted way.  This series cleans up
> that code by folding the allocation into the cache code itself, making
> it just an allocator function, and keeping the cache policy invisible to
> callers.  A bigger justification for doing this, beyond code simplicity,
> is that it makes it trivial to test the impact of disabling the cache
> and using slab directly, which I've used for slab improvement
> experiments.  I think this is one step forward in the direction
> eventually lifting the alloc_cache into a proper magazine layer in slab
> out of io_uring.

Nice!

Patchset looks good, from a quick look, even from just a cleanup
perspective. We're obviously inside the merge window right now, so it's
a 6.14 target at this point. I'll take some timer to review it a bit
closer later this week.

> It survived liburing testsuite, and when microbenchmarking the
> read-write path with mmtests and fio, I didn't observe any significant
> performance variation (there was actually a 2% gain, but that was
> within the variance of the test runs, making it not signficant and
> surely test noise).
> 
> I'm specifically interested, and happy to do so, if there are specific
> benchmarks you'd like me to run it against.

In general, running the liburing test suite with various types of files
and devices and with KASAN and LOCKDEP turned on is a good test of not
having messed something up, in a big way at least. But maybe you already
ran that too?

Gabriel Krisman Bertazi Nov. 19, 2024, 3:31 p.m. UTC | #2

Jens Axboe <axboe@kernel.dk> writes:

> On 11/18/24 6:22 PM, Gabriel Krisman Bertazi wrote:
>> Jens, Pavel,
>> 
>> The allocation paths that use alloc_cache duplicate the same code
>> pattern, sometimes in a quite convoluted way.  This series cleans up
>> that code by folding the allocation into the cache code itself, making
>> it just an allocator function, and keeping the cache policy invisible to
>> callers.  A bigger justification for doing this, beyond code simplicity,
>> is that it makes it trivial to test the impact of disabling the cache
>> and using slab directly, which I've used for slab improvement
>> experiments.  I think this is one step forward in the direction
>> eventually lifting the alloc_cache into a proper magazine layer in slab
>> out of io_uring.
>
> Nice!
>
> Patchset looks good, from a quick look, even from just a cleanup
> perspective. We're obviously inside the merge window right now, so it's
> a 6.14 target at this point. I'll take some timer to review it a bit
> closer later this week.
>
>> It survived liburing testsuite, and when microbenchmarking the
>> read-write path with mmtests and fio, I didn't observe any significant
>> performance variation (there was actually a 2% gain, but that was
>> within the variance of the test runs, making it not signficant and
>> surely test noise).
>> 
>> I'm specifically interested, and happy to do so, if there are specific
>> benchmarks you'd like me to run it against.
>
> In general, running the liburing test suite with various types of files
> and devices and with KASAN and LOCKDEP turned on is a good test of not
> having messed something up, in a big way at least. But maybe you already
> ran that too?

I was thinking more of extra performance benchmark, but I always forget to
enable KASAN.  I'll rerun to confirm it is fine. :)

thanks,