[2/2] iov_iter: optimise iter type checking

Message ID	9bc27cb3ef6ab49b6b2ccee3db6613838aee17af.1605799583.git.asml.silence@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@kernel.org> From: Pavel Begunkov <asml.silence@gmail.com> To: Jens Axboe <axboe@kernel.dk>, linux-block@vger.kernel.org, Alexander Viro <viro@zeniv.linux.org.uk>, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] iov_iter: optimise iter type checking Date: Thu, 19 Nov 2020 15:29:43 +0000 Message-Id: <9bc27cb3ef6ab49b6b2ccee3db6613838aee17af.1605799583.git.asml.silence@gmail.com> In-Reply-To: <cover.1605799583.git.asml.silence@gmail.com> References: <cover.1605799583.git.asml.silence@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	optimise iov_iter \| expand [0/2] optimise iov_iter [1/2] iov_iter: optimise iov_iter_npages for bvec [2/2] iov_iter: optimise iter type checking

Pavel Begunkov Nov. 19, 2020, 3:29 p.m. UTC

The problem here is that iov_iter_is_*() helpers check types for
equality, but all iterate_* helpers do bitwise ands. This confuses
a compiler, so even if some cases were handled separately with
iov_iter_is_*(), it can't eliminate and skip unreachable branches in
following iterate*().

E.g. iov_iter_npages() first handles bvecs and discards, but
iterate_all_kinds() still generate branches for them.

Making checks consistent solves that.

size lib/iov_iter.o
before:
   text    data     bss     dec     hex filename
  24409     805       0   25214    627e lib/iov_iter.o

after:
   text    data     bss     dec     hex filename
  23785     793       0   24578    6002 lib/iov_iter.o

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/linux/uio.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Christoph Hellwig Nov. 19, 2020, 5:03 p.m. UTC | #1

On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote:
> The problem here is that iov_iter_is_*() helpers check types for
> equality, but all iterate_* helpers do bitwise ands. This confuses
> a compiler, so even if some cases were handled separately with
> iov_iter_is_*(), it can't eliminate and skip unreachable branches in
> following iterate*().

I think we need to kill the iov_iter_is_* helpers, renumber to not do
the pointless bitmask and just check for equality (might turn into a
bunch of nice switch statements actually).

Pavel Begunkov Nov. 19, 2020, 5:12 p.m. UTC | #2

On 19/11/2020 17:03, Christoph Hellwig wrote:
> On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote:
>> The problem here is that iov_iter_is_*() helpers check types for
>> equality, but all iterate_* helpers do bitwise ands. This confuses
>> a compiler, so even if some cases were handled separately with
>> iov_iter_is_*(), it can't eliminate and skip unreachable branches in
>> following iterate*().
> 
> I think we need to kill the iov_iter_is_* helpers, renumber to not do
> the pointless bitmask and just check for equality (might turn into a
> bunch of nice switch statements actually).

There are uses like below though, and that would also add some overhead
on iov_iter_type(), so it's not apparent to me which version would be
cleaner/faster in the end. But yeah, we can experiment after landing
this patch.

if (type & (ITER_BVEC|ITER_KVEC))

Al Viro Dec. 11, 2020, 2:01 a.m. UTC | #3

On Thu, Nov 19, 2020 at 05:12:44PM +0000, Pavel Begunkov wrote:
> On 19/11/2020 17:03, Christoph Hellwig wrote:
> > On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote:
> >> The problem here is that iov_iter_is_*() helpers check types for
> >> equality, but all iterate_* helpers do bitwise ands. This confuses
> >> a compiler, so even if some cases were handled separately with
> >> iov_iter_is_*(), it can't eliminate and skip unreachable branches in
> >> following iterate*().
> > 
> > I think we need to kill the iov_iter_is_* helpers, renumber to not do
> > the pointless bitmask and just check for equality (might turn into a
> > bunch of nice switch statements actually).
> 
> There are uses like below though, and that would also add some overhead
> on iov_iter_type(), so it's not apparent to me which version would be
> cleaner/faster in the end. But yeah, we can experiment after landing
> this patch.
> 
> if (type & (ITER_BVEC|ITER_KVEC))

There are exactly 3 such places, and all of them would've been just as well
with case ITER_BVEC: case ITER_KVEC: ... in a switch.

Hmm...  I wonder which would work better:

enum iter_type {
        ITER_IOVEC = 0,
        ITER_KVEC = 2,
        ITER_BVEC = 4,
        ITER_PIPE = 6,
        ITER_DISCARD = 8,
};
iov_iter_type(iter)	(((iter)->type) & ~1)
iov_iter_rw(iter)	(((iter)->type) & 1)

or

enum iter_type {
        ITER_IOVEC,
        ITER_KVEC,
        ITER_BVEC,
        ITER_PIPE,
        ITER_DISCARD,
};
iov_iter_type(iter)	(((iter)->type) & (~0U>>1))
// callers of iov_iter_rw() are almost all comparing with explicit READ or WRITE
iov_iter_rw(iter)	(((iter)->type) & ~(~0U>>1) ? WRITE : READ)
with places like iov_iter_kvec() doing
	i->type = ITER_KVEC | ((direction == WRITE) ? BIT(31) : 0);

Preferences?

Pavel Begunkov Dec. 13, 2020, 10:32 p.m. UTC | #4

On 11/12/2020 02:01, Al Viro wrote:
> On Thu, Nov 19, 2020 at 05:12:44PM +0000, Pavel Begunkov wrote:
>> On 19/11/2020 17:03, Christoph Hellwig wrote:
>>> On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote:
>>>> The problem here is that iov_iter_is_*() helpers check types for
>>>> equality, but all iterate_* helpers do bitwise ands. This confuses
>>>> a compiler, so even if some cases were handled separately with
>>>> iov_iter_is_*(), it can't eliminate and skip unreachable branches in
>>>> following iterate*().
>>>
>>> I think we need to kill the iov_iter_is_* helpers, renumber to not do
>>> the pointless bitmask and just check for equality (might turn into a
>>> bunch of nice switch statements actually).
>>
>> There are uses like below though, and that would also add some overhead
>> on iov_iter_type(), so it's not apparent to me which version would be
>> cleaner/faster in the end. But yeah, we can experiment after landing
>> this patch.
>>
>> if (type & (ITER_BVEC|ITER_KVEC))
> 
> There are exactly 3 such places, and all of them would've been just as well
> with case ITER_BVEC: case ITER_KVEC: ... in a switch.
> 
> Hmm...  I wonder which would work better:
> 
> enum iter_type {
>         ITER_IOVEC = 0,
>         ITER_KVEC = 2,
>         ITER_BVEC = 4,
>         ITER_PIPE = 6,
>         ITER_DISCARD = 8,
> };
> iov_iter_type(iter)	(((iter)->type) & ~1)
> iov_iter_rw(iter)	(((iter)->type) & 1)
> 
> or
> 
> enum iter_type {
>         ITER_IOVEC,
>         ITER_KVEC,
>         ITER_BVEC,
>         ITER_PIPE,
>         ITER_DISCARD,
> };
> iov_iter_type(iter)	(((iter)->type) & (~0U>>1))
> // callers of iov_iter_rw() are almost all comparing with explicit READ or WRITE
> iov_iter_rw(iter)	(((iter)->type) & ~(~0U>>1) ? WRITE : READ)
> with places like iov_iter_kvec() doing
> 	i->type = ITER_KVEC | ((direction == WRITE) ? BIT(31) : 0);
> 
> Preferences?

For the bitmask version (with this patch) we have most of
iov_iter_type() completely optimised out. E.g. identical

iov_iter_type(i) & ITER_IOVEC <=> iter->type & ITER_IOVEC

It's also nice to have iov_iter_rw() to be just
(type & 1), operations with which can be optimised in a handful of ways.

Unless the compiler would be able to heavily optimise switches,
e.g. to out-of-memory/calculation-based jump tables, that I doubt,
I'd personally leave it be. Though, not like it should matter much.

David Laight Dec. 14, 2020, 10:28 a.m. UTC | #5

From: Pavel Begunkov
> Sent: 13 December 2020 22:33
> 
> On 11/12/2020 02:01, Al Viro wrote:
> > On Thu, Nov 19, 2020 at 05:12:44PM +0000, Pavel Begunkov wrote:
> >> On 19/11/2020 17:03, Christoph Hellwig wrote:
> >>> On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote:
> >>>> The problem here is that iov_iter_is_*() helpers check types for
> >>>> equality, but all iterate_* helpers do bitwise ands. This confuses
> >>>> a compiler, so even if some cases were handled separately with
> >>>> iov_iter_is_*(), it can't eliminate and skip unreachable branches in
> >>>> following iterate*().
> >>>
> >>> I think we need to kill the iov_iter_is_* helpers, renumber to not do
> >>> the pointless bitmask and just check for equality (might turn into a
> >>> bunch of nice switch statements actually).
> >>
> >> There are uses like below though, and that would also add some overhead
> >> on iov_iter_type(), so it's not apparent to me which version would be
> >> cleaner/faster in the end. But yeah, we can experiment after landing
> >> this patch.
> >>
> >> if (type & (ITER_BVEC|ITER_KVEC))
> >
> > There are exactly 3 such places, and all of them would've been just as well
> > with case ITER_BVEC: case ITER_KVEC: ... in a switch.
> >
> > Hmm...  I wonder which would work better:
> >
> > enum iter_type {
> >         ITER_IOVEC = 0,
> >         ITER_KVEC = 2,
> >         ITER_BVEC = 4,
> >         ITER_PIPE = 6,
> >         ITER_DISCARD = 8,
> > };
> > iov_iter_type(iter)	(((iter)->type) & ~1)
> > iov_iter_rw(iter)	(((iter)->type) & 1)
> >
> > or
> >
> > enum iter_type {
> >         ITER_IOVEC,
> >         ITER_KVEC,
> >         ITER_BVEC,
> >         ITER_PIPE,
> >         ITER_DISCARD,
> > };
> > iov_iter_type(iter)	(((iter)->type) & (~0U>>1))
> > // callers of iov_iter_rw() are almost all comparing with explicit READ or WRITE
> > iov_iter_rw(iter)	(((iter)->type) & ~(~0U>>1) ? WRITE : READ)
> > with places like iov_iter_kvec() doing
> > 	i->type = ITER_KVEC | ((direction == WRITE) ? BIT(31) : 0);
> >
> > Preferences?
> 
> For the bitmask version (with this patch) we have most of
> iov_iter_type() completely optimised out. E.g. identical
> 
> iov_iter_type(i) & ITER_IOVEC <=> iter->type & ITER_IOVEC
> 
> It's also nice to have iov_iter_rw() to be just
> (type & 1), operations with which can be optimised in a handful of ways.
> 
> Unless the compiler would be able to heavily optimise switches,
> e.g. to out-of-memory/calculation-based jump tables, that I doubt,
> I'd personally leave it be. Though, not like it should matter much.

The advantage of the bit-masks is that the 'usual' options can
be tested for together. So the code can be (for example):
	if (likely(iter->type & (ITER_IOVEC | ITER_PIPE) {
		if (likely((iter->type & ITER_IOVEC)) {
			... code for iovec
		} else [
			... code for pipe
		}
	} else if (iter->type & ITER_BVEC) {
		... code for bvec
	} else if (iter->type & ITER_KVEC) {
		.. code for kvec
	} else {
		.. must be discard
	}

I'm not sure of the best order though.
You might want 3 bits in the first test.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Pavel Begunkov Dec. 14, 2020, 5:25 p.m. UTC | #6

On 14/12/2020 10:28, David Laight wrote:
> From: Pavel Begunkov
>> Sent: 13 December 2020 22:33
>>
>> On 11/12/2020 02:01, Al Viro wrote:
>>> On Thu, Nov 19, 2020 at 05:12:44PM +0000, Pavel Begunkov wrote:
>>>> On 19/11/2020 17:03, Christoph Hellwig wrote:
>>>>> On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote:
>>>>>> The problem here is that iov_iter_is_*() helpers check types for
>>>>>> equality, but all iterate_* helpers do bitwise ands. This confuses
>>>>>> a compiler, so even if some cases were handled separately with
>>>>>> iov_iter_is_*(), it can't eliminate and skip unreachable branches in
>>>>>> following iterate*().
>>>>>
>>>>> I think we need to kill the iov_iter_is_* helpers, renumber to not do
>>>>> the pointless bitmask and just check for equality (might turn into a
>>>>> bunch of nice switch statements actually).
>>>>
>>>> There are uses like below though, and that would also add some overhead
>>>> on iov_iter_type(), so it's not apparent to me which version would be
>>>> cleaner/faster in the end. But yeah, we can experiment after landing
>>>> this patch.
>>>>
>>>> if (type & (ITER_BVEC|ITER_KVEC))
>>>
>>> There are exactly 3 such places, and all of them would've been just as well
>>> with case ITER_BVEC: case ITER_KVEC: ... in a switch.
>>>
>>> Hmm...  I wonder which would work better:
>>>
>>> enum iter_type {
>>>         ITER_IOVEC = 0,
>>>         ITER_KVEC = 2,
>>>         ITER_BVEC = 4,
>>>         ITER_PIPE = 6,
>>>         ITER_DISCARD = 8,
>>> };
>>> iov_iter_type(iter)	(((iter)->type) & ~1)
>>> iov_iter_rw(iter)	(((iter)->type) & 1)
>>>
>>> or
>>>
>>> enum iter_type {
>>>         ITER_IOVEC,
>>>         ITER_KVEC,
>>>         ITER_BVEC,
>>>         ITER_PIPE,
>>>         ITER_DISCARD,
>>> };
>>> iov_iter_type(iter)	(((iter)->type) & (~0U>>1))
>>> // callers of iov_iter_rw() are almost all comparing with explicit READ or WRITE
>>> iov_iter_rw(iter)	(((iter)->type) & ~(~0U>>1) ? WRITE : READ)
>>> with places like iov_iter_kvec() doing
>>> 	i->type = ITER_KVEC | ((direction == WRITE) ? BIT(31) : 0);
>>>
>>> Preferences?
>>
>> For the bitmask version (with this patch) we have most of
>> iov_iter_type() completely optimised out. E.g. identical
>>
>> iov_iter_type(i) & ITER_IOVEC <=> iter->type & ITER_IOVEC
>>
>> It's also nice to have iov_iter_rw() to be just
>> (type & 1), operations with which can be optimised in a handful of ways.
>>
>> Unless the compiler would be able to heavily optimise switches,
>> e.g. to out-of-memory/calculation-based jump tables, that I doubt,
>> I'd personally leave it be. Though, not like it should matter much.
> 
> The advantage of the bit-masks is that the 'usual' options can
> be tested for together. So the code can be (for example):

Well, you can do that for the non-bitwise case as well.
In a simpler form but should be enough.

enum { ITER_IOVEC = 1, ITER_BVEC = 2, ... }
if (type <= ITER_BVEC) {
	if (iovec) ...
	if (bvec) ...
} else { ... }


> 	if (likely(iter->type & (ITER_IOVEC | ITER_PIPE) {
> 		if (likely((iter->type & ITER_IOVEC)) {
> 			... code for iovec
> 		} else [
> 			... code for pipe
> 		}
> 	} else if (iter->type & ITER_BVEC) {
> 		... code for bvec
> 	} else if (iter->type & ITER_KVEC) {
> 		.. code for kvec
> 	} else {
> 		.. must be discard
> 	}

[2/2] iov_iter: optimise iter type checking

Commit Message

Comments

Patch