diff mbox series

[2/2] iov_iter: optimise iter type checking

Message ID 9bc27cb3ef6ab49b6b2ccee3db6613838aee17af.1605799583.git.asml.silence@gmail.com (mailing list archive)
State New, archived
Headers show
Series optimise iov_iter | expand

Commit Message

Pavel Begunkov Nov. 19, 2020, 3:29 p.m. UTC
The problem here is that iov_iter_is_*() helpers check types for
equality, but all iterate_* helpers do bitwise ands. This confuses
a compiler, so even if some cases were handled separately with
iov_iter_is_*(), it can't eliminate and skip unreachable branches in
following iterate*().

E.g. iov_iter_npages() first handles bvecs and discards, but
iterate_all_kinds() still generate branches for them.

Making checks consistent solves that.

size lib/iov_iter.o
before:
   text    data     bss     dec     hex filename
  24409     805       0   25214    627e lib/iov_iter.o

after:
   text    data     bss     dec     hex filename
  23785     793       0   24578    6002 lib/iov_iter.o

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/linux/uio.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

Christoph Hellwig Nov. 19, 2020, 5:03 p.m. UTC | #1
On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote:
> The problem here is that iov_iter_is_*() helpers check types for
> equality, but all iterate_* helpers do bitwise ands. This confuses
> a compiler, so even if some cases were handled separately with
> iov_iter_is_*(), it can't eliminate and skip unreachable branches in
> following iterate*().

I think we need to kill the iov_iter_is_* helpers, renumber to not do
the pointless bitmask and just check for equality (might turn into a
bunch of nice switch statements actually).
Pavel Begunkov Nov. 19, 2020, 5:12 p.m. UTC | #2
On 19/11/2020 17:03, Christoph Hellwig wrote:
> On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote:
>> The problem here is that iov_iter_is_*() helpers check types for
>> equality, but all iterate_* helpers do bitwise ands. This confuses
>> a compiler, so even if some cases were handled separately with
>> iov_iter_is_*(), it can't eliminate and skip unreachable branches in
>> following iterate*().
> 
> I think we need to kill the iov_iter_is_* helpers, renumber to not do
> the pointless bitmask and just check for equality (might turn into a
> bunch of nice switch statements actually).

There are uses like below though, and that would also add some overhead
on iov_iter_type(), so it's not apparent to me which version would be
cleaner/faster in the end. But yeah, we can experiment after landing
this patch.

if (type & (ITER_BVEC|ITER_KVEC))
Al Viro Dec. 11, 2020, 2:01 a.m. UTC | #3
On Thu, Nov 19, 2020 at 05:12:44PM +0000, Pavel Begunkov wrote:
> On 19/11/2020 17:03, Christoph Hellwig wrote:
> > On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote:
> >> The problem here is that iov_iter_is_*() helpers check types for
> >> equality, but all iterate_* helpers do bitwise ands. This confuses
> >> a compiler, so even if some cases were handled separately with
> >> iov_iter_is_*(), it can't eliminate and skip unreachable branches in
> >> following iterate*().
> > 
> > I think we need to kill the iov_iter_is_* helpers, renumber to not do
> > the pointless bitmask and just check for equality (might turn into a
> > bunch of nice switch statements actually).
> 
> There are uses like below though, and that would also add some overhead
> on iov_iter_type(), so it's not apparent to me which version would be
> cleaner/faster in the end. But yeah, we can experiment after landing
> this patch.
> 
> if (type & (ITER_BVEC|ITER_KVEC))

There are exactly 3 such places, and all of them would've been just as well
with case ITER_BVEC: case ITER_KVEC: ... in a switch.

Hmm...  I wonder which would work better:

enum iter_type {
        ITER_IOVEC = 0,
        ITER_KVEC = 2,
        ITER_BVEC = 4,
        ITER_PIPE = 6,
        ITER_DISCARD = 8,
};
iov_iter_type(iter)	(((iter)->type) & ~1)
iov_iter_rw(iter)	(((iter)->type) & 1)

or

enum iter_type {
        ITER_IOVEC,
        ITER_KVEC,
        ITER_BVEC,
        ITER_PIPE,
        ITER_DISCARD,
};
iov_iter_type(iter)	(((iter)->type) & (~0U>>1))
// callers of iov_iter_rw() are almost all comparing with explicit READ or WRITE
iov_iter_rw(iter)	(((iter)->type) & ~(~0U>>1) ? WRITE : READ)
with places like iov_iter_kvec() doing
	i->type = ITER_KVEC | ((direction == WRITE) ? BIT(31) : 0);

Preferences?
Pavel Begunkov Dec. 13, 2020, 10:32 p.m. UTC | #4
On 11/12/2020 02:01, Al Viro wrote:
> On Thu, Nov 19, 2020 at 05:12:44PM +0000, Pavel Begunkov wrote:
>> On 19/11/2020 17:03, Christoph Hellwig wrote:
>>> On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote:
>>>> The problem here is that iov_iter_is_*() helpers check types for
>>>> equality, but all iterate_* helpers do bitwise ands. This confuses
>>>> a compiler, so even if some cases were handled separately with
>>>> iov_iter_is_*(), it can't eliminate and skip unreachable branches in
>>>> following iterate*().
>>>
>>> I think we need to kill the iov_iter_is_* helpers, renumber to not do
>>> the pointless bitmask and just check for equality (might turn into a
>>> bunch of nice switch statements actually).
>>
>> There are uses like below though, and that would also add some overhead
>> on iov_iter_type(), so it's not apparent to me which version would be
>> cleaner/faster in the end. But yeah, we can experiment after landing
>> this patch.
>>
>> if (type & (ITER_BVEC|ITER_KVEC))
> 
> There are exactly 3 such places, and all of them would've been just as well
> with case ITER_BVEC: case ITER_KVEC: ... in a switch.
> 
> Hmm...  I wonder which would work better:
> 
> enum iter_type {
>         ITER_IOVEC = 0,
>         ITER_KVEC = 2,
>         ITER_BVEC = 4,
>         ITER_PIPE = 6,
>         ITER_DISCARD = 8,
> };
> iov_iter_type(iter)	(((iter)->type) & ~1)
> iov_iter_rw(iter)	(((iter)->type) & 1)
> 
> or
> 
> enum iter_type {
>         ITER_IOVEC,
>         ITER_KVEC,
>         ITER_BVEC,
>         ITER_PIPE,
>         ITER_DISCARD,
> };
> iov_iter_type(iter)	(((iter)->type) & (~0U>>1))
> // callers of iov_iter_rw() are almost all comparing with explicit READ or WRITE
> iov_iter_rw(iter)	(((iter)->type) & ~(~0U>>1) ? WRITE : READ)
> with places like iov_iter_kvec() doing
> 	i->type = ITER_KVEC | ((direction == WRITE) ? BIT(31) : 0);
> 
> Preferences?

For the bitmask version (with this patch) we have most of
iov_iter_type() completely optimised out. E.g. identical

iov_iter_type(i) & ITER_IOVEC <=> iter->type & ITER_IOVEC

It's also nice to have iov_iter_rw() to be just
(type & 1), operations with which can be optimised in a handful of ways.

Unless the compiler would be able to heavily optimise switches,
e.g. to out-of-memory/calculation-based jump tables, that I doubt,
I'd personally leave it be. Though, not like it should matter much.
David Laight Dec. 14, 2020, 10:28 a.m. UTC | #5
From: Pavel Begunkov
> Sent: 13 December 2020 22:33
> 
> On 11/12/2020 02:01, Al Viro wrote:
> > On Thu, Nov 19, 2020 at 05:12:44PM +0000, Pavel Begunkov wrote:
> >> On 19/11/2020 17:03, Christoph Hellwig wrote:
> >>> On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote:
> >>>> The problem here is that iov_iter_is_*() helpers check types for
> >>>> equality, but all iterate_* helpers do bitwise ands. This confuses
> >>>> a compiler, so even if some cases were handled separately with
> >>>> iov_iter_is_*(), it can't eliminate and skip unreachable branches in
> >>>> following iterate*().
> >>>
> >>> I think we need to kill the iov_iter_is_* helpers, renumber to not do
> >>> the pointless bitmask and just check for equality (might turn into a
> >>> bunch of nice switch statements actually).
> >>
> >> There are uses like below though, and that would also add some overhead
> >> on iov_iter_type(), so it's not apparent to me which version would be
> >> cleaner/faster in the end. But yeah, we can experiment after landing
> >> this patch.
> >>
> >> if (type & (ITER_BVEC|ITER_KVEC))
> >
> > There are exactly 3 such places, and all of them would've been just as well
> > with case ITER_BVEC: case ITER_KVEC: ... in a switch.
> >
> > Hmm...  I wonder which would work better:
> >
> > enum iter_type {
> >         ITER_IOVEC = 0,
> >         ITER_KVEC = 2,
> >         ITER_BVEC = 4,
> >         ITER_PIPE = 6,
> >         ITER_DISCARD = 8,
> > };
> > iov_iter_type(iter)	(((iter)->type) & ~1)
> > iov_iter_rw(iter)	(((iter)->type) & 1)
> >
> > or
> >
> > enum iter_type {
> >         ITER_IOVEC,
> >         ITER_KVEC,
> >         ITER_BVEC,
> >         ITER_PIPE,
> >         ITER_DISCARD,
> > };
> > iov_iter_type(iter)	(((iter)->type) & (~0U>>1))
> > // callers of iov_iter_rw() are almost all comparing with explicit READ or WRITE
> > iov_iter_rw(iter)	(((iter)->type) & ~(~0U>>1) ? WRITE : READ)
> > with places like iov_iter_kvec() doing
> > 	i->type = ITER_KVEC | ((direction == WRITE) ? BIT(31) : 0);
> >
> > Preferences?
> 
> For the bitmask version (with this patch) we have most of
> iov_iter_type() completely optimised out. E.g. identical
> 
> iov_iter_type(i) & ITER_IOVEC <=> iter->type & ITER_IOVEC
> 
> It's also nice to have iov_iter_rw() to be just
> (type & 1), operations with which can be optimised in a handful of ways.
> 
> Unless the compiler would be able to heavily optimise switches,
> e.g. to out-of-memory/calculation-based jump tables, that I doubt,
> I'd personally leave it be. Though, not like it should matter much.

The advantage of the bit-masks is that the 'usual' options can
be tested for together. So the code can be (for example):
	if (likely(iter->type & (ITER_IOVEC | ITER_PIPE) {
		if (likely((iter->type & ITER_IOVEC)) {
			... code for iovec
		} else [
			... code for pipe
		}
	} else if (iter->type & ITER_BVEC) {
		... code for bvec
	} else if (iter->type & ITER_KVEC) {
		.. code for kvec
	} else {
		.. must be discard
	}

I'm not sure of the best order though.
You might want 3 bits in the first test.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Pavel Begunkov Dec. 14, 2020, 5:25 p.m. UTC | #6
On 14/12/2020 10:28, David Laight wrote:
> From: Pavel Begunkov
>> Sent: 13 December 2020 22:33
>>
>> On 11/12/2020 02:01, Al Viro wrote:
>>> On Thu, Nov 19, 2020 at 05:12:44PM +0000, Pavel Begunkov wrote:
>>>> On 19/11/2020 17:03, Christoph Hellwig wrote:
>>>>> On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote:
>>>>>> The problem here is that iov_iter_is_*() helpers check types for
>>>>>> equality, but all iterate_* helpers do bitwise ands. This confuses
>>>>>> a compiler, so even if some cases were handled separately with
>>>>>> iov_iter_is_*(), it can't eliminate and skip unreachable branches in
>>>>>> following iterate*().
>>>>>
>>>>> I think we need to kill the iov_iter_is_* helpers, renumber to not do
>>>>> the pointless bitmask and just check for equality (might turn into a
>>>>> bunch of nice switch statements actually).
>>>>
>>>> There are uses like below though, and that would also add some overhead
>>>> on iov_iter_type(), so it's not apparent to me which version would be
>>>> cleaner/faster in the end. But yeah, we can experiment after landing
>>>> this patch.
>>>>
>>>> if (type & (ITER_BVEC|ITER_KVEC))
>>>
>>> There are exactly 3 such places, and all of them would've been just as well
>>> with case ITER_BVEC: case ITER_KVEC: ... in a switch.
>>>
>>> Hmm...  I wonder which would work better:
>>>
>>> enum iter_type {
>>>         ITER_IOVEC = 0,
>>>         ITER_KVEC = 2,
>>>         ITER_BVEC = 4,
>>>         ITER_PIPE = 6,
>>>         ITER_DISCARD = 8,
>>> };
>>> iov_iter_type(iter)	(((iter)->type) & ~1)
>>> iov_iter_rw(iter)	(((iter)->type) & 1)
>>>
>>> or
>>>
>>> enum iter_type {
>>>         ITER_IOVEC,
>>>         ITER_KVEC,
>>>         ITER_BVEC,
>>>         ITER_PIPE,
>>>         ITER_DISCARD,
>>> };
>>> iov_iter_type(iter)	(((iter)->type) & (~0U>>1))
>>> // callers of iov_iter_rw() are almost all comparing with explicit READ or WRITE
>>> iov_iter_rw(iter)	(((iter)->type) & ~(~0U>>1) ? WRITE : READ)
>>> with places like iov_iter_kvec() doing
>>> 	i->type = ITER_KVEC | ((direction == WRITE) ? BIT(31) : 0);
>>>
>>> Preferences?
>>
>> For the bitmask version (with this patch) we have most of
>> iov_iter_type() completely optimised out. E.g. identical
>>
>> iov_iter_type(i) & ITER_IOVEC <=> iter->type & ITER_IOVEC
>>
>> It's also nice to have iov_iter_rw() to be just
>> (type & 1), operations with which can be optimised in a handful of ways.
>>
>> Unless the compiler would be able to heavily optimise switches,
>> e.g. to out-of-memory/calculation-based jump tables, that I doubt,
>> I'd personally leave it be. Though, not like it should matter much.
> 
> The advantage of the bit-masks is that the 'usual' options can
> be tested for together. So the code can be (for example):

Well, you can do that for the non-bitwise case as well.
In a simpler form but should be enough.

enum { ITER_IOVEC = 1, ITER_BVEC = 2, ... }
if (type <= ITER_BVEC) {
	if (iovec) ...
	if (bvec) ...
} else { ... }


> 	if (likely(iter->type & (ITER_IOVEC | ITER_PIPE) {
> 		if (likely((iter->type & ITER_IOVEC)) {
> 			... code for iovec
> 		} else [
> 			... code for pipe
> 		}
> 	} else if (iter->type & ITER_BVEC) {
> 		... code for bvec
> 	} else if (iter->type & ITER_KVEC) {
> 		.. code for kvec
> 	} else {
> 		.. must be discard
> 	}
diff mbox series

Patch

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 72d88566694e..c5970b2d3307 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -57,27 +57,27 @@  static inline enum iter_type iov_iter_type(const struct iov_iter *i)
 
 static inline bool iter_is_iovec(const struct iov_iter *i)
 {
-	return iov_iter_type(i) == ITER_IOVEC;
+	return iov_iter_type(i) & ITER_IOVEC;
 }
 
 static inline bool iov_iter_is_kvec(const struct iov_iter *i)
 {
-	return iov_iter_type(i) == ITER_KVEC;
+	return iov_iter_type(i) & ITER_KVEC;
 }
 
 static inline bool iov_iter_is_bvec(const struct iov_iter *i)
 {
-	return iov_iter_type(i) == ITER_BVEC;
+	return iov_iter_type(i) & ITER_BVEC;
 }
 
 static inline bool iov_iter_is_pipe(const struct iov_iter *i)
 {
-	return iov_iter_type(i) == ITER_PIPE;
+	return iov_iter_type(i) & ITER_PIPE;
 }
 
 static inline bool iov_iter_is_discard(const struct iov_iter *i)
 {
-	return iov_iter_type(i) == ITER_DISCARD;
+	return iov_iter_type(i) & ITER_DISCARD;
 }
 
 static inline unsigned char iov_iter_rw(const struct iov_iter *i)