Message ID | 9bc27cb3ef6ab49b6b2ccee3db6613838aee17af.1605799583.git.asml.silence@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | optimise iov_iter | expand |
On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote: > The problem here is that iov_iter_is_*() helpers check types for > equality, but all iterate_* helpers do bitwise ands. This confuses > a compiler, so even if some cases were handled separately with > iov_iter_is_*(), it can't eliminate and skip unreachable branches in > following iterate*(). I think we need to kill the iov_iter_is_* helpers, renumber to not do the pointless bitmask and just check for equality (might turn into a bunch of nice switch statements actually).
On 19/11/2020 17:03, Christoph Hellwig wrote: > On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote: >> The problem here is that iov_iter_is_*() helpers check types for >> equality, but all iterate_* helpers do bitwise ands. This confuses >> a compiler, so even if some cases were handled separately with >> iov_iter_is_*(), it can't eliminate and skip unreachable branches in >> following iterate*(). > > I think we need to kill the iov_iter_is_* helpers, renumber to not do > the pointless bitmask and just check for equality (might turn into a > bunch of nice switch statements actually). There are uses like below though, and that would also add some overhead on iov_iter_type(), so it's not apparent to me which version would be cleaner/faster in the end. But yeah, we can experiment after landing this patch. if (type & (ITER_BVEC|ITER_KVEC))
On Thu, Nov 19, 2020 at 05:12:44PM +0000, Pavel Begunkov wrote: > On 19/11/2020 17:03, Christoph Hellwig wrote: > > On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote: > >> The problem here is that iov_iter_is_*() helpers check types for > >> equality, but all iterate_* helpers do bitwise ands. This confuses > >> a compiler, so even if some cases were handled separately with > >> iov_iter_is_*(), it can't eliminate and skip unreachable branches in > >> following iterate*(). > > > > I think we need to kill the iov_iter_is_* helpers, renumber to not do > > the pointless bitmask and just check for equality (might turn into a > > bunch of nice switch statements actually). > > There are uses like below though, and that would also add some overhead > on iov_iter_type(), so it's not apparent to me which version would be > cleaner/faster in the end. But yeah, we can experiment after landing > this patch. > > if (type & (ITER_BVEC|ITER_KVEC)) There are exactly 3 such places, and all of them would've been just as well with case ITER_BVEC: case ITER_KVEC: ... in a switch. Hmm... I wonder which would work better: enum iter_type { ITER_IOVEC = 0, ITER_KVEC = 2, ITER_BVEC = 4, ITER_PIPE = 6, ITER_DISCARD = 8, }; iov_iter_type(iter) (((iter)->type) & ~1) iov_iter_rw(iter) (((iter)->type) & 1) or enum iter_type { ITER_IOVEC, ITER_KVEC, ITER_BVEC, ITER_PIPE, ITER_DISCARD, }; iov_iter_type(iter) (((iter)->type) & (~0U>>1)) // callers of iov_iter_rw() are almost all comparing with explicit READ or WRITE iov_iter_rw(iter) (((iter)->type) & ~(~0U>>1) ? WRITE : READ) with places like iov_iter_kvec() doing i->type = ITER_KVEC | ((direction == WRITE) ? BIT(31) : 0); Preferences?
On 11/12/2020 02:01, Al Viro wrote: > On Thu, Nov 19, 2020 at 05:12:44PM +0000, Pavel Begunkov wrote: >> On 19/11/2020 17:03, Christoph Hellwig wrote: >>> On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote: >>>> The problem here is that iov_iter_is_*() helpers check types for >>>> equality, but all iterate_* helpers do bitwise ands. This confuses >>>> a compiler, so even if some cases were handled separately with >>>> iov_iter_is_*(), it can't eliminate and skip unreachable branches in >>>> following iterate*(). >>> >>> I think we need to kill the iov_iter_is_* helpers, renumber to not do >>> the pointless bitmask and just check for equality (might turn into a >>> bunch of nice switch statements actually). >> >> There are uses like below though, and that would also add some overhead >> on iov_iter_type(), so it's not apparent to me which version would be >> cleaner/faster in the end. But yeah, we can experiment after landing >> this patch. >> >> if (type & (ITER_BVEC|ITER_KVEC)) > > There are exactly 3 such places, and all of them would've been just as well > with case ITER_BVEC: case ITER_KVEC: ... in a switch. > > Hmm... I wonder which would work better: > > enum iter_type { > ITER_IOVEC = 0, > ITER_KVEC = 2, > ITER_BVEC = 4, > ITER_PIPE = 6, > ITER_DISCARD = 8, > }; > iov_iter_type(iter) (((iter)->type) & ~1) > iov_iter_rw(iter) (((iter)->type) & 1) > > or > > enum iter_type { > ITER_IOVEC, > ITER_KVEC, > ITER_BVEC, > ITER_PIPE, > ITER_DISCARD, > }; > iov_iter_type(iter) (((iter)->type) & (~0U>>1)) > // callers of iov_iter_rw() are almost all comparing with explicit READ or WRITE > iov_iter_rw(iter) (((iter)->type) & ~(~0U>>1) ? WRITE : READ) > with places like iov_iter_kvec() doing > i->type = ITER_KVEC | ((direction == WRITE) ? BIT(31) : 0); > > Preferences? For the bitmask version (with this patch) we have most of iov_iter_type() completely optimised out. E.g. identical iov_iter_type(i) & ITER_IOVEC <=> iter->type & ITER_IOVEC It's also nice to have iov_iter_rw() to be just (type & 1), operations with which can be optimised in a handful of ways. Unless the compiler would be able to heavily optimise switches, e.g. to out-of-memory/calculation-based jump tables, that I doubt, I'd personally leave it be. Though, not like it should matter much.
From: Pavel Begunkov > Sent: 13 December 2020 22:33 > > On 11/12/2020 02:01, Al Viro wrote: > > On Thu, Nov 19, 2020 at 05:12:44PM +0000, Pavel Begunkov wrote: > >> On 19/11/2020 17:03, Christoph Hellwig wrote: > >>> On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote: > >>>> The problem here is that iov_iter_is_*() helpers check types for > >>>> equality, but all iterate_* helpers do bitwise ands. This confuses > >>>> a compiler, so even if some cases were handled separately with > >>>> iov_iter_is_*(), it can't eliminate and skip unreachable branches in > >>>> following iterate*(). > >>> > >>> I think we need to kill the iov_iter_is_* helpers, renumber to not do > >>> the pointless bitmask and just check for equality (might turn into a > >>> bunch of nice switch statements actually). > >> > >> There are uses like below though, and that would also add some overhead > >> on iov_iter_type(), so it's not apparent to me which version would be > >> cleaner/faster in the end. But yeah, we can experiment after landing > >> this patch. > >> > >> if (type & (ITER_BVEC|ITER_KVEC)) > > > > There are exactly 3 such places, and all of them would've been just as well > > with case ITER_BVEC: case ITER_KVEC: ... in a switch. > > > > Hmm... I wonder which would work better: > > > > enum iter_type { > > ITER_IOVEC = 0, > > ITER_KVEC = 2, > > ITER_BVEC = 4, > > ITER_PIPE = 6, > > ITER_DISCARD = 8, > > }; > > iov_iter_type(iter) (((iter)->type) & ~1) > > iov_iter_rw(iter) (((iter)->type) & 1) > > > > or > > > > enum iter_type { > > ITER_IOVEC, > > ITER_KVEC, > > ITER_BVEC, > > ITER_PIPE, > > ITER_DISCARD, > > }; > > iov_iter_type(iter) (((iter)->type) & (~0U>>1)) > > // callers of iov_iter_rw() are almost all comparing with explicit READ or WRITE > > iov_iter_rw(iter) (((iter)->type) & ~(~0U>>1) ? WRITE : READ) > > with places like iov_iter_kvec() doing > > i->type = ITER_KVEC | ((direction == WRITE) ? BIT(31) : 0); > > > > Preferences? > > For the bitmask version (with this patch) we have most of > iov_iter_type() completely optimised out. E.g. identical > > iov_iter_type(i) & ITER_IOVEC <=> iter->type & ITER_IOVEC > > It's also nice to have iov_iter_rw() to be just > (type & 1), operations with which can be optimised in a handful of ways. > > Unless the compiler would be able to heavily optimise switches, > e.g. to out-of-memory/calculation-based jump tables, that I doubt, > I'd personally leave it be. Though, not like it should matter much. The advantage of the bit-masks is that the 'usual' options can be tested for together. So the code can be (for example): if (likely(iter->type & (ITER_IOVEC | ITER_PIPE) { if (likely((iter->type & ITER_IOVEC)) { ... code for iovec } else [ ... code for pipe } } else if (iter->type & ITER_BVEC) { ... code for bvec } else if (iter->type & ITER_KVEC) { .. code for kvec } else { .. must be discard } I'm not sure of the best order though. You might want 3 bits in the first test. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On 14/12/2020 10:28, David Laight wrote: > From: Pavel Begunkov >> Sent: 13 December 2020 22:33 >> >> On 11/12/2020 02:01, Al Viro wrote: >>> On Thu, Nov 19, 2020 at 05:12:44PM +0000, Pavel Begunkov wrote: >>>> On 19/11/2020 17:03, Christoph Hellwig wrote: >>>>> On Thu, Nov 19, 2020 at 03:29:43PM +0000, Pavel Begunkov wrote: >>>>>> The problem here is that iov_iter_is_*() helpers check types for >>>>>> equality, but all iterate_* helpers do bitwise ands. This confuses >>>>>> a compiler, so even if some cases were handled separately with >>>>>> iov_iter_is_*(), it can't eliminate and skip unreachable branches in >>>>>> following iterate*(). >>>>> >>>>> I think we need to kill the iov_iter_is_* helpers, renumber to not do >>>>> the pointless bitmask and just check for equality (might turn into a >>>>> bunch of nice switch statements actually). >>>> >>>> There are uses like below though, and that would also add some overhead >>>> on iov_iter_type(), so it's not apparent to me which version would be >>>> cleaner/faster in the end. But yeah, we can experiment after landing >>>> this patch. >>>> >>>> if (type & (ITER_BVEC|ITER_KVEC)) >>> >>> There are exactly 3 such places, and all of them would've been just as well >>> with case ITER_BVEC: case ITER_KVEC: ... in a switch. >>> >>> Hmm... I wonder which would work better: >>> >>> enum iter_type { >>> ITER_IOVEC = 0, >>> ITER_KVEC = 2, >>> ITER_BVEC = 4, >>> ITER_PIPE = 6, >>> ITER_DISCARD = 8, >>> }; >>> iov_iter_type(iter) (((iter)->type) & ~1) >>> iov_iter_rw(iter) (((iter)->type) & 1) >>> >>> or >>> >>> enum iter_type { >>> ITER_IOVEC, >>> ITER_KVEC, >>> ITER_BVEC, >>> ITER_PIPE, >>> ITER_DISCARD, >>> }; >>> iov_iter_type(iter) (((iter)->type) & (~0U>>1)) >>> // callers of iov_iter_rw() are almost all comparing with explicit READ or WRITE >>> iov_iter_rw(iter) (((iter)->type) & ~(~0U>>1) ? WRITE : READ) >>> with places like iov_iter_kvec() doing >>> i->type = ITER_KVEC | ((direction == WRITE) ? BIT(31) : 0); >>> >>> Preferences? >> >> For the bitmask version (with this patch) we have most of >> iov_iter_type() completely optimised out. E.g. identical >> >> iov_iter_type(i) & ITER_IOVEC <=> iter->type & ITER_IOVEC >> >> It's also nice to have iov_iter_rw() to be just >> (type & 1), operations with which can be optimised in a handful of ways. >> >> Unless the compiler would be able to heavily optimise switches, >> e.g. to out-of-memory/calculation-based jump tables, that I doubt, >> I'd personally leave it be. Though, not like it should matter much. > > The advantage of the bit-masks is that the 'usual' options can > be tested for together. So the code can be (for example): Well, you can do that for the non-bitwise case as well. In a simpler form but should be enough. enum { ITER_IOVEC = 1, ITER_BVEC = 2, ... } if (type <= ITER_BVEC) { if (iovec) ... if (bvec) ... } else { ... } > if (likely(iter->type & (ITER_IOVEC | ITER_PIPE) { > if (likely((iter->type & ITER_IOVEC)) { > ... code for iovec > } else [ > ... code for pipe > } > } else if (iter->type & ITER_BVEC) { > ... code for bvec > } else if (iter->type & ITER_KVEC) { > .. code for kvec > } else { > .. must be discard > }
diff --git a/include/linux/uio.h b/include/linux/uio.h index 72d88566694e..c5970b2d3307 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -57,27 +57,27 @@ static inline enum iter_type iov_iter_type(const struct iov_iter *i) static inline bool iter_is_iovec(const struct iov_iter *i) { - return iov_iter_type(i) == ITER_IOVEC; + return iov_iter_type(i) & ITER_IOVEC; } static inline bool iov_iter_is_kvec(const struct iov_iter *i) { - return iov_iter_type(i) == ITER_KVEC; + return iov_iter_type(i) & ITER_KVEC; } static inline bool iov_iter_is_bvec(const struct iov_iter *i) { - return iov_iter_type(i) == ITER_BVEC; + return iov_iter_type(i) & ITER_BVEC; } static inline bool iov_iter_is_pipe(const struct iov_iter *i) { - return iov_iter_type(i) == ITER_PIPE; + return iov_iter_type(i) & ITER_PIPE; } static inline bool iov_iter_is_discard(const struct iov_iter *i) { - return iov_iter_type(i) == ITER_DISCARD; + return iov_iter_type(i) & ITER_DISCARD; } static inline unsigned char iov_iter_rw(const struct iov_iter *i)
The problem here is that iov_iter_is_*() helpers check types for equality, but all iterate_* helpers do bitwise ands. This confuses a compiler, so even if some cases were handled separately with iov_iter_is_*(), it can't eliminate and skip unreachable branches in following iterate*(). E.g. iov_iter_npages() first handles bvecs and discards, but iterate_all_kinds() still generate branches for them. Making checks consistent solves that. size lib/iov_iter.o before: text data bss dec hex filename 24409 805 0 25214 627e lib/iov_iter.o after: text data bss dec hex filename 23785 793 0 24578 6002 lib/iov_iter.o Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> --- include/linux/uio.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)