Message ID | 20221107175610.349807-1-kbusch@meta.com (mailing list archive) |
---|---|
Headers | show |
Series | io_uring: use ITER_UBUF | expand |
On Mon, Nov 07, 2022 at 09:56:06AM -0800, Keith Busch wrote: > 1. io_uring will always prefer using the _iter versions of read/write > callbacks if file_operations implement both, where as the generic > syscalls will use .read/.write (if implemented) for non-vectored IO. There are very few file operations that have both, and for those the difference matters, e.g. the strange vectors semantics for the sound code. I would strongly suggest to mirror what the normal read/write path does here. > 2. io_uring will use the ITER_UBUF representation for single vector > readv/writev, but the generic syscalls currently uses ITER_IOVEC for > these. Same here. It might be woth to use ITER_UBUF for single vector readv/writev, but this should be the same for all interfaces. I'd suggest to drop this for now and do a separate series with careful review from Al for this.
On Mon, Nov 07, 2022 at 10:54:06PM -0800, Christoph Hellwig wrote: > On Mon, Nov 07, 2022 at 09:56:06AM -0800, Keith Busch wrote: > > 1. io_uring will always prefer using the _iter versions of read/write > > callbacks if file_operations implement both, where as the generic > > syscalls will use .read/.write (if implemented) for non-vectored IO. > > There are very few file operations that have both, and for those > the difference matters, e.g. the strange vectors semantics for the > sound code. Yes, thankfully there are not many. Other than the two mentioned file_operations, the only other fops I find implementing both are 'null_ops' and 'zero_ops'; those are fine. And one other implements just .write/.write_iter: trace_events_user.c, which is also fine. > I would strongly suggest to mirror what the normal > read/write path does here. I don't think we can change that now. io_uring has always used the .{read,write}_iter callbacks if available ever since it introduced non-vectored read/write (3a6820f2bb8a0). Altering the io_uring op's ABI to align with the read/write syscalls seems risky. But I don't think there are any real use cases affected by this series anyway. > > 2. io_uring will use the ITER_UBUF representation for single vector > > readv/writev, but the generic syscalls currently uses ITER_IOVEC for > > these. > > Same here. It might be woth to use ITER_UBUF for single vector > readv/writev, but this should be the same for all interfaces. I'd > suggest to drop this for now and do a separate series with careful > review from Al for this. I feel like that's a worthy longer term goal, but I'll start looking into it now.
From: Keith Busch <kbusch@kernel.org> ITER_UBUF is a more efficient representation when using single vector buffers, providing small optimizations in the fast path. Most of this series came from Jens; I just ported them forward to the current release and tested against various filesystems and devices. Usage for this new iter type has been extensively exercised via read/write syscall interface for some time now, so I don't expect surprises from supporting this with io_uring. There are, however, a couple difference between the two interfaces: 1. io_uring will always prefer using the _iter versions of read/write callbacks if file_operations implement both, where as the generic syscalls will use .read/.write (if implemented) for non-vectored IO. 2. io_uring will use the ITER_UBUF representation for single vector readv/writev, but the generic syscalls currently uses ITER_IOVEC for these. That should mean, then, the only potential areas for problem are for file_operations that implement both .read/.read_iter or .write/.write_iter. Fortunately there are very few that do that, and I found only two of them that won't readily work: qib_file_ops, and snd_pcm_f_ops. The former is already broken with io_uring before this series, and the latter's vectored read/write only works with ITER_IOVEC, so that will break, but I don't think anyone is using io_uring to talk to a sound card driver. Jens Axboe (3): iov: add import_ubuf() io_uring: switch network send/recv to ITER_UBUF io_uring: use ubuf for single range imports for read/write Keith Busch (1): iov_iter: move iter_ubuf check inside restore WARN include/linux/uio.h | 1 + io_uring/net.c | 13 ++++--------- io_uring/rw.c | 9 ++++++--- lib/iov_iter.c | 15 +++++++++++++-- 4 files changed, 24 insertions(+), 14 deletions(-)