Message ID | 20241017160937.2283225-6-kbusch@meta.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | write hints for nvme fdp | expand |
Same hint vs write stream thing here as well. > + if (ddir == ITER_SOURCE && > + req->file->f_op->fop_flags & FOP_PER_IO_HINTS) > + rw->kiocb.ki_write_hint = READ_ONCE(sqe->write_hint); > + else > + rw->kiocb.ki_write_hint = WRITE_LIFE_NOT_SET; WRITE_LIFE_NOT_SET is in the wrong namespae vs the separate streams. Either use 0 directly or add a separate constant for it.
On 10/17/24 18:09, Keith Busch wrote: > From: Kanchan Joshi <joshi.k@samsung.com> > > With F_SET_RW_HINT fcntl, user can set a hint on the file inode, and > all the subsequent writes on the file pass that hint value down. This > can be limiting for block device as all the writes can be tagged with > only one lifetime hint value. Concurrent writes (with different hint > values) are hard to manage. Per-IO hinting solves that problem. > > Allow userspace to pass additional metadata in the SQE. > > __u16 write_hint; > > This accepts all hint values that the file allows. > > The write handlers (io_prep_rw, io_write) send the hint value to > lower-layer using kiocb. This is good for upporting direct IO, but not > when kiocb is not available (e.g., buffered IO). > > When per-io hints are not passed, the per-inode hint values are set in > the kiocb (as before). Otherwise, per-io hints take the precedence over > per-inode hints. > > Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> > Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com> > Signed-off-by: Keith Busch <kbusch@kernel.org> > --- > include/uapi/linux/io_uring.h | 4 ++++ > io_uring/rw.c | 11 +++++++++-- > 2 files changed, 13 insertions(+), 2 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 86cb385fe0b53..bd9acc0053318 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -92,6 +92,10 @@ struct io_uring_sqe { __u16 addr_len; __u16 __pad3[1]; }; + struct { + __u16 write_hint; + __u16 __pad4[1]; + }; }; union { struct { diff --git a/io_uring/rw.c b/io_uring/rw.c index ffd637ca0bd17..9a6d3ba76af4f 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -279,7 +279,11 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe, rw->kiocb.ki_ioprio = get_current_ioprio(); } rw->kiocb.dio_complete = NULL; - + if (ddir == ITER_SOURCE && + req->file->f_op->fop_flags & FOP_PER_IO_HINTS) + rw->kiocb.ki_write_hint = READ_ONCE(sqe->write_hint); + else + rw->kiocb.ki_write_hint = WRITE_LIFE_NOT_SET; rw->addr = READ_ONCE(sqe->addr); rw->len = READ_ONCE(sqe->len); rw->flags = READ_ONCE(sqe->rw_flags); @@ -1027,7 +1031,10 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags) if (unlikely(ret)) return ret; req->cqe.res = iov_iter_count(&io->iter); - rw->kiocb.ki_write_hint = file_write_hint(rw->kiocb.ki_filp); + + /* Use per-file hint only if per-io hint is not set. */ + if (rw->kiocb.ki_write_hint == WRITE_LIFE_NOT_SET) + rw->kiocb.ki_write_hint = file_write_hint(rw->kiocb.ki_filp); if (force_nonblock) { /* If the file doesn't support async, just async punt */