Message ID | 20190315145938.21516-4-axboe@kernel.dk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | io_uring improvements | expand |
On 3/15/19 8:59 AM, Jens Axboe wrote: > Add hint on whether a read was served out of the page cache, or if it > hit media. This is useful for buffered async IO, O_DIRECT reads would > never have this set (for obvious reasons). > > If the read hit page cache, cqe->flags will have IOCQE_FLAG_CACHEHIT > set. Linus, curious on your opinion on this one. I had this as part of the original series, but removed it from the pull request due to the mincore etc discussions. Leaving the patch intact for you. This is part of a small series of improvements, which sit on top of a series of fixes for this release. > diff --git a/fs/io_uring.c b/fs/io_uring.c > index f66a4a5daf35..513df722702e 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -635,10 +635,14 @@ static void kiocb_end_write(struct kiocb *kiocb) > static void io_complete_rw(struct kiocb *kiocb, long res, long res2) > { > struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw); > + unsigned ev_flags = 0; > > kiocb_end_write(kiocb); > > - io_cqring_add_event(req->ctx, req->user_data, res, 0); > + if (res > 0 && (req->flags & REQ_F_FORCE_NONBLOCK)) > + ev_flags = IOCQE_FLAG_CACHEHIT; > + > + io_cqring_add_event(req->ctx, req->user_data, res, ev_flags); > io_put_req(req); > } > > diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h > index e23408692118..24906e99fdc7 100644 > --- a/include/uapi/linux/io_uring.h > +++ b/include/uapi/linux/io_uring.h > @@ -69,6 +69,11 @@ struct io_uring_cqe { > __u32 flags; > }; > > +/* > + * io_uring_event->flags > + */ > +#define IOCQE_FLAG_CACHEHIT (1U << 0) /* IO did not hit media */ > + > /* > * Magic offsets for the application to mmap the data it needs > */ >
On Fri, Mar 15, 2019 at 9:27 AM Jens Axboe <axboe@kernel.dk> wrote: > > Linus, curious on your opinion on this one. I had this as part of the > original series, but removed it from the pull request due to the > mincore etc discussions. I'd rather not have new ways to leak cache information, and in fact already the IOCB_NOWAIT is a bit questionable for that. But afaik, the non-O_DIRECT paths don't even support it to begin with for some filesystems. Wasn't the whole point of this io_ring that we'd move *away* from direct block access and O_DIRECT? I'm seeing a lot of stuff that looks like just "more of the same old O_DIRECT garbage" that seems to be all about thinking that IO doesn't cache. Haven't we gotten over that already? It was one of the arguments you used for io_ring to begin with. Linus
On 3/15/19 7:34 PM, Linus Torvalds wrote: > On Fri, Mar 15, 2019 at 9:27 AM Jens Axboe <axboe@kernel.dk> wrote: >> >> Linus, curious on your opinion on this one. I had this as part of the >> original series, but removed it from the pull request due to the >> mincore etc discussions. > > I'd rather not have new ways to leak cache information, and in fact > already the IOCB_NOWAIT is a bit questionable for that. But afaik, the > non-O_DIRECT paths don't even support it to begin with for some > filesystems. For the most popular it works fine, though. For buffered async IO it'd be nice to have it be fully reliable, so we can ensure we punt when we have to. Currently for writes, we just ignore it and punt to async, since it doesn't work at all there. The latter should be fixable. On the hint, I hear ya, that's what I figured. I'll drop the patch for now. > Wasn't the whole point of this io_ring that we'd move *away* from > direct block access and O_DIRECT? > > I'm seeing a lot of stuff that looks like just "more of the same old > O_DIRECT garbage" that seems to be all about thinking that IO doesn't > cache. > > Haven't we gotten over that already? It was one of the arguments you > used for io_ring to begin with. There are still folks that use it, even if O_DIRECT is nasty. If you want to get peak performance, there's no way around it. That doesn't mean that io_uring is in any way centered around O_DIRECT at all, it isn't. At its core, it's "just" an efficient delivery mechanism for submissions and completions. Not sure what "stuff" you are referring to, the patches in this short series? That's not centered around block devices at all, works just fine for files. The bvec iterator is what io_uring uses for fixed buffers, it's not exclusive to block devices at all.
diff --git a/fs/io_uring.c b/fs/io_uring.c index f66a4a5daf35..513df722702e 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -635,10 +635,14 @@ static void kiocb_end_write(struct kiocb *kiocb) static void io_complete_rw(struct kiocb *kiocb, long res, long res2) { struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw); + unsigned ev_flags = 0; kiocb_end_write(kiocb); - io_cqring_add_event(req->ctx, req->user_data, res, 0); + if (res > 0 && (req->flags & REQ_F_FORCE_NONBLOCK)) + ev_flags = IOCQE_FLAG_CACHEHIT; + + io_cqring_add_event(req->ctx, req->user_data, res, ev_flags); io_put_req(req); } diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index e23408692118..24906e99fdc7 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -69,6 +69,11 @@ struct io_uring_cqe { __u32 flags; }; +/* + * io_uring_event->flags + */ +#define IOCQE_FLAG_CACHEHIT (1U << 0) /* IO did not hit media */ + /* * Magic offsets for the application to mmap the data it needs */
Add hint on whether a read was served out of the page cache, or if it hit media. This is useful for buffered async IO, O_DIRECT reads would never have this set (for obvious reasons). If the read hit page cache, cqe->flags will have IOCQE_FLAG_CACHEHIT set. Signed-off-by: Jens Axboe <axboe@kernel.dk> --- fs/io_uring.c | 6 +++++- include/uapi/linux/io_uring.h | 5 +++++ 2 files changed, 10 insertions(+), 1 deletion(-)