Message ID | cedfa436-47a3-7cbc-1948-75d0e28cfdc5@kernel.dk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | pipe: honor IOCB_NOWAIT | expand |
On Mon, Sep 07, 2020 at 09:21:02AM -0600, Jens Axboe wrote: > Pipe only looks at O_NONBLOCK for non-blocking operation, which means that > io_uring can't easily poll for it or attempt non-blocking issues. Check for > IOCB_NOWAIT in locking the pipe for reads and writes, and ditto when we > decide on whether or not to block or return -EAGAIN. > > Signed-off-by: Jens Axboe <axboe@kernel.dk> > > --- > > If this is acceptable, then I can add S_ISFIFO to the whitelist on file > descriptors we can IOCB_NOWAIT try for, then poll if we get -EAGAIN > instead of using thread offload. Will check. In the meanwhile, blacklist eventpoll again. Because your attempts at "nonblocking" there had been both ugly as hell *AND* fail to prevent blocking. And frankly, I'm very tempted to rip that crap out entirely. Seriously, *look* at the code you've modified in do_epoll_ctl(). And tell me why the hell is grabbing ->mtx in that function needs to be infested with trylocks, while exact same mutex taken in loop_check_proc() called under those is fine with mutex_lock(). Ditto for calls of vfs_poll() inside ep_insert(), GFP_KERNEL allocations in ep_ptable_queue_proc(), synchronize_rcu() callable from ep_modify() (from the same function), et sodding cetera. No, this is _not_ an invitation to spread the same crap over even more places in there; I just want to understand where had that kind of voodoo approach comes from. And that's directly relevant for this patch, because it looks like the same kind of thing. What is your semantics for IOCB_NOWAIT? What should and what should _not_ be waited for?
On 9/10/20 7:12 PM, Al Viro wrote: > On Mon, Sep 07, 2020 at 09:21:02AM -0600, Jens Axboe wrote: >> Pipe only looks at O_NONBLOCK for non-blocking operation, which means that >> io_uring can't easily poll for it or attempt non-blocking issues. Check for >> IOCB_NOWAIT in locking the pipe for reads and writes, and ditto when we >> decide on whether or not to block or return -EAGAIN. >> >> Signed-off-by: Jens Axboe <axboe@kernel.dk> >> >> --- >> >> If this is acceptable, then I can add S_ISFIFO to the whitelist on file >> descriptors we can IOCB_NOWAIT try for, then poll if we get -EAGAIN >> instead of using thread offload. > > Will check. Thanks! > In the meanwhile, blacklist eventpoll again. Because your > attempts at "nonblocking" there had been both ugly as hell *AND* fail > to prevent blocking. And frankly, I'm very tempted to rip that crap > out entirely. Seriously, *look* at the code you've modified in > do_epoll_ctl(). And tell me why the hell is grabbing ->mtx in that > function needs to be infested with trylocks, while exact same mutex > taken in loop_check_proc() called under those is fine with mutex_lock(). > Ditto for calls of vfs_poll() inside ep_insert(), GFP_KERNEL allocations > in ep_ptable_queue_proc(), synchronize_rcu() callable from ep_modify() > (from the same function), et sodding cetera. Ugh missed the loop_check_proc() part :/ The original patch wasn't that pretty, in my defense the whole thing is pretty ugly to begin with. It'd need serious rewriting to become palatable. If you want to yank the patch, I've got no problem with that. > No, this is _not_ an invitation to spread the same crap over even more > places in there; I just want to understand where had that kind of voodoo > approach comes from. And that's directly relevant for this patch, > because it looks like the same kind of thing. The pipe patch is way cleaner, and pretty much to the point. Don't think that's a fair comparison. > What is your semantics for IOCB_NOWAIT? What should and what should _not_ > be waited for? Basically it's don't wait for space (if write) if it's full, don't wait for data if it's empty. O_NONBLOCK for the operation, instead of as a file state. The locking isn't strictly required, and it's basically impossible to avoid various deeper down items like memory allocations potentially blocking, so...
diff --git a/fs/pipe.c b/fs/pipe.c index 60dbee457143..3cee28e35985 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -82,6 +82,13 @@ void pipe_unlock(struct pipe_inode_info *pipe) } EXPORT_SYMBOL(pipe_unlock); +static inline bool __pipe_lock_nonblock(struct pipe_inode_info *pipe) +{ + if (mutex_trylock(&pipe->mutex)) + return true; + return false; +} + static inline void __pipe_lock(struct pipe_inode_info *pipe) { mutex_lock_nested(&pipe->mutex, I_MUTEX_PARENT); @@ -244,7 +251,12 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to) return 0; ret = 0; - __pipe_lock(pipe); + if (iocb->ki_flags & IOCB_NOWAIT) { + if (!__pipe_lock_nonblock(pipe)) + return -EAGAIN; + } else { + __pipe_lock(pipe); + } /* * We only wake up writers if the pipe was full when we started @@ -344,7 +356,8 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to) break; if (ret) break; - if (filp->f_flags & O_NONBLOCK) { + if ((filp->f_flags & O_NONBLOCK) || + (iocb->ki_flags & IOCB_NOWAIT)) { ret = -EAGAIN; break; } @@ -432,7 +445,12 @@ pipe_write(struct kiocb *iocb, struct iov_iter *from) if (unlikely(total_len == 0)) return 0; - __pipe_lock(pipe); + if (iocb->ki_flags & IOCB_NOWAIT) { + if (!__pipe_lock_nonblock(pipe)) + return -EAGAIN; + } else { + __pipe_lock(pipe); + } if (!pipe->readers) { send_sig(SIGPIPE, current, 0); @@ -554,7 +572,8 @@ pipe_write(struct kiocb *iocb, struct iov_iter *from) continue; /* Wait for buffer space to become available. */ - if (filp->f_flags & O_NONBLOCK) { + if ((filp->f_flags & O_NONBLOCK) || + (iocb->ki_flags & IOCB_NOWAIT)) { if (!ret) ret = -EAGAIN; break;
Pipe only looks at O_NONBLOCK for non-blocking operation, which means that io_uring can't easily poll for it or attempt non-blocking issues. Check for IOCB_NOWAIT in locking the pipe for reads and writes, and ditto when we decide on whether or not to block or return -EAGAIN. Signed-off-by: Jens Axboe <axboe@kernel.dk> --- If this is acceptable, then I can add S_ISFIFO to the whitelist on file descriptors we can IOCB_NOWAIT try for, then poll if we get -EAGAIN instead of using thread offload.