From patchwork Sat Sep 24 03:59:51 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Viro X-Patchwork-Id: 9349023 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 95681607F0 for ; Sat, 24 Sep 2016 04:00:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 889E12A145 for ; Sat, 24 Sep 2016 04:00:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7CD482A14B; Sat, 24 Sep 2016 04:00:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=unavailable version=3.3.1 Received: from oss.sgi.com (oss.sgi.com [192.48.182.195]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E48542A145 for ; Sat, 24 Sep 2016 03:59:59 +0000 (UTC) Received: from oss.sgi.com (localhost [IPv6:::1]) by oss.sgi.com (Postfix) with ESMTP id 396307CA7; Fri, 23 Sep 2016 22:59:59 -0500 (CDT) X-Original-To: xfs@oss.sgi.com Delivered-To: xfs@oss.sgi.com Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (Postfix) with ESMTP id 32C6C7CA0 for ; Fri, 23 Sep 2016 22:59:57 -0500 (CDT) Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by relay2.corp.sgi.com (Postfix) with ESMTP id 03F3C304039 for ; Fri, 23 Sep 2016 20:59:56 -0700 (PDT) X-ASG-Debug-ID: 1474689594-0bf57c18ca3cfa0001-NocioJ Received: from ZenIV.linux.org.uk (zeniv.linux.org.uk [195.92.253.2]) by cuda.sgi.com with ESMTP id wMH4zxvvoHNl7Pgs (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Fri, 23 Sep 2016 20:59:54 -0700 (PDT) X-Barracuda-Envelope-From: viro@ftp.linux.org.uk X-Barracuda-Effective-Source-IP: zeniv.linux.org.uk[195.92.253.2] X-Barracuda-Apparent-Source-IP: 195.92.253.2 Received: from viro by ZenIV.linux.org.uk with local (Exim 4.86_2 #1 (Red Hat Linux)) id 1bne7z-0005EN-6d; Sat, 24 Sep 2016 03:59:51 +0000 Date: Sat, 24 Sep 2016 04:59:51 +0100 From: Al Viro To: Linus Torvalds Subject: [PATCH 04/12] splice: lift pipe_lock out of splice_to_pipe() Message-ID: <20160924035951.GN2356@ZenIV.linux.org.uk> X-ASG-Orig-Subj: [PATCH 04/12] splice: lift pipe_lock out of splice_to_pipe() References: <20160914031648.GB2356@ZenIV.linux.org.uk> <20160914042559.GC2356@ZenIV.linux.org.uk> <20160917082007.GA6489@ZenIV.linux.org.uk> <20160917190023.GA8039@ZenIV.linux.org.uk> <20160923190032.GA25771@ZenIV.linux.org.uk> <20160923190326.GB2356@ZenIV.linux.org.uk> <20160923201025.GJ2356@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.1 (2016-04-27) X-Barracuda-Connect: zeniv.linux.org.uk[195.92.253.2] X-Barracuda-Start-Time: 1474689594 X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384 X-Barracuda-URL: https://192.48.176.15:443/cgi-mod/mark.cgi X-Barracuda-Scan-Msg-Size: 6747 X-Virus-Scanned: by bsmtpd at sgi.com X-Barracuda-BRTS-Status: 1 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.7 tests=BSF_SC0_MISMATCH_TO X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.33176 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 BSF_SC0_MISMATCH_TO Envelope rcpt doesn't match header Cc: Jens Axboe , CAI Qian , Nick Piggin , xfs@oss.sgi.com, linux-xfs , linux-fsdevel X-BeenThere: xfs@oss.sgi.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com X-Virus-Scanned: ClamAV using ClamSMTP * splice_to_pipe() stops at pipe overflow and does *not* take pipe_lock * ->splice_read() instances do the same * vmsplice_to_pipe() and do_splice() (ultimate callers of splice_to_pipe()) arrange for waiting, looping, etc. themselves. That should make pipe_lock the outermost one. Unfortunately, existing rules for the amount passed by vmsplice_to_pipe() and do_splice() are quite ugly _and_ userland code can be easily broken by changing those. It's not even "no more than the maximal capacity of this pipe" - it's "once we'd fed pipe->nr_buffers pages into the pipe, leave instead of waiting". Considering how poorly these rules are documented, let's try "wait for some space to appear, unless given SPLICE_F_NONBLOCK, then push into pipe and if we run into overflow, we are done". Signed-off-by: Al Viro --- fs/fuse/dev.c | 2 - fs/splice.c | 138 +++++++++++++++++++++++++++------------------------------- 2 files changed, 63 insertions(+), 77 deletions(-) diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index a94d2ed..eaf56c6 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1364,7 +1364,6 @@ static ssize_t fuse_dev_splice_read(struct file *in, loff_t *ppos, goto out; ret = 0; - pipe_lock(pipe); if (!pipe->readers) { send_sig(SIGPIPE, current, 0); @@ -1400,7 +1399,6 @@ static ssize_t fuse_dev_splice_read(struct file *in, loff_t *ppos, } out_unlock: - pipe_unlock(pipe); if (do_wakeup) { smp_mb(); diff --git a/fs/splice.c b/fs/splice.c index 31c52e0..02daa61 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -183,79 +183,41 @@ ssize_t splice_to_pipe(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd) { unsigned int spd_pages = spd->nr_pages; - int ret, do_wakeup, page_nr; + int ret = 0, page_nr = 0; if (!spd_pages) return 0; - ret = 0; - do_wakeup = 0; - page_nr = 0; - - pipe_lock(pipe); - - for (;;) { - if (!pipe->readers) { - send_sig(SIGPIPE, current, 0); - if (!ret) - ret = -EPIPE; - break; - } - - if (pipe->nrbufs < pipe->buffers) { - int newbuf = (pipe->curbuf + pipe->nrbufs) & (pipe->buffers - 1); - struct pipe_buffer *buf = pipe->bufs + newbuf; - - buf->page = spd->pages[page_nr]; - buf->offset = spd->partial[page_nr].offset; - buf->len = spd->partial[page_nr].len; - buf->private = spd->partial[page_nr].private; - buf->ops = spd->ops; - if (spd->flags & SPLICE_F_GIFT) - buf->flags |= PIPE_BUF_FLAG_GIFT; - - pipe->nrbufs++; - page_nr++; - ret += buf->len; - - if (pipe->files) - do_wakeup = 1; + if (unlikely(!pipe->readers)) { + send_sig(SIGPIPE, current, 0); + ret = -EPIPE; + goto out; + } - if (!--spd->nr_pages) - break; - if (pipe->nrbufs < pipe->buffers) - continue; + while (pipe->nrbufs < pipe->buffers) { + int newbuf = (pipe->curbuf + pipe->nrbufs) & (pipe->buffers - 1); + struct pipe_buffer *buf = pipe->bufs + newbuf; - break; - } + buf->page = spd->pages[page_nr]; + buf->offset = spd->partial[page_nr].offset; + buf->len = spd->partial[page_nr].len; + buf->private = spd->partial[page_nr].private; + buf->ops = spd->ops; + if (spd->flags & SPLICE_F_GIFT) + buf->flags |= PIPE_BUF_FLAG_GIFT; - if (spd->flags & SPLICE_F_NONBLOCK) { - if (!ret) - ret = -EAGAIN; - break; - } + pipe->nrbufs++; + page_nr++; + ret += buf->len; - if (signal_pending(current)) { - if (!ret) - ret = -ERESTARTSYS; + if (!--spd->nr_pages) break; - } - - if (do_wakeup) { - wakeup_pipe_readers(pipe); - do_wakeup = 0; - } - - pipe->waiting_writers++; - pipe_wait(pipe); - pipe->waiting_writers--; } - pipe_unlock(pipe); - - if (do_wakeup) - wakeup_pipe_readers(pipe); + if (!ret) + ret = -EAGAIN; +out: while (page_nr < spd_pages) spd->spd_release(spd, page_nr++); @@ -1339,6 +1301,20 @@ long do_splice_direct(struct file *in, loff_t *ppos, struct file *out, } EXPORT_SYMBOL(do_splice_direct); +static int wait_for_space(struct pipe_inode_info *pipe, unsigned flags) +{ + while (pipe->nrbufs == pipe->buffers) { + if (flags & SPLICE_F_NONBLOCK) + return -EAGAIN; + if (signal_pending(current)) + return -ERESTARTSYS; + pipe->waiting_writers++; + pipe_wait(pipe); + pipe->waiting_writers--; + } + return 0; +} + static int splice_pipe_to_pipe(struct pipe_inode_info *ipipe, struct pipe_inode_info *opipe, size_t len, unsigned int flags); @@ -1421,8 +1397,13 @@ static long do_splice(struct file *in, loff_t __user *off_in, offset = in->f_pos; } - ret = do_splice_to(in, &offset, opipe, len, flags); - + pipe_lock(opipe); + ret = wait_for_space(opipe, flags); + if (!ret) + ret = do_splice_to(in, &offset, opipe, len, flags); + pipe_unlock(opipe); + if (ret > 0) + wakeup_pipe_readers(opipe); if (!off_in) in->f_pos = offset; else if (copy_to_user(off_in, &offset, sizeof(loff_t))) @@ -1434,22 +1415,23 @@ static long do_splice(struct file *in, loff_t __user *off_in, return -EINVAL; } -static int get_iovec_page_array(struct iov_iter *from, +static int get_iovec_page_array(const struct iov_iter *from, struct page **pages, struct partial_page *partial, unsigned int pipe_buffers) { + struct iov_iter i = *from; int buffers = 0; - while (iov_iter_count(from)) { + while (iov_iter_count(&i)) { ssize_t copied; size_t start; - copied = iov_iter_get_pages(from, pages + buffers, ~0UL, + copied = iov_iter_get_pages(&i, pages + buffers, ~0UL, pipe_buffers - buffers, &start); if (copied <= 0) return buffers ? buffers : copied; - iov_iter_advance(from, copied); + iov_iter_advance(&i, copied); while (copied) { int size = min_t(int, copied, PAGE_SIZE - start); partial[buffers].offset = start; @@ -1546,14 +1528,20 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *uiov, return -ENOMEM; } - spd.nr_pages = get_iovec_page_array(&from, spd.pages, - spd.partial, - spd.nr_pages_max); - if (spd.nr_pages <= 0) - ret = spd.nr_pages; - else - ret = splice_to_pipe(pipe, &spd); - + pipe_lock(pipe); + ret = wait_for_space(pipe, flags); + if (!ret) { + spd.nr_pages = get_iovec_page_array(&from, spd.pages, + spd.partial, + spd.nr_pages_max); + if (spd.nr_pages <= 0) + ret = spd.nr_pages; + else + ret = splice_to_pipe(pipe, &spd); + pipe_unlock(pipe); + if (ret > 0) + wakeup_pipe_readers(pipe); + } splice_shrink_spd(&spd); kfree(iov); return ret;