From patchwork Wed Feb 15 12:00:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Konrad Dybcio X-Patchwork-Id: 13141570 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 285B8C636CC for ; Wed, 15 Feb 2023 12:00:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 847AC6B0073; Wed, 15 Feb 2023 07:00:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7D1746B0074; Wed, 15 Feb 2023 07:00:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5FC5A6B0075; Wed, 15 Feb 2023 07:00:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 49CC86B0073 for ; Wed, 15 Feb 2023 07:00:47 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D40DDC1095 for ; Wed, 15 Feb 2023 12:00:46 +0000 (UTC) X-FDA: 80469384492.29.E876CA1 Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com [209.85.208.41]) by imf11.hostedemail.com (Postfix) with ESMTP id 748F240005 for ; Wed, 15 Feb 2023 12:00:44 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linaro.org header.s=google header.b=T3y6JHOE; spf=pass (imf11.hostedemail.com: domain of konrad.dybcio@linaro.org designates 209.85.208.41 as permitted sender) smtp.mailfrom=konrad.dybcio@linaro.org; dmarc=pass (policy=none) header.from=linaro.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676462444; a=rsa-sha256; cv=none; b=Loh5yigHAb2tkF/LRtjxyw91iVtvRHBkkLhBI3jyU2R08C3lyyzUqxUPjJ9QNA8qinevfY 6/CMXuPsGLwyyh3EdaB66uUTBkyaZTnUW8OA/KTUXUN4uMkV8k3n3oL9C7wmgJyDQ+OCab +Mvv7X916GeCr9GKFKfLSVhhn0645og= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=linaro.org header.s=google header.b=T3y6JHOE; spf=pass (imf11.hostedemail.com: domain of konrad.dybcio@linaro.org designates 209.85.208.41 as permitted sender) smtp.mailfrom=konrad.dybcio@linaro.org; dmarc=pass (policy=none) header.from=linaro.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676462444; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5HPKBjV27hyJ7EdLB/XBXmAqEvwuI3cWKVxG4jloQJA=; b=I9F1x+QkDYCSrRhYMjTqkjvxq4AD+iUmJ4GNKHOwiDgrFRJFJmuZiSN6wAh4Imtzk3vK34 xS31phP8XIRv5hQ72clv6FxZ84YhcJ0R+Bo0VDPGh0adGym4jpp+usXA2IlzHvONCYFvve hPp+PAvuh11bI2d5ROXws3MSZyjSZXg= Received: by mail-ed1-f41.google.com with SMTP id bt8so15881605edb.12 for ; Wed, 15 Feb 2023 04:00:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=5HPKBjV27hyJ7EdLB/XBXmAqEvwuI3cWKVxG4jloQJA=; b=T3y6JHOEvP94y1x5pxg6optbWiwFS0yxGYtwaoCKCjvPehmb/XPZYRMF2bGsSm1g8e JWqq7pvbs5FAKCiRvO5ZVIrFmQvsp369WQAqQN+wsHl+BpNToQEin3+2Y4Ykiix6Kd9Q K83iZSMqbYNGi3WHVw35gxUOfey5LmerP3iAD6y1GO8w+XEgSXTf5323fwzmDNGqR93c pN/cm1iN1K7hureCEHIBgBTcrnjY4n1Dw4mS8ksjYM5RWIOGi1rFz+xCQ+AKIi6efaok gX7OqInTNz8+4N4AAkZTfrOX6zn6nrfZ0hHoUkYNFCi1nsGBLCm6+6YbKLLMZwa5P81i mPMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5HPKBjV27hyJ7EdLB/XBXmAqEvwuI3cWKVxG4jloQJA=; b=jbuN85P4FKzAZ+WzYfzD5HD7Ip8SCofIv9gVl0So0TjUchtmUSMPPgfKTKOKvg3J/2 iAGfajD7C9LrnRo5HDhHZEJVJt90JhwA/maG144ay+5YUPZgTwxC4qM22KWj4X2rnzCl NJP0ejb0gCIxJA8wLBA63ktA7GB3z7TozDQF0tzM9E80XiDfXV1HAHTzIyMfN/UxcLt3 hzP25EbELDDwJOyN+S71vG933cuZMham2eWSTRwNJ+b8H/eS1OAqMlHpySfEh61tldEt +dbNyaqz+7J47yoHAMyxdhVYgMuSBQMM4ghX5ritruWSZAYZ+yHlFiFkXO0IhSwysfXn aNig== X-Gm-Message-State: AO0yUKUekMtFtF+EmMVaPXxCqvvrYRslIISDuJ4DIYopdRqu8l09AuJ7 GYvQvlOheS6zLM45OEnvt5+/yQ== X-Google-Smtp-Source: AK7set/G2JltnF86eg14CrN8WPPxpYRIB/VMw/DChT6DUC7lHaDgF67EdM2oQ48xn+P9pwbYrwb2bg== X-Received: by 2002:a50:fb1a:0:b0:4aa:b40d:e0a3 with SMTP id d26-20020a50fb1a000000b004aab40de0a3mr1765967edq.34.1676462443132; Wed, 15 Feb 2023 04:00:43 -0800 (PST) Received: from [192.168.1.101] (abxh117.neoplus.adsl.tpnet.pl. [83.9.1.117]) by smtp.gmail.com with ESMTPSA id h2-20020a056402094200b004acb42134c4sm7001477edz.70.2023.02.15.04.00.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 04:00:42 -0800 (PST) From: Konrad Dybcio Date: Wed, 15 Feb 2023 13:00:38 +0100 Subject: [PATCH 1/2] Revert "iov_iter: Kill ITER_PIPE" MIME-Version: 1.0 Message-Id: <20230215-topic-next-20230214-revert-v1-1-c58cd87b9086@linaro.org> References: <20230215-topic-next-20230214-revert-v1-0-c58cd87b9086@linaro.org> In-Reply-To: <20230215-topic-next-20230214-revert-v1-0-c58cd87b9086@linaro.org> To: Alexander Viro , Steve French , Paulo Alcantara , Ronnie Sahlberg , Shyam Prasad N , Tom Talpey , "Matthew Wilcox (Oracle)" , Andrew Morton Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mm@kvack.org, Konrad Dybcio X-Mailer: b4 0.12.1 X-Developer-Signature: v=1; a=ed25519-sha256; t=1676462440; l=19713; i=konrad.dybcio@linaro.org; s=20230215; h=from:subject:message-id; bh=JamFgBokhUvbHMg0KAUa3pBXn49NqczAuxwAbKic4gc=; b=Gqee1C7M11dsA+umW6KM95no+9O08hO43LH8gcXhnRNcXAFltAAuPj4TeOdjfIx4Fom8cB+sGNma QJzHyNzQAzODuwCf1mg3DxVk6rb/Ks0R9EnIB0WiuhWKNDqcpdOM X-Developer-Key: i=konrad.dybcio@linaro.org; a=ed25519; pk=iclgkYvtl2w05SSXO5EjjSYlhFKsJ+5OSZBjOkQuEms= X-Rspam-User: X-Rspamd-Queue-Id: 748F240005 X-Rspamd-Server: rspam01 X-Stat-Signature: 3rgdxhxsenu68rai58zfrxc676w97ziw X-HE-Tag: 1676462444-493891 X-HE-Meta: U2FsdGVkX1+z2y0XNXsgb3x3ODvzacRix9vhvuX/5v2S+K7wH/f5SIzWnwXjcRF1uSJLwEp8eY5qYpFNfhj7bJ5qvmGmLlLrcKxcaCOrKrlqJaydCrY6eGOyQzHbhhGIjxW/2eQqdPf3weT/ouLYPA+wt6HRwQzJWl+nZtjkoNq7i+p2+wTGRu0BJTy6+zEk8nVywe5CO3jVEVdMcL0X3Cw2UHpGddlG9MnasWLgGtkFxWOJx6zB9z8TIugsEjBmS53gBf4ddyeXNpCq2CMhRJJ21lXwOa45GiMJfJ+FhnE5KoQzafxppzMKtS41B8pyeQJUAj9BVAxk8F43essDhLd/G84xc9XwPN7VPRWxMIdqPblYGDpauEY2FBsNvHuhTTA/kGemZJUyag5xIOm8vCMbzaZ2NeNuSCyYwUi7Vi84U+xK59K38q0xfmimiHnL80Z6W5V/aTeSiLIMy50FDqUCS4i8x2JmPokeN2fp9eygR5ZsBsapVggcee2CCSY2f2sy3/iVRKyhAfPp6JVkhHFfM7lrxcmgN/4ZudJwFWiVLkjObtDU9w1T53ORUtxMnzTpIDK+tJeZlrcYc26P9/+T+Yo/QrckahFIqjhLoo6K/9cLE0rLeBIBMQAu9pCh3ZkMyv1ibS0gjL5Urryx2NM03eZkLAc96VPiVNyfZ+CJ3Dg/FaLklVKgmVkK7NAp3oMfokD/d2AkPZWFjtDPo/oYFMB3BQ7w1YyClNrOhGuhzYzxl/FpQWvzCZNirRwNffoj9iCz9neKpuBNBzm+CTR6LXYPPt5Kw5exYOubRsFcC4mitwsujAJFAgT6EFlUcII9z80jRHDM6xyA7pThO+5In73b/7Hg5911D9k4V67CfDtll6oJ5I4qTbhH4snoWcsg5MzoZCOFCT+IfQbY6OnEPdeauqx2hX2Ob62Yt2HvJEjeKBkg+5K3fsfwvBPeG/fMXFuguBJXpbUYUbB c4yd41pz VimL5ngmyIXBuBpdSSCZXsRAv44+MAifyYpCPSW4lA+uYtdz595E6dvyrkTr7HeKkDmYAd0h/uXFU9hXhYq8roCpgXuSrzsdJIiFNiSLwp9nQ5iO8nZ4UTLqqHl0ndCuNW0C/thSC6gZDGr/AICslp/l6LUfEAC5MEQImXqe72xuaO3lMdWYuZ8hwh+wRABxQRt/WKOdO3E4zVy1Ff/pwH7dhDFXH2mEs1/QccjnxPGdpFjmnmRUS60inMvpZV2NMbTRRjUhxl3uD7ak26v99Dhwmp9iaya7mtQM2r+MdrezE9CXSbYMZPTZdLNXRSQZRAT+XpsEeTDtq2xBIwnyVwFKvf6HpAJAADZJfCM0DPZDAUXZtQP0tLyeUsQb08drS25LMW2jfR2XkkePxI1AuosW4j60OIojmyna9FmBHiSQVHyRD4h/JojPkBW5Ip+BIRmnoOIzIpQV2iAUQZq+tL8LBxA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Reverting this is a dependency for reverting commit d9722a475711 ("splice: Do splice read from a buffered file without using ITER_PIPE") This reverts commit 82cf0207bed44feb0b3b8b17a4c351fdde34a97b. Signed-off-by: Konrad Dybcio --- fs/cifs/file.c | 8 +- include/linux/uio.h | 14 ++ lib/iov_iter.c | 435 +++++++++++++++++++++++++++++++++++++++++++++++++++- mm/filemap.c | 3 +- 4 files changed, 455 insertions(+), 5 deletions(-) diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 9e0c03be032b..052609172690 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -3833,7 +3833,13 @@ cifs_readdata_to_iov(struct cifs_readdata *rdata, struct iov_iter *iter) size_t copy = min_t(size_t, remaining, PAGE_SIZE); size_t written; - written = copy_page_to_iter(page, 0, copy, iter); + if (unlikely(iov_iter_is_pipe(iter))) { + void *addr = kmap_atomic(page); + + written = copy_to_iter(addr, copy, iter); + kunmap_atomic(addr); + } else + written = copy_page_to_iter(page, 0, copy, iter); remaining -= written; if (written < copy && iov_iter_count(iter) > 0) break; diff --git a/include/linux/uio.h b/include/linux/uio.h index 74598426edb4..27e3fd942960 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -11,6 +11,7 @@ #include struct page; +struct pipe_inode_info; typedef unsigned int __bitwise iov_iter_extraction_t; @@ -24,6 +25,7 @@ enum iter_type { ITER_IOVEC, ITER_KVEC, ITER_BVEC, + ITER_PIPE, ITER_XARRAY, ITER_DISCARD, ITER_UBUF, @@ -53,10 +55,15 @@ struct iov_iter { const struct kvec *kvec; const struct bio_vec *bvec; struct xarray *xarray; + struct pipe_inode_info *pipe; void __user *ubuf; }; union { unsigned long nr_segs; + struct { + unsigned int head; + unsigned int start_head; + }; loff_t xarray_start; }; }; @@ -94,6 +101,11 @@ static inline bool iov_iter_is_bvec(const struct iov_iter *i) return iov_iter_type(i) == ITER_BVEC; } +static inline bool iov_iter_is_pipe(const struct iov_iter *i) +{ + return iov_iter_type(i) == ITER_PIPE; +} + static inline bool iov_iter_is_discard(const struct iov_iter *i) { return iov_iter_type(i) == ITER_DISCARD; @@ -235,6 +247,8 @@ void iov_iter_kvec(struct iov_iter *i, unsigned int direction, const struct kvec unsigned long nr_segs, size_t count); void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struct bio_vec *bvec, unsigned long nr_segs, size_t count); +void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode_info *pipe, + size_t count); void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count); void iov_iter_xarray(struct iov_iter *i, unsigned int direction, struct xarray *xarray, loff_t start, size_t count); diff --git a/lib/iov_iter.c b/lib/iov_iter.c index fad95e4cf372..02a73ca52207 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -14,6 +14,8 @@ #include #include +#define PIPE_PARANOIA /* for now */ + /* covers ubuf and kbuf alike */ #define iterate_buf(i, n, base, len, off, __p, STEP) { \ size_t __maybe_unused off = 0; \ @@ -184,6 +186,156 @@ static int copyin(void *to, const void __user *from, size_t n) return res; } +static inline struct pipe_buffer *pipe_buf(const struct pipe_inode_info *pipe, + unsigned int slot) +{ + return &pipe->bufs[slot & (pipe->ring_size - 1)]; +} + +#ifdef PIPE_PARANOIA +static bool sanity(const struct iov_iter *i) +{ + struct pipe_inode_info *pipe = i->pipe; + unsigned int p_head = pipe->head; + unsigned int p_tail = pipe->tail; + unsigned int p_occupancy = pipe_occupancy(p_head, p_tail); + unsigned int i_head = i->head; + unsigned int idx; + + if (i->last_offset) { + struct pipe_buffer *p; + if (unlikely(p_occupancy == 0)) + goto Bad; // pipe must be non-empty + if (unlikely(i_head != p_head - 1)) + goto Bad; // must be at the last buffer... + + p = pipe_buf(pipe, i_head); + if (unlikely(p->offset + p->len != abs(i->last_offset))) + goto Bad; // ... at the end of segment + } else { + if (i_head != p_head) + goto Bad; // must be right after the last buffer + } + return true; +Bad: + printk(KERN_ERR "idx = %d, offset = %d\n", i_head, i->last_offset); + printk(KERN_ERR "head = %d, tail = %d, buffers = %d\n", + p_head, p_tail, pipe->ring_size); + for (idx = 0; idx < pipe->ring_size; idx++) + printk(KERN_ERR "[%p %p %d %d]\n", + pipe->bufs[idx].ops, + pipe->bufs[idx].page, + pipe->bufs[idx].offset, + pipe->bufs[idx].len); + WARN_ON(1); + return false; +} +#else +#define sanity(i) true +#endif + +static struct page *push_anon(struct pipe_inode_info *pipe, unsigned size) +{ + struct page *page = alloc_page(GFP_USER); + if (page) { + struct pipe_buffer *buf = pipe_buf(pipe, pipe->head++); + *buf = (struct pipe_buffer) { + .ops = &default_pipe_buf_ops, + .page = page, + .offset = 0, + .len = size + }; + } + return page; +} + +static void push_page(struct pipe_inode_info *pipe, struct page *page, + unsigned int offset, unsigned int size) +{ + struct pipe_buffer *buf = pipe_buf(pipe, pipe->head++); + *buf = (struct pipe_buffer) { + .ops = &page_cache_pipe_buf_ops, + .page = page, + .offset = offset, + .len = size + }; + get_page(page); +} + +static inline int last_offset(const struct pipe_buffer *buf) +{ + if (buf->ops == &default_pipe_buf_ops) + return buf->len; // buf->offset is 0 for those + else + return -(buf->offset + buf->len); +} + +static struct page *append_pipe(struct iov_iter *i, size_t size, + unsigned int *off) +{ + struct pipe_inode_info *pipe = i->pipe; + int offset = i->last_offset; + struct pipe_buffer *buf; + struct page *page; + + if (offset > 0 && offset < PAGE_SIZE) { + // some space in the last buffer; add to it + buf = pipe_buf(pipe, pipe->head - 1); + size = min_t(size_t, size, PAGE_SIZE - offset); + buf->len += size; + i->last_offset += size; + i->count -= size; + *off = offset; + return buf->page; + } + // OK, we need a new buffer + *off = 0; + size = min_t(size_t, size, PAGE_SIZE); + if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) + return NULL; + page = push_anon(pipe, size); + if (!page) + return NULL; + i->head = pipe->head - 1; + i->last_offset = size; + i->count -= size; + return page; +} + +static size_t copy_page_to_iter_pipe(struct page *page, size_t offset, size_t bytes, + struct iov_iter *i) +{ + struct pipe_inode_info *pipe = i->pipe; + unsigned int head = pipe->head; + + if (unlikely(bytes > i->count)) + bytes = i->count; + + if (unlikely(!bytes)) + return 0; + + if (!sanity(i)) + return 0; + + if (offset && i->last_offset == -offset) { // could we merge it? + struct pipe_buffer *buf = pipe_buf(pipe, head - 1); + if (buf->page == page) { + buf->len += bytes; + i->last_offset -= bytes; + i->count -= bytes; + return bytes; + } + } + if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) + return 0; + + push_page(pipe, page, offset, bytes); + i->last_offset = -(offset + bytes); + i->head = head; + i->count -= bytes; + return bytes; +} + /* * fault_in_iov_iter_readable - fault in iov iterator for reading * @i: iterator @@ -287,6 +439,46 @@ void iov_iter_init(struct iov_iter *i, unsigned int direction, } EXPORT_SYMBOL(iov_iter_init); +// returns the offset in partial buffer (if any) +static inline unsigned int pipe_npages(const struct iov_iter *i, int *npages) +{ + struct pipe_inode_info *pipe = i->pipe; + int used = pipe->head - pipe->tail; + int off = i->last_offset; + + *npages = max((int)pipe->max_usage - used, 0); + + if (off > 0 && off < PAGE_SIZE) { // anon and not full + (*npages)++; + return off; + } + return 0; +} + +static size_t copy_pipe_to_iter(const void *addr, size_t bytes, + struct iov_iter *i) +{ + unsigned int off, chunk; + + if (unlikely(bytes > i->count)) + bytes = i->count; + if (unlikely(!bytes)) + return 0; + + if (!sanity(i)) + return 0; + + for (size_t n = bytes; n; n -= chunk) { + struct page *page = append_pipe(i, n, &off); + chunk = min_t(size_t, n, PAGE_SIZE - off); + if (!page) + return bytes - n; + memcpy_to_page(page, off, addr, chunk); + addr += chunk; + } + return bytes; +} + static __wsum csum_and_memcpy(void *to, const void *from, size_t len, __wsum sum, size_t off) { @@ -294,10 +486,44 @@ static __wsum csum_and_memcpy(void *to, const void *from, size_t len, return csum_block_add(sum, next, off); } +static size_t csum_and_copy_to_pipe_iter(const void *addr, size_t bytes, + struct iov_iter *i, __wsum *sump) +{ + __wsum sum = *sump; + size_t off = 0; + unsigned int chunk, r; + + if (unlikely(bytes > i->count)) + bytes = i->count; + if (unlikely(!bytes)) + return 0; + + if (!sanity(i)) + return 0; + + while (bytes) { + struct page *page = append_pipe(i, bytes, &r); + char *p; + + if (!page) + break; + chunk = min_t(size_t, bytes, PAGE_SIZE - r); + p = kmap_local_page(page); + sum = csum_and_memcpy(p + r, addr + off, chunk, sum, off); + kunmap_local(p); + off += chunk; + bytes -= chunk; + } + *sump = sum; + return off; +} + size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i) { if (WARN_ON_ONCE(i->data_source)) return 0; + if (unlikely(iov_iter_is_pipe(i))) + return copy_pipe_to_iter(addr, bytes, i); if (user_backed_iter(i)) might_fault(); iterate_and_advance(i, bytes, base, len, off, @@ -319,6 +545,42 @@ static int copyout_mc(void __user *to, const void *from, size_t n) return n; } +static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes, + struct iov_iter *i) +{ + size_t xfer = 0; + unsigned int off, chunk; + + if (unlikely(bytes > i->count)) + bytes = i->count; + if (unlikely(!bytes)) + return 0; + + if (!sanity(i)) + return 0; + + while (bytes) { + struct page *page = append_pipe(i, bytes, &off); + unsigned long rem; + char *p; + + if (!page) + break; + chunk = min_t(size_t, bytes, PAGE_SIZE - off); + p = kmap_local_page(page); + rem = copy_mc_to_kernel(p + off, addr + xfer, chunk); + chunk -= rem; + kunmap_local(p); + xfer += chunk; + bytes -= chunk; + if (rem) { + iov_iter_revert(i, rem); + break; + } + } + return xfer; +} + /** * _copy_mc_to_iter - copy to iter with source memory error exception handling * @addr: source kernel address @@ -338,8 +600,9 @@ static int copyout_mc(void __user *to, const void *from, size_t n) * alignment and poison alignment assumptions to avoid re-triggering * hardware exceptions. * - * * ITER_KVEC and ITER_BVEC can return short copies. Compare to - * copy_to_iter() where only ITER_IOVEC attempts might return a short copy. + * * ITER_KVEC, ITER_PIPE, and ITER_BVEC can return short copies. + * Compare to copy_to_iter() where only ITER_IOVEC attempts might return + * a short copy. * * Return: number of bytes copied (may be %0) */ @@ -347,6 +610,8 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i) { if (WARN_ON_ONCE(i->data_source)) return 0; + if (unlikely(iov_iter_is_pipe(i))) + return copy_mc_pipe_to_iter(addr, bytes, i); if (user_backed_iter(i)) might_fault(); __iterate_and_advance(i, bytes, base, len, off, @@ -452,6 +717,8 @@ size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes, return 0; if (WARN_ON_ONCE(i->data_source)) return 0; + if (unlikely(iov_iter_is_pipe(i))) + return copy_page_to_iter_pipe(page, offset, bytes, i); page += offset / PAGE_SIZE; // first subpage offset %= PAGE_SIZE; while (1) { @@ -500,8 +767,36 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes, } EXPORT_SYMBOL(copy_page_from_iter); +static size_t pipe_zero(size_t bytes, struct iov_iter *i) +{ + unsigned int chunk, off; + + if (unlikely(bytes > i->count)) + bytes = i->count; + if (unlikely(!bytes)) + return 0; + + if (!sanity(i)) + return 0; + + for (size_t n = bytes; n; n -= chunk) { + struct page *page = append_pipe(i, n, &off); + char *p; + + if (!page) + return bytes - n; + chunk = min_t(size_t, n, PAGE_SIZE - off); + p = kmap_local_page(page); + memset(p + off, 0, chunk); + kunmap_local(p); + } + return bytes; +} + size_t iov_iter_zero(size_t bytes, struct iov_iter *i) { + if (unlikely(iov_iter_is_pipe(i))) + return pipe_zero(bytes, i); iterate_and_advance(i, bytes, base, len, count, clear_user(base, len), memset(base, 0, len) @@ -532,6 +827,32 @@ size_t copy_page_from_iter_atomic(struct page *page, unsigned offset, size_t byt } EXPORT_SYMBOL(copy_page_from_iter_atomic); +static void pipe_advance(struct iov_iter *i, size_t size) +{ + struct pipe_inode_info *pipe = i->pipe; + int off = i->last_offset; + + if (!off && !size) { + pipe_discard_from(pipe, i->start_head); // discard everything + return; + } + i->count -= size; + while (1) { + struct pipe_buffer *buf = pipe_buf(pipe, i->head); + if (off) /* make it relative to the beginning of buffer */ + size += abs(off) - buf->offset; + if (size <= buf->len) { + buf->len = size; + i->last_offset = last_offset(buf); + break; + } + size -= buf->len; + i->head++; + off = 0; + } + pipe_discard_from(pipe, i->head + 1); // discard everything past this one +} + static void iov_iter_bvec_advance(struct iov_iter *i, size_t size) { const struct bio_vec *bvec, *end; @@ -583,6 +904,8 @@ void iov_iter_advance(struct iov_iter *i, size_t size) iov_iter_iovec_advance(i, size); } else if (iov_iter_is_bvec(i)) { iov_iter_bvec_advance(i, size); + } else if (iov_iter_is_pipe(i)) { + pipe_advance(i, size); } else if (iov_iter_is_discard(i)) { i->count -= size; } @@ -596,6 +919,26 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll) if (WARN_ON(unroll > MAX_RW_COUNT)) return; i->count += unroll; + if (unlikely(iov_iter_is_pipe(i))) { + struct pipe_inode_info *pipe = i->pipe; + unsigned int head = pipe->head; + + while (head > i->start_head) { + struct pipe_buffer *b = pipe_buf(pipe, --head); + if (unroll < b->len) { + b->len -= unroll; + i->last_offset = last_offset(b); + i->head = head; + return; + } + unroll -= b->len; + pipe_buf_release(pipe, b); + pipe->head--; + } + i->last_offset = 0; + i->head = head; + return; + } if (unlikely(iov_iter_is_discard(i))) return; if (unroll <= i->iov_offset) { @@ -683,6 +1026,24 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int direction, } EXPORT_SYMBOL(iov_iter_bvec); +void iov_iter_pipe(struct iov_iter *i, unsigned int direction, + struct pipe_inode_info *pipe, + size_t count) +{ + BUG_ON(direction != READ); + WARN_ON(pipe_full(pipe->head, pipe->tail, pipe->ring_size)); + *i = (struct iov_iter){ + .iter_type = ITER_PIPE, + .data_source = false, + .pipe = pipe, + .head = pipe->head, + .start_head = pipe->head, + .last_offset = 0, + .count = count + }; +} +EXPORT_SYMBOL(iov_iter_pipe); + /** * iov_iter_xarray - Initialise an I/O iterator to use the pages in an xarray * @i: The iterator to initialise. @@ -807,6 +1168,19 @@ bool iov_iter_is_aligned(const struct iov_iter *i, unsigned addr_mask, if (iov_iter_is_bvec(i)) return iov_iter_aligned_bvec(i, addr_mask, len_mask); + if (iov_iter_is_pipe(i)) { + size_t size = i->count; + + if (size & len_mask) + return false; + if (size && i->last_offset > 0) { + if (i->last_offset & addr_mask) + return false; + } + + return true; + } + if (iov_iter_is_xarray(i)) { if (i->count & len_mask) return false; @@ -876,6 +1250,14 @@ unsigned long iov_iter_alignment(const struct iov_iter *i) if (iov_iter_is_bvec(i)) return iov_iter_alignment_bvec(i); + if (iov_iter_is_pipe(i)) { + size_t size = i->count; + + if (size && i->last_offset > 0) + return size | i->last_offset; + return size; + } + if (iov_iter_is_xarray(i)) return (i->xarray_start + i->iov_offset) | i->count; @@ -927,6 +1309,36 @@ static int want_pages_array(struct page ***res, size_t size, return count; } +static ssize_t pipe_get_pages(struct iov_iter *i, + struct page ***pages, size_t maxsize, unsigned maxpages, + size_t *start) +{ + unsigned int npages, count, off, chunk; + struct page **p; + size_t left; + + if (!sanity(i)) + return -EFAULT; + + *start = off = pipe_npages(i, &npages); + if (!npages) + return -EFAULT; + count = want_pages_array(pages, maxsize, off, min(npages, maxpages)); + if (!count) + return -ENOMEM; + p = *pages; + for (npages = 0, left = maxsize ; npages < count; npages++, left -= chunk) { + struct page *page = append_pipe(i, left, &off); + if (!page) + break; + chunk = min_t(size_t, left, PAGE_SIZE - off); + get_page(*p++ = page); + } + if (!npages) + return -EFAULT; + return maxsize - left; +} + static ssize_t iter_xarray_populate_pages(struct page **pages, struct xarray *xa, pgoff_t index, unsigned int nr_pages) { @@ -1076,6 +1488,8 @@ static ssize_t __iov_iter_get_pages_alloc(struct iov_iter *i, } return maxsize; } + if (iov_iter_is_pipe(i)) + return pipe_get_pages(i, pages, maxsize, maxpages, start); if (iov_iter_is_xarray(i)) return iter_xarray_get_pages(i, pages, maxsize, maxpages, start); return -EFAULT; @@ -1165,7 +1579,9 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *_csstate, } sum = csum_shift(csstate->csum, csstate->off); - iterate_and_advance(i, bytes, base, len, off, ({ + if (unlikely(iov_iter_is_pipe(i))) + bytes = csum_and_copy_to_pipe_iter(addr, bytes, i, &sum); + else iterate_and_advance(i, bytes, base, len, off, ({ next = csum_and_copy_to_user(addr + off, base, len); sum = csum_block_add(sum, next, off); next ? 0 : len; @@ -1250,6 +1666,15 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages) return iov_npages(i, maxpages); if (iov_iter_is_bvec(i)) return bvec_npages(i, maxpages); + if (iov_iter_is_pipe(i)) { + int npages; + + if (!sanity(i)) + return 0; + + pipe_npages(i, &npages); + return min(npages, maxpages); + } if (iov_iter_is_xarray(i)) { unsigned offset = (i->xarray_start + i->iov_offset) % PAGE_SIZE; int npages = DIV_ROUND_UP(offset + i->count, PAGE_SIZE); @@ -1262,6 +1687,10 @@ EXPORT_SYMBOL(iov_iter_npages); const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags) { *new = *old; + if (unlikely(iov_iter_is_pipe(new))) { + WARN_ON(1); + return NULL; + } if (iov_iter_is_bvec(new)) return new->bvec = kmemdup(new->bvec, new->nr_segs * sizeof(struct bio_vec), diff --git a/mm/filemap.c b/mm/filemap.c index d647b2e7cd51..9cf984f7c25e 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2692,7 +2692,8 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter, if (unlikely(iocb->ki_pos >= i_size_read(inode))) break; - error = filemap_get_pages(iocb, iter->count, &fbatch, false); + error = filemap_get_pages(iocb, iter->count, &fbatch, + iov_iter_is_pipe(iter)); if (error < 0) break; From patchwork Wed Feb 15 12:00:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Konrad Dybcio X-Patchwork-Id: 13141571 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0157C6379F for ; Wed, 15 Feb 2023 12:00:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5B2116B0074; Wed, 15 Feb 2023 07:00:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 53B316B0075; Wed, 15 Feb 2023 07:00:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2EFD26B0078; Wed, 15 Feb 2023 07:00:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 184496B0074 for ; Wed, 15 Feb 2023 07:00:49 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C9EA3A15C4 for ; Wed, 15 Feb 2023 12:00:48 +0000 (UTC) X-FDA: 80469384576.22.0AEB247 Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) by imf14.hostedemail.com (Postfix) with ESMTP id 79760100002 for ; Wed, 15 Feb 2023 12:00:46 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linaro.org header.s=google header.b=DPicoMrc; spf=pass (imf14.hostedemail.com: domain of konrad.dybcio@linaro.org designates 209.85.208.50 as permitted sender) smtp.mailfrom=konrad.dybcio@linaro.org; dmarc=pass (policy=none) header.from=linaro.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676462446; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CTdIxWofoZbrNI+0T5uLIfywbnnYI1faqI99lTcvgD4=; b=WIZQE6+f9MLvG/Bio9dYrxNdBsEu53Ydpi71IrlLoy9x4Y7vnBf4EsXpRl2QkUZAxWh/OT 77LiJY3fjZApt3Ksl6ig48isEBU4MNmx2jQIlW1A1+5FEW+HyHpMLaEoYuMC+zFVszEvhR 4Yl3OdrjeccYJ1VUiUF8HkXWB9U5zKM= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=linaro.org header.s=google header.b=DPicoMrc; spf=pass (imf14.hostedemail.com: domain of konrad.dybcio@linaro.org designates 209.85.208.50 as permitted sender) smtp.mailfrom=konrad.dybcio@linaro.org; dmarc=pass (policy=none) header.from=linaro.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676462446; a=rsa-sha256; cv=none; b=ZXbMPm/8AAjlGmNzlCroU1O7RVmurWdGdSCXkEGRb3hxniddY7Dd1uU6E7eLGx60iDrGAX dyMkah/AB9aAr7HL/GoU+ISkWh1RcKiH9XO0VgiV2XoZoqGIlQo8VaDGNJgoF82IhelnM4 99RouZyBvlQbLMbUOQOzLs1nNL+NzzM= Received: by mail-ed1-f50.google.com with SMTP id s11so13970276edd.10 for ; Wed, 15 Feb 2023 04:00:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=CTdIxWofoZbrNI+0T5uLIfywbnnYI1faqI99lTcvgD4=; b=DPicoMrchplmMVruPuhN53nBF3IEotLq2L0p8YcPJQ/0AhVREKehl4KLdc0YoUq0I6 NvKJ2gKHd8Gw7UWHAIIsgLMBdwJyAez/8/bqIo/2lCe9Nb/2JXiQhOZ3iIgJgDFwIWij piXl5UgaHibBzS/wSK2geYlTyiHZbKnOGSBGuZgLG5pzX9QxpD38KQgkRBF8j7cNBoWx JSOQLarV6h92oB3j4J0q7X3dH4XcGSAP8PgHMgG6p9NB+wuKkCbLzta/Avvx+MLgyfZv Uh6RmRfo3OiAwRm4U3D3XIK76pCRL/0GMLimRRuOeDYN+P4XN8YpkW/XTRWkzHyJltCo O46Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CTdIxWofoZbrNI+0T5uLIfywbnnYI1faqI99lTcvgD4=; b=5Jpdgpn/T/I/3VHhiMYz62pAtW174pHXPO6JHlH2Vhk72RHOTxwPmcmoRNYdcStSeM NDqezSevDIVikFCSiHZ1czgHXkhkEG9VTiCsF3yqWwbnttGLzcSqHfY1fkg/QjVDgDWr t4BI9hxM8+XZJn/NE2XUz8DJjRKkpcEXNPPkmxiKI506K/8wpOsILUDqExBuIO3jWF4y WmnQVxjaL/nXKHHWtlkyj6Y9ZPWLjjN039XXlkNe/FctpJjyJd2jF8kE6Joae/GpU8hc s0k8toQ0ox8ycenomMZ95lLH06/rOC4I1XsbZGX/CnbKyY3xA1o7bCCycdKG+9CvjSQQ LbJg== X-Gm-Message-State: AO0yUKX1vS+LjDZs2D6i0v0W7JQmXk+Ya8ogf5Gk3MXXZK0rpbY2EAe0 6bb2Q+1aMbefBdXYOL2iFY2S9A== X-Google-Smtp-Source: AK7set+hCDLUEnIdvXW6KZUjw70YbZYedf6amBYRgn/9tgsiKw/kn/BqszQM6Js3Oo3miu4VimUPLg== X-Received: by 2002:a50:ec85:0:b0:4ad:66b:84a6 with SMTP id e5-20020a50ec85000000b004ad066b84a6mr1654860edr.13.1676462445119; Wed, 15 Feb 2023 04:00:45 -0800 (PST) Received: from [192.168.1.101] (abxh117.neoplus.adsl.tpnet.pl. [83.9.1.117]) by smtp.gmail.com with ESMTPSA id h2-20020a056402094200b004acb42134c4sm7001477edz.70.2023.02.15.04.00.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 04:00:44 -0800 (PST) From: Konrad Dybcio Date: Wed, 15 Feb 2023 13:00:39 +0100 Subject: [PATCH 2/2] Revert "splice: Do splice read from a buffered file without using ITER_PIPE" MIME-Version: 1.0 Message-Id: <20230215-topic-next-20230214-revert-v1-2-c58cd87b9086@linaro.org> References: <20230215-topic-next-20230214-revert-v1-0-c58cd87b9086@linaro.org> In-Reply-To: <20230215-topic-next-20230214-revert-v1-0-c58cd87b9086@linaro.org> To: Alexander Viro , Steve French , Paulo Alcantara , Ronnie Sahlberg , Shyam Prasad N , Tom Talpey , "Matthew Wilcox (Oracle)" , Andrew Morton Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mm@kvack.org, Konrad Dybcio X-Mailer: b4 0.12.1 X-Developer-Signature: v=1; a=ed25519-sha256; t=1676462440; l=5678; i=konrad.dybcio@linaro.org; s=20230215; h=from:subject:message-id; bh=GhZqoVotGOrdGVVTlpDSlth0cBTnSieaqw2vZSOgwMg=; b=kNOAlZxm146sX4RVEP1agu2t6Duxj4iwCjq2UwEUESRutNpNxNhcL8oQbqfeiw57PyJEeXzT37Oe coVqbCJODCzECjLMISdgy6cJo3Khk2+tqwAcR/SUcXvaAedyrvLB X-Developer-Key: i=konrad.dybcio@linaro.org; a=ed25519; pk=iclgkYvtl2w05SSXO5EjjSYlhFKsJ+5OSZBjOkQuEms= X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 79760100002 X-Stat-Signature: bypi3o4wfnxnemri6yr5ywims4f7jq7k X-HE-Tag: 1676462446-718293 X-HE-Meta: U2FsdGVkX18kXuf2JoV+uvIxYyEil4TvcFgcUkedEz0ZbGE+p2qlHEGN7jn7LdOkRXjkAY+blriXwUcCBAPaC8C0TOKWpCAekm3hrcepWY7tUQc6iOAo1u5FEncPzMLeM9q1zJnpb3d7SnDUsKRgZ4/9dNweVdbo5zezD9iBx4O3AYIjhaX+EIDYyJw0ouwki2856+HGrlIYXTqvs5Y38FL+VxZJvHPSOiGkGSAUKd4GPEVJClYfeuYGGWcbM7tSR592it7Xo3vC9fzy9wcqK1jkeTsnxPpzMt1+feM9ofLSPTzGeAJLtwApOAUIRQTUQywkrU7U+QYJL3nDsl/59B/l96vNpBQgLQcnqOvHfq/Gp0AtOeXZRmjVVLf39WgaW+3xvih19px9HaJpT9ceoUCM9GrgO0VTAvWtlMOasVdwe4+m3Jd+wesCp2IWZL556oAdOGfK7TTkM0bBp3xGUCHgVTbxeBsZ4xp/iex3ff5Mbsf4/azCGvZ3bux2lCCgNkmssPmHknXpbQq/7zwlG+SwLa4PluRGPrMdAMQN6xPTye2/n5ItfHq6IRpqyhfe37yzO5CP7Pm0x7KgBTbnhbz77ogoTGT4SGkfQHKcDWlQ4naDjY9CCSnBVyzwFRGl+cOrk/C+yBA3pzCbh2iE3IL3Cn9iF75dFcNcaOFLd7yjr+azDWJa5D+e1UcfOt7hyO5d8Q6Utpym3ei/Fspx0ICNpkHuW9ONfvYJclZS7JQauwhZl1SkbMHGr5roOQlll3CeFs2YHjpeXkNj+BXW8nOL6hktsuJRRDmbCIVaDXv+z2U5/cLv3hi3tx/qKPWLXJfSeAGkLGpC8q3De0FAGLbEprnFFXmHMZpA+TUBnvf+cohYRWfoktJ7ffigQJ85GOEJ+JazyIT/C1WKRnor+/9/AMBUtb84kXh651pwrDIpvFUOwChwMY1HHaNwL/zyIfRQ9sfVLSVTukdt1pR +2qxuxRC 5gsleV7wXFtKmItFZsY6ASwY1g2g4HVgCkfLykf57KJ18tTo7Mm58YZOF0qq6Lc7aBEqofSeci9fz0GAuOvsb0YQ2gcQfZ7ETx0XK7u/eHnwk83e7+vPJo1giswir2uwWNf0OjdQZzpNebmb7GD3H5TcmN/xfmGvICJkaYHrtAMOXKk5pn7U0h0mYETWHstCnSz6mSrif8DO5ibeb7hfrg4F9icBZQ4Wd5Z0Yx+kQVry736+Ug4TwQeQx8yvQ9oE7GPi3YehLCM/EagnEO69OlNbY4AdyWN43hMMVwlMHTEuQ4jR7w6UiRSwB7OxjMOecTkQOKvACtg3P/cPO152W6auYG+aDWbbZQF2NrZKOtOFo9BDrb0d/CqOnwlz5AUPCGAZPCIUm3BKmyo/7avOrE/Xd7vTSFfWFPV/5EHjN4xpKiCboE7h4nHNWk4n6v6ugt4uvRrOL1mvsuJAmCvGBHYQuPg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: next-20230213 introduced commit d9722a475711 ("splice: Do splice read from a buffered file without using ITER_PIPE") which broke booting on any Qualcomm ARM64 device I grabbed, dereferencing a null pointer in generic_filesplice_read+0xf8/x598. Revert it to make the devices bootable again. This reverts commit d9722a47571104f7fa1eeb5ec59044d3607c6070. Signed-off-by: Konrad Dybcio --- fs/splice.c | 159 +++++++++--------------------------------------------------- 1 file changed, 24 insertions(+), 135 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index fa82dfee1ed0..10b258250868 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -22,7 +22,6 @@ #include #include #include -#include #include #include #include @@ -378,135 +377,6 @@ static ssize_t generic_file_direct_splice_read(struct file *in, loff_t *ppos, return ret; } -/* - * Splice subpages from a folio into a pipe. - */ -static size_t splice_folio_into_pipe(struct pipe_inode_info *pipe, - struct folio *folio, - loff_t fpos, size_t size) -{ - struct page *page; - size_t spliced = 0, offset = offset_in_folio(folio, fpos); - - page = folio_page(folio, offset / PAGE_SIZE); - size = min(size, folio_size(folio) - offset); - offset %= PAGE_SIZE; - - while (spliced < size && - !pipe_full(pipe->head, pipe->tail, pipe->max_usage)) { - struct pipe_buffer *buf = &pipe->bufs[pipe->head & (pipe->ring_size - 1)]; - size_t part = min_t(size_t, PAGE_SIZE - offset, size - spliced); - - *buf = (struct pipe_buffer) { - .ops = &page_cache_pipe_buf_ops, - .page = page, - .offset = offset, - .len = part, - }; - folio_get(folio); - pipe->head++; - page++; - spliced += part; - offset = 0; - } - - return spliced; -} - -/* - * Splice folios from the pagecache of a buffered (ie. non-O_DIRECT) file into - * a pipe. - */ -static ssize_t generic_file_buffered_splice_read(struct file *in, loff_t *ppos, - struct pipe_inode_info *pipe, - size_t len, - unsigned int flags) -{ - struct folio_batch fbatch; - size_t total_spliced = 0, used, npages; - loff_t isize, end_offset; - bool writably_mapped; - int i, error = 0; - - struct kiocb iocb = { - .ki_filp = in, - .ki_pos = *ppos, - }; - - /* Work out how much data we can actually add into the pipe */ - used = pipe_occupancy(pipe->head, pipe->tail); - npages = max_t(ssize_t, pipe->max_usage - used, 0); - len = min_t(size_t, len, npages * PAGE_SIZE); - - folio_batch_init(&fbatch); - - do { - cond_resched(); - - if (*ppos >= i_size_read(file_inode(in))) - break; - - iocb.ki_pos = *ppos; - error = filemap_get_pages(&iocb, len, &fbatch, true); - if (error < 0) - break; - - /* - * i_size must be checked after we know the pages are Uptodate. - * - * Checking i_size after the check allows us to calculate - * the correct value for "nr", which means the zero-filled - * part of the page is not copied back to userspace (unless - * another truncate extends the file - this is desired though). - */ - isize = i_size_read(file_inode(in)); - if (unlikely(*ppos >= isize)) - break; - end_offset = min_t(loff_t, isize, *ppos + len); - - /* - * Once we start copying data, we don't want to be touching any - * cachelines that might be contended: - */ - writably_mapped = mapping_writably_mapped(in->f_mapping); - - for (i = 0; i < folio_batch_count(&fbatch); i++) { - struct folio *folio = fbatch.folios[i]; - size_t n; - - if (folio_pos(folio) >= end_offset) - goto out; - folio_mark_accessed(folio); - - /* - * If users can be writing to this folio using arbitrary - * virtual addresses, take care of potential aliasing - * before reading the folio on the kernel side. - */ - if (writably_mapped) - flush_dcache_folio(folio); - - n = splice_folio_into_pipe(pipe, folio, *ppos, len); - if (!n) - goto out; - len -= n; - total_spliced += n; - *ppos += n; - in->f_ra.prev_pos = *ppos; - if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) - goto out; - } - - folio_batch_release(&fbatch); - } while (len); - -out: - folio_batch_release(&fbatch); - file_accessed(in); - - return total_spliced ? total_spliced : error; -} - /** * generic_file_splice_read - splice data from file to a pipe * @in: file to splice from @@ -524,13 +394,32 @@ ssize_t generic_file_splice_read(struct file *in, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags) { - if (unlikely(*ppos >= file_inode(in)->i_sb->s_maxbytes)) - return 0; - if (unlikely(!len)) - return 0; + struct iov_iter to; + struct kiocb kiocb; + int ret; + if (in->f_flags & O_DIRECT) return generic_file_direct_splice_read(in, ppos, pipe, len, flags); - return generic_file_buffered_splice_read(in, ppos, pipe, len, flags); + + iov_iter_pipe(&to, ITER_DEST, pipe, len); + init_sync_kiocb(&kiocb, in); + kiocb.ki_pos = *ppos; + ret = call_read_iter(in, &kiocb, &to); + if (ret > 0) { + *ppos = kiocb.ki_pos; + file_accessed(in); + } else if (ret < 0) { + /* free what was emitted */ + pipe_discard_from(pipe, to.start_head); + /* + * callers of ->splice_read() expect -EAGAIN on + * "can't put anything in there", rather than -EFAULT. + */ + if (ret == -EFAULT) + ret = -EAGAIN; + } + + return ret; } EXPORT_SYMBOL(generic_file_splice_read);