From patchwork Tue Feb 14 17:13:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13140599 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58CC0C61DA4 for ; Tue, 14 Feb 2023 17:13:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CDE986B0075; Tue, 14 Feb 2023 12:13:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C8F016B0078; Tue, 14 Feb 2023 12:13:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B56DF6B007B; Tue, 14 Feb 2023 12:13:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A8A166B0075 for ; Tue, 14 Feb 2023 12:13:47 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 7CBFF140D2B for ; Tue, 14 Feb 2023 17:13:47 +0000 (UTC) X-FDA: 80466544494.03.CEC2A05 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf15.hostedemail.com (Postfix) with ESMTP id B5B84A0002 for ; Tue, 14 Feb 2023 17:13:44 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=dog3BdOh; spf=pass (imf15.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676394824; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=H0CTUnwfVEWUZBHu5fA2fJ8y0quxYsONFQfvaH+Rj5U=; b=W9cM2lsRWk5yJJT8HoZZPQTXruwy3RrVGTjhML3QBDHYdCNRf78UfBdgbAaSW+EP44Qydk fJpBT06wKcNpAPQef3oraSsww8K429qotEXvH8ZTGqMyV6W9Oru+4ID3ZXsb1Y0NuL78Ci HNPepXEvF+PhRTTWmLfCVuRj2AU4i5U= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=dog3BdOh; spf=pass (imf15.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676394824; a=rsa-sha256; cv=none; b=0h5e56tYVC/eWNOEJf6xladgy+y8LYSh3dXtiGTywSMq9vnw24RfwMc93mzDBkJRsyOxeb JllA2vlvipXr4DLYj/5fYnc0NmyywX9mnjspXm590XcLGHP92/D1aWsam3sBkUmTMOhwCL 18aDBDtKtPluJDG1HobwaVStcr68b5E= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1676394824; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H0CTUnwfVEWUZBHu5fA2fJ8y0quxYsONFQfvaH+Rj5U=; b=dog3BdOhoHCFuBk9FxwgeGNA0Pfi8t2hZy5PVslr8XPTdzg07C+S53HzZ54vZxCpM3f9A9 fJZS2GaqxF01QYyNRsHrzOLQRBnlk2t4em+0oU5KGn1ndwT+Rf6UcqrzIiVuE1SPiBxK5P IsfIDi/WdtnGX6GGw6NFNlywiUg5KNo= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-49-fW_ij3QZO56kIwrqyILfmQ-1; Tue, 14 Feb 2023 12:13:42 -0500 X-MC-Unique: fW_ij3QZO56kIwrqyILfmQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A9D5F1C08796; Tue, 14 Feb 2023 17:13:40 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.24]) by smtp.corp.redhat.com (Postfix) with ESMTP id B1699140EBF6; Tue, 14 Feb 2023 17:13:38 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v14 02/17] splice: Add a func to do a splice from a buffered file without ITER_PIPE Date: Tue, 14 Feb 2023 17:13:15 +0000 Message-Id: <20230214171330.2722188-3-dhowells@redhat.com> In-Reply-To: <20230214171330.2722188-1-dhowells@redhat.com> References: <20230214171330.2722188-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Stat-Signature: bio66f679ccmmwmixq93aw1wcsbdy4n6 X-Rspam-User: X-Rspamd-Queue-Id: B5B84A0002 X-Rspamd-Server: rspam06 X-HE-Tag: 1676394824-267192 X-HE-Meta: U2FsdGVkX1/7bZcJVV5siysjCzsAiF10c3t4oeOjNkP74yONGhqQFR23Pi0y6IMw6nzQO9UEBiqjCohXBt3Fu/1l2xD+Z/i8nVbIO8+nsTRe4fWzgYBf0WrQL71c5kbSdtlN7mdtLAT7523uu6kDx8JQJQ8SCaATtAwql3iu0fZXLBIuYAiP4jsdMgxtDC90Q7yS5LzOK5tuHU9Y8krJ1WZUkWqowAwwLkghE82ukehfGhCyytRR3N9yHvLiDIF6d3esSZixcxFEmvUNHCn8KJI9RtisaLQYJJbIbvfzjREmpBVHXcnEfopz1o7dc+yKprFMO4DP8OFcF8GdK1DPqv3iLVnkAQROHVcG4P+eb9Ui/gd5NtMdt5V0C2MIsB/3WIgP/xSU0HPg14Qgj+Rb3CdUE8kCKOUK8he9VUNnf1cs5+K4ROIi0/ZoBqC8zGzw6AiPxpK8IumgYOa6rqg3ZZOa4D3PwjA0yuRVMgtuie4HJzel3MComApVQlgEzy8589l6Ef8IbVTSVc7BGp7CpznpwUo6olT+IN1YnxKWuH9FpNEIdB7CBD1MTixnoNa56rgyEJDp1Q+ZDdOgEqIk5ugz8phLuz4E9XLOii3vXY/0e3xAi9cmjPcZiyr1ErbXAg/LLFcW5IwJizu1GA0KA974q7TOzQN5YmCa5DAkKsD3IsoT24+VTLIXR9UNcUgjVzCH5jckHhiiQkFe/3Ocf4VDVBCJoW+5G20BOfFMedbHkuBevkRV2x49HRPSz8RA2EAEPfrYoQeHfXIQCE/r7EPKs1SuQ91X5WU9xI2ZQt0wDtihCnois0nuFqM7q5l6Mp29spiSGD8n9LKyFGt8Pm6pE6cd/3KktfaZVG65x5JAbArD7vOkYWBlJ4AdIE07S+BXhH2mKIdmu6kcNF7+veJhNydiioRQuMsvBiw7dBfT7pEiOabGtxC2CeqeYNcDZ7rfrO7a74id1eD6jIz YGqgD37I l+HqP5ObYd18+MWwWppE4IkqLN1IfyGHQj/NXQNbqbQSxTw02fQ2JGROydW/pdfBqzc5FceOyNa5ctXqQImZdr7vxCrHv3slREQo8X9NxyBYXuxRHJtknzPlUeHjopVpvz2gPOHc/x80PeaoMeRhF74LRLKhgOSKpy6CVrU6ZbpBt8YkkfJ2uWlwtZValE7/fBI7xUxxOLkPA1YH4gXy9Q/ssXdCVzMd5eFgjgApjnPKubD/Fk4le8n2cYzhv9f8cZn5gIoSntzqZAI04S78BrsnGr+aw2cajJPv9Z0n3jIQn9c8rT2zhSvGZfP4qPbVUZrpolrPe138gN+O3KUWISYbiq1UKQZeWxLRHM9N1KogpIrPpRB01pl9QDE/MUPxdXrp0gIt4m6EnH+zvrsxl9UtwHdt0/2hiQDISqJFIHGnznUXd/Frll6kjMzCH46lWVmdf7gUnJHCuLEUUv/kWKxPsu4NxooxgEndnrGLjukbhWmMzZSQ0QoE5u/YyRle267fn X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Provide a function to do splice read from a buffered file, pulling the folios out of the pagecache directly by calling filemap_get_pages() to do any required reading and then pasting the returned folios into the pipe. A helper function is provided to do the actual folio pasting and will handle multipage folios by splicing as many of the relevant subpages as will fit into the pipe. The code is loosely based on filemap_read() and might belong in mm/filemap.c with that as it needs to use filemap_get_pages(). Signed-off-by: David Howells cc: Jens Axboe cc: Christoph Hellwig cc: Al Viro cc: David Hildenbrand cc: John Hubbard cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org --- Notes: ver #14) - Rename to filemap_splice_read(). - Create a helper, pipe_head_buf(), to get the head buffer. - Use init_sync_kiocb(). - Move to mm/filemap.c. - Split the implementation of filemap_splice_read() from the patch to make generic_file_splice_read() use it and direct_splice_read(). include/linux/fs.h | 3 ++ mm/filemap.c | 128 +++++++++++++++++++++++++++++++++++++++++++++ mm/internal.h | 6 +++ 3 files changed, 137 insertions(+) diff --git a/include/linux/fs.h b/include/linux/fs.h index c1769a2c5d70..28743e38df91 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3163,6 +3163,9 @@ ssize_t vfs_iocb_iter_write(struct file *file, struct kiocb *iocb, struct iov_iter *iter); /* fs/splice.c */ +ssize_t filemap_splice_read(struct file *in, loff_t *ppos, + struct pipe_inode_info *pipe, + size_t len, unsigned int flags); extern ssize_t generic_file_splice_read(struct file *, loff_t *, struct pipe_inode_info *, size_t, unsigned int); extern ssize_t iter_file_splice_write(struct pipe_inode_info *, diff --git a/mm/filemap.c b/mm/filemap.c index 876e77278d2a..8c7b135c8e23 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -42,6 +42,8 @@ #include #include #include +#include +#include #include #include #include "internal.h" @@ -2842,6 +2844,132 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter) } EXPORT_SYMBOL(generic_file_read_iter); +/* + * Splice subpages from a folio into a pipe. + */ +size_t splice_folio_into_pipe(struct pipe_inode_info *pipe, + struct folio *folio, loff_t fpos, size_t size) +{ + struct page *page; + size_t spliced = 0, offset = offset_in_folio(folio, fpos); + + page = folio_page(folio, offset / PAGE_SIZE); + size = min(size, folio_size(folio) - offset); + offset %= PAGE_SIZE; + + while (spliced < size && + !pipe_full(pipe->head, pipe->tail, pipe->max_usage)) { + struct pipe_buffer *buf = pipe_head_buf(pipe); + size_t part = min_t(size_t, PAGE_SIZE - offset, size - spliced); + + *buf = (struct pipe_buffer) { + .ops = &page_cache_pipe_buf_ops, + .page = page, + .offset = offset, + .len = part, + }; + folio_get(folio); + pipe->head++; + page++; + spliced += part; + offset = 0; + } + + return spliced; +} + +/* + * Splice folios from the pagecache of a buffered (ie. non-O_DIRECT) file into + * a pipe. + */ +ssize_t filemap_splice_read(struct file *in, loff_t *ppos, + struct pipe_inode_info *pipe, + size_t len, unsigned int flags) +{ + struct folio_batch fbatch; + struct kiocb iocb; + size_t total_spliced = 0, used, npages; + loff_t isize, end_offset; + bool writably_mapped; + int i, error = 0; + + init_sync_kiocb(&iocb, in); + iocb.ki_pos = *ppos; + + /* Work out how much data we can actually add into the pipe */ + used = pipe_occupancy(pipe->head, pipe->tail); + npages = max_t(ssize_t, pipe->max_usage - used, 0); + len = min_t(size_t, len, npages * PAGE_SIZE); + + folio_batch_init(&fbatch); + + do { + cond_resched(); + + if (*ppos >= i_size_read(file_inode(in))) + break; + + iocb.ki_pos = *ppos; + error = filemap_get_pages(&iocb, len, &fbatch, true); + if (error < 0) + break; + + /* + * i_size must be checked after we know the pages are Uptodate. + * + * Checking i_size after the check allows us to calculate + * the correct value for "nr", which means the zero-filled + * part of the page is not copied back to userspace (unless + * another truncate extends the file - this is desired though). + */ + isize = i_size_read(file_inode(in)); + if (unlikely(*ppos >= isize)) + break; + end_offset = min_t(loff_t, isize, *ppos + len); + + /* + * Once we start copying data, we don't want to be touching any + * cachelines that might be contended: + */ + writably_mapped = mapping_writably_mapped(in->f_mapping); + + for (i = 0; i < folio_batch_count(&fbatch); i++) { + struct folio *folio = fbatch.folios[i]; + size_t n; + + if (folio_pos(folio) >= end_offset) + goto out; + folio_mark_accessed(folio); + + /* + * If users can be writing to this folio using arbitrary + * virtual addresses, take care of potential aliasing + * before reading the folio on the kernel side. + */ + if (writably_mapped) + flush_dcache_folio(folio); + + n = splice_folio_into_pipe(pipe, folio, *ppos, len); + if (!n) + goto out; + len -= n; + total_spliced += n; + *ppos += n; + in->f_ra.prev_pos = *ppos; + if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) + goto out; + } + + folio_batch_release(&fbatch); + } while (len); + +out: + folio_batch_release(&fbatch); + file_accessed(in); + + return total_spliced ? total_spliced : error; +} + static inline loff_t folio_seek_hole_data(struct xa_state *xas, struct address_space *mapping, struct folio *folio, loff_t start, loff_t end, bool seek_data) diff --git a/mm/internal.h b/mm/internal.h index bcf75a8b032d..6d4ca98f3844 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -794,6 +794,12 @@ struct migration_target_control { gfp_t gfp_mask; }; +/* + * mm/filemap.c + */ +size_t splice_folio_into_pipe(struct pipe_inode_info *pipe, + struct folio *folio, loff_t fpos, size_t size); + /* * mm/vmalloc.c */