From patchwork Thu Feb 16 21:47:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13143994 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D618CC61DA4 for ; Thu, 16 Feb 2023 21:48:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 624366B0081; Thu, 16 Feb 2023 16:48:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5ACAF6B0082; Thu, 16 Feb 2023 16:48:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44DEF6B0083; Thu, 16 Feb 2023 16:48:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 35ECC6B0081 for ; Thu, 16 Feb 2023 16:48:40 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 06CE2120589 for ; Thu, 16 Feb 2023 21:48:40 +0000 (UTC) X-FDA: 80474494800.26.BCCBC8C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf27.hostedemail.com (Postfix) with ESMTP id CA4E840002 for ; Thu, 16 Feb 2023 21:48:37 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=D5K3fL3E; spf=pass (imf27.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676584117; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=B56pRrcWAY22m0G05dfkHZvKQrRhHOkH7s+6u+KVVg8=; b=ayQFfzVnm91OnlJ+2zq25b7RnEQiBdJyiHvfHQQiSApCab68MoHdbhlBFjpZ+PYdpTEg2K ENBRu7KVyx3dZ+eB93ur6LtSRM3mu3ESvUytrKlMCYo0rGblkDRznRRzz9uxkXahxhrHkI 5i3UZ8sYCPlgel92WVVS+FD3Yn0QcTk= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=D5K3fL3E; spf=pass (imf27.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676584117; a=rsa-sha256; cv=none; b=HcHgUbD/PMwNCBsYsrF6murqTyawk5Al7Fu7pMI/Z7frSM0Vozi8DBH0HXMfqRp43IS+9M qChORNWzG7UbQ1h8uo6e6NWNPX7/VfPPJNOIh99TxMwNZvluGQZ6si4f8lR5mF0kPL07z2 p0TPlJvvlxkA3OAcRtsmMINmKuzUGRQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1676584117; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=B56pRrcWAY22m0G05dfkHZvKQrRhHOkH7s+6u+KVVg8=; b=D5K3fL3EldM+gRg3/az/Cs3v/MURLrDissulbsmIRYYigcDMohAfIbKg4K0Fjtd6kuPxx0 yBAm8Pf+pEOrlrG/QfzLuJyAhoamjih4f1Z2qmg/B0ADja37Ro+Vr9jSsLZZHhqlthQmDH kEqMLd2qJFQAujqsrhEMrO67IwPvBj0= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-497-o7QVwY4PNzGubMfK7y7HHQ-1; Thu, 16 Feb 2023 16:48:33 -0500 X-MC-Unique: o7QVwY4PNzGubMfK7y7HHQ-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DED7538123AF; Thu, 16 Feb 2023 21:48:32 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.24]) by smtp.corp.redhat.com (Postfix) with ESMTP id D9AA2492B15; Thu, 16 Feb 2023 21:48:30 +0000 (UTC) From: David Howells To: Steve French Cc: David Howells , Jens Axboe , Al Viro , Shyam Prasad N , Rohith Surabattula , Tom Talpey , Stefan Metzmacher , Christoph Hellwig , Matthew Wilcox , Jeff Layton , linux-cifs@vger.kernel.org, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Steve French , linux-cachefs@redhat.com Subject: [PATCH 09/17] netfs: Add a function to extract an iterator into a scatterlist Date: Thu, 16 Feb 2023 21:47:37 +0000 Message-Id: <20230216214745.3985496-10-dhowells@redhat.com> In-Reply-To: <20230216214745.3985496-1-dhowells@redhat.com> References: <20230216214745.3985496-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: CA4E840002 X-Stat-Signature: kimpeanskfpcoed3k7ftpk1obuxnjaqs X-HE-Tag: 1676584117-28174 X-HE-Meta: U2FsdGVkX1+kkd4lN3tVIn6ETxJ4xD8nJjyCNcyetzDyGdq91PZY31PPtTk40ghsL1scB0pVkZse+9drWuFLP2lNRXCnkXTY+cHCnSNsPFfrhHglGkCRte1gccpEtYl65R1NYqATm21pqK71DfFwMZ3ZvjmVjcZw75PvP7toUkXNZGn68/pOR0T9l2TlIFAKt9hN0NZZjWc1Dfr5AMKvFjDqkMv+56SqdtPFXgRnYRGbvopORZIBYZpbSYbgW1qqxBmw3pEzm/LP20o4zbboyR6hZ+YFRwe53B87fW0P7smkj0dPPXpELhwcXVIIy5C4q9wcXbs9+m9FOUek39TAuSfYnSHxOS4/3uco14GuADx8pisZT3vIsdML3gs5im/be4AbxcTF4PqtaOoy8vjv3Tn5MrbbgSbZ7KPzep7L8N7cVOk8HNYGCpKCiitu/urXq74FmT6+odQIfnXsXyr/uKbdX8Lvvd12DQmihcssXKdJ3xQNZTdIxL23atL9bRtbs4zHJ3dNx5b/DJ3s1zEDs/TLLILnRAm3N244NPVb21GQHocr/60Ofb9P67RRmHwc9EIzSyYwM4xPvz+1Rn6b9csrZBj0+VeDQEC3BggXle2jSCyG0HkbuoG08T/w2P9/ErqvrFi58XgX9AfjBMnJIyBTpkutZvsDiYND1i+DtloIYSh0UPLF/6kqGkGkH20NhHZPVIDyinLur+9hlhOD6UDrCakIwE2OvnHvtaHfVMZoMp97ta7kmrtCOSXiMd7rUpV6sYfyX3H1rGD4mozHKAkFumaxbI5okU1y/S4mDDjtofCGI/FxDPqTbJXtB6Wu9iw3zfY65bDbOZlV6UWXCf2WQbqJSXofsaiQgtQg7iqBlWVXCRRHPs2nYh48L6xuaQPgd0yXku/Z+Rf3YFb6Dqe8ubJfP1T8Q+FPsuxGd9VTsqtXYvPcBCLJbdTLCnRgJRf1jgMUZxJeSaXjEMo bkNIfDJS PA4cayX85PCpxqtr0mf8usryqC8KqkfaxynRC5RzVOpMBrjdFahBsFWfuFLjmiFU0KeAM91QmufpKYD7BxnPQ/pVwWKGrcohgaBvY+z9VTmudmhKDw7ty2B/2DuOYoZfO1wJlBuFicdOVD4lqGqUd1eDcvXsc5AZHu90UUMI8RXsSPtrELv5Gjt822Mp4lDg7Pip6w196RIRfEL3ze6YwQU9unaRoE0Q30su9MSC5IepXVBJenJYtq/XR3qbUCZLTGMaOP8Tq1roZ3eGbJnUg2DoBOxtZAwIE9UIyjEFXBZl+O37AH7xmg6fmR87hLKCX6hXyU8zhfVLlh2kUV7oju/y35C1HvZyTkmRFvWtC4D4sC9fxMHajty7iMEudUa2uvYVx0V+SHImvPSnr8Uc7Hn25Rg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Provide a function for filling in a scatterlist from the list of pages contained in an iterator. If the iterator is UBUF- or IOBUF-type, the pages have a pin taken on them (as FOLL_PIN). If the iterator is BVEC-, KVEC- or XARRAY-type, no pin is taken on the pages and it is left to the caller to manage their lifetime. It cannot be assumed that a ref can be validly taken, particularly in the case of a KVEC iterator. Signed-off-by: David Howells cc: Jeff Layton cc: Steve French cc: Shyam Prasad N cc: Rohith Surabattula cc: linux-cachefs@redhat.com cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org --- fs/netfs/iterator.c | 268 ++++++++++++++++++++++++++++++++++++++++++ include/linux/netfs.h | 4 + mm/vmalloc.c | 1 + 3 files changed, 273 insertions(+) diff --git a/fs/netfs/iterator.c b/fs/netfs/iterator.c index 6f0d79080abc..80d7ff440cac 100644 --- a/fs/netfs/iterator.c +++ b/fs/netfs/iterator.c @@ -7,7 +7,9 @@ #include #include +#include #include +#include #include #include "internal.h" @@ -101,3 +103,269 @@ ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len, return npages; } EXPORT_SYMBOL_GPL(netfs_extract_user_iter); + +/* + * Extract and pin a list of up to sg_max pages from UBUF- or IOVEC-class + * iterators, and add them to the scatterlist. + */ +static ssize_t netfs_extract_user_to_sg(struct iov_iter *iter, + ssize_t maxsize, + struct sg_table *sgtable, + unsigned int sg_max, + iov_iter_extraction_t extraction_flags) +{ + struct scatterlist *sg = sgtable->sgl + sgtable->nents; + struct page **pages; + unsigned int npages; + ssize_t ret = 0, res; + size_t len, off; + + /* We decant the page list into the tail of the scatterlist */ + pages = (void *)sgtable->sgl + array_size(sg_max, sizeof(struct scatterlist)); + pages -= sg_max; + + do { + res = iov_iter_extract_pages(iter, &pages, maxsize, sg_max, + extraction_flags, &off); + if (res < 0) + goto failed; + + len = res; + maxsize -= len; + ret += len; + npages = DIV_ROUND_UP(off + len, PAGE_SIZE); + sg_max -= npages; + + for (; npages < 0; npages--) { + struct page *page = *pages; + size_t seg = min_t(size_t, PAGE_SIZE - off, len); + + *pages++ = NULL; + sg_set_page(sg, page, len, off); + sgtable->nents++; + sg++; + len -= seg; + off = 0; + } + } while (maxsize > 0 && sg_max > 0); + + return ret; + +failed: + while (sgtable->nents > sgtable->orig_nents) + put_page(sg_page(&sgtable->sgl[--sgtable->nents])); + return res; +} + +/* + * Extract up to sg_max pages from a BVEC-type iterator and add them to the + * scatterlist. The pages are not pinned. + */ +static ssize_t netfs_extract_bvec_to_sg(struct iov_iter *iter, + ssize_t maxsize, + struct sg_table *sgtable, + unsigned int sg_max, + iov_iter_extraction_t extraction_flags) +{ + const struct bio_vec *bv = iter->bvec; + struct scatterlist *sg = sgtable->sgl + sgtable->nents; + unsigned long start = iter->iov_offset; + unsigned int i; + ssize_t ret = 0; + + for (i = 0; i < iter->nr_segs; i++) { + size_t off, len; + + len = bv[i].bv_len; + if (start >= len) { + start -= len; + continue; + } + + len = min_t(size_t, maxsize, len - start); + off = bv[i].bv_offset + start; + + sg_set_page(sg, bv[i].bv_page, len, off); + sgtable->nents++; + sg++; + sg_max--; + + ret += len; + maxsize -= len; + if (maxsize <= 0 || sg_max == 0) + break; + start = 0; + } + + if (ret > 0) + iov_iter_advance(iter, ret); + return ret; +} + +/* + * Extract up to sg_max pages from a KVEC-type iterator and add them to the + * scatterlist. This can deal with vmalloc'd buffers as well as kmalloc'd or + * static buffers. The pages are not pinned. + */ +static ssize_t netfs_extract_kvec_to_sg(struct iov_iter *iter, + ssize_t maxsize, + struct sg_table *sgtable, + unsigned int sg_max, + iov_iter_extraction_t extraction_flags) +{ + const struct kvec *kv = iter->kvec; + struct scatterlist *sg = sgtable->sgl + sgtable->nents; + unsigned long start = iter->iov_offset; + unsigned int i; + ssize_t ret = 0; + + for (i = 0; i < iter->nr_segs; i++) { + struct page *page; + unsigned long kaddr; + size_t off, len, seg; + + len = kv[i].iov_len; + if (start >= len) { + start -= len; + continue; + } + + kaddr = (unsigned long)kv[i].iov_base + start; + off = kaddr & ~PAGE_MASK; + len = min_t(size_t, maxsize, len - start); + kaddr &= PAGE_MASK; + + maxsize -= len; + ret += len; + do { + seg = min_t(size_t, len, PAGE_SIZE - off); + if (is_vmalloc_or_module_addr((void *)kaddr)) + page = vmalloc_to_page((void *)kaddr); + else + page = virt_to_page(kaddr); + + sg_set_page(sg, page, len, off); + sgtable->nents++; + sg++; + sg_max--; + + len -= seg; + kaddr += PAGE_SIZE; + off = 0; + } while (len > 0 && sg_max > 0); + + if (maxsize <= 0 || sg_max == 0) + break; + start = 0; + } + + if (ret > 0) + iov_iter_advance(iter, ret); + return ret; +} + +/* + * Extract up to sg_max folios from an XARRAY-type iterator and add them to + * the scatterlist. The pages are not pinned. + */ +static ssize_t netfs_extract_xarray_to_sg(struct iov_iter *iter, + ssize_t maxsize, + struct sg_table *sgtable, + unsigned int sg_max, + iov_iter_extraction_t extraction_flags) +{ + struct scatterlist *sg = sgtable->sgl + sgtable->nents; + struct xarray *xa = iter->xarray; + struct folio *folio; + loff_t start = iter->xarray_start + iter->iov_offset; + pgoff_t index = start / PAGE_SIZE; + ssize_t ret = 0; + size_t offset, len; + XA_STATE(xas, xa, index); + + rcu_read_lock(); + + xas_for_each(&xas, folio, ULONG_MAX) { + if (xas_retry(&xas, folio)) + continue; + if (WARN_ON(xa_is_value(folio))) + break; + if (WARN_ON(folio_test_hugetlb(folio))) + break; + + offset = offset_in_folio(folio, start); + len = min_t(size_t, maxsize, folio_size(folio) - offset); + + sg_set_page(sg, folio_page(folio, 0), len, offset); + sgtable->nents++; + sg++; + sg_max--; + + maxsize -= len; + ret += len; + if (maxsize <= 0 || sg_max == 0) + break; + } + + rcu_read_unlock(); + if (ret > 0) + iov_iter_advance(iter, ret); + return ret; +} + +/** + * netfs_extract_iter_to_sg - Extract pages from an iterator and add ot an sglist + * @iter: The iterator to extract from + * @maxsize: The amount of iterator to copy + * @sgtable: The scatterlist table to fill in + * @sg_max: Maximum number of elements in @sgtable that may be filled + * @extraction_flags: Flags to qualify the request + * + * Extract the page fragments from the given amount of the source iterator and + * add them to a scatterlist that refers to all of those bits, to a maximum + * addition of @sg_max elements. + * + * The pages referred to by UBUF- and IOVEC-type iterators are extracted and + * pinned; BVEC-, KVEC- and XARRAY-type are extracted but aren't pinned; PIPE- + * and DISCARD-type are not supported. + * + * No end mark is placed on the scatterlist; that's left to the caller. + * + * @extraction_flags can have ITER_ALLOW_P2PDMA set to request peer-to-peer DMA + * be allowed on the pages extracted. + * + * If successul, @sgtable->nents is updated to include the number of elements + * added and the number of bytes added is returned. @sgtable->orig_nents is + * left unaltered. + * + * The iov_iter_extract_mode() function should be used to query how cleanup + * should be performed. + */ +ssize_t netfs_extract_iter_to_sg(struct iov_iter *iter, size_t maxsize, + struct sg_table *sgtable, unsigned int sg_max, + iov_iter_extraction_t extraction_flags) +{ + if (maxsize == 0) + return 0; + + switch (iov_iter_type(iter)) { + case ITER_UBUF: + case ITER_IOVEC: + return netfs_extract_user_to_sg(iter, maxsize, sgtable, sg_max, + extraction_flags); + case ITER_BVEC: + return netfs_extract_bvec_to_sg(iter, maxsize, sgtable, sg_max, + extraction_flags); + case ITER_KVEC: + return netfs_extract_kvec_to_sg(iter, maxsize, sgtable, sg_max, + extraction_flags); + case ITER_XARRAY: + return netfs_extract_xarray_to_sg(iter, maxsize, sgtable, sg_max, + extraction_flags); + default: + pr_err("%s(%u) unsupported\n", __func__, iov_iter_type(iter)); + WARN_ON_ONCE(1); + return -EIO; + } +} +EXPORT_SYMBOL_GPL(netfs_extract_iter_to_sg); diff --git a/include/linux/netfs.h b/include/linux/netfs.h index b11a84f6c32b..a1f3522daa69 100644 --- a/include/linux/netfs.h +++ b/include/linux/netfs.h @@ -300,6 +300,10 @@ void netfs_stats_show(struct seq_file *); ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len, struct iov_iter *new, iov_iter_extraction_t extraction_flags); +struct sg_table; +ssize_t netfs_extract_iter_to_sg(struct iov_iter *iter, size_t len, + struct sg_table *sgtable, unsigned int sg_max, + iov_iter_extraction_t extraction_flags); /** * netfs_inode - Get the netfs inode context from the inode diff --git a/mm/vmalloc.c b/mm/vmalloc.c index ca71de7c9d77..61f5bec0f2b6 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -656,6 +656,7 @@ int is_vmalloc_or_module_addr(const void *x) #endif return is_vmalloc_addr(x); } +EXPORT_SYMBOL_GPL(is_vmalloc_or_module_addr); /* * Walk a vmap address to the struct page it maps. Huge vmap mappings will