From patchwork Tue Feb 7 17:12:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13131879 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6A8AC636D4 for ; Tue, 7 Feb 2023 17:13:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 27DBC6B011F; Tue, 7 Feb 2023 12:13:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 253F06B0121; Tue, 7 Feb 2023 12:13:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 16A966B0122; Tue, 7 Feb 2023 12:13:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 06D6B6B011F for ; Tue, 7 Feb 2023 12:13:25 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id BB5D4AAA2D for ; Tue, 7 Feb 2023 17:13:24 +0000 (UTC) X-FDA: 80441141928.08.13DA4AA Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf24.hostedemail.com (Postfix) with ESMTP id EA01918000F for ; Tue, 7 Feb 2023 17:13:21 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=RJvJyKmh; spf=pass (imf24.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675790002; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AqhnRD2vIRBCcpahbYjaBxj+y0Ho2PQtP1suQNatsgY=; b=zB/4sOqTCVTHbruT1aJPDLXbQGBD8kx7V4OCqt3xq0zlSFK+oiy8QFlbOhTv31NZWP8y9B BuGAiEQh6e1xZywp74czRp4+4zTNcDZHis3CR1AmMK9oHsIq+FVjPqBnK8/0Jcxd54ztHa zldaEOeFkfMFFOczJh4J/bnd6+SDe6Y= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=RJvJyKmh; spf=pass (imf24.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675790002; a=rsa-sha256; cv=none; b=QxB9iXGzj/qhRUz5Ypx/BzzLKSsI7uzIwhKNqSZExpiutz9Dm5RvT/MKBjBQ2s/mn38xRN adzhpn8PkSTgTDwrvnD8+5xQ3XlH/pD8pD7lwqAWWJRUvDuKNJvGNtR3pZn/HVqxypplaX SnM83FuVSza4W0v3eB8Vc13mxWsd6gI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675790001; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AqhnRD2vIRBCcpahbYjaBxj+y0Ho2PQtP1suQNatsgY=; b=RJvJyKmhmenVTxZEsQIhIQ5kNK7YJgir3HMrjukRpGebI6DsbEQQSwEZxy7g4NAkpVsnsO FL/C+LBJLD3HDvlDGKnsUdcAHJ+jFXmdT4aFHQ78OqRwwz52LGpXyPvIeiLhOtLaSokOl0 /dY1ICq5cQ/qpI3OfQ8bVVChRQZPbrQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-141-U0lXyxJNPFiRadUkuHNCCw-1; Tue, 07 Feb 2023 12:13:15 -0500 X-MC-Unique: U0lXyxJNPFiRadUkuHNCCw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CC8898027EB; Tue, 7 Feb 2023 17:13:14 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.33.36.97]) by smtp.corp.redhat.com (Postfix) with ESMTP id C3B1940CF8EF; Tue, 7 Feb 2023 17:13:12 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, syzbot+a440341a59e3b7142895@syzkaller.appspotmail.com, Christoph Hellwig , John Hubbard Subject: [PATCH v12 01/10] vfs, iomap: Fix generic_file_splice_read() to avoid reversion of ITER_PIPE Date: Tue, 7 Feb 2023 17:12:56 +0000 Message-Id: <20230207171305.3716974-2-dhowells@redhat.com> In-Reply-To: <20230207171305.3716974-1-dhowells@redhat.com> References: <20230207171305.3716974-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: EA01918000F X-Stat-Signature: 66aphsux8rzjmqhkejojp833mhb5pbw4 X-HE-Tag: 1675790001-581158 X-HE-Meta: U2FsdGVkX1+5Ql3DDIU/82DTAq4aGWnyXa+vs7G0cXfXKpGKVreZ/fxomzM/l3q6S0teD4mjgdnuNbGYhI/g4JJvqvj8kWadovOyyWGdKnT9MICuw/sWwwTujgddAy5NfJvFHMIqrw0Y8RiyYZpfPYR4mbDyK7VfiwmVbkHSMtJaZ6bySOD+KzPKlOWpRbg71g1o5ZZSeQmHsX/AOYshaHsRi3shvY9CTLACwPTZSsxdOU1XQmoq3QWjeA9Dke7BTjBQjCv4oJQsAH+P1f9TDv4nRkyM8cpMMx0gQc7TKr60SS616SrQohVxtrCjRUWKRMu3Vf3dnLJIRVR8S5CUszM3H78WycaQbvXeSTf75YGblOg0d2AZiXrOErgVZdBxP7MvnvwlDaOcsaEw1G7tSnvKrN0M+7PAxr+RXo2LQJTNMsa+Jb5rMwhbCzdf03ugl+YDew95xaobEH6xCmdr1EU0+N5HjpX+50bQeFXY4+FIfYCVN1OPB5hdSTuHr2DwPf5LhQfTV7iX36AeRa2yAvOycrEPQMWws76RJvyaASyTR85O92zD8q7USrmG+HH0EI61zwHnGwiiNM1cXEnv7ocpKCYhUf3TwBQM5UbQXVILdsaXXyb+1sjgxHCiC337a6AlDXFcefM3+qeqpyrBMDBsAOAt5DwFXjnf6gEf40WrPSK9EZVefrAA4NMj0HIt073WOsgneepBsjdnH79WSLGHaKoJbF4OQLvuREvw8ABMXkaXXfAZc+tOksX1/Svu/Wbl2xpkWxmPZvgyGTqdUa5VTJJt7gwUuSqpgnyO1TDEzwnVdQHknce6wdnQLFNmmuTzEMu/BeyE8IcPT1LZsYCwfAsIC+/CCu4Ff9Owgdo2cXpnB7QSMvxO1ezh//aILkPRsqip4quhERuss36TEgAqqA9XL/FDOTAGl096/kwFrzkvnBFigTI68J3wDK6VBjuYf6mEYkpc4wn5fQe UhsYspKO 6zVVVJQKJtZhVv9lYHmE6I04DFxZCTwTMM5R+JSf44joL384U7vteuS0kVr126cURagrUz0aStgTbp+WiECDQ0Olea9ss7drRM3MoBxx1pUmw9XRDo1BB7REjfF9bmy5ZS6ZBEekRLUA22hb3SFld9x40aNLAvfmG719fF/fQnc3lskOWanb9hB2AJocTOABzKjeNXNhAwkKiGKrtalJV3GoUIj/4kJq4uoo6/39I3++s5OIr1emwDIqa3YsSzmHkWealYjI6ybfRVnAHt0gWiPGqjLqYzIAfVRfm6zQaYWV7/WOX934wHuFmk6+KMTlUYQccKQQWs58x1eyIUYXf+6NefVBcDolj0AHC/A4qLP7VE+oQu6XDf5yxhRtAMiLbkijrZoF4VoJagf+m9Hqn0RydptklF9IMhCt8EnBKNkbZ6YFN4Gevt2GFPYNu/7ywqG7TrfMiDOyL1CqXliTiDAlsQU7wHn0m0hE1s5ilxkwNLJKPLkPulrffl2CJL/8+XVXQmdSfth7mqYIZxsdZZ1RPhRwldilYlc2QBFWJfnrr1C0OWiVe2MJqwACyheJaqIDbmsthqKldlsr5kBvJHcEwaOTGYmmpep79ISnZOIwoLsvFc+i6eq5a+9hsh1t8qcDReKpElhZKXrP4MNFOFcvlAkPo4sUWNQXlFAseVtrTgH0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: With the new iov_iter_extract_pages() function (added in a later patch), pages extracted from a non-user-backed iterator, such as ITER_PIPE, aren't pinned. __iomap_dio_rw(), however, calls iov_iter_revert() to shorten the iterator to just the data it is going to use - which causes the pipe buffers to be freed, even though they're attached to a bio and may get written to by DMA (thanks to Hillf Danton for spotting this[1]). This then causes massive memory corruption that is particularly noticable when the syzbot test[2] is run. The test boils down to: out = creat(argv[1], 0666); ftruncate(out, 0x800); lseek(out, 0x200, SEEK_SET); in = open(argv[1], O_RDONLY | O_DIRECT | O_NOFOLLOW); sendfile(out, in, NULL, 0x1dd00); run repeatedly in parallel. What I think is happening is that ftruncate() occasionally shortens the DIO read that's about to be made by sendfile's splice core by reducing i_size. Fix this by replacing the use of an ITER_PIPE iterator with an ITER_BVEC iterator for which reversion won't free the buffers. Bulk allocate all the buffers we think we're going to use in advance, do the read synchronously and only then trim the buffer down. The pages we did use get pushed into the pipe. This is more efficient by virtue of doing a bulk page allocation, but slightly less efficient by ignoring any partial page in the pipe. Note that this removes the only user of ITER_PIPE. Fixes: 920756a3306a ("block: Convert bio_iov_iter_get_pages to use iov_iter_extract_pages") Reported-by: syzbot+a440341a59e3b7142895@syzkaller.appspotmail.com Signed-off-by: David Howells cc: Jens Axboe cc: Christoph Hellwig cc: Al Viro cc: David Hildenbrand cc: John Hubbard cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20230207094731.1390-1-hdanton@sina.com/ [1] Link: https://lore.kernel.org/r/000000000000b0b3c005f3a09383@google.com/ [2] Signed-off-by: David Howells --- fs/splice.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 68 insertions(+), 8 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index 5969b7a1d353..51778437f31f 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -295,24 +295,62 @@ void splice_shrink_spd(struct splice_pipe_desc *spd) * used as long as it has more or less sane ->read_iter(). * */ -ssize_t generic_file_splice_read(struct file *in, loff_t *ppos, +ssize_t generic_file_splice_read(struct file *file, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags) { + LIST_HEAD(pages); struct iov_iter to; + struct bio_vec *bv; struct kiocb kiocb; - int ret; + struct page *page; + unsigned int head; + ssize_t ret; + size_t used, npages, chunk, remain, reclaim; + int i; + + /* Work out how much data we can actually add into the pipe */ + used = pipe_occupancy(pipe->head, pipe->tail); + npages = max_t(ssize_t, pipe->max_usage - used, 0); + len = min_t(size_t, len, npages * PAGE_SIZE); + npages = DIV_ROUND_UP(len, PAGE_SIZE); + + bv = kmalloc(array_size(npages, sizeof(bv[0])), GFP_KERNEL); + if (!bv) + return -ENOMEM; + + npages = alloc_pages_bulk_list(GFP_USER, npages, &pages); + if (!npages) { + kfree(bv); + return -ENOMEM; + } - iov_iter_pipe(&to, ITER_DEST, pipe, len); - init_sync_kiocb(&kiocb, in); + remain = len = min_t(size_t, len, npages * PAGE_SIZE); + + for (i = 0; i < npages; i++) { + chunk = min_t(size_t, PAGE_SIZE, remain); + page = list_first_entry(&pages, struct page, lru); + list_del_init(&page->lru); + bv[i].bv_page = page; + bv[i].bv_offset = 0; + bv[i].bv_len = chunk; + remain -= chunk; + } + + /* Do the I/O */ + iov_iter_bvec(&to, ITER_DEST, bv, npages, len); + init_sync_kiocb(&kiocb, file); kiocb.ki_pos = *ppos; - ret = call_read_iter(in, &kiocb, &to); + ret = call_read_iter(file, &kiocb, &to); + + reclaim = npages * PAGE_SIZE; + remain = 0; if (ret > 0) { + reclaim -= ret; + remain = ret; *ppos = kiocb.ki_pos; - file_accessed(in); + file_accessed(file); } else if (ret < 0) { - /* free what was emitted */ - pipe_discard_from(pipe, to.start_head); /* * callers of ->splice_read() expect -EAGAIN on * "can't put anything in there", rather than -EFAULT. @@ -321,6 +359,28 @@ ssize_t generic_file_splice_read(struct file *in, loff_t *ppos, ret = -EAGAIN; } + /* Free any pages that didn't get touched at all. */ + for (; reclaim >= PAGE_SIZE; reclaim -= PAGE_SIZE) + __free_page(bv[--npages].bv_page); + + /* Push the remaining pages into the pipe. */ + head = pipe->head; + for (i = 0; i < npages; i++) { + struct pipe_buffer *buf = &pipe->bufs[head & (pipe->ring_size - 1)]; + + chunk = min_t(size_t, remain, PAGE_SIZE); + *buf = (struct pipe_buffer) { + .ops = &default_pipe_buf_ops, + .page = bv[i].bv_page, + .offset = 0, + .len = chunk, + }; + head++; + remain -= chunk; + } + pipe->head = head; + + kfree(bv); return ret; } EXPORT_SYMBOL(generic_file_splice_read); From patchwork Tue Feb 7 17:12:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13131880 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 107C5C636D3 for ; Tue, 7 Feb 2023 17:13:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 95A426B011D; Tue, 7 Feb 2023 12:13:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 90A8F6B011F; Tue, 7 Feb 2023 12:13:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7ABC96B0120; Tue, 7 Feb 2023 12:13:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 695DB6B011D for ; Tue, 7 Feb 2023 12:13:24 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 49ECA140A0C for ; Tue, 7 Feb 2023 17:13:24 +0000 (UTC) X-FDA: 80441141928.11.6AF2272 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 87C6AA0013 for ; Tue, 7 Feb 2023 17:13:22 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=E4EJ7wec; spf=pass (imf15.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675790002; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xFVviEeM/Y2ryR7/CsF3R75V4MP5Ms/KHe9gpS0qdO0=; b=xvCNyDnsfzq62xR56rAFjJebMrtHcXvHdM2Ol7yZJmEu4lzu47JOVuA3tvfAzo2TOxFrwH IruGKiec2v8HIw6vdfXbtLRxplT/SlXoGid44vntgRbE62y1GGFqwuTwYp/avk/UT2/ZtJ ZoznSurxqn3g9/hQ5MXfookhulbHwmw= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=E4EJ7wec; spf=pass (imf15.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675790002; a=rsa-sha256; cv=none; b=aoQ+gZbZoCK+fDKjty5/KtFHbyTnELRVlyCC9rP+3vwgKmDqjqHMG7qIMRULCbgBIQxKma RVW+7B96dICiuAAytMKighn5usRhnzlwYcWbeehFFH1MqsOsvADRef7AyMF6VJ5xD+j8QN gbD8G7cSFKRyWbS/D4+krnhlIZ3rqvs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675790001; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xFVviEeM/Y2ryR7/CsF3R75V4MP5Ms/KHe9gpS0qdO0=; b=E4EJ7wechQqGTZ0+Gq//jsn0hYPlCUZIFyuS1emIp4/oKDVyYglMPg8wLnZ0WXteEUQ/4T Rn1QVlynQcb3jTOnxsc1xwWXDADr2IFbSNSruoIRW5nq0IYw6eJG+wcZoK0R6BQdPeunSd xUiIP8Y15fpg0Meb0WmwjpImB7j+gu4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-416-kWMnHRCvOEeM67leXLQ-Hw-1; Tue, 07 Feb 2023 12:13:18 -0500 X-MC-Unique: kWMnHRCvOEeM67leXLQ-Hw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id AECCE811E9C; Tue, 7 Feb 2023 17:13:17 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.33.36.97]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8C6ED2166B29; Tue, 7 Feb 2023 17:13:15 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v12 02/10] iov_iter: Kill ITER_PIPE Date: Tue, 7 Feb 2023 17:12:57 +0000 Message-Id: <20230207171305.3716974-3-dhowells@redhat.com> In-Reply-To: <20230207171305.3716974-1-dhowells@redhat.com> References: <20230207171305.3716974-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 87C6AA0013 X-Rspam-User: X-Stat-Signature: u9xm3oj8y4sun5jzcwnnau4zepwuxz3t X-HE-Tag: 1675790002-112770 X-HE-Meta: U2FsdGVkX19T1kW9PPzonfmEPSCTnnRFpS8lsMHXHf/Oi9xdSf1UQie9oocK/uzKI8wyDQBrEKC8bgp6mWQzPgkCGw/KiRXC0Fg7StHCchIrshF2GitSs6cA5S3cGkjfM/EEfsPoqRavjuzSvKPiLKqg1J4pfleJxlZ9yCd7n7GYYAtq8v78o0Sp5fYSfbKf2ly07X03zA6Hq3O6tBuOYsfz+p80Pi2iTfTja2sbCpgEtGWm9f0vJ4ANR9JChGr4zgq3ho1VkNVODkA6FHA9C1KlFyL/VqTcy4qS2NJwHwRtAHveIfYOvprQwUSs3R736hEH3M3jnAx31MhLSgu3dYV5TVPJ7eSKzV5nWAmDDz1r1f7blf191CncCEK3hKbxwXgq57w9d+f/uK2B1t0CJ3kKs1aRETJAmk+voFwJc6AfgON7jM75+zAkdEPm+MDfrsurm/8qdyWBP2xalewmdZ0uMPlhBKqM4Ibyek55Kfa8QD/bvHTvpO5aN+bUls0KXFUzlMkt+Ljxgxih8eAztd+6JGS/3aMntMLFW/Wxx6JzzXy+m6W09DRWs7R95BFqvsJxKoSJQgO0G1gbRFyuNwe97yGgv11QSLeks+Y9pCVsQ0/aTc2eOuEDTHZS7lriqAm4wNjr/ss9ep/cKKMlQXiOLP/TOD9fPoqNdjTaxzOyyarFw/etZoKnAZ6TY9lOeJiA0wBzKrwOMkY0e/YQ8WmHk2DTIvztAQCFXV3yQYMtb7JUNXdWQ9dNDtDspCkR6w2IwkQCixvn0RIglMXCKNR+o+YSaBeHUFwlb3RwqLlZu09AnQLc2TvK7EZDJtfx71TYEY4zNt+MI4/hYmWN2vTR3Sq0nHZ5V+lgd8JPcYgWy3hevfwUHBGp4C5Ogc4D5wHk82h000dTAYB0ehCw+3aP96Um961OBmKXBsnY35SYjRZsbXObZqb2BgbKDGnvV1Un7S/1ndQG9AEltjO kqvJc0x+ LkOpihEQpTf0NTZ4ZR2308jAL/sDbdmv+246zRuk24kAOD0eaXBjgJI4Zy0h3ua89dkyI/YoRfC7x4TXRYXEyoEcogL7JFWzkNxo4w6Kne75g8DHBFzqTzAQ4jyPRlY1dZAwCRv0Q2nzrfI+6cPfkbmVqCqKXXUEEgEiSOUY4Zh4+Uo1tH3kV0DRTcpaAuet4pcfDyfJ60D/qeTpLlv7fHzelbVwkxoKDreJ7bejqrciH4t1mwt5Oti1L9Zz8gVGHh5Wu0GP0nFQXP+CSSxidunHenbKWslCGeT4Wte0TTPF847XrjEcWMaWRki550lO7i/cvK3OW4fd4PAMZhCYsUpxetQUxZnoLPZcOKXyHEnmvlm2zPQfUcf+Err30dXMlBtNnBSCJj4+FY408rpvKsLglmw6MgSvkC7UaQXwA34yHr8XrUrDYqVOhgo/og/Z3VTqB X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The ITER_PIPE-type iterator was only used for generic_file_splice_read(), but that has now been switched to using ITER_BVEC instead, leaving ITER_PIPE unused - so get rid of it. Signed-off-by: David Howells cc: Jens Axboe cc: Christoph Hellwig cc: Al Viro cc: David Hildenbrand cc: John Hubbard cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org --- fs/cifs/file.c | 8 +- include/linux/uio.h | 14 -- lib/iov_iter.c | 435 +------------------------------------------- mm/filemap.c | 3 - 4 files changed, 4 insertions(+), 456 deletions(-) diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 22dfc1f8b4f1..57ca4eea69dd 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -3806,13 +3806,7 @@ cifs_readdata_to_iov(struct cifs_readdata *rdata, struct iov_iter *iter) size_t copy = min_t(size_t, remaining, PAGE_SIZE); size_t written; - if (unlikely(iov_iter_is_pipe(iter))) { - void *addr = kmap_atomic(page); - - written = copy_to_iter(addr, copy, iter); - kunmap_atomic(addr); - } else - written = copy_page_to_iter(page, 0, copy, iter); + written = copy_page_to_iter(page, 0, copy, iter); remaining -= written; if (written < copy && iov_iter_count(iter) > 0) break; diff --git a/include/linux/uio.h b/include/linux/uio.h index 9f158238edba..dcc0ca5ef491 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -11,7 +11,6 @@ #include struct page; -struct pipe_inode_info; struct kvec { void *iov_base; /* and that should *never* hold a userland pointer */ @@ -23,7 +22,6 @@ enum iter_type { ITER_IOVEC, ITER_KVEC, ITER_BVEC, - ITER_PIPE, ITER_XARRAY, ITER_DISCARD, ITER_UBUF, @@ -53,15 +51,10 @@ struct iov_iter { const struct kvec *kvec; const struct bio_vec *bvec; struct xarray *xarray; - struct pipe_inode_info *pipe; void __user *ubuf; }; union { unsigned long nr_segs; - struct { - unsigned int head; - unsigned int start_head; - }; loff_t xarray_start; }; }; @@ -99,11 +92,6 @@ static inline bool iov_iter_is_bvec(const struct iov_iter *i) return iov_iter_type(i) == ITER_BVEC; } -static inline bool iov_iter_is_pipe(const struct iov_iter *i) -{ - return iov_iter_type(i) == ITER_PIPE; -} - static inline bool iov_iter_is_discard(const struct iov_iter *i) { return iov_iter_type(i) == ITER_DISCARD; @@ -245,8 +233,6 @@ void iov_iter_kvec(struct iov_iter *i, unsigned int direction, const struct kvec unsigned long nr_segs, size_t count); void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struct bio_vec *bvec, unsigned long nr_segs, size_t count); -void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode_info *pipe, - size_t count); void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count); void iov_iter_xarray(struct iov_iter *i, unsigned int direction, struct xarray *xarray, loff_t start, size_t count); diff --git a/lib/iov_iter.c b/lib/iov_iter.c index f9a3ff37ecd1..adc5e8aa8ae8 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -14,8 +14,6 @@ #include #include -#define PIPE_PARANOIA /* for now */ - /* covers ubuf and kbuf alike */ #define iterate_buf(i, n, base, len, off, __p, STEP) { \ size_t __maybe_unused off = 0; \ @@ -186,156 +184,6 @@ static int copyin(void *to, const void __user *from, size_t n) return res; } -static inline struct pipe_buffer *pipe_buf(const struct pipe_inode_info *pipe, - unsigned int slot) -{ - return &pipe->bufs[slot & (pipe->ring_size - 1)]; -} - -#ifdef PIPE_PARANOIA -static bool sanity(const struct iov_iter *i) -{ - struct pipe_inode_info *pipe = i->pipe; - unsigned int p_head = pipe->head; - unsigned int p_tail = pipe->tail; - unsigned int p_occupancy = pipe_occupancy(p_head, p_tail); - unsigned int i_head = i->head; - unsigned int idx; - - if (i->last_offset) { - struct pipe_buffer *p; - if (unlikely(p_occupancy == 0)) - goto Bad; // pipe must be non-empty - if (unlikely(i_head != p_head - 1)) - goto Bad; // must be at the last buffer... - - p = pipe_buf(pipe, i_head); - if (unlikely(p->offset + p->len != abs(i->last_offset))) - goto Bad; // ... at the end of segment - } else { - if (i_head != p_head) - goto Bad; // must be right after the last buffer - } - return true; -Bad: - printk(KERN_ERR "idx = %d, offset = %d\n", i_head, i->last_offset); - printk(KERN_ERR "head = %d, tail = %d, buffers = %d\n", - p_head, p_tail, pipe->ring_size); - for (idx = 0; idx < pipe->ring_size; idx++) - printk(KERN_ERR "[%p %p %d %d]\n", - pipe->bufs[idx].ops, - pipe->bufs[idx].page, - pipe->bufs[idx].offset, - pipe->bufs[idx].len); - WARN_ON(1); - return false; -} -#else -#define sanity(i) true -#endif - -static struct page *push_anon(struct pipe_inode_info *pipe, unsigned size) -{ - struct page *page = alloc_page(GFP_USER); - if (page) { - struct pipe_buffer *buf = pipe_buf(pipe, pipe->head++); - *buf = (struct pipe_buffer) { - .ops = &default_pipe_buf_ops, - .page = page, - .offset = 0, - .len = size - }; - } - return page; -} - -static void push_page(struct pipe_inode_info *pipe, struct page *page, - unsigned int offset, unsigned int size) -{ - struct pipe_buffer *buf = pipe_buf(pipe, pipe->head++); - *buf = (struct pipe_buffer) { - .ops = &page_cache_pipe_buf_ops, - .page = page, - .offset = offset, - .len = size - }; - get_page(page); -} - -static inline int last_offset(const struct pipe_buffer *buf) -{ - if (buf->ops == &default_pipe_buf_ops) - return buf->len; // buf->offset is 0 for those - else - return -(buf->offset + buf->len); -} - -static struct page *append_pipe(struct iov_iter *i, size_t size, - unsigned int *off) -{ - struct pipe_inode_info *pipe = i->pipe; - int offset = i->last_offset; - struct pipe_buffer *buf; - struct page *page; - - if (offset > 0 && offset < PAGE_SIZE) { - // some space in the last buffer; add to it - buf = pipe_buf(pipe, pipe->head - 1); - size = min_t(size_t, size, PAGE_SIZE - offset); - buf->len += size; - i->last_offset += size; - i->count -= size; - *off = offset; - return buf->page; - } - // OK, we need a new buffer - *off = 0; - size = min_t(size_t, size, PAGE_SIZE); - if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) - return NULL; - page = push_anon(pipe, size); - if (!page) - return NULL; - i->head = pipe->head - 1; - i->last_offset = size; - i->count -= size; - return page; -} - -static size_t copy_page_to_iter_pipe(struct page *page, size_t offset, size_t bytes, - struct iov_iter *i) -{ - struct pipe_inode_info *pipe = i->pipe; - unsigned int head = pipe->head; - - if (unlikely(bytes > i->count)) - bytes = i->count; - - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - if (offset && i->last_offset == -offset) { // could we merge it? - struct pipe_buffer *buf = pipe_buf(pipe, head - 1); - if (buf->page == page) { - buf->len += bytes; - i->last_offset -= bytes; - i->count -= bytes; - return bytes; - } - } - if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) - return 0; - - push_page(pipe, page, offset, bytes); - i->last_offset = -(offset + bytes); - i->head = head; - i->count -= bytes; - return bytes; -} - /* * fault_in_iov_iter_readable - fault in iov iterator for reading * @i: iterator @@ -439,46 +287,6 @@ void iov_iter_init(struct iov_iter *i, unsigned int direction, } EXPORT_SYMBOL(iov_iter_init); -// returns the offset in partial buffer (if any) -static inline unsigned int pipe_npages(const struct iov_iter *i, int *npages) -{ - struct pipe_inode_info *pipe = i->pipe; - int used = pipe->head - pipe->tail; - int off = i->last_offset; - - *npages = max((int)pipe->max_usage - used, 0); - - if (off > 0 && off < PAGE_SIZE) { // anon and not full - (*npages)++; - return off; - } - return 0; -} - -static size_t copy_pipe_to_iter(const void *addr, size_t bytes, - struct iov_iter *i) -{ - unsigned int off, chunk; - - if (unlikely(bytes > i->count)) - bytes = i->count; - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - for (size_t n = bytes; n; n -= chunk) { - struct page *page = append_pipe(i, n, &off); - chunk = min_t(size_t, n, PAGE_SIZE - off); - if (!page) - return bytes - n; - memcpy_to_page(page, off, addr, chunk); - addr += chunk; - } - return bytes; -} - static __wsum csum_and_memcpy(void *to, const void *from, size_t len, __wsum sum, size_t off) { @@ -486,44 +294,10 @@ static __wsum csum_and_memcpy(void *to, const void *from, size_t len, return csum_block_add(sum, next, off); } -static size_t csum_and_copy_to_pipe_iter(const void *addr, size_t bytes, - struct iov_iter *i, __wsum *sump) -{ - __wsum sum = *sump; - size_t off = 0; - unsigned int chunk, r; - - if (unlikely(bytes > i->count)) - bytes = i->count; - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - while (bytes) { - struct page *page = append_pipe(i, bytes, &r); - char *p; - - if (!page) - break; - chunk = min_t(size_t, bytes, PAGE_SIZE - r); - p = kmap_local_page(page); - sum = csum_and_memcpy(p + r, addr + off, chunk, sum, off); - kunmap_local(p); - off += chunk; - bytes -= chunk; - } - *sump = sum; - return off; -} - size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i) { if (WARN_ON_ONCE(i->data_source)) return 0; - if (unlikely(iov_iter_is_pipe(i))) - return copy_pipe_to_iter(addr, bytes, i); if (user_backed_iter(i)) might_fault(); iterate_and_advance(i, bytes, base, len, off, @@ -545,42 +319,6 @@ static int copyout_mc(void __user *to, const void *from, size_t n) return n; } -static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes, - struct iov_iter *i) -{ - size_t xfer = 0; - unsigned int off, chunk; - - if (unlikely(bytes > i->count)) - bytes = i->count; - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - while (bytes) { - struct page *page = append_pipe(i, bytes, &off); - unsigned long rem; - char *p; - - if (!page) - break; - chunk = min_t(size_t, bytes, PAGE_SIZE - off); - p = kmap_local_page(page); - rem = copy_mc_to_kernel(p + off, addr + xfer, chunk); - chunk -= rem; - kunmap_local(p); - xfer += chunk; - bytes -= chunk; - if (rem) { - iov_iter_revert(i, rem); - break; - } - } - return xfer; -} - /** * _copy_mc_to_iter - copy to iter with source memory error exception handling * @addr: source kernel address @@ -600,9 +338,8 @@ static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes, * alignment and poison alignment assumptions to avoid re-triggering * hardware exceptions. * - * * ITER_KVEC, ITER_PIPE, and ITER_BVEC can return short copies. - * Compare to copy_to_iter() where only ITER_IOVEC attempts might return - * a short copy. + * * ITER_KVEC and ITER_BVEC can return short copies. Compare to + * copy_to_iter() where only ITER_IOVEC attempts might return a short copy. * * Return: number of bytes copied (may be %0) */ @@ -610,8 +347,6 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i) { if (WARN_ON_ONCE(i->data_source)) return 0; - if (unlikely(iov_iter_is_pipe(i))) - return copy_mc_pipe_to_iter(addr, bytes, i); if (user_backed_iter(i)) might_fault(); __iterate_and_advance(i, bytes, base, len, off, @@ -717,8 +452,6 @@ size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes, return 0; if (WARN_ON_ONCE(i->data_source)) return 0; - if (unlikely(iov_iter_is_pipe(i))) - return copy_page_to_iter_pipe(page, offset, bytes, i); page += offset / PAGE_SIZE; // first subpage offset %= PAGE_SIZE; while (1) { @@ -767,36 +500,8 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes, } EXPORT_SYMBOL(copy_page_from_iter); -static size_t pipe_zero(size_t bytes, struct iov_iter *i) -{ - unsigned int chunk, off; - - if (unlikely(bytes > i->count)) - bytes = i->count; - if (unlikely(!bytes)) - return 0; - - if (!sanity(i)) - return 0; - - for (size_t n = bytes; n; n -= chunk) { - struct page *page = append_pipe(i, n, &off); - char *p; - - if (!page) - return bytes - n; - chunk = min_t(size_t, n, PAGE_SIZE - off); - p = kmap_local_page(page); - memset(p + off, 0, chunk); - kunmap_local(p); - } - return bytes; -} - size_t iov_iter_zero(size_t bytes, struct iov_iter *i) { - if (unlikely(iov_iter_is_pipe(i))) - return pipe_zero(bytes, i); iterate_and_advance(i, bytes, base, len, count, clear_user(base, len), memset(base, 0, len) @@ -827,32 +532,6 @@ size_t copy_page_from_iter_atomic(struct page *page, unsigned offset, size_t byt } EXPORT_SYMBOL(copy_page_from_iter_atomic); -static void pipe_advance(struct iov_iter *i, size_t size) -{ - struct pipe_inode_info *pipe = i->pipe; - int off = i->last_offset; - - if (!off && !size) { - pipe_discard_from(pipe, i->start_head); // discard everything - return; - } - i->count -= size; - while (1) { - struct pipe_buffer *buf = pipe_buf(pipe, i->head); - if (off) /* make it relative to the beginning of buffer */ - size += abs(off) - buf->offset; - if (size <= buf->len) { - buf->len = size; - i->last_offset = last_offset(buf); - break; - } - size -= buf->len; - i->head++; - off = 0; - } - pipe_discard_from(pipe, i->head + 1); // discard everything past this one -} - static void iov_iter_bvec_advance(struct iov_iter *i, size_t size) { const struct bio_vec *bvec, *end; @@ -904,8 +583,6 @@ void iov_iter_advance(struct iov_iter *i, size_t size) iov_iter_iovec_advance(i, size); } else if (iov_iter_is_bvec(i)) { iov_iter_bvec_advance(i, size); - } else if (iov_iter_is_pipe(i)) { - pipe_advance(i, size); } else if (iov_iter_is_discard(i)) { i->count -= size; } @@ -919,26 +596,6 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll) if (WARN_ON(unroll > MAX_RW_COUNT)) return; i->count += unroll; - if (unlikely(iov_iter_is_pipe(i))) { - struct pipe_inode_info *pipe = i->pipe; - unsigned int head = pipe->head; - - while (head > i->start_head) { - struct pipe_buffer *b = pipe_buf(pipe, --head); - if (unroll < b->len) { - b->len -= unroll; - i->last_offset = last_offset(b); - i->head = head; - return; - } - unroll -= b->len; - pipe_buf_release(pipe, b); - pipe->head--; - } - i->last_offset = 0; - i->head = head; - return; - } if (unlikely(iov_iter_is_discard(i))) return; if (unroll <= i->iov_offset) { @@ -1026,24 +683,6 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int direction, } EXPORT_SYMBOL(iov_iter_bvec); -void iov_iter_pipe(struct iov_iter *i, unsigned int direction, - struct pipe_inode_info *pipe, - size_t count) -{ - BUG_ON(direction != READ); - WARN_ON(pipe_full(pipe->head, pipe->tail, pipe->ring_size)); - *i = (struct iov_iter){ - .iter_type = ITER_PIPE, - .data_source = false, - .pipe = pipe, - .head = pipe->head, - .start_head = pipe->head, - .last_offset = 0, - .count = count - }; -} -EXPORT_SYMBOL(iov_iter_pipe); - /** * iov_iter_xarray - Initialise an I/O iterator to use the pages in an xarray * @i: The iterator to initialise. @@ -1168,19 +807,6 @@ bool iov_iter_is_aligned(const struct iov_iter *i, unsigned addr_mask, if (iov_iter_is_bvec(i)) return iov_iter_aligned_bvec(i, addr_mask, len_mask); - if (iov_iter_is_pipe(i)) { - size_t size = i->count; - - if (size & len_mask) - return false; - if (size && i->last_offset > 0) { - if (i->last_offset & addr_mask) - return false; - } - - return true; - } - if (iov_iter_is_xarray(i)) { if (i->count & len_mask) return false; @@ -1250,14 +876,6 @@ unsigned long iov_iter_alignment(const struct iov_iter *i) if (iov_iter_is_bvec(i)) return iov_iter_alignment_bvec(i); - if (iov_iter_is_pipe(i)) { - size_t size = i->count; - - if (size && i->last_offset > 0) - return size | i->last_offset; - return size; - } - if (iov_iter_is_xarray(i)) return (i->xarray_start + i->iov_offset) | i->count; @@ -1309,36 +927,6 @@ static int want_pages_array(struct page ***res, size_t size, return count; } -static ssize_t pipe_get_pages(struct iov_iter *i, - struct page ***pages, size_t maxsize, unsigned maxpages, - size_t *start) -{ - unsigned int npages, count, off, chunk; - struct page **p; - size_t left; - - if (!sanity(i)) - return -EFAULT; - - *start = off = pipe_npages(i, &npages); - if (!npages) - return -EFAULT; - count = want_pages_array(pages, maxsize, off, min(npages, maxpages)); - if (!count) - return -ENOMEM; - p = *pages; - for (npages = 0, left = maxsize ; npages < count; npages++, left -= chunk) { - struct page *page = append_pipe(i, left, &off); - if (!page) - break; - chunk = min_t(size_t, left, PAGE_SIZE - off); - get_page(*p++ = page); - } - if (!npages) - return -EFAULT; - return maxsize - left; -} - static ssize_t iter_xarray_populate_pages(struct page **pages, struct xarray *xa, pgoff_t index, unsigned int nr_pages) { @@ -1486,8 +1074,6 @@ static ssize_t __iov_iter_get_pages_alloc(struct iov_iter *i, } return maxsize; } - if (iov_iter_is_pipe(i)) - return pipe_get_pages(i, pages, maxsize, maxpages, start); if (iov_iter_is_xarray(i)) return iter_xarray_get_pages(i, pages, maxsize, maxpages, start); return -EFAULT; @@ -1577,9 +1163,7 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *_csstate, } sum = csum_shift(csstate->csum, csstate->off); - if (unlikely(iov_iter_is_pipe(i))) - bytes = csum_and_copy_to_pipe_iter(addr, bytes, i, &sum); - else iterate_and_advance(i, bytes, base, len, off, ({ + iterate_and_advance(i, bytes, base, len, off, ({ next = csum_and_copy_to_user(addr + off, base, len); sum = csum_block_add(sum, next, off); next ? 0 : len; @@ -1664,15 +1248,6 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages) return iov_npages(i, maxpages); if (iov_iter_is_bvec(i)) return bvec_npages(i, maxpages); - if (iov_iter_is_pipe(i)) { - int npages; - - if (!sanity(i)) - return 0; - - pipe_npages(i, &npages); - return min(npages, maxpages); - } if (iov_iter_is_xarray(i)) { unsigned offset = (i->xarray_start + i->iov_offset) % PAGE_SIZE; int npages = DIV_ROUND_UP(offset + i->count, PAGE_SIZE); @@ -1685,10 +1260,6 @@ EXPORT_SYMBOL(iov_iter_npages); const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags) { *new = *old; - if (unlikely(iov_iter_is_pipe(new))) { - WARN_ON(1); - return NULL; - } if (iov_iter_is_bvec(new)) return new->bvec = kmemdup(new->bvec, new->nr_segs * sizeof(struct bio_vec), diff --git a/mm/filemap.c b/mm/filemap.c index c4d4ace9cc70..f72e4875bfcb 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2446,9 +2446,6 @@ static bool filemap_range_uptodate(struct address_space *mapping, if (folio_test_uptodate(folio)) return true; - /* pipes can't handle partially uptodate pages */ - if (iov_iter_is_pipe(iter)) - return false; if (!mapping->a_ops->is_partially_uptodate) return false; if (mapping->host->i_blkbits >= folio_shift(folio)) From patchwork Tue Feb 7 17:12:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13131881 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEEA4C6379F for ; Tue, 7 Feb 2023 17:13:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B09E56B0121; Tue, 7 Feb 2023 12:13:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AB9276B0123; Tue, 7 Feb 2023 12:13:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 90A8D6B0124; Tue, 7 Feb 2023 12:13:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7F72B6B0121 for ; Tue, 7 Feb 2023 12:13:25 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 4D4FA1A0945 for ; Tue, 7 Feb 2023 17:13:25 +0000 (UTC) X-FDA: 80441141970.06.466B14D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf19.hostedemail.com (Postfix) with ESMTP id 651F41A001D for ; Tue, 7 Feb 2023 17:13:23 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=EYByOOmy; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf19.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675790003; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Q5bJAUy0Yf/xIkHbL0gYjIax658ix+8tiANnbR9foj4=; b=ExqqjM0ol8f5cDUo19m0SGSbdK3HQKLx5SOAVU6OfzQpPGJmz+uX4sFBjDIUQnKSIK8lFp UPS5+gfIN01uPB/MY9zCnWQRmHDrFIiRQjTiTG+ig/NhFc2lpXhLZx1SE/aJVQurskI/Eh HihnPJTnVJ1TQ2wGGm3uwsabSRmkh6M= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=EYByOOmy; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf19.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675790003; a=rsa-sha256; cv=none; b=dXLvDcsX2LhimdBoT4XG/zcoPwCaL+Xn3eLpm/yktWh6ih4txn5cTQG/nhryKuz/qvX2sM gtB84d7030oo60XLTWYdcgb2Vjrvj2kBC9jELyQ1WRckJ0OB1EtgT2WTP4be5A9+NZRzIk f+Ab6jQJldGz/iTjfKLx19Wr7OD2ejg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675790002; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Q5bJAUy0Yf/xIkHbL0gYjIax658ix+8tiANnbR9foj4=; b=EYByOOmydbXQOe2pEZVejlM9oaPHdJUCZIvclWeLhxHjuhooux5LArkEHGHMmcYvD8YGqo +PNxldlxdaY5pUbPrfrnruYbjt/2ujz8Y0IHdx9pXqd2B8SFBMFmkK7ytP93h0Es8C55c1 tAeRq9lfliHIRimmVxLyp9Oa2ZyG8t0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-231-iOqBQLPAOmOx4rkicrwNvw-1; Tue, 07 Feb 2023 12:13:21 -0500 X-MC-Unique: iOqBQLPAOmOx4rkicrwNvw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 40FF385A5B1; Tue, 7 Feb 2023 17:13:20 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.33.36.97]) by smtp.corp.redhat.com (Postfix) with ESMTP id 522A7400D9D0; Tue, 7 Feb 2023 17:13:18 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v12 03/10] iov_iter: Define flags to qualify page extraction. Date: Tue, 7 Feb 2023 17:12:58 +0000 Message-Id: <20230207171305.3716974-4-dhowells@redhat.com> In-Reply-To: <20230207171305.3716974-1-dhowells@redhat.com> References: <20230207171305.3716974-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Rspamd-Queue-Id: 651F41A001D X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: kgp4e6g5j35d7a97pkxa3nwu73e9sros X-HE-Tag: 1675790003-775528 X-HE-Meta: U2FsdGVkX1/MHApkUXmRCy4zgN7E/tdQ4Ggza8otEAl2lZ1II8y28SOAl1qlHaT56HwOdmbjYv2pTTP6SdwpqaM23ZuOYfG006ahfZST0mrQO6J364vHS8BtZOA7TeljL/MLiKBtJ/jm6IEu0Q6xcVySb6Cw1f5RO88715XneM9nnD8749zasmjWeYNDyQVQIw/u+24Kwhtzp2DNQFHyB57LhYiXRrmYwLZPzECc3hIbtu5cbEET4LV0AVQvJEa5L7AHI24iWAzLuI3OBrjN4gT9RpfpbJ6+lwQOA5pRvqewHNVpo6G2JXjPohh11hH4OalM/GEaPxZ//MvZX5hwz6jwqqLHONt3ZWqUWG6hsFnloLhyVvRm2ys+VRkfLVX6kLsaqaY/4EuefL5lOeifoHYqdFRrXf01yvr5xtIb1GDjdRgAOLdbZa9ac1SEzaKX6k+pVtfCys3hhivIOAy1e0Vvh7JUfKfCHwOQALNKn46ZR7okpPmxG7cFQnSyR+bwnzduyX1Gnl+6HY41yN54s/dkgCqAURIzq6m8PDcbIlrYCuwjeP9GHF2OGvkVmHVA4RjwsLSetd0HlzrraVNlfIpwycydqnpNfTeMR7waDV8XsSxH0nP6qXNk3ac2/lW0xQUVExD/fl70MfRVIylJyog4cPU3X86992B7vBNqLfKrsFv0S7eBRcdUCng0ialpHXLbgGWaMA9veqlxp4OzBmcJsKBn9ayIkYKXJF4CsLrfbuN2V9U5c+70FYvauyfMtV5KW8btdwZCdhgFvhkA7AwzW+CcPVAVbkDeqGZb/1c7vMlJ2Jd1okAcruKfjy6nwgio7TD2PfUL6bPlvlm+y2ZVPmyrFnzXz4QdT61KxhurU3HYG7kVf2ppH1VbFi3ktWn2FAFEPy7Tg9xK/7cs2NB5vk0S/0hyIEplA8Adde0czEaGDowco/BVkHkQBlRQRxt/zXscYfXq16UHjLY YjErDilt CpOeIybyOqFm+LQLebdbXM491KMiyF15lqdo3cxQ020J0nqScqAX4hzT6M1fXauTZzhVEO3rEhYQt4Q0LNYz+iO5GO3XNtp4CeRkglFniVqqicxHuxZiBVtB7tCi0FgE9/AozqS24DTvrgeptVHIXIl+fwavELmWneLGW5VhCge8dVHW53HTl2QNKwAGWBFfP8gpqbIlT0DgE9t/nqgpiILde1LZ8QOhaN4FueTvZHrbAJzKSSMrPL2+ivUNn3NOEi/F7o8jE7s0+nfuBOAKN6odDwugJrRbONcrVqHvb7ksRs1JWrzuv1KaFo5DDp9u9hxLAWGOxsFT/bGQQaP97+KuL9riM/NmXn4ethbeR+Q5Vr68VvQ/wVLpdlXZT1W5SyJjt26JjmQunTg9+BAE+OcNxzqNvUsxclpzO0w1sUoZgqTIjqnVIA4hIyaP3cJSDqg5J X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Define flags to qualify page extraction to pass into iov_iter_*_pages*() rather than passing in FOLL_* flags. For now only a flag to allow peer-to-peer DMA is supported. Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard cc: Al Viro cc: Jens Axboe cc: Logan Gunthorpe cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org --- Notes: ver #12) - Use __bitwise for the extraction flags typedef. ver #11) - Use __bitwise for the extraction flags. ver #9) - Change extract_flags to extraction_flags. ver #7) - Don't use FOLL_* as a parameter, but rather define constants specifically to use with iov_iter_*_pages*(). - Drop the I/O direction constants for now. block/bio.c | 6 +++--- block/blk-map.c | 8 ++++---- include/linux/uio.h | 10 ++++++++-- lib/iov_iter.c | 14 ++++++++------ 4 files changed, 23 insertions(+), 15 deletions(-) diff --git a/block/bio.c b/block/bio.c index ab59a491a883..b97f3991c904 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1245,11 +1245,11 @@ static int bio_iov_add_zone_append_page(struct bio *bio, struct page *page, */ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) { + iov_iter_extraction_t extraction_flags = 0; unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt; struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; struct page **pages = (struct page **)bv; - unsigned int gup_flags = 0; ssize_t size, left; unsigned len, i = 0; size_t offset, trim; @@ -1264,7 +1264,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); if (bio->bi_bdev && blk_queue_pci_p2pdma(bio->bi_bdev->bd_disk->queue)) - gup_flags |= FOLL_PCI_P2PDMA; + extraction_flags |= ITER_ALLOW_P2PDMA; /* * Each segment in the iov is required to be a block size multiple. @@ -1275,7 +1275,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) */ size = iov_iter_get_pages(iter, pages, UINT_MAX - bio->bi_iter.bi_size, - nr_pages, &offset, gup_flags); + nr_pages, &offset, extraction_flags); if (unlikely(size <= 0)) return size ? size : -EFAULT; diff --git a/block/blk-map.c b/block/blk-map.c index 19940c978c73..080dd60485be 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -265,9 +265,9 @@ static struct bio *blk_rq_map_bio_alloc(struct request *rq, static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, gfp_t gfp_mask) { + iov_iter_extraction_t extraction_flags = 0; unsigned int max_sectors = queue_max_hw_sectors(rq->q); unsigned int nr_vecs = iov_iter_npages(iter, BIO_MAX_VECS); - unsigned int gup_flags = 0; struct bio *bio; int ret; int j; @@ -280,7 +280,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, return -ENOMEM; if (blk_queue_pci_p2pdma(rq->q)) - gup_flags |= FOLL_PCI_P2PDMA; + extraction_flags |= ITER_ALLOW_P2PDMA; while (iov_iter_count(iter)) { struct page **pages, *stack_pages[UIO_FASTIOV]; @@ -291,10 +291,10 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, if (nr_vecs <= ARRAY_SIZE(stack_pages)) { pages = stack_pages; bytes = iov_iter_get_pages(iter, pages, LONG_MAX, - nr_vecs, &offs, gup_flags); + nr_vecs, &offs, extraction_flags); } else { bytes = iov_iter_get_pages_alloc(iter, &pages, - LONG_MAX, &offs, gup_flags); + LONG_MAX, &offs, extraction_flags); } if (unlikely(bytes <= 0)) { ret = bytes ? bytes : -EFAULT; diff --git a/include/linux/uio.h b/include/linux/uio.h index dcc0ca5ef491..af70e4c9ea27 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -12,6 +12,8 @@ struct page; +typedef unsigned int __bitwise iov_iter_extraction_t; + struct kvec { void *iov_base; /* and that should *never* hold a userland pointer */ size_t iov_len; @@ -238,12 +240,12 @@ void iov_iter_xarray(struct iov_iter *i, unsigned int direction, struct xarray * loff_t start, size_t count); ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages, size_t maxsize, unsigned maxpages, size_t *start, - unsigned gup_flags); + iov_iter_extraction_t extraction_flags); ssize_t iov_iter_get_pages2(struct iov_iter *i, struct page **pages, size_t maxsize, unsigned maxpages, size_t *start); ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages, size_t maxsize, size_t *start, - unsigned gup_flags); + iov_iter_extraction_t extraction_flags); ssize_t iov_iter_get_pages_alloc2(struct iov_iter *i, struct page ***pages, size_t maxsize, size_t *start); int iov_iter_npages(const struct iov_iter *i, int maxpages); @@ -346,4 +348,8 @@ static inline void iov_iter_ubuf(struct iov_iter *i, unsigned int direction, }; } +/* Flags for iov_iter_get/extract_pages*() */ +/* Allow P2PDMA on the extracted pages */ +#define ITER_ALLOW_P2PDMA ((__force iov_iter_extraction_t)0x01) + #endif diff --git a/lib/iov_iter.c b/lib/iov_iter.c index adc5e8aa8ae8..34ee3764d0fa 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -1020,9 +1020,9 @@ static struct page *first_bvec_segment(const struct iov_iter *i, static ssize_t __iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages, size_t maxsize, unsigned int maxpages, size_t *start, - unsigned int gup_flags) + iov_iter_extraction_t extraction_flags) { - unsigned int n; + unsigned int n, gup_flags = 0; if (maxsize > i->count) maxsize = i->count; @@ -1030,6 +1030,8 @@ static ssize_t __iov_iter_get_pages_alloc(struct iov_iter *i, return 0; if (maxsize > MAX_RW_COUNT) maxsize = MAX_RW_COUNT; + if (extraction_flags & ITER_ALLOW_P2PDMA) + gup_flags |= FOLL_PCI_P2PDMA; if (likely(user_backed_iter(i))) { unsigned long addr; @@ -1081,14 +1083,14 @@ static ssize_t __iov_iter_get_pages_alloc(struct iov_iter *i, ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages, size_t maxsize, unsigned maxpages, - size_t *start, unsigned gup_flags) + size_t *start, iov_iter_extraction_t extraction_flags) { if (!maxpages) return 0; BUG_ON(!pages); return __iov_iter_get_pages_alloc(i, &pages, maxsize, maxpages, - start, gup_flags); + start, extraction_flags); } EXPORT_SYMBOL_GPL(iov_iter_get_pages); @@ -1101,14 +1103,14 @@ EXPORT_SYMBOL(iov_iter_get_pages2); ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages, size_t maxsize, - size_t *start, unsigned gup_flags) + size_t *start, iov_iter_extraction_t extraction_flags) { ssize_t len; *pages = NULL; len = __iov_iter_get_pages_alloc(i, pages, maxsize, ~0U, start, - gup_flags); + extraction_flags); if (len <= 0) { kvfree(*pages); *pages = NULL; From patchwork Tue Feb 7 17:12:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13131882 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4B4DC636D3 for ; Tue, 7 Feb 2023 17:13:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3B5816B0123; Tue, 7 Feb 2023 12:13:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 365576B0125; Tue, 7 Feb 2023 12:13:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 22ED76B0126; Tue, 7 Feb 2023 12:13:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 102DE6B0123 for ; Tue, 7 Feb 2023 12:13:32 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id D4005140A3E for ; Tue, 7 Feb 2023 17:13:31 +0000 (UTC) X-FDA: 80441142222.02.4873F75 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf17.hostedemail.com (Postfix) with ESMTP id 1EA9940009 for ; Tue, 7 Feb 2023 17:13:29 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Ph1JP8Yp; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675790010; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=G1Az3EBV9SOKUryI5J/Szq/Ryazp7FGIfAGTSAZ/Hdw=; b=qqhrJ42lRCqbWUjkR2aQRLHU5t3bqYoonUkhrCmubLxEPwRtW2eyIO+UKhl0EzkvKVOtVU U4U+xw0OvKxU5ZCyODkZjwXIr5G90W01h8enXDWRRlH2Rj2kmEXjzRjkEWozvu6RrsjEcd r0yXQZMrT8tV53Q1afZ4gddunCGB9HU= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Ph1JP8Yp; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675790010; a=rsa-sha256; cv=none; b=UW/yR3JGXvDbvMfDbIZgzCeQ54A7IIaSGkBT0eXb1A1IXz4Um5YR4q4cyfwDvrWWHOznHR 1lFUXYsNFBYT6XzDh8j9q/N20dNk2dN17/FI2ccAV7hFyaiKIrbEEgKEb+NP7dceXFQ7uj UZOudM9bcQJ1HTHLNCkabC7x/ITQxDM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675790009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=G1Az3EBV9SOKUryI5J/Szq/Ryazp7FGIfAGTSAZ/Hdw=; b=Ph1JP8YpSXzd3oW+MVYvotI4t37p2TpMu1sab7CZ5KA7hfLGC7pQXf4jKGhbDksO0bgPUl UaunEHuosnX3eZN+DTLkLZ7pjLkn/aTwc7LdTw75Or1WFLLBJnTSK3OYDohMpjIEc92Wcg Jc2oAm+3+uX9ArhVt9Kv2l1klMY5MyI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-418-qxGBI61fM5mnvUtAmJqC3w-1; Tue, 07 Feb 2023 12:13:23 -0500 X-MC-Unique: qxGBI61fM5mnvUtAmJqC3w-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0F2BD800050; Tue, 7 Feb 2023 17:13:23 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.33.36.97]) by smtp.corp.redhat.com (Postfix) with ESMTP id 04EF618EC5; Tue, 7 Feb 2023 17:13:20 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v12 04/10] iov_iter: Add a function to extract a page list from an iterator Date: Tue, 7 Feb 2023 17:12:59 +0000 Message-Id: <20230207171305.3716974-5-dhowells@redhat.com> In-Reply-To: <20230207171305.3716974-1-dhowells@redhat.com> References: <20230207171305.3716974-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Rspamd-Queue-Id: 1EA9940009 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: 5i7ryz5im9fofewp6z7fhjufzn6t6apr X-HE-Tag: 1675790009-766443 X-HE-Meta: U2FsdGVkX1/l22u5+61n9GZBx5o/w/IgfAjH5GqWmACHqGZhX1OE+ns9VsdPwfnrTeW5c0vsoTc3qSSkH7TvG8XB3Pj5Ocirom8cpqt4y9C+ZrAi1WFJ10NSgzXvhb23hH2BlDQ23X5smI89Hu4l2qHdYHiiMljxHvIsmIqHKx3CtMX1UiSXMbniUXuX78/c/f1xBnmpRXmf1zRYDJB8LKkZg8FgtwTq6YaEmLGOF+1PryiyWQsJRp2zIBTMRHDsalLI3fGN/G1Gayxct3d5GB3Sr5Qe6rpz8W/QGS4BY9dCsWwQDZGrkxOuZtBrF3Qku1U77C0x3O6td0pknpvE8BAXXxy13t8lro4M+BdHcPH1EmXaqXoKCQ4n9T0FaPbYYKzmcXTZcc6vgBcf5szU8AIk5K0f/reMScEfpF4/KsoIabQYe7nsNXKqbUOJw0OsgucTvq+132F2MFrQGFbO+AmvBad467sdmTYmt5aBIycPfIr1aBFVEEBMRQb4aRUEHv7b3c0bQkZSOrP7+G8A1Zc2YOVfpj0EWkgTU1Whgf2X4APn6n4fEvZ3eJMw8VB+MXkHraGzl5EPw3dG+qgT6rmb1SCzRHFBqv7fAmx6ZjvKm+U6lDeA1als0vTTdsh8rs31LEnnL28SRkhd+xjWouQl4rB+By3kKSHaYLkfRxXurw5+Lu3+adHmOt+VjUjd7qtelTgfbzsRfup7LFg6rSE9jDD4GzgruCFOHwaqeM0XrPek9xlvO73TpAzxm/aH7O1MtRY+RLVQ1rGmRCIGFQVpWM/KwgQIxAbcArjVGTmTcBM374p80QnLbLmITE/qVsU1LyRlI+aHRjLjPdrovUsxTtX3dRZNpg5X9l0P1hPsWH/oM1eh78upSnjDejtPSQObaffW6GMqndULtRptlrl/I7lzulBSvwHMkx5kM8Xjd4UIDpQuvL60czVHxJw64TlWW2S1QLIV3DG8PCB /BBFUKgT xSNRTqWoIaM2Ky6bP/OfbOQ6QdkPnaycNWEsErX0ozgvtv1cZhSe0U1q4uCw/GgsFj2WqXUWkU/7Pon1C6s5E6D49lPCCtG9DlS/T1fRzFKePnd3D+dAIZ11he7u7+5Klw9p1TwqkPL3D6hesBVPxyIoZHv0dpH6UZ0iR4JcGEP0mYC6vWJlAPSoAdO0Vx1EpgBXZ+nHHwj8m1xS4wu8Xs7Nrf6oldamewf+PVyR/1Fjnd112uYzhoG1GBRj0TSUD0mLWNBToKIyny/v/IqlnQkOnvu7u8a9Rl9eDJ1IAWFJhnqKmNa97DP020bHepkyNTZchEyqK2SfKRMoQ5vvrtaw5Cu/47kYB8q9e6bqkGj9cMFwiv5lN7bFwCADl6HzOxm67LjUfib5TV9uYxm3XdWwZHk1GiWHVM5AuIIl32OU0yABWray0SC2CX6bWR93p6Aqwb0+SQafjIgoepKR2AWzOEaz6AhXuTW4a7zF5vZ194mkZjqUY7hLk/L3Sasawd3CCKd+eTQ0TE6fZBUmgpxt+FIhNIp887WltV5ZPaSv0vNyqI5hj67mRlg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add a function, iov_iter_extract_pages(), to extract a list of pages from an iterator. The pages may be returned with a pin added or nothing, depending on the type of iterator. Add a second function, iov_iter_extract_will_pin(), to determine how the cleanup should be done. There are two cases: (1) ITER_IOVEC or ITER_UBUF iterator. Extracted pages will have pins (FOLL_PIN) obtained on them so that a concurrent fork() will forcibly copy the page so that DMA is done to/from the parent's buffer and is unavailable to/unaffected by the child process. iov_iter_extract_will_pin() will return true for this case. The caller should use something like unpin_user_page() to dispose of the page. (2) Any other sort of iterator. No refs or pins are obtained on the page, the assumption is made that the caller will manage page retention. iov_iter_extract_will_pin() will return false. The pages don't need additional disposal. Signed-off-by: David Howells Reviewed-by: Christoph Hellwig cc: Al Viro cc: John Hubbard cc: David Hildenbrand cc: Matthew Wilcox cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- Notes: ver #12) - ITER_PIPE is gone, so drop related bits. - Don't specify FOLL_PIN as that's implied by pin_user_pages_fast(). ver #11) - Fix iov_iter_extract_kvec_pages() to include the offset into the page in the returned starting offset. - Use __bitwise for the extraction flags ver #10) - Fix use of i->kvec in iov_iter_extract_bvec_pages() to be i->bvec. ver #9) - Rename iov_iter_extract_mode() to iov_iter_extract_will_pin() and make it return true/false not FOLL_PIN/0 as FOLL_PIN is going to be made private to mm/. - Change extract_flags to extraction_flags. ver #8) - It seems that all DIO is supposed to be done under FOLL_PIN now, and not FOLL_GET, so switch to only using pin_user_pages() for user-backed iters. - Wrap an argument in brackets in the iov_iter_extract_mode() macro. - Drop the extract_flags argument to iov_iter_extract_mode() for now [hch]. ver #7) - Switch to passing in iter-specific flags rather than FOLL_* flags. - Drop the direction flags for now. - Use ITER_ALLOW_P2PDMA to request FOLL_PCI_P2PDMA. - Disallow use of ITER_ALLOW_P2PDMA with non-user-backed iter. - Add support for extraction from KVEC-type iters. - Use iov_iter_advance() rather than open-coding it. - Make BVEC- and KVEC-type skip over initial empty vectors. ver #6) - Add back the function to indicate the cleanup mode. - Drop the cleanup_mode return arg to iov_iter_extract_pages(). - Pass FOLL_SOURCE/DEST_BUF in gup_flags. Check this against the iter data_source. ver #4) - Use ITER_SOURCE/DEST instead of WRITE/READ. - Allow additional FOLL_* flags, such as FOLL_PCI_P2PDMA to be passed in. ver #3) - Switch to using EXPORT_SYMBOL_GPL to prevent indirect 3rd-party access to get/pin_user_pages_fast()[1]. include/linux/uio.h | 27 ++++- lib/iov_iter.c | 264 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 290 insertions(+), 1 deletion(-) diff --git a/include/linux/uio.h b/include/linux/uio.h index af70e4c9ea27..cf6658066736 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -347,9 +347,34 @@ static inline void iov_iter_ubuf(struct iov_iter *i, unsigned int direction, .count = count }; } - /* Flags for iov_iter_get/extract_pages*() */ /* Allow P2PDMA on the extracted pages */ #define ITER_ALLOW_P2PDMA ((__force iov_iter_extraction_t)0x01) +ssize_t iov_iter_extract_pages(struct iov_iter *i, struct page ***pages, + size_t maxsize, unsigned int maxpages, + iov_iter_extraction_t extraction_flags, + size_t *offset0); + +/** + * iov_iter_extract_will_pin - Indicate how pages from the iterator will be retained + * @iter: The iterator + * + * Examine the iterator and indicate by returning true or false as to how, if + * at all, pages extracted from the iterator will be retained by the extraction + * function. + * + * %true indicates that the pages will have a pin placed in them that the + * caller must unpin. This is must be done for DMA/async DIO to force fork() + * to forcibly copy a page for the child (the parent must retain the original + * page). + * + * %false indicates that no measures are taken and that it's up to the caller + * to retain the pages. + */ +static inline bool iov_iter_extract_will_pin(const struct iov_iter *iter) +{ + return user_backed_iter(iter); +} + #endif diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 34ee3764d0fa..8d34b6552179 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -1487,3 +1487,267 @@ void iov_iter_restore(struct iov_iter *i, struct iov_iter_state *state) i->iov -= state->nr_segs - i->nr_segs; i->nr_segs = state->nr_segs; } + +/* + * Extract a list of contiguous pages from an ITER_XARRAY iterator. This does not + * get references on the pages, nor does it get a pin on them. + */ +static ssize_t iov_iter_extract_xarray_pages(struct iov_iter *i, + struct page ***pages, size_t maxsize, + unsigned int maxpages, + iov_iter_extraction_t extraction_flags, + size_t *offset0) +{ + struct page *page, **p; + unsigned int nr = 0, offset; + loff_t pos = i->xarray_start + i->iov_offset; + pgoff_t index = pos >> PAGE_SHIFT; + XA_STATE(xas, i->xarray, index); + + offset = pos & ~PAGE_MASK; + *offset0 = offset; + + maxpages = want_pages_array(pages, maxsize, offset, maxpages); + if (!maxpages) + return -ENOMEM; + p = *pages; + + rcu_read_lock(); + for (page = xas_load(&xas); page; page = xas_next(&xas)) { + if (xas_retry(&xas, page)) + continue; + + /* Has the page moved or been split? */ + if (unlikely(page != xas_reload(&xas))) { + xas_reset(&xas); + continue; + } + + p[nr++] = find_subpage(page, xas.xa_index); + if (nr == maxpages) + break; + } + rcu_read_unlock(); + + maxsize = min_t(size_t, nr * PAGE_SIZE - offset, maxsize); + iov_iter_advance(i, maxsize); + return maxsize; +} + +/* + * Extract a list of contiguous pages from an ITER_BVEC iterator. This does + * not get references on the pages, nor does it get a pin on them. + */ +static ssize_t iov_iter_extract_bvec_pages(struct iov_iter *i, + struct page ***pages, size_t maxsize, + unsigned int maxpages, + iov_iter_extraction_t extraction_flags, + size_t *offset0) +{ + struct page **p, *page; + size_t skip = i->iov_offset, offset; + int k; + + for (;;) { + if (i->nr_segs == 0) + return 0; + maxsize = min(maxsize, i->bvec->bv_len - skip); + if (maxsize) + break; + i->iov_offset = 0; + i->nr_segs--; + i->bvec++; + skip = 0; + } + + skip += i->bvec->bv_offset; + page = i->bvec->bv_page + skip / PAGE_SIZE; + offset = skip % PAGE_SIZE; + *offset0 = offset; + + maxpages = want_pages_array(pages, maxsize, offset, maxpages); + if (!maxpages) + return -ENOMEM; + p = *pages; + for (k = 0; k < maxpages; k++) + p[k] = page + k; + + maxsize = min_t(size_t, maxsize, maxpages * PAGE_SIZE - offset); + iov_iter_advance(i, maxsize); + return maxsize; +} + +/* + * Extract a list of virtually contiguous pages from an ITER_KVEC iterator. + * This does not get references on the pages, nor does it get a pin on them. + */ +static ssize_t iov_iter_extract_kvec_pages(struct iov_iter *i, + struct page ***pages, size_t maxsize, + unsigned int maxpages, + iov_iter_extraction_t extraction_flags, + size_t *offset0) +{ + struct page **p, *page; + const void *kaddr; + size_t skip = i->iov_offset, offset, len; + int k; + + for (;;) { + if (i->nr_segs == 0) + return 0; + maxsize = min(maxsize, i->kvec->iov_len - skip); + if (maxsize) + break; + i->iov_offset = 0; + i->nr_segs--; + i->kvec++; + skip = 0; + } + + kaddr = i->kvec->iov_base + skip; + offset = (unsigned long)kaddr & ~PAGE_MASK; + *offset0 = offset; + + maxpages = want_pages_array(pages, maxsize, offset, maxpages); + if (!maxpages) + return -ENOMEM; + p = *pages; + + kaddr -= offset; + len = offset + maxsize; + for (k = 0; k < maxpages; k++) { + size_t seg = min_t(size_t, len, PAGE_SIZE); + + if (is_vmalloc_or_module_addr(kaddr)) + page = vmalloc_to_page(kaddr); + else + page = virt_to_page(kaddr); + + p[k] = page; + len -= seg; + kaddr += PAGE_SIZE; + } + + maxsize = min_t(size_t, maxsize, maxpages * PAGE_SIZE - offset); + iov_iter_advance(i, maxsize); + return maxsize; +} + +/* + * Extract a list of contiguous pages from a user iterator and get a pin on + * each of them. This should only be used if the iterator is user-backed + * (IOBUF/UBUF). + * + * It does not get refs on the pages, but the pages must be unpinned by the + * caller once the transfer is complete. + * + * This is safe to be used where background IO/DMA *is* going to be modifying + * the buffer; using a pin rather than a ref makes forces fork() to give the + * child a copy of the page. + */ +static ssize_t iov_iter_extract_user_pages(struct iov_iter *i, + struct page ***pages, + size_t maxsize, + unsigned int maxpages, + iov_iter_extraction_t extraction_flags, + size_t *offset0) +{ + unsigned long addr; + unsigned int gup_flags = 0; + size_t offset; + int res; + + if (i->data_source == ITER_DEST) + gup_flags |= FOLL_WRITE; + if (extraction_flags & ITER_ALLOW_P2PDMA) + gup_flags |= FOLL_PCI_P2PDMA; + if (i->nofault) + gup_flags |= FOLL_NOFAULT; + + addr = first_iovec_segment(i, &maxsize); + *offset0 = offset = addr % PAGE_SIZE; + addr &= PAGE_MASK; + maxpages = want_pages_array(pages, maxsize, offset, maxpages); + if (!maxpages) + return -ENOMEM; + res = pin_user_pages_fast(addr, maxpages, gup_flags, *pages); + if (unlikely(res <= 0)) + return res; + maxsize = min_t(size_t, maxsize, res * PAGE_SIZE - offset); + iov_iter_advance(i, maxsize); + return maxsize; +} + +/** + * iov_iter_extract_pages - Extract a list of contiguous pages from an iterator + * @i: The iterator to extract from + * @pages: Where to return the list of pages + * @maxsize: The maximum amount of iterator to extract + * @maxpages: The maximum size of the list of pages + * @extraction_flags: Flags to qualify request + * @offset0: Where to return the starting offset into (*@pages)[0] + * + * Extract a list of contiguous pages from the current point of the iterator, + * advancing the iterator. The maximum number of pages and the maximum amount + * of page contents can be set. + * + * If *@pages is NULL, a page list will be allocated to the required size and + * *@pages will be set to its base. If *@pages is not NULL, it will be assumed + * that the caller allocated a page list at least @maxpages in size and this + * will be filled in. + * + * @extraction_flags can have ITER_ALLOW_P2PDMA set to request peer-to-peer DMA + * be allowed on the pages extracted. + * + * The iov_iter_extract_will_pin() function can be used to query how cleanup + * should be performed. + * + * Extra refs or pins on the pages may be obtained as follows: + * + * (*) If the iterator is user-backed (ITER_IOVEC/ITER_UBUF), pins will be + * added to the pages, but refs will not be taken. + * iov_iter_extract_will_pin() will return true. + * + * (*) If the iterator is ITER_KVEC, ITER_BVEC or ITER_XARRAY, the pages are + * merely listed; no extra refs or pins are obtained. + * iov_iter_extract_will_pin() will return 0. + * + * Note also: + * + * (*) Use with ITER_DISCARD is not supported as that has no content. + * + * On success, the function sets *@pages to the new pagelist, if allocated, and + * sets *offset0 to the offset into the first page. + * + * It may also return -ENOMEM and -EFAULT. + */ +ssize_t iov_iter_extract_pages(struct iov_iter *i, + struct page ***pages, + size_t maxsize, + unsigned int maxpages, + iov_iter_extraction_t extraction_flags, + size_t *offset0) +{ + maxsize = min_t(size_t, min_t(size_t, maxsize, i->count), MAX_RW_COUNT); + if (!maxsize) + return 0; + + if (likely(user_backed_iter(i))) + return iov_iter_extract_user_pages(i, pages, maxsize, + maxpages, extraction_flags, + offset0); + if (iov_iter_is_kvec(i)) + return iov_iter_extract_kvec_pages(i, pages, maxsize, + maxpages, extraction_flags, + offset0); + if (iov_iter_is_bvec(i)) + return iov_iter_extract_bvec_pages(i, pages, maxsize, + maxpages, extraction_flags, + offset0); + if (iov_iter_is_xarray(i)) + return iov_iter_extract_xarray_pages(i, pages, maxsize, + maxpages, extraction_flags, + offset0); + return -EFAULT; +} +EXPORT_SYMBOL_GPL(iov_iter_extract_pages); From patchwork Tue Feb 7 17:13:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13131883 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C655BC636D4 for ; Tue, 7 Feb 2023 17:13:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 870766B0128; Tue, 7 Feb 2023 12:13:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 820A06B0126; Tue, 7 Feb 2023 12:13:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 69A0D6B0128; Tue, 7 Feb 2023 12:13:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3E7C46B0126 for ; Tue, 7 Feb 2023 12:13:32 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 228064019C for ; Tue, 7 Feb 2023 17:13:32 +0000 (UTC) X-FDA: 80441142264.04.D246263 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf26.hostedemail.com (Postfix) with ESMTP id 4CA40140012 for ; Tue, 7 Feb 2023 17:13:30 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hm40Q+Qd; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf26.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675790010; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AA3SDxqAw6tV+jeEhwhhXCCPQx7iw6NiGu/cAPT0aQk=; b=Q+FUYK6fhrfVNg1BJIo7otHKRJAmV9BBHO8DBt53xzTrV5x0DcOz3Ogsv+Fw08aT3kZ5iX TS7wSQcU0aEg4Z7qnQCLpUVuPvyh4kLHffVaAgYaSSDxw1CueRauyHuprhYD2a0ACBO2EO 35eMBAMu5m/wG8eTRGjohqFehbBUaHY= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hm40Q+Qd; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf26.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675790010; a=rsa-sha256; cv=none; b=Cc+ySncIGS9Ltc9td9CkloTYAdq/HeEBCp5mgcpD5lIcTbq6LcSwu2d2CrznuuQxePFgiJ 8SY5L6xs5p066odzf5jM4GDtdvT4r4FSoh9WubWfmEADPnhYtDm41BJKok8WD9phGVGyS8 98fX2M11gX6X3ppl3955Uh2EuK2VKho= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675790009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AA3SDxqAw6tV+jeEhwhhXCCPQx7iw6NiGu/cAPT0aQk=; b=hm40Q+QdgwM89yEvm002WXQ6H3T99yQ4SuNl1Iba7t0fdPXUOwyT+B+vWVoJKSdJ1cxoho iLQt9+vqn8g+qgNON59uQFG8ZqWKkbjMz1XPcebS5TpE2cKtPrSaWOSE+uSGBOhR5oFElB RXaGNrqAAn84KTMZ6M9+Tv3tycq8UGk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-452-wF4wE6smMp-jExteuoiB4Q-1; Tue, 07 Feb 2023 12:13:26 -0500 X-MC-Unique: wF4wE6smMp-jExteuoiB4Q-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7ECF4858F0E; Tue, 7 Feb 2023 17:13:25 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.33.36.97]) by smtp.corp.redhat.com (Postfix) with ESMTP id A6C1918EC5; Tue, 7 Feb 2023 17:13:23 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, John Hubbard Subject: [PATCH v12 05/10] iomap: Don't get an reference on ZERO_PAGE for direct I/O block zeroing Date: Tue, 7 Feb 2023 17:13:00 +0000 Message-Id: <20230207171305.3716974-6-dhowells@redhat.com> In-Reply-To: <20230207171305.3716974-1-dhowells@redhat.com> References: <20230207171305.3716974-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 4CA40140012 X-Stat-Signature: bwgwr1kfset5nh61szu3q8sydierx1gh X-HE-Tag: 1675790010-936481 X-HE-Meta: U2FsdGVkX18RnznOTc66+Q5kALajmm6sJD0LC1LfeLeX+XSH8+7sdK8nbzWk1+VwcuQ2YEc0ei+5y92n7spqmfMmxli7cDN8GIuEHskS4FJmCJnRGUCFGB5KS+QssSNxhqxgbMSLF/1waXh8sswmWCtNHpgEty/fCIHadjBFXBiG4X3v4eBJ3x4hZT80I0Q2rdA5t3OlV7OtB4nH9MrHxWs6kDEAKDc6HlI5oxTN63TDkysTytD+HAwaNz079JfVz0y/mTb+JVtqnC4YpR8fpkpxRxaod9vzjmiY0VUxEiKt5agWHmeey/KmQyjJ918lCiQQ9hNIDABoCZ9bjEj5g9sBcfAWQi/u227Q/W0bH3jjgtzLStp31QUaqIgG5qmxcCoDmaiahE5pmZPY28IZkF5RkmU4f503i1pe5nOV3XlhjjWf2/zKGj5a6zzyDqnGGqliztZgoBNeVU+dzX2P6g3uOArFECYuhpxqySaPv16Vn+wNLC9jOpiYQNgrtKcuH4AyLi8rgHtRa5vY1A8gTpkRHtNULgoid4aqea2vV8rYsmJLAjLZCtdl3fNt6TqRPGDS4t9mTioVtZSdR5yUpFByx+WqUMU5tbbpnPgcY67aUUgjVcYK797hvYFJvh/C3a+VzCJ/VILKcRFLqIZPgwbKmFTWW2mwdi8QVja81JnPxDCSzS3FSG8g/7suKoTnvlDOB8s41WbAlyv4qw0ceAiTU6l1Agh64kYF9Bgu0U9w4WlNYaEiPcJu75Aoi/Vxk3L8plQTRkq2q90356jczi7GxqYvdRKl6OUK3XkXVvCG8XCymJVTtPt8Z65Fyt3lcH2XRSUy7TSExX30MpfQfaBq010moQNTZOAYw+jUVctpKQAi5Fhwic+jfsjXR2qzfF7Rfa7Jod5d2DvmN4j73vwnYbXHOL7ThSVwKJiU9bRQu1JAeOFLISW2AmXI8FlNAHUyGlUGUNmnpNya+tx V3qKBLzd gzKLc1SbQwBe+wlmqgXgCsquTF/Tmn1DS7I71RgkLRWRhcRfBLK83LuSnz9hAXzNVAJISAU/P8AUh2kguN5GyGBosWhEnYeGSlm9r9BF5qAUjM8tmjWIYovcP2nuyNBM0KnlsP2fExTk3I5xQl5aidYDF2AfrdhPENJ/OdoBuif9sH1nDWRnxp4rw67WKmNPpht6uMro2A38rUF4a1l3beCODXa+mGYiMR6F6NjvMDQq6OhWZxk6Kq0erPh3hJu/hLnRzQ2K8fcx5dV79p/CuGZ+h+VevYCC6BFTqJCAbYZLj5+jyWwu7D5lrBLpKOyfv7bkonn2Qq4D/mQvZA72IFINVlkwDpPJRLLbvJ0PeRpCIpVwcU6JoVH2XUAEIe8yM8MJ9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: ZERO_PAGE can't go away, no need to hold an extra reference. Signed-off-by: David Howells Reviewed-by: David Hildenbrand Reviewed-by: John Hubbard cc: Al Viro cc: David Hildenbrand cc: linux-fsdevel@vger.kernel.org --- fs/iomap/direct-io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 9804714b1751..47db4ead1e74 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -202,7 +202,7 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio, bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io; - get_page(page); + bio_set_flag(bio, BIO_NO_PAGE_REF); __bio_add_page(bio, page, len, 0); iomap_dio_submit_bio(iter, dio, bio, pos); } From patchwork Tue Feb 7 17:13:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13131884 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADA04C636CC for ; Tue, 7 Feb 2023 17:13:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4048E8E0002; Tue, 7 Feb 2023 12:13:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 38CAF8E0001; Tue, 7 Feb 2023 12:13:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 256168E0002; Tue, 7 Feb 2023 12:13:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0305B8E0001 for ; Tue, 7 Feb 2023 12:13:35 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id C96E0140A5F for ; Tue, 7 Feb 2023 17:13:34 +0000 (UTC) X-FDA: 80441142348.09.671CCA5 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf09.hostedemail.com (Postfix) with ESMTP id 0B4C9140012 for ; Tue, 7 Feb 2023 17:13:32 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=OXjFoh9n; spf=pass (imf09.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675790013; a=rsa-sha256; cv=none; b=by0KY0Ug/WpGzuA7FN3wzQxATuK/1gIZdEJW9JNckqkISp5FNrZNWZ2N2m1YhzOiDkgMxv UtgO43JRIABIA17qimrqdV6wgRvC44uBaxBAxZD9GoGWzPmjIyFIg18I7HqziMQutIrtcW TeHY525CpSmzWFvr5GlpX3c1AHKVbDc= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=OXjFoh9n; spf=pass (imf09.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675790013; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9PNg3WPaFWzxoLjSof2qU8psox3R+qpqOMDOMyjiLi8=; b=8Bt0E9inDJohaZLhFSXlVRRQOAHhg5lVampuG5wXNsb/ZdOLKY9Th+5lWcdVYlNyqFfvfe XDsXyyt9iE0Vz2FwOOlvDBmBpCxO8dLWIJDPt381GIkH/zRwi8j2ASoNxXk1N3pshG39cS p4RChGMVlE34AkplkplrL8jEekpIyZA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675790012; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9PNg3WPaFWzxoLjSof2qU8psox3R+qpqOMDOMyjiLi8=; b=OXjFoh9nvkJWzn2s3f36/MBW66MoMQyLFuYUtYqK5YeAjPn45CRUQ6e/sIZc36xi2agvRg OrEtyDiYa4YhJP0UhmzNVoWJysZvzWbKSw6dQLEl84+uY+Tcy+iZDGTeOddk2s2pCkLYYn 9m2UGDspFkd2ayhfDslzIktjS90K/y4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-83-P1yZu5qUNLmrLp6Co-SkqQ-1; Tue, 07 Feb 2023 12:13:29 -0500 X-MC-Unique: P1yZu5qUNLmrLp6Co-SkqQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2C89E85A588; Tue, 7 Feb 2023 17:13:28 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.33.36.97]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4176F2166B29; Tue, 7 Feb 2023 17:13:26 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v12 06/10] block: Fix bio_flagged() so that gcc can better optimise it Date: Tue, 7 Feb 2023 17:13:01 +0000 Message-Id: <20230207171305.3716974-7-dhowells@redhat.com> In-Reply-To: <20230207171305.3716974-1-dhowells@redhat.com> References: <20230207171305.3716974-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Rspam-User: X-Rspamd-Queue-Id: 0B4C9140012 X-Rspamd-Server: rspam01 X-Stat-Signature: 5opdnbt61ksi1iqu3g9i5yjmy1qsa6iw X-HE-Tag: 1675790012-483426 X-HE-Meta: U2FsdGVkX19YwPb+SYDlfg6rEl+XjjAJIhHs4iAGFm+StZPTQamzlN3q9bF7/m9VOil6qzc9qhUNoFROOn5Cp/lulnS+ct+eS5Perd4j//hjgTQH4qT9bf5q9Oc74c0nGIyIdi5EOB5fPRYMFdAshBdmut7iGoBFgPCQNcoczFWyRn8d7sCJrwWfbEw5Wy4rP0o2XCIuY2cPIrPloYcqlqmEyKGiP2HZwZaRp7GguLuCwKOUasIwivvv7jLw8phl5G9CgKM3UN7es7ZnsMNo62JWLjFddb+BcD7pXlZ7tDNaTPKPLWzRD1ur0EudmtKjRFQjWFHwW9affbJJRPwksYlZbZLtIypcuzJKID+jwWn5qlOYGme6Pv99upLftliRUO0NAFxqEi3RgIC31D6siV3yxHzs/GhvhwZTbvUsAIpSJhF48D8pZcbX2wUx3lvk9iEGfZkXZ6HBHGzp7Gy+pNL/JEdNluGJ8YKEJI1ZeM3W8PQKbFyjSJSwbWUiNO1kei2PYlCRz2c3y8Zbp99MsLc2m5OKj4ct6hFn8WQsVlqTT8XvYlxUTLM5jBvnZ1oZWSG05Tbi5LQ5PfjprnCozk4Xke8bBTPQ1c0sUWc2L4IJGG9yJQz9JbUmaEnL6ylsSr+oKWx8KdqR6NYq1X0m9wMFq2lEH4Ahtm+kUSq0D9BT15wwnLJYi8YZ9a2mZCTSqhb8m6e2mlOKoA+xpRiLU8wWgxN4OcO3yuzvXNxxrisCto21Co8W3fHYtL4zCdb1b9jqaKrGGPltnwa/q4Ipyu4xrj1iBf1hDy7ydiLQ+9JrLmAbyuUc+SsR5PXX9VLq+WI/6k7s//mQYYoBwkYIYsQp5/q5IT+nv/azFgwXJNEwZP0qy+3i6qk2YFXwXjbsDhsKhJjXAxQ7PH+33ilmS519vDmkmof1E2c5XNfKT1metA7kYAjfj1ZTvPd0/58kLCajAiqP/mmy2f5H38V x9Y2uWXA SdU6VdA5Ev6WBqbL6+D8PgWyw9nU++inpeUfLkZzSQxI9+bF8uNB0cj5t+6A6zk1FYiDCXHHAQC8TrgKacxCCfTkfa5dESvqYIvGBeeN7Yf5d2MQGzEnl5hzRHGnDSdVGaTcMKba+glxp+SbPsakF6w0g5XhCWP/fmQUhUVhuTrLVURW8Wwkv98UPqLX9c5ee2kb+Lo3uNZa1NYSXzfwacUuXe18SYNrRCSBpTiqNgjJ/IezjzswHbjJSgDMoOJljlQrRr9p2gQalzsQEU4RBEPF2dvHTjHt/kEUNRs2U1ciuvHVE2yGch1pQGhmKnizvqO2PwNeQVLpdcjdmK8XxSUHyUmhcgED1TB+mDFumSa1eC11f9PHNeMNuBgirTImsjPtEfHqpRfQrE2A1biYxagOLPt9mtUGJvOczNp4J1AdDH1/OucQ9kEC3wvcYDUwkgCec X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Fix bio_flagged() so that multiple instances of it, such as: if (bio_flagged(bio, BIO_PAGE_REFFED) || bio_flagged(bio, BIO_PAGE_PINNED)) can be combined by the gcc optimiser into a single test in assembly (arguably, this is a compiler optimisation issue[1]). The missed optimisation stems from bio_flagged() comparing the result of the bitwise-AND to zero. This results in an out-of-line bio_release_page() being compiled to something like: <+0>: mov 0x14(%rdi),%eax <+3>: test $0x1,%al <+5>: jne 0xffffffff816dac53 <+7>: test $0x2,%al <+9>: je 0xffffffff816dac5c <+11>: movzbl %sil,%esi <+15>: jmp 0xffffffff816daba1 <__bio_release_pages> <+20>: jmp 0xffffffff81d0b800 <__x86_return_thunk> However, the test is superfluous as the return type is bool. Removing it results in: <+0>: testb $0x3,0x14(%rdi) <+4>: je 0xffffffff816e4af4 <+6>: movzbl %sil,%esi <+10>: jmp 0xffffffff816dab7c <__bio_release_pages> <+15>: jmp 0xffffffff81d0b7c0 <__x86_return_thunk> instead. Also, the MOVZBL instruction looks unnecessary[2] - I think it's just 're-booling' the mark_dirty parameter. Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard cc: Jens Axboe cc: linux-block@vger.kernel.org Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108370 [1] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108371 [2] Link: https://lore.kernel.org/r/167391056756.2311931.356007731815807265.stgit@warthog.procyon.org.uk/ # v6 --- include/linux/bio.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/bio.h b/include/linux/bio.h index c1da63f6c808..10366b8bdb13 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -227,7 +227,7 @@ static inline void bio_cnt_set(struct bio *bio, unsigned int count) static inline bool bio_flagged(struct bio *bio, unsigned int bit) { - return (bio->bi_flags & (1U << bit)) != 0; + return bio->bi_flags & (1U << bit); } static inline void bio_set_flag(struct bio *bio, unsigned int bit) From patchwork Tue Feb 7 17:13:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13131885 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0991AC636D3 for ; Tue, 7 Feb 2023 17:13:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 898988E0003; Tue, 7 Feb 2023 12:13:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 84A6B8E0001; Tue, 7 Feb 2023 12:13:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 711DD8E0003; Tue, 7 Feb 2023 12:13:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5EB378E0001 for ; Tue, 7 Feb 2023 12:13:37 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 3D7411C61A9 for ; Tue, 7 Feb 2023 17:13:37 +0000 (UTC) X-FDA: 80441142474.21.AAB18B6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf04.hostedemail.com (Postfix) with ESMTP id 87FE440006 for ; Tue, 7 Feb 2023 17:13:35 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cKBuGgV7; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf04.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675790015; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dqniJi1fcm9DuVb2jGUzwy7ATEE/wnT//KjRosSsKEA=; b=PajRf+oADjnh1Pk4j3tfVwj65uHfnESMEnAUCQi3nPmxQnpZK7VVw2/aDtp7XWasPQ/TrT G+8UoyF/Ngn+q0rKL4W6ND0YEPd6Xazn0d/Sc2TxhlfgO0cadaPQovYFKlqWtyh5lKbGiG EriCFP20dS67hDTgK6KcUtX+uIPyU58= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cKBuGgV7; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf04.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675790015; a=rsa-sha256; cv=none; b=zgVcljwDjGhRfORZlxOvKASzElglbjhKwPCftysQyEW9EoOOWi7cbXlOTj/Um5Pojpf7hr G0wY03mTweddM96/v22GwJPIWvmM7NMEARE+kvsGxd7G0mCtCFt4f+MwnEcIbKR5uLDcZF /0dQM/ALyOfD1LjJ9QegkbCsQhrUmAA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675790014; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dqniJi1fcm9DuVb2jGUzwy7ATEE/wnT//KjRosSsKEA=; b=cKBuGgV71HCMev9JrdAkqwmAArLqlDnkl0PtunVchNdTXNTTGso1rSTi/+Q88bNDA6TbVE bOwAWu2D/vXpKReruHgAbu4IsrMr8QwpMv6sXyeZzh+eaKhNj4bn9TmhcmDRtseem7KTCP CiuavA2OWFgUvkIe7wxwZvoF6hM/TkE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-513-8zu-9WyXOrWwYqBDb1-JBQ-1; Tue, 07 Feb 2023 12:13:31 -0500 X-MC-Unique: 8zu-9WyXOrWwYqBDb1-JBQ-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B4B0387B2A3; Tue, 7 Feb 2023 17:13:30 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.33.36.97]) by smtp.corp.redhat.com (Postfix) with ESMTP id BDE58492B21; Tue, 7 Feb 2023 17:13:28 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v12 07/10] block: Replace BIO_NO_PAGE_REF with BIO_PAGE_REFFED with inverted logic Date: Tue, 7 Feb 2023 17:13:02 +0000 Message-Id: <20230207171305.3716974-8-dhowells@redhat.com> In-Reply-To: <20230207171305.3716974-1-dhowells@redhat.com> References: <20230207171305.3716974-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Rspamd-Queue-Id: 87FE440006 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: wccj9rr94q95m7numzo3pxgngib8ktr3 X-HE-Tag: 1675790015-707548 X-HE-Meta: U2FsdGVkX1/OPedvQEpa8vQ4oc8lI8i1waPLT6VDPwTQn4oJEhV1lpfeq3IiOApQ3oBjmGTI9uH92l+jKjyG0JTqYq4kGN0SuPChACnoHY4hyrzRVCRjuld+cuKjy6uaXomnqcagWa6SZYQwqgSP5a+f5F/PH3UOfmIQk0FPVfEocjZnIYkTSdkKXmQlOL+Dvv/b4UImcGquNPxkvAUfD5mAdZwSNV3Q3fgs6vGprUEOypJaGtDBX9o1p5o4NnNirUKi2HdrxXGudkLSn0l3mq0Jjfbhsx7vhYWqmY09JQpd6GCcX4EO8HZNP1Faz/pwYzYrms/9ScKPKqGbVKF1AeaQxFt84n4tuwgWBS+qCQSohx8dq2AjTNSsnSOiisO7dXfInq9Al77ETprC5tUWrdwQQtYHzga8tro0ibeRYWiTs5ZxKvWyIpIxu5KSzIovYV84eStcyw1hBXcsUpDHMmQhtWeTsf2X4lUIVv6MeKPQWGRVG5FBUFj0A9cMxs3EkkNi+1AAQg6OwdNr8U/Nqvc2lJQONAYiB2z79YlJ8zFIDvGGWmZljmOABVs80puy1aJtRdE6TnzRwF8Ut5zvFiNS6PRkncua/wkLNvEhum9x1/cxxnaNdl6PaEhFdmw1+FWH90Ewt5ZfDCVDV2T5Dn4ROk2JUUevjUPYRBa6/843kqF7PcUnMLeIblTynR78FYMp0RlDK2Tj6QeiFW3Mys5jKmUqQ25PFOx6PLnUSFCf+Ls94I8+hWOKuwPyOlrn3hc6PsLP63mxtMKEt5nB11pBud9WVXqNi1PchYP6kXqNIMPT3EZuBj27+TAc2qcXG/w1IBH0GnDZCX133dGwzISZ1K8C+vBpIdx0+xe0gj/ndWsexApNSQxGuu9QYH4G3duvJSuuiTTjoVLk4URcxw3KM+nZiyqvS4ujDzCRp4OdY/P/6fNrCDy6R4S7SgLweb8WSdHZ8CtsbqZhjQm NdKWAQ8x vDf295veTYSPX8EaFE5ozrnnrYujm/vEEmxwuzDdWkJq4FVA4rdUhqYGzyzVYecpD8YiT3DybrzM3T81gszlbB+H6YkunZk4awtCJxY8yM91z4VrkuQubhPitV8jL7fi0frGqy1rkfbBXh+5K5lcdZIFDBoo8wNRsTvfxpqvQtdcDdnilwakYfFPaAqWzwVolkq4OGSf7ny2rpngxIvJTwZ0kUhWySp9Ewkv8H0ZYHR12fQniABh6mZ6Zhx9F4hmqOUUI5YRS6tKw8U7doYUpCPjoRDOSTfWtlrgPNWHDL2C2Fg4SntXqhITk+/xq2WlouWOmJe1hC3uWr5LrEH3b2AaWCwlBZx8OG7LoBDqeLuR6/3S7ZH/+pWStUr2Wcbl9PLV9WiwS6k9dxRHzTR1Ngd05DNLG4YuAfvoXLm4nOWZ6ZTY7vJnEWaV2TzzfvDNHDuhJb3s1glxKQckOZP/786KPvZtajDQFs1996kkd6ZjxwjM661ASWsnpJw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Christoph Hellwig Replace BIO_NO_PAGE_REF with a BIO_PAGE_REFFED flag that has the inverted meaning is only set when a page reference has been acquired that needs to be released by bio_release_pages(). Signed-off-by: Christoph Hellwig Signed-off-by: David Howells Reviewed-by: John Hubbard cc: Al Viro cc: Jens Axboe cc: Jan Kara cc: Matthew Wilcox cc: Logan Gunthorpe cc: linux-block@vger.kernel.org --- Notes: ver #8) - Split out from another patch [hch]. - Don't default to BIO_PAGE_REFFED [hch]. ver #5) - Split from patch that uses iov_iter_extract_pages(). block/bio.c | 2 +- block/blk-map.c | 1 + fs/direct-io.c | 2 ++ fs/iomap/direct-io.c | 1 - include/linux/bio.h | 2 +- include/linux/blk_types.h | 2 +- 6 files changed, 6 insertions(+), 4 deletions(-) diff --git a/block/bio.c b/block/bio.c index b97f3991c904..bf9bf53232be 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1198,7 +1198,6 @@ void bio_iov_bvec_set(struct bio *bio, struct iov_iter *iter) bio->bi_io_vec = (struct bio_vec *)iter->bvec; bio->bi_iter.bi_bvec_done = iter->iov_offset; bio->bi_iter.bi_size = size; - bio_set_flag(bio, BIO_NO_PAGE_REF); bio_set_flag(bio, BIO_CLONED); } @@ -1343,6 +1342,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) return 0; } + bio_set_flag(bio, BIO_PAGE_REFFED); do { ret = __bio_iov_iter_get_pages(bio, iter); } while (!ret && iov_iter_count(iter) && !bio_full(bio, 0)); diff --git a/block/blk-map.c b/block/blk-map.c index 080dd60485be..f1f70b50388d 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -282,6 +282,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, if (blk_queue_pci_p2pdma(rq->q)) extraction_flags |= ITER_ALLOW_P2PDMA; + bio_set_flag(bio, BIO_PAGE_REFFED); while (iov_iter_count(iter)) { struct page **pages, *stack_pages[UIO_FASTIOV]; ssize_t bytes; diff --git a/fs/direct-io.c b/fs/direct-io.c index 03d381377ae1..07810465fc9d 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -403,6 +403,8 @@ dio_bio_alloc(struct dio *dio, struct dio_submit *sdio, bio->bi_end_io = dio_bio_end_aio; else bio->bi_end_io = dio_bio_end_io; + /* for now require references for all pages */ + bio_set_flag(bio, BIO_PAGE_REFFED); sdio->bio = bio; sdio->logical_offset_in_bio = sdio->cur_page_fs_offset; } diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 47db4ead1e74..c0e75900e754 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -202,7 +202,6 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio, bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io; - bio_set_flag(bio, BIO_NO_PAGE_REF); __bio_add_page(bio, page, len, 0); iomap_dio_submit_bio(iter, dio, bio, pos); } diff --git a/include/linux/bio.h b/include/linux/bio.h index 10366b8bdb13..805957c99147 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -484,7 +484,7 @@ void zero_fill_bio(struct bio *bio); static inline void bio_release_pages(struct bio *bio, bool mark_dirty) { - if (!bio_flagged(bio, BIO_NO_PAGE_REF)) + if (bio_flagged(bio, BIO_PAGE_REFFED)) __bio_release_pages(bio, mark_dirty); } diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 99be590f952f..7daa261f4f98 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -318,7 +318,7 @@ struct bio { * bio flags */ enum { - BIO_NO_PAGE_REF, /* don't put release vec pages */ + BIO_PAGE_REFFED, /* put pages in bio_release_pages() */ BIO_CLONED, /* doesn't own data */ BIO_BOUNCED, /* bio is a bounce bio */ BIO_QUIET, /* Make BIO Quiet */ From patchwork Tue Feb 7 17:13:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13131886 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04820C636D3 for ; Tue, 7 Feb 2023 17:13:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8999D8E0001; Tue, 7 Feb 2023 12:13:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8484C900002; Tue, 7 Feb 2023 12:13:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E84A8E0005; Tue, 7 Feb 2023 12:13:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 604D28E0001 for ; Tue, 7 Feb 2023 12:13:40 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 31C511C6036 for ; Tue, 7 Feb 2023 17:13:40 +0000 (UTC) X-FDA: 80441142600.21.F4A38EF Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf10.hostedemail.com (Postfix) with ESMTP id 4DA99C0023 for ; Tue, 7 Feb 2023 17:13:38 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=AdSMsiMT; spf=pass (imf10.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675790018; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GZNnQ4PFOAio/M36ha2lSqPmGocJT5Xn1bbdC3s23DY=; b=UZUzJTlGctt4PLFXFcWxu4c6O6GKynilyMJ1nOeMgtaJ+CjeCKsjtTivcbADN21E+ldWQb E6b6ed7ydaQxtQeSc1UBT+ks4GCR8lo/g9P48DyZfiv5g/qOL3XToGgJ+u5Y9xnMRwEuF0 8Ql/Kh8uo0umA9n0roPvuUhX8Mo8WYA= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=AdSMsiMT; spf=pass (imf10.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675790018; a=rsa-sha256; cv=none; b=a4KqBtogQou5r6siKlAuLHJC2osw5+73Ppxb1s7uIcWWk/4rwLvMzSy+EQ91sSOh4GMy+P Rmzo0UKQNjinYYPiSMEVIGJaAY9ZjCFPk23IKF2HjU99Ny9hMC8p4wicFa415YDyUD3qYn B89FZm8eD4kB0ddOfIZ2jieCp91GQz0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675790017; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GZNnQ4PFOAio/M36ha2lSqPmGocJT5Xn1bbdC3s23DY=; b=AdSMsiMT7VqgRyYQaigLkBxwXy9NeacXLx4RY0cj+VDnMvj4J3szfVcx1ATV75hW8MvWa7 +suHiWa8QvgUbnAO1xrU75Rxmvb1hpNZK3JvSdC+FES1wwlE/UtyTbQxTQvfGYQa360ljn XCdeOIPrf6bueTKPxUXjdSwkavIhofU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-453-aB6sR73-ONqhF5rsBSycRw-1; Tue, 07 Feb 2023 12:13:34 -0500 X-MC-Unique: aB6sR73-ONqhF5rsBSycRw-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4F6F887B2A5; Tue, 7 Feb 2023 17:13:33 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.33.36.97]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5EBE6492B21; Tue, 7 Feb 2023 17:13:31 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v12 08/10] block: Add BIO_PAGE_PINNED and associated infrastructure Date: Tue, 7 Feb 2023 17:13:03 +0000 Message-Id: <20230207171305.3716974-9-dhowells@redhat.com> In-Reply-To: <20230207171305.3716974-1-dhowells@redhat.com> References: <20230207171305.3716974-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Rspamd-Queue-Id: 4DA99C0023 X-Stat-Signature: 9yh918dajahx5nbwbr5dq49sdo3yrrcy X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1675790018-927065 X-HE-Meta: U2FsdGVkX19xzu1frBn8mVQ+E4e4MYnLUjiw7YJHKgjju1WfctvVWaK75FVX/mmbPGQxt0CDiX8Vb5f5M90Nma76zBrNKEDs/MjF6XiN61lq2Hyurcxc+HC3w/Eyjkq5MwR6CBcyyfy5IzqeDp9xr3YmQC9l2Lso7EuZLhs/4rBEX2epIzTCE2nDAmiQ9hyr7R1uKsLftjmnIEWawxZceQFSE9e6ki1cpZ2AuQViHlOC4qo6Kchlm89QlYTufzTgvqHZy32bkLQEq6NmF5TgEU40o4cgYHHPTUq9KTqOeDSRQt6vTakBTU3jsQEHYJi2LmYYYqEu9yDwOgH6fgxI+AB2CTR4tS8R0KzgJAiYK/BcD3pqJafcmJqopPE1pTjDEEdfGZI3WA3vXyVocczgpuqJtaW1YdC+HYB2JmKaPT9Q3QmUWRKTXJk8orgPHUuZ+QDbbX6ILwB4qVyGl1YjZs3V97SyN8Oq8PHZsMzZseV4s6GY/W04ma4SWB+gfXIfMAFf/LYhoJdKPsDo+kkYluFqMPSJRslfg5fT1nMXF0bevEaFV1z7Cdrb1b0cV5hfn3NZVqt8jm6Zch+grQBWBakZa4DGhuJssLCbOMe16dHlSH0zWX8EDpeg4hMgw9w/T0NSQBBZsfGniRCVJIWTrV6mV6w4tT1ipf4NXctaI3aQbWnFnI8WDt9p1MHjaXMc+zQCmVjQxrobXh4ikafzxm6T6pZqfTSZKclOpezKr5s3vd3TH66k5tLcqlaTpa9ySvv5rWt7OkiFbZCQM2KBA1id6q6zGw+SCKsI2lvc9DdD52OyzD7JkTAzrBALojNu0gCVE6hSZ7aUID2zU/9ihz3tv1kC3YEue2uDEBVDz0R85/3AQenUg3T9AbC4QdACHDYCEMLDIWUjoFtX2kykoeTu19YwaWfMW7IW6hkQOOky5H+N/BM8a7JHGM4KvfVbuKcQJvHhs4cm/RHs6dG xX2YT3H+ nY/3Xp9rFNXJMKJvg3IxNIXn3hF3BlWKidlpGpeqH0zbFM8g3R6iWUHs9cdnFUCjXJzDEQcAwQvswLmQ361aWDu4b8LPgPuXKkR81eVp9r3eVc1rhWsFJzEhVmkEHie6+unfLDCvEOCupvq5UbKPe8e+RB9vY3LUlQuQ6usNnrMA4g9jK302LDpVBSIGY8flu9QzwXhqYsbvZ04SOrlpylNiPJQgDLVQutzJQVvkivjHJ3TAvu3Yjba66wXrM5f/g+N3rhXwcwMDTeMzzop0hkJ9GzZbivcCTaI5WrsqD5HjCnfq/iC6R7l3z64fn3BeYkOLBsoV7tGVS6nKcy7tCmADGEVhglmVp4ApEgc/cBzk+P52yHrlGqIxtRpP7ZSjSznhBejr48RELCv7Fb+WvosGzAxQ6KgLAEHUWfEu4FzY+Ku8UpZcBkOuD2UQyzstjVPHppmHr2hNjoCPdS1rUyeuVNWFpyJGih77inODf72HxyjKhW3cWs6i5fw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add BIO_PAGE_PINNED to indicate that the pages in a bio are pinned (FOLL_PIN) and that the pin will need removing. Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard cc: Al Viro cc: Jens Axboe cc: Jan Kara cc: Matthew Wilcox cc: Logan Gunthorpe cc: linux-block@vger.kernel.org --- Notes: ver #10) - Drop bio_set_cleanup_mode(), open coding it instead. ver #9) - Only consider pinning in bio_set_cleanup_mode(). Ref'ing pages in struct bio is going away. - page_put_unpin() is removed; call unpin_user_page() and put_page() directly. - Use bio_release_page() in __bio_release_pages(). - BIO_PAGE_PINNED and BIO_PAGE_REFFED can't both be set, so use if-else when testing both of them. ver #8) - Move the infrastructure to clean up pinned pages to this patch [hch]. - Put BIO_PAGE_PINNED before BIO_PAGE_REFFED as the latter should probably be removed at some point. FOLL_PIN can then be renumbered first. block/bio.c | 6 +++--- block/blk.h | 12 ++++++++++++ include/linux/bio.h | 3 ++- include/linux/blk_types.h | 1 + 4 files changed, 18 insertions(+), 4 deletions(-) diff --git a/block/bio.c b/block/bio.c index bf9bf53232be..547e38883934 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1176,7 +1176,7 @@ void __bio_release_pages(struct bio *bio, bool mark_dirty) bio_for_each_segment_all(bvec, bio, iter_all) { if (mark_dirty && !PageCompound(bvec->bv_page)) set_page_dirty_lock(bvec->bv_page); - put_page(bvec->bv_page); + bio_release_page(bio, bvec->bv_page); } } EXPORT_SYMBOL_GPL(__bio_release_pages); @@ -1496,8 +1496,8 @@ void bio_set_pages_dirty(struct bio *bio) * the BIO and re-dirty the pages in process context. * * It is expected that bio_check_pages_dirty() will wholly own the BIO from - * here on. It will run one put_page() against each page and will run one - * bio_put() against the BIO. + * here on. It will unpin each page and will run one bio_put() against the + * BIO. */ static void bio_dirty_fn(struct work_struct *work); diff --git a/block/blk.h b/block/blk.h index 4c3b3325219a..f02381405311 100644 --- a/block/blk.h +++ b/block/blk.h @@ -425,6 +425,18 @@ int bio_add_hw_page(struct request_queue *q, struct bio *bio, struct page *page, unsigned int len, unsigned int offset, unsigned int max_sectors, bool *same_page); +/* + * Clean up a page appropriately, where the page may be pinned, may have a + * ref taken on it or neither. + */ +static inline void bio_release_page(struct bio *bio, struct page *page) +{ + if (bio_flagged(bio, BIO_PAGE_PINNED)) + unpin_user_page(page); + else if (bio_flagged(bio, BIO_PAGE_REFFED)) + put_page(page); +} + struct request_queue *blk_alloc_queue(int node_id); int disk_scan_partitions(struct gendisk *disk, fmode_t mode, void *owner); diff --git a/include/linux/bio.h b/include/linux/bio.h index 805957c99147..b2c09997d79c 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -484,7 +484,8 @@ void zero_fill_bio(struct bio *bio); static inline void bio_release_pages(struct bio *bio, bool mark_dirty) { - if (bio_flagged(bio, BIO_PAGE_REFFED)) + if (bio_flagged(bio, BIO_PAGE_REFFED) || + bio_flagged(bio, BIO_PAGE_PINNED)) __bio_release_pages(bio, mark_dirty); } diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 7daa261f4f98..a0e339ff3d09 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -318,6 +318,7 @@ struct bio { * bio flags */ enum { + BIO_PAGE_PINNED, /* Unpin pages in bio_release_pages() */ BIO_PAGE_REFFED, /* put pages in bio_release_pages() */ BIO_CLONED, /* doesn't own data */ BIO_BOUNCED, /* bio is a bounce bio */ From patchwork Tue Feb 7 17:13:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13131887 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 804F9C636CC for ; Tue, 7 Feb 2023 17:13:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1B956900003; Tue, 7 Feb 2023 12:13:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 16A2F900002; Tue, 7 Feb 2023 12:13:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 03739900003; Tue, 7 Feb 2023 12:13:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E7647900002 for ; Tue, 7 Feb 2023 12:13:42 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B052BA09ED for ; Tue, 7 Feb 2023 17:13:42 +0000 (UTC) X-FDA: 80441142684.30.B51C666 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf08.hostedemail.com (Postfix) with ESMTP id EB05816000B for ; Tue, 7 Feb 2023 17:13:40 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aNQdzGhK; spf=pass (imf08.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675790021; a=rsa-sha256; cv=none; b=Ir8numU0sAH+0mpxdgU8k/3JSsFfbYWVMhlzzwM+b4/6Rg7xrNpq0eO9iH6odxv0mFvn1W fGBN1nlbpR3Suto+GQZ7q8AINinOApcVqM9mFOccLYOdkMdu7rbJJEvKjs27e/TRdphVOG x1Jee7a+ZQFAmF3SCh2p/IAdUSXLMuU= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aNQdzGhK; spf=pass (imf08.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675790021; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YX3eOMnEnHmNO2fsUm7QGJ9rpDZ86GbDPShQEUjns7k=; b=34b1VsEiGbYeVixTQzSz10hhPv+N9bn6mi+USXwmLA8f4GUdMmuZovxvM+z06F+V8mR2Mt 5gu/wuuFXtDSNoYybILEKWiPdX6hja7w4rYUdjdvAYBgvfczQlozCIeGBmaC+Q2QN+5++o V7a7sFITsMjvEG90JtkXZMTG9QDRL9s= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675790020; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YX3eOMnEnHmNO2fsUm7QGJ9rpDZ86GbDPShQEUjns7k=; b=aNQdzGhKlUDcL0Cr/Neu0hz4WKiyJkFbC6Bh5tLk8BH6+ImmdGbjr35BQGDFbc8J5Hym+4 pLdR0YNxv6nCbMkFJbSDKYQqSr/5u68+1ezcN4H7Ti8vfM13uvTOZr2bYWLXsoPMCkETuL gZTrSSKezu7p12N1ttRuZnCl/FrxDws= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-287-COKQm65RPnKYKjlD5vKdJw-1; Tue, 07 Feb 2023 12:13:37 -0500 X-MC-Unique: COKQm65RPnKYKjlD5vKdJw-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D676A3C10145; Tue, 7 Feb 2023 17:13:35 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.33.36.97]) by smtp.corp.redhat.com (Postfix) with ESMTP id E7009492B21; Tue, 7 Feb 2023 17:13:33 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v12 09/10] block: Convert bio_iov_iter_get_pages to use iov_iter_extract_pages Date: Tue, 7 Feb 2023 17:13:04 +0000 Message-Id: <20230207171305.3716974-10-dhowells@redhat.com> In-Reply-To: <20230207171305.3716974-1-dhowells@redhat.com> References: <20230207171305.3716974-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Rspam-User: X-Rspamd-Queue-Id: EB05816000B X-Rspamd-Server: rspam01 X-Stat-Signature: ncsi9z13y3c75qb8pn886a1ojxgacqpu X-HE-Tag: 1675790020-341184 X-HE-Meta: U2FsdGVkX18hYYyByGPkw4N1DJHAWoPtvBGsgNn68IWqVOlZQAkcdr/uazStiUXFM34Dpf5bxAzRf3h7jc25WZHvkJog8pwEbx7nSUNm64M/RE/0+VclVDE/2T9qppc/ZENcMi/b6HWnNKtw/G1Ib3GkZUsw5xsCENpeCZdDCgJd9KSGOPOyVourinpwfBfoUwUmmUYvSe1PSq3Vz5jrZJh0JXAEShA6ZI56pD+DoqCoM6Upuw8DReo/1MlG9pNqkYzZQPl32KNB/qO9LeQg3MuZhfx8ZdtCLYfvi7cfD5tKB5rrkwabN6bTMWxwdWdQYQYyLqwOs76wfs35F6/nu73xsjiyVtyLGVx6c1Z3UULOLSAt74q8hz+k78sKVJZf0lxoS9F9VcXR33UI989pd0IA1JnxDJH4Qh01JIaHvakxSsjNGKxpXHxSW947zkP5C5CoRZBoq1m26lEbtZ8CmKrw41/o3t30rf3rHeiG4cyvvDcHCkJTBhUfAQZn+HV/OKQi6FceYMGzjcAC8XJDnBIDjg71i3IWA8YPolOA7MoxQeoqnlkCc7RHWNGBFnqqIHoqHJB60EUUUwMeTQBT7oMr+h5g88c8qwH5AszOwh49r0ZzvPtlDVXug42ikFvDt8vLtY6tjga76AoA378aGJA54N7ONZJoK4a4FAFeskwoGFl9gKd53gH+x6ZDj5ThxTOMRoxSDpM4lt8cDcWeaW9bUqq/E7xVpkeCJ3ZO4Ygvpsa7DM35mkU88sB7UfNX2irG0EtnMmyyINlMbJhzI+7N6evc0X0+tvuGv/wQIjJajEQyMm1gon9yKmBXzEjerR1OlTHtEnBfl+3DQLRhPCc7ROXYt96gQAiXr4Ej9FPjwzwb/in+e6B7iK4A6ewb0sDDHAMbCJKRNQk7qRaJi8xjXOYfrwwvrwXChEb40GEKzqyehT3ICrhOwnLrI1XeypxWmId18lIRN115I3l XLxJEF6z /Y/GGLmC8jIVLElgYMGTQux+ULIvd8eoxwpCHdGX5LAAScF6wxVnGo1zbZzQ0n62ISEqX7Wcjjbkh+TWHwSkBYcIOzWZgI8pySxfmFyKreU8Mg1G3Z8RLf8mmkhbKGb7Rbw1P0FeaHwT4ZOPeV0mbs1bI5RFthLQGo+eCYfvB8S8bTGPuiTlS83AVjk9dk4ETPHtK2T/ctJpyU5Fb8V4U+dealyHPPDPTX0JmwqLMGA6C9UtjbP8+yrq3+2r5FPtDHvkshcu2a+BnLYvoW4/phl860aJMhkXY1zboQ0FTkolMYkWC94WSaBim2Fxxx9f21hHmlz/galCnjsviMJYZC9zvwA5478wJV54XAMApzNwECsgrdxsliSzrX1/7Xaa+FJMOT8DglBV2zGSxSLctiTVXthwno1ClbIY0FlUh9QeVA3EEpn3fkT+yvwVTpcEI5a8IA6rsXQxRLd+DmRlpU+/A2kgKdnOe7GdMcr8tvcvOoGtCKw7mGuX8bw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This will pin pages or leave them unaltered rather than getting a ref on them as appropriate to the iterator. The pages need to be pinned for DIO rather than having refs taken on them to prevent VM copy-on-write from malfunctioning during a concurrent fork() (the result of the I/O could otherwise end up being affected by/visible to the child process). Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard cc: Al Viro cc: Jens Axboe cc: Jan Kara cc: Matthew Wilcox cc: Logan Gunthorpe cc: linux-block@vger.kernel.org --- Notes: ver #10) - Drop bio_set_cleanup_mode(), open coding it instead. ver #8) - Split the patch up a bit [hch]. - We should only be using pinned/non-pinned pages and not ref'd pages, so adjust the comments appropriately. ver #7) - Don't treat BIO_PAGE_REFFED/PINNED as being the same as FOLL_GET/PIN. ver #5) - Transcribe the FOLL_* flags returned by iov_iter_extract_pages() to BIO_* flags and got rid of bi_cleanup_mode. - Replaced BIO_NO_PAGE_REF to BIO_PAGE_REFFED in the preceding patch. block/bio.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/block/bio.c b/block/bio.c index 547e38883934..fc57f0aa098e 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1212,7 +1212,7 @@ static int bio_iov_add_page(struct bio *bio, struct page *page, } if (same_page) - put_page(page); + bio_release_page(bio, page); return 0; } @@ -1226,7 +1226,7 @@ static int bio_iov_add_zone_append_page(struct bio *bio, struct page *page, queue_max_zone_append_sectors(q), &same_page) != len) return -EINVAL; if (same_page) - put_page(page); + bio_release_page(bio, page); return 0; } @@ -1237,10 +1237,10 @@ static int bio_iov_add_zone_append_page(struct bio *bio, struct page *page, * @bio: bio to add pages to * @iter: iov iterator describing the region to be mapped * - * Pins pages from *iter and appends them to @bio's bvec array. The - * pages will have to be released using put_page() when done. - * For multi-segment *iter, this function only adds pages from the - * next non-empty segment of the iov iterator. + * Extracts pages from *iter and appends them to @bio's bvec array. The pages + * will have to be cleaned up in the way indicated by the BIO_PAGE_PINNED flag. + * For a multi-segment *iter, this function only adds pages from the next + * non-empty segment of the iov iterator. */ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) { @@ -1272,9 +1272,9 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) * result to ensure the bio's total size is correct. The remainder of * the iov data will be picked up in the next bio iteration. */ - size = iov_iter_get_pages(iter, pages, - UINT_MAX - bio->bi_iter.bi_size, - nr_pages, &offset, extraction_flags); + size = iov_iter_extract_pages(iter, &pages, + UINT_MAX - bio->bi_iter.bi_size, + nr_pages, extraction_flags, &offset); if (unlikely(size <= 0)) return size ? size : -EFAULT; @@ -1307,7 +1307,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) iov_iter_revert(iter, left); out: while (i < nr_pages) - put_page(pages[i++]); + bio_release_page(bio, pages[i++]); return ret; } @@ -1342,7 +1342,8 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) return 0; } - bio_set_flag(bio, BIO_PAGE_REFFED); + if (iov_iter_extract_will_pin(iter)) + bio_set_flag(bio, BIO_PAGE_PINNED); do { ret = __bio_iov_iter_get_pages(bio, iter); } while (!ret && iov_iter_count(iter) && !bio_full(bio, 0)); From patchwork Tue Feb 7 17:13:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13131888 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5259CC636D3 for ; Tue, 7 Feb 2023 17:13:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E3C45900004; Tue, 7 Feb 2023 12:13:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DECCA900002; Tue, 7 Feb 2023 12:13:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB60C900004; Tue, 7 Feb 2023 12:13:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id BA175900002 for ; Tue, 7 Feb 2023 12:13:46 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 8D38D120A4A for ; Tue, 7 Feb 2023 17:13:46 +0000 (UTC) X-FDA: 80441142852.24.05BE0CD Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf20.hostedemail.com (Postfix) with ESMTP id C4F351C000B for ; Tue, 7 Feb 2023 17:13:44 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IoCSlkNO; spf=pass (imf20.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675790024; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7RYncx3d2QCFno76YxQ26A3gARIW2WsBk/g5n5z32ew=; b=tV8T+Yod3D1A+EffVfe0SJ+242JA0xBP7uqRFbiGtzqd6qEHeMsDCj5+vsKHWaXj+n0nDY O3H1I/QJoqjL9zgCD1oe7GNnMe1XpvmsIrLIwUgjoQOFhb0WDDUPQtJkINMuWZVrGm/KFf R1sPEXg5BZIBdNSApbyLUBtn86N2A5Q= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IoCSlkNO; spf=pass (imf20.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675790024; a=rsa-sha256; cv=none; b=uSGdMwsvpA0A8ts+Wj+14R4Nb7GJdnBq/Oz2dndo7BScX6bX9467Aw37SUUl4B6LCAbbuq IUyV2uxYkyMd2gAhr0d02v0wn7YVxoqYXtE2fdvSeW87POcCs8pEistLFkzWFJugV6Pxnq +p9Y+JX52XC+03bq1K6sgpGHw58jdf4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675790024; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7RYncx3d2QCFno76YxQ26A3gARIW2WsBk/g5n5z32ew=; b=IoCSlkNOt7CrFrcCpJ6QK5ZwjVHg+hQzjg0+qm5bzRog0pqovQmc79wKjWGeALTidRA+lW mXU5ppj4m3muCXuF4tvfx7rvVGyDp/llePrvAXazxU5l/2/m4jnH77g479DPEWNCWnx2YQ ry4gxn40tj6W6bany9TR4x/Hf5UMkzw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-73-lolXqV_9N4uHxYJn3WqZgg-1; Tue, 07 Feb 2023 12:13:39 -0500 X-MC-Unique: lolXqV_9N4uHxYJn3WqZgg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id AC1C480D0E2; Tue, 7 Feb 2023 17:13:38 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.33.36.97]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9A3F01121314; Tue, 7 Feb 2023 17:13:36 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v12 10/10] block: convert bio_map_user_iov to use iov_iter_extract_pages Date: Tue, 7 Feb 2023 17:13:05 +0000 Message-Id: <20230207171305.3716974-11-dhowells@redhat.com> In-Reply-To: <20230207171305.3716974-1-dhowells@redhat.com> References: <20230207171305.3716974-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: C4F351C000B X-Stat-Signature: 1d5fk1dkdhfdxsyqd19gyfoqmgnicg5p X-Rspam-User: X-HE-Tag: 1675790024-411511 X-HE-Meta: U2FsdGVkX18eLQhOMBLYP9yukwmbhvFfw0x08aljpooQqrKn8PtXpiJOwkdl3AiQqWJ2lLYKLxmVN8Jixe6vbt1D2mKHiGPvg3iZDhVCh34wT7XrxT6BmCsmujz2W3eTp0zuWU/91m4mSy684CBj5XwGLnGiJtLlPDG6sn453qL+EPg7MJ07bHFrrskxLJmIz+/A3eEHCf6t1bXtv+QhqDiFNcvV++CY9bVbZ/IZNPZP7BUKca5qRLp1v11j83CjIHVJIhjugJzWTwij3EfLtXZ9vz7juayTn5Snje7YOWaZ/lRGSzcmGBDg6QMoaDj/IjUyv3BVd4VAlSt8xc6BiiOgEdWKz3aCYAoYb0opDit6SbPxK6ht7FDLKjOS8wzOvCMLl0mi7igrMHEhx+2ON5GJ3VrIbv3NW2+Qnjqj+Hcyu02EW9L2c4HsFSN8X3vEp3pAeMzrA9BOAC5imhaudeIUVuYi3ibWRt9FeNxcY6UkE3oOSWY2/80NIt1Vo8HdPJiqfB7aGTwZurlBGHyFozuMpAAdm0MtVsz4Zne6QKeMz9BGJ/6zf1HY0jgv2yiMMKulqB2N14rysuyf0WcTzW+VHDtau6av8VDplvLW64va5421a27kPJJ0Uv1kJpCorsfqmeXoM6yL3TOtPKy+YbQaCTbrsldM/5tc6j25ORvshPd1r9hdA4uWB7+//0AAOt/7iOdwvar6AjAxI8LsGT0MjiUzTn+DrSdUFjnfGwcEVbrmVpxLX7nvhA++q7DRvZIuhHoSIiCzpZPXRvczTR1MN82ntIhQojgornFdLawn16WyinF2fRrS7HMmuR1UU8/fimWJEVmxLh7YMTILfZYpA1OVmYKzSg60NxQkTD92a+9/V7r3KbP7V7+036cZ6aQyWQcS3OYCM1TjtB0rynE6JOv01gCWfFjjMdqdAOWWsPLZ62NuFOKp0uAcvOpFbRvtOakGEyDz7J6uLCL XqMVYead VIEk178PSQrJ9EQ7aN++ytoRmMTVPPH7XO0shUsS8ppG1ef9em+KGcAOgRw6FSMAxeWg6BxUgc2kVJ6ny+HA2KklvInXWtTwXav+ff2rbGBFxYk12/F7hLGeGB4pyumdEE/oFH4JMm8lw8mlInGCOlOd7zJuZVqGl+Ex4/KuLZ9ktd1foYlQVlAoX2DqTf9R89vLHuDj0hm/6VqPewFBdJFeN8mF1fAKn5b3GXi2AiZSK8Al0yk9NO2atJscjMvorDvFUZSHsmwAJfDlAb5fIAiMLr879wcZn84RT/jLa5nRXs4nNkfqSLnPrU/OexUQnGuD0NTBkOZ3+1cSkX7IPUhdfMid5+HhvkMR/KTtQT8JIARf8BMjaGLJX4mqtpp8OXLQ/hlDN6pIQTUHMIJs9CuqBcQ49pG+hKhl2Z9/1ynUp9vSKsqGQYJC23RJX2ASmQMb/B56IiQuxcqsXd3DyRvHea/DFsTez+yIxrqIcf5qxHadEhTVOWgv3Yw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This will pin pages or leave them unaltered rather than getting a ref on them as appropriate to the iterator. The pages need to be pinned for DIO rather than having refs taken on them to prevent VM copy-on-write from malfunctioning during a concurrent fork() (the result of the I/O could otherwise end up being visible to/affected by the child process). Signed-off-by: David Howells Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard cc: Al Viro cc: Jens Axboe cc: Jan Kara cc: Matthew Wilcox cc: Logan Gunthorpe cc: linux-block@vger.kernel.org --- Notes: ver #10) - Drop bio_set_cleanup_mode(), open coding it instead. ver #8) - Split the patch up a bit [hch]. - We should only be using pinned/non-pinned pages and not ref'd pages, so adjust the comments appropriately. ver #7) - Don't treat BIO_PAGE_REFFED/PINNED as being the same as FOLL_GET/PIN. ver #5) - Transcribe the FOLL_* flags returned by iov_iter_extract_pages() to BIO_* flags and got rid of bi_cleanup_mode. - Replaced BIO_NO_PAGE_REF to BIO_PAGE_REFFED in the preceding patch. block/blk-map.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/block/blk-map.c b/block/blk-map.c index f1f70b50388d..0f1593e144da 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -281,22 +281,21 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, if (blk_queue_pci_p2pdma(rq->q)) extraction_flags |= ITER_ALLOW_P2PDMA; + if (iov_iter_extract_will_pin(iter)) + bio_set_flag(bio, BIO_PAGE_PINNED); - bio_set_flag(bio, BIO_PAGE_REFFED); while (iov_iter_count(iter)) { - struct page **pages, *stack_pages[UIO_FASTIOV]; + struct page *stack_pages[UIO_FASTIOV]; + struct page **pages = stack_pages; ssize_t bytes; size_t offs; int npages; - if (nr_vecs <= ARRAY_SIZE(stack_pages)) { - pages = stack_pages; - bytes = iov_iter_get_pages(iter, pages, LONG_MAX, - nr_vecs, &offs, extraction_flags); - } else { - bytes = iov_iter_get_pages_alloc(iter, &pages, - LONG_MAX, &offs, extraction_flags); - } + if (nr_vecs > ARRAY_SIZE(stack_pages)) + pages = NULL; + + bytes = iov_iter_extract_pages(iter, &pages, LONG_MAX, + nr_vecs, extraction_flags, &offs); if (unlikely(bytes <= 0)) { ret = bytes ? bytes : -EFAULT; goto out_unmap; @@ -318,7 +317,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, if (!bio_add_hw_page(rq->q, bio, page, n, offs, max_sectors, &same_page)) { if (same_page) - put_page(page); + bio_release_page(bio, page); break; } @@ -330,7 +329,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, * release the pages we didn't map into the bio, if any */ while (j < npages) - put_page(pages[j++]); + bio_release_page(bio, pages[j++]); if (pages != stack_pages) kvfree(pages); /* couldn't stuff something into bio? */