From patchwork Fri Jun 30 15:25:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13298361 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B974EB64DA for ; Fri, 30 Jun 2023 15:26:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9D2D78E0027; Fri, 30 Jun 2023 11:26:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 95BF88E000F; Fri, 30 Jun 2023 11:26:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 823A08E0027; Fri, 30 Jun 2023 11:26:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 719FE8E000F for ; Fri, 30 Jun 2023 11:26:08 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 42D8780558 for ; Fri, 30 Jun 2023 15:26:08 +0000 (UTC) X-FDA: 80959790016.07.090FD71 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf01.hostedemail.com (Postfix) with ESMTP id 2B98540023 for ; Fri, 30 Jun 2023 15:26:05 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HkfgA793; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf01.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688138766; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YvuuosyDox1jyrnioRFOxuFzcr1YzhZJrvdziq8YYeo=; b=VL5ynegdshpEvPIQt6UIHpF7I5L/Aa8W2/f3NNF/VW7EwbK+89Rc9skjpEdpx2g0hE3tI1 nUfcIMv0dwJfYF/iXdZtIkc6s7HWmK1ZzG7JyBOthZ+qUROUfLbLxIn8l8Y0vo4qjfNzQ0 zocxJOX/u6NqGmQo5/Xh4qE9W7vRnps= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HkfgA793; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf01.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688138766; a=rsa-sha256; cv=none; b=pAsPuQqBplM8ROvDX9YTxorA2OU9x0oC7Z5HLNeiijgXslqWjYF66A5csbAgjJSXEyxOff YpVBQrF6Yycl17mLtVX9YURub5wxa15WN56EdaugDDl/1qnrdNajFcH6BQ6zMYzEKXQmKz YxipWVUVftUmHkbWkMkoCoaau/C3oBQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1688138765; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YvuuosyDox1jyrnioRFOxuFzcr1YzhZJrvdziq8YYeo=; b=HkfgA793iVSrB64AlEspG6dR5iY+ruW4MUn09hU+poc8Jc2hy1QrZVZvRoZVd/jFZO1yOP hTW1H0NEeWBXfFVQH6gZ4JagFGtw6CwBza/ypnfsIoZGYhS72USbN9UaD8pUdyNHAhLHro kdch2yzBOCe18NM3tBEvkZH46QHCI6o= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-142-n-OmuebeNTmap1grquelDQ-1; Fri, 30 Jun 2023 11:26:00 -0400 X-MC-Unique: n-OmuebeNTmap1grquelDQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5340229AA2CC; Fri, 30 Jun 2023 15:25:59 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.42.28.195]) by smtp.corp.redhat.com (Postfix) with ESMTP id B7DBB14682FA; Fri, 30 Jun 2023 15:25:56 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Dominique Martinet , Eric Van Hensbergen , Latchesar Ionkov , Christian Schoenebeck , v9fs-developer@lists.sourceforge.net Subject: [RFC PATCH 10/11] 9p: Pin pages rather than ref'ing if appropriate Date: Fri, 30 Jun 2023 16:25:23 +0100 Message-ID: <20230630152524.661208-11-dhowells@redhat.com> In-Reply-To: <20230630152524.661208-1-dhowells@redhat.com> References: <20230630152524.661208-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Rspam-User: X-Stat-Signature: z745jxw5jw6ooqmpyzwzm64ybgkarz5j X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 2B98540023 X-HE-Tag: 1688138765-139493 X-HE-Meta: U2FsdGVkX1/i53wjMk4sBQnxaYiYzZg13PIFDvc7nBjPNECHnWIfmVQKyqRHOni0fp4qFg6IGobjtz8ME5AYnbkbG4z+03fSc+Kr/e8qEXzgHLq+oH6FvK6RkQnjjqac0odbXUoxg5027mfL9vJRBhSFUKjIRzVx7gagxbGh6iY7+NbJx+WS+19RKmyGZkbL5FTgX2U9aBH0e81+bKNQ5K8V/2qfeFPvWc8nbLp6r/Zrq2Cbjk3LUINVCX4QGZIOUSUZuSopIDfmjEm+CB/yJwEugEOxTucqRLgTib/8/6FaA/EdcPezUttuUny3GDzHPfiUbbpnkrxLGYZKapdPZHWWijGEzlI9HQp/7/rV1YKujHncDrpp7T2nWx5H2p9LN2u5Btj+AKZxHQ2QvYlwOxA/srUy85R9kv8gNoymJx8gpU2B6yDPHLUvLyz4XbkYxXB/sH1WmiuYsCmonb6QRST90m7yvL5m+jgO3ZsFgspAq1lsGMLRNe33BcNwbUkb5bev6h1X8+FOjy8SR4uqT2GFvC0+RtEZyJS6vlFZ7YHiR0BIm5wbTMVMY8H0cM6HTlDs+qLsRL0XwhC9LXGHHgDuhov+EGbMR8UtkDEz75Esi/YdK6ZC9y3YNIjKPugFV8IihP6hcG48mVmOzepmH0p/bWuZWUAoHC49APkZEEccvtJkTHtgXNFB/myO6IgKYBlAIU6RJFABk9cOfOKR1psUVv859eUK/lLvHYBJz73EeF8AesR21QTUkI0gOHrtJIBvEXdckPbF6FV3jZh+kq/5KU6hE+mZ2/hour1/euO9WGA331JXvRuRnY/ObLGaqv0iTmW/VEq+ju1LOOiOHXjUND1VaP1DXiN0/R9ogPSOeRMBMxMFbLMPvzUmyOZ6XsM8e7tNbmDRR/HuTKvMPP4l1t9wwtiPP2/skAYj0MRlqmP4djqeuVIhIX8M6Z3xIKqQRdIYT4H+BJsjAL1 zBpgJ0+P bzJjImO1m8nDGAUP7lrVkPzlfo2AgSEjFR91aCAEbyobkniOsxy35SMS/Q2+XyRrpFhCfM/oILFshb7MuzKinhw+/smD58D14yhmw+GmqISjKCqSzU4aKtTN1JI+PE2jioMtyj8krYAmFx/Rn0ICt25hfNErRsf+G9bV3Pb3SJCtkAxkQ3XYiaIcEvMfv3ODnklNAtOKCXXcHjscPWUYhS423N0HrcDSg3f+S86WSVuI75ckaIoH5e7zjIQvmoCfnB8ZqwZGpS/bxYQvhzY8Jn0FY4QVVzl+KbMjZp3ZiGdTQXDHL3ukW88BytFSCehnrBTA4YEhkmibUdnbU3tudBxT12C3+ZanWAaO0cjP5bA/xcG6xNz/7sPUL5kYlwMJfdyloGBZCx2UZOY5lmkOwDQJGzOQegO/bjxiY4GcbmpFOfk0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Convert the 9p filesystem to use iov_iter_extract_pages() instead of iov_iter_get_pages(). This will pin pages or leave them unaltered rather than getting a ref on them as appropriate to the iterator. The pages need to be pinned for DIO-read rather than having refs taken on them to prevent VM copy-on-write from malfunctioning during a concurrent fork() (the result of the I/O would otherwise end up only visible to the child process and not the parent). Signed-off-by: David Howells cc: Dominique Martinet cc: Eric Van Hensbergen cc: Latchesar Ionkov cc: Christian Schoenebeck cc: v9fs-developer@lists.sourceforge.net --- net/9p/trans_common.c | 8 ++-- net/9p/trans_common.h | 2 +- net/9p/trans_virtio.c | 92 ++++++++++++++----------------------------- 3 files changed, 34 insertions(+), 68 deletions(-) diff --git a/net/9p/trans_common.c b/net/9p/trans_common.c index c827f694551c..4342de18f08b 100644 --- a/net/9p/trans_common.c +++ b/net/9p/trans_common.c @@ -9,16 +9,16 @@ #include "trans_common.h" /** - * p9_release_pages - Release pages after the transaction. + * p9_unpin_pages - Unpin pages after the transaction. * @pages: array of pages to be put * @nr_pages: size of array */ -void p9_release_pages(struct page **pages, int nr_pages) +void p9_unpin_pages(struct page **pages, int nr_pages) { int i; for (i = 0; i < nr_pages; i++) if (pages[i]) - put_page(pages[i]); + unpin_user_page(pages[i]); } -EXPORT_SYMBOL(p9_release_pages); +EXPORT_SYMBOL(p9_unpin_pages); diff --git a/net/9p/trans_common.h b/net/9p/trans_common.h index 32134db6abf3..fd94c48aba5b 100644 --- a/net/9p/trans_common.h +++ b/net/9p/trans_common.h @@ -4,4 +4,4 @@ * Author Venkateswararao Jujjuri */ -void p9_release_pages(struct page **pages, int nr_pages); +void p9_unpin_pages(struct page **pages, int nr_pages); diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c index 3c27ffb781e3..93569de2bdba 100644 --- a/net/9p/trans_virtio.c +++ b/net/9p/trans_virtio.c @@ -310,71 +310,35 @@ static int p9_get_mapped_pages(struct virtio_chan *chan, struct iov_iter *data, int count, size_t *offs, - int *need_drop) + bool *need_unpin, + iov_iter_extraction_t extraction_flags) { int nr_pages; int err; + int n; if (!iov_iter_count(data)) return 0; - if (!iov_iter_is_kvec(data)) { - int n; - /* - * We allow only p9_max_pages pinned. We wait for the - * Other zc request to finish here - */ - if (atomic_read(&vp_pinned) >= chan->p9_max_pages) { - err = wait_event_killable(vp_wq, - (atomic_read(&vp_pinned) < chan->p9_max_pages)); - if (err == -ERESTARTSYS) - return err; - } - n = iov_iter_get_pages_alloc2(data, pages, count, offs); - if (n < 0) - return n; - *need_drop = 1; - nr_pages = DIV_ROUND_UP(n + *offs, PAGE_SIZE); - atomic_add(nr_pages, &vp_pinned); - return n; - } else { - /* kernel buffer, no need to pin pages */ - int index; - size_t len; - void *p; - - /* we'd already checked that it's non-empty */ - while (1) { - len = iov_iter_single_seg_count(data); - if (likely(len)) { - p = data->kvec->iov_base + data->iov_offset; - break; - } - iov_iter_advance(data, 0); - } - if (len > count) - len = count; - - nr_pages = DIV_ROUND_UP((unsigned long)p + len, PAGE_SIZE) - - (unsigned long)p / PAGE_SIZE; - - *pages = kmalloc_array(nr_pages, sizeof(struct page *), - GFP_NOFS); - if (!*pages) - return -ENOMEM; - - *need_drop = 0; - p -= (*offs = offset_in_page(p)); - for (index = 0; index < nr_pages; index++) { - if (is_vmalloc_addr(p)) - (*pages)[index] = vmalloc_to_page(p); - else - (*pages)[index] = kmap_to_page(p); - p += PAGE_SIZE; - } - iov_iter_advance(data, len); - return len; + /* + * We allow only p9_max_pages pinned. We wait for the + * Other zc request to finish here + */ + if (atomic_read(&vp_pinned) >= chan->p9_max_pages) { + err = wait_event_killable(vp_wq, + (atomic_read(&vp_pinned) < chan->p9_max_pages)); + if (err == -ERESTARTSYS) + return err; } + + n = iov_iter_extract_pages(data, pages, count, INT_MAX, + extraction_flags, offs); + if (n < 0) + return n; + *need_unpin = iov_iter_extract_will_pin(data); + nr_pages = DIV_ROUND_UP(n + *offs, PAGE_SIZE); + atomic_add(nr_pages, &vp_pinned); + return n; } static void handle_rerror(struct p9_req_t *req, int in_hdr_len, @@ -429,7 +393,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req, struct virtio_chan *chan = client->trans; struct scatterlist *sgs[4]; size_t offs; - int need_drop = 0; + bool need_unpin; int kicked = 0; p9_debug(P9_DEBUG_TRANS, "virtio request\n"); @@ -437,7 +401,8 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req, if (uodata) { __le32 sz; int n = p9_get_mapped_pages(chan, &out_pages, uodata, - outlen, &offs, &need_drop); + outlen, &offs, &need_unpin, + WRITE_FROM_ITER); if (n < 0) { err = n; goto err_out; @@ -456,7 +421,8 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req, memcpy(&req->tc.sdata[0], &sz, sizeof(sz)); } else if (uidata) { int n = p9_get_mapped_pages(chan, &in_pages, uidata, - inlen, &offs, &need_drop); + inlen, &offs, &need_unpin, + READ_INTO_ITER); if (n < 0) { err = n; goto err_out; @@ -542,13 +508,13 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req, * Non kernel buffers are pinned, unpin them */ err_out: - if (need_drop) { + if (need_unpin) { if (in_pages) { - p9_release_pages(in_pages, in_nr_pages); + p9_unpin_pages(in_pages, in_nr_pages); atomic_sub(in_nr_pages, &vp_pinned); } if (out_pages) { - p9_release_pages(out_pages, out_nr_pages); + p9_unpin_pages(out_pages, out_nr_pages); atomic_sub(out_nr_pages, &vp_pinned); } /* wakeup anybody waiting for slots to pin pages */