From patchwork Sat Aug 22 04:20:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Hubbard X-Patchwork-Id: 11730817 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DC9831392 for ; Sat, 22 Aug 2020 04:21:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A973B2071A for ; Sat, 22 Aug 2020 04:21:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="SzgUaIbU" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A973B2071A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8EF458D0007; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 656AA8D000B; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 40EBE8D0014; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0212.hostedemail.com [216.40.44.212]) by kanga.kvack.org (Postfix) with ESMTP id 10D3F8D0001 for ; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id B97583633 for ; Sat, 22 Aug 2020 04:21:08 +0000 (UTC) X-FDA: 77176904616.22.offer68_4207ac52703f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 865AA18038E60 for ; Sat, 22 Aug 2020 04:21:08 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,jhubbard@nvidia.com,,RULES_HIT:30054:30064:30070,0,RBL:216.228.121.65:@nvidia.com:.lbl8.mailshell.net-62.18.0.100 64.10.201.10;04yfcydk87o5xefy54i6xwgaqw7wkycdmexcjugxd7uoks9sx6piqx3rimhdhd7.twumxaauqnrhmc69deef9q1cojwrf5hf883owfyxito9w3tqy9dcwdb45ox95fq.1-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: offer68_4207ac52703f X-Filterd-Recvd-Size: 7156 Received: from hqnvemgate26.nvidia.com (hqnvemgate26.nvidia.com [216.228.121.65]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Sat, 22 Aug 2020 04:21:07 +0000 (UTC) Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate26.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Fri, 21 Aug 2020 21:20:52 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Fri, 21 Aug 2020 21:21:06 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Fri, 21 Aug 2020 21:21:06 -0700 Received: from HQMAIL105.nvidia.com (172.20.187.12) by HQMAIL101.nvidia.com (172.20.187.10) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Sat, 22 Aug 2020 04:21:05 +0000 Received: from hqnvemgw03.nvidia.com (10.124.88.68) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Sat, 22 Aug 2020 04:21:05 +0000 Received: from sandstorm.nvidia.com (Not Verified[10.2.94.162]) by hqnvemgw03.nvidia.com with Trustwave SEG (v7,5,8,10121) id ; Fri, 21 Aug 2020 21:21:05 -0700 From: John Hubbard To: Andrew Morton CC: Alexander Viro , Christoph Hellwig , Ilya Dryomov , Jens Axboe , Jeff Layton , , , , , , LKML , John Hubbard Subject: [PATCH 1/5] iov_iter: introduce iov_iter_pin_user_pages*() routines Date: Fri, 21 Aug 2020 21:20:55 -0700 Message-ID: <20200822042059.1805541-2-jhubbard@nvidia.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200822042059.1805541-1-jhubbard@nvidia.com> References: <20200822042059.1805541-1-jhubbard@nvidia.com> MIME-Version: 1.0 X-NVConfidentiality: public DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1598070052; bh=ahW3iV5UoriqVpODE9/Ro7q37g3MhHBoUKpL3N8V2b0=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:MIME-Version:X-NVConfidentiality: Content-Transfer-Encoding:Content-Type; b=SzgUaIbUNUj2QzuM+W+c7fFbR3BisP4W1tqgKFyByO8AfSF0ecFRGQVml+rFq1xTM dDyBINByC5SMRgk7d1Tss5Ul2xta/r8hzvVaeepBYFojeL+3CrWFhVuozK+EqSe53J B3WohMzk3+RG0Hrkyv+AR5J68jXvS4bdmXNGIknhMwDMDV9Zk/0OrYqKP2NVuOoLiE cB0Zcr6KjTCVwQEZbXAIlmxaudTHkJZZ6ZzY6ywW0L049JTjBMGxIouUa07AKt6To8 eZ62LMA4L5yAOtdGHGqxqR67KIA+RrlMDYpyZ1g6G37LqSa/hb8dgzoq20ycscYoCE LJDEeB/c3NR+A== X-Rspamd-Queue-Id: 865AA18038E60 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The new routines are: iov_iter_pin_user_pages() iov_iter_pin_user_pages_alloc() and those correspond to these pre-existing routines: iov_iter_get_pages() iov_iter_get_pages_alloc() Unlike the iov_iter_get_pages*() routines, the iov_iter_pin_user_pages*() routines assert that only ITER_IOVEC items are passed in. They then call pin_user_pages_fast(), instead of get_user_pages_fast(). Why: In order to incrementally change Direct IO callers from calling get_user_pages_fast() and put_page(), over to calling pin_user_pages_fast() and unpin_user_page(), there need to be mid-level routines that specifically call one or the other systems, for both page acquisition and page release. Signed-off-by: John Hubbard --- include/linux/uio.h | 5 +++ lib/iov_iter.c | 80 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 85 insertions(+) diff --git a/include/linux/uio.h b/include/linux/uio.h index 3835a8a8e9ea..29b0504a27cc 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -229,6 +229,11 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages); const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags); +ssize_t iov_iter_pin_user_pages(struct iov_iter *i, struct page **pages, + size_t maxsize, unsigned int maxpages, size_t *start); +ssize_t iov_iter_pin_user_pages_alloc(struct iov_iter *i, struct page ***pages, + size_t maxsize, size_t *start); + static inline size_t iov_iter_count(const struct iov_iter *i) { return i->count; diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 5e40786c8f12..d818b16d136b 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -1309,6 +1309,44 @@ static ssize_t pipe_get_pages(struct iov_iter *i, return __pipe_get_pages(i, min(maxsize, capacity), pages, iter_head, start); } +ssize_t iov_iter_pin_user_pages(struct iov_iter *i, + struct page **pages, size_t maxsize, unsigned int maxpages, + size_t *start) +{ + size_t skip = i->iov_offset; + const struct iovec *iov; + struct iovec v; + + if (WARN_ON_ONCE(!iter_is_iovec(i))) + return -EFAULT; + + if (unlikely(!maxsize)) + return 0; + maxsize = min(maxsize, i->count); + + iterate_iovec(i, maxsize, v, iov, skip, ({ + unsigned long addr = (unsigned long)v.iov_base; + size_t len = v.iov_len + (*start = addr & (PAGE_SIZE - 1)); + int n; + int res; + + if (len > maxpages * PAGE_SIZE) + len = maxpages * PAGE_SIZE; + addr &= ~(PAGE_SIZE - 1); + n = DIV_ROUND_UP(len, PAGE_SIZE); + + res = pin_user_pages_fast(addr, n, + iov_iter_rw(i) != WRITE ? FOLL_WRITE : 0, + pages); + if (unlikely(res < 0)) + return res; + return (res == n ? len : res * PAGE_SIZE) - *start; + 0; + })) + return 0; +} +EXPORT_SYMBOL(iov_iter_pin_user_pages); + ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages, size_t maxsize, unsigned maxpages, size_t *start) @@ -1388,6 +1426,48 @@ static ssize_t pipe_get_pages_alloc(struct iov_iter *i, return n; } +ssize_t iov_iter_pin_user_pages_alloc(struct iov_iter *i, + struct page ***pages, size_t maxsize, + size_t *start) +{ + struct page **p; + size_t skip = i->iov_offset; + const struct iovec *iov; + struct iovec v; + + if (WARN_ON_ONCE(!iter_is_iovec(i))) + return -EFAULT; + + if (unlikely(!maxsize)) + return 0; + maxsize = min(maxsize, i->count); + + iterate_iovec(i, maxsize, v, iov, skip, ({ + unsigned long addr = (unsigned long)v.iov_base; + size_t len = v.iov_len + (*start = addr & (PAGE_SIZE - 1)); + int n; + int res; + + addr &= ~(PAGE_SIZE - 1); + n = DIV_ROUND_UP(len, PAGE_SIZE); + p = get_pages_array(n); + if (!p) + return -ENOMEM; + + res = pin_user_pages_fast(addr, n, + iov_iter_rw(i) != WRITE ? FOLL_WRITE : 0, p); + if (unlikely(res < 0)) { + kvfree(p); + return res; + } + *pages = p; + return (res == n ? len : res * PAGE_SIZE) - *start; + 0; + })) + return 0; +} +EXPORT_SYMBOL(iov_iter_pin_user_pages_alloc); + ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages, size_t maxsize, size_t *start) From patchwork Sat Aug 22 04:20:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Hubbard X-Patchwork-Id: 11730813 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 744F51392 for ; Sat, 22 Aug 2020 04:21:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 35B3120FC3 for ; Sat, 22 Aug 2020 04:21:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="ClifxiVu" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 35B3120FC3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 262F48D000E; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 211AA8D0002; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0B1E18D0007; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0152.hostedemail.com [216.40.44.152]) by kanga.kvack.org (Postfix) with ESMTP id E5F7E8D0001 for ; Sat, 22 Aug 2020 00:21:08 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 9AD7B824556B for ; Sat, 22 Aug 2020 04:21:08 +0000 (UTC) X-FDA: 77176904616.18.step47_150bcaf2703f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id 7127C100EC668 for ; Sat, 22 Aug 2020 04:21:08 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,jhubbard@nvidia.com,,RULES_HIT:30005:30054:30064,0,RBL:216.228.121.65:@nvidia.com:.lbl8.mailshell.net-62.18.0.100 64.10.201.10;04y8cwwdns85hnmmijtfq15fzno87ypbmnfjqp49az8uxwbqikg7pqnca57bsaw.m813g7sd8k9g8mss1yoydkk4b3qfj135mdjgo5zp4uz59roqombfctq1rq9auo3.1-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: step47_150bcaf2703f X-Filterd-Recvd-Size: 5355 Received: from hqnvemgate26.nvidia.com (hqnvemgate26.nvidia.com [216.228.121.65]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Sat, 22 Aug 2020 04:21:07 +0000 (UTC) Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate26.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Fri, 21 Aug 2020 21:20:52 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Fri, 21 Aug 2020 21:21:06 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Fri, 21 Aug 2020 21:21:06 -0700 Received: from HQMAIL111.nvidia.com (172.20.187.18) by HQMAIL101.nvidia.com (172.20.187.10) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Sat, 22 Aug 2020 04:21:05 +0000 Received: from hqnvemgw03.nvidia.com (10.124.88.68) by HQMAIL111.nvidia.com (172.20.187.18) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Sat, 22 Aug 2020 04:21:05 +0000 Received: from sandstorm.nvidia.com (Not Verified[10.2.94.162]) by hqnvemgw03.nvidia.com with Trustwave SEG (v7,5,8,10121) id ; Fri, 21 Aug 2020 21:21:05 -0700 From: John Hubbard To: Andrew Morton CC: Alexander Viro , Christoph Hellwig , Ilya Dryomov , Jens Axboe , Jeff Layton , , , , , , LKML , John Hubbard Subject: [PATCH 2/5] mm/gup: introduce pin_user_page() Date: Fri, 21 Aug 2020 21:20:56 -0700 Message-ID: <20200822042059.1805541-3-jhubbard@nvidia.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200822042059.1805541-1-jhubbard@nvidia.com> References: <20200822042059.1805541-1-jhubbard@nvidia.com> MIME-Version: 1.0 X-NVConfidentiality: public DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1598070052; bh=AWm7U3a8aov4PNZrSYviqR5QtvQW/uLd0HXBmgxRGZY=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:MIME-Version:X-NVConfidentiality: Content-Transfer-Encoding:Content-Type; b=ClifxiVuABBkqvc/rqX6402hmxTySQe6DZT+CO3/xTn9dndqsYA1/DERMczpJhmKX w/8FUE9ErK/3jA+GCZ1sqGBiLhg1zK1miQSti1fNYmBrsGzfSIRJRNWrpn+/ZeDWfK TDoAVUzZ0qPKq3O6BsfpzF+8d1pZMxybo8DE7gmn+fs1bAuvDstlmdsNbUGYPknySA /BbmcP7B/8lyEAKIYiK2UytD7Seeb5X4Lehc9g9Zl7nD3ix63Pbzf3Zm082f3faLgm IEDZaKiUXd/f8B1uccRquX40ku0FMN1PIKsVxxzJTPfSIIFO6Q14lYldC9vvpPg89F JlFf893Gw976g== X-Rspamd-Queue-Id: 7127C100EC668 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: pin_user_page() is the FOLL_PIN equivalent of get_page(). This was always a missing piece of the pin/unpin API calls (early reviewers of pin_user_pages() asked about it, in fact), but until now, it just wasn't needed. Finally though, now that the Direct IO pieces in block/bio are about to be converted to use FOLL_PIN, it turns out that there are some cases in which get_page() and get_user_pages_fast() were both used. Converting those sites requires a drop-in replacement for get_page(), which this patch supplies. [1] and [2] provide some background about pin_user_pages() in general. [1] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ [2] Documentation/core-api/pin_user_pages.rst Signed-off-by: John Hubbard --- include/linux/mm.h | 2 ++ mm/gup.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 1983e08f5906..bee26614f430 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1149,6 +1149,8 @@ static inline void get_page(struct page *page) page_ref_inc(page); } +void pin_user_page(struct page *page); + bool __must_check try_grab_page(struct page *page, unsigned int flags); static inline __must_check bool try_get_page(struct page *page) diff --git a/mm/gup.c b/mm/gup.c index ae096ea7583f..2cae5bbbc862 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -123,6 +123,36 @@ static __maybe_unused struct page *try_grab_compound_head(struct page *page, return NULL; } +/* + * pin_user_page() - elevate the page refcount, and mark as FOLL_PIN + * + * This the FOLL_PIN equivalent of get_page(). It is intended for use when the + * page will be released via unpin_user_page(). + */ +void pin_user_page(struct page *page) +{ + int refs = 1; + + page = compound_head(page); + + VM_BUG_ON_PAGE(page_ref_count(page) <= 0, page); + + if (hpage_pincount_available(page)) + hpage_pincount_add(page, 1); + else + refs = GUP_PIN_COUNTING_BIAS; + + /* + * Similar to try_grab_compound_head(): even if using the + * hpage_pincount_add/_sub() routines, be sure to + * *also* increment the normal page refcount field at least + * once, so that the page really is pinned. + */ + page_ref_add(page, refs); + + mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_ACQUIRED, 1); +} + /** * try_grab_page() - elevate a page's refcount by a flag-dependent amount * From patchwork Sat Aug 22 04:20:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Hubbard X-Patchwork-Id: 11730815 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E5BF71392 for ; Sat, 22 Aug 2020 04:21:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 97EA92071A for ; Sat, 22 Aug 2020 04:21:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="rgGKXhOL" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 97EA92071A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 52B768D0002; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 372298D0013; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 14DC08D000B; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0206.hostedemail.com [216.40.44.206]) by kanga.kvack.org (Postfix) with ESMTP id F0E278D0002 for ; Sat, 22 Aug 2020 00:21:08 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id B4133362F for ; Sat, 22 Aug 2020 04:21:08 +0000 (UTC) X-FDA: 77176904616.01.thing04_32142c42703f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id 79A501004F90F for ; Sat, 22 Aug 2020 04:21:08 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,jhubbard@nvidia.com,,RULES_HIT:30051:30054:30064:30070:30090,0,RBL:216.228.121.64:@nvidia.com:.lbl8.mailshell.net-62.18.0.100 64.10.201.10;04y85pfyz598zbzu4h8a4xb511ixiyc9tkk7xkzjo4muy5yhjdm413fnom8xy3c.nh8535t656fmm6hkizeyeihcx65knujfhxf49hyogzam7drm6cb1omwbdfctgh9.a-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: thing04_32142c42703f X-Filterd-Recvd-Size: 14831 Received: from hqnvemgate25.nvidia.com (hqnvemgate25.nvidia.com [216.228.121.64]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Sat, 22 Aug 2020 04:21:07 +0000 (UTC) Received: from hqpgpgate102.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate25.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Fri, 21 Aug 2020 21:20:06 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate102.nvidia.com (PGP Universal service); Fri, 21 Aug 2020 21:21:06 -0700 X-PGP-Universal: processed; by hqpgpgate102.nvidia.com on Fri, 21 Aug 2020 21:21:06 -0700 Received: from HQMAIL101.nvidia.com (172.20.187.10) by HQMAIL111.nvidia.com (172.20.187.18) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Sat, 22 Aug 2020 04:21:05 +0000 Received: from hqnvemgw03.nvidia.com (10.124.88.68) by HQMAIL101.nvidia.com (172.20.187.10) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Sat, 22 Aug 2020 04:21:05 +0000 Received: from sandstorm.nvidia.com (Not Verified[10.2.94.162]) by hqnvemgw03.nvidia.com with Trustwave SEG (v7,5,8,10121) id ; Fri, 21 Aug 2020 21:21:05 -0700 From: John Hubbard To: Andrew Morton CC: Alexander Viro , Christoph Hellwig , Ilya Dryomov , Jens Axboe , Jeff Layton , , , , , , LKML , John Hubbard Subject: [PATCH 3/5] bio: convert get_user_pages_fast() --> pin_user_pages_fast() Date: Fri, 21 Aug 2020 21:20:57 -0700 Message-ID: <20200822042059.1805541-4-jhubbard@nvidia.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200822042059.1805541-1-jhubbard@nvidia.com> References: <20200822042059.1805541-1-jhubbard@nvidia.com> MIME-Version: 1.0 X-NVConfidentiality: public DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1598070006; bh=W3aFSTwxcsXnvQDT0rb9ATUk6+FImoLZnNsgZkjkrP4=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:MIME-Version:X-NVConfidentiality: Content-Transfer-Encoding:Content-Type; b=rgGKXhOLVHjC224uNFOv6QV+3giMdke2C5sWlrh0oH5CHu6Mw8nrMdtoCLrZlZ+FM CoJWh4JHw/Q89QXMOGHuOtyRRZRERP3hS6MXjsGQOi9udrkOPWbZReXbxF2DtI3697 1OrRLuzpCLfFXBaEDZ67JAz2zwQO0KCFC2EIz2cl9xo6hv7xT2OhaUGJQwRhgFwbI2 wt2/OXPDcuuCnXULtHHo7OROsK9KJOl+wZ7R1+7Da5UqsxcP03qdEmqcH/O6vMe4O4 S4UY9rcNoK7LOGqE3NmjNUESeq/2L0d5P1GGP4PpY2Eay/1XWpI5+E+oCTANzmE00i QT4OCoyqIYHxw== X-Rspamd-Queue-Id: 79A501004F90F X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Change generic block/bio Direct IO routines, to acquire FOLL_PIN user pages via the recently added routines: iov_iter_pin_user_pages() iov_iter_pin_user_pages_alloc() pin_user_page() This effectively converts several file systems (ext4, for example) that use the common Direct IO routines. Change the corresponding page release calls from put_page() to unpin_user_page(). Change bio_release_pages() to handle FOLL_PIN pages. In fact, that is now the *only* type of pages it handles now. Design notes ============ Quite a few approaches have been considered over the years. This one is inspired by Christoph Hellwig's July, 2019 observation that there are only 5 ITER_ types, and we can simplify handling of them for Direct IO [1]. Accordingly, this patch implements the following pseudocode: Direct IO behavior: ITER_IOVEC: pin_user_pages_fast(); break; ITER_KVEC: // already elevated page refcount, leave alone ITER_BVEC: // already elevated page refcount, leave alone ITER_PIPE: // just, no :) ITER_DISCARD: // discard return -EFAULT or -ENVALID; ...which works for callers that already have sorted out which case they are in. Such as, Direct IO in the block/bio layers. Now, this does leave ITER_KVEC and ITER_BVEC unconverted, but on the other hand, it's not clear that these are actually affected in the real world, by the get_user_pages()+filesystem interaction problems of [2]. If it turns out to matter, then those can be handled too, but it's just more refactoring and surgery to do so. Page acquisition: The iov_iter_get_pages*() routines above are at just the right level in the call stack: the callers already know which system to use, and so it's a small change to just drop in the replacement routines. And it's a fan-in/fan-out point: block/bio call sites for Direct IO funnel their page acquisitions through the iov_iter_get_pages*() routines, and there are many other callers of those. And we can't convert all of the callers at once--too many subsystems are involved, and it would be a too large and too risky patch. Page release: there are already separate release routines: put_page() vs. unpin_user_page(), so it's already done there. [1] https://lore.kernel.org/kvm/20190724061750.GA19397@infradead.org/ [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Signed-off-by: John Hubbard --- block/bio.c | 24 ++++++++++++------------ block/blk-map.c | 6 +++--- fs/direct-io.c | 28 ++++++++++++++-------------- fs/iomap/direct-io.c | 2 +- 4 files changed, 30 insertions(+), 30 deletions(-) diff --git a/block/bio.c b/block/bio.c index c63ba04bd629..00d548e3c2b8 100644 --- a/block/bio.c +++ b/block/bio.c @@ -955,7 +955,7 @@ void bio_release_pages(struct bio *bio, bool mark_dirty) bio_for_each_segment_all(bvec, bio, iter_all) { if (mark_dirty && !PageCompound(bvec->bv_page)) set_page_dirty_lock(bvec->bv_page); - put_page(bvec->bv_page); + unpin_user_page(bvec->bv_page); } } EXPORT_SYMBOL_GPL(bio_release_pages); @@ -986,9 +986,9 @@ static int __bio_iov_bvec_add_pages(struct bio *bio, struct iov_iter *iter) * @iter: iov iterator describing the region to be mapped * * Pins pages from *iter and appends them to @bio's bvec array. The - * pages will have to be released using put_page() when done. - * For multi-segment *iter, this function only adds pages from the - * next non-empty segment of the iov iterator. + * pages will have to be released using put_page() or unpin_user_page() when + * done. For multi-segment *iter, this function only adds pages from the next + * non-empty segment of the iov iterator. */ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) { @@ -1009,7 +1009,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2); pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); - size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); + size = iov_iter_pin_user_pages(iter, pages, LONG_MAX, nr_pages, &offset); if (unlikely(size <= 0)) return size ? size : -EFAULT; @@ -1020,7 +1020,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) if (__bio_try_merge_page(bio, page, len, offset, &same_page)) { if (same_page) - put_page(page); + unpin_user_page(page); } else { if (WARN_ON_ONCE(bio_full(bio, len))) return -EINVAL; @@ -1056,7 +1056,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2); pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); - size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); + size = iov_iter_pin_user_pages(iter, pages, LONG_MAX, nr_pages, &offset); if (unlikely(size <= 0)) return size ? size : -EFAULT; @@ -1069,7 +1069,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) max_append_sectors, &same_page) != len) return -EINVAL; if (same_page) - put_page(page); + unpin_user_page(page); offset = 0; } @@ -1113,8 +1113,8 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) } else { if (is_bvec) ret = __bio_iov_bvec_add_pages(bio, iter); - else - ret = __bio_iov_iter_get_pages(bio, iter); + else + ret = __bio_iov_iter_get_pages(bio, iter); } } while (!ret && iov_iter_count(iter) && !bio_full(bio, 0)); @@ -1326,8 +1326,8 @@ void bio_set_pages_dirty(struct bio *bio) * the BIO and re-dirty the pages in process context. * * It is expected that bio_check_pages_dirty() will wholly own the BIO from - * here on. It will run one put_page() against each page and will run one - * bio_put() against the BIO. + * here on. It will run one unpin_user_page() against each page + * and will run one bio_put() against the BIO. */ static void bio_dirty_fn(struct work_struct *work); diff --git a/block/blk-map.c b/block/blk-map.c index 6e804892d5ec..7a095b4947ea 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -275,7 +275,7 @@ static struct bio *bio_map_user_iov(struct request_queue *q, size_t offs, added = 0; int npages; - bytes = iov_iter_get_pages_alloc(iter, &pages, LONG_MAX, &offs); + bytes = iov_iter_pin_user_pages_alloc(iter, &pages, LONG_MAX, &offs); if (unlikely(bytes <= 0)) { ret = bytes ? bytes : -EFAULT; goto out_unmap; @@ -298,7 +298,7 @@ static struct bio *bio_map_user_iov(struct request_queue *q, if (!bio_add_hw_page(q, bio, page, n, offs, max_sectors, &same_page)) { if (same_page) - put_page(page); + unpin_user_page(page); break; } @@ -312,7 +312,7 @@ static struct bio *bio_map_user_iov(struct request_queue *q, * release the pages we didn't map into the bio, if any */ while (j < npages) - put_page(pages[j++]); + unpin_user_page(pages[j++]); kvfree(pages); /* couldn't stuff something into bio? */ if (bytes) diff --git a/fs/direct-io.c b/fs/direct-io.c index 183299892465..b01c8d003bd3 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -170,7 +170,7 @@ static inline int dio_refill_pages(struct dio *dio, struct dio_submit *sdio) { ssize_t ret; - ret = iov_iter_get_pages(sdio->iter, dio->pages, LONG_MAX, DIO_PAGES, + ret = iov_iter_pin_user_pages(sdio->iter, dio->pages, LONG_MAX, DIO_PAGES, &sdio->from); if (ret < 0 && sdio->blocks_available && (dio->op == REQ_OP_WRITE)) { @@ -182,7 +182,7 @@ static inline int dio_refill_pages(struct dio *dio, struct dio_submit *sdio) */ if (dio->page_errors == 0) dio->page_errors = ret; - get_page(page); + pin_user_page(page); dio->pages[0] = page; sdio->head = 0; sdio->tail = 1; @@ -472,7 +472,7 @@ static inline void dio_bio_submit(struct dio *dio, struct dio_submit *sdio) static inline void dio_cleanup(struct dio *dio, struct dio_submit *sdio) { while (sdio->head < sdio->tail) - put_page(dio->pages[sdio->head++]); + unpin_user_page(dio->pages[sdio->head++]); } /* @@ -739,7 +739,7 @@ static inline int dio_bio_add_page(struct dio_submit *sdio) */ if ((sdio->cur_page_len + sdio->cur_page_offset) == PAGE_SIZE) sdio->pages_in_io--; - get_page(sdio->cur_page); + pin_user_page(sdio->cur_page); sdio->final_block_in_bio = sdio->cur_page_block + (sdio->cur_page_len >> sdio->blkbits); ret = 0; @@ -853,13 +853,13 @@ submit_page_section(struct dio *dio, struct dio_submit *sdio, struct page *page, */ if (sdio->cur_page) { ret = dio_send_cur_page(dio, sdio, map_bh); - put_page(sdio->cur_page); + unpin_user_page(sdio->cur_page); sdio->cur_page = NULL; if (ret) return ret; } - get_page(page); /* It is in dio */ + pin_user_page(page); /* It is in dio */ sdio->cur_page = page; sdio->cur_page_offset = offset; sdio->cur_page_len = len; @@ -874,7 +874,7 @@ submit_page_section(struct dio *dio, struct dio_submit *sdio, struct page *page, ret = dio_send_cur_page(dio, sdio, map_bh); if (sdio->bio) dio_bio_submit(dio, sdio); - put_page(sdio->cur_page); + unpin_user_page(sdio->cur_page); sdio->cur_page = NULL; } return ret; @@ -974,7 +974,7 @@ static int do_direct_IO(struct dio *dio, struct dio_submit *sdio, ret = get_more_blocks(dio, sdio, map_bh); if (ret) { - put_page(page); + unpin_user_page(page); goto out; } if (!buffer_mapped(map_bh)) @@ -1019,7 +1019,7 @@ static int do_direct_IO(struct dio *dio, struct dio_submit *sdio, /* AKPM: eargh, -ENOTBLK is a hack */ if (dio->op == REQ_OP_WRITE) { - put_page(page); + unpin_user_page(page); return -ENOTBLK; } @@ -1032,7 +1032,7 @@ static int do_direct_IO(struct dio *dio, struct dio_submit *sdio, if (sdio->block_in_file >= i_size_aligned >> blkbits) { /* We hit eof */ - put_page(page); + unpin_user_page(page); goto out; } zero_user(page, from, 1 << blkbits); @@ -1072,7 +1072,7 @@ static int do_direct_IO(struct dio *dio, struct dio_submit *sdio, sdio->next_block_for_io, map_bh); if (ret) { - put_page(page); + unpin_user_page(page); goto out; } sdio->next_block_for_io += this_chunk_blocks; @@ -1087,8 +1087,8 @@ static int do_direct_IO(struct dio *dio, struct dio_submit *sdio, break; } - /* Drop the ref which was taken in get_user_pages() */ - put_page(page); + /* Drop the ref which was taken in pin_user_pages() */ + unpin_user_page(page); } out: return ret; @@ -1327,7 +1327,7 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, ret2 = dio_send_cur_page(dio, &sdio, &map_bh); if (retval == 0) retval = ret2; - put_page(sdio.cur_page); + unpin_user_page(sdio.cur_page); sdio.cur_page = NULL; } if (sdio.bio) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index c1aafb2ab990..390f611528ea 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -194,7 +194,7 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io; - get_page(page); + pin_user_page(page); __bio_add_page(bio, page, len, 0); bio_set_op_attrs(bio, REQ_OP_WRITE, flags); iomap_dio_submit_bio(dio, iomap, bio, pos); From patchwork Sat Aug 22 04:20:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Hubbard X-Patchwork-Id: 11730825 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B314E1392 for ; Sat, 22 Aug 2020 04:21:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 73B942071A for ; Sat, 22 Aug 2020 04:21:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="C6LhBjw+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 73B942071A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E8E9A8D0013; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8A2648D0017; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 612B28D0007; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0094.hostedemail.com [216.40.44.94]) by kanga.kvack.org (Postfix) with ESMTP id 409898D000B for ; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id F08183637 for ; Sat, 22 Aug 2020 04:21:08 +0000 (UTC) X-FDA: 77176904616.21.back80_590a0ea2703f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id C5D86180442C0 for ; Sat, 22 Aug 2020 04:21:08 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,jhubbard@nvidia.com,,RULES_HIT:30034:30046:30054:30064:30070,0,RBL:216.228.121.143:@nvidia.com:.lbl8.mailshell.net-62.18.0.100 64.10.201.10;04y8myx7et4g6hm6jc9gmy84yso87ycjs1y6oz1qibw974ai5woyp5qb9zw6zze.9jt3bafpxrnoicojufgo9ryhae17a7z6f3huuwgzhnoktax3f6qa6jkjza98qra.r-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: back80_590a0ea2703f X-Filterd-Recvd-Size: 10013 Received: from hqnvemgate24.nvidia.com (hqnvemgate24.nvidia.com [216.228.121.143]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Sat, 22 Aug 2020 04:21:08 +0000 (UTC) Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate24.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Fri, 21 Aug 2020 21:19:10 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Fri, 21 Aug 2020 21:21:06 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Fri, 21 Aug 2020 21:21:06 -0700 Received: from HQMAIL111.nvidia.com (172.20.187.18) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Sat, 22 Aug 2020 04:21:05 +0000 Received: from hqnvemgw03.nvidia.com (10.124.88.68) by HQMAIL111.nvidia.com (172.20.187.18) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Sat, 22 Aug 2020 04:21:06 +0000 Received: from sandstorm.nvidia.com (Not Verified[10.2.94.162]) by hqnvemgw03.nvidia.com with Trustwave SEG (v7,5,8,10121) id ; Fri, 21 Aug 2020 21:21:05 -0700 From: John Hubbard To: Andrew Morton CC: Alexander Viro , Christoph Hellwig , Ilya Dryomov , Jens Axboe , Jeff Layton , , , , , , LKML , John Hubbard Subject: [PATCH 4/5] bio: introduce BIO_FOLL_PIN flag Date: Fri, 21 Aug 2020 21:20:58 -0700 Message-ID: <20200822042059.1805541-5-jhubbard@nvidia.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200822042059.1805541-1-jhubbard@nvidia.com> References: <20200822042059.1805541-1-jhubbard@nvidia.com> MIME-Version: 1.0 X-NVConfidentiality: public DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1598069950; bh=dStqadzdVyXdIQS1MKqVD7zok05od3X+Gm0e+Cu/ZoM=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:MIME-Version:X-NVConfidentiality: Content-Transfer-Encoding:Content-Type; b=C6LhBjw+fU0jybOfRmvr8dzUNMGe3gMPFtMhY/zJGfLrGkYwjh77m1Z9CLV3g0c9Z rtdepTOlTuyTasuYggwWRkvU1LaOHETQ5efA0k/s1rGynwLSVIPVoYJqoV6jh3Chdf cDyZ055a9lWQ2dELXyr60r1s1GO1MxUc4ES9iqrz4ajAtRO8u9giR6UXgq83cIVeDC 7mBTH5htcBSA58XccKh1Zw/yBWBYC+gN+uB28k+TS8Fwxmjsh3xmR/vNoCQyKKXXpb NE8QoJtWCUzAKxljNycFQ6PkLjMuOt+rOUW/1ixH8hJ8A0SZn9MT8rBHf5AH3+8FlH cUJZncSTaQhxg== X-Rspamd-Queue-Id: C5D86180442C0 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add a new BIO_FOLL_PIN flag to struct bio, whose "short int" flags field was full, thuse triggering an expansion of the field from 16, to 32 bits. This allows for a nice assertion in bio_release_pages(), that the bio page release mechanism matches the page acquisition mechanism. Set BIO_FOLL_PIN whenever pin_user_pages_fast() is used, and check for BIO_FOLL_PIN before using unpin_user_page(). Signed-off-by: John Hubbard Reported-by: kernel test robot --- block/bio.c | 9 +++++++-- block/blk-map.c | 3 ++- fs/direct-io.c | 4 ++-- include/linux/blk_types.h | 5 +++-- include/linux/uio.h | 5 +++-- lib/iov_iter.c | 9 +++++++-- 6 files changed, 24 insertions(+), 11 deletions(-) diff --git a/block/bio.c b/block/bio.c index 00d548e3c2b8..dd8e85618d5e 100644 --- a/block/bio.c +++ b/block/bio.c @@ -952,6 +952,9 @@ void bio_release_pages(struct bio *bio, bool mark_dirty) if (bio_flagged(bio, BIO_NO_PAGE_REF)) return; + if (WARN_ON_ONCE(!bio_flagged(bio, BIO_FOLL_PIN))) + return; + bio_for_each_segment_all(bvec, bio, iter_all) { if (mark_dirty && !PageCompound(bvec->bv_page)) set_page_dirty_lock(bvec->bv_page); @@ -1009,7 +1012,8 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2); pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); - size = iov_iter_pin_user_pages(iter, pages, LONG_MAX, nr_pages, &offset); + size = iov_iter_pin_user_pages(bio, iter, pages, LONG_MAX, nr_pages, + &offset); if (unlikely(size <= 0)) return size ? size : -EFAULT; @@ -1056,7 +1060,8 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2); pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); - size = iov_iter_pin_user_pages(iter, pages, LONG_MAX, nr_pages, &offset); + size = iov_iter_pin_user_pages(bio, iter, pages, LONG_MAX, nr_pages, + &offset); if (unlikely(size <= 0)) return size ? size : -EFAULT; diff --git a/block/blk-map.c b/block/blk-map.c index 7a095b4947ea..ddfff2f0b1cb 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -275,7 +275,8 @@ static struct bio *bio_map_user_iov(struct request_queue *q, size_t offs, added = 0; int npages; - bytes = iov_iter_pin_user_pages_alloc(iter, &pages, LONG_MAX, &offs); + bytes = iov_iter_pin_user_pages_alloc(bio, iter, &pages, + LONG_MAX, &offs); if (unlikely(bytes <= 0)) { ret = bytes ? bytes : -EFAULT; goto out_unmap; diff --git a/fs/direct-io.c b/fs/direct-io.c index b01c8d003bd3..4d0787ba85eb 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -170,8 +170,8 @@ static inline int dio_refill_pages(struct dio *dio, struct dio_submit *sdio) { ssize_t ret; - ret = iov_iter_pin_user_pages(sdio->iter, dio->pages, LONG_MAX, DIO_PAGES, - &sdio->from); + ret = iov_iter_pin_user_pages(sdio->bio, sdio->iter, dio->pages, + LONG_MAX, DIO_PAGES, &sdio->from); if (ret < 0 && sdio->blocks_available && (dio->op == REQ_OP_WRITE)) { struct page *page = ZERO_PAGE(0); diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 4ecf4fed171f..d0e0da762af3 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -188,7 +188,7 @@ struct bio { * top bits REQ_OP. Use * accessors. */ - unsigned short bi_flags; /* status, etc and bvec pool number */ + unsigned int bi_flags; /* status, etc and bvec pool number */ unsigned short bi_ioprio; unsigned short bi_write_hint; blk_status_t bi_status; @@ -267,6 +267,7 @@ enum { * of this bio. */ BIO_CGROUP_ACCT, /* has been accounted to a cgroup */ BIO_TRACKED, /* set if bio goes through the rq_qos path */ + BIO_FOLL_PIN, /* must release pages via unpin_user_pages() */ BIO_FLAG_LAST }; @@ -285,7 +286,7 @@ enum { * freed. */ #define BVEC_POOL_BITS (3) -#define BVEC_POOL_OFFSET (16 - BVEC_POOL_BITS) +#define BVEC_POOL_OFFSET (32 - BVEC_POOL_BITS) #define BVEC_POOL_IDX(bio) ((bio)->bi_flags >> BVEC_POOL_OFFSET) #if (1<< BVEC_POOL_BITS) < (BVEC_POOL_NR+1) # error "BVEC_POOL_BITS is too small" diff --git a/include/linux/uio.h b/include/linux/uio.h index 29b0504a27cc..62bcf5e45f2b 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -209,6 +209,7 @@ size_t copy_to_iter_mcsafe(void *addr, size_t bytes, struct iov_iter *i) return _copy_to_iter_mcsafe(addr, bytes, i); } +struct bio; size_t iov_iter_zero(size_t bytes, struct iov_iter *); unsigned long iov_iter_alignment(const struct iov_iter *i); unsigned long iov_iter_gap_alignment(const struct iov_iter *i); @@ -229,9 +230,9 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages); const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags); -ssize_t iov_iter_pin_user_pages(struct iov_iter *i, struct page **pages, +ssize_t iov_iter_pin_user_pages(struct bio *bio, struct iov_iter *i, struct page **pages, size_t maxsize, unsigned int maxpages, size_t *start); -ssize_t iov_iter_pin_user_pages_alloc(struct iov_iter *i, struct page ***pages, +ssize_t iov_iter_pin_user_pages_alloc(struct bio *bio, struct iov_iter *i, struct page ***pages, size_t maxsize, size_t *start); static inline size_t iov_iter_count(const struct iov_iter *i) diff --git a/lib/iov_iter.c b/lib/iov_iter.c index d818b16d136b..a4bc1b3a3fda 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -3,6 +3,7 @@ #include #include #include +#include #include #include #include @@ -1309,7 +1310,7 @@ static ssize_t pipe_get_pages(struct iov_iter *i, return __pipe_get_pages(i, min(maxsize, capacity), pages, iter_head, start); } -ssize_t iov_iter_pin_user_pages(struct iov_iter *i, +ssize_t iov_iter_pin_user_pages(struct bio *bio, struct iov_iter *i, struct page **pages, size_t maxsize, unsigned int maxpages, size_t *start) { @@ -1335,6 +1336,8 @@ ssize_t iov_iter_pin_user_pages(struct iov_iter *i, addr &= ~(PAGE_SIZE - 1); n = DIV_ROUND_UP(len, PAGE_SIZE); + bio_set_flag(bio, BIO_FOLL_PIN); + res = pin_user_pages_fast(addr, n, iov_iter_rw(i) != WRITE ? FOLL_WRITE : 0, pages); @@ -1426,7 +1429,7 @@ static ssize_t pipe_get_pages_alloc(struct iov_iter *i, return n; } -ssize_t iov_iter_pin_user_pages_alloc(struct iov_iter *i, +ssize_t iov_iter_pin_user_pages_alloc(struct bio *bio, struct iov_iter *i, struct page ***pages, size_t maxsize, size_t *start) { @@ -1454,6 +1457,8 @@ ssize_t iov_iter_pin_user_pages_alloc(struct iov_iter *i, if (!p) return -ENOMEM; + bio_set_flag(bio, BIO_FOLL_PIN); + res = pin_user_pages_fast(addr, n, iov_iter_rw(i) != WRITE ? FOLL_WRITE : 0, p); if (unlikely(res < 0)) { From patchwork Sat Aug 22 04:20:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Hubbard X-Patchwork-Id: 11730829 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 14A641510 for ; Sat, 22 Aug 2020 04:21:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D2DCF207CD for ; Sat, 22 Aug 2020 04:21:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="cPZpvrna" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D2DCF207CD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 271F98D0014; Sat, 22 Aug 2020 00:21:10 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A77E88D0001; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8EFC18D0013; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0076.hostedemail.com [216.40.44.76]) by kanga.kvack.org (Postfix) with ESMTP id 47F728D0001 for ; Sat, 22 Aug 2020 00:21:09 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 03603180AD81F for ; Sat, 22 Aug 2020 04:21:09 +0000 (UTC) X-FDA: 77176904658.25.loss03_5701d012703f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id C8D481804E3A0 for ; Sat, 22 Aug 2020 04:21:08 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,jhubbard@nvidia.com,,RULES_HIT:30034:30054:30064,0,RBL:216.228.121.143:@nvidia.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100;04y8m6oryiexfkshqc4yqd8cbpnm5opxyze8sbzfkxq9mna6xsxyhcupeqps9dz.tod1smy961wkdqobg68h5ta4y8x1upyjjyt5uaq9awe9b7pnyusorw4rub3e774.k-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: loss03_5701d012703f X-Filterd-Recvd-Size: 5467 Received: from hqnvemgate24.nvidia.com (hqnvemgate24.nvidia.com [216.228.121.143]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Sat, 22 Aug 2020 04:21:07 +0000 (UTC) Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate24.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Fri, 21 Aug 2020 21:19:10 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Fri, 21 Aug 2020 21:21:06 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Fri, 21 Aug 2020 21:21:06 -0700 Received: from HQMAIL105.nvidia.com (172.20.187.12) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Sat, 22 Aug 2020 04:21:05 +0000 Received: from hqnvemgw03.nvidia.com (10.124.88.68) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Sat, 22 Aug 2020 04:21:05 +0000 Received: from sandstorm.nvidia.com (Not Verified[10.2.94.162]) by hqnvemgw03.nvidia.com with Trustwave SEG (v7,5,8,10121) id ; Fri, 21 Aug 2020 21:21:05 -0700 From: John Hubbard To: Andrew Morton CC: Alexander Viro , Christoph Hellwig , Ilya Dryomov , Jens Axboe , Jeff Layton , , , , , , LKML , John Hubbard Subject: [PATCH 5/5] fs/ceph: use pipe_get_pages_alloc() for pipe Date: Fri, 21 Aug 2020 21:20:59 -0700 Message-ID: <20200822042059.1805541-6-jhubbard@nvidia.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200822042059.1805541-1-jhubbard@nvidia.com> References: <20200822042059.1805541-1-jhubbard@nvidia.com> MIME-Version: 1.0 X-NVConfidentiality: public DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1598069950; bh=GTzFC9ZKOy1JQWqhGPPFcWCorren+4pNfhyF1KWKP8Y=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:MIME-Version:X-NVConfidentiality: Content-Transfer-Encoding:Content-Type; b=cPZpvrnaH6BmFX+0zBe5QqRgPZb3Bv5dZIEekgu75/v14xpqNha+eLqoiJrBPN0dg Mg+SnzuYzkTwf3CnmMIg5OVHtiSSdm9/oHAouzCKWDeWxZSBswlxT4l3Y8IzXkoNrI 7BBsJTt2r+aBzUT+2iv8WQdn8QHM/47iNyfVRQhKfzra+iv60Bz5dV11jUHACvkrZZ JDttpGrJI9IhTQLIgLOzkjmIMg5qDIJBEeTN7U0DF7hGUxRv+1QVdE+Md6tXkWPy5z LXh1eEAZVp0Eu8SpU1XUq80eRjHVJeQHufNH4dEwk1S5D7fFveGe30EBHmCU/hHNVX WuELYtuojJ9IA== X-Rspamd-Queue-Id: C8D481804E3A0 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This reduces, by one, the number of callers of iov_iter_get_pages(). That's helpful because these calls are being audited and converted over to use iov_iter_pin_user_pages(), where applicable. And this one here is already known by the caller to be only for ITER_PIPE, so let's just simplify it now. Signed-off-by: John Hubbard Acked-by: Jeff Layton --- fs/ceph/file.c | 3 +-- include/linux/uio.h | 3 ++- lib/iov_iter.c | 6 +++--- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index d51c3f2fdca0..d3d7dd957390 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -879,8 +879,7 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, more = len < iov_iter_count(to); if (unlikely(iov_iter_is_pipe(to))) { - ret = iov_iter_get_pages_alloc(to, &pages, len, - &page_off); + ret = pipe_get_pages_alloc(to, &pages, len, &page_off); if (ret <= 0) { ceph_osdc_put_request(req); ret = -ENOMEM; diff --git a/include/linux/uio.h b/include/linux/uio.h index 62bcf5e45f2b..76cd47ab3dfd 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -227,7 +227,8 @@ ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages, ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages, size_t maxsize, size_t *start); int iov_iter_npages(const struct iov_iter *i, int maxpages); - +ssize_t pipe_get_pages_alloc(struct iov_iter *i, struct page ***pages, + size_t maxsize, size_t *start); const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags); ssize_t iov_iter_pin_user_pages(struct bio *bio, struct iov_iter *i, struct page **pages, diff --git a/lib/iov_iter.c b/lib/iov_iter.c index a4bc1b3a3fda..f571fe3ddbe8 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -1396,9 +1396,8 @@ static struct page **get_pages_array(size_t n) return kvmalloc_array(n, sizeof(struct page *), GFP_KERNEL); } -static ssize_t pipe_get_pages_alloc(struct iov_iter *i, - struct page ***pages, size_t maxsize, - size_t *start) +ssize_t pipe_get_pages_alloc(struct iov_iter *i, struct page ***pages, + size_t maxsize, size_t *start) { struct page **p; unsigned int iter_head, npages; @@ -1428,6 +1427,7 @@ static ssize_t pipe_get_pages_alloc(struct iov_iter *i, kvfree(p); return n; } +EXPORT_SYMBOL(pipe_get_pages_alloc); ssize_t iov_iter_pin_user_pages_alloc(struct bio *bio, struct iov_iter *i, struct page ***pages, size_t maxsize,