From patchwork Fri Mar 6 13:25:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Claudio Imbrenda X-Patchwork-Id: 11423923 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 726501580 for ; Fri, 6 Mar 2020 13:25:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 462E920848 for ; Fri, 6 Mar 2020 13:25:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 462E920848 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B88EF6B0005; Fri, 6 Mar 2020 08:25:48 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B11F36B0007; Fri, 6 Mar 2020 08:25:48 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7DF9D6B0006; Fri, 6 Mar 2020 08:25:48 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0163.hostedemail.com [216.40.44.163]) by kanga.kvack.org (Postfix) with ESMTP id 61C566B0005 for ; Fri, 6 Mar 2020 08:25:48 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 1D1D98248076 for ; Fri, 6 Mar 2020 13:25:48 +0000 (UTC) X-FDA: 76565009976.07.badge38_15742b5c4b955 X-Spam-Summary: 2,0,0,d918c9a282c1deee,d41d8cd98f00b204,imbrenda@linux.ibm.com,,RULES_HIT:41:69:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1542:1711:1730:1747:1777:1792:2198:2199:2393:2559:2562:2693:3138:3139:3140:3141:3142:3353:3608:3865:3867:3870:3871:3872:4321:4605:5007:6261:7875:7903:9592:10004:11026:11232:11473:11639:11658:11914:12043:12048:12291:12296:12297:12438:12555:12663:12683:12895:12986:13255:13869:13894:14096:14110:14181:14394:14721:21080:21451:21627:21987:21990:30005:30054:30064:30070,0,RBL:148.163.156.1:@linux.ibm.com:.lbl8.mailshell.net-62.2.0.100 64.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:190,LUA_SUMMARY:none X-HE-Tag: badge38_15742b5c4b955 X-Filterd-Recvd-Size: 5815 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Fri, 6 Mar 2020 13:25:47 +0000 (UTC) Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 026DKCN4138088 for ; Fri, 6 Mar 2020 08:25:46 -0500 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0a-001b2d01.pphosted.com with ESMTP id 2yk4jq2wj8-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 06 Mar 2020 08:25:45 -0500 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 6 Mar 2020 13:25:43 -0000 Received: from b06avi18626390.portsmouth.uk.ibm.com (9.149.26.192) by e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Fri, 6 Mar 2020 13:25:40 -0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 026DOebk23920902 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 6 Mar 2020 13:24:40 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 39C4AA4051; Fri, 6 Mar 2020 13:25:39 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9C523A4057; Fri, 6 Mar 2020 13:25:38 +0000 (GMT) Received: from localhost.localdomain (unknown [9.145.0.1]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 6 Mar 2020 13:25:38 +0000 (GMT) From: Claudio Imbrenda To: linux-next@vger.kernel.org, akpm@linux-foundation.org, jack@suse.cz, kirill@shutemov.name Cc: borntraeger@de.ibm.com, david@redhat.com, aarcange@redhat.com, linux-mm@kvack.org, frankja@linux.ibm.com, sfr@canb.auug.org.au, jhubbard@nvidia.com, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org Subject: [PATCH v4 1/2] mm/gup: fixup for 9947ea2c1e608e32 "mm/gup: track FOLL_PIN pages" Date: Fri, 6 Mar 2020 14:25:36 +0100 X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200306132537.783769-1-imbrenda@linux.ibm.com> References: <20200306132537.783769-1-imbrenda@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 20030613-4275-0000-0000-000003A8FA56 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 20030613-4276-0000-0000-000038BE0CFD Message-Id: <20200306132537.783769-2-imbrenda@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138,18.0.572 definitions=2020-03-06_04:2020-03-06,2020-03-06 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 mlxlogscore=999 priorityscore=1501 lowpriorityscore=0 malwarescore=0 mlxscore=0 impostorscore=0 clxscore=1015 spamscore=0 suspectscore=0 bulkscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2003060096 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In case pin fails, we need to unpin, a simple put_page will not be enough fixup for commit 9947ea2c1e608e32 ("mm/gup: track FOLL_PIN pages") it can be simply squashed in Signed-off-by: Claudio Imbrenda Reviewed-by: John Hubbard --- mm/gup.c | 46 +++++++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index f589299b0d4a..81a95fbe9901 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -116,6 +116,28 @@ static __maybe_unused struct page *try_grab_compound_head(struct page *page, return NULL; } +static void put_compound_head(struct page *page, int refs, unsigned int flags) +{ + if (flags & FOLL_PIN) { + mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_RELEASED, + refs); + + if (hpage_pincount_available(page)) + hpage_pincount_sub(page, refs); + else + refs *= GUP_PIN_COUNTING_BIAS; + } + + VM_BUG_ON_PAGE(page_ref_count(page) < refs, page); + /* + * Calling put_page() for each ref is unnecessarily slow. Only the last + * ref needs a put_page(). + */ + if (refs > 1) + page_ref_sub(page, refs - 1); + put_page(page); +} + /** * try_grab_page() - elevate a page's refcount by a flag-dependent amount * @@ -2134,7 +2156,7 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, goto pte_unmap; if (unlikely(pte_val(pte) != pte_val(*ptep))) { - put_page(head); + put_compound_head(head, 1, flags); goto pte_unmap; } @@ -2267,28 +2289,6 @@ static int record_subpages(struct page *page, unsigned long addr, return nr; } -static void put_compound_head(struct page *page, int refs, unsigned int flags) -{ - if (flags & FOLL_PIN) { - mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_RELEASED, - refs); - - if (hpage_pincount_available(page)) - hpage_pincount_sub(page, refs); - else - refs *= GUP_PIN_COUNTING_BIAS; - } - - VM_BUG_ON_PAGE(page_ref_count(page) < refs, page); - /* - * Calling put_page() for each ref is unnecessarily slow. Only the last - * ref needs a put_page(). - */ - if (refs > 1) - page_ref_sub(page, refs - 1); - put_page(page); -} - #ifdef CONFIG_ARCH_HAS_HUGEPD static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end, unsigned long sz) From patchwork Fri Mar 6 13:25:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Claudio Imbrenda X-Patchwork-Id: 11423925 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9A0C692A for ; Fri, 6 Mar 2020 13:25:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6391920726 for ; Fri, 6 Mar 2020 13:25:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6391920726 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1F3796B0006; Fri, 6 Mar 2020 08:25:51 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0B3286B0007; Fri, 6 Mar 2020 08:25:51 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E6F7B6B0008; Fri, 6 Mar 2020 08:25:50 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0179.hostedemail.com [216.40.44.179]) by kanga.kvack.org (Postfix) with ESMTP id C5DE46B0006 for ; Fri, 6 Mar 2020 08:25:50 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 8273E181AC9C6 for ; Fri, 6 Mar 2020 13:25:50 +0000 (UTC) X-FDA: 76565010060.15.toe96_15d8665911a56 X-Spam-Summary: 2,0,0,bc348e119c5fe571,d41d8cd98f00b204,imbrenda@linux.ibm.com,,RULES_HIT:2:41:355:379:541:800:960:966:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1605:1606:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2693:2731:2897:2899:2901:2918:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4119:4250:4321:4385:4605:5007:6119:6261:8957:9592:10004:11026:11473:11658:11914:12043:12048:12219:12291:12296:12297:12438:12555:12663:12683:12895:12986:13223:13229:13894:14096:14394:21080:21212:21324:21451:21627:21795:21990:30003:30012:30051:30054:30064:30074,0,RBL:148.163.158.5:@linux.ibm.com:.lbl8.mailshell.net-62.2.0.100 64.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: toe96_15d8665911a56 X-Filterd-Recvd-Size: 8630 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Fri, 6 Mar 2020 13:25:49 +0000 (UTC) Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 026DJgE7127589 for ; Fri, 6 Mar 2020 08:25:49 -0500 Received: from e06smtp05.uk.ibm.com (e06smtp05.uk.ibm.com [195.75.94.101]) by mx0b-001b2d01.pphosted.com with ESMTP id 2ykataqpea-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 06 Mar 2020 08:25:49 -0500 Received: from localhost by e06smtp05.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 6 Mar 2020 13:25:47 -0000 Received: from b06cxnps3075.portsmouth.uk.ibm.com (9.149.109.195) by e06smtp05.uk.ibm.com (192.168.101.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Fri, 6 Mar 2020 13:25:41 -0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 026DPef340173606 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 6 Mar 2020 13:25:40 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EA238A404D; Fri, 6 Mar 2020 13:25:39 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4DEC9A4057; Fri, 6 Mar 2020 13:25:39 +0000 (GMT) Received: from localhost.localdomain (unknown [9.145.0.1]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 6 Mar 2020 13:25:39 +0000 (GMT) From: Claudio Imbrenda To: linux-next@vger.kernel.org, akpm@linux-foundation.org, jack@suse.cz, kirill@shutemov.name Cc: borntraeger@de.ibm.com, david@redhat.com, aarcange@redhat.com, linux-mm@kvack.org, frankja@linux.ibm.com, sfr@canb.auug.org.au, jhubbard@nvidia.com, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, Will Deacon Subject: [PATCH v4 2/2] mm/gup/writeback: add callbacks for inaccessible pages Date: Fri, 6 Mar 2020 14:25:37 +0100 X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200306132537.783769-1-imbrenda@linux.ibm.com> References: <20200306132537.783769-1-imbrenda@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 20030613-0020-0000-0000-000003B11798 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 20030613-0021-0000-0000-0000220957FA Message-Id: <20200306132537.783769-3-imbrenda@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138,18.0.572 definitions=2020-03-06_04:2020-03-06,2020-03-06 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 spamscore=0 suspectscore=2 clxscore=1015 lowpriorityscore=0 priorityscore=1501 mlxlogscore=612 malwarescore=0 phishscore=0 mlxscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2003060096 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: With the introduction of protected KVM guests on s390 there is now a concept of inaccessible pages. These pages need to be made accessible before the host can access them. While cpu accesses will trigger a fault that can be resolved, I/O accesses will just fail. We need to add a callback into architecture code for places that will do I/O, namely when writeback is started or when a page reference is taken. This is not only to enable paging, file backing etc, it is also necessary to protect the host against a malicious user space. For example a bad QEMU could simply start direct I/O on such protected memory. We do not want userspace to be able to trigger I/O errors and thus the logic is "whenever somebody accesses that page (gup) or does I/O, make sure that this page can be accessed". When the guest tries to access that page we will wait in the page fault handler for writeback to have finished and for the page_ref to be the expected value. On s390x the function is not supposed to fail, so it is ok to use a WARN_ON on failure. If we ever need some more finegrained handling we can tackle this when we know the details. Signed-off-by: Claudio Imbrenda Acked-by: Will Deacon Reviewed-by: David Hildenbrand Reviewed-by: Christian Borntraeger Reviewed-by: John Hubbard --- include/linux/gfp.h | 6 ++++++ mm/gup.c | 30 +++++++++++++++++++++++++++--- mm/page-writeback.c | 9 ++++++++- 3 files changed, 41 insertions(+), 4 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index e5b817cb86e7..be2754841369 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -485,6 +485,12 @@ static inline void arch_free_page(struct page *page, int order) { } #ifndef HAVE_ARCH_ALLOC_PAGE static inline void arch_alloc_page(struct page *page, int order) { } #endif +#ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE +static inline int arch_make_page_accessible(struct page *page) +{ + return 0; +} +#endif struct page * __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid, diff --git a/mm/gup.c b/mm/gup.c index 81a95fbe9901..d0c4c6f336bb 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -413,6 +413,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, struct page *page; spinlock_t *ptl; pte_t *ptep, pte; + int ret; /* FOLL_GET and FOLL_PIN are mutually exclusive. */ if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) == @@ -471,8 +472,6 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, if (is_zero_pfn(pte_pfn(pte))) { page = pte_page(pte); } else { - int ret; - ret = follow_pfn_pte(vma, address, ptep, flags); page = ERR_PTR(ret); goto out; @@ -480,7 +479,6 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, } if (flags & FOLL_SPLIT && PageTransCompound(page)) { - int ret; get_page(page); pte_unmap_unlock(ptep, ptl); lock_page(page); @@ -497,6 +495,19 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, page = ERR_PTR(-ENOMEM); goto out; } + /* + * We need to make the page accessible if and only if we are going + * to access its content (the FOLL_PIN case). Please see + * Documentation/core-api/pin_user_pages.rst for details. + */ + if (flags & FOLL_PIN) { + ret = arch_make_page_accessible(page); + if (ret) { + unpin_user_page(page); + page = ERR_PTR(ret); + goto out; + } + } if (flags & FOLL_TOUCH) { if ((flags & FOLL_WRITE) && !pte_dirty(pte) && !PageDirty(page)) @@ -2162,6 +2173,19 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, VM_BUG_ON_PAGE(compound_head(page) != head, page); + /* + * We need to make the page accessible if and only if we are + * going to access its content (the FOLL_PIN case). Please + * see Documentation/core-api/pin_user_pages.rst for + * details. + */ + if (flags & FOLL_PIN) { + ret = arch_make_page_accessible(page); + if (ret) { + unpin_user_page(page); + goto pte_unmap; + } + } SetPageReferenced(page); pages[*nr] = page; (*nr)++; diff --git a/mm/page-writeback.c b/mm/page-writeback.c index ab5a3cee8ad3..b7f3d0766a5f 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2764,7 +2764,7 @@ int test_clear_page_writeback(struct page *page) int __test_set_page_writeback(struct page *page, bool keep_write) { struct address_space *mapping = page_mapping(page); - int ret; + int ret, access_ret; lock_page_memcg(page); if (mapping && mapping_use_writeback_tags(mapping)) { @@ -2807,6 +2807,13 @@ int __test_set_page_writeback(struct page *page, bool keep_write) inc_zone_page_state(page, NR_ZONE_WRITE_PENDING); } unlock_page_memcg(page); + access_ret = arch_make_page_accessible(page); + /* + * If writeback has been triggered on a page that cannot be made + * accessible, it is too late to recover here. + */ + VM_BUG_ON_PAGE(access_ret != 0, page); + return ret; }