From patchwork Thu Feb 20 10:39:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Borntraeger X-Patchwork-Id: 11393763 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7748517F0 for ; Thu, 20 Feb 2020 10:40:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4102B24654 for ; Thu, 20 Feb 2020 10:40:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4102B24654 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=de.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DF7F36B0006; Thu, 20 Feb 2020 05:40:30 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DA7336B0007; Thu, 20 Feb 2020 05:40:30 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C48516B0008; Thu, 20 Feb 2020 05:40:30 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0058.hostedemail.com [216.40.44.58]) by kanga.kvack.org (Postfix) with ESMTP id ABA546B0006 for ; Thu, 20 Feb 2020 05:40:30 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6EC072C7C for ; Thu, 20 Feb 2020 10:40:30 +0000 (UTC) X-FDA: 76510161420.13.vein70_217648e95554f X-Spam-Summary: 2,0,0,39fccc053321ffe8,d41d8cd98f00b204,borntraeger@de.ibm.com,:borntraeger@de.ibm.com:frankja@linux.vnet.ibm.com:akpm@linux-foundation.org:kvm@vger.kernel.org:cohuck@redhat.com:david@redhat.com:thuth@redhat.com:ulrich.weigand@de.ibm.com:imbrenda@linux.ibm.com:linux-s390@vger.kernel.org:mimu@linux.ibm.com:gor@linux.ibm.com:aarcange@redhat.com::will@kernel.org,RULES_HIT:41:355:379:541:800:960:966:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1544:1605:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2693:2731:2897:2901:2918:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4118:4250:4321:4385:4605:5007:6119:6261:6742:7576:8634:8957:9592:10004:11026:11473:11658:11914:12043:12296:12297:12438:12555:12663:12679:12895:12986:13223:13229:13894:14181:14394:14721:21080:21212:21324:21451:21627:21795:21990:30003:30012:30051:30054,0,RBL:148.163.158.5:@de.ibm.com:.lbl8.mailshell.net-64.100.201.201 62.2.0.100,CacheIP:none,Bayesian:0.5,0.5 ,0.5,Net X-HE-Tag: vein70_217648e95554f X-Filterd-Recvd-Size: 7856 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Thu, 20 Feb 2020 10:40:29 +0000 (UTC) Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01KAZLhU057372; Thu, 20 Feb 2020 05:40:26 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 2y8uc13ytx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 20 Feb 2020 05:40:26 -0500 Received: from m0098421.ppops.net (m0098421.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 01KAZQSq057826; Thu, 20 Feb 2020 05:40:25 -0500 Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0a-001b2d01.pphosted.com with ESMTP id 2y8uc13ytf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 20 Feb 2020 05:40:25 -0500 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 01KAGSwO027415; Thu, 20 Feb 2020 10:40:24 GMT Received: from b01cxnp22036.gho.pok.ibm.com (b01cxnp22036.gho.pok.ibm.com [9.57.198.26]) by ppma02wdc.us.ibm.com with ESMTP id 2y6896uvdp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 20 Feb 2020 10:40:24 +0000 Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 01KAeNDk16122860 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 20 Feb 2020 10:40:23 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DE206112066; Thu, 20 Feb 2020 10:40:22 +0000 (GMT) Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D7EED112063; Thu, 20 Feb 2020 10:40:22 +0000 (GMT) Received: from localhost.localdomain (unknown [9.114.17.106]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP; Thu, 20 Feb 2020 10:40:22 +0000 (GMT) From: Christian Borntraeger To: Christian Borntraeger , Janosch Frank , Andrew Morton Cc: KVM , Cornelia Huck , David Hildenbrand , Thomas Huth , Ulrich Weigand , Claudio Imbrenda , linux-s390 , Michael Mueller , Vasily Gorbik , Andrea Arcangeli , linux-mm@kvack.org, Will Deacon Subject: [PATCH v3 01/37] mm:gup/writeback: add callbacks for inaccessible pages Date: Thu, 20 Feb 2020 05:39:44 -0500 Message-Id: <20200220104020.5343-2-borntraeger@de.ibm.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200220104020.5343-1-borntraeger@de.ibm.com> References: <20200220104020.5343-1-borntraeger@de.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138,18.0.572 definitions=2020-02-20_02:2020-02-19,2020-02-20 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 spamscore=0 phishscore=0 impostorscore=0 mlxscore=0 mlxlogscore=822 priorityscore=1501 lowpriorityscore=0 bulkscore=0 clxscore=1015 suspectscore=2 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002200078 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Claudio Imbrenda With the introduction of protected KVM guests on s390 there is now a concept of inaccessible pages. These pages need to be made accessible before the host can access them. While cpu accesses will trigger a fault that can be resolved, I/O accesses will just fail. We need to add a callback into architecture code for places that will do I/O, namely when writeback is started or when a page reference is taken. This is not only to enable paging, file backing etc, it is also necessary to protect the host against a malicious user space. For example a bad QEMU could simply start direct I/O on such protected memory. We do not want userspace to be able to trigger I/O errors and thus we the logic is "whenever somebody accesses that page (gup) or doing I/O, make sure that this page can be accessed. When the guest tries to access that page we will wait in the page fault handler for writeback to have finished and for the page_ref to be the expected value. Signed-off-by: Claudio Imbrenda Acked-by: Will Deacon Signed-off-by: Christian Borntraeger Reviewed-by: David Hildenbrand --- include/linux/gfp.h | 6 ++++++ mm/gup.c | 15 ++++++++++++--- mm/page-writeback.c | 5 +++++ 3 files changed, 23 insertions(+), 3 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index e5b817cb86e7..be2754841369 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -485,6 +485,12 @@ static inline void arch_free_page(struct page *page, int order) { } #ifndef HAVE_ARCH_ALLOC_PAGE static inline void arch_alloc_page(struct page *page, int order) { } #endif +#ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE +static inline int arch_make_page_accessible(struct page *page) +{ + return 0; +} +#endif struct page * __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid, diff --git a/mm/gup.c b/mm/gup.c index 1b521e0ac1de..354bcfbd844b 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -193,6 +193,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, struct page *page; spinlock_t *ptl; pte_t *ptep, pte; + int ret; /* FOLL_GET and FOLL_PIN are mutually exclusive. */ if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) == @@ -250,8 +251,6 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, if (is_zero_pfn(pte_pfn(pte))) { page = pte_page(pte); } else { - int ret; - ret = follow_pfn_pte(vma, address, ptep, flags); page = ERR_PTR(ret); goto out; @@ -259,7 +258,6 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, } if (flags & FOLL_SPLIT && PageTransCompound(page)) { - int ret; get_page(page); pte_unmap_unlock(ptep, ptl); lock_page(page); @@ -276,6 +274,12 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, page = ERR_PTR(-ENOMEM); goto out; } + ret = arch_make_page_accessible(page); + if (ret) { + put_page(page); + page = ERR_PTR(ret); + goto out; + } } if (flags & FOLL_TOUCH) { if ((flags & FOLL_WRITE) && @@ -1919,6 +1923,11 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, VM_BUG_ON_PAGE(compound_head(page) != head, page); + ret = arch_make_page_accessible(page); + if (ret) { + put_page(head); + goto pte_unmap; + } SetPageReferenced(page); pages[*nr] = page; (*nr)++; diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 2caf780a42e7..558d7063c117 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2807,6 +2807,11 @@ int __test_set_page_writeback(struct page *page, bool keep_write) inc_zone_page_state(page, NR_ZONE_WRITE_PENDING); } unlock_page_memcg(page); + /* + * If writeback has been triggered on a page that cannot be made + * accessible, it is too late. + */ + WARN_ON(arch_make_page_accessible(page)); return ret; }