From patchwork Mon Jan 21 07:57:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10772799 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2AD8C13BF for ; Mon, 21 Jan 2019 07:58:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1C8DF29D39 for ; Mon, 21 Jan 2019 07:58:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1084B29D3D; Mon, 21 Jan 2019 07:58:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9120B29D39 for ; Mon, 21 Jan 2019 07:58:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A546C8E000B; Mon, 21 Jan 2019 02:58:35 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A02CE8E0001; Mon, 21 Jan 2019 02:58:35 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F2C68E000B; Mon, 21 Jan 2019 02:58:35 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 685358E0001 for ; Mon, 21 Jan 2019 02:58:35 -0500 (EST) Received: by mail-qt1-f198.google.com with SMTP id u20so20164459qtk.6 for ; Sun, 20 Jan 2019 23:58:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=AhgKOlGfnVME6vNO5nF8xCpFGu20VLCRJBMaSTPFS4A=; b=aw+MVk1i7zNOcGda6s5yoOpBd7xmfP0/N4NjjT8iiLyNvQrsi070IpFcyamlMv8yXh EPojSGUAoyBUsbTsNptIG8qp61bR12TWvlmOQUcLVk43wMw8Gmzl1fZAw5A8PynQKzc6 /eeaLRyDtn8aD7ASRGfDViwRSq5vF28ycbywO3/8Zc0+8T8OuQUH4GLf2rWkFhej0U9U ieQR+ff9tSXAoB4W0+6W77fNrkupckwQUbC+JEMnnnq0S+8+ecxVPGNcF1bP73vfZTko JRrsYWQDMsXaK/AljcXurf8eNWqCL4hLloqpI8Ib4Dd+HETcgDX5hXd5ZJw52Kdtchtm ehLQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukcvwZagLX+E7gV4VwMNH+KdQfDcLzw2ouVJALCE62wkEd4OpP89 Knvu/LXjmPgAtzH8Ff8sMLxOm7J95aCt+KT8cAg15g+whHZNd4rfMoic+0lbdPKXYYfi3tYtsL6 yhLUGhzvKypO3wkuLGpzLlTt7fRitslMEZurhPqOd8ahD2Q7o2aKZmKM6AVmYmpkWyQ== X-Received: by 2002:a37:b482:: with SMTP id d124mr24240127qkf.168.1548057515197; Sun, 20 Jan 2019 23:58:35 -0800 (PST) X-Google-Smtp-Source: ALg8bN5ZPayL+mB7bjIHRLV2iIHLafA3YPsIdCViy0ZGeefg9x+wHcrWOzlR+fApFQ08Vdt7i/Ij X-Received: by 2002:a37:b482:: with SMTP id d124mr24240094qkf.168.1548057514543; Sun, 20 Jan 2019 23:58:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548057514; cv=none; d=google.com; s=arc-20160816; b=oe6a1/0ioQhXx11pt4dGyPADr0pEutaDhEX31vyiCvayiXse7bClJwloq6PYtU6xsY AEzLHrK7p6y2K+LGzpiPmflKn5/AdujPXU0vFpC+SkukNX9KCwGmctpR2zmBFDlBSAvN OTKwqc4nJBYpX8rsX0qcUvTam/RGovzvtQRdJhfJYDWNJq1Wst1I34uil4oPLYLEJnk7 8243/8mrPMbMkaYRP4HXzKKnuGHOxeQMFAHMF0UJStTpS1Dwupp0UkkbHv3C/ww5T4oq WbXysa7jGuc0Dj6ljiAvzJcKFZPQ84ZT7S2rUb4uPUTmeWOFl4j7VsfJcOBXspS9PiE3 b0Hw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=AhgKOlGfnVME6vNO5nF8xCpFGu20VLCRJBMaSTPFS4A=; b=oxIq+x/JlLsdvbfFmTuEnvxLpXMPHE+f06oqtW3Uo3Duc2770UeWvdBAsaKqEPOwQc 6jJjaCgQJxPbt0wMiylMDSY4B+S4G4KXnkgoKrotSf42hZGq6lqQ1mK9KEnW9CPm7pXS ZqVDf3dhR/O7dAY8lKoGSLS0EpXmt9iT/SMWi8aUaZAIhOKANNK0XelzkaZeVVPw7Ppt ZrGFI0tULeCI1wwMUlJ6bbLK2DA6YJJUD0S13ETgBlyqiSes8DIjoBMxbe9rCq+iatHw Woo8t7mNVoqHy9pz5c7e7QdWUPvCkrNcDA6QLOQQTeQBGSpNJeipTXMjTumxsvTvUDo2 l0QA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id j25si3179253qtr.152.2019.01.20.23.58.34 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Jan 2019 23:58:34 -0800 (PST) Received-SPF: pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of peterx@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 85DD386677; Mon, 21 Jan 2019 07:58:33 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 000C56090E; Mon, 21 Jan 2019 07:58:27 +0000 (UTC) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Maya Gokhale , Jerome Glisse , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , Shaohua Li , Andrea Arcangeli , Pavel Emelyanov , Mike Kravetz , Marty McFadden , Mike Rapoport , Mel Gorman , "Kirill A . Shutemov" , "Dr . David Alan Gilbert" Subject: [PATCH RFC 08/24] userfaultfd: wp: hook userfault handler to write protection fault Date: Mon, 21 Jan 2019 15:57:06 +0800 Message-Id: <20190121075722.7945-9-peterx@redhat.com> In-Reply-To: <20190121075722.7945-1-peterx@redhat.com> References: <20190121075722.7945-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 21 Jan 2019 07:58:33 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Andrea Arcangeli There are several cases write protection fault happens. It could be a write to zero page, swaped page or userfault write protected page. When the fault happens, there is no way to know if userfault write protect the page before. Here we just blindly issue a userfault notification for vma with VM_UFFD_WP regardless if app write protects it yet. Application should be ready to handle such wp fault. v1: From: Shaohua Li v2: Handle the userfault in the common do_wp_page. If we get there a pagetable is present and readonly so no need to do further processing until we solve the userfault. In the swapin case, always swapin as readonly. This will cause false positive userfaults. We need to decide later if to eliminate them with a flag like soft-dirty in the swap entry (see _PAGE_SWP_SOFT_DIRTY). hugetlbfs wouldn't need to worry about swapouts but and tmpfs would be handled by a swap entry bit like anonymous memory. The main problem with no easy solution to eliminate the false positives, will be if/when userfaultfd is extended to real filesystem pagecache. When the pagecache is freed by reclaim we can't leave the radix tree pinned if the inode and in turn the radix tree is reclaimed as well. The estimation is that full accuracy and lack of false positives could be easily provided only to anonymous memory (as long as there's no fork or as long as MADV_DONTFORK is used on the userfaultfd anonymous range) tmpfs and hugetlbfs, it's most certainly worth to achieve it but in a later incremental patch. v3: Add hooking point for THP wrprotect faults. CC: Shaohua Li Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu --- mm/memory.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index 4ad2d293ddc2..89d51d1650e4 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2482,6 +2482,11 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; + if (userfaultfd_wp(vma)) { + pte_unmap_unlock(vmf->pte, vmf->ptl); + return handle_userfault(vmf, VM_UFFD_WP); + } + vmf->page = vm_normal_page(vma, vmf->address, vmf->orig_pte); if (!vmf->page) { /* @@ -2799,6 +2804,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS); pte = mk_pte(page, vma->vm_page_prot); + if (userfaultfd_wp(vma)) + vmf->flags &= ~FAULT_FLAG_WRITE; if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) { pte = maybe_mkwrite(pte_mkdirty(pte), vma); vmf->flags &= ~FAULT_FLAG_WRITE; @@ -3662,8 +3669,11 @@ static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf) /* `inline' is required to avoid gcc 4.1.2 build error */ static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf, pmd_t orig_pmd) { - if (vma_is_anonymous(vmf->vma)) + if (vma_is_anonymous(vmf->vma)) { + if (userfaultfd_wp(vmf->vma)) + return handle_userfault(vmf, VM_UFFD_WP); return do_huge_pmd_wp_page(vmf, orig_pmd); + } if (vmf->vma->vm_ops->huge_fault) return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PMD);