From patchwork Wed Jun 22 18:50:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12891718 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5313EC43334 for ; Thu, 23 Jun 2022 02:26:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 576C38E010D; Wed, 22 Jun 2022 22:25:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 54B498E00FA; Wed, 22 Jun 2022 22:25:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F0568E010D; Wed, 22 Jun 2022 22:25:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 2CD3B8E00FA for ; Wed, 22 Jun 2022 22:25:58 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 000FE213E7 for ; Thu, 23 Jun 2022 02:25:57 +0000 (UTC) X-FDA: 79607910396.20.CC9C24E Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by imf04.hostedemail.com (Postfix) with ESMTP id A297E4001B for ; Thu, 23 Jun 2022 02:25:57 +0000 (UTC) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 431663570B for ; Thu, 23 Jun 2022 02:25:57 +0000 (UTC) X-FDA: 79607910354.29.E712680 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf18.hostedemail.com (Postfix) with ESMTP id CB8691C0024 for ; Thu, 23 Jun 2022 02:25:56 +0000 (UTC) Received: by mail-pl1-f175.google.com with SMTP id k14so5450674plh.4 for ; Wed, 22 Jun 2022 19:25:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=zvPXoMWmzuR0KHLWaxsojd02ycg7ylF2SnBJl0q13Z8=; b=IihgpJ6DAoJKZMmbvBoB/+HCZIugXwER9x+fsqM/BzvhaDCZuC4lqtDXsPiMGJb1yu 1oeRFXHUewBjRsk8+h7VHqVfR1YoHA3IalNJNAI9BEd9pfCvStk2FyZ8SUeSgXR6RMxj Oi2AN0XczkrMW8z5SVYOjtb35g8Y0yPxS1W064X0GAQPFCSnhzXpV4pvaymrx/9e3Ll+ JfzRSbXQcC0u1gmDQzYlyLgFWPFMKsYDTA2LoSXwPJxeyQhzD9rKOzPFviLCrmPtj+xp JnBQKWBy3Cjfu1e8Lpv1XmfWj+aW+AjQBmsj00jczTNmDmrdwkrCysDIlvsqven7wiJ+ cV7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zvPXoMWmzuR0KHLWaxsojd02ycg7ylF2SnBJl0q13Z8=; b=vhak/KW8dp1tXUJ4GUxsHmsv3WMMMKkVtXxL1Znwdd2bJW8kk3XYT4qLxQeW0sS6fX SHqKOgJVNEkGNHFBqemDI0LxbJNFvoRH1CSRUePK/RPqAvDTSEE4L0gstS+QRj0qc9eq OQw+DGXIouOUcuDNAAv3BEuZ78FdekdNBwSyzS958Jwm+T64tYUcTwBO7vc6EX7VVJ+n B2XslcjCFwkUKfHmaU7tE+tRuB8D92rZTxmWl+UanP8CBQnMV9MiYR9h+205PsXfKdM8 rGTEbdCAFlgbmtCEquvb9do90kiNvEtfwWZey/N/x5xuyzFHRwlxRxcR7pRWIZ05laR+ raMw== X-Gm-Message-State: AJIora8GcDhiEqWATIagAc/36UMNk6HJ5S78zT9Eg4+5+phycH7TwTN2 KI/yaGKw3JFDL1U00xgv0A/bqspIN0Uw6w== X-Google-Smtp-Source: AGRyM1tLtsw5WrKsqR9EZgF9ROrEkynp1ju8qc5271A+0tP8HNMqrsAv9NXwdVsC/naA3plzsc84ew== X-Received: by 2002:a17:90b:1c86:b0:1ea:4ceb:2788 with SMTP id oo6-20020a17090b1c8600b001ea4ceb2788mr1550568pjb.16.1655951155550; Wed, 22 Jun 2022 19:25:55 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id ik10-20020a170902ab0a00b001617541c94fsm13423998plb.60.2022.06.22.19.25.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Jun 2022 19:25:54 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: Nadav Amit , David Hildenbrand , Mike Kravetz , Hugh Dickins , Andrew Morton , Axel Rasmussen , Peter Xu , Mike Rapoport Subject: [PATCH v1 4/5] userfaultfd: zero access/write hints Date: Wed, 22 Jun 2022 11:50:37 -0700 Message-Id: <20220622185038.71740-5-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220622185038.71740-1-namit@vmware.com> References: <20220622185038.71740-1-namit@vmware.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655951157; a=rsa-sha256; cv=none; b=aQnWqAEjlnJALZgw0SegSBGY2NL8KDNg46LZaz0pven8scPBgjtJcqcj8gf6MjzmyxkVGM uDn3Bktllm03xIPVzxTB/e+uIbulhnKRzrBhoa0Wa5mp2ln9Wlpde/11R0VvZDBz2Yqjll R2b8NVBDINA9CMDLvH6Q7EPYul3vStg= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=IihgpJ6D; spf=none (imf04.hostedemail.com: domain of MAILER-DAEMON@hostedemail.com has no SPF policy when checking 216.40.44.17) smtp.mailfrom=MAILER-DAEMON@hostedemail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655951157; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zvPXoMWmzuR0KHLWaxsojd02ycg7ylF2SnBJl0q13Z8=; b=VhYf6waksAIcpCHgLaD6dJhN5JjMDFiKtLUE8HCFRS9c3TU+nVCkr3S1ibgf9J9irXgpSN zmU4J/0UbmEDT3kYrf2C6mjlTroaYrjHTZICtsd+kaaug7Aie/bfub1j9nkc7Iu1y3oEag OMhPyl7yw0NJ7hO3YSvvlj9wY1rzDMU= X-Stat-Signature: bhphymbgbdu8zm7ik5wc4z6sk9xdsawx X-Rspamd-Server: rspam06 Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=IihgpJ6D; spf=none (imf04.hostedemail.com: domain of MAILER-DAEMON@hostedemail.com has no SPF policy when checking 216.40.44.17) smtp.mailfrom=MAILER-DAEMON@hostedemail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspam-User: X-HE-Tag-Orig: 1655951156-659929 X-Rspamd-Queue-Id: A297E4001B X-HE-Tag: 1655951157-61537 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit When userfaultfd provides a zeropage in response to ioctl, it provides a readonly alias to the zero page. If the page is later written (which is the likely scenario), page-fault occurs and the page-fault allocator allocates a page and rewires the page-tables. This is an expensive flow for cases in which a page is likely be written to. Users can use the copy ioctl to initialize zero page (by copying zeros), but this is also wasteful. Allow userfaultfd users to efficiently map initialized zero-pages that are writable. IF UFFDIO_ZEROPAGE_MODE_WRITE_LIKELY is provided would map a clear page instead of an alias to the zero page. For consistency, introduce also UFFDIO_ZEROPAGE_MODE_ACCESS_LIKELY. Suggested-by: David Hildenbrand Cc: Mike Kravetz Cc: Hugh Dickins Cc: Andrew Morton Cc: Axel Rasmussen Cc: Peter Xu Cc: Mike Rapoport Signed-off-by: Nadav Amit Acked-by: Peter Xu --- mm/userfaultfd.c | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 6e767f1e7007..48286746b0af 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -249,6 +249,37 @@ static int mfill_zeropage_pte(struct mm_struct *dst_mm, return ret; } +static int mfill_clearpage_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, + struct vm_area_struct *dst_vma, + unsigned long dst_addr, + uffd_flags_t uffd_flags) +{ + struct page *page; + int ret; + + ret = -ENOMEM; + page = alloc_zeroed_user_highpage_movable(dst_vma, dst_addr); + if (!page) + goto out; + + /* The PTE is not marked as dirty unconditionally */ + SetPageDirty(page); + __SetPageUptodate(page); + + if (mem_cgroup_charge(page_folio(page), dst_vma->vm_mm, GFP_KERNEL)) + goto out_release; + + ret = mfill_atomic_install_pte(dst_mm, dst_pmd, dst_vma, dst_addr, + page, true, uffd_flags); + if (ret) + goto out_release; +out: + return ret; +out_release: + put_page(page); + goto out; +} + /* Handles UFFDIO_CONTINUE for all shmem VMAs (shared or private). */ static int mcontinue_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, @@ -511,6 +542,10 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, src_addr, page, uffd_flags); + else if (!(uffd_flags & UFFD_FLAGS_WP) && + (uffd_flags & UFFD_FLAGS_WRITE_LIKELY)) + err = mfill_clearpage_pte(dst_mm, dst_pmd, dst_vma, + dst_addr, uffd_flags); else err = mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, dst_addr, uffd_flags);