From patchwork Tue Jul 19 19:56:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 12922972 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3B42C433EF for ; Tue, 19 Jul 2022 19:56:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 744A26B0073; Tue, 19 Jul 2022 15:56:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6BD636B0074; Tue, 19 Jul 2022 15:56:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D3E56B0075; Tue, 19 Jul 2022 15:56:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 4F5896B0073 for ; Tue, 19 Jul 2022 15:56:36 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 21B2AC01E4 for ; Tue, 19 Jul 2022 19:56:36 +0000 (UTC) X-FDA: 79704906792.24.DF0AD8E Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf03.hostedemail.com (Postfix) with ESMTP id C071E2005B for ; Tue, 19 Jul 2022 19:56:35 +0000 (UTC) Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-31c936387fbso127381517b3.2 for ; Tue, 19 Jul 2022 12:56:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=2lmji2pJwQUUP3R6t1yXuXRPrPh1yDx0i1mGLg8CxXk=; b=rRPjfzCfRNd5lQiZ67VG1PsIVIYjPwFy08N1NgXwZEdmOj6JA+coqAzVWkDJIlNRY5 YYf3Y/bM+zhmKkRWAsQ2bjiJrwYdl3VnbCcCZGMRHF6ew7w4EywM1I3/QseVs1Mn1HXX tpRbi7PWHIGlDIpJweXOzNMEFyij2g2rnpJrMrBPd0i5ICMUSiInyP+w3GQ4/Jk2pCQB aC/ZpgWXZ+1KmqHJq0wUclsGmDY8ho2QpLzY95UOiGq0A0G8UHc0v+qIJPIQ9/n6efPf iewCSIjs66UnweZpc/EznMh9zB+4mPI7OoyNXC/osApBENIUmF4qtOi2XbW2DpSB5wMJ zUIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=2lmji2pJwQUUP3R6t1yXuXRPrPh1yDx0i1mGLg8CxXk=; b=ZrlFawAvpM3mLCA6X7yqmsHBB9Qlq+qyjwKxIB/0ofW6IH70cRuMrvcWxtNot/+Rrw 3jjnnKGXKGfuvfa+X+ZqUhgyyx9jv9mfHO4oeKNWQWYVLQZiKmSgsUkwl1RymLPrnphx RAGMuJ5ec52ya3v8kZyMSd/DkCdqh1I6heYr1ZB1SO3lXSX8WoDkUTftZs1F3P9UNxnn LuNLR+V/DoRdxy8WhvXRs2KDOZnzcMPvjp7Eqk0c7jswbpIOvGz2Xggf4RzpBcctj6ye iJIKl06nlmO4oZ7tBQ+4GuCzs0OpI0UzpUUoJ+DwNS5RiIvZr5uSBzslxodF8c4fcAGe G81g== X-Gm-Message-State: AJIora9FV6rCKoLChXio7PKu+0D3gcM+eGm/j8Z0kUAWwEyypJU4B48b o3+hZ/wYsIdngTnCLvuQddjt4G/8b4rZMS/qkWhc X-Google-Smtp-Source: AGRyM1uCcb2pDVTpDDsVUiZd0J0+WEjtbbgh4jqcG3bGfivbKRKkUwVy3Wnb/o8tyjKhhIKWZ30sg4m4oT0+s0fm0h3m X-Received: from ajr0.svl.corp.google.com ([2620:15c:2d4:203:a065:9221:e40d:4fbe]) (user=axelrasmussen job=sendgmr) by 2002:a25:7c41:0:b0:670:7de8:1d4b with SMTP id x62-20020a257c41000000b006707de81d4bmr5955132ybc.488.1658260595015; Tue, 19 Jul 2022 12:56:35 -0700 (PDT) Date: Tue, 19 Jul 2022 12:56:24 -0700 In-Reply-To: <20220719195628.3415852-1-axelrasmussen@google.com> Message-Id: <20220719195628.3415852-2-axelrasmussen@google.com> Mime-Version: 1.0 References: <20220719195628.3415852-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.37.0.170.g444d1eabd0-goog Subject: [PATCH v4 1/5] selftests: vm: add hugetlb_shared userfaultfd test to run_vmtests.sh From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Dave Hansen , "Dmitry V . Levin" , Gleb Fotengauer-Malinovskiy , Hugh Dickins , Jan Kara , Jonathan Corbet , Mel Gorman , Mike Kravetz , Mike Rapoport , Nadav Amit , Peter Xu , Shuah Khan , Suren Baghdasaryan , Vlastimil Babka , zhangyi Cc: Axel Rasmussen , linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Shuah Khan ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rRPjfzCf; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of 3cwzXYg0KCAYg3krxgys0yyktmuumrk.iusrot03-ssq1giq.uxm@flex--axelrasmussen.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3cwzXYg0KCAYg3krxgys0yyktmuumrk.iusrot03-ssq1giq.uxm@flex--axelrasmussen.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658260595; a=rsa-sha256; cv=none; b=bsW99kaAiO7UTgDAFRc9MTvTs3KY/rXWK3ls+8uTHMbl0DVslMmeaxlX4uPuvU5O9ijoTv Hhcdxz6Rhkwn/vka/5WpROp16cJUknvZTT3RulmNprddHnAV0E2iG6gXD2VUyZMnA6pODI 0jqUkaFf0oBVXK6tamhLtpXlKinlHYA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658260595; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2lmji2pJwQUUP3R6t1yXuXRPrPh1yDx0i1mGLg8CxXk=; b=zixfZGs/+WTQE0CnqNo1o+NOEDywFC4dtYjR/RM3bgH5/vhNHLRUKSgDVyFLsWseI8k5GI nEMevQHBywXuDyg7Jrh1EjveQNNlSW7EUZ/xSN0b835Jshyh92z1TL87BBriqcFV4WyVRE Za9C7qB/rUlpJsTpfbAzbGvo/let19U= X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: C071E2005B Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rRPjfzCf; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of 3cwzXYg0KCAYg3krxgys0yyktmuumrk.iusrot03-ssq1giq.uxm@flex--axelrasmussen.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3cwzXYg0KCAYg3krxgys0yyktmuumrk.iusrot03-ssq1giq.uxm@flex--axelrasmussen.bounces.google.com X-Stat-Signature: 7qb5izjswigjebigyn5x13apwdgurhjq X-HE-Tag: 1658260595-703631 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This not being included was just a simple oversight. There are certain features (like minor fault support) which are only enabled on shared mappings, so without including hugetlb_shared we actually lose a significant amount of test coverage. Reviewed-by: Shuah Khan Reviewed-by: Peter Xu Signed-off-by: Axel Rasmussen --- tools/testing/selftests/vm/run_vmtests.sh | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/vm/run_vmtests.sh b/tools/testing/selftests/vm/run_vmtests.sh index 41fce8bea929..e70ae0f3aaf6 100755 --- a/tools/testing/selftests/vm/run_vmtests.sh +++ b/tools/testing/selftests/vm/run_vmtests.sh @@ -121,9 +121,11 @@ run_test ./gup_test -a run_test ./gup_test -ct -F 0x1 0 19 0x1000 run_test ./userfaultfd anon 20 16 -# Test requires source and destination huge pages. Size of source -# (half_ufd_size_MB) is passed as argument to test. +# Hugetlb tests require source and destination huge pages. Pass in half the +# size ($half_ufd_size_MB), which is used for *each*. run_test ./userfaultfd hugetlb "$half_ufd_size_MB" 32 +run_test ./userfaultfd hugetlb_shared "$half_ufd_size_MB" 32 "$mnt"/uffd-test +rm -f "$mnt"/uffd-test run_test ./userfaultfd shmem 20 16 #cleanup From patchwork Tue Jul 19 19:56:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 12922973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5118CC433EF for ; Tue, 19 Jul 2022 19:56:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DEBF46B0074; Tue, 19 Jul 2022 15:56:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D9C0A8E0001; Tue, 19 Jul 2022 15:56:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C647A6B0078; Tue, 19 Jul 2022 15:56:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B9AB56B0074 for ; Tue, 19 Jul 2022 15:56:38 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 962751A037F for ; Tue, 19 Jul 2022 19:56:38 +0000 (UTC) X-FDA: 79704906876.15.240D93D Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf18.hostedemail.com (Postfix) with ESMTP id 2E1CC1C0082 for ; Tue, 19 Jul 2022 19:56:37 +0000 (UTC) Received: by mail-yb1-f201.google.com with SMTP id s6-20020a25c206000000b0066ebb148de6so4048903ybf.15 for ; Tue, 19 Jul 2022 12:56:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=qMVUvtohpl5u0YhYXflPGUyFWvhgp3J9BbhXpFzblMU=; b=lnfO9vGuCg2XCgU/FaH826cH1zjvOw9G/iNyzi1HOhL3DxRT5bJcHvrfvhI4t6URAW ZP8ma2CBwGO2AVLagAv1r0AO895+URwxT6XMHfplaCjyUO6lJkILUzedOiepPxy4Eahr EhH0JgOK+S8z069u+yUEl/l3mZrtwpKQA9Xw28TuBzAjavO2bWFh8TRuv1X2CVR90KpU qPBKThQYdZup0H8ibLQRBekD27EueLVSQBgOG0oQLbT0STjwrqunbK5D9oWZXtUEf2be s1WMV4AYiUbOkEVKGSa0BPiWU1Jl6E6ZbPremSWY888QlqwUWVrZ7Do8xcodXxNg2E9I fJJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=qMVUvtohpl5u0YhYXflPGUyFWvhgp3J9BbhXpFzblMU=; b=p48InY6iBH07khQVwqZEYT1vPezYCc4D/7DC9JTysmAZD+foEWomZy6CQZWK2FI58L A3ZzeUZwJIAsWBEjnuB9GjwYHjKPpcJNuAgatsbyNv6PznFLBAgQGIYerR3f/tkJN72n 6WyPKTNRIE0vXCbfBZ0ewAVescZ8Es//4IJVlGh8w/mv6lN5lntjVUd6gMCKo7G8eKpC dXxcIxaOEtthNfH4d7pux+hrMDLLpQbpvhMrYEkKdblpBdOfRdG62VIINb4j2uu+4Mfk IKnnIm8h78TUCyAXy1qmDCC0aqu4FmrxaNjbOrvfW4WD5uPkmiT1oUvhoQsOMgyKd5T4 ymgQ== X-Gm-Message-State: AJIora8THN0JPd8NSFqr8nOMl/FF57bkdJMEuZzPc/mxBHL+SeE0DBme KFdxAEOkQg74LvYCAv3XyXdT0PttrNtyd9AcFtf+ X-Google-Smtp-Source: AGRyM1tJ/Uc5B/XJoSXLcnF76pAX+DiA/uqYaRjrJAYUu9yZiW0d+SAGBCOg8WS/GtxAUC/6LkRBA2YKMCzMouUel+Kd X-Received: from ajr0.svl.corp.google.com ([2620:15c:2d4:203:a065:9221:e40d:4fbe]) (user=axelrasmussen job=sendgmr) by 2002:a25:6ed5:0:b0:669:8b84:bb57 with SMTP id j204-20020a256ed5000000b006698b84bb57mr32393560ybc.227.1658260597485; Tue, 19 Jul 2022 12:56:37 -0700 (PDT) Date: Tue, 19 Jul 2022 12:56:25 -0700 In-Reply-To: <20220719195628.3415852-1-axelrasmussen@google.com> Message-Id: <20220719195628.3415852-3-axelrasmussen@google.com> Mime-Version: 1.0 References: <20220719195628.3415852-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.37.0.170.g444d1eabd0-goog Subject: [PATCH v4 2/5] userfaultfd: add /dev/userfaultfd for fine grained access control From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Dave Hansen , "Dmitry V . Levin" , Gleb Fotengauer-Malinovskiy , Hugh Dickins , Jan Kara , Jonathan Corbet , Mel Gorman , Mike Kravetz , Mike Rapoport , Nadav Amit , Peter Xu , Shuah Khan , Suren Baghdasaryan , Vlastimil Babka , zhangyi Cc: Axel Rasmussen , linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lnfO9vGu; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of 3dQzXYg0KCAgi5mtzi0u200mvowwotm.kwutqv25-uus3iks.wzo@flex--axelrasmussen.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3dQzXYg0KCAgi5mtzi0u200mvowwotm.kwutqv25-uus3iks.wzo@flex--axelrasmussen.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658260598; a=rsa-sha256; cv=none; b=zCCuMczWtAINMGkzmFsbpY6x900/4xS94zzlFJgfx3Ku0j6u5WvrFEkj88r94Fi1RAAwgX PU1LDtCBr/QQeHCyv5fA3qw4PirGs1HmL1rc3O+V6/GlypsVrlElbAdYq85H8+1X/9BU99 VnDlWrk0uyDbXwHr1wqnvnkFLnKRw8U= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658260598; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qMVUvtohpl5u0YhYXflPGUyFWvhgp3J9BbhXpFzblMU=; b=sNkKCgVLPa9Aj3VOsuEyZC5HuPz+Gv6B7oUAuk6CppIVe2P1IMhvQF3BPD7JU0/cjCfaHn 2Eyq5nDCf7lZG+yAM8RNND9Plw9H6cQdWbYs9ptQvUC4ERtNuZY7sdJoXW9ND7vuYRseky QpMNEjf47VziSqobImyoKQp45bAvY7Y= X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 2E1CC1C0082 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lnfO9vGu; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of 3dQzXYg0KCAgi5mtzi0u200mvowwotm.kwutqv25-uus3iks.wzo@flex--axelrasmussen.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3dQzXYg0KCAgi5mtzi0u200mvowwotm.kwutqv25-uus3iks.wzo@flex--axelrasmussen.bounces.google.com X-Stat-Signature: m3on9b1f4ubp4ifke35oqiicbn3ndsxi X-HE-Tag: 1658260597-589335 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Historically, it has been shown that intercepting kernel faults with userfaultfd (thereby forcing the kernel to wait for an arbitrary amount of time) can be exploited, or at least can make some kinds of exploits easier. So, in 37cd0575b8 "userfaultfd: add UFFD_USER_MODE_ONLY" we changed things so, in order for kernel faults to be handled by userfaultfd, either the process needs CAP_SYS_PTRACE, or this sysctl must be configured so that any unprivileged user can do it. In a typical implementation of a hypervisor with live migration (take QEMU/KVM as one such example), we do indeed need to be able to handle kernel faults. But, both options above are less than ideal: - Toggling the sysctl increases attack surface by allowing any unprivileged user to do it. - Granting the live migration process CAP_SYS_PTRACE gives it this ability, but *also* the ability to "observe and control the execution of another process [...], and examine and change [its] memory and registers" (from ptrace(2)). This isn't something we need or want to be able to do, so granting this permission violates the "principle of least privilege". This is all a long winded way to say: we want a more fine-grained way to grant access to userfaultfd, without granting other additional permissions at the same time. To achieve this, add a /dev/userfaultfd misc device. This device provides an alternative to the userfaultfd(2) syscall for the creation of new userfaultfds. The idea is, any userfaultfds created this way will be able to handle kernel faults, without the caller having any special capabilities. Access to this mechanism is instead restricted using e.g. standard filesystem permissions. Signed-off-by: Axel Rasmussen Acked-by: Peter Xu Acked-by: Nadav Amit --- fs/userfaultfd.c | 69 +++++++++++++++++++++++++------- include/uapi/linux/userfaultfd.h | 4 ++ 2 files changed, 59 insertions(+), 14 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index e943370107d0..968f2517a281 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -30,6 +30,7 @@ #include #include #include +#include int sysctl_unprivileged_userfaultfd __read_mostly; @@ -413,13 +414,8 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) if (ctx->features & UFFD_FEATURE_SIGBUS) goto out; - if ((vmf->flags & FAULT_FLAG_USER) == 0 && - ctx->flags & UFFD_USER_MODE_ONLY) { - printk_once(KERN_WARNING "uffd: Set unprivileged_userfaultfd " - "sysctl knob to 1 if kernel faults must be handled " - "without obtaining CAP_SYS_PTRACE capability\n"); + if (!(vmf->flags & FAULT_FLAG_USER) && (ctx->flags & UFFD_USER_MODE_ONLY)) goto out; - } /* * If it's already released don't get it. This avoids to loop @@ -2052,19 +2048,30 @@ static void init_once_userfaultfd_ctx(void *mem) seqcount_spinlock_init(&ctx->refile_seq, &ctx->fault_pending_wqh.lock); } -SYSCALL_DEFINE1(userfaultfd, int, flags) +static inline bool userfaultfd_syscall_allowed(int flags) +{ + /* Userspace-only page faults are always allowed */ + if (flags & UFFD_USER_MODE_ONLY) + return true; + + /* + * The user is requesting a userfaultfd which can handle kernel faults. + * Privileged users are always allowed to do this. + */ + if (capable(CAP_SYS_PTRACE)) + return true; + + /* Otherwise, access to kernel fault handling is sysctl controlled. */ + return sysctl_unprivileged_userfaultfd; +} + +static int new_userfaultfd(bool is_syscall, int flags) { struct userfaultfd_ctx *ctx; int fd; - if (!sysctl_unprivileged_userfaultfd && - (flags & UFFD_USER_MODE_ONLY) == 0 && - !capable(CAP_SYS_PTRACE)) { - printk_once(KERN_WARNING "uffd: Set unprivileged_userfaultfd " - "sysctl knob to 1 if kernel faults must be handled " - "without obtaining CAP_SYS_PTRACE capability\n"); + if (is_syscall && !userfaultfd_syscall_allowed(flags)) return -EPERM; - } BUG_ON(!current->mm); @@ -2098,8 +2105,42 @@ SYSCALL_DEFINE1(userfaultfd, int, flags) return fd; } +SYSCALL_DEFINE1(userfaultfd, int, flags) +{ + return new_userfaultfd(true, flags); +} + +static int userfaultfd_dev_open(struct inode *inode, struct file *file) +{ + return 0; +} + +static long userfaultfd_dev_ioctl(struct file *file, unsigned int cmd, unsigned long flags) +{ + if (cmd != USERFAULTFD_IOC_NEW) + return -EINVAL; + + return new_userfaultfd(false, flags); +} + +static const struct file_operations userfaultfd_dev_fops = { + .open = userfaultfd_dev_open, + .unlocked_ioctl = userfaultfd_dev_ioctl, + .compat_ioctl = userfaultfd_dev_ioctl, + .owner = THIS_MODULE, + .llseek = noop_llseek, +}; + +static struct miscdevice userfaultfd_misc = { + .minor = MISC_DYNAMIC_MINOR, + .name = "userfaultfd", + .fops = &userfaultfd_dev_fops +}; + static int __init userfaultfd_init(void) { + WARN_ON(misc_register(&userfaultfd_misc)); + userfaultfd_ctx_cachep = kmem_cache_create("userfaultfd_ctx_cache", sizeof(struct userfaultfd_ctx), 0, diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 7d32b1e797fb..005e5e306266 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -12,6 +12,10 @@ #include +/* ioctls for /dev/userfaultfd */ +#define USERFAULTFD_IOC 0xAA +#define USERFAULTFD_IOC_NEW _IO(USERFAULTFD_IOC, 0x00) + /* * If the UFFDIO_API is upgraded someday, the UFFDIO_UNREGISTER and * UFFDIO_WAKE ioctls should be defined as _IOW and not as _IOR. In From patchwork Tue Jul 19 19:56:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 12922974 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C865BC43334 for ; Tue, 19 Jul 2022 19:56:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 64AA88E0002; Tue, 19 Jul 2022 15:56:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5FA048E0001; Tue, 19 Jul 2022 15:56:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4E8B88E0002; Tue, 19 Jul 2022 15:56:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 4019D8E0001 for ; Tue, 19 Jul 2022 15:56:41 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 17C77A037A for ; Tue, 19 Jul 2022 19:56:41 +0000 (UTC) X-FDA: 79704907002.11.E709272 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf21.hostedemail.com (Postfix) with ESMTP id ADC3F1C0014 for ; Tue, 19 Jul 2022 19:56:40 +0000 (UTC) Received: by mail-yb1-f202.google.com with SMTP id a8-20020a25a188000000b0066839c45fe8so11636463ybi.17 for ; Tue, 19 Jul 2022 12:56:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=kfHTJgvI1xhbc2HV5Dg5HLOm6jC5g70kHqbkIkuEdDY=; b=eATFJJjY0r2Ckp5HyJYF7nbsKpNr/n+bnVWTBX/qKJPuAEJFHTGGCBZXyp9cXnxBjg Ca1+y0xLR0qoBqtrlrNHYULBzMXCsc8caFlMWYdtEt52PUQhmCcZQT9ESXfI1teZEuWa IWUlbVDzn15ayl5FiIOdT8aMilcfI1uMjS5UpC1ymoVTyAuboD3MgfK8/r3RKHzKZmKP 8m5pTgYtqjZZkBSc+eWfj3+JIt1hCkNiSZnJ7k/w+No27VJ2bcyRAaOz1ogd6Mrde2Rr +tyNsenOxNV9jPukEvgPal2QfI5GQaafFY21IscEco8k3Vd5LNcprxRmuTztwvxV115+ 2EZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=kfHTJgvI1xhbc2HV5Dg5HLOm6jC5g70kHqbkIkuEdDY=; b=ohkLYW0qeWdBTnM22E0BpTaQ15PLZOmSiEKl5OM99yGBroPEyQJcS8kWS/NvnU5Dz0 ekk/WW6I5k/LNB1VmrkKEYShMjrntcu6pln5aHwrvzI0fReNj5LVHavllF+uJ+rlSl6z jGnE+0XUsoUtlKPhJyHHEa99IWDTAbwCzyXajQT/hj1j7bGAt65iN9GFkT98hKOHbNNF Ww/RkXkunAOlITouDBqVCEbxbuNvamKZlGnl0fNXibuclF2i+SdnRwiPqQ8+S65pmdO3 ogY7yZrKP8gOmxoJLyt6H+ZWsFQFrS2Q+dzSJUof1HPd5wQ0ohLxoBHgF7xABtnAQVAg RfhQ== X-Gm-Message-State: AJIora/h10nQUVQop1+0HhcgEwF6IWGRgDbfGjVuS+9t3dBLJfEdpcct R2J2LyvDj+7ePDnlQDtTh9yBDHMkZstGoHtlS8G4 X-Google-Smtp-Source: AGRyM1vh6EyxVuVXbs2vKNFncUeOGmAA4GIWCwxEtu04z5gOe8ex1hyo5UKRpWnlw9uhvTBE62Bd9M97+L2VvA7rWmqs X-Received: from ajr0.svl.corp.google.com ([2620:15c:2d4:203:a065:9221:e40d:4fbe]) (user=axelrasmussen job=sendgmr) by 2002:a25:3b11:0:b0:66e:ccf2:76dc with SMTP id i17-20020a253b11000000b0066eccf276dcmr33549674yba.247.1658260600090; Tue, 19 Jul 2022 12:56:40 -0700 (PDT) Date: Tue, 19 Jul 2022 12:56:26 -0700 In-Reply-To: <20220719195628.3415852-1-axelrasmussen@google.com> Message-Id: <20220719195628.3415852-4-axelrasmussen@google.com> Mime-Version: 1.0 References: <20220719195628.3415852-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.37.0.170.g444d1eabd0-goog Subject: [PATCH v4 3/5] userfaultfd: selftests: modify selftest to use /dev/userfaultfd From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Dave Hansen , "Dmitry V . Levin" , Gleb Fotengauer-Malinovskiy , Hugh Dickins , Jan Kara , Jonathan Corbet , Mel Gorman , Mike Kravetz , Mike Rapoport , Nadav Amit , Peter Xu , Shuah Khan , Suren Baghdasaryan , Vlastimil Babka , zhangyi Cc: Axel Rasmussen , linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eATFJJjY; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 3eAzXYg0KCAsl8pw2l3x533pyrzzrwp.nzxwty58-xxv6lnv.z2r@flex--axelrasmussen.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3eAzXYg0KCAsl8pw2l3x533pyrzzrwp.nzxwty58-xxv6lnv.z2r@flex--axelrasmussen.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658260600; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kfHTJgvI1xhbc2HV5Dg5HLOm6jC5g70kHqbkIkuEdDY=; b=WaCFF3leNDUlL40hWpgJ70zzDAUVyvObIUsykvq4DCwUvlsSxzB7Ts3NWDBpzt4tshFN0E mUT75Q4nbD/1vZ3fHcGPwYEzV8hobnC4ICl/D1nqXQcYa4Hq+KsBq0SHIAWvY37OykR/1X CKVvUzZm2lmFVWIS5DYLA3MTRmaEwV0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658260600; a=rsa-sha256; cv=none; b=OfPS5Otlz9huII47b+4uZjy04gMMB0c9Sh2f7IHrwhPQZSYBnC8tn2pfkNWsgaUZJubzgv CLLZltjOgVDsezYpBxZXO1WkwOT34i4jZPuk2v+BDQidwc83qXZQ2Zc2MiUpZnqnFb34YM THOJl9PZXySTQwZV7R7+nC9azTuVe7I= X-Rspamd-Queue-Id: ADC3F1C0014 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eATFJJjY; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 3eAzXYg0KCAsl8pw2l3x533pyrzzrwp.nzxwty58-xxv6lnv.z2r@flex--axelrasmussen.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3eAzXYg0KCAsl8pw2l3x533pyrzzrwp.nzxwty58-xxv6lnv.z2r@flex--axelrasmussen.bounces.google.com X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: o8fsxyhsrbhug9737qyyj6ozn6hby95p X-HE-Tag: 1658260600-701820 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We clearly want to ensure both userfaultfd(2) and /dev/userfaultfd keep working into the future, so just run the test twice, using each interface. Instead of always testing both userfaultfd(2) and /dev/userfaultfd, let the user choose which to test. As with other test features, change the behavior based on a new command line flag. Introduce the idea of "test mods", which are generic (not specific to a test type) modifications to the behavior of the test. This is sort of borrowed from this RFC patch series [1], but simplified a bit. The benefit is, in "typical" configurations this test is somewhat slow (say, 30sec or something). Testing both clearly doubles it, so it may not always be desirable, as users are likely to use one or the other, but never both, in the "real world". [1]: https://patchwork.kernel.org/project/linux-mm/patch/20201129004548.1619714-14-namit@vmware.com/ Signed-off-by: Axel Rasmussen Acked-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 69 ++++++++++++++++++++---- 1 file changed, 60 insertions(+), 9 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index 0bdfc1955229..0a126c620bc0 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -77,6 +77,11 @@ static int bounces; #define TEST_SHMEM 3 static int test_type; +#define UFFD_FLAGS (O_CLOEXEC | O_NONBLOCK | UFFD_USER_MODE_ONLY) + +/* test using /dev/userfaultfd, instead of userfaultfd(2) */ +static bool test_dev_userfaultfd; + /* exercise the test_uffdio_*_eexist every ALARM_INTERVAL_SECS */ #define ALARM_INTERVAL_SECS 10 static volatile bool test_uffdio_copy_eexist = true; @@ -125,6 +130,8 @@ struct uffd_stats { const char *examples = "# Run anonymous memory test on 100MiB region with 99999 bounces:\n" "./userfaultfd anon 100 99999\n\n" + "# Run the same anonymous memory test, but using /dev/userfaultfd:\n" + "./userfaultfd anon:dev 100 99999\n\n" "# Run share memory test on 1GiB region with 99 bounces:\n" "./userfaultfd shmem 1000 99\n\n" "# Run hugetlb memory test on 256MiB region with 50 bounces:\n" @@ -141,6 +148,14 @@ static void usage(void) "[hugetlbfs_file]\n\n"); fprintf(stderr, "Supported : anon, hugetlb, " "hugetlb_shared, shmem\n\n"); + fprintf(stderr, "'Test mods' can be joined to the test type string with a ':'. " + "Supported mods:\n"); + fprintf(stderr, "\tsyscall - Use userfaultfd(2) (default)\n"); + fprintf(stderr, "\tdev - Use /dev/userfaultfd instead of userfaultfd(2)\n"); + fprintf(stderr, "\nExample test mod usage:\n"); + fprintf(stderr, "# Run anonymous memory test with /dev/userfaultfd:\n"); + fprintf(stderr, "./userfaultfd anon:dev 100 99999\n\n"); + fprintf(stderr, "Examples:\n\n"); fprintf(stderr, "%s", examples); exit(1); @@ -154,12 +169,14 @@ static void usage(void) ret, __LINE__); \ } while (0) -#define err(fmt, ...) \ +#define errexit(exitcode, fmt, ...) \ do { \ _err(fmt, ##__VA_ARGS__); \ - exit(1); \ + exit(exitcode); \ } while (0) +#define err(fmt, ...) errexit(1, fmt, ##__VA_ARGS__) + static void uffd_stats_reset(struct uffd_stats *uffd_stats, unsigned long n_cpus) { @@ -383,13 +400,29 @@ static void assert_expected_ioctls_present(uint64_t mode, uint64_t ioctls) } } +static int __userfaultfd_open_dev(void) +{ + int fd, _uffd = -1; + + fd = open("/dev/userfaultfd", O_RDWR | O_CLOEXEC); + if (fd < 0) + return -1; + + _uffd = ioctl(fd, USERFAULTFD_IOC_NEW, UFFD_FLAGS); + close(fd); + return _uffd; +} + static void userfaultfd_open(uint64_t *features) { struct uffdio_api uffdio_api; - uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK | UFFD_USER_MODE_ONLY); + if (test_dev_userfaultfd) + uffd = __userfaultfd_open_dev(); + else + uffd = syscall(__NR_userfaultfd, UFFD_FLAGS); if (uffd < 0) - err("userfaultfd syscall not available in this kernel"); + errexit(KSFT_SKIP, "creating userfaultfd failed"); uffd_flags = fcntl(uffd, F_GETFD, NULL); uffdio_api.api = UFFD_API; @@ -1584,8 +1617,6 @@ unsigned long default_huge_page_size(void) static void set_test_type(const char *type) { - uint64_t features = UFFD_API_FEATURES; - if (!strcmp(type, "anon")) { test_type = TEST_ANON; uffd_test_ops = &anon_uffd_test_ops; @@ -1603,9 +1634,29 @@ static void set_test_type(const char *type) test_type = TEST_SHMEM; uffd_test_ops = &shmem_uffd_test_ops; test_uffdio_minor = true; - } else { - err("Unknown test type: %s", type); } +} + +static void parse_test_type_arg(const char *raw_type) +{ + char *buf = strdup(raw_type); + uint64_t features = UFFD_API_FEATURES; + + while (buf) { + const char *token = strsep(&buf, ":"); + + if (!test_type) + set_test_type(token); + else if (!strcmp(token, "dev")) + test_dev_userfaultfd = true; + else if (!strcmp(token, "syscall")) + test_dev_userfaultfd = false; + else + err("unrecognized test mod '%s'", token); + } + + if (!test_type) + err("failed to parse test type argument: '%s'", raw_type); if (test_type == TEST_HUGETLB) page_size = default_huge_page_size(); @@ -1653,7 +1704,7 @@ int main(int argc, char **argv) err("failed to arm SIGALRM"); alarm(ALARM_INTERVAL_SECS); - set_test_type(argv[1]); + parse_test_type_arg(argv[1]); nr_cpus = sysconf(_SC_NPROCESSORS_ONLN); nr_pages_per_cpu = atol(argv[2]) * 1024*1024 / page_size / From patchwork Tue Jul 19 19:56:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 12922975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63505C43334 for ; Tue, 19 Jul 2022 19:56:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ED7078E0001; Tue, 19 Jul 2022 15:56:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E8641900002; Tue, 19 Jul 2022 15:56:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D4D3A8E0003; Tue, 19 Jul 2022 15:56:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C62C28E0001 for ; Tue, 19 Jul 2022 15:56:43 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A1D49160300 for ; Tue, 19 Jul 2022 19:56:43 +0000 (UTC) X-FDA: 79704907086.03.5DA7AB8 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf04.hostedemail.com (Postfix) with ESMTP id 44E3240067 for ; Tue, 19 Jul 2022 19:56:43 +0000 (UTC) Received: by mail-yb1-f201.google.com with SMTP id l6-20020a25bf86000000b00668c915a3f2so11583141ybk.4 for ; Tue, 19 Jul 2022 12:56:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=D+axFZcXM2aiH957XKoLzm0NK888e8vhfRqJcL8tQlk=; b=Q6lTmACHbxPLYKcD6TEAkGdodCXui8T8KQJjaNpTrLvbWN5YYkAgyFnxcQZ0lRY7oa 9acQO03dUwhSLLvwZgG97CGmC+M40EFkdd14zNVuiZQDK4FpyM9U525S0X5c7VpbI6Xn marrPlcC1B8OnVdKcyWLEzuxmfyKhQY6KJ04vb6BCCY7/tXnVjU1sHyYUUStNY026lvx a1JgtjHgsUNop33vm+b+5kvkCAIjGuUeNHlBhU7D03TW84KjTflKGPaQP9ECMhxEIUtS /x21ATGjiJ/WPYC9JCPHfhv5dMPm6wUIZ9DJZ6WQtg1p5yFpjTb5vnOzyYzl9H6HwAsK MVbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=D+axFZcXM2aiH957XKoLzm0NK888e8vhfRqJcL8tQlk=; b=hVw9z3kkSfdzWwon+9QoCBgpvq2BKw1ZxYtaR1HqIhs+mv3R/vVRM7FTCKzf3rYQcC MLVCoshi4DHSgCNVFzksIFX5M2smsQ6DIyovVybXBB7rrSy26Emv0uQM6rWUkuV/pXM6 G/7tDr5kPL/sFVI0hklSD8654Q2PaFcQyb79MIM4gKUV/irAmt1iSizlBH9db6NZy+jY 5ZyryvxUkPGRxv6V3r5eHNauBuOrMzQLRuucOQ5xiqAN+PewQoFalkR9pPWHoavFXQnF USpGfO4XLoP2U4hVCCIaO4Ir/CtRkaLf69UkA5+o9UxRwWJR/TE7OwzkMm+xT3sKth8M B0hA== X-Gm-Message-State: AJIora+vz3lW/x/Ep7J0uv45TnnKXJC7ITKpI82Iy7TzleuXLQnURkDJ ZhNEufz5ZwqKuJVtcfdJbX6oBR8eIpCeyAW1qpU4 X-Google-Smtp-Source: AGRyM1sQ1a9x2GIZzx4aY+TQYaC9bhDEJVn0xFYWIaHkF0vSABxhMbIcNlU/zTDwvgT/LJednu8AwjLGuGiRA1DWNMMo X-Received: from ajr0.svl.corp.google.com ([2620:15c:2d4:203:a065:9221:e40d:4fbe]) (user=axelrasmussen job=sendgmr) by 2002:a5b:202:0:b0:66f:aab4:9c95 with SMTP id z2-20020a5b0202000000b0066faab49c95mr32614656ybl.81.1658260602570; Tue, 19 Jul 2022 12:56:42 -0700 (PDT) Date: Tue, 19 Jul 2022 12:56:27 -0700 In-Reply-To: <20220719195628.3415852-1-axelrasmussen@google.com> Message-Id: <20220719195628.3415852-5-axelrasmussen@google.com> Mime-Version: 1.0 References: <20220719195628.3415852-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.37.0.170.g444d1eabd0-goog Subject: [PATCH v4 4/5] userfaultfd: update documentation to describe /dev/userfaultfd From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Dave Hansen , "Dmitry V . Levin" , Gleb Fotengauer-Malinovskiy , Hugh Dickins , Jan Kara , Jonathan Corbet , Mel Gorman , Mike Kravetz , Mike Rapoport , Nadav Amit , Peter Xu , Shuah Khan , Suren Baghdasaryan , Vlastimil Babka , zhangyi Cc: Axel Rasmussen , linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658260603; a=rsa-sha256; cv=none; b=d17Gfiv15wg61uHbKMTzYHQnvTTlQ2ScAa9NbLw068215pKePPas6FJjIby+1DtD4FZE5w 70LKr0BMBmSz8Zq1xyQ+dZMrf6fBkA99bDQoP+sV4VBW3T5VSJgUY++4SJwzLN4J9evV25 EvgJVmlotEr3b8WTG/K3IDHaf9LZpG0= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Q6lTmACH; spf=pass (imf04.hostedemail.com: domain of 3egzXYg0KCA0nAry4n5z755r0t11tyr.p1zyv07A-zzx8npx.14t@flex--axelrasmussen.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3egzXYg0KCA0nAry4n5z755r0t11tyr.p1zyv07A-zzx8npx.14t@flex--axelrasmussen.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658260603; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=D+axFZcXM2aiH957XKoLzm0NK888e8vhfRqJcL8tQlk=; b=Lykdjp+L4xi1uljeLqsXOxc/QqChJTfpJKpo2iTPd3lURpTmXg/ZWvU5felTzsWqJ5ayN+ IZIR/dzm22ZRKbgSIHm3lHN/fdBCMUQ7FZdVnan77W0Rp9RVsw1U7iZ8OpIbkr2rQD2LjU 6yfH5TVSJCHsrJnnWW1LjF+hPAgaY3k= X-Stat-Signature: u31bauc5nzmyz64p9rpno1nxuzigmgqo X-Rspamd-Queue-Id: 44E3240067 Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Q6lTmACH; spf=pass (imf04.hostedemail.com: domain of 3egzXYg0KCA0nAry4n5z755r0t11tyr.p1zyv07A-zzx8npx.14t@flex--axelrasmussen.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3egzXYg0KCA0nAry4n5z755r0t11tyr.p1zyv07A-zzx8npx.14t@flex--axelrasmussen.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1658260603-264398 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Explain the different ways to create a new userfaultfd, and how access control works for each way. Signed-off-by: Axel Rasmussen Acked-by: Peter Xu --- Documentation/admin-guide/mm/userfaultfd.rst | 41 ++++++++++++++++++-- Documentation/admin-guide/sysctl/vm.rst | 3 ++ 2 files changed, 41 insertions(+), 3 deletions(-) diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst index 6528036093e1..a76c9dc1865b 100644 --- a/Documentation/admin-guide/mm/userfaultfd.rst +++ b/Documentation/admin-guide/mm/userfaultfd.rst @@ -17,7 +17,10 @@ of the ``PROT_NONE+SIGSEGV`` trick. Design ====== -Userfaults are delivered and resolved through the ``userfaultfd`` syscall. +Userspace creates a new userfaultfd, initializes it, and registers one or more +regions of virtual memory with it. Then, any page faults which occur within the +region(s) result in a message being delivered to the userfaultfd, notifying +userspace of the fault. The ``userfaultfd`` (aside from registering and unregistering virtual memory ranges) provides two primary functionalities: @@ -34,12 +37,11 @@ The real advantage of userfaults if compared to regular virtual memory management of mremap/mprotect is that the userfaults in all their operations never involve heavyweight structures like vmas (in fact the ``userfaultfd`` runtime load never takes the mmap_lock for writing). - Vmas are not suitable for page- (or hugepage) granular fault tracking when dealing with virtual address spaces that could span Terabytes. Too many vmas would be needed for that. -The ``userfaultfd`` once opened by invoking the syscall, can also be +The ``userfaultfd``, once created, can also be passed using unix domain sockets to a manager process, so the same manager process could handle the userfaults of a multitude of different processes without them being aware about what is going on @@ -50,6 +52,39 @@ is a corner case that would currently return ``-EBUSY``). API === +Creating a userfaultfd +---------------------- + +There are two ways to create a new userfaultfd, each of which provide ways to +restrict access to this functionality (since historically userfaultfds which +handle kernel page faults have been a useful tool for exploiting the kernel). + +The first way, supported since userfaultfd was introduced, is the +userfaultfd(2) syscall. Access to this is controlled in several ways: + +- Any user can always create a userfaultfd which traps userspace page faults + only. Such a userfaultfd can be created using the userfaultfd(2) syscall + with the flag UFFD_USER_MODE_ONLY. + +- In order to also trap kernel page faults for the address space, then either + the process needs the CAP_SYS_PTRACE capability, or the system must have + vm.unprivileged_userfaultfd set to 1. By default, vm.unprivileged_userfaultfd + is set to 0. + +The second way, added to the kernel more recently, is by opening and issuing a +USERFAULTFD_IOC_NEW ioctl to /dev/userfaultfd. This method yields equivalent +userfaultfds to the userfaultfd(2) syscall. + +Unlike userfaultfd(2), access to /dev/userfaultfd is controlled via normal +filesystem permissions (user/group/mode), which gives fine grained access to +userfaultfd specifically, without also granting other unrelated privileges at +the same time (as e.g. granting CAP_SYS_PTRACE would do). Users who have access +to /dev/userfaultfd can always create userfaultfds that trap kernel page faults; +vm.unprivileged_userfaultfd is not considered. + +Initializing a userfaultfd +-------------------------- + When first opened the ``userfaultfd`` must be enabled invoking the ``UFFDIO_API`` ioctl specifying a ``uffdio_api.api`` value set to ``UFFD_API`` (or a later API version) which will specify the ``read/POLLIN`` protocol diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst index 5c9aa171a0d3..36cf21f3b7ab 100644 --- a/Documentation/admin-guide/sysctl/vm.rst +++ b/Documentation/admin-guide/sysctl/vm.rst @@ -928,6 +928,9 @@ calls without any restrictions. The default value is 0. +Another way to control permissions for userfaultfd is to use +/dev/userfaultfd instead of userfaultfd(2). See +Documentation/admin-guide/mm/userfaultfd.rst. user_reserve_kbytes =================== From patchwork Tue Jul 19 19:56:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 12922976 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DAB5BC43334 for ; Tue, 19 Jul 2022 19:56:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 75C79900003; Tue, 19 Jul 2022 15:56:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E5B4900002; Tue, 19 Jul 2022 15:56:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D3F9900003; Tue, 19 Jul 2022 15:56:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4ED47900002 for ; Tue, 19 Jul 2022 15:56:46 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 30380802BA for ; Tue, 19 Jul 2022 19:56:46 +0000 (UTC) X-FDA: 79704907212.21.B17579B Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf28.hostedemail.com (Postfix) with ESMTP id B3F15C000D for ; Tue, 19 Jul 2022 19:56:45 +0000 (UTC) Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-31e572f96cdso30169287b3.8 for ; Tue, 19 Jul 2022 12:56:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=hOhehCg3VD/THunYJJ5B4YThbuYZbL9+a5Qwaql9g98=; b=bNEJzZtKoIwRRVCbEhhzPBFWZtV6mxX5h0TNf42DduzDwcxSAljzJzxILukQgkAqZJ Fr4hgnnXxmToxkm5HMEiBThZjYFkjtQNe1w0gNK8HVvBtsDUcNgn+QTuCIz9wKiHWsOa rQ+x3lFIMfgeny/J1BHAsDcYfSwoYWcXCMgbYh/VWoIVx1qsJxbRI/zFLssciOfAf1OO 4v+uBp9+wTcaNYpEFGhApHT+SfhxwXOIqt2T2pAjvhX1U9EHcdMMY5T/0cmp5MO2+olo EL/TuCXI7fV8EMi8yISn4WQ3LWCqnWkhxLy3WcnraSF8Qfb11rn+rA/jNwzFARxSTaeR vsiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=hOhehCg3VD/THunYJJ5B4YThbuYZbL9+a5Qwaql9g98=; b=f6uSsD0utRsJUJc2bQX12w21454gtda9UG9gb5+ij1ocj4C2p8ZAedtYiFU8eNbqzB kVKx1QC6+gbnz5go4OrCMChotfpaUrmcWUaR23qpMURDKQdkECjphCljPuFJkKMjAXGD hqSFmgQzPJITqGw7FVO/rG28d21Cr2cdA/oFl2AM1jOcuafXrpXQs3YALunDAi+c75mL NnsZTr/p0iJJndQrXX7fsp/CX0q5LJuCSYPIicplFXRbpYKuTDGGpACddwwsSi0eGvoa HnRCu3ffm+Vcp9/nUW13PyMWTsFCrxbxncOF7T04e2RGpKYoPF0KbArVpiqdtUbSyj6a njrA== X-Gm-Message-State: AJIora/qgxN+zZ7XzPkbZh5DVXDCSg0UijIQ83e3KmlmP29PFzTda2ZN 2uKv+iNbUvMoAyXulRdjHM2bQDb2U3KTORmIbtmz X-Google-Smtp-Source: AGRyM1sJa7Fbwmi70rsvyDqxe2kH1B8xy8UoSr0mRkjdLSB4LR9SWeGIONuOcxkrQy2m8DeHjejHuwm+ncnhHIgcyBt6 X-Received: from ajr0.svl.corp.google.com ([2620:15c:2d4:203:a065:9221:e40d:4fbe]) (user=axelrasmussen job=sendgmr) by 2002:a25:2e50:0:b0:669:9a76:beb with SMTP id b16-20020a252e50000000b006699a760bebmr34308174ybn.597.1658260604865; Tue, 19 Jul 2022 12:56:44 -0700 (PDT) Date: Tue, 19 Jul 2022 12:56:28 -0700 In-Reply-To: <20220719195628.3415852-1-axelrasmussen@google.com> Message-Id: <20220719195628.3415852-6-axelrasmussen@google.com> Mime-Version: 1.0 References: <20220719195628.3415852-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.37.0.170.g444d1eabd0-goog Subject: [PATCH v4 5/5] selftests: vm: add /dev/userfaultfd test cases to run_vmtests.sh From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Dave Hansen , "Dmitry V . Levin" , Gleb Fotengauer-Malinovskiy , Hugh Dickins , Jan Kara , Jonathan Corbet , Mel Gorman , Mike Kravetz , Mike Rapoport , Nadav Amit , Peter Xu , Shuah Khan , Suren Baghdasaryan , Vlastimil Babka , zhangyi Cc: Axel Rasmussen , linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Shuah Khan ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658260605; a=rsa-sha256; cv=none; b=rZM8MCY8iq/whMsb+49qP7scHUu/3+PrjlttO3nf/gFpxVyzY+XXPYwvvTbd0dsyaIY3LC CrUgvp9A6tu2iox8hvgoCupT+18dbwBt7byBzaW6yXR/e5jH+pXDE8b8twkxuzawXJrMF4 vIjOacLppL8mOM1GrHR/4uD54z+CAGc= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=bNEJzZtK; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of 3fAzXYg0KCA8pCt06p71977t2v33v0t.r310x29C-11zAprz.36v@flex--axelrasmussen.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3fAzXYg0KCA8pCt06p71977t2v33v0t.r310x29C-11zAprz.36v@flex--axelrasmussen.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658260605; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hOhehCg3VD/THunYJJ5B4YThbuYZbL9+a5Qwaql9g98=; b=qZQ+FlDSDVO6UKfEVWL1IB4GOvGcXbKI2Qp10Yy2yZhjl77oLshsip4/Xg65Jz+Gf+TAUj uG3lyVd6fh9+isbH1MRoHwwCxGzzYO/Yz5DD7Y3yq8CA59lNeGbcet7cM62xG11kS99mx6 LACoSSNfMpECNSDGYuzsej6hyaxzKwU= X-Rspam-User: Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=bNEJzZtK; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of 3fAzXYg0KCA8pCt06p71977t2v33v0t.r310x29C-11zAprz.36v@flex--axelrasmussen.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3fAzXYg0KCA8pCt06p71977t2v33v0t.r310x29C-11zAprz.36v@flex--axelrasmussen.bounces.google.com X-Stat-Signature: 6ag3tm4g36ithy9dhe8jkjnjxfgb37sy X-Rspamd-Queue-Id: B3F15C000D X-Rspamd-Server: rspam02 X-HE-Tag: 1658260605-829037 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This new mode was recently added to the userfaultfd selftest. We want to exercise both userfaultfd(2) as well as /dev/userfaultfd, so add both test cases to the script. Reviewed-by: Shuah Khan Acked-by: Peter Xu Signed-off-by: Axel Rasmussen --- tools/testing/selftests/vm/run_vmtests.sh | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/tools/testing/selftests/vm/run_vmtests.sh b/tools/testing/selftests/vm/run_vmtests.sh index e70ae0f3aaf6..156f864030fc 100755 --- a/tools/testing/selftests/vm/run_vmtests.sh +++ b/tools/testing/selftests/vm/run_vmtests.sh @@ -121,12 +121,17 @@ run_test ./gup_test -a run_test ./gup_test -ct -F 0x1 0 19 0x1000 run_test ./userfaultfd anon 20 16 +run_test ./userfaultfd anon:dev 20 16 # Hugetlb tests require source and destination huge pages. Pass in half the # size ($half_ufd_size_MB), which is used for *each*. run_test ./userfaultfd hugetlb "$half_ufd_size_MB" 32 +run_test ./userfaultfd hugetlb:dev "$half_ufd_size_MB" 32 run_test ./userfaultfd hugetlb_shared "$half_ufd_size_MB" 32 "$mnt"/uffd-test rm -f "$mnt"/uffd-test +run_test ./userfaultfd hugetlb_shared:dev "$half_ufd_size_MB" 32 "$mnt"/uffd-test +rm -f "$mnt"/uffd-test run_test ./userfaultfd shmem 20 16 +run_test ./userfaultfd shmem:dev 20 16 #cleanup umount "$mnt"