From patchwork Fri Jun 21 20:14:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kagan, Roman" X-Patchwork-Id: 13708035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 439F7C2BB85 for ; Fri, 21 Jun 2024 20:15:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C6FEB8D019B; Fri, 21 Jun 2024 16:15:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1EF68D0190; Fri, 21 Jun 2024 16:15:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE79F8D019B; Fri, 21 Jun 2024 16:15:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 8E0C98D0190 for ; Fri, 21 Jun 2024 16:15:12 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 11EF2161150 for ; Fri, 21 Jun 2024 20:15:12 +0000 (UTC) X-FDA: 82256000064.18.FF77188 Received: from smtp-fw-80008.amazon.com (smtp-fw-80008.amazon.com [99.78.197.219]) by imf14.hostedemail.com (Postfix) with ESMTP id 0C1FA100013 for ; Fri, 21 Jun 2024 20:15:09 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=amazon.de header.s=amazon201209 header.b=fj39Rt9p; spf=pass (imf14.hostedemail.com: domain of "prvs=895821b17=rkagan@amazon.de" designates 99.78.197.219 as permitted sender) smtp.mailfrom="prvs=895821b17=rkagan@amazon.de"; dmarc=pass (policy=quarantine) header.from=amazon.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719000904; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=cjFX5+KGjVctEsjcauGEjNTPHotOoWaKJe6iGg4ahqE=; b=C0TrVDnapHyGHkYjdNEeJTwbKVzlHeyp80IuHavItNfQsZTmJhXO6ASMeg8X021oB35/Ib 739dyiEkscO9WeNjLHt1gU7ElomsderKnKSS/14ND3uiWSXoz/4X96vsL/D6efGuOI4BgF S0juz05PLpdsXuK94x4qPJ8dEBQcsBg= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=amazon.de header.s=amazon201209 header.b=fj39Rt9p; spf=pass (imf14.hostedemail.com: domain of "prvs=895821b17=rkagan@amazon.de" designates 99.78.197.219 as permitted sender) smtp.mailfrom="prvs=895821b17=rkagan@amazon.de"; dmarc=pass (policy=quarantine) header.from=amazon.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719000904; a=rsa-sha256; cv=none; b=ktl//vzmp10lotDig/QyZmnS/sfVp7BIlLwyl1QKksbwUgV5kc3wG5UV0ZynFP8St8VC8u JaKT0iWPDwSV0ludE3rPFWcH3vnuHhbFLNsE6i6ccfwa9YwNP7JIU46YPU0z+P923waDAK g0eWD1YfaUqgw8EbP33/pUQ0fls0GRE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1719000910; x=1750536910; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=cjFX5+KGjVctEsjcauGEjNTPHotOoWaKJe6iGg4ahqE=; b=fj39Rt9plMczMMa06lpepjy/NpJUNLxTxRBjkU/VoJNylWZENNjUGkKu 38vMsgg8DBaFEOzm9Z9JM+jGLegS30PVsz2YjLsIbOCXg2erjwoFk0RME F6zvof7Yze8Zro30jde5TePKg62xMzWD73iFvbZf9JauysvSZUMNqyR/B 8=; X-IronPort-AV: E=Sophos;i="6.08,255,1712620800"; d="scan'208";a="98641151" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80008.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jun 2024 20:15:06 +0000 Received: from EX19MTAUEB001.ant.amazon.com [10.0.0.204:58514] by smtpin.naws.us-east-1.prod.farcaster.email.amazon.dev [10.0.94.206:2525] with esmtp (Farcaster) id 6cecfc43-1604-431e-b151-5844ee09a129; Fri, 21 Jun 2024 20:15:06 +0000 (UTC) X-Farcaster-Flow-ID: 6cecfc43-1604-431e-b151-5844ee09a129 Received: from EX19D008UEA004.ant.amazon.com (10.252.134.191) by EX19MTAUEB001.ant.amazon.com (10.252.135.108) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.34; Fri, 21 Jun 2024 20:15:06 +0000 Received: from EX19MTAUEA001.ant.amazon.com (10.252.134.203) by EX19D008UEA004.ant.amazon.com (10.252.134.191) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.34; Fri, 21 Jun 2024 20:15:05 +0000 Received: from u40bc5e070a0153.ant.amazon.com (10.95.134.31) by mail-relay.amazon.com (10.252.134.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.34 via Frontend Transport; Fri, 21 Jun 2024 20:15:02 +0000 From: Roman Kagan To: CC: Shuah Khan , Dragan Cvetic , Fares Mehanna , Alexander Graf , "Derek Kiernan" , , , Greg Kroah-Hartman , , David Woodhouse , Andrew Morton , Arnd Bergmann Subject: [PATCH RFC 0/3] add support for mm-local memory allocations Date: Fri, 21 Jun 2024 22:14:58 +0200 Message-ID: <20240621201501.1059948-1-rkagan@amazon.de> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 0C1FA100013 X-Stat-Signature: dabeh1ae1h9ifnje5k5kxpjewf7x1ys6 X-Rspam-User: X-HE-Tag: 1719000909-698509 X-HE-Meta: U2FsdGVkX1+GAxXKsYeb6ttvwHcVoEtB/OR4Q1oxP+urGnMgs0oGjE8GrBbTZZCQJF5fgHGcmnZh6JtzM+LNMzr8vfi1tKIXit43AZhTPMcay6PdZQDL7hrHb/5dxhp8J3sLEUUX+kOhY3mW5YJpJXKFf0TBdmUwTqVK7QsE1ueHYBO6Co9xuOk3s3jR6PVmia7iNKSQ5YtwyVeVdc0Q9pQ/S9J8PI1yWLI5xe8Q8m9Cw9e3petVbV0rIpKpmmpHob1k/aIebxzDpo9x5fYumlzQwLb4sScFKr3GBJwAv7MudtSwToethIXtkW8o4Rs0dZ5Yj+E9riKvscNZJC+MGaVG3JZNNkxLevmFoxBK8o0FNFvvmR1qbapOo1pcRElHupuMyRxhFtDyr++BKc6ljiDadllbX31RkoKRL2lbrOeq317UPl0wV5QemTzKTAmEH5c2h9ungcDm99tBx30t9150nwPQyn69lyKe+KNKDyT6jovcYqZgN2BDWsghHEeCync/21DJplHA0zyaLBzLqjtISQIcFaaA2w7/a37iCVRpObqJ8Ob989NWvOVF0js2ej8Q2OD4+iMrT7BGz5Y4iqhmGORF9DbiuDjTseIHg6YlpYlq/R8k40Wvxxykm4AFKYHKg2/FXihnStQOTH2011qvaBLg5098ayE/D9kKN4prKrsNp3pTpfbn7gkB/zS6idbPBIyYPkX6V5zq1vswsPiMCsFmfj6IpmzPBmpBIkofTa9yiVDBCyar4r/CDeKw8o/IEy71eet6G8FzQjU6zJDhng5eTeejaD93Ypi7hUu4toUtV4gO9Z38jpzaJ5CoMVzkWztl+PTEcejSaqRyVGc569CsHN0oB0KPuy4gjduz/3ep1uEUXY43e9M9GbKJocPUFYukDK3m3zZx6BqXfE8UeJ3G7kHUSkp5TqgPPhmqPgxNI1P07oeoga9djtQrroEotConDdmGLn69fRi NWdrueDY QqX+lX4YlnIiQwK7uSXGvlCEa29iW8e8WApxVzhdFsI3XqGVfJko7sgB9p2T1iqr4a4i51YLfuwtSewnCiZpqJ/n6OIBJcB6pY87Q0ElmJq4b3mMvDpdHnpIcNtkDATdMhIATnXEfcMdxRJ6rjHbTtcfWLDWeJHDgchos3iVBBzc3wBf5XoP3TobReSeNTV2AgzOwjPRtwNIAfa+iqg/UAyFmiDP0Xu5C6CPY8NWRz/E0PR3czRDLJlxgG15gwA53HihcT1jk5+smOyjwYIZ6AzwLV6rZJSzhuM9TBuAI5D+mT5kwRZ+XuSYmbLQVGlWGKFAMbMMlqJu7G/HVZO1evFfVK6am1/oyKkGT9gbhEp5bq8tpLSzrTDkeDqmis4s3xc2mJpwvXfQsrRxT2GccANtag4epPFZG362hGTZdX98WJ+k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000417, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In a series posted a few years ago [1], a proposal was put forward to allow the kernel to allocate memory local to a mm and thus push it out of reach for current and future speculation-based cross-process attacks. We still believe this is a nice thing to have. However, in the time passed since that post Linux mm has grown quite a few new goodies, so we'd like to explore possibilities to implement this functionality with less effort and churn leveraging the now available facilities. Specifically, this is a proof-of-concept attempt to implement mm-local allocations piggy-backing on memfd_secret(), using regular user addressess but pinning the pages and flipping the user/supervisor flag on the respective PTEs to make them directly accessible from kernel, and sealing the VMA to prevent userland from taking over the address range. The approach allowed to delegate all the heavy lifting -- address management, interactions with the direct map, cleanup on mm teardown -- to the existing infrastructure, and required zero architecture-specific code. Compared to the approach used in the orignal series, where a dedicated kernel address range and thus a dedicated PGD was used for mm-local allocations, the one proposed here may have certain drawbacks, in particular - using user addresses for kernel memory may violate assumptions in various parts of kernel code which we may not have identified with smoke tests we did - the allocated addresses are guessable by the userland (ATM they are even visible in /proc/PID/maps but that's fixable) which may weaken the security posture Also included is a simple test driver and selftest to smoke test and showcase the feature. The code is PoC RFC and lacks a lot of checks and special case handling, but demonstrates the idea. We'd appreciate any feedback on whether it's a viable approach or it should better be abandoned in favor of the one with dedicated PGD / kernel address range or yet something else. [1] https://lore.kernel.org/lkml/20190612170834.14855-1-mhillenb@amazon.de/ Fares Mehanna (2): mseal: expose interface to seal / unseal user memory ranges mm/secretmem: implement mm-local kernel allocations Roman Kagan (1): drivers/misc: add test driver and selftest for proclocal allocator drivers/misc/Makefile | 1 + tools/testing/selftests/proclocal/Makefile | 6 + include/linux/secretmem.h | 8 + mm/internal.h | 7 + drivers/misc/proclocal-test.c | 200 +++++++++++++++++ mm/gup.c | 4 +- mm/mseal.c | 81 ++++--- mm/secretmem.c | 208 ++++++++++++++++++ .../selftests/proclocal/proclocal-test.c | 150 +++++++++++++ drivers/misc/Kconfig | 15 ++ tools/testing/selftests/proclocal/.gitignore | 1 + 11 files changed, 649 insertions(+), 32 deletions(-) create mode 100644 tools/testing/selftests/proclocal/Makefile create mode 100644 drivers/misc/proclocal-test.c create mode 100644 tools/testing/selftests/proclocal/proclocal-test.c create mode 100644 tools/testing/selftests/proclocal/.gitignore