From patchwork Wed Oct 30 13:49:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Roy X-Patchwork-Id: 13856547 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01CD8D5CCAF for ; Wed, 30 Oct 2024 13:49:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3AC2A8D0003; Wed, 30 Oct 2024 09:49:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 335348D0001; Wed, 30 Oct 2024 09:49:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 187358D0003; Wed, 30 Oct 2024 09:49:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id E92118D0001 for ; Wed, 30 Oct 2024 09:49:44 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 58851160CD0 for ; Wed, 30 Oct 2024 13:49:44 +0000 (UTC) X-FDA: 82730400942.18.3188EA7 Received: from smtp-fw-80008.amazon.com (smtp-fw-80008.amazon.com [99.78.197.219]) by imf09.hostedemail.com (Postfix) with ESMTP id 3DD53140019 for ; Wed, 30 Oct 2024 13:49:24 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=amazon.co.uk header.s=amazon201209 header.b="HFO1/M92"; spf=pass (imf09.hostedemail.com: domain of "prvs=02621381b=roypat@amazon.co.uk" designates 99.78.197.219 as permitted sender) smtp.mailfrom="prvs=02621381b=roypat@amazon.co.uk"; dmarc=pass (policy=quarantine) header.from=amazon.co.uk ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730296101; a=rsa-sha256; cv=none; b=vswe9EYx8Rg1As5dFf0yGnfklRRJojAjpmNF3BO6O5J9HzoY5a2aGt4defjqa3nMMYvuSd /L7sGRM+bltDf6EmeiMqG4M2AWwYfQZTcvbh6ds9dGxlKP1BiKqZ/hFQ86WlbbrF8xo0HA EHxaEdLG+bGSGGRdzl5JN8m8xtRUChA= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=amazon.co.uk header.s=amazon201209 header.b="HFO1/M92"; spf=pass (imf09.hostedemail.com: domain of "prvs=02621381b=roypat@amazon.co.uk" designates 99.78.197.219 as permitted sender) smtp.mailfrom="prvs=02621381b=roypat@amazon.co.uk"; dmarc=pass (policy=quarantine) header.from=amazon.co.uk ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730296101; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=r1QMDZCnRymQeRRBpcOx6ZreXT7450ZDAPuRRJDuiN0=; b=aOFkwG1HJ857upoCf4LQO/aDkNH2ju4BLnTBJg7kJrQPhXBVKnkqrXzIsYO4OtpzBuzQL5 4NY0IInCv32LYclr8wbRwsIVoDQPT27oTzOwUn9qgslKAEFO5cpICEvjmiUvUyoZIl542D BcZykoenYqSfckzLbQ2EDEoS2ln67vM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.uk; i=@amazon.co.uk; q=dns/txt; s=amazon201209; t=1730296182; x=1761832182; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=r1QMDZCnRymQeRRBpcOx6ZreXT7450ZDAPuRRJDuiN0=; b=HFO1/M92H3o3H/t822aS/xdWsYEJskBFE1hyPGLdplEFDLKnG4c75stY qTXHHB/gWF/ErXOGo10E5mw/4KfaCQGjhQpK0DBl95n+FNaVGOtkrdx/Y gAyQPapD02jXgcBL6a7RMEYDYcFAsxMiv7Q4EWk9lMKtehdz4XnMFAhBf E=; X-IronPort-AV: E=Sophos;i="6.11,245,1725321600"; d="scan'208";a="141949542" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80008.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2024 13:49:37 +0000 Received: from EX19MTAUWA001.ant.amazon.com [10.0.38.20:55646] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.24.36:2525] with esmtp (Farcaster) id b62929fd-fda7-42c8-9f60-587e2125a407; Wed, 30 Oct 2024 13:49:36 +0000 (UTC) X-Farcaster-Flow-ID: b62929fd-fda7-42c8-9f60-587e2125a407 Received: from EX19D003UWB004.ant.amazon.com (10.13.138.24) by EX19MTAUWA001.ant.amazon.com (10.250.64.218) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Wed, 30 Oct 2024 13:49:36 +0000 Received: from EX19MTAUWA001.ant.amazon.com (10.250.64.204) by EX19D003UWB004.ant.amazon.com (10.13.138.24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.35; Wed, 30 Oct 2024 13:49:35 +0000 Received: from email-imr-corp-prod-pdx-all-2c-8a67eb17.us-west-2.amazon.com (10.25.36.214) by mail-relay.amazon.com (10.250.64.204) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34 via Frontend Transport; Wed, 30 Oct 2024 13:49:35 +0000 Received: from ua2d7e1a6107c5b.home (dev-dsk-roypat-1c-dbe2a224.eu-west-1.amazon.com [172.19.88.180]) by email-imr-corp-prod-pdx-all-2c-8a67eb17.us-west-2.amazon.com (Postfix) with ESMTPS id C4A404032D; Wed, 30 Oct 2024 13:49:26 +0000 (UTC) From: Patrick Roy To: , , , , , , , , CC: Patrick Roy , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [RFC PATCH v3 0/6] Direct Map Removal for guest_memfd Date: Wed, 30 Oct 2024 13:49:04 +0000 Message-ID: <20241030134912.515725-1-roypat@amazon.co.uk> X-Mailer: git-send-email 2.47.0 MIME-Version: 1.0 X-Stat-Signature: nnq8g3uyjmxbi8et9pbg85wcqjf7ureb X-Rspamd-Queue-Id: 3DD53140019 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1730296164-868383 X-HE-Meta: U2FsdGVkX1/QpPV5aDwYb4h9+QoSZzB9DE8kuQlDHP8DybMLGycKtVkbJp30RUEhGkR7H8Atu2vpG1r4HGBmpMU/GOBvvRoU0UdGvPeec8L2wZHvMSGbdqF3Wtz0msCs4WOZXfP6yMxuaUUPYgLnjGD50sJp3jjSFjctaPK6IWH4bkGQoA+FLtKyNlFHqyVXvZma8KHppudXcZ1Lx+4MujsF1vabriaPbNgc5km+bc6qQ9weoc7oCu2oyOetHRS8RxXC1wC+VQ3is3+Q1DnETBBblDn+lzGY9MbhOtMPwGocQadgI5fm75i6UXlbU0HvCypT16tfsG9rD/7FPUzJjx3sqO1ZGIULooWZ+DX4NL3hqePp8Cgu6W4gcw2p/qeriLOkfWqniGtfSadZMmquUSmK9GWaxfqsdfZZJiN6cq4ic8HWqkniLWoFhgwacY3jvRCwG2t2BdzR0ZBeA1J4OfoDqDKV5qP6ABOC/wLDISIP41a2F+XGA08esH5Lu66D78v5TxbNi/+oI02qHtNCTiuzCGKSr/C2BkTpp0x/ksXTF6kHattJfxXHS7abMAH5my6gtDUVB9A4agPK7o7+GFJPFbb216lJdurpctk/+wMFEKju774Nf5+8S4f6Rl/l27JN/jMYA9qUvBrBva2s1/1KaXlVyt7JZChpFmaToWR9RGh/DULTrsrlSsRYcH2NnKFI+U8hsKiy8YFKc8NWZgWqqXcH4FXchYen8rHNQLUI6xQox8uxpAJcZU/XHxGKZtsgnKQ9K/aBM2yYDgzfJXx1UdWyHhpV/O/M50WIS9af6U2XNfidaW2aMDMZc1BBz7Yt1Ak1uPA8M0xv5TMLvomdbBMq+PpiI9UfUyaOmy7BrIvqD3ygzOYOuMaxruUBSu87aEl/NwzhWje0Coxz/b0U2ByKogV0uhMC7ku0f6uYvUNoGVc8Gbt5Xjoo7wTHEmrN5jJkg0Hj8UOkyKF Nci6iCcE gUaAGtW4Dg6xqwl6if6YL9yx15cQywMY8ZLmz1ddQ68Z7z+w6kfCrV3LtDQODcTMH72MBOkjjn7ORvl8YIGkMRs+/sqLMSK4diednBsGN+7A9EMH8lORKWQzoBPYD9ogS9mxfhD/YLuPoYbTr385q8gzJ2ZwQLxtVrATBC5eDMihKclM0AtTKgHs3SpgBnJ8n6dJL3ntPo7A23nnYkYCialXRdKpph6KiqIzOTXE9RG9zn/RX0ZAGJPNXSrlFwR1EYBx1nx6Zx9cTQL/4lUtLkaXX7c8BtktGkNvj9gDrmun2PL7zxSB4oOm/togKjXWNElda5ecznhf5OEhrwiPA6uZ/LEabrIDa3O6JXWA5Sl0YsJ38058cYDOe8La+ZwnqHdkQ5C83Jl6O8wuXWvqPGqpQ3t7Z+5kXPIVCD9pdPft/JyepRgxtKb0re7ayQ2GrW6AwuF5aj0SwHO8jFejrzH3ucnZMAYXgsnwZz8DvquymFQvBq84vvmOqsewGPCnmy7Nx+26TG49Rk1s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Unmapping virtual machine guest memory from the host kernel's direct map is a successful mitigation against Spectre-style transient execution issues: If the kernel page tables do not contain entries pointing to guest memory, then any attempted speculative read through the direct map will necessarily be blocked by the MMU before any observable microarchitectural side-effects happen. This means that Spectre-gadgets and similar cannot be used to target virtual machine memory. Roughly 60% of speculative execution issues fall into this category [1, Table 1]. This patch series extends guest_memfd with the ability to remove its memory from the host kernel's direct map, to be able to attain the above protection for KVM guests running inside guest_memfd. === Changes to v2 === - Handle direct map removal for physically contiguous pages in arch code (Mike R.) - Track the direct map state in guest_memfd itself instead of at the folio level, to prepare for huge pages support (Sean C.) - Allow configuring direct map state of not-yet faulted in memory (Vishal A.) - Pay attention to alignment in ftrace structs (Steven R.) Most significantly, I've reduced the patch series to focus only on direct map removal for guest_memfd for now, leaving the whole "how to do non-CoCo VMs in guest_memfd" for later. If this separation is acceptable, then I think I can drop the RFC tag in the next revision (I've mainly kept it here because I'm not entirely sure what to do with patches 3 and 4). === Implementation === This patch series introduces a new flag to the KVM_CREATE_GUEST_MEMFD that causes guest_memfd to remove its pages from the host kernel's direct map immediately after population/preparation. It also adds infrastructure for tracking the direct map state of all gmem folios inside the guest_memfd inode. Storing this information in the inode has the advantage that the code is ready for future hugepages extensions, where only removing/reinserting direct map entries for sub-ranges of a huge folio is a valid usecase, and it allows pre-configuring the direct map state of not-yet faulted in parts of memory (for example, when the VMM is receiving a RX virtio buffer from the guest). === Summary === Patch 1 (from Mike Rapoport) adds arch APIs for manipulating the direct map for ranges of physically contiguous pages, which are used by guest_memfd in follow up patches. Patch 2 adds the KVM_GMEM_NO_DIRECT_MAP flag and the logic for configuring direct map state of freshly prepared folios. Patches 3 and 4 mainly serve an illustrative purpose, to show how the framework from patch 2 can be extended with routines for runtime direct map manipulation. Patches 5 and 6 deal with documentation and self-tests respectively. [1]: https://download.vusec.net/papers/quarantine_raid23.pdf [RFC v1]: https://lore.kernel.org/kvm/20240709132041.3625501-1-roypat@amazon.co.uk/ [RFC v2]: https://lore.kernel.org/kvm/20240910163038.1298452-1-roypat@amazon.co.uk/ Mike Rapoport (Microsoft) (1): arch: introduce set_direct_map_valid_noflush() Patrick Roy (5): kvm: gmem: add flag to remove memory from kernel direct map kvm: gmem: implement direct map manipulation routines kvm: gmem: add trace point for direct map state changes kvm: document KVM_GMEM_NO_DIRECT_MAP flag kvm: selftests: run gmem tests with KVM_GMEM_NO_DIRECT_MAP set Documentation/virt/kvm/api.rst | 14 + arch/arm64/include/asm/set_memory.h | 1 + arch/arm64/mm/pageattr.c | 10 + arch/loongarch/include/asm/set_memory.h | 1 + arch/loongarch/mm/pageattr.c | 21 ++ arch/riscv/include/asm/set_memory.h | 1 + arch/riscv/mm/pageattr.c | 15 + arch/s390/include/asm/set_memory.h | 1 + arch/s390/mm/pageattr.c | 11 + arch/x86/include/asm/set_memory.h | 1 + arch/x86/mm/pat/set_memory.c | 8 + include/linux/set_memory.h | 6 + include/trace/events/kvm.h | 22 ++ include/uapi/linux/kvm.h | 2 + .../testing/selftests/kvm/guest_memfd_test.c | 2 +- .../kvm/x86_64/private_mem_conversions_test.c | 7 +- virt/kvm/guest_memfd.c | 280 +++++++++++++++++- 17 files changed, 384 insertions(+), 19 deletions(-) base-commit: 5cb1659f412041e4780f2e8ee49b2e03728a2ba6