From patchwork Thu Dec 12 06:36:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Roth X-Patchwork-Id: 13904724 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D71D3E7717F for ; Thu, 12 Dec 2024 06:39:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 717FD6B0095; Thu, 12 Dec 2024 01:39:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C7CC6B0096; Thu, 12 Dec 2024 01:39:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 541D16B0098; Thu, 12 Dec 2024 01:39:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 359886B0095 for ; Thu, 12 Dec 2024 01:39:15 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C2812440A3 for ; Thu, 12 Dec 2024 06:39:14 +0000 (UTC) X-FDA: 82885354482.15.C2ACE97 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2075.outbound.protection.outlook.com [40.107.94.75]) by imf23.hostedemail.com (Postfix) with ESMTP id E102D140012 for ; Thu, 12 Dec 2024 06:38:55 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=tUcvn5dW; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1"); spf=pass (imf23.hostedemail.com: domain of Michael.Roth@amd.com designates 40.107.94.75 as permitted sender) smtp.mailfrom=Michael.Roth@amd.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1733985529; a=rsa-sha256; cv=pass; b=hSJUrJutcRXpU1CzJfz9nKqpEM7NrMIVBPwRnPfq6PAbRJn1ysQIN4EorRMM3j7T5zWkWF HanpPmIF4sz6MvXlXKgbotuWuXsyLknBINbl5YHRU6Hy71dqTtHNVSfkrNgY+REQWExq3x UuJRiLZ/IxBKcaNRA8dECnfO98tDPK0= ARC-Authentication-Results: i=2; imf23.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=tUcvn5dW; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1"); spf=pass (imf23.hostedemail.com: domain of Michael.Roth@amd.com designates 40.107.94.75 as permitted sender) smtp.mailfrom=Michael.Roth@amd.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733985529; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OJko6VK/4yYa1/Q3u/SX/rr6ao4Sfll0k5Fjriv+GgI=; b=f1fv4Th0i3BfwmXHHqsFZuY+Oh5XLwJAuw5R4PIvKKxT8giJLCytiYY0tlRidarcOtMZ/V TU7FlQaUpfD0TBFWLWfKv6N0gsKySie/fBSaPekBAXMGYqpYzaHS6+GugKMNkWg+M2MQx1 Hya7YJqfq0vGYU9ykV9gGmY3By+1ZJI= ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=l0c4LgrtlT7seVsQ1V2WcukgzUiNn3PSGJBuoU6v29WRePjU00rc/E5Eg5+XLEwhh4OX8vZHoR+biaxthFCC/5ySFgQC440FkK26PGhTLakUgWP9rZUoyNQPYAzq9kTSvGsG+e7an3q8ayus/qhD6AE/Lr0EMHNwkVOt8jp7zJonn3BuzbUys5w5FSU0QAD9viambCaSXsyydCPTe/WLGi7PdiUkNtOji2oIVOtVeZuik85OGJRi3jc46n3UZqmEoKsigkXyGO/Q1/HroQSvvlo7pwCsCNcW2x4dJwxvqFlrY4mMGons2+V1BVCYMMU9OnSKtDJ+tGUqWPonP5oGaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=OJko6VK/4yYa1/Q3u/SX/rr6ao4Sfll0k5Fjriv+GgI=; b=EnsTcXsP5hnOgeNWnHfUAoZp043pdAESWAGaHvDGOmiCbgGzM/Wh1vTxuR4Foc6U3lWfqnlS0I5X7y9mx/W5JPDOM8wJW6orIV2Np5k9abouHYtiKA2mR2kXUAlHWPc2vgujYTz/IiFDvIfg4qiB/TK9He2/XmVrsNOIHhZJZV5moFA3VN53SeGTyk0tFSigZeQg3G7VfgDByDSjBLsRTobm47H5nE6wCYMlBXxQTuXxzW6OeZycoGB2RJAQdLuSfmvHBx2bGbHKiFFrlLQHI5Lt0aUs20dHtB5Nlh9qweRYO9DWzBFpH3GUYN7r4OrXXysq97FCga0pETyUI/mdsw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=OJko6VK/4yYa1/Q3u/SX/rr6ao4Sfll0k5Fjriv+GgI=; b=tUcvn5dWpaQvyaZ91gqAKnNvBURBo/Mx0U4MvN76crk0JB3a6leiJPmG315xUSvHwTG9wampK1VTgxBLSO8DHDgX6yHoMOyEciEE5vLEBPi3vbGXMb70wvnDPXVXWLKLkSv4qZ4eK7LSiLNeSmXuzy56WwZ7/oXAezckUimbItw= Received: from CH2PR10CA0026.namprd10.prod.outlook.com (2603:10b6:610:4c::36) by SA1PR12MB7222.namprd12.prod.outlook.com (2603:10b6:806:2bf::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8207.14; Thu, 12 Dec 2024 06:39:07 +0000 Received: from CH2PEPF0000009A.namprd02.prod.outlook.com (2603:10b6:610:4c:cafe::e6) by CH2PR10CA0026.outlook.office365.com (2603:10b6:610:4c::36) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8251.17 via Frontend Transport; Thu, 12 Dec 2024 06:39:06 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CH2PEPF0000009A.mail.protection.outlook.com (10.167.244.22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8251.15 via Frontend Transport; Thu, 12 Dec 2024 06:39:06 +0000 Received: from localhost (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 12 Dec 2024 00:39:06 -0600 From: Michael Roth To: CC: , , , , , , , , , , , , , , , Subject: [PATCH 5/5] KVM: Add hugepage support for dedicated guest memory Date: Thu, 12 Dec 2024 00:36:35 -0600 Message-ID: <20241212063635.712877-6-michael.roth@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20241212063635.712877-1-michael.roth@amd.com> References: <20241212063635.712877-1-michael.roth@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PEPF0000009A:EE_|SA1PR12MB7222:EE_ X-MS-Office365-Filtering-Correlation-Id: f5f751d0-e858-4d15-9e61-08dd1a77ad6c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|376014|1800799024|36860700013|7416014; X-Microsoft-Antispam-Message-Info: SiaS4GNlI+RLVI429LgqSGFS34du03p+2TsX/lTyCsSkonO5oW3QpJ/PCPjXtMaKSGEduA0PM+goNqbQZUIERXvuCp6W2qo8Di3DQyENxfxwbxPxQQuj/jSSURtSViQXINedtVd7baUt8ylf2g8tqxTrR9zEmQGeJlJHlaxoV3X9P5dkmjgQUgGfSySK3klRKTagyzZcpM1eMoL5LcJca7Fgk+9XxPFwZrbVpekH78ezs240oIfVv17MmILBA4LK7OMXGXADKHGGgchmQLDUGlq37KWhgxMpFrjCZsilZBJB1+HO4gT03hRjjqQp1WPbn6Mfxake4QK6e7wPMFX8kvauj04vqgjXkihRmlYcgl9l88ISNB6JYlIRH1fI28vxZdI8Mjl0nwEnz+IZ3UjIILqbq3FC467uWRp91cQ9tTNzliwz4lWRYGh03/25C7yxxNgRn5xHwvvEBMpXxfqCFCM7kgDUgIHPx2mmHld6JoQ77dAWkednspOXBQKhO769a7tyclzslXelOl64kLj8QbNF/yMGeLR8NhPRU3QenyNuQMSEfZTeXrvDQN/u21F3CeMqgn+YkC5WnerGT/voJIZvlW+erh0j2uDt6s7qn2uHdzykG6VSSu9mxSxxBZ4ds1BLIhSPMnKBrgrWahrRS9Mri+5S5l3Y5mE/28v7oki+qX/kcPUQ0UmFmRtg8GT3mkWp4cFpv40mbLJJ8SjFclpobTfPwzqzlB5OI81UvZgeVSs+1xFmi3GJDh/b0t0VDK4d5nsELtxSO/32L81CW16Bsndsc0DV2qYYfKtvY3kBMhn3vqkk7xKTppo/bLjW+pvQfnlXMDDv9fT+PeOntecaeCjJ0EZQ/3bL9/YFN7crnfpSW4OXpwBOHLsswxfdBZLBRVs+1B3fZRlrLnvF7aDaJu7QHKbNTWo2UEQOxSIFymwrcwDgrlfhNQPLm9ivrxGF36oUKHIzLF4I5EOEY3PIpeSKGxYKXQwhTUg2BHM72QS3zdsEEaZMZ9+VkHgnhfXAKdizWdq4NUlHPQJsb4kYuWk/nvSqhNpYFvoqnIuhRn3pRbo3uhYJPLcs/xkEebOtS2DRgQ9kYpIKw2CFLGpgwZSNqBRNI10UkbQoOISg68Dx0MlNi2PBqXB6W994Wfm6Ef+kVEBoZW9hjRe2VKYYNGAuMgXobgk7nwX2+OZKss3ExLUCZEs0ptMb90Q4mjkADdyFUF1DRuiZ3kQ9Z+VgV8Ifsa61q6TJWND6PFeGx55X5soCIZA5+wMQaUD1QK33Ph2bcQfTns/f9uEVP1qfchdy5jlACJpYl83rl8EAEClbHmdA52/TC/ksKITxh58lYPJVaLHDtvQfoeVWGXuq6r7vUPLxop0XarvNO9kGVtlDROR83fm+xfPV1WJPbxy+DITzYst/nHvC8mNv6A== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(376014)(1800799024)(36860700013)(7416014);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Dec 2024 06:39:06.8704 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f5f751d0-e858-4d15-9e61-08dd1a77ad6c X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CH2PEPF0000009A.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB7222 X-Stat-Signature: a5eigbmbr3gxszaietsruiye7wd6fuwr X-Rspam-User: X-Rspamd-Queue-Id: E102D140012 X-Rspamd-Server: rspam08 X-HE-Tag: 1733985535-540423 X-HE-Meta: U2FsdGVkX180UTrAkyXif2UNQs5XlYMbZwxkt2DJkluIchBjSo+JGH9XJEJGg+Z3kbCnsfiucGoch+sJvsW0/3nNUG8BxAUVVWa0k8F4w1wp+31+9uoX8ODAEn6QcCXUSX19eDiNoxKWotACoqNqnjAry2P81Z0pptzqNp/3zSQAVp7/77RKxTOEb7YqckNLnuPGIjDZ+VTlA6Na5zGNa8twedLsFfR4xlj9TzVMrpWzj/dYMS7ffhlgyIjEGC9XqaU4lN1F1RBO9s8FhCUi9CInvv8d04x7ADi7GH7airUym9feGmBtPhF2AmODBEkHsOkQw3vDHCJoMQgA3Zw/IEgZrMf0EE7ncUPVoCc8UVhh4R1VzCmbusXmQGki8A0lVOQOdsvFTluEM99innDYLM7BG1fp1DsKk1v4FXpOUl64W4hH2OMa9zauPxqpsD9yvInzeqibPSHIEpeylLH6a9NPvNORk0j056JQszsX7lGhMKj0xqRyUaLgfEh1vO8nLZOui0AMlsWq5U7bXJHnS88Uvx23c0pCV4VYapGt72/LQsvKa0JmBgwjmaKu7s80gJan8bDKdLwtJ2j6PWMlHFLChUtW4yHtyBBoBzLtoB6In0vL5wuulCaLCSWAvYFp2PicmRTww1hJ8pJ1Me3f8KVrNlZvKQuhHprfC7fbh1rMZ7GMWCb9s4AtOJCWqbNRskbhU9VyspXMlOOG4Jubak8Cbw6vNUlLWGedpRGJUoGOD4eeqViefkCK1gDwNaMCfME+UBTSTnmHF8nHdwnU4ekXiRYu36HZAUKl3aJy1Go8RscsqF1U3VSb3LHhoggm7MZBM6JosZJlTEdOHRFQjja+WdBigIyk8jGUZjxEMYHEwF3Ag+dpPR/DjIJGIu8Jh9hXcwU/LFPswNL3KD/oF4Sg5dWQ5unZqo/nerL3TSpz0WiNgfKm2TIaSIGdopMXFuvaWcNlm2CYsEsFTjI +a28yde5 /J6GMwH0oHSQ8vlnUsUsAsU8SS4EZWxJ2vGxdsq/mX9y0d91BdBoSTZfa23wVd3sjEeYMhGzUAolJ4xsKY9mSl246+K39zpbZVUITCj6lsCLnNdRbT5SGIQOvHs4cMj6If50YWz7W45X+4ix7tXWNvt94TztHCfNCDoR/JYZW/L9TrBOHvy9wy3YaeglkFz+puGY7ywCvllOGhS+oE8hX2ENF95oYi1cBdcdkopRupleC7ATYToNBhpGW6D+row9ko8DAC3e4h/k68Sr4Ei4oiDVxzCrA3IZxajh+2yfsoRoRnT6QuSGBae+uKMz872dVU7FKjKb9NfYm5+aHhYZKxG3J35qfzsExUPhFbFHIlcciEc1To5dabXG8Au2vShXA0PghhJAAc327rdPh+0JkNKhfl9atU+ssxANDgTyQeqmXs9HDJYZeGOR9W1FCYM6495mRxvJD3bUPK8L5IsRabVe/MscesiHasTGE2fuxCzkhpjMxASkDB3OH+2feA7mj780jZNbk/Yyvqy2ZcDMDmCPthYYsI4F3HFp/c0fx1vn2LNkSBz3Aogrjcldn4IXNBfMh X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Sean Christopherson Extended guest_memfd to allow backing guest memory with hugepages. This is done as a best-effort by default until a better-defined mechanism is put in place that can provide better control/assurances to userspace about hugepage allocations. When reporting the max order when KVM gets a pfn from guest_memfd, force order-0 pages if the hugepage is not fully contained by the memslot binding, e.g. if userspace requested hugepages but punches a hole in the memslot bindings in order to emulate x86's VGA hole. Link: https://lore.kernel.org/kvm/20231027182217.3615211-1-seanjc@google.com/T/#mccbd3e8bf9897f0ddbf864e6318d6f2f208b269c Signed-off-by: Sean Christopherson Message-Id: <20231027182217.3615211-18-seanjc@google.com> [Allow even with CONFIG_TRANSPARENT_HUGEPAGE; dropped momentarily due to uneasiness about the API. - Paolo] Signed-off-by: Paolo Bonzini [mdr: based on discussion in the Link regarding original patch, make the following set of changes: - For now, don't introduce an opt-in flag to enable hugepage support. By default, just make a best-effort for PMD_ORDER allocations so that there are no false assurances to userspace that they'll get hugepages. Performance-wise, it's better at least than the current guarantee that they will get 4K pages every time. A more proper opt-in interface can then improve on things later. - Pass GFP_NOWARN to alloc_pages() so failures are not disruptive to normal operations - Drop size checks during creation time. Instead just avoid huge allocations if they extend beyond end of the memfd. - Drop hugepage-related unit tests since everything is now handled transparently to userspace anyway. - Update commit message accordingly.] Signed-off-by: Michael Roth --- include/linux/kvm_host.h | 2 ++ virt/kvm/guest_memfd.c | 68 +++++++++++++++++++++++++++++++--------- virt/kvm/kvm_main.c | 4 +++ 3 files changed, 59 insertions(+), 15 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index c7e4f8be3e17..c946ec98d614 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2278,6 +2278,8 @@ extern unsigned int halt_poll_ns_grow; extern unsigned int halt_poll_ns_grow_start; extern unsigned int halt_poll_ns_shrink; +extern unsigned int gmem_2m_enabled; + struct kvm_device { const struct kvm_device_ops *ops; struct kvm *kvm; diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 9a5172de6a03..d0caec99fe03 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -273,6 +273,36 @@ static int kvm_gmem_prepare_folio(struct kvm *kvm, struct file *file, return r; } +static struct folio *kvm_gmem_get_huge_folio(struct inode *inode, pgoff_t index, + unsigned int order) +{ + pgoff_t npages = 1UL << order; + pgoff_t huge_index = round_down(index, npages); + struct address_space *mapping = inode->i_mapping; + gfp_t gfp = mapping_gfp_mask(mapping) | __GFP_NOWARN; + loff_t size = i_size_read(inode); + struct folio *folio; + + /* Make sure hugepages would be fully-contained by inode */ + if ((huge_index + npages) * PAGE_SIZE > size) + return NULL; + + if (filemap_range_has_page(mapping, (loff_t)huge_index << PAGE_SHIFT, + (loff_t)(huge_index + npages - 1) << PAGE_SHIFT)) + return NULL; + + folio = filemap_alloc_folio(gfp, order); + if (!folio) + return NULL; + + if (filemap_add_folio(mapping, folio, huge_index, gfp)) { + folio_put(folio); + return NULL; + } + + return folio; +} + /* * Returns a locked folio on success. The caller is responsible for * setting the up-to-date flag before the memory is mapped into the guest. @@ -284,8 +314,15 @@ static int kvm_gmem_prepare_folio(struct kvm *kvm, struct file *file, */ static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index) { - /* TODO: Support huge pages. */ - return filemap_grab_folio(inode->i_mapping, index); + struct folio *folio = NULL; + + if (gmem_2m_enabled) + folio = kvm_gmem_get_huge_folio(inode, index, PMD_ORDER); + + if (!folio) + folio = filemap_grab_folio(inode->i_mapping, index); + + return folio; } static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start, @@ -660,6 +697,7 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) inode->i_size = size; mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); mapping_set_inaccessible(inode->i_mapping); + mapping_set_large_folios(inode->i_mapping); /* Unmovable mappings are supposed to be marked unevictable as well. */ WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); @@ -791,6 +829,7 @@ static struct folio *__kvm_gmem_get_pfn(struct file *file, { struct kvm_gmem *gmem = file->private_data; struct folio *folio; + pgoff_t huge_index; if (file != slot->gmem.file) { WARN_ON_ONCE(slot->gmem.file); @@ -803,6 +842,17 @@ static struct folio *__kvm_gmem_get_pfn(struct file *file, return ERR_PTR(-EIO); } + /* + * The folio can be mapped with a hugepage if and only if the folio is + * fully contained by the range the memslot is bound to. Note, the + * caller is responsible for handling gfn alignment, this only deals + * with the file binding. + */ + huge_index = ALIGN_DOWN(index, 1ull << *max_order); + if (huge_index < slot->gmem.pgoff || + huge_index + (1ull << *max_order) > slot->gmem.pgoff + slot->npages) + *max_order = 0; + folio = kvm_gmem_get_folio(file_inode(file), index); if (IS_ERR(folio)) return folio; @@ -814,8 +864,7 @@ static struct folio *__kvm_gmem_get_pfn(struct file *file, } *pfn = folio_file_pfn(folio, index); - if (max_order) - *max_order = 0; + *max_order = min_t(int, *max_order, folio_order(folio)); return folio; } @@ -910,17 +959,6 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t start_gfn, void __user *src, long break; } - /* - * The max order shouldn't extend beyond the GFN range being - * populated in this iteration, so set max_order accordingly. - * __kvm_gmem_get_pfn() will then further adjust the order to - * one that is contained by the backing memslot/folio. - */ - max_order = 0; - while (IS_ALIGNED(gfn, 1 << (max_order + 1)) && - (npages - i >= (1 << (max_order + 1)))) - max_order++; - folio = __kvm_gmem_get_pfn(file, slot, index, &pfn, &max_order); if (IS_ERR(folio)) { ret = PTR_ERR(folio); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 5901d03e372c..525d136ba235 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -94,6 +94,10 @@ unsigned int halt_poll_ns_shrink = 2; module_param(halt_poll_ns_shrink, uint, 0644); EXPORT_SYMBOL_GPL(halt_poll_ns_shrink); +unsigned int gmem_2m_enabled; +EXPORT_SYMBOL_GPL(gmem_2m_enabled); +module_param(gmem_2m_enabled, uint, 0644); + /* * Allow direct access (from KVM or the CPU) without MMU notifier protection * to unpinned pages.