From patchwork Tue Nov 5 16:45:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shivank Garg X-Patchwork-Id: 13863217 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2074.outbound.protection.outlook.com [40.107.244.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C87631D47B4; Tue, 5 Nov 2024 16:48:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.244.74 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730825290; cv=fail; b=cxzuQVny5naBRYV+nC3GPDx4E4TtjfO0abfASAZddlMOIvBxeZIN3r4EUcOtD3NHGe4w3Tj8+ajDn3GwmGKg22w3Zzadif0DkkmqzNFlQbdphcSPtlCXNFh45YQFeelfnfNTJigrBYFTNXCQOz7wp4BbTb7rbOxwEWtQC7fPE2A= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730825290; c=relaxed/simple; bh=+UfxFrGXVTVyE8HLUd/UN+Bn0M6fPmxlIpYY0+yPiHE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=VMq1ZXYYkFAOYqtbcsGTCKMAqICLNmq3+4k8nW7L4+/D45dCavQP7iiD26e/Fn8EVGQ5l3DsOZqfozfrullbOxA+P291afm9kl3bfj3qV0reAscV3YVHWFkZUawD9+DyEk9FR+czJu4dQk27GJXzv+JAGnoVHguifjomIf6/f6Y= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=R5xll6PA; arc=fail smtp.client-ip=40.107.244.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="R5xll6PA" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=SUQgAChdj0ph1Rg6tt286Ueoi+YWXb/k4ZQdwVdY0+ZGyIfbUH+KUwGe8CbXfOOWzJQQXf39knC4deaVPwURnkQzs+8L+x2iKi/rC3TmZeCjYHYT9bT8JnqFKpPnPyVzEH1E6p7ScN1LdVSCofRX63yFLpJGy1eKMhR24mYqacBcL1nwZoaxwTTnOCngq6BANQTrbmV07DMZvDgl8a8HQPW3jegRGFojSf+ihXkFrtRmPQ1cTr40PYEwA30Ky7U8WfJqrjZczZsCEpEOwSzv40l92bHvEBG6xNFpdiVdkjj+mS/JX8bc9YiKWfCrEgdm8q6gOKpgk4Qcix/tMUlhfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9SNmEP9ffiNaTSzFYP+hIwyVz5gMf9kYO9erAqnd/kk=; b=d1aIOp/falsOP5bYoVTJfyNb+knNOc96sBkgsZuPyE2TGQdcieACj+6ZII0mM/mPJzDDMmtszDqie40wxCwFP75+SaBemToucLmTxg9peVD4LsJkIM87jecf1ChgrdX4uIym8dWF4+3SSjKJglxMCvz1TnOLuVxG9YELg8AZ4TIDieR/1a4RPMBNGOKxzbeqEd0KQWq9bydU5xJn+eSlx5tNBhqIrtULrHmDl28fbjtmI+ew4ZB43rQvGu1cgy1oDXTm8gkWM8NWgsfiPwwk7lTRGrt7EDXkPStMAQe9PSs8LDhCGL/buvvIemGyPbdoj15GRHEd4Mx0zSwIFr5k4A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9SNmEP9ffiNaTSzFYP+hIwyVz5gMf9kYO9erAqnd/kk=; b=R5xll6PA9pyJZJMMjcPryFmYbNQ2v7Vwwe6HlGdxNn3AblAGXDdMj2PtC6+D4x02pbQddvJQdA+Jep+lyz7rH0O62cjHLTqAxGzGyjcl92RGFxVOUHKTI+h5a7ap1+ZRlvXmCBjS0dFHyeRCv7xeoXFpP9xIIaZ8dASngDzHu30= Received: from CH2PR11CA0007.namprd11.prod.outlook.com (2603:10b6:610:54::17) by DM6PR12MB4169.namprd12.prod.outlook.com (2603:10b6:5:215::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8114.31; Tue, 5 Nov 2024 16:48:03 +0000 Received: from CH2PEPF0000014A.namprd02.prod.outlook.com (2603:10b6:610:54:cafe::23) by CH2PR11CA0007.outlook.office365.com (2603:10b6:610:54::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8137.19 via Frontend Transport; Tue, 5 Nov 2024 16:48:03 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CH2PEPF0000014A.mail.protection.outlook.com (10.167.244.107) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8137.17 via Frontend Transport; Tue, 5 Nov 2024 16:48:03 +0000 Received: from kaveri.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Tue, 5 Nov 2024 10:47:54 -0600 From: Shivank Garg To: , , , , , , , , , , CC: , , , , , , , , , , , , , , , , , , Shivansh Dhiman Subject: [RFC PATCH 1/4] mm: Add mempolicy support to the filemap layer Date: Tue, 5 Nov 2024 16:45:47 +0000 Message-ID: <20241105164549.154700-2-shivankg@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241105164549.154700-1-shivankg@amd.com> References: <20241105164549.154700-1-shivankg@amd.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PEPF0000014A:EE_|DM6PR12MB4169:EE_ X-MS-Office365-Filtering-Correlation-Id: 9c16f1cd-ec3f-4683-3814-08dcfdb99d8d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|376014|7416014|36860700013|82310400026|921020; X-Microsoft-Antispam-Message-Info: kXLT1+MmoGgnJidDw3N+BCOGj6D9ulCKNMq2wsmGmpgJXwOtfsXWSFTdsrWzxb97Wk1i46hf8l4q9SMzUcBLct7cqYIqcZ6pPmsLrHUDCEiaXhWO/6vTGN5cNohVS914x6yJLGqdKVzF973CZGoyrNSWtveF2UdVLj0w9uVdRBFTNeKD573v9rVvkk5yfIxUuURJMtiYOKhILwXXYjKTWzMMR6xS+71OVBW7+X6KVZeVersRawwa40b5BGBv4mMFBKuX7RwnGBkxyRFYFVdCZh/KOjYexJa2kXCfJCmuinhEW5l6Z/UMqeFxPfz37ZcgUJhWbOBfwIP2IFSPlhlzy+M1tpbTm3LN99sNBXm1XtcfQ1yBC0U6aaMvT6I6vI4yUrXAXhZG4mS9c3lWQxbXcrDf+R46k1mdR6d7PfKbwBfwNAW1oWhg52Th10782/nZfk5xcW1fd/Le3LSDvoWYA918CCnNH17bQm8SLI0h96bMoeqft7RBdFl7GNbdkdBg6KkF7z8zs0dMPrzCSpNLINE7M//aKru8yftl8SqnHONCglvP/qDgqrzXyPLNYZocNrWJsvAe2VbsLrsJoIyJ64WL43nKRH4zksdXgjQ2vvak+8XkK1Jc5fKoVyFwclWT3x0eZH1/u48vzOywviRst9FL2SR5VoqqvBtCJK+fiHB+6sXjj1szqFd+mwiaGRL/X7bF4iPLox6tL+nKFFeveswX1WwTDB/OjJAxiccPnvWrL0Gj+/GpXb6JjejOW1bV+mrFulqV7FT7a70HW1UModw9+hnfaD8kQv3JxNQxHMYDsGKLpsZhylfcRVfsoxx1bTtcqagXMJ7IBbaFrXyiTB8F3GHo8Xk9PXZgxAjDccBFcI25Lyw+DAOmB5ECL//x2fGAvtoHtJwqBfnaVFx4F6ub4QySq0zeSPl9pHGMrRqMJ8zfWwe7qyzYoFjkdafGsRKUbs6BkTVCXIOS/CelJddT36d/HuDWxrjXt8dvoBZYUTH/ZfsSFkU5XzUl88GKwJay/cXNuOlOXQ3kXCZohZq1oQORJqDp2IiGjt2+136GymjNUvgCMT4uHZMMflFns7hksaMRjoyizegI0dXDXTNguBihUKtolFhY+d5FFt6mNJSIjmJSsDEI02TUd8ig/1oDEuuWfSa+kGF8wRTg5MgoZZpHn/OuiKDQbmHVJ9Ji724jdaunl6M+sx45R+bjUqv4lTic2fn+iwKrYjZPgUWtMU3KjTk+LlLdwLo6vlHkc8IUPpavguUpgn+fMaQAgccGmkqWjby0VqcZ5Fu2P/IbdwWg/iz3qYcfnYXCCl2fdt0+bsD6utnHnzJxNiuFYphMv2H6K4s1hkNgNciveKjud9dSgbBYQ1RRQM55PTlHhM8H+MYZMJKe+iLAnkq2D2oaWYvuMsFKv9Xua2NAtseODcf15cbyiLFH95W8LsU= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(1800799024)(376014)(7416014)(36860700013)(82310400026)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Nov 2024 16:48:03.3041 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 9c16f1cd-ec3f-4683-3814-08dcfdb99d8d X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CH2PEPF0000014A.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB4169 From: Shivansh Dhiman Introduce mempolicy support to the filemap. Add filemap_grab_folio_mpol, filemap_alloc_folio_mpol_noprof() and __filemap_get_folio_mpol() APIs that take mempolicy struct as an argument. The API is required by VMs using KVM guest-memfd memory backends for NUMA mempolicy aware allocations. Signed-off-by: Shivansh Dhiman Signed-off-by: Shivank Garg --- include/linux/pagemap.h | 40 ++++++++++++++++++++++++++++++++++++++++ mm/filemap.c | 30 +++++++++++++++++++++++++----- 2 files changed, 65 insertions(+), 5 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index d9c7edb6422b..b05b696f310b 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -564,15 +564,25 @@ static inline void *detach_page_private(struct page *page) #ifdef CONFIG_NUMA struct folio *filemap_alloc_folio_noprof(gfp_t gfp, unsigned int order); +struct folio *filemap_alloc_folio_mpol_noprof(gfp_t gfp, unsigned int order, + struct mempolicy *mpol); #else static inline struct folio *filemap_alloc_folio_noprof(gfp_t gfp, unsigned int order) { return folio_alloc_noprof(gfp, order); } +static inline struct folio *filemap_alloc_folio_mpol_noprof(gfp_t gfp, + unsigned int order, + struct mempolicy *mpol) +{ + return filemap_alloc_folio_noprof(gfp, order); +} #endif #define filemap_alloc_folio(...) \ alloc_hooks(filemap_alloc_folio_noprof(__VA_ARGS__)) +#define filemap_alloc_folio_mpol(...) \ + alloc_hooks(filemap_alloc_folio_mpol_noprof(__VA_ARGS__)) static inline struct page *__page_cache_alloc(gfp_t gfp) { @@ -652,6 +662,8 @@ static inline fgf_t fgf_set_order(size_t size) void *filemap_get_entry(struct address_space *mapping, pgoff_t index); struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, fgf_t fgp_flags, gfp_t gfp); +struct folio *__filemap_get_folio_mpol(struct address_space *mapping, + pgoff_t index, fgf_t fgp_flags, gfp_t gfp, struct mempolicy *mpol); struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index, fgf_t fgp_flags, gfp_t gfp); @@ -710,6 +722,34 @@ static inline struct folio *filemap_grab_folio(struct address_space *mapping, mapping_gfp_mask(mapping)); } +/** + * filemap_grab_folio_mpol - grab a folio from the page cache + * @mapping: The address space to search + * @index: The page index + * @mpol: The mempolicy to apply + * + * Same as filemap_grab_folio(), except that it allocates the folio using + * given memory policy. + * + * Return: A found or created folio. ERR_PTR(-ENOMEM) if no folio is found + * and failed to create a folio. + */ +#ifdef CONFIG_NUMA +static inline struct folio *filemap_grab_folio_mpol(struct address_space *mapping, + pgoff_t index, struct mempolicy *mpol) +{ + return __filemap_get_folio_mpol(mapping, index, + FGP_LOCK | FGP_ACCESSED | FGP_CREAT, + mapping_gfp_mask(mapping), mpol); +} +#else +static inline struct folio *filemap_grab_folio_mpol(struct address_space *mapping, + pgoff_t index, struct mempolicy *mpol) +{ + return filemap_grab_folio(mapping, index); +} +#endif /* CONFIG_NUMA */ + /** * find_get_page - find and get a page reference * @mapping: the address_space to search diff --git a/mm/filemap.c b/mm/filemap.c index d62150418b91..a870a05296c8 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -990,8 +990,13 @@ int filemap_add_folio(struct address_space *mapping, struct folio *folio, EXPORT_SYMBOL_GPL(filemap_add_folio); #ifdef CONFIG_NUMA -struct folio *filemap_alloc_folio_noprof(gfp_t gfp, unsigned int order) +struct folio *filemap_alloc_folio_mpol_noprof(gfp_t gfp, unsigned int order, + struct mempolicy *mpol) { + if (mpol) + return folio_alloc_mpol_noprof(gfp, order, mpol, + NO_INTERLEAVE_INDEX, numa_node_id()); + int n; struct folio *folio; @@ -1007,6 +1012,12 @@ struct folio *filemap_alloc_folio_noprof(gfp_t gfp, unsigned int order) } return folio_alloc_noprof(gfp, order); } +EXPORT_SYMBOL(filemap_alloc_folio_mpol_noprof); + +struct folio *filemap_alloc_folio_noprof(gfp_t gfp, unsigned int order) +{ + return filemap_alloc_folio_mpol_noprof(gfp, order, NULL); +} EXPORT_SYMBOL(filemap_alloc_folio_noprof); #endif @@ -1861,11 +1872,12 @@ void *filemap_get_entry(struct address_space *mapping, pgoff_t index) } /** - * __filemap_get_folio - Find and get a reference to a folio. + * __filemap_get_folio_mpol - Find and get a reference to a folio. * @mapping: The address_space to search. * @index: The page index. * @fgp_flags: %FGP flags modify how the folio is returned. * @gfp: Memory allocation flags to use if %FGP_CREAT is specified. + * @mpol: The mempolicy to apply. * * Looks up the page cache entry at @mapping & @index. * @@ -1876,8 +1888,8 @@ void *filemap_get_entry(struct address_space *mapping, pgoff_t index) * * Return: The found folio or an ERR_PTR() otherwise. */ -struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, - fgf_t fgp_flags, gfp_t gfp) +struct folio *__filemap_get_folio_mpol(struct address_space *mapping, pgoff_t index, + fgf_t fgp_flags, gfp_t gfp, struct mempolicy *mpol) { struct folio *folio; @@ -1947,7 +1959,7 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, err = -ENOMEM; if (order > 0) alloc_gfp |= __GFP_NORETRY | __GFP_NOWARN; - folio = filemap_alloc_folio(alloc_gfp, order); + folio = filemap_alloc_folio_mpol(alloc_gfp, order, mpol); if (!folio) continue; @@ -1978,6 +1990,14 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, return ERR_PTR(-ENOENT); return folio; } +EXPORT_SYMBOL(__filemap_get_folio_mpol); + +struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, + fgf_t fgp_flags, gfp_t gfp) +{ + return __filemap_get_folio_mpol(mapping, index, + fgp_flags, gfp, NULL); +} EXPORT_SYMBOL(__filemap_get_folio); static inline struct folio *find_get_entry(struct xa_state *xas, pgoff_t max, From patchwork Tue Nov 5 16:55:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shivank Garg X-Patchwork-Id: 13863228 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2065.outbound.protection.outlook.com [40.107.223.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA1B317A583; Tue, 5 Nov 2024 16:55:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.223.65 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730825759; cv=fail; b=Al4zc2jpkGOuUWG0e65GRqfVVrePV1ON/VbaaJobKv77oBXmHU32ZTJ6nPiubOg/7DtWiuHV571LdbiDHkRTdhM8C0kteNRe9vlMQL7BsgCGkubVjkgpB5Fx37qoHd+1CekqdOg1RKaXert3B3zs6nnP1a1272dEV5cQAPcYgWk= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730825759; c=relaxed/simple; bh=jwimxez9sPMT63zFln3TB1OGmA+o6K2ffCledNKW5Jo=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=kTHQQyZlpMV0muivZ0jAIf9fmTH5k1sOvmcQiQqmwynZ/pTtauM680h8nAVUpZgq7DqE6Fp3mtBjARSKpasjuaWoGL/CkbChKsLlRZFo+PHWLqnHHPP0fDpkBii7rU/3iG+eoamFF4c7z4Bbmxd7DJ4ZIee4a7KdCLFXqK5erzk= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=XcYmzc9E; arc=fail smtp.client-ip=40.107.223.65 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="XcYmzc9E" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=GO+8KqZksOBgb63gOvUzh3TiPtMInDY9/tcJlUWc32refWK2FWd0/6aNN0TQn8AlHAfzEYHYQUYPNJRZ1qGj50eRjtVscdRHPkLrNv8BjZ3UfsEPSHNMOux1j0Vxdvh14sk20mU5bzMb9B2isxxwP98Qr0ZIkUEgE4Dvyw2bY9/44QEJh+vF3V7JvC3K74heJDLOiEBNdEyy2u0crRJvF3mtCtP/p1IYOei3ve2+usEeX8NTOGOxFEH5M4sbr/nNDI2JSFpFynCcRsNQcqndeKdu/bDN7b1pG8C59+D/RKdCW98ZrAtF61aZF6t2So/QkvTEs4LzOCNpJnokGafHyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=UkA9Oi58zWn7FUe7iSeOcsOE148v28uBRk223S3eB8c=; b=TPE27ucvvOAhSzYiTlzdVYDAMBY5tw2Ab0Rf6X4ObgQ32me/vpM1B2Vfm4GG+NmlNRBCRFzVNpCuE0aM2RJj1H7h2bRidJxdsqrFXiQ/iCatKVB1MHCNFV+KvFU7W6zSCWsR5u2ZdrbPvUgZAaJYcMx6bOq2w3xni4v9lpD716o+ApDOL13wPHkxAZx47zzxkjcX9dVKvvzDBCd6YvfE3kl6j+uF0XqUcu9tupUszOG0k6DF74KUiUqKVz0XMeZ8sYRgWjGZovJfcPam1P/RP0YQaFJSItt4zcRheXta12IMW6jeKWLD18d5F1IdXNOKvPnqGVBXLtdOsliT6h/BcQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=UkA9Oi58zWn7FUe7iSeOcsOE148v28uBRk223S3eB8c=; b=XcYmzc9E1OexO+8/JosY4XpXR8b9N0pqYOvcrCHxxbghYMdIOTzzC38sp6bCz/KHT8tRyMi5z04FgC6PVwBvGc+X48HK/RaHun8hFjdOU/1k5Cm+K9PXvuIRsjecFr05TnL5hVXPvDR2hxpFa+One9JKY3FKrW3oTwFx0S8PfCo= Received: from CY5PR18CA0017.namprd18.prod.outlook.com (2603:10b6:930:5::34) by SJ0PR12MB6944.namprd12.prod.outlook.com (2603:10b6:a03:47b::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8114.30; Tue, 5 Nov 2024 16:55:54 +0000 Received: from CY4PEPF0000E9D7.namprd05.prod.outlook.com (2603:10b6:930:5:cafe::aa) by CY5PR18CA0017.outlook.office365.com (2603:10b6:930:5::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8114.31 via Frontend Transport; Tue, 5 Nov 2024 16:55:54 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CY4PEPF0000E9D7.mail.protection.outlook.com (10.167.241.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8137.17 via Frontend Transport; Tue, 5 Nov 2024 16:55:54 +0000 Received: from kaveri.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Tue, 5 Nov 2024 10:55:46 -0600 From: Shivank Garg To: , , , , , , , , , , CC: , , , , , , , , , , , , , , , , , , Shivansh Dhiman Subject: [RFC PATCH 2/4] Introduce fbind syscall Date: Tue, 5 Nov 2024 16:55:13 +0000 Message-ID: <20241105165515.154941-1-shivankg@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241105164549.154700-1-shivankg@amd.com> References: <20241105164549.154700-1-shivankg@amd.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY4PEPF0000E9D7:EE_|SJ0PR12MB6944:EE_ X-MS-Office365-Filtering-Correlation-Id: 7a5cab06-d0e8-4ad5-8b56-08dcfdbab664 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|7416014|376014|82310400026|36860700013|1800799024|921020; X-Microsoft-Antispam-Message-Info: 5mijMRTxZD5Rhi4Lq1dISlzlnRjFs5kABHXC25kQ1AzRSzY2uaQcY0SyJWwKiZHdbsEqujWkGElhm/QRMXtPAQlXfcYeU6iH+L3rDG7FifPwyinSkcA5DmMCGGZ/XjlDMTDKap8cUZlrazB5fWJGIhBUm/YhLPpPPFpo0/j2ppxvoIDz1aoqWv6Z+fU0Gaq8OWzWyIcYzWA4G2gMWmJZXRb2BfuPKlJcESwIvC4IJ3RLXGOMtcGefaH9ytsaEikAN8vFhxN95QGVN+BkYagJ1zGD+yfdmqttQnM1/rE/JSm7/wUpFSO+f+GW1VqwEx1qZMD3LJM8Xq54TW8Ng6wnq9zkyXeEJqARhtYRj3qQIgQoCqTWYNFit7ZKubYVMaFgorqe3ikuwq5/xAb+c7PvErVXLZ2jP1xPx3hFIRnmt1Vs7XRf4XDFAhZLVgruV4bABNFNOoU0exBGiT86ohoQ+sJ8arxm3oAGozjAXHdTu4dEVrvF8ULG0vMkMTdInPF1PbBoL/ryUkg/jUxRMvV7e2qUoewiviFzrflMzXFKHx1a2iwMu75689PahM52eEMs7K1CKDK+J+cGzwNISbcjrWhyXTif0IYwpM4pHp48nzkoQ6EoVuJkveI92bcJI/xzzRA99oOBRDclQmP+TIBEJUsMr0TkzCwgbQwjzLfBAPvjwDKlA8BL6H0eykeIPzlW02mCCoW2RnqfGPFi7+kd4NObIh+lDiOlbqF/SAbpLQ3dACkTmuPIaBtuod4khQYtbTI+CmA2xJ5DjDQpkS7bMn/xbCDXJzgHo9JIYkuFpTJD1OPV1XKqY3Me8KVOqw2WNjdUd9kFQess07DGWcjyD8cebX8/bZToT8Ennneqz09FMosC5nCmw9DSrECrjz6ckihaoFvWolSriC1tfxMpj6K4sxmmKGEb+KWy+4LglU53SiaMCuYufmJDyAWNnladPRWG1x5oY3UAJFrpguJWh90zqRBdSzmr6HBWzJ+FufWpkKdmsyO32LtHqErDMGlDfEqgXNuyw4OV1HgJ470odqKbxuECKfKwozqbShCzKCDK2d1CP/0+EY0YFB6KN7arW49KQoAWiazx95cTo5AY+vin92it3y2hQU/b5S0z9T/63Fx3eUXa0w8HUwoRFNItDkSaiF7JLssP+snxcMgpH6xh6b7/64jgnAdJxL9LorBftsNG1jRbMPr8/qSEddRIcNwFdJ/qmayVvUZ7zlsyJ1uqJkoqNeB59jho106RjZQj38ASeg4H7ChSK2Joz7Mn23nAXjmcWIeKAFtz/OdpmLnXv/R5i/iCgaos9upVFOLVqN8hzVkREdgfxYhino0IvMWxhxLgi8d+eEkDcBMh/5VPE1vSXEZIrr2MjetmNa5HR6a6+5uVX9gowgqyjPqHEavIc+EgfT/wM4E0VzjP4qNAvKW/SLc2O8M1b8afiAY= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(7416014)(376014)(82310400026)(36860700013)(1800799024)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Nov 2024 16:55:54.3836 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 7a5cab06-d0e8-4ad5-8b56-08dcfdbab664 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CY4PEPF0000E9D7.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR12MB6944 The new fbind syscall sets the NUMA memory policy for file-backed memory and has following signature: long fbind(unsigned int fd, unsigned long mode, const unsigned long nodemask[(.maxnode + ULONG_WIDTH - 1) / ULONG_WIDTH], unsigned long maxnode, unsigned int flags); fbind behaves similar to mbind except that it takes file descriptor as input instead of address ranges. TODO: 1. Support fbind syscall on all architectures. 2. Expand commit msg and add documentation. 3. clean-up the code. [Shivansh: add create_mpol_from_args()] Signed-off-by: Shivansh Dhiman Signed-off-by: Shivank Garg --- arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + include/linux/fs.h | 3 ++ include/linux/mempolicy.h | 3 ++ include/linux/syscalls.h | 3 ++ include/uapi/asm-generic/unistd.h | 5 ++- kernel/sys_ni.c | 1 + mm/Makefile | 2 +- mm/fbind.c | 49 +++++++++++++++++++++++ mm/mempolicy.c | 55 ++++++++++++++++++++++++++ 10 files changed, 121 insertions(+), 2 deletions(-) create mode 100644 mm/fbind.c diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index 534c74b14fab..0660ce6d08d8 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -468,3 +468,4 @@ 460 i386 lsm_set_self_attr sys_lsm_set_self_attr 461 i386 lsm_list_modules sys_lsm_list_modules 462 i386 mseal sys_mseal +463 i386 fbind sys_fbind diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index 7093ee21c0d1..9794347cc2e6 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -386,6 +386,7 @@ 460 common lsm_set_self_attr sys_lsm_set_self_attr 461 common lsm_list_modules sys_lsm_list_modules 462 common mseal sys_mseal +463 common fbind sys_fbind # # Due to a historical design error, certain syscalls are numbered differently diff --git a/include/linux/fs.h b/include/linux/fs.h index fd34b5755c0b..42042b62bdcd 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2058,6 +2058,9 @@ struct file_operations { struct file *file_out, loff_t pos_out, loff_t len, unsigned int remap_flags); int (*fadvise)(struct file *, loff_t, loff_t, int); +#ifdef CONFIG_NUMA + int (*set_policy)(struct file *, struct mempolicy *); +#endif int (*uring_cmd)(struct io_uring_cmd *ioucmd, unsigned int issue_flags); int (*uring_cmd_iopoll)(struct io_uring_cmd *, struct io_comp_batch *, unsigned int poll_flags); diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h index 1add16f21612..b9023f6246a7 100644 --- a/include/linux/mempolicy.h +++ b/include/linux/mempolicy.h @@ -299,4 +299,7 @@ static inline bool mpol_is_preferred_many(struct mempolicy *pol) } #endif /* CONFIG_NUMA */ +struct mempolicy *create_mpol_from_args(unsigned char mode, + const unsigned long __user *nmask, + unsigned short maxnode); #endif diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 4bcf6754738d..2dc686921b9f 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -502,6 +502,9 @@ asmlinkage long sys_readlinkat(int dfd, const char __user *path, char __user *bu asmlinkage long sys_newfstatat(int dfd, const char __user *filename, struct stat __user *statbuf, int flag); asmlinkage long sys_newfstat(unsigned int fd, struct stat __user *statbuf); +asmlinkage long sys_fbind(unsigned int fd, unsigned long mode, + const unsigned long __user *nmask, + unsigned long maxnode, unsigned int flags); #if defined(__ARCH_WANT_STAT64) || defined(__ARCH_WANT_COMPAT_STAT64) asmlinkage long sys_fstat64(unsigned long fd, struct stat64 __user *statbuf); asmlinkage long sys_fstatat64(int dfd, const char __user *filename, diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index 5bf6148cac2b..550730f36dae 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -841,8 +841,11 @@ __SYSCALL(__NR_lsm_list_modules, sys_lsm_list_modules) #define __NR_mseal 462 __SYSCALL(__NR_mseal, sys_mseal) +#define __NR_fbind 463 +__SYSCALL(__NR_fbind, sys_fbind) + #undef __NR_syscalls -#define __NR_syscalls 463 +#define __NR_syscalls 464 /* * 32 bit systems traditionally used different diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index c00a86931f8c..f57350e581f6 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -195,6 +195,7 @@ COND_SYSCALL(move_pages); COND_SYSCALL(set_mempolicy_home_node); COND_SYSCALL(cachestat); COND_SYSCALL(mseal); +COND_SYSCALL(fbind); COND_SYSCALL(perf_event_open); COND_SYSCALL(accept4); diff --git a/mm/Makefile b/mm/Makefile index d2915f8c9dc0..ba339ddc0be2 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -79,7 +79,7 @@ obj-$(CONFIG_ZSWAP) += zswap.o obj-$(CONFIG_HAS_DMA) += dmapool.o obj-$(CONFIG_HUGETLBFS) += hugetlb.o obj-$(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP) += hugetlb_vmemmap.o -obj-$(CONFIG_NUMA) += mempolicy.o +obj-$(CONFIG_NUMA) += mempolicy.o fbind.o obj-$(CONFIG_SPARSEMEM) += sparse.o obj-$(CONFIG_SPARSEMEM_VMEMMAP) += sparse-vmemmap.o obj-$(CONFIG_MMU_NOTIFIER) += mmu_notifier.o diff --git a/mm/fbind.c b/mm/fbind.c new file mode 100644 index 000000000000..85ec7d13345c --- /dev/null +++ b/mm/fbind.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Implement fbind() syscall. + * + * Copyright (c) 2024 AMD + * + * Author: Shivank Garg + */ + +#include +#include +#include +#include + +static long do_fbind(unsigned int fd, unsigned long mode, + const unsigned long __user *nmask, + unsigned long maxnode, unsigned int flags) +{ + struct mempolicy *mpol; + struct fd f; + int ret; + + f = fdget(fd); + if (!f.file) + return -EBADF; + + mpol = create_mpol_from_args(mode, nmask, maxnode); + if (IS_ERR_OR_NULL(mpol)) { + ret = PTR_ERR(mpol); + goto out_putf; + } + + if (f.file->f_op->set_policy) + ret = f.file->f_op->set_policy(f.file, mpol); + else + ret = -EOPNOTSUPP; + + mpol_put(mpol); +out_putf: + fdput(f); + return ret; +} + +SYSCALL_DEFINE5(fbind, unsigned int, fd, unsigned long, mode, + const unsigned long __user *, nmask, + unsigned long, maxnode, unsigned int, flags) +{ + return do_fbind(fd, mode, nmask, maxnode, flags); +} diff --git a/mm/mempolicy.c b/mm/mempolicy.c index b858e22b259d..3a697080ecad 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -3557,3 +3557,58 @@ static int __init mempolicy_sysfs_init(void) late_initcall(mempolicy_sysfs_init); #endif /* CONFIG_SYSFS */ + +/** + * create_mpol_from_args - create a mempolicy structure from args + * @mode: NUMA memory policy mode + * @nmask: bitmask of NUMA nodes + * @maxnode: number of bits in the nodes bitmask + * + * Create a mempolicy from given nodemask and memory policy such as + * default, preferred, interleave or bind. + * + * Return: error encoded in a pointer or memory policy on success. + */ +struct mempolicy *create_mpol_from_args(unsigned char mode, + const unsigned long __user *nmask, + unsigned short maxnode) +{ + struct mm_struct *mm = current->mm; + unsigned short mode_flags; + struct mempolicy *mpol; + nodemask_t nodes; + int lmode = mode; + int err = -ENOMEM; + + err = sanitize_mpol_flags(&lmode, &mode_flags); + if (err) + return ERR_PTR(err); + + err = get_nodes(&nodes, nmask, maxnode); + if (err) + return ERR_PTR(err); + + mpol = mpol_new(mode, mode_flags, &nodes); + if (IS_ERR_OR_NULL(mpol)) + return mpol; + + NODEMASK_SCRATCH(scratch); + if (!scratch) { + err = -ENOMEM; + goto err_out; + } + + mmap_write_lock(mm); + err = mpol_set_nodemask(mpol, &nodes, scratch); + mmap_write_unlock(mm); + NODEMASK_SCRATCH_FREE(scratch); + + if (err) + goto err_out; + + return mpol; + +err_out: + mpol_put(mpol); + return ERR_PTR(err); +} From patchwork Tue Nov 5 16:55:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shivank Garg X-Patchwork-Id: 13863229 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2084.outbound.protection.outlook.com [40.107.220.84]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 50DF71DB940; Tue, 5 Nov 2024 16:56:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.220.84 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730825769; cv=fail; b=Ed7roIIgoy7FlOZPeHwggkk9/EUyeR7emejrEOwyMBl7ZL/TGadLi+LdwOgGJo8s5KxXOiug/TlB6Q5/GjWaH+qM1dozDe8zgUlaXHBszeDvRP8GkvaP/6HkIaJrMMgrTnlajIs98klfJlzMkC6ZcyQgyI7ZnRNrbdj5kmaIPFM= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730825769; c=relaxed/simple; bh=3Y7dbqZMLNQkzXBjWT6ZxlqF62JC7mysMVWrG3On7CA=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=k8ERlRHpAxvW/pYN76+tqdqKCWcG/ZZ40dCG2l4axbMU+VsW3ALUzC+BQ52cWoa5xJ90a9TSuFtSNrkjo+Nk/tnxR6ltOdCDZYNE1nvOgxO79oBBytaw1/FktLOt2QYiQ4xApqzLvscegPlXd6eXHvg1PLP4ZxEDcT1tln/9OVo= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=IlYPTqU0; arc=fail smtp.client-ip=40.107.220.84 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="IlYPTqU0" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=VQNqx5rfkrm0xspFXiRFCw3B3iZ4DSgHhAoD9r1uvmo/wmeW1OFahr0qh6TljCoY6JSD/apke9ye4EDV30RmAE/PDE3T7hsMtUd7IIMPh40CIBISgx9XAqsC4muoHyLxDGquqdW9aljdvTTzYFGpI0amj8wYlf/N3CWLhi5evJ7kWKHLg8BtZamJcRAWB74ChKFQr9OhMQ4B2IJFiSUSTqe39ZidnbV0tKaYqxdNfZAeGc2wU/h78OxlgBrH6kQFQHu11ELf761liu3wPQe7BsrRqXDLxX6e9qrimlyrSjsBrlmkMF3SLyhx01BVO3NseOsHMHGawtOh1+zEeWYqaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=VYWSF/gUUp6fJckNhr9t7F7Ukp+JcQngNTbbZJ1S7CI=; b=qHJb8CGziQ2aeXFgkWYT38mnMrZTSaWRsqM9W74sRPAF/pXhxh30oPlsGt4w/ig84f9SUCDoAg0eZ2i9jdCcXol7m56DC1LLcDOEsCEsBwT2Jt3UIJERbGf7A6SyH9nKKiyoFFVqDdRkOmMNNfBxMYyjzELa/7bcZ8sEXppqh+GO2+W1VwFlyeWKpKNNk+Jt06lz1fehbbF71Ign10QJxj2nQtbffdxeAAnJ3KpagagxzKqErjI+2dD9pzfuvsh5HxrJxWU5E6ODcfiIo0S0mCzsLoivAswb9aE3oZJaRziGYeuRrruCFY21IPY2/PPgRDzFTd/K/xq+bTo1Wn8hAg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=VYWSF/gUUp6fJckNhr9t7F7Ukp+JcQngNTbbZJ1S7CI=; b=IlYPTqU0xLrL11gzMjByw9GvzskWtKb1K82/OcB11/CSOEbMDjGGT/pmq6hp/PE7Hko5A59NLzUuM4nelSBPToUnP1oiGm85KcOGCewyaAoblO4E4VGmNz/hX645zBSHGITq1yAKCO5B82hP8tqCRH3U1FieftDrUwFSc+am6i0= Received: from CY5PR18CA0023.namprd18.prod.outlook.com (2603:10b6:930:5::19) by DS0PR12MB7993.namprd12.prod.outlook.com (2603:10b6:8:14b::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8114.20; Tue, 5 Nov 2024 16:56:05 +0000 Received: from CY4PEPF0000E9D7.namprd05.prod.outlook.com (2603:10b6:930:5:cafe::4) by CY5PR18CA0023.outlook.office365.com (2603:10b6:930:5::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8114.31 via Frontend Transport; Tue, 5 Nov 2024 16:56:05 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CY4PEPF0000E9D7.mail.protection.outlook.com (10.167.241.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8137.17 via Frontend Transport; Tue, 5 Nov 2024 16:56:05 +0000 Received: from kaveri.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Tue, 5 Nov 2024 10:55:56 -0600 From: Shivank Garg To: , , , , , , , , , , CC: , , , , , , , , , , , , , , , , , Subject: [RFC PATCH 3/4] KVM: guest_memfd: Pass file pointer instead of inode in guest_memfd APIs Date: Tue, 5 Nov 2024 16:55:14 +0000 Message-ID: <20241105165515.154941-2-shivankg@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241105165515.154941-1-shivankg@amd.com> References: <20241105164549.154700-1-shivankg@amd.com> <20241105165515.154941-1-shivankg@amd.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY4PEPF0000E9D7:EE_|DS0PR12MB7993:EE_ X-MS-Office365-Filtering-Correlation-Id: 03792042-eef5-4eec-8dbb-08dcfdbabcdd X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|36860700013|82310400026|1800799024|921020; X-Microsoft-Antispam-Message-Info: yWtUKC3+lYrQaPGckXHSaZZGA/1P/8AG+rQDXmYCIQtPHypCTizIy68gOl+/Sb9lO/JWITKK+pF+mTPdawJhvfjh4yubbxp7r8J9nV3YbVPICO/2YbcSw2Qa7fQFMwfcm9X/beYsr5VKQ04l4v1OlL7IuGbYy5V8CadVpzRD/D8qsRbm7fy4nn/TE+9yoGZh+9afGtgpS8gsAaZyM6uYH6A6vnc/r1VcVMdqKmFgAVqQ4VVwJEYO78PkQ77YDhNAtZUVrFVqKMG/b5mD+7uA6a5c1njDMfuz7U2yyEpwqayoltGlIow7czaqQERMsm23jh/w6pz9DUu522IA3QmJgVn14b1z9OKwnABQcJwNSfnVs/+SHaarTKUcMxCVqZx2EcizXZ7De1UwVcz07i9wgBSQDcyRR/3gQSEXDZOmVZFo4fQCF/IIexbUEehJNnkG6MBOV/MhF492Ld0ZyiAtHtQWKjDHsthvRHk8c3OihnOVu3TlBygX7SwtNwiVywG++Y4moLigQGBwDUtKYAZ4PrGutzyMrZalPhGiFRvdb+2fjIdhfQz+RFI3QBx7BlSsKaWQLpboZRMco7n5jZ4uyp/OvRIMvOVTqNW9NWOrAWKnRzdQscWHQq2T2D9+3oEAGrSfSi8b6Rl0OLg6wH2/Ab5gB8JE+NCjTc0cXeKU8DLjAOXnqucV/1dQ5QcYQhb0npRHiC76a5TL3CZt7st+4blSDOH0qHkzBucsaJVyd+4rQkzI4s9xbkpFsZxqLKuWlycY2LaP/wJ1YuGnSXEnP0VNDs/2pTF+XQydLyQlsCvksuEe5d47EMp/hGazFV+UR/RC+aapd4KXKowCmEfFieXXthnWvLSbH1EDwuh22xUDujSGAstc8iA6W50fDbUlB+5L332/NIEJeTDW11T4I+w6UOTwgFRZ7TxaqUQJ1AF81Nug6ohYFNSglTAsPqsE5pj4n1LphboDkpeqP6wH3idhOCjewLTiaMwcsatDvaGOKzW6p1BqTrM10iEZ6qL1wzbaI4Y87W40LaAQeKJHDhniw/KaN8PU3VgPmXlYePowTSx1992ijpOxg2VSyZYk02ikhC5j37pGjSF/itfHuBlNBqwM8VQTzD6YLZGFkMYx52kvML7I9i0x8DDCdiASdo7++ErLA8x5CIjGxSa3TdfuENqmOecemwokkA2bENRT6A+LgJuey9mBptoHXGCAi5QJX2p1Ttkh781/WwCmvQ180MNcemEk7bLwPtj19kM//S6LL8W5Y8dTHFfsdgwYxCDlGyp1CFRXRmmCm7dJcR6opKqtODHrtOinU6C7/QKI0d/eQG1P7A4EQbhR/Hl/IjTFFMsmhKiVzc8IUtJm1ORe03w56XA11Od67Moj57VgDxYKkc/IIoU1FrLzLWddb+WoYNKCERHfPL+SEcpq1g+2aGF5WjwPm6cl3CpAK1I= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(376014)(7416014)(36860700013)(82310400026)(1800799024)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Nov 2024 16:56:05.2586 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 03792042-eef5-4eec-8dbb-08dcfdbabcdd X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CY4PEPF0000E9D7.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB7993 Change the KVM guest_memfd APIs to pass file pointers instead of inodes in the folio allocation functions. This is preparatory patch for adding NUMA support to guest memory allocations. The functional behavior remains unchanged. Signed-off-by: Shivank Garg --- virt/kvm/guest_memfd.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index e930014b4bdc..2c6fcf7c3ec9 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -91,7 +91,7 @@ static struct folio *kvm_gmem_get_huge_folio(struct inode *inode, pgoff_t index, { pgoff_t npages = 1UL << order; pgoff_t huge_index = round_down(index, npages); - struct address_space *mapping = inode->i_mapping; + struct address_space *mapping = inode->i_mapping; gfp_t gfp = mapping_gfp_mask(mapping) | __GFP_NOWARN; loff_t size = i_size_read(inode); struct folio *folio; @@ -125,16 +125,16 @@ static struct folio *kvm_gmem_get_huge_folio(struct inode *inode, pgoff_t index, * Ignore accessed, referenced, and dirty flags. The memory is * unevictable and there is no storage to write back to. */ -static struct folio *__kvm_gmem_get_folio(struct inode *inode, pgoff_t index, +static struct folio *__kvm_gmem_get_folio(struct file *file, pgoff_t index, bool allow_huge) { struct folio *folio = NULL; if (gmem_2m_enabled && allow_huge) - folio = kvm_gmem_get_huge_folio(inode, index, PMD_ORDER); + folio = kvm_gmem_get_huge_folio(file_inode(file), index, PMD_ORDER); if (!folio) - folio = filemap_grab_folio(inode->i_mapping, index); + folio = filemap_grab_folio(file_inode(file)->i_mapping, index); pr_debug("%s: allocate folio with PFN %lx order %d\n", __func__, folio_pfn(folio), folio_order(folio)); @@ -150,9 +150,9 @@ static struct folio *__kvm_gmem_get_folio(struct inode *inode, pgoff_t index, * Ignore accessed, referenced, and dirty flags. The memory is * unevictable and there is no storage to write back to. */ -static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index) +static struct folio *kvm_gmem_get_folio(struct file *file, pgoff_t index) { - return __kvm_gmem_get_folio(inode, index, true); + return __kvm_gmem_get_folio(file, index, true); } static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start, @@ -228,8 +228,9 @@ static long kvm_gmem_punch_hole(struct inode *inode, loff_t offset, loff_t len) return 0; } -static long kvm_gmem_allocate(struct inode *inode, loff_t offset, loff_t len) +static long kvm_gmem_allocate(struct file *file, loff_t offset, loff_t len) { + struct inode *inode = file_inode(file); struct address_space *mapping = inode->i_mapping; pgoff_t start, index, end; int r; @@ -252,7 +253,7 @@ static long kvm_gmem_allocate(struct inode *inode, loff_t offset, loff_t len) break; } - folio = kvm_gmem_get_folio(inode, index); + folio = kvm_gmem_get_folio(file, index); if (IS_ERR(folio)) { r = PTR_ERR(folio); break; @@ -292,7 +293,7 @@ static long kvm_gmem_fallocate(struct file *file, int mode, loff_t offset, if (mode & FALLOC_FL_PUNCH_HOLE) ret = kvm_gmem_punch_hole(file_inode(file), offset, len); else - ret = kvm_gmem_allocate(file_inode(file), offset, len); + ret = kvm_gmem_allocate(file, offset, len); if (!ret) file_modified(file); @@ -626,7 +627,7 @@ __kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot, return ERR_PTR(-EIO); } - folio = __kvm_gmem_get_folio(file_inode(file), index, allow_huge); + folio = __kvm_gmem_get_folio(file, index, allow_huge); if (IS_ERR(folio)) return folio; From patchwork Tue Nov 5 16:55:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shivank Garg X-Patchwork-Id: 13863230 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2062.outbound.protection.outlook.com [40.107.93.62]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4252D1DB940; Tue, 5 Nov 2024 16:56:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.93.62 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730825778; cv=fail; b=DoSV+h/NanKktAPmQqB2ozepgpBedaUFkG4j2ET/kT+0T7Nhlv4LPH3SwoH8d/UPJT0T3wsWn6wCyq1AWRPGeWuK/qgqKdm1Ydf9kVkxxF5grAow3Wuhd6ts4PraplQShZHGqkLSVgKtzeN34ThfUKrUlQd/bJNg2t4PJVYA0tE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730825778; c=relaxed/simple; bh=zY82ZtP+UnMynnjFPI2nf6jmxD9SU4PbKqe5dE9+Yaw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=XFhnHTIpjlCZfNhV9yj4PI0jKLYoz0dvfTykR3PpHsPgzqfW0r+CBISuGDSD9KX5smygmYqDMLAVxFNxkI553fObu1UOn09PonYAizftza3b0CSHALUqTns6YFBf60z5vgQ5WGUJyVJPZbI69QlQEIQVyi8M8a2YMtSCpXxrrmg= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=Q5GKiOM+; arc=fail smtp.client-ip=40.107.93.62 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="Q5GKiOM+" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=N3T2kqHH159/4ZnLZeL+pPJYhHFASc2AQ9Dv68cZshRsZNTx5TjfJK8p2bSysPJxcaZlzdyMw5HPOrnRq43lL7yYfpXZmTvKowL5N3XfyfqaOmXTPo/tFhKR45UN3SJZ4XI/KfrAQAP+jjE5HcXQh41cfUo+ADJy7IFN9/ayUpJBL84Q6MWaXhQK7uaglvG8XD2dtSHA9Ou3J/ZxPWNkEURN5IgOUP5FMw+t+1IoWt/Gpk9m0V/rxPofN8dLskm1NAlyauOZ3ma848EClY5XKMiLdghbWgDD4SDxtN5uxdBOoE0SIcpOGodqxP5bWrTru3IHaUTJxW0J4xhk8kwa/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=WKH5iUczCRiHl86KVuC6tKdcCK/RD41GSpieUjteNDE=; b=kYe7bcrX5Ru7hbLXcxCtXrQ4sXdNNJ+ORSC3XY49lUCqBDJFNjWHz1TPt1h0CnwzwsBq7UIEnqw2NqvrQxVT39lTVj/Okj79Zn2EJes3uFWKQa3QlWOJVvX4a4GTsjbCN4OY++Bp7drUBJQ7F7BFSKZP0SOj6mamTqlJJMjbPfJPas4gSJ2hEFTt9wsPyC2y/1z1HfvPjU/nBzS9BdQqryE4KnFr+DHuvVMCGkIQnazaPyE6WuLYwyo+erzpT+apzfTPWg/3uhOdGFDMbOx3jpwXSGBGBkkHTcA1JqFF849bvPXg0yMw7kJ7DJZpwvYEjgh5HGPkqI75IP0wZ6VdvQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WKH5iUczCRiHl86KVuC6tKdcCK/RD41GSpieUjteNDE=; b=Q5GKiOM+5BsFrJ0XTfIUstIgK3fFzkVKh4zRYnM7xr/EQ3i4reXpHyCQXArtr0Iah5eCUU2lHlsFEsJTNT6/90LO1n/y4fNh+IphY8NNDnTfEn/QgOxXf39Sf0sfogJ5vOjYovzQvaJAykykx/utpNO1tL+mkHZcyweDt3ZGNHQ= Received: from SA9PR11CA0030.namprd11.prod.outlook.com (2603:10b6:806:6e::35) by LV2PR12MB5942.namprd12.prod.outlook.com (2603:10b6:408:171::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8137.18; Tue, 5 Nov 2024 16:56:13 +0000 Received: from SN1PEPF00036F3D.namprd05.prod.outlook.com (2603:10b6:806:6e:cafe::ef) by SA9PR11CA0030.outlook.office365.com (2603:10b6:806:6e::35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8114.30 via Frontend Transport; Tue, 5 Nov 2024 16:56:12 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by SN1PEPF00036F3D.mail.protection.outlook.com (10.167.248.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8137.17 via Frontend Transport; Tue, 5 Nov 2024 16:56:12 +0000 Received: from kaveri.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Tue, 5 Nov 2024 10:56:05 -0600 From: Shivank Garg To: , , , , , , , , , , CC: , , , , , , , , , , , , , , , , , Subject: [RFC PATCH 4/4] KVM: guest_memfd: Enforce NUMA mempolicy if available Date: Tue, 5 Nov 2024 16:55:15 +0000 Message-ID: <20241105165515.154941-3-shivankg@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241105165515.154941-1-shivankg@amd.com> References: <20241105164549.154700-1-shivankg@amd.com> <20241105165515.154941-1-shivankg@amd.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN1PEPF00036F3D:EE_|LV2PR12MB5942:EE_ X-MS-Office365-Filtering-Correlation-Id: 9805a6c8-2b6a-4feb-2820-08dcfdbac14a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|36860700013|376014|7416014|82310400026|1800799024|921020; X-Microsoft-Antispam-Message-Info: AhKjNqwuOQ9yLHVD1PLSic7sN0fyQsPfG9/Vk6L0fVk2MB3PuDXuyIoo/17KWMlz+9KOsTHOIG5KVLJWiwP1ASbmb7Z9LlTWJZDgO+k7Ftwq35MLW4pDM5eMnYvh5oX4t5gT+rGsuiFicbnorcxjAvMp7PXljibx0pjgWpcZOOHOyZzW//5Vym8ofnzUQ7O6gJy6xWPMF3HITgB/HIVNixD57o/NaKFyos1s4BdLOUEZuiKcDVwPe3Fb4JtoJvKDe/31CkbnVKZ7gmaepZx6kedMYGLCfTc6F+VnZAFDikPevYKN/RhAqFGVX0ADrlW4BEYIM3srd/5GrJnTetzBuLEQSmpLTMx5c+x+ERFuu9ytozTO1cuaoKAAeuHli2Hox7F5paCs6YTM77DvIJZ2mjJqkLhk2DV8uBtqd+U1eRIgJ687FEC48H37RXu0YJOk3BlvOVPKpvDmeHDscWh/scJHQ6OOKjnJpvgTi5VDLXCXm3FPfGaNuAzPw8ygLo0AJF2ekyIxH8fkUWUgFx+ZJKE1/CcZoS7SYvz065QcGtzExroRi75OCNyMjLE6ffXDfpgYto/HpZ4JotyAKwl4QTvwMTnMZxw1tqAmlNDOycXaEOZUb/V4SD70u2skLeFHoN6qsPvg4mq6We1PxxcKS5ty8xgjKeaTPzyqopeMyp41h2O5Q+2iDxDYKiHAIBWpdtBQWBoDUFIKSeYj3FeyfUxzG+Yft8dKA3eod97P89PxVKRLA4nr0caWJ4ylYoWV3+59fmrO/THsJFO1ktbMQQgGmmC7Hjgt/+20L5CQ6+tdg9vTgWJhXY5Ri41hBD5GyXlTcZ9z94beC8j2mm0rgQ5XhdH3OBnb7ANAeDG1GYgE9Hg0M5fu61Mea+KYXw/RLexyXQe1m1u5saGhLNhRYN6LMgbcYhlO2YmlvrkChR1XInr15CsCVqjMU15P07+6iTkYrnOOvjzvY97ZRo470c0tyRtA7EuMxv9f8zyiekw5c0njaB1hfjBOBFrK2wlgHAmjPoFyhc0U80YsW08wtNaRyvTBAaddY7kbWPrCxPq8ig8HUO0X6Z50DZ3uo2vynuds4Gjg/qX+URPT6MKooxQ2yG111ZFaMivYxfTEpwSZaz8iM+MI7Zyi7uhHwLU6qbImYLYjReThcy0a+Aph5MbngjpGVRJNSiY6YF9dYw+kS05ek5imILvVH6XKWMz9uq/0eAYTvykAzX5hHMbT7gyx3tgAfr9NQSeXyvlRftBk6ZX77MZbzeCF+STwCUQU0JsKDkzQiz+2W9Ptxhinf0eDEUyP6x11WkiJ6DXKL9I+BtPLHTdGqB9jr/wWAKAPTZ66JqygA1vhiEoY7R1NTnaY6etOgjEK6SiM/QZfGn30JyVpGfSjhI/6Vh5IPp/IEycEc9zZolWjFboInzZnLMpoVzemB1BxTnCLqwpYfdU= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(36860700013)(376014)(7416014)(82310400026)(1800799024)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Nov 2024 16:56:12.7593 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 9805a6c8-2b6a-4feb-2820-08dcfdbac14a X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: SN1PEPF00036F3D.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV2PR12MB5942 Enforce memory policy on guest-memfd to provide proper NUMA support. Previously, guest-memfd allocations were following local NUMA node id in absence of process mempolicy, resulting in random memory allocation. Moreover, it cannot use mbind() since memory isn't mapped to userspace. To support NUMA policies, call fbind() syscall from VMM to store mempolicy as f_policy in struct kvm_gmem of guest_memfd. The f_policy is retrieved to pass in filemap_grab_folio_mpol() to ensure that allocations follow the specified memory policy. Signed-off-by: Shivank Garg --- mm/mempolicy.c | 2 ++ virt/kvm/guest_memfd.c | 49 ++++++++++++++++++++++++++++++++++++++---- 2 files changed, 47 insertions(+), 4 deletions(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 3a697080ecad..af2e1ef4dae7 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -347,6 +347,7 @@ void __mpol_put(struct mempolicy *pol) return; kmem_cache_free(policy_cache, pol); } +EXPORT_SYMBOL(__mpol_put); static void mpol_rebind_default(struct mempolicy *pol, const nodemask_t *nodes) { @@ -2599,6 +2600,7 @@ struct mempolicy *__mpol_dup(struct mempolicy *old) atomic_set(&new->refcnt, 1); return new; } +EXPORT_SYMBOL(__mpol_dup); /* Slow path of a mempolicy comparison */ bool __mpol_equal(struct mempolicy *a, struct mempolicy *b) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 2c6fcf7c3ec9..0237bda4382c 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -4,6 +4,7 @@ #include #include #include +#include #include "kvm_mm.h" @@ -11,6 +12,7 @@ struct kvm_gmem { struct kvm *kvm; struct xarray bindings; struct list_head entry; + struct mempolicy *f_policy; }; /** @@ -87,7 +89,8 @@ static int kvm_gmem_prepare_folio(struct kvm *kvm, struct kvm_memory_slot *slot, } static struct folio *kvm_gmem_get_huge_folio(struct inode *inode, pgoff_t index, - unsigned int order) + unsigned int order, + struct mempolicy *policy) { pgoff_t npages = 1UL << order; pgoff_t huge_index = round_down(index, npages); @@ -104,7 +107,7 @@ static struct folio *kvm_gmem_get_huge_folio(struct inode *inode, pgoff_t index, (loff_t)(huge_index + npages - 1) << PAGE_SHIFT)) return NULL; - folio = filemap_alloc_folio(gfp, order); + folio = filemap_alloc_folio_mpol(gfp, order, policy); if (!folio) return NULL; @@ -129,12 +132,26 @@ static struct folio *__kvm_gmem_get_folio(struct file *file, pgoff_t index, bool allow_huge) { struct folio *folio = NULL; + struct kvm_gmem *gmem = file->private_data; + struct mempolicy *policy = NULL; + + /* + * RCU lock is required to prevent any race condition with set_policy(). + */ + if (IS_ENABLED(CONFIG_NUMA)) { + rcu_read_lock(); + policy = READ_ONCE(gmem->f_policy); + mpol_get(policy); + rcu_read_unlock(); + } if (gmem_2m_enabled && allow_huge) - folio = kvm_gmem_get_huge_folio(file_inode(file), index, PMD_ORDER); + folio = kvm_gmem_get_huge_folio(file_inode(file), index, PMD_ORDER, policy); if (!folio) - folio = filemap_grab_folio(file_inode(file)->i_mapping, index); + folio = filemap_grab_folio_mpol(file_inode(file)->i_mapping, index, policy); + + mpol_put(policy); pr_debug("%s: allocate folio with PFN %lx order %d\n", __func__, folio_pfn(folio), folio_order(folio)); @@ -338,6 +355,7 @@ static int kvm_gmem_release(struct inode *inode, struct file *file) mutex_unlock(&kvm->slots_lock); xa_destroy(&gmem->bindings); + mpol_put(gmem->f_policy); kfree(gmem); kvm_put_kvm(kvm); @@ -356,10 +374,32 @@ static inline struct file *kvm_gmem_get_file(struct kvm_memory_slot *slot) return get_file_active(&slot->gmem.file); } +#ifdef CONFIG_NUMA +static int kvm_gmem_set_policy(struct file *file, struct mempolicy *mpol) +{ + struct mempolicy *old, *new; + struct kvm_gmem *gmem = file->private_data; + + new = mpol_dup(mpol); + if (IS_ERR(new)) + return PTR_ERR(new); + + old = gmem->f_policy; + WRITE_ONCE(gmem->f_policy, new); + synchronize_rcu(); + mpol_put(old); + + return 0; +} +#endif + static struct file_operations kvm_gmem_fops = { .open = generic_file_open, .release = kvm_gmem_release, .fallocate = kvm_gmem_fallocate, +#ifdef CONFIG_NUMA + .set_policy = kvm_gmem_set_policy, +#endif }; void kvm_gmem_init(struct module *module) @@ -489,6 +529,7 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) kvm_get_kvm(kvm); gmem->kvm = kvm; + gmem->f_policy = NULL; xa_init(&gmem->bindings); list_add(&gmem->entry, &inode->i_mapping->i_private_list);