From patchwork Wed Apr 5 18:01:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankit Agrawal X-Patchwork-Id: 13202325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70628C76188 for ; Wed, 5 Apr 2023 18:01:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BA9586B0071; Wed, 5 Apr 2023 14:01:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B58A96B0074; Wed, 5 Apr 2023 14:01:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9D2376B0075; Wed, 5 Apr 2023 14:01:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 90B0C6B0071 for ; Wed, 5 Apr 2023 14:01:54 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 46FCE120D36 for ; Wed, 5 Apr 2023 18:01:54 +0000 (UTC) X-FDA: 80648105748.24.4D0B565 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2059.outbound.protection.outlook.com [40.107.220.59]) by imf02.hostedemail.com (Postfix) with ESMTP id 09FCC80033 for ; Wed, 5 Apr 2023 18:01:50 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=AUky1adr; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf02.hostedemail.com: domain of ankita@nvidia.com designates 40.107.220.59 as permitted sender) smtp.mailfrom=ankita@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680717711; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ixyjebktJREJsrnkZvwgfBAu2SPMa3QRXMAWKUw88R8=; b=0XO74xlu3H/5nUk9I2WiiTNaHKhgEB2N9fHb3xwsKGLm5T7qH3N32LSJVJAHLL4A+2nLQH xzrujbGF63z/g2IVvGIgEVUAoZFSnHk8TsuxxFkywPzIN9zz9cNNi8LpzH6tiYK/9d+4Kh Wq4HU5ICEyy7JJZQSna/g7OZ9QFPOXg= ARC-Authentication-Results: i=2; imf02.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=AUky1adr; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf02.hostedemail.com: domain of ankita@nvidia.com designates 40.107.220.59 as permitted sender) smtp.mailfrom=ankita@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1680717711; a=rsa-sha256; cv=pass; b=XHd77qsPHk13XsNz/IkFS/EXLSsed1KAJrydb7Vk+QE2iwr4nGGe2tCQfvcXPtAd3XZ2rN vIxNEPZV5BKw/1wjoRuELd+No0pxcMBvAZabuZY1lj301lia6kMSsMOIqQhIRn/Zch7HBq wAyt7JrNP4guyjMRIE2BVSyJAyLQ3EM= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EjNMwqRdGLxzODoF40zrVzQDI4XjC45frrIYPX89CqRHScyfeBhO37Aw4WIRXDJiMEbw9kAF0OzdzxtNupsh9NI+tEWXhcCwHjDXYr7q4b30T1H1FBKyO5lx6susZokR4ZGoAwTMLWmKZyMwSg6JC3PIGTt0xCzOTRtgbZDJz+0gWEW/R60SAj7OnA0XvVsYx2oE1sW0mAtDkza8a4Dng1oTfOcKNUTAEYzDQX3mGk6LrAzs/1vhBv8FdPfg5cItN89o0droKcVpBO7wb08rnrUv7DSu8fHF0Keh7+VWWxRMWo5mV+Tws2FZ9ShT8LdoQ6GBev6Hj6lGtu6ij64z/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ixyjebktJREJsrnkZvwgfBAu2SPMa3QRXMAWKUw88R8=; b=X2fVm5gI7a+F3xVsXtxPCpbTF7gig3/5sOn8sTBnnbmDucS8JKWJ+33cd8mMgGyop2xU1twW7Q1DbkPitW29xqnwl6iiR3I6W3NC10xH1AXCZynvUT+GM8QSycjseVhUlmOvlmrFL8pTOpC2J5QaN6CRuxtSWPJxPGJ9jm3vFlCU1xJfjgD3h2IHxRR1mPnx5/03dgZL2QUXpAaXBcDQjtZLj4F+cLuOEDvz3mg1NmVbZks+ojdE6c9cLgmNQscDl3yUET83rMhsme6XpJp4MK0dobjvIhBGsUz8s8SAD8hzRcp8B2W+Z63NPeVnn79lSqVSm5qeP73mUXULzf5l5Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ixyjebktJREJsrnkZvwgfBAu2SPMa3QRXMAWKUw88R8=; b=AUky1adre3hnadQ9CFkTpT1NqUR3h5h3O3DB0hxmlprqknWBhJTefWIHvml6SkkfLfbZEA0t2Sh5rjVK14e/GBmQa/1Syp+0/NZJGGIrXV17rMltuTTfgUl/HbH0Pmgse3oZ0NOXpqkgYrYzmyRP0Is7lnXmw1JJfWGT4CaFYndbJBnYGj5M8QGSwOkZoa0/190RFoQ9L1aZNYTs8JwNyTyRIv7t7P/J4N9JNOzBw8Bhi+yYuFlah+Izbwy1GIX7JLS+zdXgntylD6QPDbBxPm8y1CxXQPbtRqcZrQjjbWDpp7j5xxx30wEjfv3X8orbYXDxFs22wNbaiDIsc+pzZg== Received: from MW4PR04CA0252.namprd04.prod.outlook.com (2603:10b6:303:88::17) by DM6PR12MB4169.namprd12.prod.outlook.com (2603:10b6:5:215::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6254.35; Wed, 5 Apr 2023 18:01:46 +0000 Received: from CO1NAM11FT045.eop-nam11.prod.protection.outlook.com (2603:10b6:303:88:cafe::40) by MW4PR04CA0252.outlook.office365.com (2603:10b6:303:88::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6277.29 via Frontend Transport; Wed, 5 Apr 2023 18:01:46 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by CO1NAM11FT045.mail.protection.outlook.com (10.13.175.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6277.30 via Frontend Transport; Wed, 5 Apr 2023 18:01:46 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Wed, 5 Apr 2023 11:01:35 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Wed, 5 Apr 2023 11:01:35 -0700 Received: from localhost.localdomain (10.127.8.14) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37 via Frontend Transport; Wed, 5 Apr 2023 11:01:35 -0700 From: To: , , , , , CC: , , , , , , , , , , , , Subject: [PATCH v3 1/6] kvm: determine memory type from VMA Date: Wed, 5 Apr 2023 11:01:29 -0700 Message-ID: <20230405180134.16932-2-ankita@nvidia.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230405180134.16932-1-ankita@nvidia.com> References: <20230405180134.16932-1-ankita@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT045:EE_|DM6PR12MB4169:EE_ X-MS-Office365-Filtering-Correlation-Id: 5c10b3d6-9f95-464f-6103-08db35ffd250 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: cD44gJ8CMfj/RP+gPq48x8AgtnGDcgbMvl/HPyIaRDasHrowVD2SMju+1vMCWU1iFpEH0LKsAM67a7/a3vbaP0N5actqFMXfAQ3EMBkp6DuOW0OYeYgdNSREiKtYT7Fgw2IZrhNPIkw2ZNKStVWSM1Cd45p5YHjSNkn3k2Ric701jN6OpmY4nUPJPX4AGOqTXiaQI6qOMi0i419J2B4a7qqCvEM3Ow2TKWHx//0man6rNe9ziwKwa+U+YjHMBStWfPQQgoicpvapStHr/BpwU1yRifkEmGABwc28TFJKc9xgZh8HU1EHTfosgLR/m486x9nhYujtdyrttu+Xz/YIm+weKCrexQ/VieMZy4npAoVj3pLfKvvTjBkIZF3v3A8nXtxlK8KrWUFpUj4gFLjKvj577ln4htyWqlfDTo27JqQfGZCkv63z/99BoipfVANZkgxmD/GLtO+TnyQSYtsz/ApMu/heYKpDA6dERuDNPLiyO67ADfv9BDIAzbnuXXzCgI/DHtPoUZVRtnRBFxQKLOZ6rs7yFRtzhk3TEbj1fLmxMrl3Y5CInutsX5bF5IDZk7E4HSPUFrgEmLaW5IQjR6QLNufm1a90U8c+UEtmMTrSly+gMhWj8slhTsID3Jae3rPpwlf8Wi5LJMkqX6gVPNGk7WF04FLlwsfRl+/0DfTJZvpG1UopDNki5F6HKN1FS53vVDSmq0sND1QBwoNbgVEDq1EEjg6BOUwtPP14TIYgJimOXFqSEBiM9luPj0j4eqiIh9J15NeyRdSyfoBNv27N6ZsYH8Yvgc/rqFXRLkItQ2xsGzcfiuI4gh3meKVBv0vQsIA2GrKd1fbsp9S7rw== X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230028)(4636009)(346002)(39860400002)(376002)(136003)(396003)(451199021)(36840700001)(46966006)(40470700004)(54906003)(41300700001)(8936002)(2906002)(316002)(19627235002)(70206006)(5660300002)(4326008)(2876002)(186003)(8676002)(70586007)(478600001)(110136005)(966005)(336012)(6666004)(47076005)(7636003)(356005)(40460700003)(26005)(1076003)(426003)(40480700001)(2616005)(36756003)(83380400001)(82310400005)(82740400003)(86362001)(36860700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Apr 2023 18:01:46.3535 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5c10b3d6-9f95-464f-6103-08db35ffd250 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT045.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB4169 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 09FCC80033 X-Rspam-User: X-Stat-Signature: pp4e9qjorsi61r58ing5z71prde1oyng X-HE-Tag: 1680717710-175878 X-HE-Meta: U2FsdGVkX1/5pkgfiBaEKrKH/KcoNjv2pPhPXg1hQMmSgu23wGdGyQ9mU0tae7VOrMoswribY3tl7rZIbF+QBFd9i/e3RUqSnF3qiw81dBDTmIzQJ1UgAB7qZMwHPkqbK66xy3IRvpNycTgB1Ymw3vq26LfZFTbtA3QKHtMOeNXkIYbAT9bXJcSpMJUTUMqTo/uKgsuJ71jYe2abaElMaRxxT0SzAGLFNaIO0eDy24ipXrBL9cLaM265omDGDycmjNhdsszkLVJTIQDTNaOuIFaPG4XlJ/0w3vOAbQiFdm7omxfKFqXucgJUPxY4mzbda/aDUa2as87R353Rgdwcu3NZpQYFhzyfk08sQdtQDzKaU9s3+LJfQZGfjwfc8wCuUa6Jau8/YDJyyHov+t6BQED5vY958wJo/3f33Rb9yb0XZ15WmjawC871Rg+fHBV3OFMkDcDmzgj8N7zt01qbf0ruLlInW3xkgLmJHwmOmL3pvWM8wP98uhBAR2/3ux/sV9woOmoAJ98HGvA3U/fwWEpXX6srdwKiIWOPhGwx1UQvbFBYTddOol3stmhKE/HotMRGYBlUKpilyauNVkwjo+a3bw8bO9wpTCyBU2y32B2bIq+nq9fLB788OAloP42gLt8ptcKP+88RhA2Pmit87qfyKuXO6naNXR465ASaoLjjweL7ROo/MS+eC5pz/3fasx9RbC8VyGC/IjtjHd/eWdTV269Fi4N7VeYWIDWZPpwUzpHJYsbtCZfwwt2y4aDBwyBnmPq71OwpOznluCOa8vjVcvvIaEFzNglIlkuAgSHBGxp5BHjOXDHuGmJYMZ9blic3Go9MCOLZyPlGwvfcqIcREuRQBfBOvNMNAUeELYnaf/9LfyhGsvMfY4xXymztl7Q29eCcJaIrFbKX90qYI3qiZ79UaQZDsenmRUlzcJ+r+dZGftftMSNAGfuBIMhgRk3jB1Rw2/c9og5dAoK VY2AJaaD 4iEQNtnP3K1Qr2EtYKjDMo0tZh/6JQRDbTceXycnhCCaahtfl1QJOI95AZcOQaxh0iWfQwRsvt7OOUxHP8hQfM6yNhL0kyFXAUw5qV2Av2Kz6qj7wgXs+lNhjtJ4kc87gKzVBXS0IX+xpgt2mVpH+jLqF0k8xopaz94cdB8F0JBZfxuLbPSCGmrpJ6K9xkJX2EQrbsQp1pX/lpgncC9dVxPkalv2rs9bM/cKSi1TrVMs43NmJIDzVKb7BChfJr6HjTXKu7laHT3zlY0/ZkFeuybJeW+9toTd4ljjKGnJStfwSXU2dPAZ+gS+gZucLekCXvLas/gKjF5xVJPk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ankit Agrawal Each VM stores the requires pgprots for its mappings in the vma->pgprot. Based on this we can determine the desired MT_DEVICE_* for the VMA directly, and do not have to guess based on heuristics based on pfn_is_map_memory(). There are the following kinds of pgprot available to userspace and their corresponding type: pgprot_noncached -> MT_DEVICE_nGnRnE pgprot_writecombine -> MT_NORMAL_NC pgprot_device -> MT_DEVICE_nGnRE pgprot_tagged -> MT_NORMAL_TAGGED Decode the relevant MT_* types in use and translate them into the corresponding KVM_PGTABLEPROT_*: - MT_DEVICE_nGnRE -> KVM_PGTABLE_PROT_DEVICE_nGnRE (device) - MT_DEVICE_nGnRnE -> KVM_PGTABLE_PROT_DEVICE_nGnRnE (noncached) - MT_NORMAL/_TAGGED/_NC -> 0 The selection of 0 for the S2 KVM_PGTABLE_PROT_DEVICE_nGnRnE is based on [2]. Also worth noting is the result of the stage-1 and stage-2. Ref [3] If FWB not set, then the combination is the one that is more restrictive. The sequence from lowest restriction to the highest: DEVICE_nGnRnE -> DEVICE_nGnRE -> NORMAL/_TAGGED/_NC If FWB is set, then stage-2 mapping type overrides the stage-1 [1]. This solves a problem where KVM cannot preserve the MT_NORMAL memory type for non-struct page backed memory into the S2 mapping. Instead the VMA creator determines the MT type and the S2 will follow it. [1] https://developer.arm.com/documentation/102376/0100/Combining-Stage-1-and-Stage-2-attributes [2] ARMv8 reference manual: https://developer.arm.com/documentation/ddi0487/gb/ Section D5.5.3, Table D5-38 [3] ARMv8 reference manual: https://developer.arm.com/documentation/ddi0487/gb/ Table G5-20 on page G5-6330 Signed-off-by: Ankit Agrawal --- arch/arm64/include/asm/kvm_pgtable.h | 8 +++++--- arch/arm64/include/asm/memory.h | 6 ++++-- arch/arm64/kvm/hyp/pgtable.c | 16 +++++++++++----- arch/arm64/kvm/mmu.c | 27 ++++++++++++++++++++++----- 4 files changed, 42 insertions(+), 15 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 4cd6762bda80..d3166b6e6329 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -150,7 +150,8 @@ enum kvm_pgtable_stage2_flags { * @KVM_PGTABLE_PROT_X: Execute permission. * @KVM_PGTABLE_PROT_W: Write permission. * @KVM_PGTABLE_PROT_R: Read permission. - * @KVM_PGTABLE_PROT_DEVICE: Device attributes. + * @KVM_PGTABLE_PROT_DEVICE_nGnRE: Device nGnRE attributes. + * @KVM_PGTABLE_PROT_DEVICE_nGnRnE: Device nGnRnE attributes. * @KVM_PGTABLE_PROT_SW0: Software bit 0. * @KVM_PGTABLE_PROT_SW1: Software bit 1. * @KVM_PGTABLE_PROT_SW2: Software bit 2. @@ -161,7 +162,8 @@ enum kvm_pgtable_prot { KVM_PGTABLE_PROT_W = BIT(1), KVM_PGTABLE_PROT_R = BIT(2), - KVM_PGTABLE_PROT_DEVICE = BIT(3), + KVM_PGTABLE_PROT_DEVICE_nGnRE = BIT(3), + KVM_PGTABLE_PROT_DEVICE_nGnRnE = BIT(4), KVM_PGTABLE_PROT_SW0 = BIT(55), KVM_PGTABLE_PROT_SW1 = BIT(56), @@ -178,7 +180,7 @@ enum kvm_pgtable_prot { #define PAGE_HYP KVM_PGTABLE_PROT_RW #define PAGE_HYP_EXEC (KVM_PGTABLE_PROT_R | KVM_PGTABLE_PROT_X) #define PAGE_HYP_RO (KVM_PGTABLE_PROT_R) -#define PAGE_HYP_DEVICE (PAGE_HYP | KVM_PGTABLE_PROT_DEVICE) +#define PAGE_HYP_DEVICE (PAGE_HYP | KVM_PGTABLE_PROT_DEVICE_nGnRE) typedef bool (*kvm_pgtable_force_pte_cb_t)(u64 addr, u64 end, enum kvm_pgtable_prot prot); diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index 78e5163836a0..4ebbc4b1ba4d 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -147,14 +147,16 @@ * Memory types for Stage-2 translation */ #define MT_S2_NORMAL 0xf +#define MT_S2_DEVICE_nGnRnE 0x0 #define MT_S2_DEVICE_nGnRE 0x1 /* * Memory types for Stage-2 translation when ID_AA64MMFR2_EL1.FWB is 0001 * Stage-2 enforces Normal-WB and Device-nGnRE */ -#define MT_S2_FWB_NORMAL 6 -#define MT_S2_FWB_DEVICE_nGnRE 1 +#define MT_S2_FWB_NORMAL 0x6 +#define MT_S2_FWB_DEVICE_nGnRnE 0x0 +#define MT_S2_FWB_DEVICE_nGnRE 0x1 #ifdef CONFIG_ARM64_4K_PAGES #define IOREMAP_MAX_ORDER (PUD_SHIFT) diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 3d61bd3e591d..7a8238b41590 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -355,7 +355,7 @@ struct hyp_map_data { static int hyp_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep) { - bool device = prot & KVM_PGTABLE_PROT_DEVICE; + bool device = prot & KVM_PGTABLE_PROT_DEVICE_nGnRE; u32 mtype = device ? MT_DEVICE_nGnRE : MT_NORMAL; kvm_pte_t attr = FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_ATTRIDX, mtype); u32 sh = KVM_PTE_LEAF_ATTR_LO_S1_SH_IS; @@ -636,14 +636,20 @@ static bool stage2_has_fwb(struct kvm_pgtable *pgt) static int stage2_set_prot_attr(struct kvm_pgtable *pgt, enum kvm_pgtable_prot prot, kvm_pte_t *ptep) { - bool device = prot & KVM_PGTABLE_PROT_DEVICE; - kvm_pte_t attr = device ? KVM_S2_MEMATTR(pgt, DEVICE_nGnRE) : - KVM_S2_MEMATTR(pgt, NORMAL); u32 sh = KVM_PTE_LEAF_ATTR_LO_S2_SH_IS; + kvm_pte_t attr; + + if (prot & KVM_PGTABLE_PROT_DEVICE_nGnRE) + attr = KVM_S2_MEMATTR(pgt, DEVICE_nGnRE); + else if (prot & KVM_PGTABLE_PROT_DEVICE_nGnRnE) + attr = KVM_S2_MEMATTR(pgt, DEVICE_nGnRnE); + else + attr = KVM_S2_MEMATTR(pgt, NORMAL); if (!(prot & KVM_PGTABLE_PROT_X)) attr |= KVM_PTE_LEAF_ATTR_HI_S2_XN; - else if (device) + else if (prot & KVM_PGTABLE_PROT_DEVICE_nGnRE || + prot & KVM_PGTABLE_PROT_DEVICE_nGnRnE) return -EINVAL; if (prot & KVM_PGTABLE_PROT_R) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 7113587222ff..8d63aa951c33 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -897,7 +897,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, int ret = 0; struct kvm_mmu_memory_cache cache = { .gfp_zero = __GFP_ZERO }; struct kvm_pgtable *pgt = kvm->arch.mmu.pgt; - enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_DEVICE | + enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_DEVICE_nGnRE | KVM_PGTABLE_PROT_R | (writable ? KVM_PGTABLE_PROT_W : 0); @@ -1186,6 +1186,15 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma) return vma->vm_flags & VM_MTE_ALLOWED; } +/* + * Determine the memory region cacheability from VMA's pgprot. This + * is used to set the stage 2 PTEs. + */ +static unsigned long mapping_type(pgprot_t page_prot) +{ + return ((pgprot_val(page_prot) & PTE_ATTRINDX_MASK) >> 2); +} + static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_memory_slot *memslot, unsigned long hva, unsigned long fault_status) @@ -1368,10 +1377,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (exec_fault) prot |= KVM_PGTABLE_PROT_X; - if (device) - prot |= KVM_PGTABLE_PROT_DEVICE; - else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC)) - prot |= KVM_PGTABLE_PROT_X; + switch (mapping_type(vma->vm_page_prot)) { + case MT_DEVICE_nGnRE: + prot |= KVM_PGTABLE_PROT_DEVICE_nGnRE; + break; + case MT_DEVICE_nGnRnE: + prot |= KVM_PGTABLE_PROT_DEVICE_nGnRnE; + break; + /* MT_NORMAL/_TAGGED/_NC */ + default: + if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC)) + prot |= KVM_PGTABLE_PROT_X; + } /* * Under the premise of getting a FSC_PERM fault, we just need to relax From patchwork Wed Apr 5 18:01:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankit Agrawal X-Patchwork-Id: 13202328 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92F07C76188 for ; Wed, 5 Apr 2023 18:02:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 344DF6B007B; Wed, 5 Apr 2023 14:02:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2CC0D6B007D; Wed, 5 Apr 2023 14:02:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1460D6B007E; Wed, 5 Apr 2023 14:02:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 023056B007B for ; Wed, 5 Apr 2023 14:02:19 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id BEE7B12068F for ; Wed, 5 Apr 2023 18:02:18 +0000 (UTC) X-FDA: 80648106756.05.D56D655 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2071.outbound.protection.outlook.com [40.107.244.71]) by imf25.hostedemail.com (Postfix) with ESMTP id 647C0A000E for ; Wed, 5 Apr 2023 18:02:13 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=WHK1HL9p; spf=pass (imf25.hostedemail.com: domain of ankita@nvidia.com designates 40.107.244.71 as permitted sender) smtp.mailfrom=ankita@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680717733; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1rpV84bedB/SLY9kiz3nuV5S5K+BBy2AxYHvceEx63c=; b=5JfDMv6T8y0ACpjUhGyuKN2B97Mr/yHpYnqWVPaSC9nBRUeLc/Ru1yZPXL7FO9suyEWZR8 zD097np/Ck6fGUsLOKNJpkN7Skb5cmxFwLQq3UIHRjvzr6kuTeMoGen8HFo4UeCLdCz07c a4/DD/Twryt6NEffl8VSHigzp1xjIxQ= ARC-Authentication-Results: i=2; imf25.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=WHK1HL9p; spf=pass (imf25.hostedemail.com: domain of ankita@nvidia.com designates 40.107.244.71 as permitted sender) smtp.mailfrom=ankita@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1680717733; a=rsa-sha256; cv=pass; b=uHNxTfvbUswty+XBC8MYsDTYXtSSmVRSf1lB9b307ir4skFDopvn/LrWzux9Yniy3jhyHG CrrpX7kGayhBFplWoBJNaIF7kR5/+pZ6MaT0ImK2SRNUSSSVq6dIpTJfe6+pCkKrjUI2+7 AXCpqrx22/jP91b3hxB5ewIcxkP6zf0= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VLTu+o0ehF/5rWCm6ZElBfQHG27Lbl3t2cb3Gwft/z3v7C8YLBWMwZkdPXZJRah3w1qVkk0F3gNErZW5QnPy7Ig/SMqqui/OCu7AdUUnLJivd6w6p9NKH/3ti53BEt4bUhFGlVcy+Rt0zwt5SdZWhrzRYZi0p4VXYwU4unHZ6gQ5JPtWcNQET3QAH+o5kHQP+Q5kStGFCPEPetTLCXVppPU4EPlHHLk6U9TcOA0G15OWPFgWK3iLlKwIdv8JcZQ89ELuHA4pGqg7BHigodQ78oTIbybeiyYZR5GPLyoLcbIc0JZ0mr0Yq5kU+LDt/F+haOlX3LUpY1GuK5Pta2cHCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1rpV84bedB/SLY9kiz3nuV5S5K+BBy2AxYHvceEx63c=; b=WkzRXFzuyW2JOCeD9lmFg228tlh3g0mJmz8FFAV3E4WtYpoPAoSBHLO2QW3b8djsHQ0By44hda33MNWVa4Qpx5tHfaNky3ecX4Xepq8XuZ8YqOjvcnmLFIRjRZeZ+JEd7J4lPOSHsAfZGyawePNhJhQ4vx86MdrOkrxxyM11StPMruzxBeocszp0LpA9uzfwCiHvVmncfDGKakCd6CkL0nkcCT93/P0qnxLE4yfxes18kdLlCkpr+owSXuSePgqfhIlT/yJwxFR1Tt/n3XzXJXWAI6FCOP1tHq4W7Cb5RhRcOJw7lkB7W1kqH5LzXqd65z0/WArrV4OwkTRIuCvM/g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1rpV84bedB/SLY9kiz3nuV5S5K+BBy2AxYHvceEx63c=; b=WHK1HL9pCY9VSpgicLim/6WWnpBhG1Zyh6ZpOCD32ld8EXgqG51l0Xv5hwH9m5MGyfUqSGEC4a7BDrfEEF0dKXrQJeCz7uf24MhaLpJX8aZWdEOKaAn7Meg+ETURFL8YWK2z/yvNtQ+7EqbvrJilw33CPp28L0JHOKfk46WK++cAFnzVJ3yA8JDFkSi94zy+kAgo6eS6rASMbmcBHLWQX7NocWAJveH8RRDcPI26hJiJYNU1JzL+Q04bLhLIc5RtWNrmUAN8UVi8PSgMSgotAa51RIx6RTsmiPu6y3oqlCQSPpfKl604u59Bg8VK56VVlcEVREMyCSZ54keDZXyRXQ== Received: from DM6PR17CA0023.namprd17.prod.outlook.com (2603:10b6:5:1b3::36) by MN0PR12MB5929.namprd12.prod.outlook.com (2603:10b6:208:37c::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6254.35; Wed, 5 Apr 2023 18:02:10 +0000 Received: from DM6NAM11FT049.eop-nam11.prod.protection.outlook.com (2603:10b6:5:1b3:cafe::33) by DM6PR17CA0023.outlook.office365.com (2603:10b6:5:1b3::36) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6254.22 via Frontend Transport; Wed, 5 Apr 2023 18:02:10 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by DM6NAM11FT049.mail.protection.outlook.com (10.13.172.188) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6277.28 via Frontend Transport; Wed, 5 Apr 2023 18:02:09 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Wed, 5 Apr 2023 11:01:35 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Wed, 5 Apr 2023 11:01:35 -0700 Received: from localhost.localdomain (10.127.8.14) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37 via Frontend Transport; Wed, 5 Apr 2023 11:01:35 -0700 From: To: , , , , , CC: , , , , , , , , , , , , Subject: [PATCH v3 2/6] vfio/nvgpu: expose GPU device memory as BAR1 Date: Wed, 5 Apr 2023 11:01:30 -0700 Message-ID: <20230405180134.16932-3-ankita@nvidia.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230405180134.16932-1-ankita@nvidia.com> References: <20230405180134.16932-1-ankita@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6NAM11FT049:EE_|MN0PR12MB5929:EE_ X-MS-Office365-Filtering-Correlation-Id: 68feaf94-b7bb-431a-26b2-08db35ffe060 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 31IPSctlLXlk6/oJbDUTpNFrgGkYnQxIic6BXdjpLqKLO9VAwFMlgIXY4rnzaaxRUdSLlMxVRph0SbLz2mXRKdH3GfcykEcR3tJB8X3NN4BJ0pialKYPyHqSoWdI5erIWM6wyWCEgQD1zpCc3Vz7HkmBb0/392brL/dGt7sKZty0N+9V9nZmYqDAQGL+touu8Gmiu1/etdwVM7RV/8nkyT+f7C3c6+tMyjrJBpXwSKVHi5gcLMFAzksjxiASg6drNndgXw1NAfLC81fbdkiovQkCMnNycCc8ohaxg69DVKZL2NkKFJFZavZJ/OnB0MY/1v8n6QUJCRySbeiCCu/uAhTAJ5DX27xiext5p8dfBU2u6jIfqVyQfa4u608LQJmpnrN2WwOzbnBCYzjJRS86o22RBSFBGr6xD8ZJHuNhiTn53sayJo3dUXyjU68XHzYXEyRC1AucKdG1YRoZaJcsw54ZfaNfa0/NrjLwSeTEGaxwckbUyNLo7QdtFB0wJ2e/ttCuCIW6XkzyafHQuh4U9i/lPAMBoWArWGRmFZ91rS1eyMtV2Ewejgsz/A360xr3bvUZwgEyq43GPKuIF7tSA0evecBsm9xraa6iow8nrguBGGQU0dK5Ih4LPlQu8rZGWozSjiVTnIKwRphtwpwYZxpVmgSZbfEcnyQqlAj9GvSTDAwnwWOsgnpeILX9KtVDM3ohyFlSYMZFA+/pQsNJOlMu+xBui7hruxQ2B9L8j0iTlVezL4xDoLl0iHqy8ttz X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230028)(4636009)(346002)(136003)(396003)(376002)(39860400002)(451199021)(36840700001)(46966006)(40470700004)(40480700001)(186003)(6666004)(82310400005)(26005)(8936002)(336012)(5660300002)(2906002)(86362001)(2876002)(30864003)(36860700001)(316002)(4326008)(70586007)(47076005)(1076003)(83380400001)(426003)(70206006)(41300700001)(356005)(82740400003)(110136005)(40460700003)(2616005)(8676002)(54906003)(36756003)(7636003)(478600001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Apr 2023 18:02:09.8981 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 68feaf94-b7bb-431a-26b2-08db35ffe060 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT049.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN0PR12MB5929 X-Rspamd-Queue-Id: 647C0A000E X-Stat-Signature: ewfw76b8gko8i7qa8p1dp1muij8nz1rk X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1680717733-528996 X-HE-Meta: U2FsdGVkX18Ok5z4/SEYbu0rXDQjRODTKSxG/N+iIEpa5AkvlFNGRxGFi2MRhvEZiV9fa3/Owe4gEGaDYouTBG82Zmmd9gTrIdqry7wGoct5SfF4K29hGWIK5TxEbKUKuJpO92OPBUEAbzDsns694YvYjZ1zlrkZcYTdqWWFuePTlZ+G7ivgPvGpvoX6vH4VMJ0Xffblk3yFercXrsUa6foMeDqNUk38ZhSWv6GiFE9eZvy2GE+qzqPdodHHI8bYoN12f+logbe/Ry4U+N3Vi/HWWIqlSu4ugiCUeYytFjde1z67P2ars6wuoauLcVgRdc0mdKYZN5NXGhHXFVlKUbZuPBU4lEoxRfrDZGuKFfZS9RKWY+Ocp6sWSzor0lKvnrmggVyqfBH8MseZB29Z1WEuA2NsXf64orI9iX+1MuHFm4MN3u34DFZjpTAL+qqfhGkkDveGkQY1reU+5d1tpvcCUSz2PZMPNylg9nFMTPbEqh38tgLwuzaY+c47XB5H298BHX0l6B/kM74EY7AyJ2hPbmcwjohzeE13vDsMnnB97GFQsc1TcbJGsbuZQjbDBLntLJ0rexiET9+crzb9G9eSIcnK+AJHfTigUFwJoRaFPr0wWBLZZ9smQNNN2HIIEp/UnvwZKXnM4hn9dvillBRwZ0qtKyTMn8jKaCkRohjIpgbak/JT5RWEQgp4qG6nPooNajj53xLWDIt+uQeAWI1TW3ke3gm/nMtWMx17PWJ0F18KXg2EpqAWYPjO/5NJ45Wh/KKscq8h5dfjc/y4FmGHIqwRMb5XP0pi8ikM+YyRxv5+aGHfRPbQBkVWNAng9BUSijab3sOALLjvDSGbKee73cD0CRz19b0flYMVAHgb1m/nIgVcT4mggyBxEzcvNu6NRzIHL141n5NsR6qb4R9Vb66tAHptbj12I+N9hU8i1UlS6wUm2kw9j3oe80v00NjA+TqZKRNVVZ+bnYZ iMba2PKl TUTskdoPinejyk/wcEEFY3wT3P80tZ0ygcN25IvMFijtXXTb+m47QWoQ0/wi48vMlGfaRuqtqJSjinG8Kg5icGvfD0OBrfVjS98mqE2SrhzxM93cWfQIZ00nRe8SRV2IFObAyFpCNzUmlJBgCy8NFOfLvMzDz0cRgEYv3V7AQBsT3c+HteDDPm0LoT1QI4F7vuDg0EsIGGU9DXeh/gnj4isoIrgx5UoRWlGhkRE58XSQFjDLO0PivysY3q3Eidb+NplIRxqTyVY5AXBkqTDsT7Ciyuw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ankit Agrawal The NVIDIA Grace Hopper superchip does not model the coherent GPU memory aperture as a PCI config space BAR. Introduce an in-tree VFIO PCI variant module (nvgpu-vfio-pci) to expose the GPU memory as BAR1 to the userspace. The GPU memory size and physical address are obtained from ACPI using device_property_read_u64() and exported to userspace as the VFIO_REGION. QEMU will naturally generate a PCI device in the VM where the cachable aperture is reported in BAR1. QEMU can fetch the region information and perform mapping on it. The subsequent mmap call is handled by mmap() function pointer for the nvgpu-vfio-pci module and mapping to the GPU memory is established using the remap_pfn_range() API. Signed-off-by: Ankit Agrawal --- MAINTAINERS | 6 + drivers/vfio/pci/Kconfig | 2 + drivers/vfio/pci/Makefile | 2 + drivers/vfio/pci/nvgpu/Kconfig | 10 ++ drivers/vfio/pci/nvgpu/Makefile | 3 + drivers/vfio/pci/nvgpu/main.c | 255 ++++++++++++++++++++++++++++++++ 6 files changed, 278 insertions(+) create mode 100644 drivers/vfio/pci/nvgpu/Kconfig create mode 100644 drivers/vfio/pci/nvgpu/Makefile create mode 100644 drivers/vfio/pci/nvgpu/main.c diff --git a/MAINTAINERS b/MAINTAINERS index 1dc8bd26b6cf..6b48756c30d3 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -21954,6 +21954,12 @@ L: kvm@vger.kernel.org S: Maintained F: drivers/vfio/pci/mlx5/ +VFIO NVIDIA PCI DRIVER +M: Ankit Agrawal +L: kvm@vger.kernel.org +S: Maintained +F: drivers/vfio/pci/nvgpu/ + VGA_SWITCHEROO R: Lukas Wunner S: Maintained diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig index f9d0c908e738..ade18b0ffb7b 100644 --- a/drivers/vfio/pci/Kconfig +++ b/drivers/vfio/pci/Kconfig @@ -59,4 +59,6 @@ source "drivers/vfio/pci/mlx5/Kconfig" source "drivers/vfio/pci/hisilicon/Kconfig" +source "drivers/vfio/pci/nvgpu/Kconfig" + endif diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile index 24c524224da5..0c93d452d0da 100644 --- a/drivers/vfio/pci/Makefile +++ b/drivers/vfio/pci/Makefile @@ -11,3 +11,5 @@ obj-$(CONFIG_VFIO_PCI) += vfio-pci.o obj-$(CONFIG_MLX5_VFIO_PCI) += mlx5/ obj-$(CONFIG_HISI_ACC_VFIO_PCI) += hisilicon/ + +obj-$(CONFIG_NVGPU_VFIO_PCI) += nvgpu/ diff --git a/drivers/vfio/pci/nvgpu/Kconfig b/drivers/vfio/pci/nvgpu/Kconfig new file mode 100644 index 000000000000..066f764f7c5f --- /dev/null +++ b/drivers/vfio/pci/nvgpu/Kconfig @@ -0,0 +1,10 @@ +# SPDX-License-Identifier: GPL-2.0-only +config NVGPU_VFIO_PCI + tristate "VFIO support for the GPU in the NVIDIA Grace Hopper Superchip" + depends on ARM64 || (COMPILE_TEST && 64BIT) + select VFIO_PCI_CORE + help + VFIO support for the GPU in the NVIDIA Grace Hopper Superchip is + required to assign the GPU device to a VM using KVM/qemu/etc. + + If you don't know what to do here, say N. diff --git a/drivers/vfio/pci/nvgpu/Makefile b/drivers/vfio/pci/nvgpu/Makefile new file mode 100644 index 000000000000..00fd3a078218 --- /dev/null +++ b/drivers/vfio/pci/nvgpu/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0-only +obj-$(CONFIG_NVGPU_VFIO_PCI) += nvgpu-vfio-pci.o +nvgpu-vfio-pci-y := main.o diff --git a/drivers/vfio/pci/nvgpu/main.c b/drivers/vfio/pci/nvgpu/main.c new file mode 100644 index 000000000000..2dd8cc6e0145 --- /dev/null +++ b/drivers/vfio/pci/nvgpu/main.c @@ -0,0 +1,255 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#include +#include + +#define DUMMY_PFN \ + (((nvdev->mem_prop.hpa + nvdev->mem_prop.mem_length) >> PAGE_SHIFT) - 1) + +struct dev_mem_properties { + uint64_t hpa; + uint64_t mem_length; + int bar1_start_offset; +}; + +struct nvgpu_vfio_pci_core_device { + struct vfio_pci_core_device core_device; + struct dev_mem_properties mem_prop; +}; + +static int vfio_get_bar1_start_offset(struct vfio_pci_core_device *vdev) +{ + u8 val = 0; + + pci_read_config_byte(vdev->pdev, 0x10, &val); + /* + * The BAR1 start offset in the PCI config space depends on the BAR0size. + * Check if the BAR0 is 64b and return the approproiate BAR1 offset. + */ + if (val & PCI_BASE_ADDRESS_MEM_TYPE_64) + return VFIO_PCI_BAR2_REGION_INDEX; + + return VFIO_PCI_BAR1_REGION_INDEX; +} + +static int nvgpu_vfio_pci_open_device(struct vfio_device *core_vdev) +{ + struct nvgpu_vfio_pci_core_device *nvdev = container_of( + core_vdev, struct nvgpu_vfio_pci_core_device, core_device.vdev); + struct vfio_pci_core_device *vdev = + container_of(core_vdev, struct vfio_pci_core_device, vdev); + int ret; + + ret = vfio_pci_core_enable(vdev); + if (ret) + return ret; + + vfio_pci_core_finish_enable(vdev); + + nvdev->mem_prop.bar1_start_offset = vfio_get_bar1_start_offset(vdev); + + return ret; +} + +int nvgpu_vfio_pci_mmap(struct vfio_device *core_vdev, + struct vm_area_struct *vma) +{ + struct nvgpu_vfio_pci_core_device *nvdev = container_of( + core_vdev, struct nvgpu_vfio_pci_core_device, core_device.vdev); + + unsigned long start_pfn; + unsigned int index; + u64 req_len, pgoff; + int ret = 0; + + index = vma->vm_pgoff >> (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT); + if (index != nvdev->mem_prop.bar1_start_offset) + return vfio_pci_core_mmap(core_vdev, vma); + + /* + * Request to mmap the BAR1. Map to the CPU accessible memory on the + * GPU using the memory information gathered from the system ACPI + * tables. + */ + start_pfn = nvdev->mem_prop.hpa >> PAGE_SHIFT; + req_len = vma->vm_end - vma->vm_start; + pgoff = vma->vm_pgoff & + ((1U << (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT)) - 1); + if (pgoff >= (nvdev->mem_prop.mem_length >> PAGE_SHIFT)) + return -EINVAL; + + /* + * Perform a PFN map to the memory. The device BAR1 is backed by the + * GPU memory now. Check that the mapping does not overflow out of + * the GPU memory size. + */ + ret = remap_pfn_range(vma, vma->vm_start, start_pfn + pgoff, + min(req_len, nvdev->mem_prop.mem_length - pgoff), + vma->vm_page_prot); + if (ret) + return ret; + + vma->vm_pgoff = start_pfn + pgoff; + + return 0; +} + +long nvgpu_vfio_pci_ioctl(struct vfio_device *core_vdev, unsigned int cmd, + unsigned long arg) +{ + struct nvgpu_vfio_pci_core_device *nvdev = container_of( + core_vdev, struct nvgpu_vfio_pci_core_device, core_device.vdev); + + unsigned long minsz = offsetofend(struct vfio_region_info, offset); + struct vfio_region_info info; + + switch (cmd) { + case VFIO_DEVICE_GET_REGION_INFO: + if (copy_from_user(&info, (void __user *)arg, minsz)) + return -EFAULT; + + if (info.argsz < minsz) + return -EINVAL; + + if (info.index == nvdev->mem_prop.bar1_start_offset) { + /* + * Request to determine the BAR1 region information. Send the + * GPU memory information. + */ + info.offset = VFIO_PCI_INDEX_TO_OFFSET(info.index); + info.size = nvdev->mem_prop.mem_length; + info.flags = VFIO_REGION_INFO_FLAG_READ | + VFIO_REGION_INFO_FLAG_WRITE | + VFIO_REGION_INFO_FLAG_MMAP; + return copy_to_user((void __user *)arg, &info, minsz) ? + -EFAULT : 0; + } + + if (info.index == nvdev->mem_prop.bar1_start_offset + 1) { + /* + * The BAR1 region is 64b. Ignore this access. + */ + info.offset = VFIO_PCI_INDEX_TO_OFFSET(info.index); + info.size = 0; + info.flags = 0; + return copy_to_user((void __user *)arg, &info, minsz) ? + -EFAULT : 0; + } + + return vfio_pci_core_ioctl(core_vdev, cmd, arg); + + default: + return vfio_pci_core_ioctl(core_vdev, cmd, arg); + } +} + +static const struct vfio_device_ops nvgpu_vfio_pci_ops = { + .name = "nvgpu-vfio-pci", + .init = vfio_pci_core_init_dev, + .release = vfio_pci_core_release_dev, + .open_device = nvgpu_vfio_pci_open_device, + .close_device = vfio_pci_core_close_device, + .ioctl = nvgpu_vfio_pci_ioctl, + .read = vfio_pci_core_read, + .write = vfio_pci_core_write, + .mmap = nvgpu_vfio_pci_mmap, + .request = vfio_pci_core_request, + .match = vfio_pci_core_match, + .bind_iommufd = vfio_iommufd_physical_bind, + .unbind_iommufd = vfio_iommufd_physical_unbind, + .attach_ioas = vfio_iommufd_physical_attach_ioas, +}; + +static struct nvgpu_vfio_pci_core_device *nvgpu_drvdata(struct pci_dev *pdev) +{ + struct vfio_pci_core_device *core_device = dev_get_drvdata(&pdev->dev); + + return container_of(core_device, struct nvgpu_vfio_pci_core_device, + core_device); +} + +static int +nvgpu_vfio_pci_fetch_memory_property(struct pci_dev *pdev, + struct nvgpu_vfio_pci_core_device *nvdev) +{ + int ret = 0; + + /* + * The memory information is present in the system ACPI tables as DSD + * properties nvidia,gpu-mem-base-pa and nvidia,gpu-mem-size. + */ + ret = device_property_read_u64(&(pdev->dev), "nvidia,gpu-mem-base-pa", + &(nvdev->mem_prop.hpa)); + if (ret) + return ret; + + ret = device_property_read_u64(&(pdev->dev), "nvidia,gpu-mem-size", + &(nvdev->mem_prop.mem_length)); + return ret; +} + +static int nvgpu_vfio_pci_probe(struct pci_dev *pdev, + const struct pci_device_id *id) +{ + struct nvgpu_vfio_pci_core_device *nvdev; + int ret; + + nvdev = vfio_alloc_device(nvgpu_vfio_pci_core_device, core_device.vdev, + &pdev->dev, &nvgpu_vfio_pci_ops); + if (IS_ERR(nvdev)) + return PTR_ERR(nvdev); + + dev_set_drvdata(&pdev->dev, nvdev); + + ret = nvgpu_vfio_pci_fetch_memory_property(pdev, nvdev); + if (ret) + goto out_put_vdev; + + ret = vfio_pci_core_register_device(&nvdev->core_device); + if (ret) + goto out_put_vdev; + + return ret; + +out_put_vdev: + vfio_put_device(&nvdev->core_device.vdev); + return ret; +} + +static void nvgpu_vfio_pci_remove(struct pci_dev *pdev) +{ + struct nvgpu_vfio_pci_core_device *nvdev = nvgpu_drvdata(pdev); + struct vfio_pci_core_device *vdev = &nvdev->core_device; + + vfio_pci_core_unregister_device(vdev); + vfio_put_device(&vdev->vdev); +} + +static const struct pci_device_id nvgpu_vfio_pci_table[] = { + { PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_VENDOR_ID_NVIDIA, 0x2342) }, + { PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_VENDOR_ID_NVIDIA, 0x2343) }, + { PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_VENDOR_ID_NVIDIA, 0x2345) }, + {} +}; + +MODULE_DEVICE_TABLE(pci, nvgpu_vfio_pci_table); + +static struct pci_driver nvgpu_vfio_pci_driver = { + .name = KBUILD_MODNAME, + .id_table = nvgpu_vfio_pci_table, + .probe = nvgpu_vfio_pci_probe, + .remove = nvgpu_vfio_pci_remove, + .err_handler = &vfio_pci_core_err_handlers, + .driver_managed_dma = true, +}; + +module_pci_driver(nvgpu_vfio_pci_driver); + +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("Ankit Agrawal "); +MODULE_AUTHOR("Aniket Agashe "); +MODULE_DESCRIPTION( + "VFIO NVGPU PF - User Level driver for NVIDIA devices with CPU coherently accessible device memory"); From patchwork Wed Apr 5 18:01:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankit Agrawal X-Patchwork-Id: 13202329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2B8DC761AF for ; Wed, 5 Apr 2023 18:02:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7B1576B007D; Wed, 5 Apr 2023 14:02:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7623D6B007E; Wed, 5 Apr 2023 14:02:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DAF36B0080; Wed, 5 Apr 2023 14:02:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 4BEEE6B007D for ; Wed, 5 Apr 2023 14:02:20 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 07383AC84E for ; Wed, 5 Apr 2023 18:02:19 +0000 (UTC) X-FDA: 80648106840.19.05EDCCD Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on2055.outbound.protection.outlook.com [40.107.102.55]) by imf02.hostedemail.com (Postfix) with ESMTP id D9FCC80037 for ; Wed, 5 Apr 2023 18:02:14 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=knD+HkYC; spf=pass (imf02.hostedemail.com: domain of ankita@nvidia.com designates 40.107.102.55 as permitted sender) smtp.mailfrom=ankita@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680717735; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lN7JCeX+38WUzhVmzlDe4zwhECT4eimEo59GDuBNPwk=; b=vRYDqPvZMY5ezhyumfSReMfTgLk4/mt9uKnbfbDNXD3yXtOnSSKsh+c0AQxinz7zvY/CNI KlQOO5WHTbIWoTdHSndKp/yLnwA2iImRJ9k625Rmoo9z/QfLvabEOIgW6FaFnnc3lVIylz qxhyXn1OscnjEUXkE4X37/Vs0U+VZyk= ARC-Authentication-Results: i=2; imf02.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=knD+HkYC; spf=pass (imf02.hostedemail.com: domain of ankita@nvidia.com designates 40.107.102.55 as permitted sender) smtp.mailfrom=ankita@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1680717735; a=rsa-sha256; cv=pass; b=oFE42l34Cz7ljocNqvO/Rv8FtP7uF8ZV+GRGe3lqFw9tTkbWu1gA5AC5Z+CJXfzk67Lq9O nZQx+tpr+iyb+pziATLKlkDrqESjFIoPxOFzMuZLpeC7EzTAOYdhOYJip5HVP3zFNOv3Uv AuxlfZvfCJi/Bs6xNqE9m8LKJE1IKSU= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WNJuBIWO5tL7OjllMJfC7zFWEySvvPRmrfCU5VRvA48/qNbpCYcUoiTkaCZH7IigA7wT2/HRfWhTvj28gUcx9Bly4vvvp3F08iAJlPwSI+KIbjz5uwgSlyhJdf30oTIQi6HT0GcABiZm7as9mPJ71KKJ2E23XcysYU9+qMM6OdH10O+Ix+r0WqNUd5e1f9o3V6EyfvUCIgV20S2qyT9QqiIpFV6lcDddSKl+eqdYD+DFfc682P8vByUq0kawPJKSpj4A9K+kvoOGuDz5MV3lZD01ZbYBxIFZygqJZKcIiGyCG5qL1tvAy3xG6RkZfpYBvH8JHslJODzY9Q41DeBqPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=lN7JCeX+38WUzhVmzlDe4zwhECT4eimEo59GDuBNPwk=; b=iEgQjGnI0feL3TnVyS7MsvEM/nD656gCc4re2gVfL8tcRZNL5RE9qlP0VlfAF1m+hQPUrk5bR9TfieoVBozLLlkJcoIqC/KjhrpzPlXeArvD2sZ3Mw1JsF5UFLI0x5CiE0NV/kkcXfFBonurU3LIJsvRlS6JFcV2+ScSUQsQjWYga+jsdwf+JQBTreEY0Jb0UGoBhbNhG8D0RoZPNvXTUGDjeWUfTypCCJhxPbyDBN64elmnL19U33eyxLXj+Zz8dO94JQUYgw1mAvRWUWJhJ9ugBcPdWgsT2B2V7oyKMJ4RQk1uPnskR0yc+Ex8c83Y/J04JjAjRTEKEPVUNX2DhA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lN7JCeX+38WUzhVmzlDe4zwhECT4eimEo59GDuBNPwk=; b=knD+HkYCUJOzMuS9wVgQb50u3hipM/e5qHocdCCInJsDEfpzABiIuF03hDg2Jo2eRPqU9Dti+3oX853gRrVJ3V3kyCqJ6KgV5A6WgQTigDIm/55Gtu/NlFhOolsMbS4lzYiAskBRZ5auQxWrbfCnuR41dm4bZ2URIdrbCWZCrLpDgxaGmMxe+TD4tjO4B3dH0IZh8Oo9TuBAvUrJMWyvvuWmF0UavB7lIDLJTqD3BnYrvXmzHZGaOX/NYgqYUOGlMYpVBKfV/7ZQbeXq05ZyXVclqcJZhNwfoElZVGCHkYXBl+3W/SqH6m1HvGMqB/KJtthaAxQBwWr7AXctSozMew== Received: from DM6PR17CA0008.namprd17.prod.outlook.com (2603:10b6:5:1b3::21) by PH7PR12MB5688.namprd12.prod.outlook.com (2603:10b6:510:130::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6254.35; Wed, 5 Apr 2023 18:02:11 +0000 Received: from DM6NAM11FT049.eop-nam11.prod.protection.outlook.com (2603:10b6:5:1b3:cafe::51) by DM6PR17CA0008.outlook.office365.com (2603:10b6:5:1b3::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6254.22 via Frontend Transport; Wed, 5 Apr 2023 18:02:10 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by DM6NAM11FT049.mail.protection.outlook.com (10.13.172.188) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6277.28 via Frontend Transport; Wed, 5 Apr 2023 18:02:10 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Wed, 5 Apr 2023 11:01:35 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Wed, 5 Apr 2023 11:01:35 -0700 Received: from localhost.localdomain (10.127.8.14) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37 via Frontend Transport; Wed, 5 Apr 2023 11:01:35 -0700 From: To: , , , , , CC: , , , , , , , , , , , , Subject: [PATCH v3 3/6] mm: handle poisoning of pfn without struct pages Date: Wed, 5 Apr 2023 11:01:31 -0700 Message-ID: <20230405180134.16932-4-ankita@nvidia.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230405180134.16932-1-ankita@nvidia.com> References: <20230405180134.16932-1-ankita@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6NAM11FT049:EE_|PH7PR12MB5688:EE_ X-MS-Office365-Filtering-Correlation-Id: 76f0e9a7-ff18-4cd7-a82a-08db35ffe0e5 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 2zM+XAVqxDJZqQdem4U+vrXQLcPxhbPFnGHor3CvunsvcXg9T8g7oM84thFiaxZN/8r85ENobh/hvuEpz07SVSmapIT+LNzIsmpGjbWaK7xbouJP0R778ciEGxJEnXdrCs+U8Nhi10PTH1f3TTnLr3B31eP3juvoo1JbZz4pieIZNi5bucYHzuEAi6MibHYYzDw4zXrn5UhkF7IAs2dptARo5ptWEMvQmcomWTK4K2OY5+s9EbZULehgMsae8RbeeaDp6HFZPflslb8HjPy8uAjFosZ5hBJ+yNmpEfR8gJ25zl7JAGIJF/PwJHirksIMFqsC6sIyyyBCqnTH8oLpUELlT1YElViqfxWuY3rY2MgNCZdD1OoIwg1krROVO/sJp46CbAwqjCGJHSjvBth7R1h/McHN8Rgfn51wVtI6rlDM6CP+CPYAGtcmvbvGKeR0m4/McW3rcUqMC0G4gyaWK9ZtG3wRCBBlORZSx6QPrkKcAHqeXoyNuttbtaQZPCW8X5Txfuh2jjiTOHhO7HiexbVKC2AZ37Oij/pH06YVrOBoJE9h5pM/JtKXW2ipW+ShvEb6Pa8c2m110j8keBofXWDxHQd81D3XbVRuHl76V1iefy4EitjiApl8UmXJR3JdNPQM99XMyZQ3hRa41E3k3VYLlZz5gBOjfOqHwDZoOEyK1h1pVe/aC6/a+/xL6uTpAjZx3UqjzpsHEsV7/E8Dhx/ZMAeGy1SGD04Crf8KogvCbh7ixEmuFV0EZeaZs1tr X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230028)(4636009)(136003)(39860400002)(396003)(346002)(376002)(451199021)(46966006)(40470700004)(36840700001)(6666004)(83380400001)(40480700001)(336012)(36860700001)(47076005)(2616005)(82310400005)(86362001)(82740400003)(426003)(356005)(36756003)(7636003)(40460700003)(1076003)(26005)(186003)(8936002)(70206006)(70586007)(54906003)(2906002)(4326008)(8676002)(41300700001)(316002)(110136005)(5660300002)(2876002)(30864003)(478600001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Apr 2023 18:02:10.7887 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 76f0e9a7-ff18-4cd7-a82a-08db35ffe0e5 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT049.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB5688 X-Stat-Signature: ibpbskpnty31f47q3rowzpjn8jsrf3y8 X-Rspam-User: X-Rspamd-Queue-Id: D9FCC80037 X-Rspamd-Server: rspam06 X-HE-Tag: 1680717734-479157 X-HE-Meta: U2FsdGVkX197y7BRl905vKszG2HGcuwbz7xHpHqpKCdxrC6M+IGspvlgrCKrGQa2rn0xFFyv6/rxrGQy8N+UcgxdYoLbIsmYCaYbrVVZ/nqijeuUConiS3i23YFMMKLRvDQY0K5rSFfS3dLM29UDOv4i+spnXxDooVyhrPLKAxsdBWeyBSFHdbInqYbpDbX1he3LhIQu1iAT4EFcUDlq7/zVztK0829di7sttBV4ogQlubUavhp7fkMpeZ+NuPAyHllxooAKCVy4LSpLVk9MSk9/cGyFtKZq7Xd7PoIp39eRZzhZM96LoaBuVfsFDhHqwKgiVch4T3cPgnNhRoOuLmJ88XOPwcFnP8f/wolxQmoIYgy9J2LtiuNnPobkyHu9c8bBCHJfgoo6bj5gf4XUac2MXhjhnSFRmt+i4inUU7C4d/LRm3KJ6A+J5ueiFVzo60DYsM1mnOKeywASVQ1irHqiohb+mUtNAEbxfoywKlwfoepaJhTYOBDJHzQ5+CxWd5jw7l8Ce/fPivWK1gL7UOqPWZzg0EbZ6IA02URb6Hwjy52wqeYBQC//dqWMlNNGoPoHsQNHmQmr7l/FM9jSjXclbCDdlfUfiet5A3tLksO7bmgGoJJVIy7+GtcYVUEGCK0UDfhMoIjsnDb7UBvypmwrZHWhJZIHh1brT7ljyGOC7XZQYWbPkJOsvqOo37H5NIZJ0GxMJxINhpVcaNGbo7DNKsArctZBK7nitZlWTlP8QJt8apVJ58GGNaeFnZ1i1GWtuZhbMMnVTTnXayW1NWNPM0paF/h0OEB1yE3VrHi0dXN4qA0WW5/PbR6fsoGgVZXOLv4DLIii1O2mKWInRWYkEYuARO1cbvNIzxf6sdd50qkhw3f8MP6ZoWIyCsPQedG6FkZ6l18xx7y+s5BBMt0MmdqQcC9osTlTQNVLM3eznKyxKNDRHihSzhqmLbLb63Clhj7w29wFJy8y01V hRJRyMy/ /pK9M8+fsxr3ox5Fto8iaATFEs2ALAWos8AkisaCk5AytRWwPYy2uLVbfSFtBhydR+oBAaczISOtfiSXm31aZj4A0hTgkc7tPoCZJbXC/iofKdcvIp3+ge6cgTgey6lLKPdlM6KWkwx0SkS/HoVCabBobepZjTtz/qGgrFUMDGFjBdL7MJ74h6DPQUccuEhC9hv0elc7AveLkm+7mCJBJc3PJAPNRF0H8Edxv X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ankit Agrawal The kernel MM does not currently handle ECC errors / poison on a memory region that is not backed by struct pages. In this series, mapping request from QEMU to the device memory is executed using remap_pfn_range(). Hence added a new mechanism to handle memory failure on such memory. Make kernel MM expose a function to allow modules managing the device memory to register a failure function and the address space that is associated with the device memory. MM maintains this information as interval tree. The registered memory failure function is used by MM to notify the module of the PFN, so that the module may take any required action. The module for example may use the information to track the poisoned pages. In this implementation, kernel MM follows the following sequence (mostly) similar to the memory_failure() handler for struct page backed memory: 1. memory_failure() is triggered on reception of a poison error. An absence of struct page is detected and consequently memory_failure_pfn is executed. 2. memory_failure_pfn() call the newly introduced failure handler exposed by the module managing the poisoned memory to notify it of the problematic PFN. 3. memory_failure_pfn() unmaps the stage-2 mapping to the PFN. 4. memory_failure_pfn() collects the processes mapped to the PFN. 5. memory_failure_pfn() sends SIGBUS (BUS_MCEERR_AO) to all the processes mapping the faulty PFN using kill_procs(). 6. An access to the faulty PFN by an operation in VM at a later point of time is trapped and user_mem_abort() is called. 7. user_mem_abort() calls __gfn_to_pfn_memslot() on the PFN, and the following execution path is followed: __gfn_to_pfn_memslot() -> hva_to_pfn() -> hva_to_pfn_remapped() -> fixup_user_fault() -> handle_mm_fault() -> handle_pte_fault() -> do_fault(). do_fault() is expected to return VM_FAULT_HWPOISON on the PFN (it currently does not and is fixed as part of another patch in the series). 8. __gfn_to_pfn_memslot() then returns KVM_PFN_ERR_HWPOISON, which cause the poison with SIGBUS (BUS_MCEERR_AR) to be sent to the QEMU process through kvm_send_hwpoison_signal(). Signed-off-by: Ankit Agrawal --- include/linux/memory-failure.h | 22 +++++ include/linux/mm.h | 1 + include/ras/ras_event.h | 1 + mm/memory-failure.c | 148 +++++++++++++++++++++++++++++---- 4 files changed, 154 insertions(+), 18 deletions(-) create mode 100644 include/linux/memory-failure.h diff --git a/include/linux/memory-failure.h b/include/linux/memory-failure.h new file mode 100644 index 000000000000..9a579960972a --- /dev/null +++ b/include/linux/memory-failure.h @@ -0,0 +1,22 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_MEMORY_FAILURE_H +#define _LINUX_MEMORY_FAILURE_H + +#include + +struct pfn_address_space; + +struct pfn_address_space_ops { + void (*failure)(struct pfn_address_space *pfn_space, unsigned long pfn); +}; + +struct pfn_address_space { + struct interval_tree_node node; + const struct pfn_address_space_ops *ops; + struct address_space *mapping; +}; + +int register_pfn_address_space(struct pfn_address_space *pfn_space); +void unregister_pfn_address_space(struct pfn_address_space *pfn_space); + +#endif /* _LINUX_MEMORY_FAILURE_H */ diff --git a/include/linux/mm.h b/include/linux/mm.h index 1f79667824eb..e3ef52d3d45a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3530,6 +3530,7 @@ enum mf_action_page_type { MF_MSG_BUDDY, MF_MSG_DAX, MF_MSG_UNSPLIT_THP, + MF_MSG_PFN, MF_MSG_UNKNOWN, }; diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h index cbd3ddd7c33d..5c62a4d17172 100644 --- a/include/ras/ras_event.h +++ b/include/ras/ras_event.h @@ -373,6 +373,7 @@ TRACE_EVENT(aer_event, EM ( MF_MSG_BUDDY, "free buddy page" ) \ EM ( MF_MSG_DAX, "dax page" ) \ EM ( MF_MSG_UNSPLIT_THP, "unsplit thp" ) \ + EM ( MF_MSG_PFN, "non struct page pfn" ) \ EMe ( MF_MSG_UNKNOWN, "unknown page" ) /* diff --git a/mm/memory-failure.c b/mm/memory-failure.c index fae9baf3be16..2c1a2ec42f7b 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -38,6 +38,7 @@ #include #include +#include #include #include #include @@ -62,6 +63,7 @@ #include #include #include +#include #include "swap.h" #include "internal.h" #include "ras/ras_event.h" @@ -122,6 +124,10 @@ const struct attribute_group memory_failure_attr_group = { .attrs = memory_failure_attr, }; +static struct rb_root_cached pfn_space_itree = RB_ROOT_CACHED; + +static DEFINE_MUTEX(pfn_space_lock); + /* * Return values: * 1: the page is dissolved (if needed) and taken off from buddy, @@ -399,15 +405,14 @@ static unsigned long dev_pagemap_mapping_shift(struct vm_area_struct *vma, * Schedule a process for later kill. * Uses GFP_ATOMIC allocations to avoid potential recursions in the VM. * - * Note: @fsdax_pgoff is used only when @p is a fsdax page and a - * filesystem with a memory failure handler has claimed the - * memory_failure event. In all other cases, page->index and - * page->mapping are sufficient for mapping the page back to its + * Notice: @pgoff is used either when @p is a fsdax page or a PFN is not + * backed by struct page and a filesystem with a memory failure handler + * has claimed the memory_failure event. In all other cases, page->index + * and page->mapping are sufficient for mapping the page back to its * corresponding user virtual address. */ -static void add_to_kill(struct task_struct *tsk, struct page *p, - pgoff_t fsdax_pgoff, struct vm_area_struct *vma, - struct list_head *to_kill) +static void add_to_kill(struct task_struct *tsk, struct page *p, pgoff_t pgoff, + struct vm_area_struct *vma, struct list_head *to_kill) { struct to_kill *tk; @@ -417,13 +422,20 @@ static void add_to_kill(struct task_struct *tsk, struct page *p, return; } - tk->addr = page_address_in_vma(p, vma); - if (is_zone_device_page(p)) { - if (fsdax_pgoff != FSDAX_INVALID_PGOFF) - tk->addr = vma_pgoff_address(fsdax_pgoff, 1, vma); + if (vma->vm_flags | PFN_MAP) { + tk->addr = + vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); + tk->size_shift = PAGE_SHIFT; + } else if (is_zone_device_page(p)) { + if (pgoff != FSDAX_INVALID_PGOFF) + tk->addr = vma_pgoff_address(pgoff, 1, vma); + else + tk->addr = page_address_in_vma(p, vma); tk->size_shift = dev_pagemap_mapping_shift(vma, tk->addr); - } else + } else { + tk->addr = page_address_in_vma(p, vma); tk->size_shift = page_shift(compound_head(p)); + } /* * Send SIGKILL if "tk->addr == -EFAULT". Also, as @@ -617,13 +629,12 @@ static void collect_procs_file(struct page *page, struct list_head *to_kill, i_mmap_unlock_read(mapping); } -#ifdef CONFIG_FS_DAX /* * Collect processes when the error hit a fsdax page. */ -static void collect_procs_fsdax(struct page *page, - struct address_space *mapping, pgoff_t pgoff, - struct list_head *to_kill) +static void collect_procs_pgoff(struct page *page, + struct address_space *mapping, pgoff_t pgoff, + struct list_head *to_kill) { struct vm_area_struct *vma; struct task_struct *tsk; @@ -643,7 +654,6 @@ static void collect_procs_fsdax(struct page *page, read_unlock(&tasklist_lock); i_mmap_unlock_read(mapping); } -#endif /* CONFIG_FS_DAX */ /* * Collect the processes who have the corrupted page mapped to kill. @@ -835,6 +845,7 @@ static const char * const action_page_types[] = { [MF_MSG_BUDDY] = "free buddy page", [MF_MSG_DAX] = "dax page", [MF_MSG_UNSPLIT_THP] = "unsplit thp", + [MF_MSG_PFN] = "non struct page pfn", [MF_MSG_UNKNOWN] = "unknown page", }; @@ -1745,7 +1756,7 @@ int mf_dax_kill_procs(struct address_space *mapping, pgoff_t index, SetPageHWPoison(page); - collect_procs_fsdax(page, mapping, index, &to_kill); + collect_procs_pgoff(page, mapping, index, &to_kill); unmap_and_kill(&to_kill, page_to_pfn(page), mapping, index, mf_flags); unlock: @@ -2052,6 +2063,99 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags, return rc; } +/** + * register_pfn_address_space - Register PA region for poison notification. + * @pfn_space: structure containing region range and callback function on + * poison detection. + * + * This function is called by a kernel module to register a PA region and + * a callback function with the kernel. On detection of poison, the + * kernel code will go through all registered regions and call the + * appropriate callback function associated with the range. The kernel + * module is responsible for tracking the poisoned pages. + * + * Return: 0 if successfully registered, + * -EBUSY if the region is already registered + */ +int register_pfn_address_space(struct pfn_address_space *pfn_space) +{ + if (!request_mem_region(pfn_space->node.start << PAGE_SHIFT, + (pfn_space->node.last - pfn_space->node.start + 1) << PAGE_SHIFT, "")) + return -EBUSY; + + mutex_lock(&pfn_space_lock); + interval_tree_insert(&pfn_space->node, &pfn_space_itree); + mutex_unlock(&pfn_space_lock); + + return 0; +} +EXPORT_SYMBOL_GPL(register_pfn_address_space); + +/** + * unregister_pfn_address_space - Unregister a PA region from poison + * notification. + * @pfn_space: structure containing region range to be unregistered. + * + * This function is called by a kernel module to unregister the PA region + * from the kernel from poison tracking. + */ +void unregister_pfn_address_space(struct pfn_address_space *pfn_space) +{ + mutex_lock(&pfn_space_lock); + interval_tree_remove(&pfn_space->node, &pfn_space_itree); + mutex_unlock(&pfn_space_lock); + release_mem_region(pfn_space->node.start << PAGE_SHIFT, + (pfn_space->node.last - pfn_space->node.start + 1) << PAGE_SHIFT); +} +EXPORT_SYMBOL_GPL(unregister_pfn_address_space); + +static int memory_failure_pfn(unsigned long pfn, int flags) +{ + struct interval_tree_node *node; + int rc = -EBUSY; + LIST_HEAD(tokill); + + mutex_lock(&pfn_space_lock); + /* + * Modules registers with MM the address space mapping to the device memory they + * manage. Iterate to identify exactly which address space has mapped to this + * failing PFN. + */ + for (node = interval_tree_iter_first(&pfn_space_itree, pfn, pfn); node; + node = interval_tree_iter_next(node, pfn, pfn)) { + struct pfn_address_space *pfn_space = + container_of(node, struct pfn_address_space, node); + rc = 0; + + /* + * Modules managing the device memory needs to be conveyed about the + * memory failure so that the poisoned PFN can be tracked. + */ + pfn_space->ops->failure(pfn_space, pfn); + + collect_procs_pgoff(NULL, pfn_space->mapping, pfn, &tokill); + + unmap_mapping_range(pfn_space->mapping, pfn << PAGE_SHIFT, + PAGE_SIZE, 0); + } + mutex_unlock(&pfn_space_lock); + + /* + * Unlike System-RAM there is no possibility to swap in a different + * physical page at a given virtual address, so all userspace + * consumption of direct PFN memory necessitates SIGBUS (i.e. + * MF_MUST_KILL) + */ + flags |= MF_ACTION_REQUIRED | MF_MUST_KILL; + kill_procs(&tokill, true, false, pfn, flags); + + pr_err("%#lx: recovery action for %s: %s\n", + pfn, action_page_types[MF_MSG_PFN], + action_name[rc ? MF_FAILED : MF_RECOVERED]); + + return rc; +} + static DEFINE_MUTEX(mf_mutex); /** @@ -2093,6 +2197,11 @@ int memory_failure(unsigned long pfn, int flags) if (!(flags & MF_SW_SIMULATED)) hw_memory_failure = true; + if (!pfn_valid(pfn) && !arch_is_platform_page(PFN_PHYS(pfn))) { + res = memory_failure_pfn(pfn, flags); + goto unlock_mutex; + } + p = pfn_to_online_page(pfn); if (!p) { res = arch_memory_failure(pfn, flags); @@ -2106,6 +2215,9 @@ int memory_failure(unsigned long pfn, int flags) pgmap); goto unlock_mutex; } + + res = memory_failure_pfn(pfn, flags); + goto unlock_mutex; } pr_err("%#lx: memory outside kernel control\n", pfn); res = -ENXIO; From patchwork Wed Apr 5 18:01:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankit Agrawal X-Patchwork-Id: 13202324 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 897BAC7619A for ; Wed, 5 Apr 2023 18:01:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6C8366B0074; Wed, 5 Apr 2023 14:01:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6767C6B0075; Wed, 5 Apr 2023 14:01:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F0DE6B0078; Wed, 5 Apr 2023 14:01:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 412796B0074 for ; Wed, 5 Apr 2023 14:01:55 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id EBC21AC434 for ; Wed, 5 Apr 2023 18:01:54 +0000 (UTC) X-FDA: 80648105748.09.613491C Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2069.outbound.protection.outlook.com [40.107.93.69]) by imf24.hostedemail.com (Postfix) with ESMTP id D1AFF180033 for ; Wed, 5 Apr 2023 18:01:51 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=PZcCFz0q; dmarc=pass (policy=reject) header.from=nvidia.com; spf=pass (imf24.hostedemail.com: domain of ankita@nvidia.com designates 40.107.93.69 as permitted sender) smtp.mailfrom=ankita@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680717711; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Lbn51B6boribzXbBrYXY0Ya6Hm0KLcNEK2K3S+ou5C8=; b=rlqmbxFfrJtNnCaOrUl41pIORk69lTeF8py7F2FBoRv0cKeU7MBHlYi7F5te24YBfBa5QR 8Yb/XhItHJ1vN+sI+lxxTNhBxFQx8s62UaILy4pS7DKA6UyooHrGVhrvmVl1gUgeiEblG7 whtS6pSx3iApz8FG+r7Jnv1/iMD8XO0= ARC-Authentication-Results: i=2; imf24.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=PZcCFz0q; dmarc=pass (policy=reject) header.from=nvidia.com; spf=pass (imf24.hostedemail.com: domain of ankita@nvidia.com designates 40.107.93.69 as permitted sender) smtp.mailfrom=ankita@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1680717711; a=rsa-sha256; cv=pass; b=j0I6MQthCMSxdzNyBdDScim8TYK/n5msHbgGFPlFJDD6VH3LRXXhp+fcyRdTZeQ1+hIJ6J npP1VR64WdWiLzIP9kJk798J7/Kjwrp1LP6qfCnOw9GFVf/jLdT/7c1XLk1rZUCZaQSltn Rg37uCAGnZa67X4C+IVEQbkIyBE0O5Q= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Mk16Qisdbe0+HbJZYZO05tRGYuqSxSSQgAbCYG8g2V4dm1cQkMry7SJ18tDQOJaCImLF+pJv63LIaFDPLoNT0ownDmto3e/MKi737P2PrKEx40C+FDLf27VjqJj34aUvX2In01Qlg82jQ1rykt0vSu08s/B53lMu5kWFZUVi5MbuFae+4qdTICjIZGW7WFo3twhRXEKaF2D9Tl6SrJukvzkrhFOxtvJIn44K6j68AmFJIY7HrWkS1b3Y/qbSf5NlvWRj5N8PMnRRUk3Wtquf1I1NXDOIx9v2KdsBF6DXMLhLabT8FSqnvgfuXlLquGWkOv8+toSYjX+9q0ahJJYk+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Lbn51B6boribzXbBrYXY0Ya6Hm0KLcNEK2K3S+ou5C8=; b=MKlLjpD1L6tzVVgw49KWnibuzszKo6yzrPmPTuD+yI7JO/kusH141CPSsleVyjHNue5MDhbW/hhZJ6qRt5W/fmW45P4FuG0IuVM+apc2jWhWKNsgR6B0ET5zlRDRIwHxOq6N0umSZOCVrhtAkNvM28eGHyf1JtSMbQoWn1Ib4B8X432QdUj/zYlrLg7F+4VziDR3s7RdC3/XzTovyHuYDnRSetoMwcvl3kfq1Pn/74IXy40eg3coI4Wd5OkYGS/7EqcbgluybjBBTYrIjCbDcIwvEiU8/HiIBMk6LNGXdvvbqtjW3toP/pXw4+1h0+nrr9myIti7W7Zefvv5Ji4Org== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Lbn51B6boribzXbBrYXY0Ya6Hm0KLcNEK2K3S+ou5C8=; b=PZcCFz0qH8qNoVLQKhvqdFll5wU0Zc6tG/mKK6JvTISQIxEPlHJGF7RUsxyRSJ6KMxJvWZfASuArYA9Xf+Q1dgycobpZNV4H1LzkkkxQn6qE7K5qLxvBdRzHXUP/dVU3NccSeKhDQi/KPcCtrYgxIwp0cUERNzuQ75OmVXRhNROZ/djX2/3KWQTJT1vGD6cMYnpipuDz8D5DXA4zlQtpvli5n4e6vMVfsgR2QeXoC6M4po/3sG1vzxSbB0kvssNs1ziW0LZAKJlS59Jv/5d3VpvTmpTOdGcTVrMDI3cUnwrjsl9tTfPqfF6f1gGzewAgnyec0ButMXzqcuhQlH5oJQ== Received: from MW4PR04CA0250.namprd04.prod.outlook.com (2603:10b6:303:88::15) by MN0PR12MB6271.namprd12.prod.outlook.com (2603:10b6:208:3c1::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6254.33; Wed, 5 Apr 2023 18:01:49 +0000 Received: from CO1NAM11FT045.eop-nam11.prod.protection.outlook.com (2603:10b6:303:88:cafe::88) by MW4PR04CA0250.outlook.office365.com (2603:10b6:303:88::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6277.30 via Frontend Transport; Wed, 5 Apr 2023 18:01:48 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by CO1NAM11FT045.mail.protection.outlook.com (10.13.175.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6277.30 via Frontend Transport; Wed, 5 Apr 2023 18:01:48 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Wed, 5 Apr 2023 11:01:36 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Wed, 5 Apr 2023 11:01:35 -0700 Received: from localhost.localdomain (10.127.8.14) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37 via Frontend Transport; Wed, 5 Apr 2023 11:01:35 -0700 From: To: , , , , , CC: , , , , , , , , , , , , Subject: [PATCH v3 4/6] mm: Add poison error check in fixup_user_fault() for mapped PFN Date: Wed, 5 Apr 2023 11:01:32 -0700 Message-ID: <20230405180134.16932-5-ankita@nvidia.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230405180134.16932-1-ankita@nvidia.com> References: <20230405180134.16932-1-ankita@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT045:EE_|MN0PR12MB6271:EE_ X-MS-Office365-Filtering-Correlation-Id: d58cd556-1e24-452b-abca-08db35ffd3c6 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: TPrqo7iyZ1LzNwd2UFOiHjb3KLRpX/sJApbPGFn+XwlHp+Y9BEaCIuP9RUIldtiCozRqkdlmVSRETW/yA/eHUVbyu34ZvGx/9M9ynBGhLPUrKE8D1xhramLu3R14fw7CBWXzZYwKiwOD4fFGmVxnQrAtmvhBOZmAyiXqdRZZLlFilsSEwi5qztlj6z7kWCobJPVe4KGU7uXEwxOuXXhtZumpWzUo+U4pVV/D59VfdxChneSMlN/JRGOWXNd4rFrDohvPjEr1ujQgeenMln4S01W9CjZ79wCi2T6bO7Sf/8+MeEzQLr0yPNK8otUKAz8of0+QkeS+5Tlqn+aR4eX29QvM4o3qz3JsAIOvSS38Li/CQz0kI0OjhygiWIlBF9mqg9vYSALxFycMqvDdeIQTDOQk9f81DJ8zna4PcPbd4ZPxW6+3iCI2YkedWnBZpZOweU/EZQBZGiHYSVe1mV2gjns7e0L4pCImHw/A/WQy+eSy6OQgMXX4B6fDJjSI3RPL3qUR7irH4vVOhZkCafubX9DpHToB3w2l1rr+gwA+7r2zK5gcgiClbN4I4u9OxsAyMQBj+jeCMIMAkeMvAuqIDHLUceKp/5O+2UziNnzbv3njsWzQ7H4O+49S2UftdS+H50HMZG5PRDbHXIaRrLOh93AmY7lu0GFIoqMOJ1jYLDNFx+UP+VU8JoZSjbV7r32IC9Hiiz/RzhegDkAus1f9M4mRpnr1x6Cf+jG02sFcCmf72Kkw2ZQ8TXtaHNAlwJ4L X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230028)(4636009)(39860400002)(346002)(376002)(136003)(396003)(451199021)(46966006)(40470700004)(36840700001)(2876002)(2906002)(82310400005)(36756003)(86362001)(36860700001)(40480700001)(2616005)(336012)(47076005)(186003)(426003)(83380400001)(6666004)(1076003)(26005)(8676002)(70586007)(478600001)(70206006)(40460700003)(5660300002)(4326008)(82740400003)(356005)(41300700001)(110136005)(316002)(8936002)(7636003)(54906003);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Apr 2023 18:01:48.8221 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: d58cd556-1e24-452b-abca-08db35ffd3c6 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT045.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN0PR12MB6271 X-Rspamd-Queue-Id: D1AFF180033 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: epj86c9sbyi1pb4d6mea3o4mpwii1qbn X-HE-Tag: 1680717711-596076 X-HE-Meta: U2FsdGVkX1/ftYlNo1V6MuNI7go0o94Fz4VO23PCGRQlZGQ3O8EpnceH74VnVFpcV0F9nDXp5VF6OKYR4AXVwwVsUvko2RJbRmwuR6TK6YNbe5ucsl/uF4B1mGGCsTPXCavSPgPBGr4gAe9CsNzXODrb+2UJZmwU/b+1RgRVUuA+n954P3VnqwjQ7/AZMnPd9yEl/nDc+MtKHS1nRO3MzUNf7ajsbYL5LEOsCg9XEp4lD/ckZG2dGychxgsYC2JpH4PieL8pnuRU/xMyua0hnps2RN/0FSoq8X0gxNpAPTdQBzlwLttc6M6tXSGqRHFQGVSniwPTyKLRGsOmrSUDJyspQ6ynfNu5JZGUjMnCEZK2vN/tbMVr5fxOLIgQXJ6qn9QrX+1NbqZTSEXxB9CDKldmOZxKS4JI4dycNe7lF6MdFYIxcBAAfzhY1smRV9YalPyhDBlP0QLUREAyTmYlCs2hyVsdWlNHTlX6xDv6h+YT2Y6NCHF/sCVTnXSZ99zCgq9M96axEUH3xeycbQtJnrCt5ysmOcZtFEFlTgxnY2PRqjNdF6yUAZKcYZsllCqygRk4D6ui4fceVtgVZq//qhfo3tx+RZW0HhMhhToA3eE+imjG5jpvym+NIjzZLKTrn7fn/5CB4S6gTd5wPyLQbF7WTaTXTXeBQJ16yqyGOgT4/nEWIdqL3CfGQl4wjTRKcZpMQTBPAOwN8tePnSkFk7vTP7FgPB9Q2mFX8QrAsl+U79+TKQdGBbpucrrrFUNMmizjN6xY0NpX5p/RSA8NGCgme0ppKN+0XFTdPpoDl0kLyBY7qbKjJHC8k5zIvRz1mThs45j+kAuneGoxlO7HMAik4tSHJjSHMIfyxtEqGh43IPM9ifRkcrl88infNgcRzzmunacGF3N+j0AYi3X6TiCHSHpwZm9GFagWRcKjt1K5sFcJoq5QH21gOMRbMnCTssxtFDaX/6t879cglkX kFHqNTI6 EJR0gtpB5RnvbWj5hMf03IbtnJGu5eG3oRCzn3dmmOmiSV01HyT+haIDyVjhLJNPgquHUwCWkn1ZIHqLOht68JfmPNrhYH7WsfhyEV9Is72wn+BZtOuKwOWkXyoGjg1FS09Or9CcvgCuDeBLQXXdfObx8p5hTeSVI40/lulQqYV1kA36x3CZfKuiw+hbO01pdFqQ4OMWrvYIiG2I2ogPuM3hAjWExRlgQ46N5xcE7cMPLjPgs3sO3Vi5qvPdMNsh9ZcDl8iHvnyYQowM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000007, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ankit Agrawal The fixup_user_fault() currently does not expect a VM_FAULT_HWPOISON and hence does not check for it while calling vm_fault_to_errno(). Since we now have a new code path which can trigger such case, change fixup_user_fault to look for VM_FAULT_HWPOISON. Also make hva_to_pfn_remapped check for -EHWPOISON and communicate the poison fault up to the user_mem_abort(). Signed-off-by: Ankit Agrawal --- mm/gup.c | 2 +- virt/kvm/kvm_main.c | 6 ++++++ 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/gup.c b/mm/gup.c index eab18ba045db..507a96e91bad 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1290,7 +1290,7 @@ int fixup_user_fault(struct mm_struct *mm, } if (ret & VM_FAULT_ERROR) { - int err = vm_fault_to_errno(ret, 0); + int err = vm_fault_to_errno(ret, FOLL_HWPOISON); if (err) return err; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d255964ec331..09b6973e679d 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2688,6 +2688,12 @@ kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool interruptible, r = hva_to_pfn_remapped(vma, addr, write_fault, writable, &pfn); if (r == -EAGAIN) goto retry; + + if (r == -EHWPOISON) { + pfn = KVM_PFN_ERR_HWPOISON; + goto exit; + } + if (r < 0) pfn = KVM_PFN_ERR_FAULT; } else { From patchwork Wed Apr 5 18:01:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankit Agrawal X-Patchwork-Id: 13202326 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AE55C77B6C for ; Wed, 5 Apr 2023 18:01:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9FCDD6B0075; Wed, 5 Apr 2023 14:01:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9AD866B0078; Wed, 5 Apr 2023 14:01:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 788DD6B007B; Wed, 5 Apr 2023 14:01:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 64AF56B0075 for ; Wed, 5 Apr 2023 14:01:57 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 3C9BD120D59 for ; Wed, 5 Apr 2023 18:01:57 +0000 (UTC) X-FDA: 80648105874.12.75B3DFD Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2074.outbound.protection.outlook.com [40.107.92.74]) by imf26.hostedemail.com (Postfix) with ESMTP id DE90A140037 for ; Wed, 5 Apr 2023 18:01:53 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=QyaJu+mR; spf=pass (imf26.hostedemail.com: domain of ankita@nvidia.com designates 40.107.92.74 as permitted sender) smtp.mailfrom=ankita@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680717714; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BtaL/n8C6+3A7ETFFxQQ906gOFVAx/zW99+G1mQbAcE=; b=Zf3WrUD7LAGARvEz8jZjiqHm7Tt0UxnYia04Er88s9++v3PqppcJv1vGCfQkaR6I0F5wZh GCTG+y4oKGhQe5ePAOTYGxvVWwWIINWjr7umvQRbz8qn7PN1M11bSX8OO0HcpTKbZXeEWE 3sItWDXfk15NYD34mj0P58gQQM2+J8A= ARC-Authentication-Results: i=2; imf26.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=QyaJu+mR; spf=pass (imf26.hostedemail.com: domain of ankita@nvidia.com designates 40.107.92.74 as permitted sender) smtp.mailfrom=ankita@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1680717714; a=rsa-sha256; cv=pass; b=f4qJLCfxOIMtwmWtPg8iskYpIM2EEZuG643uVAzOO2TzBDZEe/PsR2PlpWnxz3RKNfsUCJ ZXafbCuxLEbU2qshyYWOWQJE/owT4OxxpM29SapybA/EQKaAFZc2YPzBKDBALyLNrpGtFI N08EfUdCUOJIOZTmdtdFYqbjcuj2WMU= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OZEZu3kRRvLKZAvppHl9rlc4mYRgdx1YoThcLRVLlztdB9H1HS6NoJol8igy16PYhBGeFGlCiPZHrPDtprxUoXuAjuPqaSOwutPNyq/s0SFxVXI0MZRf+VA5zNJ1ZYhoyishQ9XOkkViEE8pVR+iI1EOVeuQveMa18zzRgG1t2bKdd7LAfi7BmNk8YGYXbXG1DfaMO4iCzuEpr9G0by0wQZrrPkQkDN+Dzp9fF7L+Jlib3FGCelFOK4iPdKvzyVNIGyipmEoEzbjY2+K/8zHHhL2gTbqRV4zG6rCtgkYXB7rWN7UVfRjm07qGzWWTHRXBJJ4jr9Q1eaNF38gdO4tuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=BtaL/n8C6+3A7ETFFxQQ906gOFVAx/zW99+G1mQbAcE=; b=bJtA0zIYz/GDw+dRvQs3YcqPlJwiaH6ZMjTyuPrEVWeqiDxkW3ymxI0cClv11lLzxg4LEGzxU3DSqFNGRA86oQ+4R6Lr32QIubKZF99VSY/vH5Fhl39jPT3Xkbbl/4IW8Z4Kgl2E2kMJQ/xQvtUbX3l0DWhnCAT6yP+aJ/SpD5P39v+hIWxd9s0o2HXxzHZXsRc32YQFJ3zuZsLNAviQC2c/XGK5NMcMMmH4C5tcGQp5SH9QzpsXTolUUeDavew4ZpJnRIiF+4G5eLZ3mra+UhfR+SD+Hhdl8jnY8Fhn6VWnCqGWDpYdJsbcBOOnd2ZntLHFYmPh1yAx+0X4jc7C8g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=BtaL/n8C6+3A7ETFFxQQ906gOFVAx/zW99+G1mQbAcE=; b=QyaJu+mRzARAiiVLlhivxtdm0wWLTASPlKuNoHtnbEobDBdfM2LtDfrBEyES/U+F0sPurXqtWcFlHyvO00xYsF0YjlYBJLxtMVxi3+2eKr8G9TcHjluJxbFybtFuT1P2G/D7GEHG+VsyGX3q3KphFQhSmp658gbKqpEg6uHnyMJp8sxQ6492+05cm4i/vp/Sj2uep05awNVTjX4XZ5kKK7QiBiDuP20Mg/AxDxGGqxjGESpbQ9oAnQLea+yTUXb3qf4Uq6v718oa/gRxP1Dz1qonnEcuv15ZpO98IzP39UqUmVMKiihMcgfPpv60/hixTetIJwlbKt3KzD1XX5/fJA== Received: from MW4PR04CA0265.namprd04.prod.outlook.com (2603:10b6:303:88::30) by MN0PR12MB6077.namprd12.prod.outlook.com (2603:10b6:208:3cb::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6254.33; Wed, 5 Apr 2023 18:01:51 +0000 Received: from CO1NAM11FT045.eop-nam11.prod.protection.outlook.com (2603:10b6:303:88:cafe::bd) by MW4PR04CA0265.outlook.office365.com (2603:10b6:303:88::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6254.35 via Frontend Transport; Wed, 5 Apr 2023 18:01:51 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by CO1NAM11FT045.mail.protection.outlook.com (10.13.175.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6277.30 via Frontend Transport; Wed, 5 Apr 2023 18:01:50 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Wed, 5 Apr 2023 11:01:36 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Wed, 5 Apr 2023 11:01:36 -0700 Received: from localhost.localdomain (10.127.8.14) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37 via Frontend Transport; Wed, 5 Apr 2023 11:01:36 -0700 From: To: , , , , , CC: , , , , , , , , , , , , Subject: [PATCH v3 5/6] mm: Change ghes code to allow poison of non-struct PFN Date: Wed, 5 Apr 2023 11:01:33 -0700 Message-ID: <20230405180134.16932-6-ankita@nvidia.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230405180134.16932-1-ankita@nvidia.com> References: <20230405180134.16932-1-ankita@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT045:EE_|MN0PR12MB6077:EE_ X-MS-Office365-Filtering-Correlation-Id: 4d9c77d8-476c-4dc5-216d-08db35ffd506 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: aowpSczYvGkbf0GmtDx0TVteJl6Z3VkolupEIVVvbH9BsXlNO84HYLdRWY4TIUbPhoeFtu70v75Z44NH7DVzfD5K/lAoF4qBAKlioEM6+HTskN4pdYZv+lMBRiY3jEmTwV42OsFMIYD0zKMLpXJdLQs0S2u91rgbp0lvieCBxVMAOmK6A0gm/YYZ0Qqz9eDZQ2tASV+IqqH3CaGBYoRjn66tmI48cLBF5DluzKhTK7RhwRwxvVz3Yf9s0e656EYw9hy7AZLO40e1ibb90lyRvOsbiCAYXFJgv8Cxj2tsiGtmS4GJDo1oTXtXRZIIH+qpjtZV151mvnWNFDz4yZiv5JQXwDC0O6FVggfKGbhKzVUehb09XbOP7t1CueTKRPG7d0EsjZiBLzROaUk9JnJaIYWKgBOPSBqPSEM4cfVsXZ4ZJJLLHwaB0sjuGf6ri66JDwtUbSxs7vwgiJKRHdfFt5buG9pppE70cJunJ/qKKIPmopEMWRbuESi/Yswpivpl9Bx1NuCfZz1RHILnOSRQCppm3YHDJ0Ub3Gmff916Jz2QMph9ys76h+FvoFvWbnPnXctEucrIOfD75yBSG8mEZawLTbeh1SmgdcuI37ySYdLZVqztNpXqCAArWlTfmni3ZBFOiDgJuO4Vnujl1WaCuGDR9sQcjkmSUNnhlo5xE3yQGLSjNDoW5RLSuF6ao6AhkIIZ87auOtcTAqebEtmEaBIqmWK0USoJiSYWk+WeOpqOIryiPaa7r20BGriQGHLc X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230028)(4636009)(39860400002)(136003)(346002)(396003)(376002)(451199021)(46966006)(36840700001)(40470700004)(82740400003)(2876002)(2906002)(7636003)(356005)(5660300002)(36860700001)(70206006)(70586007)(40460700003)(8936002)(41300700001)(8676002)(1076003)(54906003)(316002)(83380400001)(4326008)(47076005)(2616005)(26005)(82310400005)(336012)(110136005)(426003)(186003)(478600001)(6666004)(40480700001)(36756003)(86362001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Apr 2023 18:01:50.9001 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 4d9c77d8-476c-4dc5-216d-08db35ffd506 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT045.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN0PR12MB6077 X-Stat-Signature: 1xuyxx6xan1qo1ofqsy4m797ppejitsu X-Rspam-User: X-Rspamd-Queue-Id: DE90A140037 X-Rspamd-Server: rspam06 X-HE-Tag: 1680717713-479009 X-HE-Meta: U2FsdGVkX19xJRKPPIUnwnW9m4UXTnvosoDvQzjUGR1c9PIMzmesbgk2P7qQA6WCL/WvB6ACrRmoMyL4V39cVelbzhUr9KNeobd0Gj4y1RGoIAeIj486DrqiPLk8/GJVBVImCd3LUa66I1lsL7SH6rjeuy1q1RcSYYruaQjDKdew6G+UFXFm3dAeaXVGS97ZBCN4FOOOz9t7zM3BC+mEY9uWeJV483kAVdYBVOgVJlWKi9SPDn6aMl8gUEq2qTscU/3DxlB09OlI7Y/FiSTfspSQ565wtGcTM9EwaW8v5BiaFh5sQf9Z/hv+nOUjsG555agbZ2noGuCAmhULpP4Nh6QD3JHq990A1oBSxNFNP4mJcxC6qUTlXCGGPG8v0tib9QGW3LLEdK0nttC61HVlstgblaEW52PYojbF2HV+9heFmQd3WwAYisoRrZgDvv3AO6Vc+YXCxgK8xAWPJ/kbKxmgbpYegZjTubSWg3+s1uNSEoZZoKK+2lR7xvnLbyopOh3QomurZH2rHRCwtHDaeMipUou0Gwjp+hywhj5I9dNiQDOdCxIxwffasDATEA1CdwFJXHg9gEjAvxaY16pF2uJB9Lw6KPchHT2fLDCS96fPLsF2hVhH7gRrAaWtJeAE3oaFbzmn64SoaSEIYhX/EmrJBWguaNVaIDBh0a2ukj6r2UuPLUCLW+fQTl7hSbZLoh+T50GJZTmXXHwXGTXIllcx193QIuYe82k4ZUztSNRAOmO03nN73RLhbZWfSqqat1NXyfITgA59eop2YE76ZR8WZOjvJt3lMwXeH38tPczt3SZIs4VNv4aIkPj5GqEWrQncrgyvdGkRWbNse35ixHwwxzMtxTxlCsjy5YHrIDfbtuTEsAkn3ja/Epw7lzYlnJfsQ9FlKZlVaPUypQjIaIK6j6xWjMkh3HxY2efeR2zSwKq5VdcoJi5spehnDtzdwoLccqkucKGBvpQLdsp epA8tdOA m0vSHtyicHtGF+Bn2Y1+oMrJ/3MjbvHv7m11jMYtHONaz4phUWl2JKEQUKKZ7xvn+o3BgYWqRSEkoe3p522XMEs/gzAzvgNh+3KgQroGS0flV73AgFG/aiKNShoEDi87D8zr7sQ0dT65333rodiOpRWVC9o1wcK9zw3JsIJfKXwLz4s3rSlweg4DZhiBVeNSpJE+NYW0rVRg0fv5OkVKyRhdbEQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000006, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ankit Agrawal The GHES code allows calling of memory_failure() on the PFNs that pass the pfn_valid() check. This contract is broken for the remapped PFNs which fails the check and ghes_do_memory_failure() returns without triggering memory_failure(). Update code to allow memory_failure() call on PFNs failing pfn_valid(). Signed-off-by: Ankit Agrawal --- drivers/acpi/apei/ghes.c | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 34ad071a64e9..2ab7fec8127d 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -459,20 +459,10 @@ static void ghes_kick_task_work(struct callback_head *head) static bool ghes_do_memory_failure(u64 physical_addr, int flags) { - unsigned long pfn; - if (!IS_ENABLED(CONFIG_ACPI_APEI_MEMORY_FAILURE)) return false; - pfn = PHYS_PFN(physical_addr); - if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) { - pr_warn_ratelimited(FW_WARN GHES_PFX - "Invalid address in generic error data: %#llx\n", - physical_addr); - return false; - } - - memory_failure_queue(pfn, flags); + memory_failure_queue(PHYS_PFN(physical_addr), flags); return true; } From patchwork Wed Apr 5 18:01:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankit Agrawal X-Patchwork-Id: 13202330 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00ADDC7619A for ; Wed, 5 Apr 2023 18:02:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 95DBA6B007E; Wed, 5 Apr 2023 14:02:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 90DD86B0080; Wed, 5 Apr 2023 14:02:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D6426B0081; Wed, 5 Apr 2023 14:02:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 6FDB36B007E for ; Wed, 5 Apr 2023 14:02:27 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 31C778047A for ; Wed, 5 Apr 2023 18:02:27 +0000 (UTC) X-FDA: 80648107134.19.06EFEC0 Received: from NAM02-BN1-obe.outbound.protection.outlook.com (mail-bn1nam02on2076.outbound.protection.outlook.com [40.107.212.76]) by imf09.hostedemail.com (Postfix) with ESMTP id 37C72140036 for ; Wed, 5 Apr 2023 18:02:20 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=ubW5dCoK; spf=pass (imf09.hostedemail.com: domain of ankita@nvidia.com designates 40.107.212.76 as permitted sender) smtp.mailfrom=ankita@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1680717741; a=rsa-sha256; cv=pass; b=dJgu0dZL7NBeNwJ4X5oXIu5rMd2AiF8pfsfVQuNFkTmlWCQLe24nyMOcJUC2/G6TWURZOn vVG+lUvuJ5bvxDXbedFUEXlWjtFhond0xWYE4Bhv1SRU8cPw6TJrm2wjulnE2Gwa2JtJvw nCRwR2wqXoRVSAp2fTpZDoUGTHRL+V0= ARC-Authentication-Results: i=2; imf09.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=ubW5dCoK; spf=pass (imf09.hostedemail.com: domain of ankita@nvidia.com designates 40.107.212.76 as permitted sender) smtp.mailfrom=ankita@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680717741; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vdzN4GTXlcXcoY0qTILPkK5gTwxyEEnt0xYFTZZrQps=; b=DRSRFrEPkvvhNnU1vJvyl1PU6qI4CLN2Zj/UciJs8y3E0JdFoiqNPjHxEcI8wiRQ1RI2dC 9P/xkX6MMN0upcOLmJEEPnaMXlieUUIjXO4GCnm9m5j/6q26UD9T1FU+p4I6MmHOsaZc6c t4VSvG8HbePZbNOdTdXiYbtKVslL0gk= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=k/LJWdS4RjzCTH/KoVH5nAkrW9pmx8WQDAWwRyOlx5jL5rLyFuklY07scJXyN/smnmXkDolzQYNQI7ZpDEmWMploHFUa1Iwcqd8FOD955pS1zhCMPwp2ugTps1qPCKtYeEXId3ObePRsA9Eu19adT7Ne8Ff5pDyPUEIK9PT613szNHM8bOY0IZWDcjbnxnA8OXlC3pCjcggMpy4hlKsxX/5PfdKj2u1BcWpxIXpENKuE9ct/WX7TP4qBr6uP9yheLT7euFQQZZHqtdFR8X4yJsiS9nL8yJaJKVzpPmNNIffaca8YOBmHU8LUtYv8XJYzA8BGxlgHpfy/zSs4MlXJeA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=vdzN4GTXlcXcoY0qTILPkK5gTwxyEEnt0xYFTZZrQps=; b=N7eZrdrYc3HXGPDuae21yBPy9z+1wKud2/2U0RYXzfP5ZC8Tjp/+Gfwi30Q4JhBPZqLiFtHnVdjAPmxDddSFxk2IKtyalHjyM7FQ5lcRYdKgYPKId8gQblBxF6EClhkRQhStR1ukWfDMlxAEyDIARgrIuI3OZ7y7L9ajd1p5vjBeQ2Dh6238hCvzGvPyi0Id8ZFoaMDrRI049mJLFDUoKfAtnMnm7V7EiMPxkoJdDyLp9cvAUJZnVlBpm4d40thGV9r0z//j8GLfcoXgQRLHDrbVSlWRqGuXnpeNdewBXVcuOVTZtwtws0aySw2LD/k5CZ+56dY0ACZ09AzEPZPyCg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vdzN4GTXlcXcoY0qTILPkK5gTwxyEEnt0xYFTZZrQps=; b=ubW5dCoKjzw5cTfA5hKqxz5SjfyJGVBAh2LK12BVTyNFAupHBGFm8n4F8WiEydoE20mjLwetNf/fJIfb5dHmBjZpe9HRp8fBFOERCK7d8RlSr5aX2HrFWu/Ql1ctzmBo/2ANJdRFEZYoBqk5yIp9y36+2TaKItA88pZ7zaqZCMn5cWv2rVSe5Afl1hyqk9MqRS21cf0rUhde/eroMoPvOND1OCQsMCfKShg1FSdR+bPD8OABfslRhtUyp13wDNJ10Z+bRAz610Ss9RcO/3QboXplbfXUWiBQbLDGxxAKW9Bi2l1CvemOjH69ivcvd7GBY5+DWILkOBEHDvkGlv+rBQ== Received: from DM6PR17CA0005.namprd17.prod.outlook.com (2603:10b6:5:1b3::18) by SJ0PR12MB6901.namprd12.prod.outlook.com (2603:10b6:a03:47e::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6254.33; Wed, 5 Apr 2023 18:02:14 +0000 Received: from DM6NAM11FT049.eop-nam11.prod.protection.outlook.com (2603:10b6:5:1b3:cafe::41) by DM6PR17CA0005.outlook.office365.com (2603:10b6:5:1b3::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.44 via Frontend Transport; Wed, 5 Apr 2023 18:02:13 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by DM6NAM11FT049.mail.protection.outlook.com (10.13.172.188) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6277.28 via Frontend Transport; Wed, 5 Apr 2023 18:02:13 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Wed, 5 Apr 2023 11:01:36 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Wed, 5 Apr 2023 11:01:36 -0700 Received: from localhost.localdomain (10.127.8.14) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37 via Frontend Transport; Wed, 5 Apr 2023 11:01:36 -0700 From: To: , , , , , CC: , , , , , , , , , , , , Subject: [PATCH v3 6/6] vfio/nvgpu: register device memory for poison handling Date: Wed, 5 Apr 2023 11:01:34 -0700 Message-ID: <20230405180134.16932-7-ankita@nvidia.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230405180134.16932-1-ankita@nvidia.com> References: <20230405180134.16932-1-ankita@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6NAM11FT049:EE_|SJ0PR12MB6901:EE_ X-MS-Office365-Filtering-Correlation-Id: 23ea251e-83ae-480e-4f9e-08db35ffe2bb X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 1ThF1QE+sshigyAroD389Se81EL2sgk1Xm6rlSoq83A0oALJZTJp74i4qj1tB+lvSecnvy3symPZUJPA683YkRpwfqEzE0iNv8Rhwa6roJbppfXN3Kpq5foiebR5Q5i+XMnxpGHCETBRYrLTivrseacxvifN3v1+IyKuFIYkyxGpOp6vbnbTK0rWgAGTXcJqiG8XKoTZX069dHRWQH9JRHv5baJL0+oFzLrvtx72a/nvtWqaO2cvIPe4smtQXhk77aKFnE0CHMbhaxXscipevEztXpyso9+rxnM4a+pIQthJiTXJ95eMVD5rmuzWKc3SsgKwfes90Wyl8rXIr533IfADKe4xiy30IJqmeHkEPObHsNQtWqzS4wAJc1DzX6CaUgNwJovEz9k3BF19ZVMjEDZrmUWkrv12rUb22aMF0d12+UhXUfOaKwauj3NYO6oQwxCLCPyJK9lpOpp8Uhm9bjhNAwTyZ/j7BZz5UUZhoa3oG6kFYd08QWCGp32rJ3vs82OgAOfDB3BBkE8WyOtsLpgAJAqheSg1ufP153djRht4poeiLtWBgook0s0X3nSMoqS7tLD5ZnNZhOSV2FUNvb/d1nIcuxC5JcEpHsZq5LiULW0Zk5T0K0EWAwHVNpPXb7YXP/WEe2Y0A7G16hVslGMY08U8pDw5tbqVu9TWJidtbJ1ZA1Zlb1g+tBCWQEcZfez/HmDzUCZ18FxqiHAzCuND9PDuRI+qXFCUyhw/5vJhu7v74HTUe1H8yy030JiC X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230028)(4636009)(396003)(376002)(346002)(136003)(39860400002)(451199021)(46966006)(40470700004)(36840700001)(186003)(336012)(426003)(40460700003)(41300700001)(40480700001)(2616005)(4326008)(70586007)(8676002)(36860700001)(70206006)(2876002)(36756003)(2906002)(356005)(86362001)(7636003)(82740400003)(82310400005)(83380400001)(5660300002)(110136005)(8936002)(316002)(54906003)(478600001)(47076005)(1076003)(26005);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Apr 2023 18:02:13.8510 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 23ea251e-83ae-480e-4f9e-08db35ffe2bb X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT049.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR12MB6901 X-Rspam-User: X-Rspamd-Queue-Id: 37C72140036 X-Rspamd-Server: rspam01 X-Stat-Signature: 3ebr8rxe9iq4oc8hejw5yzaqz9d141bt X-HE-Tag: 1680717740-36626 X-HE-Meta: U2FsdGVkX1/O3TBDUfQ20UeC40GFEBVD2H93d991CJVqaYkpLB9jvMH/VZS9DhmuHPFysfKgGU9EqKHxtBYBJM5/EwWL/ZTWO1OP5escuBwy/vd0TiE0EA+n09nZpuyzceFDm3hM+zyn9gh3iXIHJv1h6OjlDKECGmB8Qh7S6ogkk3umjBwcJSpXMDWW3xQvjYEEdvZJhtPubFhGBQlP9Dn6LD+FIUcC3QH2HSIeagcQe9tdlgZLa/FCzYbPunGeE2HjasSGhe6xH0rs6NgNugcptbkyc79SzKHDw51EUBrPcXyknetwBsEjgN9C8g35FukF+XXuuanS/2CAKs0s9RU8IteZF5O59A0zPqrDLpYVeOztDdu6IlCyXrg3nqMd2AeVDUwgXNdAHwWO3mz1/hjVqyrko2q2B0gWPWokvZT18olLB7bJa2kqsspKvkNwYtELXO0IaE+dsOoEnRy0Ye8sXTc/WWjJ3VMyQJtD1VFXaD8xvyvXrUlG0+oBvUWZB9ycO1/5q3YeiMWcbVZn/q++8q844GIv8YZ7s9X72jL9NlQOlKxbKZ5bXDzeKm+loqJ0RYbVk6WA+A/OmS9StxpycelQuscMdlsrFaij+nCbXCLorUtGe+tby4z7BfzLQ8ykyR59BPwJ7TGbLmI2zi3/RWdIC9O0M4fMV/uL2OZjjH6sUx7tehVi878ICbT5vrKHZGdkW0MKkRYHmpNVN6ZCs0l2t2TV8WOwUQu77BUK2dtgh47Y50tSLeB7mlcIMtnIqHaDe1yfIFlclGYhr7YBfZ/8wu/zYbrmdgy/KOWchnb13RfZoIlsgmm8F9ZonHkEjrdUbuFjhQqaSN0ejTQbCWCZP4ekK9qyPfHRFJuJtdK+uUPpdIH7tbwfxEspqW8Se0Lcy4mOZxtZtuj5dAwsViRHul2ApTu5rOfrQqBCYKPGkcPv6uPDhk5ut2mZeVDn+1qFk9ovMZGjZ65 kDXW54oU vchcA13CkmYMZ4QUcflnfBs3FqvVuqhxBpxQ5rFj+Bs1SGHIO7ETmmBins5NYTZVSgaIWIpwBXCAkPwWhh0a8EED4YcjHAb8CEON6Fia9dhPsfiQX2kG+IVfTPZneTLnZY6Rmsr7dboYOlAJdHzW9U9WzPbS0xSQHPH6FF9Im0Jt3STNppweppezeZK8gawOOgeZ0j+oGghyFiElRuS2oXHuJ7Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ankit Agrawal The nvgpu-vfio-pci module maps QEMU VMA to device memory through remap_pfn_range(). The new mechanism to handle poison on memory not backed by struct page is leveraged here. nvgpu-vfio-pci defines a function pfn_memory_failure() to get the ECC PFN from the MM. The function is registered with kernel MM along with the address space and PFN range through register_pfn_address_space(). Track poisoned PFN in the nvgpu-vfio-pci module as bitmap with a bit per PFN. The PFN is communicated by the kernel MM to the module through the failure function, which sets the appropriate bit in the bitmap. Register a VMA fault ops for the module. It returns VM_FAULT_HWPOISON in case the bit for the PFN is set in the bitmap. Clear bitmap on reset to reflect the clean state of the device memory after reset. Signed-off-by: Ankit Agrawal --- drivers/vfio/pci/nvgpu/main.c | 116 ++++++++++++++++++++++++++++++++-- 1 file changed, 110 insertions(+), 6 deletions(-) diff --git a/drivers/vfio/pci/nvgpu/main.c b/drivers/vfio/pci/nvgpu/main.c index 2dd8cc6e0145..8ccd3fe33a0f 100644 --- a/drivers/vfio/pci/nvgpu/main.c +++ b/drivers/vfio/pci/nvgpu/main.c @@ -5,6 +5,8 @@ #include #include +#include +#include #define DUMMY_PFN \ (((nvdev->mem_prop.hpa + nvdev->mem_prop.mem_length) >> PAGE_SHIFT) - 1) @@ -12,12 +14,78 @@ struct dev_mem_properties { uint64_t hpa; uint64_t mem_length; + unsigned long *pfn_bitmap; int bar1_start_offset; }; struct nvgpu_vfio_pci_core_device { struct vfio_pci_core_device core_device; struct dev_mem_properties mem_prop; + struct pfn_address_space pfn_address_space; +}; + +void nvgpu_vfio_pci_pfn_memory_failure(struct pfn_address_space *pfn_space, + unsigned long pfn) +{ + struct nvgpu_vfio_pci_core_device *nvdev = container_of( + pfn_space, struct nvgpu_vfio_pci_core_device, pfn_address_space); + + /* + * MM has called to notify a poisoned page. Track that in the bitmap. + */ + __set_bit(pfn - (pfn_space->node.start), nvdev->mem_prop.pfn_bitmap); +} + +struct pfn_address_space_ops nvgpu_vfio_pci_pas_ops = { + .failure = nvgpu_vfio_pci_pfn_memory_failure, +}; + +static int +nvgpu_vfio_pci_register_pfn_range(struct nvgpu_vfio_pci_core_device *nvdev, + struct vm_area_struct *vma) +{ + unsigned long nr_pages; + int ret = 0; + + nr_pages = nvdev->mem_prop.mem_length >> PAGE_SHIFT; + + nvdev->pfn_address_space.node.start = vma->vm_pgoff; + nvdev->pfn_address_space.node.last = vma->vm_pgoff + nr_pages - 1; + nvdev->pfn_address_space.ops = &nvgpu_vfio_pci_pas_ops; + nvdev->pfn_address_space.mapping = vma->vm_file->f_mapping; + + ret = register_pfn_address_space(&(nvdev->pfn_address_space)); + + return ret; +} + +static vm_fault_t nvgpu_vfio_pci_fault(struct vm_fault *vmf) +{ + unsigned long mem_offset = vmf->pgoff - vmf->vma->vm_pgoff; + struct nvgpu_vfio_pci_core_device *nvdev = container_of( + vmf->vma->vm_file->private_data, + struct nvgpu_vfio_pci_core_device, core_device.vdev); + int ret; + + /* + * Check if the page is poisoned. + */ + if (mem_offset < (nvdev->mem_prop.mem_length >> PAGE_SHIFT) && + test_bit(mem_offset, nvdev->mem_prop.pfn_bitmap)) + return VM_FAULT_HWPOISON; + + ret = remap_pfn_range(vmf->vma, + vmf->vma->vm_start + (mem_offset << PAGE_SHIFT), + DUMMY_PFN, PAGE_SIZE, + vmf->vma->vm_page_prot); + if (ret) + return VM_FAULT_ERROR; + + return VM_FAULT_NOPAGE; +} + +static const struct vm_operations_struct nvgpu_vfio_pci_mmap_ops = { + .fault = nvgpu_vfio_pci_fault, }; static int vfio_get_bar1_start_offset(struct vfio_pci_core_device *vdev) @@ -26,8 +94,9 @@ static int vfio_get_bar1_start_offset(struct vfio_pci_core_device *vdev) pci_read_config_byte(vdev->pdev, 0x10, &val); /* - * The BAR1 start offset in the PCI config space depends on the BAR0size. - * Check if the BAR0 is 64b and return the approproiate BAR1 offset. + * The BAR1 start offset in the PCI config space depends on the BAR0 + * size. Check if the BAR0 is 64b and return the approproiate BAR1 + * offset. */ if (val & PCI_BASE_ADDRESS_MEM_TYPE_64) return VFIO_PCI_BAR2_REGION_INDEX; @@ -54,6 +123,16 @@ static int nvgpu_vfio_pci_open_device(struct vfio_device *core_vdev) return ret; } +void nvgpu_vfio_pci_close_device(struct vfio_device *core_vdev) +{ + struct nvgpu_vfio_pci_core_device *nvdev = container_of( + core_vdev, struct nvgpu_vfio_pci_core_device, core_device.vdev); + + unregister_pfn_address_space(&(nvdev->pfn_address_space)); + + vfio_pci_core_close_device(core_vdev); +} + int nvgpu_vfio_pci_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma) { @@ -93,8 +172,11 @@ int nvgpu_vfio_pci_mmap(struct vfio_device *core_vdev, return ret; vma->vm_pgoff = start_pfn + pgoff; + vma->vm_ops = &nvgpu_vfio_pci_mmap_ops; - return 0; + ret = nvgpu_vfio_pci_register_pfn_range(nvdev, vma); + + return ret; } long nvgpu_vfio_pci_ioctl(struct vfio_device *core_vdev, unsigned int cmd, @@ -140,7 +222,14 @@ long nvgpu_vfio_pci_ioctl(struct vfio_device *core_vdev, unsigned int cmd, } return vfio_pci_core_ioctl(core_vdev, cmd, arg); - + case VFIO_DEVICE_RESET: + /* + * Resetting the GPU clears up the poisoned page. Reset the + * poisoned page bitmap. + */ + memset(nvdev->mem_prop.pfn_bitmap, 0, + nvdev->mem_prop.mem_length >> (PAGE_SHIFT + 3)); + return vfio_pci_core_ioctl(core_vdev, cmd, arg); default: return vfio_pci_core_ioctl(core_vdev, cmd, arg); } @@ -151,7 +240,7 @@ static const struct vfio_device_ops nvgpu_vfio_pci_ops = { .init = vfio_pci_core_init_dev, .release = vfio_pci_core_release_dev, .open_device = nvgpu_vfio_pci_open_device, - .close_device = vfio_pci_core_close_device, + .close_device = nvgpu_vfio_pci_close_device, .ioctl = nvgpu_vfio_pci_ioctl, .read = vfio_pci_core_read, .write = vfio_pci_core_write, @@ -188,7 +277,20 @@ nvgpu_vfio_pci_fetch_memory_property(struct pci_dev *pdev, ret = device_property_read_u64(&(pdev->dev), "nvidia,gpu-mem-size", &(nvdev->mem_prop.mem_length)); - return ret; + if (ret) + return ret; + + /* + * A bitmap is maintained to teack the pages that are poisoned. Each + * page is represented by a bit. Allocation size in bytes is + * determined by shifting the device memory size by PAGE_SHIFT to + * determine the number of pages; and further shifted by 3 as each + * byte could track 8 pages. + */ + nvdev->mem_prop.pfn_bitmap + = vzalloc(nvdev->mem_prop.mem_length >> (PAGE_SHIFT + 3)); + + return 0; } static int nvgpu_vfio_pci_probe(struct pci_dev *pdev, @@ -224,6 +326,8 @@ static void nvgpu_vfio_pci_remove(struct pci_dev *pdev) struct nvgpu_vfio_pci_core_device *nvdev = nvgpu_drvdata(pdev); struct vfio_pci_core_device *vdev = &nvdev->core_device; + vfree(nvdev->mem_prop.pfn_bitmap); + vfio_pci_core_unregister_device(vdev); vfio_put_device(&vdev->vdev); }