From patchwork Thu Apr 4 15:13:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617985 Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam04on2082.outbound.protection.outlook.com [40.107.100.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A34E112AAE0; Thu, 4 Apr 2024 15:14:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.100.82 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243671; cv=fail; b=jR2KtMBPiVjBxv98m/AeU9nlIQa7a2oVOfVu3lltEVz55WpHhIO5mj+Vrok1lGsqSJ8tvHygz1KJUJvt0IJh567iwuy4PisLi0bME3A2jhN4hhSTIIkD0XN6Iy9J2yiV8Gu+P8HTiESerrbRp6eLsHcSHJQgqNq3MdP+uFcWF54= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243671; c=relaxed/simple; bh=ciSK8iw59uQpGRmu8Bzw0PQtZYiGnDsesBKd1sQemFo=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=QdHSAMzVKRoUoc0iFarzVwKfv0yyD6GjncJxD72TTi++KYyIzMa5T+dYQ3T8JGmJtnRYsWuaimirv4/zYiBQc4YRg0TrE2bX4ZdKNIHT2zYmoLQdjtZsxnPHRmqC7UBx+HzMUbGvv1t3Kh7lBHxRoZ0aRaig4Xt1HN7GQmvihFM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=4Fjcn6gE; arc=fail smtp.client-ip=40.107.100.82 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="4Fjcn6gE" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mjcLxwZnT4K5yu22sZHHYs266J18WWI9pkeLcX/clJBwCGCtMbzW2+miwipy6C0+w+89ejt1zWcDN/DWF8rUagy5uIc30t+VVz62rQEhR5u5hQ3XEFmNwEMzcnhE57rUJEm66105phnWXGaRZk/Yw1mg97KRsW7iXeY2Zp3A/Yd2/MqKR7du7ClBQwq0IXukSZkt0V9gudzlW3p96V4/w7WGkGvvlS28ApzaeTnFmRiBReFbhOl/iup2Xy6xhMKfDXbSRZkVaagyQviqW7RnjzuajCxWX7v1YFXpYYcQhOWSbszmZoA7pcdVrGMn5zn8Ye+UUgAZbMQaCqob4iVh3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=28d7CehSLGwCI4JyYjtOVg7dRBNVxN1Bz5bAYyVcpps=; b=LC8OEPHupk1qbZgR7BbvAZEGmFMkbCk9F7NNpokcL3Yb2SkJZk5ziQtPKIXv9tlsDZaXienYTDiJm/b0vN34STBtS1NWbQm7gVHbyFVIuNQatd/MKAFbIaCxhb29SThaMvjgZbUf09EWbfaT4/+NsKyjSdhPCt6gJ73Ya2usbDc4MkMuMto9VvYtWOD5zSN2h1c/uv+AVJxTUB/zqL2FFzC+wAnDsWNtyHag1BbguiOrlzUypUaaNFqBu0Mfr6UrcozklzIwzxocxht09VK/akEf3fqW8oQ5y3ImOOdxxUFKBWrp7OqMTAxWZRhHq5Ohj+nOAq3rFH3UZ1yDLASfpA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=28d7CehSLGwCI4JyYjtOVg7dRBNVxN1Bz5bAYyVcpps=; b=4Fjcn6gEngA+P6KzdnzTK/OUKJG5YiHFZxRBXlh5Hr9oNH4qFNjuP6FbU8K0f54z5PUZQSH9y4YBEg79dXa0Ez9S5vWRgQxMo3lqJzYiwf5ZbHw4sTJTn4VK7fwZSH4qlC4xDao3ykORT8FZf2VbNqW/KESt6z3VCd3AbVriQZQ= Received: from CH0PR13CA0052.namprd13.prod.outlook.com (2603:10b6:610:b2::27) by BL1PR12MB5945.namprd12.prod.outlook.com (2603:10b6:208:398::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:21 +0000 Received: from DS3PEPF000099D6.namprd04.prod.outlook.com (2603:10b6:610:b2:cafe::e0) by CH0PR13CA0052.outlook.office365.com (2603:10b6:610:b2::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7270.11 via Frontend Transport; Thu, 4 Apr 2024 15:14:20 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099D6.mail.protection.outlook.com (10.167.17.7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:20 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:09 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 01/16] x86/mce: Define mce_setup() helpers for common and per-CPU fields Date: Thu, 4 Apr 2024 10:13:44 -0500 Message-ID: <20240404151359.47970-2-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099D6:EE_|BL1PR12MB5945:EE_ X-MS-Office365-Filtering-Correlation-Id: 8be12f94-e401-4283-5e3a-08dc54b9e73c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: IS1IQuBE4bA7Nb5x42g+tgerGkvgepGP0TWUa7InmpHkP4fv+JjBrgcI5hDJElUVvfZeoqCu7sp6d61A5CvG8KVTcfDJjJYT1zYwU/p2QjyKgJeF4cLX98vciqAiN59awy00q52pmkGfrqOjvhCVwMlcwVHJUrxdrOq14fyh8xO4nlowD/PcRc16uvneOlUuGaMYsN9mYWruTinSKdg+H1yNLvTnusUdKzKyKenIK3Qz1AK1ov03QfHV/EHljAzhAlYnCzFvGkumGKYq5+95pjBHU3RGGIkYAk4Y50mxrMGR2JGo2ycezrJRhIyY2jOetW/85HoqyRWBKXLJEx7WPfsBucsG9/90jwMULXoGCSSNKbms759bt4E8MQbdN9hs8ZZ88MLs7vh9KFYvplc+ar4P+1amcMi3jaDRfp6ZLvW2tFt0VE/2ntKrg6HqPwCwvo2rYNMoZeTWuhEk1nkZXsDQt9ww3STQW+nHyOr0oFFS1XTJh8vjYiYyfe6AIehGugurahXrucPj/Rk8B6LzzhHzWlPJzt52nxGL8GUV81eGbM5ZcJV8u97h/RcbqnkqhkIGoRA/OTlQWJWqyNxcUS3BLXfUp0ftQv/eKH/BLueLG81iirVLGz8TV+zhMCTyeBTiqWv3B8qSOMmIukRtEfsP+NXyyMSSlGytYR5N8wWR62XD81eBosl+XhW6o9vfUgk6k4+ECcu/0oy8NZOOhg== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(36860700004)(82310400014)(376005)(1800799015);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:20.2443 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 8be12f94-e401-4283-5e3a-08dc54b9e73c X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099D6.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL1PR12MB5945 Generally, MCA information for an error is gathered on the CPU that reported the error. In this case, CPU-specific information from the running CPU will be correct. However, this will be incorrect if the MCA information is gathered while running on a CPU that didn't report the error. One example is creating an MCA record using mce_setup() for errors reported from ACPI. Split mce_setup() so that there is a helper function to gather common, i.e. not CPU-specific, information and another helper for CPU-specific information. Leave mce_setup() defined as-is for the common case when running on the reporting CPU. Get MCG_CAP in the global helper even though the register is per-CPU. This value is not already cached per-CPU like other values. And it does not assist with any per-CPU decoding or handling. Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-3-yazen.ghannam@amd.com v1->v2: * Change helper names and pass-in CPU number (Boris) arch/x86/kernel/cpu/mce/core.c | 34 ++++++++++++++++++++---------- arch/x86/kernel/cpu/mce/internal.h | 2 ++ 2 files changed, 25 insertions(+), 11 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index b5cc557cfc37..7a857b33f515 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -117,20 +117,32 @@ static struct irq_work mce_irq_work; */ BLOCKING_NOTIFIER_HEAD(x86_mce_decoder_chain); -/* Do initial initialization of a struct mce */ -void mce_setup(struct mce *m) +void mce_setup_common(struct mce *m) { memset(m, 0, sizeof(struct mce)); - m->cpu = m->extcpu = smp_processor_id(); + + m->cpuid = cpuid_eax(1); + m->cpuvendor = boot_cpu_data.x86_vendor; + m->mcgcap = __rdmsr(MSR_IA32_MCG_CAP); /* need the internal __ version to avoid deadlocks */ - m->time = __ktime_get_real_seconds(); - m->cpuvendor = boot_cpu_data.x86_vendor; - m->cpuid = cpuid_eax(1); - m->socketid = cpu_data(m->extcpu).topo.pkg_id; - m->apicid = cpu_data(m->extcpu).topo.initial_apicid; - m->mcgcap = __rdmsr(MSR_IA32_MCG_CAP); - m->ppin = cpu_data(m->extcpu).ppin; - m->microcode = boot_cpu_data.microcode; + m->time = __ktime_get_real_seconds(); +} + +void mce_setup_for_cpu(unsigned int cpu, struct mce *m) +{ + m->cpu = cpu; + m->extcpu = cpu; + m->apicid = cpu_data(m->extcpu).topo.initial_apicid; + m->microcode = cpu_data(m->extcpu).microcode; + m->ppin = cpu_data(m->extcpu).ppin; + m->socketid = cpu_data(m->extcpu).topo.pkg_id; +} + +/* Do initial initialization of a struct mce */ +void mce_setup(struct mce *m) +{ + mce_setup_common(m); + mce_setup_for_cpu(smp_processor_id(), m); } DEFINE_PER_CPU(struct mce, injectm); diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h index 01f8f03969e6..e86e53695828 100644 --- a/arch/x86/kernel/cpu/mce/internal.h +++ b/arch/x86/kernel/cpu/mce/internal.h @@ -261,6 +261,8 @@ enum mca_msr { /* Decide whether to add MCE record to MCE event pool or filter it out. */ extern bool filter_mce(struct mce *m); +void mce_setup_common(struct mce *m); +void mce_setup_for_cpu(unsigned int cpu, struct mce *m); #ifdef CONFIG_X86_MCE_AMD extern bool amd_filter_mce(struct mce *m); From patchwork Thu Apr 4 15:13:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617983 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2067.outbound.protection.outlook.com [40.107.92.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 944D0823D0; Thu, 4 Apr 2024 15:14:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.92.67 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243669; cv=fail; b=V08kHq9dDxtGnaoPo6anqF/znYfBk6txArQ2ULYyBbUXi0qn9skFSgmf6gbHf42s97vg0/Yrppk7sO/zXo+nQ+1gG6g2G3vnlWlX4yiKXYlRF7fxFIn3T/cfsLzZjRME0VhGhY6tGtaKuVCtT1k3yACCa6c1et5diqtCnVSZ7VI= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243669; c=relaxed/simple; bh=X/O/cVvQ0bk66pmGOrC4kUiJGBmZGD9SrKgoQG/VdOk=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=GnAdDdvZKFbvbdNycxgRW9ucnbLRRsZaHjj1yDhhOVKYUDXAt8X8fDXwpa4hVZ0lHR5vOGd1LXil/eHNRVYQDYss7kSR4pYTiXLtj4tRJDJRNTRhutLqVz0kOE5l9g23h0IHv5TrZkwyGeQztQ6ZwfoiH7sEZDBOFaJ4/aGNcTw= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=5TwQ8IJp; arc=fail smtp.client-ip=40.107.92.67 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="5TwQ8IJp" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QeUQS1WGpcXBG2HlqmgIJgxqAJ3vyjCfRd0sPFW4s8OeAX8KIemeOHBcGqb2GIa3IWsAXwGjn/BxgOwURQJPK45C/A0cZSsEf2e9KJZxaQGFDkOYT+StO8PkMVqtHq1i6MvvqBWYtYmikXQVwzWxRH+yUbZ2CSSjB9RIf2meUMBCJRnbDkKUoCLo149tG+85bqdIop+p1y5Wi60e8ZHYy2jP7ja7JKGS/fwC67pJZrKSzuVNFsfJZ0Wbj4nJBjrGUEhG6lfQmABRfb4Nx/Dp/l9W93VBAWMIb2jDXqebMPQSS49qOQwn4vfyk4pbxiUZIRaghMVZsapX20gnZBKBCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=UKYsFKaHwAtFnsm54C34f0OvS6eu45Lr8/f0OlgxHIM=; b=P52s4S3jdrTPglF27t3C+2s1R61/7ceOnOgX1/HLgq1hkQcUOQ3gE3JdnPXgruC9Gpb/ZT3XkIivmqw37r1r7RtYnsBbUNZNdEfEQiD9Cf0dLWKEiDsBAFHTYCYDe2rqd1h4KJwmWmNh4KCKhu9BLEGnvuNVGET+TWtyal0ccmVvWjboGP7LlqwNn9Wrhn7k3PTr2qb0cIrgpcH2m1nfMHcmQdwHwwB89IeW8szuhmCSFhcje4osuNa04FUAU4s1GbyRocs965Ckz64mH6YqeONaf2JaZNpP+I+iL5JbJ9hgwjNoLVz1BiIqVdAs1BKLBBRHvjybrjnH6LsZFpvEeA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=UKYsFKaHwAtFnsm54C34f0OvS6eu45Lr8/f0OlgxHIM=; b=5TwQ8IJp7xeb+RuM10mV7wZSa7Bwf+S8vsXNNQhnvOjrbL5mo1npY/BDuyKl+pOx1uqzAhFwoVfhPtlelwfxidCiV/AnHRHzKKih3ESACO4xXOGGV6hLCj3g/xlVGa1bhtggZ3GYEbfcjWIPZ3OEShctrfs6UT5rEZiMWt500i0= Received: from DM6PR05CA0050.namprd05.prod.outlook.com (2603:10b6:5:335::19) by MW4PR12MB6875.namprd12.prod.outlook.com (2603:10b6:303:209::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:21 +0000 Received: from DS3PEPF000099DA.namprd04.prod.outlook.com (2603:10b6:5:335:cafe::7) by DM6PR05CA0050.outlook.office365.com (2603:10b6:5:335::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7472.10 via Frontend Transport; Thu, 4 Apr 2024 15:14:20 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099DA.mail.protection.outlook.com (10.167.17.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:20 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:10 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 02/16] x86/mce: Use mce_setup() helpers for apei_smca_report_x86_error() Date: Thu, 4 Apr 2024 10:13:45 -0500 Message-ID: <20240404151359.47970-3-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099DA:EE_|MW4PR12MB6875:EE_ X-MS-Office365-Filtering-Correlation-Id: 501fd181-48df-4fcf-3a00-08dc54b9e776 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: N1Y/k7yAFTYCf/d5oumfTlfOPpTnbd5VwYzlKoNkXCWEymEj3Xf4oatbgFGtghPR01e76yy0ILNAPJ+exiugw3xTOIYoQ15e+X3umEpEWbty3WxcR0P6DqOnPVHjjQfE/b4+jO0mPwoanIY26Bn7vj9L0KP+V4cE/TKXXRTdJVQ0QvFbOh0Ixq/WIuA63u+OAYYz+bNeiueRiWseVL4Cdq6nV87nPIg93J8/jMs5jIfsZsL44Q2ZJBj2gQuDr4Kx2OZVDKYOO4MfZQDxdhORHk2B9Soy8Ru3jgPpJDu5CjsIlWCSvoaHo/e+id4awURv5Jzkb9VzMqkFvlfY1MKcKMYz+9gO/YtNlaJYeKKM4G6vFGDrL8qAAHHzIbh86GVXNNcm1cb9zzY0R+iyNHKr6BplrtpJ1V2TLjB8uU9Im00b350UJDvb/E8wdFxAhQZTb9XUHzH+MYXHQwOJEipUMCHgxINREAV32rY0g/LPmSwixa5yjzKmAzeqT/uWT6736FAVGpvgfjb1fYNJ/kKeWFsuwgON0eUIPEqa2BtSsPcVzOBz8Dc2ejLUvT4LvBo78iakbD3ZNIlrKoXq8dkyjQZltjs72mMl/b8YtyvxvfZAhPj0SXVQlS32P6pPKdowInp/qvETI3/jmKGpU0TaBoWvVfRb7DgUwkFEgzXvkSpmPT3D1aKCjJMqPwLsDlXFQQPgWbSVtaQX5C/IZ9dR4Q== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(36860700004)(82310400014)(376005)(1800799015);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:20.6391 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 501fd181-48df-4fcf-3a00-08dc54b9e776 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099DA.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW4PR12MB6875 Current AMD systems may report MCA errors using the ACPI Boot Error Record Table (BERT). The BERT entries for MCA errors will be an x86 Common Platform Error Record (CPER) with an MSR register context that matches the MCAX/SMCA register space. However, the BERT will not necessarily be processed on the CPU that reported the MCA errors. Therefore, the correct CPU number needs to be determined and the information saved in struct mce. The CPU number is determined by searching all possible CPUs for a Local APIC ID matching the value in the x86 CPER. Use the newly defined mce_setup_*() helpers to get the correct data. Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-4-yazen.ghannam@amd.com v1->v2: * Trim commit message (Boris) * Rebase on earlier changes (Yazen) arch/x86/kernel/cpu/mce/apei.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c index 7f7309ff67d0..e4e32e337110 100644 --- a/arch/x86/kernel/cpu/mce/apei.c +++ b/arch/x86/kernel/cpu/mce/apei.c @@ -97,20 +97,17 @@ int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id) if (ctx_info->reg_arr_size < 48) return -EINVAL; - mce_setup(&m); - - m.extcpu = -1; - m.socketid = -1; - for_each_possible_cpu(cpu) { - if (cpu_data(cpu).topo.initial_apicid == lapic_id) { - m.extcpu = cpu; - m.socketid = cpu_data(m.extcpu).topo.pkg_id; + if (cpu_data(cpu).topo.initial_apicid == lapic_id) break; - } } - m.apicid = lapic_id; + if (!cpu_possible(cpu)) + return -EINVAL; + + mce_setup_common(&m); + mce_setup_for_cpu(cpu, &m); + m.bank = (ctx_info->msr_addr >> 4) & 0xFF; m.status = *i_mce; m.addr = *(i_mce + 1); From patchwork Thu Apr 4 15:13:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617980 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2077.outbound.protection.outlook.com [40.107.92.77]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D45441272C0; Thu, 4 Apr 2024 15:14:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.92.77 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243668; cv=fail; b=o1a1ecqmf0WG5sKASnGSDJd1uRX1g9sToamqpBkEwrnvvPEvI89GF0e4eri5r1H7FnKEbpGeQD/nKfPDzflEoxqD3pOYgFg2RHvkC0q4YzjhMPsl6qDEV9Y0MZ1xpU4cdYYK/kJmzI7bQG4L+zhii0pPTA1CxfOFQnQRaYY/NDk= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243668; c=relaxed/simple; bh=NofrJPj+4U+pgvzx34DXwBfKPJYrta8BFCUCH0PYQPw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=BFgoEc/gPQzr32dMflCuEEcnihLxlw2zlfoRkoDTffjQ9PytLNV714HK2gYPDgK5wm5fQoN2NS97hNi0IqN4ZBDTpkog2a6D93EebQccs9Dys/wJKm5CvleCGHrpdGhM1FWBAVvs/AbI8mNyAAwglVGpxfwtimbjSElJyUGDLAM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=u2XQGM4z; arc=fail smtp.client-ip=40.107.92.77 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="u2XQGM4z" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=A88sqEM17b0zCwoT00NJXtivZxD8rDyAMSi806SbOy8GoVuttgz0qbvA7eh4EIpaVK+5lal1x2JCm/l0n1TrJJ8QcYIlZKfJawgn+9JyYrS1asHTZg1f0DuERVDL04IOKRTi/XRDeIJzGcm8ag+PW+z4/PsgRPLApKihhQAscisg7I7f/+YDc0dol7vz0D7Hx62wV0ir31LNuZN5moNziEqGCwwVGXoFieBe+2YMbEVKijlGJmfs/cXnYoTG99k5bC1XnUtxWEDUOI/ZvTdvCvAePx/mJgB4cWU+Bu249ALDRuvJ0fERJglyYuvkWNB7P5R84MtMWLCWQUyXFTfv/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=lzgGYP1AxCR3CLFR0mIyPVC1VQGoPa5NlguqXZbbgik=; b=MhvqxP2TldXcJCm7rlIyee6AanpBUBoA2Fbaq8WafVzCE7XGPcBkVGv6rlEDCp0Ztw6kGLGjMQsfubadxPxQATF7dupuJUlA+coLUb/dIQFX7gUuODXGOPnHaHRkkfA3RW5Fw32TDpFOwLk3izUca9ZULFS29B2EQaVpvs5x5bxdhPWVVk1HTcowaIEXJfx/WH+Bk7kGYAi5yzHRzO3lHGL9pQxRjGtC3yVgL8W3en2/fv9xlr4shKQMAYIIlLlei8XZU+Xd3mgLGrei4332Cgzlc1GXBB8v3uemX5lULa/aUPJ29gM7Ftvl5fqE5vmceayLTolovvrzqqWM05kOmw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lzgGYP1AxCR3CLFR0mIyPVC1VQGoPa5NlguqXZbbgik=; b=u2XQGM4zjdKBvMaE88JBgobeJG2IehdGzxAI09VcXh+o/bT9nN7WLUfvH2Y9fmNbFvRnszFsd22/NHkqtHf07B1oWr+i3VrNKpM91uhM15voueFfaboMIT+19ycj0Qv+7ygqffcZkv5r/EK8uEVFtYcvx0sAqalRbn/S9JqlbqY= Received: from DM6PR18CA0012.namprd18.prod.outlook.com (2603:10b6:5:15b::25) by PH7PR12MB5781.namprd12.prod.outlook.com (2603:10b6:510:1d0::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:21 +0000 Received: from DS3PEPF000099D7.namprd04.prod.outlook.com (2603:10b6:5:15b:cafe::99) by DM6PR18CA0012.outlook.office365.com (2603:10b6:5:15b::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.26 via Frontend Transport; Thu, 4 Apr 2024 15:14:21 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099D7.mail.protection.outlook.com (10.167.17.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:21 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:10 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 03/16] x86/mce/amd: Use fixed bank number for quirks Date: Thu, 4 Apr 2024 10:13:46 -0500 Message-ID: <20240404151359.47970-4-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099D7:EE_|PH7PR12MB5781:EE_ X-MS-Office365-Filtering-Correlation-Id: 32bc4052-ec9c-48c7-310a-08dc54b9e7b8 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: bRAKIZ1HfHIrd5DWssVy2axZCGv18XCh3noYvQamFnkEe7OllC/w56DRpx6bFWrgqpgbVlBRJDaqHWeFwLyfHKNhn8HeYuhYncNoae+ozZlHXDC94DGRuySlm1/P+dp5vvjpV9YZGjOlZ3xWhjKllMpuhRk9g1/N5BLIkZtJeoohQt8Jpp0oo+Fg4Rx1a5LzGbONUyUWCSmilpSNQY99VT5QqpmzM6E+B2khUGKgn3BYcx/8JmJJJAWTvy0bDE6OkSsITM7a3iUleD0Bj2z4ln6OftVYioboNNKkqqo87wMPjNiEU+HJ6xXbm0aANUM2D7IWWJfznnqPPyUOFKZHqEJyUCvBa1THzzlyFOEpnnQ5+70A5nWi/fqqrAa7XgJCWS/5LijdxmQwrWLgzufQsjSPvVlvSbp1VRkTRxgtgyjnumG6QhxQDmHFwnFlkJKcBSKfFJQ1dHzMMJ953UgEacL6K9mbZKyC3dLev7zs0o1uCtgHzKS0IOjFCz3UiJXp3u9QNyCE0MXsa7sSxYhG9sotT7v4BkfCRF4TENC7uxbiO5k0R38Tht6ogmSLkAbBDjclsreZln0u5geSYeMBjL42EWId/02rRkrti3oPOOpb6rLRRgce7vTXojkGIOuBMJNlzJbSwAqK4YZv6Jh8yjxGW3kiBTjQbw80oyx9X2CWKExCEcHCyqadg9IZk/BCUlnXN/rHdHchc06OWeEWUg== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(1800799015)(82310400014)(376005)(36860700004);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:21.0715 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 32bc4052-ec9c-48c7-310a-08dc54b9e7b8 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099D7.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB5781 Quirks break micro-architectural definitions. Therefore, quirk conditions don't need to follow micro-architectural requirements. Currently, there is a quirk to filter some errors from the Instruction Fetch (IF) unit on specific models. The IF unit is represented by MCA bank 1 for these models. Related to this quirk is code to disable MCA Thresholding for the IF bank. Check the bank number for the quirks instead of determining the bank type. Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-8-yazen.ghannam@amd.com v1->v2: * No change. arch/x86/kernel/cpu/mce/amd.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 9a0133ef7e20..bc78e751dfcc 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -605,13 +605,12 @@ prepare_threshold_block(unsigned int bank, unsigned int block, u32 addr, bool amd_filter_mce(struct mce *m) { - enum smca_bank_types bank_type = smca_get_bank_type(m->extcpu, m->bank); struct cpuinfo_x86 *c = &boot_cpu_data; /* See Family 17h Models 10h-2Fh Erratum #1114. */ if (c->x86 == 0x17 && c->x86_model >= 0x10 && c->x86_model <= 0x2F && - bank_type == SMCA_IF && XEC(m->status, 0x3f) == 10) + m->bank == 1 && XEC(m->status, 0x3f) == 10) return true; /* NB GART TLB error reporting is disabled by default. */ @@ -643,7 +642,7 @@ static void disable_err_thresholding(struct cpuinfo_x86 *c, unsigned int bank) } else if (c->x86 == 0x17 && (c->x86_model >= 0x10 && c->x86_model <= 0x2F)) { - if (smca_get_bank_type(smp_processor_id(), bank) != SMCA_IF) + if (bank != 1) return; msrs[0] = MSR_AMD64_SMCA_MCx_MISC(bank); From patchwork Thu Apr 4 15:13:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617982 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2041.outbound.protection.outlook.com [40.107.93.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8858B763F4; Thu, 4 Apr 2024 15:14:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.93.41 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243669; cv=fail; b=oWeT2Lcs+I6XulX+BW1YraSvXtY6hs4gcx2Aup8w5lHHCHI4Pvmh9C1/z3ZSHwBcqaXPYUt5TbYtXXQwWgla4VVINML45KhYdIiAXEoRXzR3ebTCrcZsRPFhtVAGsCLG4sWUs0ZIT9GTwz4mCXahn20AvmC8ct5XlCrVJv1d3ak= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243669; c=relaxed/simple; bh=wEWls+8xa9onO0OKVRSmlGNbTe9MCg0UjZlfVzUBV+c=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=gT2ruU43Oj1zjcn8ZH0IFEvpTOC5GYzFvM2ViFqORJCK5gLvEEYjSHmWSyVNyX+wdQyfybDUcdj7iAyKrbK+SavouUkX42TSJstrYtecv9QhBzdCk/coH2EzP+iRjdJU3xc7a5Tfrva3WLr3njU6TnWADMVN4/uDg9ge9Hk4Dpc= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=J132Blgc; arc=fail smtp.client-ip=40.107.93.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="J132Blgc" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GrVThiSjAO5P7zLraAP6puZEUKPgr9MFtSTOgEoAN8fOCDwq7PWzykpp1LZ/dze3whKZKQkY6f/EYoa9dIkE/4SCSBjB0i9uA5m8USnM8ql5iqA+xqdbfQQcnVpZZhq7sQV22lHXnYwGunzt8/HMguGi6ETEqD6kgjpZkcy1fOIW6gWlcEkdk0sp6EOBpm4CsJ3UqrxrYO/Diewak4r7UpcFXjKNoewXwV3dKugkkTLTQpuh6gFFecL+evCBc1ayAPs6mIs3CdlEeO6DHsuuErMUcVR9AA9FLxOJhFWzilbqLlD3rUz8WNT23wSM+K1GV+4yPNZ9OB55zGlIQ3J+2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=EDXHciCZX0ZxBi2ZkrfANi+WSFR1ozFBISLzPbK4Y8E=; b=l7HFIOuFSqPHqnoDFcXRH4C99NbTaMMug/ZgcLLUnIcIeh1pLwOoXhCYtjFgUb9DJ6z2qDU7utbnLSSNqRAk96UNKdZFVjR2xaS/k8Sc2PdyTcoW1YYOBbRyLo6DEXepCztWqUFE9F3PRXn+n96hpzDQ3U84MaOv9bNAFzLC3Xo4yV5K+kx5ckZ+PRAD1FZwwxaWDiDXa/D2lEFT6KvXstRFad6VMySSholJWJ8xP/sQ/vzNpXmEyTlMq5cAiaRUsFdkHYzp+BDkk4I08MdIFwAhlD4IfP3TNmt8ctpBiXOStE+0oJ3ScXTBqAZpKFsBhh2W+I5/FlNGX0LRGwQusQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=EDXHciCZX0ZxBi2ZkrfANi+WSFR1ozFBISLzPbK4Y8E=; b=J132BlgcomM7S0d3BBOl8C0OPMkGDB/NyCdd6SAzmwq9882/NaOz2hJQ6l7ui2IiM2FcFAwKfRmL5GpcNvdIBqU20/FEeSa7XnJ0vYa3CZRnb+GR+2bZBwqRuwEngg+5rOCFC2fI5P+/8F0Jx1MN0snd8LIqQZUOFFn3GGnUKGY= Received: from DM6PR18CA0002.namprd18.prod.outlook.com (2603:10b6:5:15b::15) by IA0PR12MB8747.namprd12.prod.outlook.com (2603:10b6:208:48b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:22 +0000 Received: from DS3PEPF000099D7.namprd04.prod.outlook.com (2603:10b6:5:15b:cafe::f0) by DM6PR18CA0002.outlook.office365.com (2603:10b6:5:15b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.26 via Frontend Transport; Thu, 4 Apr 2024 15:14:22 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099D7.mail.protection.outlook.com (10.167.17.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:22 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:10 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 04/16] x86/mce/amd: Look up bank type by IPID Date: Thu, 4 Apr 2024 10:13:47 -0500 Message-ID: <20240404151359.47970-5-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099D7:EE_|IA0PR12MB8747:EE_ X-MS-Office365-Filtering-Correlation-Id: 3ade3401-ba07-4fa4-7c27-08dc54b9e861 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: TUz1TQjqSd77DaoiYZRAAk2NtWaTxxzk7PHHX9NuBZ5tRFaZt435+cua/fh/aRrbgE0cnR/CeQG6FUcD2mXxLXM1NdA7iIFrKt9SSV/FArpoFagkO7XQRAS0Rr4xZi8wneL94I8QfqgCeuG81TsCmAzczY46YqsheTgkA3Z7x2TSoFLEiWHDT1gqKe4fqeHgDrbTZNjWhKtdNzWHlX8K8oiRROaIkA3F9vrcUVdUxufKdwVYsolCX5hC5jdYs0PCqIBF4xq2RMLJx2T6Sqh1ui5JAjULRQpw36JFdkUZNozUZTixTkCUH0h2yMPIFQpYmts8MByqhQYdlzYeznUBrex3fxIh46649SAsrLXkDcj9dqPex0iYSfAm2VvdVZgdsAIjhEySwYwzwYDDaYbYKDoFzt3WW1wQlveV4k1aeRjY7zxtNvqJ18vkP9w3vuSj6rFN2Ywyy4USIJb60c1sFgtsBpXakQrmb8ci2wtksUm0EeA/PAn24OJcTv0QqlmjfaePhJlkNJusz1D+mCFUCgYmBIDU7oFoCxiKq8qcLqF7rSZAvDhrqySAiUVia6Rx0tzuW+PuOnGBgF6J+u2pQMPskSQ3oSUn/JotpFljuWiKZbQJcsEGlM4QerBtFlbJp6ZUJTqTmk+/No6hevEOMM3uUWYVkLn9NkV8KSCZrbeV5xx9tyNjSnzB8SwFfrIUWncc80ydNjGwJw14edPS5w== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(1800799015)(82310400014)(376005)(36860700004);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:22.1027 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3ade3401-ba07-4fa4-7c27-08dc54b9e861 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099D7.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA0PR12MB8747 Scalable MCA systems use values within the MCA_IPID register to describe a bank's type. Other information is not needed. Currently, the bank types are cached during boot and this information is used during boot and run time. The cached values are per-CPU and per-bank. The boot path needs the cached values, but this should be removed. The run time path does not need the cached values. Determine a Scalable MCA bank's type using only the MCA_IPID values. Keep old code until init path is cleaned up. Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-9-yazen.ghannam@amd.com v1->v2: * Include bitops started in dropped patches. (Yazen) * Update all users of smca_get_bank_type(). (Yazen) arch/x86/include/asm/mce.h | 4 +- arch/x86/kernel/cpu/mce/amd.c | 99 ++++++++++++++++++++++--- drivers/edac/amd64_edac.c | 2 +- drivers/edac/mce_amd.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 2 +- 5 files changed, 94 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index de3118305838..adad99bac567 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -59,8 +59,6 @@ * - TCC bit is present in MCx_STATUS. */ #define MCI_CONFIG_MCAX 0x1 -#define MCI_IPID_MCATYPE 0xFFFF0000 -#define MCI_IPID_HWID 0xFFF /* * Note that the full MCACOD field of IA32_MCi_STATUS MSR is @@ -342,7 +340,7 @@ extern int mce_threshold_create_device(unsigned int cpu); extern int mce_threshold_remove_device(unsigned int cpu); void mce_amd_feature_init(struct cpuinfo_x86 *c); -enum smca_bank_types smca_get_bank_type(unsigned int cpu, unsigned int bank); +enum smca_bank_types smca_get_bank_type(u64 ipid); #else static inline int mce_threshold_create_device(unsigned int cpu) { return 0; }; diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index bc78e751dfcc..c76bc158b6b6 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -7,6 +7,7 @@ * * All MC4_MISCi registers are shared between cores on a node. */ +#include #include #include #include @@ -51,6 +52,10 @@ #define DEF_INT_TYPE_APIC 0x2 /* Scalable MCA: */ +#define MCI_IPID_MCATYPE GENMASK_ULL(47, 44) +#define MCI_IPID_HWID GENMASK_ULL(43, 32) +#define MCI_IPID_MCATYPE_OLD 0xFFFF0000 +#define MCI_IPID_HWID_OLD 0xFFF /* Threshold LVT offset is at MSR0xC0000410[15:12] */ #define SMCA_THR_LVT_OFF 0xF000 @@ -131,7 +136,7 @@ static const char *smca_get_name(enum smca_bank_types t) return smca_names[t]; } -enum smca_bank_types smca_get_bank_type(unsigned int cpu, unsigned int bank) +static enum smca_bank_types smca_get_bank_type_old(unsigned int cpu, unsigned int bank) { struct smca_bank *b; @@ -144,9 +149,8 @@ enum smca_bank_types smca_get_bank_type(unsigned int cpu, unsigned int bank) return b->hwid->bank_type; } -EXPORT_SYMBOL_GPL(smca_get_bank_type); -static const struct smca_hwid smca_hwid_mcatypes[] = { +static const struct smca_hwid smca_hwid_mcatypes_old[] = { /* { bank_type, hwid_mcatype } */ /* Reserved type */ @@ -210,6 +214,83 @@ static const struct smca_hwid smca_hwid_mcatypes[] = { { SMCA_GMI_PHY, HWID_MCATYPE(0x269, 0x0) }, }; +/* Keep sorted first by HWID then by McaType. */ +static const u32 smca_hwid_mcatypes[] = { + /* Reserved type */ + [SMCA_RESERVED] = HWID_MCATYPE(0x00, 0x0), + + /* System Management Unit MCA type */ + [SMCA_SMU] = HWID_MCATYPE(0x01, 0x0), + [SMCA_SMU_V2] = HWID_MCATYPE(0x01, 0x1), + + /* Microprocessor 5 Unit MCA type */ + [SMCA_MP5] = HWID_MCATYPE(0x01, 0x2), + + /* MPDMA MCA type */ + [SMCA_MPDMA] = HWID_MCATYPE(0x01, 0x3), + + /* Parameter Block MCA type */ + [SMCA_PB] = HWID_MCATYPE(0x05, 0x0), + + /* Northbridge IO Unit MCA type */ + [SMCA_NBIO] = HWID_MCATYPE(0x18, 0x0), + + /* Data Fabric MCA types */ + [SMCA_CS] = HWID_MCATYPE(0x2E, 0x0), + [SMCA_PIE] = HWID_MCATYPE(0x2E, 0x1), + [SMCA_CS_V2] = HWID_MCATYPE(0x2E, 0x2), + + /* PCI Express Unit MCA type */ + [SMCA_PCIE] = HWID_MCATYPE(0x46, 0x0), + [SMCA_PCIE_V2] = HWID_MCATYPE(0x46, 0x1), + + [SMCA_XGMI_PCS] = HWID_MCATYPE(0x50, 0x0), + [SMCA_NBIF] = HWID_MCATYPE(0x6C, 0x0), + [SMCA_SHUB] = HWID_MCATYPE(0x80, 0x0), + + /* Unified Memory Controller MCA type */ + [SMCA_UMC] = HWID_MCATYPE(0x96, 0x0), + [SMCA_UMC_V2] = HWID_MCATYPE(0x96, 0x1), + + [SMCA_SATA] = HWID_MCATYPE(0xA8, 0x0), + [SMCA_USB] = HWID_MCATYPE(0xAA, 0x0), + + /* ZN Core (HWID=0xB0) MCA types */ + [SMCA_LS] = HWID_MCATYPE(0xB0, 0x0), + [SMCA_IF] = HWID_MCATYPE(0xB0, 0x1), + [SMCA_L2_CACHE] = HWID_MCATYPE(0xB0, 0x2), + [SMCA_DE] = HWID_MCATYPE(0xB0, 0x3), + /* HWID 0xB0 MCATYPE 0x4 is Reserved */ + [SMCA_EX] = HWID_MCATYPE(0xB0, 0x5), + [SMCA_FP] = HWID_MCATYPE(0xB0, 0x6), + [SMCA_L3_CACHE] = HWID_MCATYPE(0xB0, 0x7), + [SMCA_LS_V2] = HWID_MCATYPE(0xB0, 0x10), + + /* Platform Security Processor MCA type */ + [SMCA_PSP] = HWID_MCATYPE(0xFF, 0x0), + [SMCA_PSP_V2] = HWID_MCATYPE(0xFF, 0x1), + + [SMCA_GMI_PCS] = HWID_MCATYPE(0x241, 0x0), + [SMCA_XGMI_PHY] = HWID_MCATYPE(0x259, 0x0), + [SMCA_WAFL_PHY] = HWID_MCATYPE(0x267, 0x0), + [SMCA_GMI_PHY] = HWID_MCATYPE(0x269, 0x0), +}; + +enum smca_bank_types smca_get_bank_type(u64 ipid) +{ + enum smca_bank_types type; + u32 hwid_mcatype = HWID_MCATYPE(FIELD_GET(MCI_IPID_HWID, ipid), + FIELD_GET(MCI_IPID_MCATYPE, ipid)); + + for (type = 0; type < ARRAY_SIZE(smca_hwid_mcatypes); type++) { + if (hwid_mcatype == smca_hwid_mcatypes[type]) + return type; + } + + return N_SMCA_BANK_TYPES; +} +EXPORT_SYMBOL_GPL(smca_get_bank_type); + /* * In SMCA enabled processors, we can have multiple banks for a given IP type. * So to define a unique name for each bank, we use a temp c-string to append @@ -310,11 +391,11 @@ static void smca_configure(unsigned int bank, unsigned int cpu) return; } - hwid_mcatype = HWID_MCATYPE(high & MCI_IPID_HWID, - (high & MCI_IPID_MCATYPE) >> 16); + hwid_mcatype = HWID_MCATYPE(high & MCI_IPID_HWID_OLD, + (high & MCI_IPID_MCATYPE_OLD) >> 16); - for (i = 0; i < ARRAY_SIZE(smca_hwid_mcatypes); i++) { - s_hwid = &smca_hwid_mcatypes[i]; + for (i = 0; i < ARRAY_SIZE(smca_hwid_mcatypes_old); i++) { + s_hwid = &smca_hwid_mcatypes_old[i]; if (hwid_mcatype == s_hwid->hwid_mcatype) { this_cpu_ptr(smca_banks)[bank].hwid = s_hwid; @@ -724,7 +805,7 @@ static bool smca_mce_is_memory_error(struct mce *m) if (XEC(m->status, 0x3f)) return false; - bank_type = smca_get_bank_type(m->extcpu, m->bank); + bank_type = smca_get_bank_type(m->ipid); return bank_type == SMCA_UMC || bank_type == SMCA_UMC_V2; } @@ -1097,7 +1178,7 @@ static const char *get_name(unsigned int cpu, unsigned int bank, struct threshol return th_names[bank]; } - bank_type = smca_get_bank_type(cpu, bank); + bank_type = smca_get_bank_type_old(cpu, bank); if (bank_type >= N_SMCA_BANK_TYPES) return NULL; diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index 1f3520d76861..4b3764ea7c59 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -1041,7 +1041,7 @@ static int fixup_node_id(int node_id, struct mce *m) /* MCA_IPID[InstanceIdHi] give the AMD Node ID for the bank. */ u8 nid = (m->ipid >> 44) & 0xF; - if (smca_get_bank_type(m->extcpu, m->bank) != SMCA_UMC_V2) + if (smca_get_bank_type(m->ipid) != SMCA_UMC_V2) return node_id; /* Nodes below the GPU base node are CPU nodes and don't need a fixup. */ diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index 8130c3dc64da..e02af5da1ec2 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -731,7 +731,7 @@ static const char *smca_get_long_name(enum smca_bank_types t) /* Decode errors according to Scalable MCA specification */ static void decode_smca_error(struct mce *m) { - enum smca_bank_types bank_type = smca_get_bank_type(m->extcpu, m->bank); + enum smca_bank_types bank_type = smca_get_bank_type(m->ipid); u8 xec = XEC(m->status, xec_mask); if (bank_type >= N_SMCA_BANK_TYPES) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index 8ebab6f22e5a..c543600b759b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -3546,7 +3546,7 @@ static int amdgpu_bad_page_notifier(struct notifier_block *nb, * and error occurred in DramECC (Extended error code = 0) then only * process the error, else bail out. */ - if (!m || !((smca_get_bank_type(m->extcpu, m->bank) == SMCA_UMC_V2) && + if (!m || !((smca_get_bank_type(m->ipid) == SMCA_UMC_V2) && (XEC(m->status, 0x3f) == 0x0))) return NOTIFY_DONE; From patchwork Thu Apr 4 15:13:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617995 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2064.outbound.protection.outlook.com [40.107.94.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DFC6C12D76B; Thu, 4 Apr 2024 15:14:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.94.64 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243676; cv=fail; b=bHVjsDBqU4XCGQJiGbtFTZmPYDtv/P/hyN89Vu1yASp6lSPf6VPKph84UwLj2Vgvty+uvZpHSYzL3aVbjDhaWP6hNq7za+wZRccI2yDJcRFFxKa1/Qq94vVRYa5/8Zl3jIRL7fLqcik+gYZJ45jTa7pr2ipD3iQZ9VmKgfAyHTQ= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243676; c=relaxed/simple; bh=rEsvVp68GCtkyB58v7Zz59yEpCvnBk8M//vuPQPHyu0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=PyWPSdn3TOkRd4/7Krpap8fmLE/t5eXDdd0Ac9BzNROE9hdCeRUYR0EgT10fxFgi66ZYZmoWvMho2yECZMg4Tgsb5l1/kydzXT0TY4WIHEO+jD8U+if/5o/4xaNGu37UQP0r9h/I/L3HUa03RDCMG9A6n8Ltq8WD2tycMwQK1TY= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=ytNh7Vg2; arc=fail smtp.client-ip=40.107.94.64 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="ytNh7Vg2" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=i6Nlv/mR7ixsPmYKj7/9fMXKv3+3mEoCXyK0OgT4ublIGJzJK3GZRB7NHinP1jIFFqBbBBG1IOO4P3PSMGnDkZCYGzyP6jFof3kPMQdnCwoTEz8JKeKYnullthEA5wCM90jJo0d0+63xB4ZarzDKXVxxWPgpUT+YJ94D+FvWLqDUdThA5yp1+D8so9hkKaMOoi0k/iPiPbaHnF6g3IJR7ZeSWTe+1q+7/a0kyjQ/ZWfdIZqXuNoY21+etUTOLMKkDt9jWNwhzlFNKesm4/vJNFPpAQMna9cEuyXv3XgNlytU3vFtsJTGCRQTRVe59rXzjVi8DGoonMEgweY6QyZ8+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=lxfzmX3za9sE4uwVkeaWKWNez20Kcsh8WGOOETPLKl8=; b=V/qWcWfR2/PnUp0uOSBrAy+khMSMeuLWqALy55S5AxhrY4qYzWnzNhDjOzDKoZL3YqA8ZIrX8CquD5aLrJj7I/AdfEr56XHTkOmC9nZ10Iv4qiBNq6s6GLx+LJYDlRHQgWtmlLLHS6acIvO9L90ECRelVa6cPQnyxp9isTA/P6Bm+I+tLCbcE8yYRCVa87p0PV+znnEUQY3JvxHV1ASTw/vIZBuf467Cg2iSXOLEr0U3m3tE6oSsh9DhO9HUxdbi54Ur54UhtSCuVbQvSUi4waz9s3gNb0NBaCW6re7zb+wDJCKr4w+uGp93wKWHFqfWc+HAFIBRTTv4WwWPrQEsbA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lxfzmX3za9sE4uwVkeaWKWNez20Kcsh8WGOOETPLKl8=; b=ytNh7Vg2PVHZCu0m3sxqPnxKtqJqlTTtn3Ke2fc4iqLXerwuDSAGne5LmIiicUxEZw4yVBNIr+z2GdtuiGIBuTukMEdWpfcEWP/yr6sEbOh25y6vpcDgAsRHJg37fKFxs2cEchi5/I6OBPkeuV0BpnKJm6c12B1B1lo+1sN7Xww= Received: from DM6PR18CA0025.namprd18.prod.outlook.com (2603:10b6:5:15b::38) by SN7PR12MB7372.namprd12.prod.outlook.com (2603:10b6:806:29b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:27 +0000 Received: from DS3PEPF000099D7.namprd04.prod.outlook.com (2603:10b6:5:15b:cafe::12) by DM6PR18CA0025.outlook.office365.com (2603:10b6:5:15b::38) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46 via Frontend Transport; Thu, 4 Apr 2024 15:14:23 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099D7.mail.protection.outlook.com (10.167.17.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:22 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:11 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 05/16] x86/mce/amd: Clean up SMCA configuration Date: Thu, 4 Apr 2024 10:13:48 -0500 Message-ID: <20240404151359.47970-6-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099D7:EE_|SN7PR12MB7372:EE_ X-MS-Office365-Filtering-Correlation-Id: 4e17c1f9-5229-4c0e-d545-08dc54b9e8df X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: OQVVt6NJ9NcFBfWeDje9cLCVg3+gnWvGf3EyzF26tBeuW16thwezrsbU/T0W16mdbe0fgNvahZVyP7gRmiUP57uSqbgXA1y1V/n9KLx12FnR9831m+C4CDAm09Wh0gvoO2y63JvKoWuZpUIEQLOqNiyZm3KT/RedBdqBOADMMTjZSl5xyBDYAtHrj7qbUtRwpFaBqP0jTolTezuvY6vuZ/FlpPomkCsZgITdbTKAZKsLvpEe4cHJSvVh47T/fBxwpuMPbz+eAYfCgYW/t7v4uM4UHbBRB6Z335XhNnz//IyDYzwuTLheurvageDEP9l7gzfZ66ST2RtodvpAMzIzadwxvN5muWx25TFBKKAnpe1aUmyeZVRwN89cgzjQxEou7/V2V43XN5hFkhgyZoTJljnNtjq4I6nzTxQZE189xkL7M2JFN44C6HNUubZKi1k+a2rbrN8kYUyijSZgVkXkl/ZSX22OMLyltZTbog68mMbXTAKNmbOjMLJigrsLpWM1MUeDlzGZtC7OTQzAM5r1ByA1cZGorALlqj98biJEZ7M7YQOY6X6sr+HwqQF3xDUJZevSjv0z3ae/x4C4MFhbsskRFxXPsyp2UFYdncZwhZV0VM3bs4OhOWWh2KHk4/UdB+cR3bOZIijL5YYzuIh1Om2XwNcNes2Fj7Zjin6uyDevRwHx7l0a5U5Vd7l+ziEyDK0K+gMushXoq9PrkrXlbQ== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(1800799015)(36860700004)(376005)(82310400014);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:22.9933 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 4e17c1f9-5229-4c0e-d545-08dc54b9e8df X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099D7.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR12MB7372 The current SMCA configuration function does more than just configure SMCA features. It also detects and caches the SMCA bank types. However, the bank type caching flow will be removed during the init path clean up. Define a new function that only configures SMCA features. This will operate on the MCA_CONFIG MSR, so include updated register field definitions using bitops. Leave old code until init path is cleaned up. Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-10-yazen.ghannam@amd.com v1->v2: * No change. arch/x86/kernel/cpu/mce/amd.c | 84 ++++++++++++++++++++--------------- 1 file changed, 49 insertions(+), 35 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index c76bc158b6b6..3093fed06194 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -50,6 +50,7 @@ #define MASK_DEF_INT_TYPE 0x00000006 #define DEF_LVT_OFF 0x2 #define DEF_INT_TYPE_APIC 0x2 +#define INTR_TYPE_APIC 0x1 /* Scalable MCA: */ #define MCI_IPID_MCATYPE GENMASK_ULL(47, 44) @@ -57,6 +58,12 @@ #define MCI_IPID_MCATYPE_OLD 0xFFFF0000 #define MCI_IPID_HWID_OLD 0xFFF +/* MCA_CONFIG register, one per MCA bank */ +#define CFG_DFR_INT_TYPE GENMASK_ULL(38, 37) +#define CFG_MCAX_EN BIT_ULL(32) +#define CFG_LSB_IN_STATUS BIT_ULL(8) +#define CFG_DFR_INT_SUPP BIT_ULL(5) + /* Threshold LVT offset is at MSR0xC0000410[15:12] */ #define SMCA_THR_LVT_OFF 0xF000 @@ -344,45 +351,51 @@ static void smca_set_misc_banks_map(unsigned int bank, unsigned int cpu) } -static void smca_configure(unsigned int bank, unsigned int cpu) +/* Set appropriate bits in MCA_CONFIG. */ +static void configure_smca(unsigned int bank) { - u8 *bank_counts = this_cpu_ptr(smca_bank_counts); - const struct smca_hwid *s_hwid; - unsigned int i, hwid_mcatype; - u32 high, low; - u32 smca_config = MSR_AMD64_SMCA_MCx_CONFIG(bank); + u64 mca_config; - /* Set appropriate bits in MCA_CONFIG */ - if (!rdmsr_safe(smca_config, &low, &high)) { - /* - * OS is required to set the MCAX bit to acknowledge that it is - * now using the new MSR ranges and new registers under each - * bank. It also means that the OS will configure deferred - * errors in the new MCx_CONFIG register. If the bit is not set, - * uncorrectable errors will cause a system panic. - * - * MCA_CONFIG[MCAX] is bit 32 (0 in the high portion of the MSR.) - */ - high |= BIT(0); + if (!mce_flags.smca) + return; - /* - * SMCA sets the Deferred Error Interrupt type per bank. - * - * MCA_CONFIG[DeferredIntTypeSupported] is bit 5, and tells us - * if the DeferredIntType bit field is available. - * - * MCA_CONFIG[DeferredIntType] is bits [38:37] ([6:5] in the - * high portion of the MSR). OS should set this to 0x1 to enable - * APIC based interrupt. First, check that no interrupt has been - * set. - */ - if ((low & BIT(5)) && !((high >> 5) & 0x3)) - high |= BIT(5); + if (rdmsrl_safe(MSR_AMD64_SMCA_MCx_CONFIG(bank), &mca_config)) + return; + + /* + * OS is required to set the MCAX enable bit to acknowledge that it is + * now using the new MSR ranges and new registers under each + * bank. It also means that the OS will configure deferred + * errors in the new MCA_CONFIG register. If the bit is not set, + * uncorrectable errors will cause a system panic. + */ + mca_config |= FIELD_PREP(CFG_MCAX_EN, 0x1); - this_cpu_ptr(mce_banks_array)[bank].lsb_in_status = !!(low & BIT(8)); + /* + * SMCA sets the Deferred Error Interrupt type per bank. + * + * MCA_CONFIG[DeferredIntTypeSupported] is bit 5, and tells us + * if the DeferredIntType bit field is available. + * + * MCA_CONFIG[DeferredIntType] is bits [38:37]. OS should set + * this to 0x1 to enable APIC based interrupt. First, check that + * no interrupt has been set. + */ + if (FIELD_GET(CFG_DFR_INT_SUPP, mca_config) && !FIELD_GET(CFG_DFR_INT_TYPE, mca_config)) + mca_config |= FIELD_PREP(CFG_DFR_INT_TYPE, INTR_TYPE_APIC); - wrmsr(smca_config, low, high); - } + if (FIELD_GET(CFG_LSB_IN_STATUS, mca_config)) + this_cpu_ptr(mce_banks_array)[bank].lsb_in_status = true; + + wrmsrl(MSR_AMD64_SMCA_MCx_CONFIG(bank), mca_config); +} + +static void smca_configure_old(unsigned int bank, unsigned int cpu) +{ + u8 *bank_counts = this_cpu_ptr(smca_bank_counts); + const struct smca_hwid *s_hwid; + unsigned int i, hwid_mcatype; + u32 high, low; smca_set_misc_banks_map(bank, cpu); @@ -758,8 +771,9 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) { if (mce_flags.smca) - smca_configure(bank, cpu); + smca_configure_old(bank, cpu); + configure_smca(bank); disable_err_thresholding(c, bank); for (block = 0; block < NR_BLOCKS; ++block) { From patchwork Thu Apr 4 15:13:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617981 Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on2046.outbound.protection.outlook.com [40.107.102.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E8BEA128362; Thu, 4 Apr 2024 15:14:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.102.46 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243668; cv=fail; b=aEUmQBVI8Hm3D1HwKF5fo62oyLqAgBpnpiltBeBE6qZyDCW3zQcmWCo3VYZ/zqS6wtfS9qxDEQjPbX6XcN1VpahgswLAmGz7wUznROxW8TkNjzqc3C8sZVsHPn+NjJJklB6F5Rcc126IlbWA5VJjXEDGf0GViPYfF6ugkludC4Y= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243668; c=relaxed/simple; bh=vPFWkUy8IIHa/WUQHXkrjG1ox0a2n+Dx91X2EksZ93E=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=HNZqNBSi5ocZity0B1pquQAcbCj1viPAlcSrwrHTAp+xM5NXt3fk/bXCdVsZDh6jLp518wb666KJHYSWBasIYu4nKu77JKw7ua7Kw0QvYgOckEAC0S+UTpIGsI9X/SpA7pwWHMYm8nNKIuWYPLJUZtnz/MwprWkj3w7EUOfrLlo= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=Nao5wDHe; arc=fail smtp.client-ip=40.107.102.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="Nao5wDHe" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Kp2UUCs7kujo2QeCYNEP1Fme/n+sJfyJi4xHJh2ntfOffZ/uucdf+nFaIxR+vRumoVvQY+YhvBjR3xBLIHLUpTq2+CDAm7NnpgL9OZOnr0FREFh++/RChPCCL6wjhXLdLiUxxzt+9Fn7yNAjJ3PkRsrW500mj05JpuKwswSN4+FZ8bNJIVh1zPzCcJAoOYDoUco/TRHPSJnLLIaKweghvLKzCHnU7S2ycplqcsSGbd/xc1bg1tHk/n0+mmkDvkm4T531otXsjNMcGVrBBpyK8unFiPcW3pPoE5EVf3yNFBx414L7rN5K1ouQm4ULNyukt584+PZwQDFglCP4FqfHAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ii7nrqaeeFK7Rd2T3wSvKEYfBVw/yhmnSw1wcsBU4Uk=; b=jSCYsjn5assO1QJPefInj12c99UXOVr/DmhkXyWVLsMzCZwrODVtKrUkv04bsMXzUcW3vesVBuk4JPe7pe0RcVbizxa82EAEedqtNIevH0v8rOqei5IB1Fv7q4xL0LIJ0r0zeqJyIUGCDTeSXf/Ybv81kpPJX8A/x5xSWt2sldsYQSFz74pg4lG8CPA2x0iBAC/r61VYLFyIW2OBQ2MmTpj2h3ehb1caK6EEaBbMuMcfnwEUuVrlfNwvQ0eb4QoBaU9QVEnE93oXFiOXMUjp/0Q01+mxybB/mcX6hvHOdLm2EXmz+jPkLRRrMgNvN873HrrhzPC1CUAS7sAAJX21XA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ii7nrqaeeFK7Rd2T3wSvKEYfBVw/yhmnSw1wcsBU4Uk=; b=Nao5wDHeOUr/LdPHrwubD7tRIA75FhScfT8k/LfD3jNjpY1dq1thD/7tUtIXHZIEq7UigDioQ1Ryc6Cp1RmtMAvrBP0GpWvjL8tEjUzcWf4HEFK6gZXm13w9O3N+7dr9CcRooPBJOQndYThiONxHKoZQe07fQLTThcPHPhR4Ods= Received: from DM6PR05CA0043.namprd05.prod.outlook.com (2603:10b6:5:335::12) by MW5PR12MB5623.namprd12.prod.outlook.com (2603:10b6:303:199::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:24 +0000 Received: from DS3PEPF000099DA.namprd04.prod.outlook.com (2603:10b6:5:335:cafe::37) by DM6PR05CA0043.outlook.office365.com (2603:10b6:5:335::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.26 via Frontend Transport; Thu, 4 Apr 2024 15:14:23 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099DA.mail.protection.outlook.com (10.167.17.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:23 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:11 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 06/16] x86/mce/amd: Prep DFR handler before enabling banks Date: Thu, 4 Apr 2024 10:13:49 -0500 Message-ID: <20240404151359.47970-7-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099DA:EE_|MW5PR12MB5623:EE_ X-MS-Office365-Filtering-Correlation-Id: 21cbd44e-b3bc-4a8a-741b-08dc54b9e936 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: BWNNZW9HZXoJ3yqi5lpGdM0bTcRvuxNrOI78p7ubIq5nwRcOsme+l314h/xGm7F+/FPP023NnSlwo/+sNTsM9tkW4QoKrJ5mn4lJG0TU2bYE8n8j+iNQyUQ/dWFob3oC6LIbNERarq0uEhpNMH5kEV552W8Jm8JR1aw47mHNDUWwfPoItLQrzEqvo3MblMTimqpIW1GYad9MUq7SNELKL+haBL1axaRLLrTCyrObhEHcUYgz/LLxTcb/wXEH+nmV+YgHN3NBun6sG7/jNulrjQwjpnexAcQNC1BFpOCYjI2Dj0f6y0+wHgxoVk4b5C8D7v73suycZlzY/ZOXGSqDnWlcNZazbvoaXRltqZ7cX1neWyB4+KGgstkB919+4IZOOiYh9CHR7jR02t/3D7BMzgPRs1zPI/Pi8mlBVoO2jj2n5IJONYJE/SXQkunKXp4GBpVZjaWrd0yklWLvlErWP3gZSMbXJ0w41cRDqbs3rgSjDjfjMlDLMIjOlqnMozT72cUdWjJ1fqf/xmM567G/o8P7L/5WstnLoPOSgUdOdfDGK8dtzUN5Afozkez8IGTbEI/7kkWaexhofqaNYR1oIza+wvMsGw62jaJEEcH9tvHvhsTsriQkLHwkOtxg92aFO0xIlNtI5CU5uO4JYTami5h7h1XKZEWyxkwNVCzbi1qhau+Oyjuz1zySVNGfsDuN X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(82310400014)(376005)(36860700004)(1800799015);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:23.5610 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 21cbd44e-b3bc-4a8a-741b-08dc54b9e936 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099DA.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW5PR12MB5623 Scalable MCA systems use the per-bank MCA_CONFIG register to enable deferred error interrupts. This is done as part of SMCA configuration. Currently, the deferred error interrupt handler is set up after SMCA configuration. Move the deferred error interrupt handler set up before SMCA configuration. This ensures the kernel is ready to receive the interrupts before the hardware is configured to send them. Signed-off-by: Yazen Ghannam Reviewed-by: Borislav Petkov (AMD) --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-11-yazen.ghannam@amd.com v1->v2: * No change. arch/x86/kernel/cpu/mce/amd.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 3093fed06194..e8e78d91082b 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -589,6 +589,9 @@ static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c) u32 low = 0, high = 0; int def_offset = -1, def_new; + if (!mce_flags.succor) + return; + if (rdmsr_safe(MSR_CU_DEF_ERR, &low, &high)) return; @@ -768,6 +771,7 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) u32 low = 0, high = 0, address = 0; int offset = -1; + deferred_error_interrupt_enable(c); for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) { if (mce_flags.smca) @@ -794,9 +798,6 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) offset = prepare_threshold_block(bank, block, address, offset, high); } } - - if (mce_flags.succor) - deferred_error_interrupt_enable(c); } /* From patchwork Thu Apr 4 15:13:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617989 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2078.outbound.protection.outlook.com [40.107.237.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 69B311D543; Thu, 4 Apr 2024 15:14:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.237.78 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243672; cv=fail; b=LV4sK8Xm0kbOCV7BwTGu6SZZvKSpM2BxxEH3LWNBXvkBJ3YuWL22bJ+9kDNoYb/NcBqTkEb/NIfaKb9eZq+DK8pxD3f1mW+YNtBROxkr80ictKQo2+wqvh3J2Mv8fmNK/5d2B1oQb8FMrwsSelP8Xmr8PWSQe1tC6gP0hUtfwgw= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243672; c=relaxed/simple; bh=aoSsT4Cf1q2iEcRO+7KHFuHb/jboMvczY8GloQoLSbo=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=AT5f9LVs3+GxTYdn8gPMjyXcoU69WTd0T8hM7as5PchWjuGaHFlYh7cLf15niem7+/S3nFVecAyGiaQfdtlHX7WUQeVcKtiJlpUPbbhe5BBT0sK0Gj/ePPV/uOvKlJFFxY+4c8rCIy0BZ9QsL7FO3mkumfElnOnXgd7AUM1B5Pk= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=ivulkDJc; arc=fail smtp.client-ip=40.107.237.78 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="ivulkDJc" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ReOVmkK1Qkuxoj2UMOez+ZQtApjAJcQQkpqn8m5TFRBcrf1Di/Acm7ajV8ePrYfCHi8O1z/P/xPXPrM54fZYuTaZfthUPMNnnx5npuOdl4XUhNJgXtsgiTA5T3egwWQdVIbt5trLhb9UaMSlv91V2PPv/69Dh/zVld++7C0zfzr+29rsvE0JLvx/eBF2XU83WPa0blruH+wcM7/nj5G7SxxeCtmHl0ydhTJDBQcnkSTz762YLiLdTzRC4nSjJZWmVb+MTAY6OdE1FTLQKrE3N9RUjyavdzsKzuwduSsVqZ6fXr56GfeLV8Ek7FRVAAxfNp5t7Pz6+26VTLklWRPaZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7qYZ6FFmRynp6bBjYFro/IJsfaqEsqzjLa/ktE+Hlp0=; b=IYVPKmPuDDrgpSO4M5tFAbZG8Na4gVJH4y6+4zfjFnK0NdgSmOljbxZIfE8Rbzw0s39zHV2F+5s9vyXyQslV2hTqbHFb0UXTj3IubkDNmCQnx9q1xiTe8aZ4Fcj0A7qk+TmKnpusLyt4apL/ilnyZb+xDW/e9I+/y5x1BjnUNl+cJVkDDSReH6enFDN5Jh9mAIjr3ika6UJ7+erGbSde79VEzv3CQo/V5Er/mjTlNbcK3D2HWmkIqTzWRF/daywAES4KLRooW255Px/4BZed9tkcKgHksJvtYLy6qr+l+xZQN3qxwGao7TL/bnoqschO5bG2wjhrzkUriU+1hRIB5g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7qYZ6FFmRynp6bBjYFro/IJsfaqEsqzjLa/ktE+Hlp0=; b=ivulkDJcL58remg1XOfdtR/fRK99SrWgzfQZIkEqich/MbBbIYVViQyl+S5da/P6N9wHMnUCiFLmRwmmpG6su75mTSnciAArgsBaY5xWEeTxYNB97v70oNNyVDj3DICMwIhk/hMlegPAhhOkDNmZ0ziVRecOm8bWRsWoQ5mhRNM= Received: from CH0PR13CA0041.namprd13.prod.outlook.com (2603:10b6:610:b2::16) by PH7PR12MB6467.namprd12.prod.outlook.com (2603:10b6:510:1f5::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.26; Thu, 4 Apr 2024 15:14:24 +0000 Received: from DS3PEPF000099D6.namprd04.prod.outlook.com (2603:10b6:610:b2:cafe::5f) by CH0PR13CA0041.outlook.office365.com (2603:10b6:610:b2::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7472.10 via Frontend Transport; Thu, 4 Apr 2024 15:14:24 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099D6.mail.protection.outlook.com (10.167.17.7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:23 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:12 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 07/16] x86/mce/amd: Simplify DFR handler setup Date: Thu, 4 Apr 2024 10:13:50 -0500 Message-ID: <20240404151359.47970-8-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099D6:EE_|PH7PR12MB6467:EE_ X-MS-Office365-Filtering-Correlation-Id: 82744762-fc75-4a85-9217-08dc54b9e978 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: t/AVr9K5OxuurG5j8RfhhCuGKD7vCkOO9+TYzLSg7S4Xrled7YNR+DG+4ptr6oV87AhBeenJnvHDdcdqMaZm+k013pqBgND8gaeK7Ns5K0ASWWgJxspjmBh55+AFZDeaOWV2a7GAnVvcWViRet69P334p2pjZv4049P0OL47RAfAY4i70GhLIXg89ouDQ5MZP4XI6OPutMKh8yVomfTrc5eQoHXUAvUiXX+95sBVFRb8U182fddV/vsN3rB0/9n/SSMqIUrZ4pao03Mz69B58IJsQSCJlbaTNfZj3KY/omQ1BJECm8i8ay5JyZRkimviL2EOgkyRQ331u3B204nmPvjRLRjDOP7c0t3cu7v70mSjfBT6naVLbT4LvSuDyPYqs3+xGeAbdQGBqtYCAne5vKSy6YE/oraK9mQYr8oKqak9Os/M+hMayzX3RuPktbGP5WxVArj8tthNiSki9+KmSzsDh9J0U1ED45TUVFi4T8uS7DMKswrlEgGvYeA0DkI+fpvKLmUEuZAAp1BgONfQYTwOeFnlLrWRTXCiULsXRMtuSTMHDaVL3RgF1koOBF6HGj6lEvmHWvgyzoGF81kbMXltzjjZhwLlLrSAhVAm7U+/45PY3jaG/GXJaYWrc4DmGwho/DQMvs8QQvN5WOf+CGzBtozV2JsE5rGvrE896uIoIRVnibbThsxEqQ7Do8YyilRA47peszEEcPYeLRmYjQ== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(376005)(36860700004)(82310400014)(1800799015);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:23.9943 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 82744762-fc75-4a85-9217-08dc54b9e978 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099D6.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB6467 AMD systems with the SUCCOR feature can send an APIC LVT interrupt for deferred errors. The LVT offset is 0x2 by convention, i.e. this is the default as listed in hardware documentation. However, the MCA registers may list a different LVT offset for this interrupt. The kernel should honor the value from the hardware. Simplify the enable flow by using the hardware-provided value. Any conflicts will be caught by setup_APIC_eilvt(). Conflicts on production systems can be handled as quirks, if needed. Also, rename the function using a "verb-first" style. Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-12-yazen.ghannam@amd.com v1->v2: * No change. arch/x86/kernel/cpu/mce/amd.c | 33 ++++++++++----------------------- 1 file changed, 10 insertions(+), 23 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index e8e78d91082b..32628a30a5c1 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -48,7 +48,6 @@ #define MSR_CU_DEF_ERR 0xC0000410 #define MASK_DEF_LVTOFF 0x000000F0 #define MASK_DEF_INT_TYPE 0x00000006 -#define DEF_LVT_OFF 0x2 #define DEF_INT_TYPE_APIC 0x2 #define INTR_TYPE_APIC 0x1 @@ -575,19 +574,9 @@ static int setup_APIC_mce_threshold(int reserved, int new) return reserved; } -static int setup_APIC_deferred_error(int reserved, int new) +static void enable_deferred_error_interrupt(void) { - if (reserved < 0 && !setup_APIC_eilvt(new, DEFERRED_ERROR_VECTOR, - APIC_EILVT_MSG_FIX, 0)) - return new; - - return reserved; -} - -static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c) -{ - u32 low = 0, high = 0; - int def_offset = -1, def_new; + u32 low = 0, high = 0, def_new; if (!mce_flags.succor) return; @@ -595,17 +584,15 @@ static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c) if (rdmsr_safe(MSR_CU_DEF_ERR, &low, &high)) return; + /* + * Trust the value from hardware. + * If there's a conflict, then setup_APIC_eilvt() will throw an error. + */ def_new = (low & MASK_DEF_LVTOFF) >> 4; - if (!(low & MASK_DEF_LVTOFF)) { - pr_err(FW_BUG "Your BIOS is not setting up LVT offset 0x2 for deferred error IRQs correctly.\n"); - def_new = DEF_LVT_OFF; - low = (low & ~MASK_DEF_LVTOFF) | (DEF_LVT_OFF << 4); - } + if (setup_APIC_eilvt(def_new, DEFERRED_ERROR_VECTOR, APIC_EILVT_MSG_FIX, 0)) + return; - def_offset = setup_APIC_deferred_error(def_offset, def_new); - if ((def_offset == def_new) && - (deferred_error_int_vector != amd_deferred_error_interrupt)) - deferred_error_int_vector = amd_deferred_error_interrupt; + deferred_error_int_vector = amd_deferred_error_interrupt; if (!mce_flags.smca) low = (low & ~MASK_DEF_INT_TYPE) | DEF_INT_TYPE_APIC; @@ -771,7 +758,7 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) u32 low = 0, high = 0, address = 0; int offset = -1; - deferred_error_interrupt_enable(c); + enable_deferred_error_interrupt(); for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) { if (mce_flags.smca) From patchwork Thu Apr 4 15:13:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617984 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2061.outbound.protection.outlook.com [40.107.92.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A6F7412A172; Thu, 4 Apr 2024 15:14:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.92.61 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243670; cv=fail; b=Y2u+AQSeYcQfLkgICtLOXLGZzSjmEsaneJ+5x2pGFQUgv+vBwvI+w31L/fxPR8HevhDEEmg2r9Ri50VmOqEExxdkoQVOGxp18S+ZeqAs5oPrPxCfdvbQP8QS2hJPIKmS1xrDuPKB4C6E9WZrN8omZJ92cbWieksJzDMlgQs4Fag= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243670; c=relaxed/simple; bh=yMCkcMcMfBryg42eoKZvhY+5T8pIjEi4gap3WRYUE48=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=eRvxc0PJi/iIUFqne0QDOYnEQu8wpD5h1c7SuCVKF3EMh04HPyj9ZoBxhGEqa8b72c1c/7ZpjYNBse5lsfCjMZZys7qqXdy8Vh0r7+8LrAKn7QExhr/7355Z7kxz6ec4y1edBs1QzrLosi10+6KZNvzs3KKVQaezPh/ORGdlmvQ= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=b8LTzeY5; arc=fail smtp.client-ip=40.107.92.61 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="b8LTzeY5" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mHSOu7TzbEJFv3akHeFy292PDJUyFJa5Cx1FzIg3t0arF3dYQ6b+8APMBkVRQ9pmmkYFV3fglnMg9a5JJONcBtKGtnJaO77pVdcW6yPwY1yTLW/EOBXbpgbHTzkupwIAjceTcOzLGngWZiJA5hnGPb4vfq8Sw3V8V8I1H7pAjJg0BJaeUZ8e+SFsD3ctRud/RapV1KCdK6QMwcVL4D3Eaq1SWWJBtHjgqWTsP1ul8GBrDR0zTsTHSZfdvb0OTKZ5m3q5CK/O7sPdix4uL+Kj+bep377tqaW6vLZ8mCOMvXt8C/lg76DaOpe+iLQUnsBH9Bvq4qyNPM2cU8Mi27VoAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=baqrihR7gr1slglaOsLWSPt9ZTz6c2VFnWfH+kGz6yc=; b=gAKJKhEpH1Q4Rdu/njQX8o4hoVz8/r/i9IExuoSY/dUPXGiHVQv1iYa4IezrBFbAIDJktcJkq9w1dfHzwzg33nnLt6aboVEnqtIuF+JlSwmMLX2PYwm4SiFNbY9Uerlq7+bahilriCZJLsOV9g4NE3SkCbnF0T95N9HIrStB852bak41kX7/SiKp+5qeUzbCnakSoQrO5Hwn2G7K8le5fuxdGVK+/s/jWwe/VOKAh9GX/4rXu1Bq4DrpLnr2DJjAFt3ZAscKQWfQNEv8VZQO3X9Nv1NRG49x0+UIgN/xsXQcb9fSRW7a4oaL8JVRIUV+WMRBnDbzYnzgCiiHNHF8ag== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=baqrihR7gr1slglaOsLWSPt9ZTz6c2VFnWfH+kGz6yc=; b=b8LTzeY5e8iesnvjrKY2gBRGEjGwTMz5aU3jHgTcBAoSmIRLqlU3/FbhBF+CeHadzzjImTrIX3Y5KHW/Su3zFevOlPfLoTQmqryFCcm6pvYpcNdvcJgx5+clS94loze4eSzqsiIm8IxfniDxjS3uhtUfQ6p/Enx7AjCd2HbW7Yc= Received: from DM6PR05CA0044.namprd05.prod.outlook.com (2603:10b6:5:335::13) by DM4PR12MB9072.namprd12.prod.outlook.com (2603:10b6:8:be::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:24 +0000 Received: from DS3PEPF000099DA.namprd04.prod.outlook.com (2603:10b6:5:335:cafe::b5) by DM6PR05CA0044.outlook.office365.com (2603:10b6:5:335::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.26 via Frontend Transport; Thu, 4 Apr 2024 15:14:24 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099DA.mail.protection.outlook.com (10.167.17.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:24 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:12 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 08/16] x86/mce/amd: Clean up enable_deferred_error_interrupt() Date: Thu, 4 Apr 2024 10:13:51 -0500 Message-ID: <20240404151359.47970-9-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099DA:EE_|DM4PR12MB9072:EE_ X-MS-Office365-Filtering-Correlation-Id: 421092e0-9573-47f4-cf9b-08dc54b9e9d3 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ZoxqSwgS1OxLj3H5byfyNMlmedomcByqS07iwR819p9yJZe3UbOznvpnkLmql8XAnG0VBIlQ64lDaByy3dtTMOkBowaY+FFvgknIX4S3rrcrRnJIa7d2Woe/mEywWpGvwyMgnBnpiagwdnH+zic+TmKqXGDJMUAVL2adv/nQ8LVzU1J/Ba93OkNf/rD/8V+yqQXNgjTwWGWHjxa9E/TvY775Cf2M8AZZjMal1FAPKTIhMwynAXXwYWYtEo1c4P95fJzbtIaD5sbeBmytxoqthYj8a3OLMxCJn0auD02WuV5TNnL9qzpOZwF85D/0tAhMSockoknUOJfIgVI+rN9KKJMRrzcK5hlh6zK2v2o6QAE7ajDVnNGCylaSjhdbJxgMGRd3+zXtesEws/uccFvSb5vWcRTMTMNPbFdNB3+QIpiVfW5TpCtcWgzlbj91MOSYPFpaa8VV3c6wKs0l3PaQA7UiBd2H2nZUYtq4sSMxI3v4YC4/gSuJSHFlhT7+12kik0T0kPBiCa7mzy2z3HbYlx7/vSS3/q1gIVor7cKaTQ3H13E+eEJIhdAyH/EuY+HO8yJ8U+LxFCfZsljYP5ymtnjWaHZc69zLEvZ6j/5ZVXQQeeqo7tIrBpv4tRjyMETLTteKz1wOZDkr/Fjzpu7nl5SYE/vaOnPW3evD/XCczqhv3omoWPETds/xResG08w0jmzP9ldLIqnaLjP2yuAGPg== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(82310400014)(376005)(36860700004)(1800799015);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:24.5454 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 421092e0-9573-47f4-cf9b-08dc54b9e9d3 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099DA.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB9072 Switch to bitops to help with clarity. Also, avoid an unnecessary wrmsr() for SMCA systems. Use the updated name for MSR 0xC000_0410 to match the documentation for Family 0x17 and later systems. This MSR is used for setting up both Deferred and MCA Thresholding interrupts on current systems. So read it once during init and pass to functions that need it. Start with the Deferred error interrupt case. The MCA Thresholding interrupt case will be handled during refactoring. Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-13-yazen.ghannam@amd.com v1->v2: * Remove invalid SMCA check in get_mca_intr_cfg(). (Yazen) arch/x86/kernel/cpu/mce/amd.c | 46 +++++++++++++++++++++++------------ 1 file changed, 30 insertions(+), 16 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 32628a30a5c1..f59f4a1c9b21 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -44,11 +44,11 @@ #define MASK_BLKPTR_LO 0xFF000000 #define MCG_XBLK_ADDR 0xC0000400 -/* Deferred error settings */ +/* MCA Interrupt Configuration register, one per CPU */ #define MSR_CU_DEF_ERR 0xC0000410 -#define MASK_DEF_LVTOFF 0x000000F0 -#define MASK_DEF_INT_TYPE 0x00000006 -#define DEF_INT_TYPE_APIC 0x2 +#define MSR_MCA_INTR_CFG 0xC0000410 +#define INTR_CFG_DFR_LVT_OFFSET GENMASK_ULL(7, 4) +#define INTR_CFG_LEGACY_DFR_INTR_TYPE GENMASK_ULL(2, 1) #define INTR_TYPE_APIC 0x1 /* Scalable MCA: */ @@ -574,30 +574,30 @@ static int setup_APIC_mce_threshold(int reserved, int new) return reserved; } -static void enable_deferred_error_interrupt(void) +static void enable_deferred_error_interrupt(u64 mca_intr_cfg) { - u32 low = 0, high = 0, def_new; + u8 dfr_offset; - if (!mce_flags.succor) - return; - - if (rdmsr_safe(MSR_CU_DEF_ERR, &low, &high)) + if (!mca_intr_cfg) return; /* * Trust the value from hardware. * If there's a conflict, then setup_APIC_eilvt() will throw an error. */ - def_new = (low & MASK_DEF_LVTOFF) >> 4; - if (setup_APIC_eilvt(def_new, DEFERRED_ERROR_VECTOR, APIC_EILVT_MSG_FIX, 0)) + dfr_offset = FIELD_GET(INTR_CFG_DFR_LVT_OFFSET, mca_intr_cfg); + if (setup_APIC_eilvt(dfr_offset, DEFERRED_ERROR_VECTOR, APIC_EILVT_MSG_FIX, 0)) return; deferred_error_int_vector = amd_deferred_error_interrupt; - if (!mce_flags.smca) - low = (low & ~MASK_DEF_INT_TYPE) | DEF_INT_TYPE_APIC; + if (mce_flags.smca) + return; - wrmsr(MSR_CU_DEF_ERR, low, high); + mca_intr_cfg &= ~INTR_CFG_LEGACY_DFR_INTR_TYPE; + mca_intr_cfg |= FIELD_PREP(INTR_CFG_LEGACY_DFR_INTR_TYPE, INTR_TYPE_APIC); + + wrmsrl(MSR_MCA_INTR_CFG, mca_intr_cfg); } static u32 smca_get_block_address(unsigned int bank, unsigned int block, @@ -751,14 +751,28 @@ static void disable_err_thresholding(struct cpuinfo_x86 *c, unsigned int bank) wrmsrl(MSR_K7_HWCR, hwcr); } +static u64 get_mca_intr_cfg(void) +{ + u64 mca_intr_cfg; + + if (!mce_flags.succor) + return 0; + + if (rdmsrl_safe(MSR_MCA_INTR_CFG, &mca_intr_cfg)) + return 0; + + return mca_intr_cfg; +} + /* cpu init entry point, called from mce.c with preempt off */ void mce_amd_feature_init(struct cpuinfo_x86 *c) { unsigned int bank, block, cpu = smp_processor_id(); + u64 mca_intr_cfg = get_mca_intr_cfg(); u32 low = 0, high = 0, address = 0; int offset = -1; - enable_deferred_error_interrupt(); + enable_deferred_error_interrupt(mca_intr_cfg); for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) { if (mce_flags.smca) From patchwork Thu Apr 4 15:13:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617990 Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam04on2087.outbound.protection.outlook.com [40.107.100.87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 18CB212AAF2; Thu, 4 Apr 2024 15:14:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.100.87 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243673; cv=fail; b=ayE28q+AdONwkjC9QYY0k5EmYNJGl7SgFBj0g+ojEXnUWW4CDpMqA1RQ54H9WmWVUxMHKo9upmxmWkm0VYh+VhoAkiCMYwjKuL7SFmeZsqLdJPGq9jNpFo8SPfT+D/6vM54LvfUZzmjxluB6Oadc1/XLYvRmF9jO/aLT7OxWD9w= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243673; c=relaxed/simple; bh=l1IfGhAprb4MXyxttDnp262mOHHRz2QlO69nfS/Pkus=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=WpomCofvEGlQ1ne1km0ypcPHcm+FwXwZIaEgok4d0lxVxrYDfk0OTy38/iJO06FpSCZhGtA/SrWh6XjMY9BbG/2wea3s4gX+XES8h0YDURelpdSEGdhfJiqWVAw/pbvgEcWbLi+rz9joqiUe9DXE/xawd/v14BlmvqJJVED3yCE= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=gKEdyBAQ; arc=fail smtp.client-ip=40.107.100.87 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="gKEdyBAQ" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ao6El1XuFphas2y7rXYS2nULKRe/sGZPM1wsAX+L8EgGcHWHHmZd2/1hOHPiKrR5Mu1GBG8lyv3I134jfTCNO4u2T+Qn9SAatAxbYwp/BFu48Aluf/tbtUexM2HRdVfHAPTSaumsjwUCrC0+BvTh/RERAjm/D3TJ57quUMHO3Hw+9DYUGqqZS2Et3+xZ+9T54sThpGQVR393hE+Eqy5NXK/4YfjWxjcXd+/N5l2W0FqabccEotTZ0Vl73KkCCY6YfOc4KGexsHqpp1mMs6AAGxYOYrA70KXcM4Urtg+wPTJykvF7JQSvLBJ4IwhyWMbTe/0kTFPGnvyxqm6/iPonzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3Lz+p388J6AoIRShka38TwLszdNjbES/X4b4hJsFWDY=; b=QNBeAIyN5qHQcPD+ZUO1Y96WCHC3h/6tFuCyT/kDJtk8ZDOLn/K/kjMAzJGEssbjrlzuFZf6aFN5QK9gDA09fKN44s9RamA77492oMginWOPHdrHrQYnSNl+agyat359oSpFwBplqMKqEluEIx+PJ4Nfvt4Nn7JueKS+ZDJx9w1N3Tl2j6Ao57QXHgCa7MYvqytOUIxl4NwvroNbBodmWm7RTg7844K8usPzjtUBMNJVnXeUsCteulySxmhRImYwEYsFWCkbKwM+7jN1wNEQsS6ezP2yoIjidt1UHmoEX305Sp0Tc2IFy0omdidZ6aCkKpxo+qEd2PTKMURBiLoMXw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3Lz+p388J6AoIRShka38TwLszdNjbES/X4b4hJsFWDY=; b=gKEdyBAQowiKZdJ9c1hmF5eJ863fv5PWFyXBC12rNm2WmQC3pMxMeGdlNEHtJywHMszrNMANnXZgp2ONv/JiYeBHOUTycoJogfPB0awjIdBO71dhHrBHqXAixPSxbSKit2GaGFzk/823+RlHd9i7A8Xm3EdgAQxRdaDPvYitr3c= Received: from DM6PR05CA0041.namprd05.prod.outlook.com (2603:10b6:5:335::10) by SA3PR12MB7829.namprd12.prod.outlook.com (2603:10b6:806:316::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:25 +0000 Received: from DS3PEPF000099DA.namprd04.prod.outlook.com (2603:10b6:5:335:cafe::4a) by DM6PR05CA0041.outlook.office365.com (2603:10b6:5:335::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7472.10 via Frontend Transport; Thu, 4 Apr 2024 15:14:25 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099DA.mail.protection.outlook.com (10.167.17.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:25 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:12 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 09/16] x86/mce: Unify AMD THR handler with MCA Polling Date: Thu, 4 Apr 2024 10:13:52 -0500 Message-ID: <20240404151359.47970-10-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099DA:EE_|SA3PR12MB7829:EE_ X-MS-Office365-Filtering-Correlation-Id: e02aaa20-7d65-4899-2dbe-08dc54b9ea33 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: kJ0JT0GRpJjWZdaSmYS31xUXpQ2tFvYUB0uwFEr4AXuDg4IgfRX1rf+noQ3lDNg2tEEKJvGc4Ac6x2ot+3SidzfZDJgkmpNdV1C2jwOe1JRgqeKyqyf0oqMGjbzD4EGs58RyIjQPWD6XgjjAEZMJ08zU/7FMkkwF448kkoX3ZSYirASL2a1w2G9iRVU0eix15S8RANnYBAvGby5JSnQt8/4mXUr306B4Gbafnbt0AuIXlIuGnrLOXufrR+99CEQUfyegIx9DDLpw7d2woOrD/1lseRkyll9Rw/+yALW2KeCc5B3iye6jPGv7Z5DhpPRcxgmDBW5IxuXKVaQw4bH7hhu+92nH0yLdVQXNDqe5ij3MM8RGxfv1J3VS4mJJN7KompsN4l22qfG9n4+sVDcJjIWiTM+gYfgFaFjQFI5wMLUPdN5EqO9HnztUxPBKOV3+iz5CxIDyxbS0scf+gFZ0uiMzLDKyx6c9Ng7kbxrHJQFgVoK61egS4plw1pz52QzS8W2s8sAetB+E5+5d7xasJHiiIDO/MCd5VmZQK33L7bcfw0GgCJKCZE/8DAkGHWmtkQwk6QDxvRdZ1DTZulIqhcC1mJzEQ2I6hBDeR/IuozQqUwH/+Cb2C3YUlnp3i/wyebST8qWwEfbKE/EWslWkoUpLbtU9M99MeeYpWXJAonmQG5jFloiBsXQhRu0sy7684lXiYvzyz+0sKSNRJhRo1w== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(82310400014)(376005)(36860700004)(1800799015);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:25.2485 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e02aaa20-7d65-4899-2dbe-08dc54b9ea33 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099DA.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA3PR12MB7829 AMD systems optionally support an MCA Thresholding interrupt. The interrupt should be used as another signal to trigger MCA polling. This is similar to how the Intel Corrected Machine Check interrupt (CMCI) is handled. AMD MCA Thresholding is managed using the MCA_MISC registers within an MCA bank. The OS will need to modify the hardware error count field in order to reset the threshold limit and rearm the interrupt. Management of the MCA_MISC register should be done as a follow up to the basic MCA polling flow. It should not be the main focus of the interrupt handler. Furthermore, future systems will have the ability to send an MCA Thresholding interrupt to the OS even when the OS does not manage the feature, i.e. MCA_MISC registers are Read-as-Zero/Locked. Call the common MCA polling function when handling the MCA Thresholding interrupt. This will allow the OS to find any valid errors whether or not the MCA Thresholding feature is OS-managed. Also, this allows the common MCA polling options and kernel parameters to apply to AMD systems. Add a callback to the MCA polling function to handle vendor-specific operations. Start by handling the AMD MCA Thresholding "block reset" flow. Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-14-yazen.ghannam@amd.com v1->v2: * No change. arch/x86/kernel/cpu/mce/amd.c | 57 ++++++++++++++---------------- arch/x86/kernel/cpu/mce/core.c | 8 +++++ arch/x86/kernel/cpu/mce/internal.h | 2 ++ 3 files changed, 37 insertions(+), 30 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index f59f4a1c9b21..75195d6fe971 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -979,12 +979,7 @@ static void amd_deferred_error_interrupt(void) log_error_deferred(bank); } -static void log_error_thresholding(unsigned int bank, u64 misc) -{ - _log_error_deferred(bank, misc); -} - -static void log_and_reset_block(struct threshold_block *block) +static void reset_block(struct threshold_block *block) { struct thresh_restart tr; u32 low = 0, high = 0; @@ -998,49 +993,51 @@ static void log_and_reset_block(struct threshold_block *block) if (!(high & MASK_OVERFLOW_HI)) return; - /* Log the MCE which caused the threshold event. */ - log_error_thresholding(block->bank, ((u64)high << 32) | low); - /* Reset threshold block after logging error. */ memset(&tr, 0, sizeof(tr)); tr.b = block; threshold_restart_bank(&tr); } -/* - * Threshold interrupt handler will service THRESHOLD_APIC_VECTOR. The interrupt - * goes off when error_count reaches threshold_limit. - */ -static void amd_threshold_interrupt(void) +static void reset_thr_blocks(unsigned int bank) { struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL; struct threshold_bank **bp = this_cpu_read(threshold_banks); - unsigned int bank, cpu = smp_processor_id(); /* * Validate that the threshold bank has been initialized already. The * handler is installed at boot time, but on a hotplug event the * interrupt might fire before the data has been initialized. */ - if (!bp) + if (!bp || !bp[bank]) return; - for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) { - if (!(per_cpu(bank_map, cpu) & BIT_ULL(bank))) - continue; + first_block = bp[bank]->blocks; + if (!first_block) + return; - first_block = bp[bank]->blocks; - if (!first_block) - continue; + /* + * The first block is also the head of the list. Check it first + * before iterating over the rest. + */ + reset_block(first_block); + list_for_each_entry_safe(block, tmp, &first_block->miscj, miscj) + reset_block(block); +} - /* - * The first block is also the head of the list. Check it first - * before iterating over the rest. - */ - log_and_reset_block(first_block); - list_for_each_entry_safe(block, tmp, &first_block->miscj, miscj) - log_and_reset_block(block); - } +/* + * Threshold interrupt handler will service THRESHOLD_APIC_VECTOR. The interrupt + * goes off when error_count reaches threshold_limit. + */ +static void amd_threshold_interrupt(void) +{ + /* Check all banks for now. This could be optimized in the future. */ + machine_check_poll(MCP_TIMESTAMP, this_cpu_ptr(&mce_poll_banks)); +} + +void amd_handle_error(struct mce *m) +{ + reset_thr_blocks(m->bank); } /* diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 7a857b33f515..75297e7eb980 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -672,6 +672,12 @@ static noinstr void mce_read_aux(struct mce *m, int i) } } +static void vendor_handle_error(struct mce *m) +{ + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) + return amd_handle_error(m); +} + DEFINE_PER_CPU(unsigned, mce_poll_count); /* @@ -787,6 +793,8 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) mce_log(&m); clear_it: + vendor_handle_error(&m); + /* * Clear state for this bank. */ diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h index e86e53695828..96b108175ca2 100644 --- a/arch/x86/kernel/cpu/mce/internal.h +++ b/arch/x86/kernel/cpu/mce/internal.h @@ -267,6 +267,7 @@ void mce_setup_for_cpu(unsigned int cpu, struct mce *m); #ifdef CONFIG_X86_MCE_AMD extern bool amd_filter_mce(struct mce *m); bool amd_mce_usable_address(struct mce *m); +void amd_handle_error(struct mce *m); /* * If MCA_CONFIG[McaLsbInStatusSupported] is set, extract ErrAddr in bits @@ -295,6 +296,7 @@ static __always_inline void smca_extract_err_addr(struct mce *m) #else static inline bool amd_filter_mce(struct mce *m) { return false; } static inline bool amd_mce_usable_address(struct mce *m) { return false; } +static inline void amd_handle_error(struct mce *m) { } static inline void smca_extract_err_addr(struct mce *m) { } #endif From patchwork Thu Apr 4 15:13:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617986 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2066.outbound.protection.outlook.com [40.107.220.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6609C12AAD2; Thu, 4 Apr 2024 15:14:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.220.66 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243671; cv=fail; b=EWTrHYn/434p/J9EkqK8ewsh+HS6Q97cVqKrOtpgeE0u+8TcbF+UXrPh0eb9l2nAwRe2qDxmYBIJNi8QVnYcntlTbMH4VJVnsX84Gr26/oMfggjtjaEAscoaoKmbCd2EHey3E7wrbq+3JY3mqoPIN3KAoAeFUV24oDnIPe3fBlY= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243671; c=relaxed/simple; bh=nyk8lJR8uC93sH8936+JiaYpErm7E7JXdwimGb5m000=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Kblris+cTPHC0QrsPw2Hk74bKbbXd3DrpjLxCwcYteM3CHlvy5Xa5uufBK2QCJoYB1R+a/m8MCwUVsNP4XTphOvq3P6Kb5vRaNI0PM2WzmDXxLv22YJO4264REzh+R5OcjD/WOHelYBcX7rzTRJaymo5dkSZkBe/iPWTax9S1Yg= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=l4uBQsET; arc=fail smtp.client-ip=40.107.220.66 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="l4uBQsET" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XpvVmwefStVWRBgD9SG3RVoAQHjwKMIPuRETuXbbwe/XxS4MosD8W0E+hstwHCAnugzNA30vgpScBvxXjok8ke/j2V0+40gRvsACPpsRRHJCezpiP4laHVIQDHdWg8Rrw+cqNu8CWG0FHGrXdGDILgln5ABJ8g97EwhM4bAPXuNFO2ZBmZ5rIRDntXIHkGyO3RQRO0IWR2bbSgzDGhLQsmTqWkDeerLckQ+TZW8ogLLP/vHxZp9IIs9ASLdnFN2xyO+01+TTarsMlErfJOrTWlzJ6HzL67xBQqZz2oZE3FG2+O8otYJEUjbT16ga7+Bk3hfokMfXTmqCv7gi26Pnqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wM1ULlnV2okVrd3WIjzBTWfF964iCeHNHE1Rh/UL3xs=; b=RAwDshpbUdENMF1a86L24ygC42IM5yiKHvgwY0HqmvdzF+cGfcCOAJ4gXotvjSWmtiiusOP1cMaY3+olc37iSkO7JqToAXRa7WwFhbf+Bdx8TpiNwZW0+Wige7Xw+LkxX0N1McUeMBKQVxCgQXvRKfRY9KEtic/E2iI+HaeoVC9L5nhh6Ooi/ph469fYl276e0tpmMOIujuFEa7h1hRbxTGhjs2cUXhqYFUqClJ7SbSSsP9fdHMFcq+uI3JGC5mboz4/k6J94dT9JTDGcVQVUw1tNGCFxCjiimKeG0qq4QdJ7KfSXK0dbw1A7M36HUaVcxRKjKKrZyglQkL9GfHOCQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wM1ULlnV2okVrd3WIjzBTWfF964iCeHNHE1Rh/UL3xs=; b=l4uBQsETc+Z3DYGV16253jOHpSLBw6H4/uHRgDzctVBrwp1Viy1ayDTN3UaADftBpRCSPhBFzbYmfpwUfHRtSqxd/vzRrd7EdLnsYjQ4epxSFKnSQqiX6NG8b9gh1LAIi+XbzvMj49ZAE7F483Xbzuba0bmXUtIFvtrFSM44P8A= Received: from DM6PR05CA0057.namprd05.prod.outlook.com (2603:10b6:5:335::26) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:26 +0000 Received: from DS3PEPF000099DA.namprd04.prod.outlook.com (2603:10b6:5:335:cafe::b4) by DM6PR05CA0057.outlook.office365.com (2603:10b6:5:335::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.26 via Frontend Transport; Thu, 4 Apr 2024 15:14:26 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099DA.mail.protection.outlook.com (10.167.17.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:26 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:13 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 10/16] x86/mce: Unify AMD DFR handler with MCA Polling Date: Thu, 4 Apr 2024 10:13:53 -0500 Message-ID: <20240404151359.47970-11-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099DA:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: 1aadc1b4-46e7-4e39-ecdb-08dc54b9eab3 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ZukE/eDwZWlxD9pd4mKse/15ogafJD6+UTsYs6MTQXcJMUOtHywIZG4baa6AQeY7ToZGe4vSrHUEDulQS8pDEJ8WGcLtWv3+6PzUgTqUEMkwyP+vToRpFNJfW+UTiSELp5zr8ozJ4ubBTCE2l/hA1qeCmCSZ6WTVW6tqdURLPoA9zlT36nV2zAI9o0T8jhLXg1y/A28WEZU91W+WteXdmYp+GSq21f8ArPx8arg7JDHB1lsEWB8J3aniVWSIIInVlXywgq0htDgjSCebbc2EzhfJogaZqCVB1EwY3DplIPNl9Fr8ysLjE0hajy4o3JoSqLBRO8+E2AGJhVMWVzDFauJrvxucHBoGBmWpfCoi12asMygb0HylzHO4MPLmHMtWqBjATw2ZVYtdSR89dkgEpWoLFl4BmZOW1ThryXUDjb9rWXSo4axpdu+CYQGMUR1JgtDsqu+sm3Um2wFeSD9vznqS1HvrrAkrshBK4JpqLgTh8pNTZZe56DYCpRrtQ6QJCDT4uUg6jfCCFqrJ7tIPEsGnBapkJhdXKJVF13OFYSm/QdCs9CR5hW/o4dvtMDNJe4CTna4Pa7RAEQBkO6zxF4E0GQS6SkDDKptqa9Eqc5GapJzCqBHY7r78NJeLO4YumuP/x19lLnuMxYe5QmqyZiV9/LJ158tEDgT7fnsg4W5+hPjlXCAc1U0FlYlsywg0XnAUIzn+qPX+jt9S/OGCzQ== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(82310400014)(36860700004)(376005)(1800799015);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:26.0923 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1aadc1b4-46e7-4e39-ecdb-08dc54b9eab3 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099DA.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 AMD systems optionally support a Deferred error interrupt. The interrupt should be used as another signal to trigger MCA polling. This is similar to how other MCA interrupts are handled. Deferred errors do not require any special handling related to the interrupt, e.g. resetting or rearming the interrupt, etc. However, Scalable MCA systems include a pair of registers, MCA_DESTAT and MCA_DEADDR, that should be checked for valid errors. This check should be done whenever MCA registers are polled. Currently, the Deferred error interrupt does this check, but the MCA polling function does not. Call the MCA polling function when handling the Deferred error interrupt. This keeps all "polling" cases in a common function. Call the polling function only for banks that have the Deferred error interrupt enabled. Add a "SMCA DFR handler" for Deferred errors to the AMD vendor-specific error handler callback. This will do the same status check, register clearing, and logging that the interrupt handler has done. And it extends the common polling flow to find AMD Deferred errors. Remove old code whose functionality is already covered in the common MCA code. Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-15-yazen.ghannam@amd.com v1->v2: * Keep separate interrupt entry points. (Yazen) * Move DFR error setup for MCA_CONFIG to a helper. (Yazen) arch/x86/kernel/cpu/mce/amd.c | 155 +++++++++++++-------------------- arch/x86/kernel/cpu/mce/core.c | 16 +++- 2 files changed, 76 insertions(+), 95 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 75195d6fe971..40912c5e35d1 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -62,11 +62,13 @@ #define CFG_MCAX_EN BIT_ULL(32) #define CFG_LSB_IN_STATUS BIT_ULL(8) #define CFG_DFR_INT_SUPP BIT_ULL(5) +#define CFG_DFR_LOG_SUPP BIT_ULL(2) /* Threshold LVT offset is at MSR0xC0000410[15:12] */ #define SMCA_THR_LVT_OFF 0xF000 static bool thresholding_irq_en; +static DEFINE_PER_CPU(mce_banks_t, mce_dfr_int_banks); static const char * const th_names[] = { "load_store", @@ -350,6 +352,28 @@ static void smca_set_misc_banks_map(unsigned int bank, unsigned int cpu) } +/* SMCA sets the Deferred Error Interrupt type per bank. */ +static void configure_smca_dfr(unsigned int bank, u64 *mca_config) +{ + /* Nothing to do if the bank doesn't support deferred error logging. */ + if (!FIELD_GET(CFG_DFR_LOG_SUPP, *mca_config)) + return; + + /* Nothing to do if the bank doesn't support setting the interrupt type. */ + if (!FIELD_GET(CFG_DFR_INT_SUPP, *mca_config)) + return; + + /* + * Nothing to do if the interrupt type is already set. Either it was set by + * the OS already. Or it was set by firmware, and the OS should leave it as-is. + */ + if (FIELD_GET(CFG_DFR_INT_TYPE, *mca_config)) + return; + + *mca_config |= FIELD_PREP(CFG_DFR_INT_TYPE, INTR_TYPE_APIC); + set_bit(bank, (void *)this_cpu_ptr(&mce_dfr_int_banks)); +} + /* Set appropriate bits in MCA_CONFIG. */ static void configure_smca(unsigned int bank) { @@ -370,18 +394,7 @@ static void configure_smca(unsigned int bank) */ mca_config |= FIELD_PREP(CFG_MCAX_EN, 0x1); - /* - * SMCA sets the Deferred Error Interrupt type per bank. - * - * MCA_CONFIG[DeferredIntTypeSupported] is bit 5, and tells us - * if the DeferredIntType bit field is available. - * - * MCA_CONFIG[DeferredIntType] is bits [38:37]. OS should set - * this to 0x1 to enable APIC based interrupt. First, check that - * no interrupt has been set. - */ - if (FIELD_GET(CFG_DFR_INT_SUPP, mca_config) && !FIELD_GET(CFG_DFR_INT_TYPE, mca_config)) - mca_config |= FIELD_PREP(CFG_DFR_INT_TYPE, INTR_TYPE_APIC); + configure_smca_dfr(bank, &mca_config); if (FIELD_GET(CFG_LSB_IN_STATUS, mca_config)) this_cpu_ptr(mce_banks_array)[bank].lsb_in_status = true; @@ -872,33 +885,6 @@ bool amd_mce_usable_address(struct mce *m) return false; } -static void __log_error(unsigned int bank, u64 status, u64 addr, u64 misc) -{ - struct mce m; - - mce_setup(&m); - - m.status = status; - m.misc = misc; - m.bank = bank; - m.tsc = rdtsc(); - - if (m.status & MCI_STATUS_ADDRV) { - m.addr = addr; - - smca_extract_err_addr(&m); - } - - if (mce_flags.smca) { - rdmsrl(MSR_AMD64_SMCA_MCx_IPID(bank), m.ipid); - - if (m.status & MCI_STATUS_SYNDV) - rdmsrl(MSR_AMD64_SMCA_MCx_SYND(bank), m.synd); - } - - mce_log(&m); -} - DEFINE_IDTENTRY_SYSVEC(sysvec_deferred_error) { trace_deferred_error_apic_entry(DEFERRED_ERROR_VECTOR); @@ -908,75 +894,46 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_deferred_error) apic_eoi(); } -/* - * Returns true if the logged error is deferred. False, otherwise. - */ -static inline bool -_log_error_bank(unsigned int bank, u32 msr_stat, u32 msr_addr, u64 misc) -{ - u64 status, addr = 0; - - rdmsrl(msr_stat, status); - if (!(status & MCI_STATUS_VAL)) - return false; - - if (status & MCI_STATUS_ADDRV) - rdmsrl(msr_addr, addr); - - __log_error(bank, status, addr, misc); - - wrmsrl(msr_stat, 0); - - return status & MCI_STATUS_DEFERRED; -} - -static bool _log_error_deferred(unsigned int bank, u32 misc) -{ - if (!_log_error_bank(bank, mca_msr_reg(bank, MCA_STATUS), - mca_msr_reg(bank, MCA_ADDR), misc)) - return false; - - /* - * Non-SMCA systems don't have MCA_DESTAT/MCA_DEADDR registers. - * Return true here to avoid accessing these registers. - */ - if (!mce_flags.smca) - return true; - - /* Clear MCA_DESTAT if the deferred error was logged from MCA_STATUS. */ - wrmsrl(MSR_AMD64_SMCA_MCx_DESTAT(bank), 0); - return true; -} - /* * We have three scenarios for checking for Deferred errors: * * 1) Non-SMCA systems check MCA_STATUS and log error if found. + * This is already handled in machine_check_poll(). * 2) SMCA systems check MCA_STATUS. If error is found then log it and also * clear MCA_DESTAT. * 3) SMCA systems check MCA_DESTAT, if error was not found in MCA_STATUS, and * log it. */ -static void log_error_deferred(unsigned int bank) +static void handle_smca_dfr_error(struct mce *m) { - if (_log_error_deferred(bank, 0)) + struct mce m_dfr; + u64 mca_destat; + + /* Non-SMCA systems don't have MCA_DESTAT/MCA_DEADDR registers. */ + if (!mce_flags.smca) return; - /* - * Only deferred errors are logged in MCA_DE{STAT,ADDR} so just check - * for a valid error. - */ - _log_error_bank(bank, MSR_AMD64_SMCA_MCx_DESTAT(bank), - MSR_AMD64_SMCA_MCx_DEADDR(bank), 0); -} + /* Clear MCA_DESTAT if the deferred error was logged from MCA_STATUS. */ + if (m->status & MCI_STATUS_DEFERRED) + goto out; -/* APIC interrupt handler for deferred errors */ -static void amd_deferred_error_interrupt(void) -{ - unsigned int bank; + /* MCA_STATUS didn't have a deferred error, so check MCA_DESTAT for one. */ + mca_destat = mce_rdmsrl(MSR_AMD64_SMCA_MCx_DESTAT(m->bank)); + + if (!(mca_destat & MCI_STATUS_VAL)) + return; + + /* Reuse the same data collected from machine_check_poll(). */ + memcpy(&m_dfr, m, sizeof(m_dfr)); + + /* Save the MCA_DE{STAT,ADDR} values. */ + m_dfr.status = mca_destat; + m_dfr.addr = mce_rdmsrl(MSR_AMD64_SMCA_MCx_DEADDR(m_dfr.bank)); - for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) - log_error_deferred(bank); + mce_log(&m_dfr); + +out: + wrmsrl(MSR_AMD64_SMCA_MCx_DESTAT(m->bank), 0); } static void reset_block(struct threshold_block *block) @@ -1035,9 +992,19 @@ static void amd_threshold_interrupt(void) machine_check_poll(MCP_TIMESTAMP, this_cpu_ptr(&mce_poll_banks)); } +/* + * Deferred error interrupt handler will service DEFERRED_ERROR_VECTOR. The interrupt + * is triggered when a bank logs a deferred error. + */ +static void amd_deferred_error_interrupt(void) +{ + machine_check_poll(MCP_TIMESTAMP, this_cpu_ptr(&mce_dfr_int_banks)); +} + void amd_handle_error(struct mce *m) { reset_thr_blocks(m->bank); + handle_smca_dfr_error(m); } /* diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 75297e7eb980..308766868f39 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -680,6 +680,14 @@ static void vendor_handle_error(struct mce *m) DEFINE_PER_CPU(unsigned, mce_poll_count); +static bool smca_destat_is_valid(unsigned int bank) +{ + if (!mce_flags.smca) + return false; + + return mce_rdmsrl(MSR_AMD64_SMCA_MCx_DESTAT(bank)) & MCI_STATUS_VAL; +} + /* * Poll for corrected events or events that happened before reset. * Those are just logged through /dev/mcelog. @@ -731,8 +739,14 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) mce_track_storm(&m); /* If this entry is not valid, ignore it */ - if (!(m.status & MCI_STATUS_VAL)) + if (!(m.status & MCI_STATUS_VAL)) { + if (smca_destat_is_valid(i)) { + mce_read_aux(&m, i); + goto clear_it; + } + continue; + } /* * If we are logging everything (at CPU online) or this From patchwork Thu Apr 4 15:13:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617988 Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on2044.outbound.protection.outlook.com [40.107.102.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 64F5012B14A; Thu, 4 Apr 2024 15:14:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.102.44 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243672; cv=fail; b=IJ7928nrZKoJS6QBbH5/SynxHn/Kwrxx/M1IkXsjiWZ2bO8X89ZIDvIB/xXnN2n/ETYzV4Bhf772AZqHeiudhtSgbnU4CSwbkB6C4DTZ56RPnM8aTQ91/pBZ834nLBR51qo3rVnb6ZIWECkEVMUrAm58Xwn/qzTw4o9DysqfbVU= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243672; c=relaxed/simple; bh=R/f1SgWoxON6lTaRMvuJSRdNVKD62ll4NlvgOBoVAIs=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=duYF9dAdbGvkbR0xpW43OaFL7qMjXtRMWoLoMlH7m7mHQ3QZDrzJ4ylPVmY8RIx6G+5SJmaQFd+ScUsPuvnEr4nDnHgiSOblzYaa1eJ4am8wVeY6+1KUIwRQ7InNxUMExbIrF2mQtBrTPU6EiJl+oGJKmyDkmyF4Y1Dv74fHyE0= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=1ISncIAZ; arc=fail smtp.client-ip=40.107.102.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="1ISncIAZ" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KlmA+8+gQEA0O+p/6s1hkKklwNCaFjZIcF9WlsD4THsa8ss2e7rMSSje1swGNrugZ+rjo8WlmuG+oA/S0sHqvm1+YMVFt0cSp6k7EM/9lQDAhxwd+D2+YWeXQLva9NpJ9tp9F04TuHwJHQExKn918BEUkCtJSYX12C/uF9X74JUILHpV62MaGFBMeCpJD0G9s/IYQQ/LybDar2eDG0cY5b8sZrDrUgdA0Gq1RtFE/1BhsKXDYYOCBJurOXK/ebjB3V7Egtj5z32u02kRCPoBKc4nSPQ5C/OPCyzDCTGr89/lUQo9HCtD9g4IS2A0NmrqFBN2gdUIg+kUcsMCshCAEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rvwbYhtmemGdjxmXLAN1KW3v8Df3/xLy0FI1JdLG4Jo=; b=QZmUvv1iqV91YgBGlLRD4QlToq2ypaYfgAo4l2gk9dOvqENgu8bND2eFtFZclrUJEzi6z6T6Qu0HvWygw6fENLOUUbCrjd7dZXBdKsnXviIRr01nLl13Eh0nTbIJ/6Aq5Yqtx6IiEWE1QjoLq4xm0QIw7eeix2GV9c/XjL4LljuDJMp0YfWKKyvNIbsTXQp6L/LPhhjscWkhgu1sfOaajctkzvj1zlfpxIgDEMbli96kq1FuyOFnREVZBpPIJ6Rs3qfanNXEY81/c8sS2FpSa23Re+sIyZSiXS/wY6joAKWiGKUGgyk5XO1KnRAjlHDgmQVUqzl2b9RfVjmnIq9Azg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=rvwbYhtmemGdjxmXLAN1KW3v8Df3/xLy0FI1JdLG4Jo=; b=1ISncIAZLkF+iGeZjAvKWsUWfAIJ9lhQ8L7cCl1ilHe93ybUKN+c6HHq8Hma7v9raODl1qMqIlf0eGxOpgFQrZN4kWL4WXQfcxNPg5MnayEO9cgB3eB6txMWDQ5BVx68yTcL5v6bOk6rG1Qgc7bdUKBQkrHN9jJ7beTOiYYT2SU= Received: from DM6PR05CA0054.namprd05.prod.outlook.com (2603:10b6:5:335::23) by DS0PR12MB7512.namprd12.prod.outlook.com (2603:10b6:8:13a::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:27 +0000 Received: from DS3PEPF000099DA.namprd04.prod.outlook.com (2603:10b6:5:335:cafe::4a) by DM6PR05CA0054.outlook.office365.com (2603:10b6:5:335::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.26 via Frontend Transport; Thu, 4 Apr 2024 15:14:27 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099DA.mail.protection.outlook.com (10.167.17.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:27 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:13 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 11/16] x86/mce: Skip AMD threshold init if no threshold banks found Date: Thu, 4 Apr 2024 10:13:54 -0500 Message-ID: <20240404151359.47970-12-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099DA:EE_|DS0PR12MB7512:EE_ X-MS-Office365-Filtering-Correlation-Id: 220b2836-acc7-426a-6f62-08dc54b9eb58 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 5SFrr5eUIaUj+qQFHA1NfxzReztEcKtx5GcJ0NTso4kZzf58EnUovlCWSOBuhkQ99Y0NtlXVELleb7hmYLnti6jBkEZPI1VP0qwFFG4OHaOOeKnOblBQRTN/+NRUMZKakCeuYSW2Nct4aWWwIViSFgSUunPg8UX1b1BBjuxo/Nn8V4LXAuP2JmE4kBQdbSRZtqcOG84VXvplAcVNsrYc3AT2RvKd0PW1VqvdvhXvmaSiFDak5BQL3u3OTka8ii/zSgxwh4voneWb2yiDfLGA/MSeAQUV7CqTSYd7VrHm+HFA226d1vvR0v8fw/q3FrzdYbJC6BiAXufEjV/gVE14uAz43vUfX2bj9E6SR2R0y5AOM+Yv3OHCudiGfW3qZHmGjdYe2vFtgokO15LBVyoF++VLQ3rXgPeIoyLFqPVXqAZURzKf/m3NiEs7Jo5mjAbMsNTVdfYZr3kI3YGUVrwxluu3QjH92mLfIPSitHYcYDg+k34wpoRqAGaXp/J63sSOkWYz0B/YGbSqUblmbdwRBOBbQlQpgro1A5AUNIYhOhHpK2K+oO637gZWS35UkAeyHWt3m3MJAjbIcKAVn7y+9myc+M80gILbS1/66NLz07QS/6hcoR1f033+9QdnSEhjDgtPf8AA/YgVl+ZTB4jrqrrRqIZXRoWsug27sR5ucLW1FUhxz/Xr2nbWNsMdiwCO7BFyudYPag41bwUkxbD16w== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(82310400014)(376005)(36860700004)(1800799015);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:27.1392 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 220b2836-acc7-426a-6f62-08dc54b9eb58 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099DA.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB7512 AMD systems optionally support MCA Thresholding. This feature is discovered by checking capability bits in the MCA_MISC* registers. Currently, MCA Thresholding is set up in two passes. The first is during CPU init where available banks are detected, and the "bank_map" variable is updated. The second is during sysfs/device init when the thresholding data structures are allocated and hardware is fully configured. During device init, the "threshold_banks" array is allocated even if no available banks were discovered. Furthermore, the thresholding reset flow checks if the top-level "threshold_banks" array is non-NULL, but it doesn't check if individual "threshold_bank" structures are non-NULL. This is avoided because the hardware interrupt is not enabled in this case. But this issue becomes present if enabling the interrupt when the thresholding data structures are not initialized. Check "bank_map" to determine if the thresholding structures should be allocated and initialized. Also, remove "mce_flags.amd_threshold" which is redundant when checking "bank_map". Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-16-yazen.ghannam@amd.com v1->v2: * Update mce_vendor_flags reserved bits. (Yazen) arch/x86/kernel/cpu/mce/amd.c | 2 +- arch/x86/kernel/cpu/mce/core.c | 1 - arch/x86/kernel/cpu/mce/internal.h | 5 +---- 3 files changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 40912c5e35d1..08ee647cb6ce 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1455,7 +1455,7 @@ int mce_threshold_create_device(unsigned int cpu) struct threshold_bank **bp; int err; - if (!mce_flags.amd_threshold) + if (!this_cpu_read(bank_map)) return 0; bp = this_cpu_read(threshold_banks); diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 308766868f39..17cf5a9df3cd 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -2024,7 +2024,6 @@ static void __mcheck_cpu_init_early(struct cpuinfo_x86 *c) mce_flags.overflow_recov = !!cpu_has(c, X86_FEATURE_OVERFLOW_RECOV); mce_flags.succor = !!cpu_has(c, X86_FEATURE_SUCCOR); mce_flags.smca = !!cpu_has(c, X86_FEATURE_SMCA); - mce_flags.amd_threshold = 1; } } diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h index 96b108175ca2..9802f7c6cc93 100644 --- a/arch/x86/kernel/cpu/mce/internal.h +++ b/arch/x86/kernel/cpu/mce/internal.h @@ -214,9 +214,6 @@ struct mce_vendor_flags { /* Zen IFU quirk */ zen_ifu_quirk : 1, - /* AMD-style error thresholding banks present. */ - amd_threshold : 1, - /* Pentium, family 5-style MCA */ p5 : 1, @@ -229,7 +226,7 @@ struct mce_vendor_flags { /* Skylake, Cascade Lake, Cooper Lake REP;MOVS* quirk */ skx_repmov_quirk : 1, - __reserved_0 : 55; + __reserved_0 : 56; }; extern struct mce_vendor_flags mce_flags; From patchwork Thu Apr 4 15:13:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617987 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2065.outbound.protection.outlook.com [40.107.94.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2144912AAF4; Thu, 4 Apr 2024 15:14:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.94.65 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243672; cv=fail; b=H37U/tBHTT9q0BkIf1gcydwpxjwvUr9fDROm+91Y2GMOkmGZsgo3yjk1++IecJr3WBV2aSVYnE+binQmKcPQZW+KA3J4qu1Ln+EhITHkDHtoSkX3ff4exkqc+JGtsWkUtbfTeIASfnaXWzdc7aprwZNbqwxv6o0vyWfUib0vRXI= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243672; c=relaxed/simple; bh=aicdAFUQvlfjeof2BuotaRdK++xylrNMuzB/AcRK9o4=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=LHv0os2YIMOkXWVx9d12aEC8crNowEEW0ydc1L4dgqUo1eiomhJedg4UIcKck1ULcLa2sGFBqd8b3MeJM5Q3ts1t+GT4/2oR0ksdyHSD2CZZ2bSwtrrZGc0cFGwSkkbpHeC0aQWqlWudZyfZGKqWJkZgKJmCzjAD+vTay6BVAzA= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=4idU/gXw; arc=fail smtp.client-ip=40.107.94.65 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="4idU/gXw" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=eaLLKlYyIKBo5CY0MdlvrNs3VCMfVkNo0SgkAY+sQcC8czl/GitfRvmEMPR+ekPLAC0Z7rprBoohgCuD7uY+rNSxeFrXLJ0aBpz5bwWNq3LCxtWi5KTXOEKPibS+pxFZ1T61QfumG8e2NvknCsWGK+842wxRnSQoytvjNcP92NUZL1tZhJm/L+Zs9N6ekq9FS+tiGKSXmrMKcPUVVeiErs/IGThOYSx0eBHWtACVlhYhGX+FoOx9DkVbewcskmKi7NQf6piCTOQ2wSz7WohvnrsBWOCDffmR1Y6V7X2Xf/RKkxiit0pst0Y4z+D0OnD2jDSIjJq+sNaKY3w68Ew+JA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1Jf6CKd1WvxPatLAOJhdUzKqpCUxGqOIbKpOseiEWfI=; b=lqIxcBpqwLX0GbsOTnyVDlPKJNSflP0Kii4CQNgXGg4q0urfehw1duc1az7ncIsoES97aMbdRKQyM/SJ/gRwcJ20QuqWH/0Nl/2VBps84lc3VjOseugcMnSrYkJPq4VRv4wdN1cv3Qvxp2vvc/pytnnXdFjXCeAKXdP6h6XDc/8vRpBR+1SWKuHk3Zn+kt1bTolGlAwAfGPt6eZjwUr3cQ+iY9Jqgvg3nl7KB74wZfM65ja4H5ag0qyoWwiSjsDwg/g3GQm4U7Eu9jM59lj3NNKHxyVxhnoWO0T1uVmvpulaJkKf48qhIHpCMn2eQz8ynwMLWGyglciLtr71VGk4ig== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1Jf6CKd1WvxPatLAOJhdUzKqpCUxGqOIbKpOseiEWfI=; b=4idU/gXw6v0N+Ddl0zUbKuuPRCDun1fTrTnpRhpAzxB63fnSS2MxfTMjBCZd2J3BgM9jL9d67lLaVKwIsvsGXrfby3G18dTG2OsS66aVwELAVBUbt6rNWfRpX/jqxLhYYwzJ50YU43+c7THMrPqbRWilzLUzr0zcWKGz7YLEup0= Received: from DM6PR05CA0057.namprd05.prod.outlook.com (2603:10b6:5:335::26) by DM6PR12MB4188.namprd12.prod.outlook.com (2603:10b6:5:215::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:27 +0000 Received: from DS3PEPF000099DA.namprd04.prod.outlook.com (2603:10b6:5:335:cafe::dd) by DM6PR05CA0057.outlook.office365.com (2603:10b6:5:335::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.26 via Frontend Transport; Thu, 4 Apr 2024 15:14:27 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099DA.mail.protection.outlook.com (10.167.17.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:27 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:14 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 12/16] x86/mce/amd: Support SMCA Corrected Error Interrupt Date: Thu, 4 Apr 2024 10:13:55 -0500 Message-ID: <20240404151359.47970-13-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099DA:EE_|DM6PR12MB4188:EE_ X-MS-Office365-Filtering-Correlation-Id: e250a6af-8726-40e7-ba70-08dc54b9eba4 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: gkvimopCcH5PK2PxvWW3qr8ATcR123IkkBCLAM4GFGTmADYCH4uq+xY6ffE14fLTZGIkkBnO3I3YadrXMedTRLmY9CPPkMaXiZO1RvI20nF0s3Ef86gPZ2kdRYqBdT7DbajIkc2y0nyEHq3HPoyqpPU8uM+zL8HD/ZJJZgkdUsYkzqPsyFZCQ5gJKI8oqEmN1XT2UQO016MElpJBXjZRDLxyjtXj3A1UalHIjPKeUnOMXcGXOjZmT4BlEkX773NVAuxJ4TH3pMJjF+owAQ6FBcFD/Kk/saRhWtr8u+PgXXdnhzQ1x6CK6gM+m6tapK1F4x+FjuqxyPIHEw2u+Z574UeRm1Izo7TST9+TRgOaDfv0O3a8EWJDi27FxivHLdyBN6zHUb9wHJD03aY9U7mBoNUWGkxk1MKS/cmIyY8hV1kMmlJc2lbL+nWABVwLrE4yoVAO2ffZYQR+mdxaUN1OAQktSe/9rCInHMmsKibzOAkDuXBiXQFhU7rKKN8mDl+NZZUsFbl9m6KoVjMqRntq0iY0vMoKd7VKAAdvD/IVx+W12i7b366RIOvuZp4M2NMwDfANmo2T2J29HrwCnTDTCoUopilmD3PHmsOrSFssR3oeJL0sGqQHZbrecGlgy3k2ga+vJUJMWlhKinlLJnYgG0KZeF7Sdxgoh5qlLn2szWeaORfyQe2i/BC65dLoUbJ6G2InFD3aAgq7/oYY15Obkw== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(376005)(82310400014)(1800799015)(36860700004);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:27.6860 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e250a6af-8726-40e7-ba70-08dc54b9eba4 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099DA.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB4188 AMD systems optionally support MCA Thresholding which provides the ability for hardware to send an interrupt when a set error threshold is reached. This feature counts errors of all severities, but it is commonly used to report correctable errors with an interrupt rather than polling. Scalable MCA systems allow the Platform to take control of this feature. In this case, the OS will not see the feature configuration and control bits in the MCA_MISC* registers. The OS will not receive the MCA Thresholding interrupt, and it will need to poll for correctable errors. A "corrected error interrupt" will be available on Scalable MCA systems. This will be used in the same configuration where the Platform controls MCA Thresholding. However, the Platform will now be able to send the MCA Thresholding interrupt to the OS. Check for the feature bit in the MCA_CONFIG register and attempt to set up the MCA Thresholding interrupt handler. If successful, set the feature enable bit in the MCA_CONFIG register to indicate to the Platform that the OS is ready for the interrupt. Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-17-yazen.ghannam@amd.com v1->v2: * Rebase on earlier changes. (Yazen) arch/x86/kernel/cpu/mce/amd.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 08ee647cb6ce..a81d911d608e 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -47,6 +47,7 @@ /* MCA Interrupt Configuration register, one per CPU */ #define MSR_CU_DEF_ERR 0xC0000410 #define MSR_MCA_INTR_CFG 0xC0000410 +#define INTR_CFG_THR_LVT_OFFSET GENMASK_ULL(15, 12) #define INTR_CFG_DFR_LVT_OFFSET GENMASK_ULL(7, 4) #define INTR_CFG_LEGACY_DFR_INTR_TYPE GENMASK_ULL(2, 1) #define INTR_TYPE_APIC 0x1 @@ -58,8 +59,10 @@ #define MCI_IPID_HWID_OLD 0xFFF /* MCA_CONFIG register, one per MCA bank */ +#define CFG_CE_INT_EN BIT_ULL(40) #define CFG_DFR_INT_TYPE GENMASK_ULL(38, 37) #define CFG_MCAX_EN BIT_ULL(32) +#define CFG_CE_INT_PRESENT BIT_ULL(10) #define CFG_LSB_IN_STATUS BIT_ULL(8) #define CFG_DFR_INT_SUPP BIT_ULL(5) #define CFG_DFR_LOG_SUPP BIT_ULL(2) @@ -352,6 +355,17 @@ static void smca_set_misc_banks_map(unsigned int bank, unsigned int cpu) } +static bool smca_thr_handler_enabled(u64 mca_intr_cfg) +{ + u8 offset = FIELD_GET(INTR_CFG_THR_LVT_OFFSET, mca_intr_cfg); + + if (setup_APIC_eilvt(offset, THRESHOLD_APIC_VECTOR, APIC_EILVT_MSG_FIX, 0)) + return false; + + mce_threshold_vector = amd_threshold_interrupt; + return true; +} + /* SMCA sets the Deferred Error Interrupt type per bank. */ static void configure_smca_dfr(unsigned int bank, u64 *mca_config) { @@ -375,7 +389,7 @@ static void configure_smca_dfr(unsigned int bank, u64 *mca_config) } /* Set appropriate bits in MCA_CONFIG. */ -static void configure_smca(unsigned int bank) +static void configure_smca(unsigned int bank, u64 mca_intr_cfg) { u64 mca_config; @@ -399,6 +413,9 @@ static void configure_smca(unsigned int bank) if (FIELD_GET(CFG_LSB_IN_STATUS, mca_config)) this_cpu_ptr(mce_banks_array)[bank].lsb_in_status = true; + if (FIELD_GET(CFG_CE_INT_PRESENT, mca_config) && smca_thr_handler_enabled(mca_intr_cfg)) + mca_config |= FIELD_PREP(CFG_CE_INT_EN, 0x1); + wrmsrl(MSR_AMD64_SMCA_MCx_CONFIG(bank), mca_config); } @@ -791,7 +808,7 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) if (mce_flags.smca) smca_configure_old(bank, cpu); - configure_smca(bank); + configure_smca(bank, mca_intr_cfg); disable_err_thresholding(c, bank); for (block = 0; block < NR_BLOCKS; ++block) { From patchwork Thu Apr 4 15:13:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617994 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2059.outbound.protection.outlook.com [40.107.94.59]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DA54212CD9C; Thu, 4 Apr 2024 15:14:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.94.59 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243675; cv=fail; b=PcU9kamfinefJS9VUK7GDVLcwGwZMBQ4poGmsQG0WOsNkcgDw435VMgXQj2rFn3ogxtXfM7/EFuBxDooNWXWSCPs5spYGYuMbzq0QXVkkr0wO+2ZmuDDcR3uUDs5Ly9tmltU5qndhqsEsH59iwtArWJn5cCrFgfKOBR9W7/NpEY= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243675; c=relaxed/simple; bh=+yPBjOYQKajTvuBsIBpsMVunvh6lB3TZh0WTlNrqnx4=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=nJXMHYDdOkV7YNNrRh7OIyV/on4QBwaIDvUF6YCcG+3MZ9CGFqxb+6RyFltfCkJC/f7Tl69JDKQiDvKa+SAP+WiGU+OhZl3PdoJXLCIt3+lb7ViXV1pqIe5n/dUv7iNNi6s+NYng5Of9aeiC4ukwl58vWLvkn6LvfJong65hXCw= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=2UQcfZol; arc=fail smtp.client-ip=40.107.94.59 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="2UQcfZol" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZjLLE+8tsEgJfjHjGNdKuEq9JfDYQ1z7WDaoHyacprU2288o/9/NlJq5fw6xO6UHl7gbZh9I+egH+BlRVf0b3Kyc1uZ7rFjHYRGbztDx2pJaRU8A49CgmE4HGTw51k5Hug2mso78e5lURjhJ9ZX3MiM49Q1BA7ZA0v8N7tX8lH/d3I4ZcilypRTTBwGC9kIM6VGEyfqRWteJfiuvhOIP9cxe9XDlzU3c16r2BNI526bOJVziIoALhJS5FQ1PUkiObBW+6V8DLl/b343GlSLdzvpKh+kj68HMYI+8BDhkNvPuL+qAYEh4V8/T8MJPgtpdANvt6rJ324mFHKcyK5kr/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=G7B+k5mlnz84kjuZ98pZ/2qO0gpdKe1t9aMXYkFs4sk=; b=OCuxsJcgy39dLOrefAOwY/nY9pOxYmJXGEFwG1idau5+MHn2lq5g6ji6bsD8rQueo0zdUtuTkCPiL74X4s4XwowXc/QmjMrqeSbk9TCdbZ2f6oWMSjJUBW1gRwSRHzLIYi3BKxM+F7upiPCZRh/NBx/ih1TtD2EzvuwBImAW32yrhHg4MGv8lFfM7am6vhjwUqNmhSh0O/I6bcqXV612vKC0Lvnxhmhub/5GpmSTT5QTtkBxPES/Km3ZlOSIk93Kj6Rl53BJDvKZE5Me3wywm3A3Hdgu7w8DMa0WnzR1aQETK6HQV6N+PDlb0DntAt0pPa0RWqPES/an0vyg2dmEWA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=G7B+k5mlnz84kjuZ98pZ/2qO0gpdKe1t9aMXYkFs4sk=; b=2UQcfZol19h9wvmpd2up6rbMpyidtW/3U/f4HFO9RfM4+fgM88vheDfFZ+4Z1ONd913aGvwPVOExRZGlJkXcLufW+T5MofVC7Qk+zd9V06CMd2wksxnB3WUqx/adD5NEfxgHIOB/R8Uw3D4g7Yh5THUm5HFeJlSCf63bHs5vxbE= Received: from CH0PR13CA0003.namprd13.prod.outlook.com (2603:10b6:610:b1::8) by CYYPR12MB8654.namprd12.prod.outlook.com (2603:10b6:930:c9::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:28 +0000 Received: from DS3PEPF000099D6.namprd04.prod.outlook.com (2603:10b6:610:b1:cafe::bf) by CH0PR13CA0003.outlook.office365.com (2603:10b6:610:b1::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7270.17 via Frontend Transport; Thu, 4 Apr 2024 15:14:28 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099D6.mail.protection.outlook.com (10.167.17.7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:28 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:14 -0500 From: Yazen Ghannam To: CC: , , , , , Borislav Petkov , Yazen Ghannam Subject: [PATCH v2 13/16] x86/mce: Add wrapper for struct mce to export vendor specific info Date: Thu, 4 Apr 2024 10:13:56 -0500 Message-ID: <20240404151359.47970-14-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099D6:EE_|CYYPR12MB8654:EE_ X-MS-Office365-Filtering-Correlation-Id: a8ca1c95-911b-494c-74df-08dc54b9ebe4 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: l9rWw4ZdgMabahlpAXkkjo6OuEbcnaTln+dGjvGIUSA56d4YY3eHFqDP0IQQBuBvOu7hTwVv2J+CFof6EjJFcyfV+yu1FZkA6XcTgHT9d/5uLggNZr57IPNgRA2Z+aRRJYcCmipJJK4ywGJq1Cd/mauD+I8PF4D0kZiZstVf3AbT4jGEWJM5Ep7n8fH3IAJL/UuA/d7T4QMgP0Rn+vUSmTtNSBCW9XfORLkESB3A6cb3qvwbK4/t6YfJGsiwKSp4LzsR+P94NGE8farN+Hkg5aq28qmVxv85Ja4AMOoj72vW74ouoQSpyrvlBR9Q4LIkuFMBZetuIw/KDYif8rgp3phc/e/IotOi2RgRpO/r59SCYAA0asfgstbGjpz/0xTCgtBHAoSrmz9b0xSHaBdnvgt6UaAan+y6uCwWeh/LqGSIgwLtuTcNOuiUHQee9Cudf75EZNoWKElipIp+egSKBGqIP+uzryjngpgun65YWg5mUdMNS0Yy0p08PLV0eC55+3nU1kddLDTX4CS363vWWkqbfhC0Jysd0pysUO6vTXdH1eV2VWMOI3+8SBkptaqhMwD7aR4k82LjuxE8GMVKLpQ6cGtffctv2xO7zOG3qVMpqsVrGUB/aqQhHQKGg1SqhAmXkn5WAvNKqhUyGdnMbt541fGgCkIfTumYSwsrqjiudPofjX6zJaA51Dk2koeBMFR2cdT6XkAiLj31/HtcwA== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(36860700004)(376005)(82310400014)(1800799015);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:28.0881 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a8ca1c95-911b-494c-74df-08dc54b9ebe4 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099D6.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CYYPR12MB8654 From: Avadhut Naik Currently, exporting new additional machine check error information involves adding new fields for the same at the end of the struct mce. This additional information can then be consumed through mcelog or tracepoint. However, as new MSRs are being added (and will be added in the future) by CPU vendors on their newer CPUs with additional machine check error information to be exported, the size of struct mce will balloon on some CPUs, unnecessarily, since those fields are vendor-specific. Moreover, different CPU vendors may export the additional information in varying sizes. The problem particularly intensifies since struct mce is exposed to userspace as part of UAPI. It's bloating through vendor-specific data should be avoided to limit the information being sent out to userspace. Add a new structure mce_hw_err to wrap the existing struct mce. The same will prevent its ballooning since vendor-specifc data, if any, can now be exported through a union within the wrapper structure and through __dynamic_array in mce_record tracepoint. Furthermore, new internal kernel fields can be added to the wrapper struct without impacting the user space API. [Yazen: Add last commit message paragraph. Rebase on other MCA updates.] Suggested-by: Borislav Petkov (AMD) Signed-off-by: Avadhut Naik Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-18-yazen.ghannam@amd.com v1->v2: * Update all MCE nofitier blocks. (Yazen) * Rebase on upstream changes for MCE trace event. (Avadhut) arch/x86/include/asm/mce.h | 6 +- arch/x86/kernel/cpu/mce/amd.c | 24 ++-- arch/x86/kernel/cpu/mce/apei.c | 46 +++--- arch/x86/kernel/cpu/mce/core.c | 181 +++++++++++++----------- arch/x86/kernel/cpu/mce/dev-mcelog.c | 2 +- arch/x86/kernel/cpu/mce/genpool.c | 20 +-- arch/x86/kernel/cpu/mce/inject.c | 4 +- arch/x86/kernel/cpu/mce/internal.h | 8 +- drivers/acpi/acpi_extlog.c | 2 +- drivers/acpi/nfit/mce.c | 3 +- drivers/edac/i7core_edac.c | 2 +- drivers/edac/igen6_edac.c | 2 +- drivers/edac/pnd2_edac.c | 2 +- drivers/edac/sb_edac.c | 2 +- drivers/edac/skx_common.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 2 +- drivers/ras/amd/fmpm.c | 2 +- drivers/ras/cec.c | 3 +- include/trace/events/mce.h | 42 +++--- 19 files changed, 196 insertions(+), 159 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index adad99bac567..e4ad9807b3e3 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -183,6 +183,10 @@ enum mce_notifier_prios { MCE_PRIO_HIGHEST = MCE_PRIO_CEC }; +struct mce_hw_err { + struct mce m; +}; + struct notifier_block; extern void mce_register_decode_chain(struct notifier_block *nb); extern void mce_unregister_decode_chain(struct notifier_block *nb); @@ -218,7 +222,7 @@ static inline int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, #endif void mce_setup(struct mce *m); -void mce_log(struct mce *m); +void mce_log(struct mce_hw_err *err); DECLARE_PER_CPU(struct device *, mce_device); /* Maximum number of MCA banks per CPU. */ diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index a81d911d608e..40e6c5a98dce 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -921,9 +921,9 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_deferred_error) * 3) SMCA systems check MCA_DESTAT, if error was not found in MCA_STATUS, and * log it. */ -static void handle_smca_dfr_error(struct mce *m) +static void handle_smca_dfr_error(struct mce_hw_err *err) { - struct mce m_dfr; + struct mce_hw_err err_dfr; u64 mca_destat; /* Non-SMCA systems don't have MCA_DESTAT/MCA_DEADDR registers. */ @@ -931,26 +931,26 @@ static void handle_smca_dfr_error(struct mce *m) return; /* Clear MCA_DESTAT if the deferred error was logged from MCA_STATUS. */ - if (m->status & MCI_STATUS_DEFERRED) + if (err->m.status & MCI_STATUS_DEFERRED) goto out; /* MCA_STATUS didn't have a deferred error, so check MCA_DESTAT for one. */ - mca_destat = mce_rdmsrl(MSR_AMD64_SMCA_MCx_DESTAT(m->bank)); + mca_destat = mce_rdmsrl(MSR_AMD64_SMCA_MCx_DESTAT(err->m.bank)); if (!(mca_destat & MCI_STATUS_VAL)) return; /* Reuse the same data collected from machine_check_poll(). */ - memcpy(&m_dfr, m, sizeof(m_dfr)); + memcpy(&err_dfr, err, sizeof(err_dfr)); /* Save the MCA_DE{STAT,ADDR} values. */ - m_dfr.status = mca_destat; - m_dfr.addr = mce_rdmsrl(MSR_AMD64_SMCA_MCx_DEADDR(m_dfr.bank)); + err_dfr.m.status = mca_destat; + err_dfr.m.addr = mce_rdmsrl(MSR_AMD64_SMCA_MCx_DEADDR(err_dfr.m.bank)); - mce_log(&m_dfr); + mce_log(&err_dfr); out: - wrmsrl(MSR_AMD64_SMCA_MCx_DESTAT(m->bank), 0); + wrmsrl(MSR_AMD64_SMCA_MCx_DESTAT(err->m.bank), 0); } static void reset_block(struct threshold_block *block) @@ -1018,10 +1018,10 @@ static void amd_deferred_error_interrupt(void) machine_check_poll(MCP_TIMESTAMP, this_cpu_ptr(&mce_dfr_int_banks)); } -void amd_handle_error(struct mce *m) +void amd_handle_error(struct mce_hw_err *err) { - reset_thr_blocks(m->bank); - handle_smca_dfr_error(m); + reset_thr_blocks(err->m.bank); + handle_smca_dfr_error(err); } /* diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c index e4e32e337110..89a8ebac53ea 100644 --- a/arch/x86/kernel/cpu/mce/apei.c +++ b/arch/x86/kernel/cpu/mce/apei.c @@ -28,9 +28,12 @@ void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err) { - struct mce m; + struct mce_hw_err err; + struct mce *m = &err.m; int lsb; + memset(&err, 0, sizeof(struct mce_hw_err)); + if (!(mem_err->validation_bits & CPER_MEM_VALID_PA)) return; @@ -44,30 +47,33 @@ void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err) else lsb = PAGE_SHIFT; - mce_setup(&m); - m.bank = -1; + mce_setup(m); + m->bank = -1; /* Fake a memory read error with unknown channel */ - m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | MCI_STATUS_MISCV | 0x9f; - m.misc = (MCI_MISC_ADDR_PHYS << 6) | lsb; + m->status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | MCI_STATUS_MISCV | 0x9f; + m->misc = (MCI_MISC_ADDR_PHYS << 6) | lsb; if (severity >= GHES_SEV_RECOVERABLE) - m.status |= MCI_STATUS_UC; + m->status |= MCI_STATUS_UC; if (severity >= GHES_SEV_PANIC) { - m.status |= MCI_STATUS_PCC; - m.tsc = rdtsc(); + m->status |= MCI_STATUS_PCC; + m->tsc = rdtsc(); } - m.addr = mem_err->physical_addr; - mce_log(&m); + m->addr = mem_err->physical_addr; + mce_log(&err); } EXPORT_SYMBOL_GPL(apei_mce_report_mem_error); int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id) { const u64 *i_mce = ((const u64 *) (ctx_info + 1)); + struct mce_hw_err err; + struct mce *m = &err.m; unsigned int cpu; - struct mce m; + + memset(&err, 0, sizeof(struct mce_hw_err)); if (!boot_cpu_has(X86_FEATURE_SMCA)) return -EINVAL; @@ -105,18 +111,18 @@ int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id) if (!cpu_possible(cpu)) return -EINVAL; - mce_setup_common(&m); - mce_setup_for_cpu(cpu, &m); + mce_setup_common(m); + mce_setup_for_cpu(cpu, m); - m.bank = (ctx_info->msr_addr >> 4) & 0xFF; - m.status = *i_mce; - m.addr = *(i_mce + 1); - m.misc = *(i_mce + 2); + m->bank = (ctx_info->msr_addr >> 4) & 0xFF; + m->status = *i_mce; + m->addr = *(i_mce + 1); + m->misc = *(i_mce + 2); /* Skipping MCA_CONFIG */ - m.ipid = *(i_mce + 4); - m.synd = *(i_mce + 5); + m->ipid = *(i_mce + 4); + m->synd = *(i_mce + 5); - mce_log(&m); + mce_log(&err); return 0; } diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 17cf5a9df3cd..fef025bda2af 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -88,7 +88,7 @@ struct mca_config mca_cfg __read_mostly = { .monarch_timeout = -1 }; -static DEFINE_PER_CPU(struct mce, mces_seen); +static DEFINE_PER_CPU(struct mce_hw_err, hw_errs_seen); static unsigned long mce_need_notify; /* @@ -148,9 +148,9 @@ void mce_setup(struct mce *m) DEFINE_PER_CPU(struct mce, injectm); EXPORT_PER_CPU_SYMBOL_GPL(injectm); -void mce_log(struct mce *m) +void mce_log(struct mce_hw_err *err) { - if (!mce_gen_pool_add(m)) + if (!mce_gen_pool_add(err)) irq_work_queue(&mce_irq_work); } EXPORT_SYMBOL_GPL(mce_log); @@ -171,8 +171,10 @@ void mce_unregister_decode_chain(struct notifier_block *nb) } EXPORT_SYMBOL_GPL(mce_unregister_decode_chain); -static void __print_mce(struct mce *m) +static void __print_mce(struct mce_hw_err *err) { + struct mce *m = &err->m; + pr_emerg(HW_ERR "CPU %d: Machine Check%s: %Lx Bank %d: %016Lx\n", m->extcpu, (m->mcgstatus & MCG_STATUS_MCIP ? " Exception" : ""), @@ -214,9 +216,11 @@ static void __print_mce(struct mce *m) m->microcode); } -static void print_mce(struct mce *m) +static void print_mce(struct mce_hw_err *err) { - __print_mce(m); + struct mce *m = &err->m; + + __print_mce(err); if (m->cpuvendor != X86_VENDOR_AMD && m->cpuvendor != X86_VENDOR_HYGON) pr_emerg_ratelimited(HW_ERR "Run the above through 'mcelog --ascii'\n"); @@ -251,7 +255,7 @@ static const char *mce_dump_aux_info(struct mce *m) return NULL; } -static noinstr void mce_panic(const char *msg, struct mce *final, char *exp) +static noinstr void mce_panic(const char *msg, struct mce_hw_err *final, char *exp) { struct llist_node *pending; struct mce_evt_llist *l; @@ -282,20 +286,22 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp) pending = mce_gen_pool_prepare_records(); /* First print corrected ones that are still unlogged */ llist_for_each_entry(l, pending, llnode) { - struct mce *m = &l->mce; + struct mce_hw_err *err = &l->err; + struct mce *m = &err->m; if (!(m->status & MCI_STATUS_UC)) { - print_mce(m); + print_mce(err); if (!apei_err) apei_err = apei_write_mce(m); } } /* Now print uncorrected but with the final one last */ llist_for_each_entry(l, pending, llnode) { - struct mce *m = &l->mce; + struct mce_hw_err *err = &l->err; + struct mce *m = &err->m; if (!(m->status & MCI_STATUS_UC)) continue; - if (!final || mce_cmp(m, final)) { - print_mce(m); + if (!final || mce_cmp(m, &final->m)) { + print_mce(err); if (!apei_err) apei_err = apei_write_mce(m); } @@ -303,12 +309,12 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp) if (final) { print_mce(final); if (!apei_err) - apei_err = apei_write_mce(final); + apei_err = apei_write_mce(&final->m); } if (exp) pr_emerg(HW_ERR "Machine check: %s\n", exp); - memmsg = mce_dump_aux_info(final); + memmsg = mce_dump_aux_info(&final->m); if (memmsg) pr_emerg(HW_ERR "Machine check: %s\n", memmsg); @@ -323,9 +329,9 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp) * panic. */ if (kexec_crash_loaded()) { - if (final && (final->status & MCI_STATUS_ADDRV)) { + if (final && (final->m.status & MCI_STATUS_ADDRV)) { struct page *p; - p = pfn_to_online_page(final->addr >> PAGE_SHIFT); + p = pfn_to_online_page(final->m.addr >> PAGE_SHIFT); if (p) SetPageHWPoison(p); } @@ -574,13 +580,13 @@ EXPORT_SYMBOL_GPL(mce_is_correctable); static int mce_early_notifier(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *m = (struct mce *)data; + struct mce_hw_err *err = (struct mce_hw_err *)data; - if (!m) + if (!err) return NOTIFY_DONE; /* Emit the trace record: */ - trace_mce_record(m); + trace_mce_record(err); set_bit(0, &mce_need_notify); @@ -597,7 +603,8 @@ static struct notifier_block early_nb = { static int uc_decode_notifier(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *mce = (struct mce *)data; + struct mce_hw_err *err = (struct mce_hw_err *)data; + struct mce *mce = &err->m; unsigned long pfn; if (!mce || !mce_usable_address(mce)) @@ -624,13 +631,13 @@ static struct notifier_block mce_uc_nb = { static int mce_default_notifier(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *m = (struct mce *)data; + struct mce_hw_err *err = (struct mce_hw_err *)data; - if (!m) + if (!err) return NOTIFY_DONE; - if (mca_cfg.print_all || !m->kflags) - __print_mce(m); + if (mca_cfg.print_all || !(err->m.kflags)) + __print_mce(err); return NOTIFY_DONE; } @@ -672,10 +679,10 @@ static noinstr void mce_read_aux(struct mce *m, int i) } } -static void vendor_handle_error(struct mce *m) +static void vendor_handle_error(struct mce_hw_err *err) { if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) - return amd_handle_error(m); + return amd_handle_error(err); } DEFINE_PER_CPU(unsigned, mce_poll_count); @@ -707,26 +714,29 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) { struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array); bool error_seen = false; - struct mce m; + struct mce_hw_err err; + struct mce *m = &err.m; int i; + memset(&err, 0, sizeof(struct mce_hw_err)); + this_cpu_inc(mce_poll_count); - mce_gather_info(&m, NULL); + mce_gather_info(m, NULL); if (flags & MCP_TIMESTAMP) - m.tsc = rdtsc(); + m->tsc = rdtsc(); for (i = 0; i < this_cpu_read(mce_num_banks); i++) { if (!mce_banks[i].ctl || !test_bit(i, *b)) continue; - m.misc = 0; - m.addr = 0; - m.bank = i; + m->misc = 0; + m->addr = 0; + m->bank = i; barrier(); - m.status = mce_rdmsrl(mca_msr_reg(i, MCA_STATUS)); + m->status = mce_rdmsrl(mca_msr_reg(i, MCA_STATUS)); /* * Update storm tracking here, before checking for the @@ -736,12 +746,12 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) * storm status. */ if (!mca_cfg.cmci_disabled) - mce_track_storm(&m); + mce_track_storm(m); /* If this entry is not valid, ignore it */ - if (!(m.status & MCI_STATUS_VAL)) { + if (!(m->status & MCI_STATUS_VAL)) { if (smca_destat_is_valid(i)) { - mce_read_aux(&m, i); + mce_read_aux(m, i); goto clear_it; } @@ -752,7 +762,7 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) * If we are logging everything (at CPU online) or this * is a corrected error, then we must log it. */ - if ((flags & MCP_UC) || !(m.status & MCI_STATUS_UC)) + if ((flags & MCP_UC) || !(m->status & MCI_STATUS_UC)) goto log_it; /* @@ -762,20 +772,20 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) * everything else. */ if (!mca_cfg.ser) { - if (m.status & MCI_STATUS_UC) + if (m->status & MCI_STATUS_UC) continue; goto log_it; } /* Log "not enabled" (speculative) errors */ - if (!(m.status & MCI_STATUS_EN)) + if (!(m->status & MCI_STATUS_EN)) goto log_it; /* * Log UCNA (SDM: 15.6.3 "UCR Error Classification") * UC == 1 && PCC == 0 && S == 0 */ - if (!(m.status & MCI_STATUS_PCC) && !(m.status & MCI_STATUS_S)) + if (!(m->status & MCI_STATUS_PCC) && !(m->status & MCI_STATUS_S)) goto log_it; /* @@ -791,23 +801,24 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) if (flags & MCP_DONTLOG) goto clear_it; - mce_read_aux(&m, i); - m.severity = mce_severity(&m, NULL, NULL, false); + mce_read_aux(m, i); + m->severity = mce_severity(m, NULL, NULL, false); + /* * Don't get the IP here because it's unlikely to * have anything to do with the actual error location. */ - if (mca_cfg.dont_log_ce && !mce_usable_address(&m)) + if (mca_cfg.dont_log_ce && !mce_usable_address(m)) goto clear_it; if (flags & MCP_QUEUE_LOG) - mce_gen_pool_add(&m); + mce_gen_pool_add(&err); else - mce_log(&m); + mce_log(&err); clear_it: - vendor_handle_error(&m); + vendor_handle_error(&err); /* * Clear state for this bank. @@ -1044,6 +1055,7 @@ static noinstr int mce_timed_out(u64 *t, const char *msg) static void mce_reign(void) { int cpu; + struct mce_hw_err *err = NULL; struct mce *m = NULL; int global_worst = 0; char *msg = NULL; @@ -1054,11 +1066,13 @@ static void mce_reign(void) * Grade the severity of the errors of all the CPUs. */ for_each_possible_cpu(cpu) { - struct mce *mtmp = &per_cpu(mces_seen, cpu); + struct mce_hw_err *etmp = &per_cpu(hw_errs_seen, cpu); + struct mce *mtmp = &etmp->m; if (mtmp->severity > global_worst) { global_worst = mtmp->severity; - m = &per_cpu(mces_seen, cpu); + err = &per_cpu(hw_errs_seen, cpu); + m = &err->m; } } @@ -1070,7 +1084,7 @@ static void mce_reign(void) if (m && global_worst >= MCE_PANIC_SEVERITY) { /* call mce_severity() to get "msg" for panic */ mce_severity(m, NULL, &msg, true); - mce_panic("Fatal machine check", m, msg); + mce_panic("Fatal machine check", err, msg); } /* @@ -1087,11 +1101,11 @@ static void mce_reign(void) mce_panic("Fatal machine check from unknown source", NULL, NULL); /* - * Now clear all the mces_seen so that they don't reappear on + * Now clear all the hw_errs_seen so that they don't reappear on * the next mce. */ for_each_possible_cpu(cpu) - memset(&per_cpu(mces_seen, cpu), 0, sizeof(struct mce)); + memset(&per_cpu(hw_errs_seen, cpu), 0, sizeof(struct mce_hw_err)); } static atomic_t global_nwo; @@ -1295,12 +1309,13 @@ static noinstr bool mce_check_crashing_cpu(void) } static __always_inline int -__mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *final, +__mc_scan_banks(struct mce_hw_err *err, struct pt_regs *regs, struct mce *final, unsigned long *toclear, unsigned long *valid_banks, int no_way_out, int *worst) { struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array); struct mca_config *cfg = &mca_cfg; + struct mce *m = &err->m; int severity, i, taint = 0; for (i = 0; i < this_cpu_read(mce_num_banks); i++) { @@ -1356,7 +1371,7 @@ __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *final, * done in #MC context, where instrumentation is disabled. */ instrumentation_begin(); - mce_log(m); + mce_log(err); instrumentation_end(); if (severity > *worst) { @@ -1426,8 +1441,9 @@ static void kill_me_never(struct callback_head *cb) set_mce_nospec(pfn); } -static void queue_task_work(struct mce *m, char *msg, void (*func)(struct callback_head *)) +static void queue_task_work(struct mce_hw_err *err, char *msg, void (*func)(struct callback_head *)) { + struct mce *m = &err->m; int count = ++current->mce_count; /* First call, save all the details */ @@ -1441,11 +1457,12 @@ static void queue_task_work(struct mce *m, char *msg, void (*func)(struct callba /* Ten is likely overkill. Don't expect more than two faults before task_work() */ if (count > 10) - mce_panic("Too many consecutive machine checks while accessing user data", m, msg); + mce_panic("Too many consecutive machine checks while accessing user data", + err, msg); /* Second or later call, make sure page address matches the one from first call */ if (count > 1 && (current->mce_addr >> PAGE_SHIFT) != (m->addr >> PAGE_SHIFT)) - mce_panic("Consecutive machine checks to different user pages", m, msg); + mce_panic("Consecutive machine checks to different user pages", err, msg); /* Do not call task_work_add() more than once */ if (count > 1) @@ -1494,8 +1511,14 @@ noinstr void do_machine_check(struct pt_regs *regs) int worst = 0, order, no_way_out, kill_current_task, lmce, taint = 0; DECLARE_BITMAP(valid_banks, MAX_NR_BANKS) = { 0 }; DECLARE_BITMAP(toclear, MAX_NR_BANKS) = { 0 }; - struct mce m, *final; + struct mce_hw_err *final; + struct mce_hw_err err; char *msg = NULL; + struct mce *m; + + memset(&err, 0, sizeof(struct mce_hw_err)); + + m = &err.m; if (unlikely(mce_flags.p5)) return pentium_machine_check(regs); @@ -1533,13 +1556,13 @@ noinstr void do_machine_check(struct pt_regs *regs) this_cpu_inc(mce_exception_count); - mce_gather_info(&m, regs); - m.tsc = rdtsc(); + mce_gather_info(m, regs); + m->tsc = rdtsc(); - final = this_cpu_ptr(&mces_seen); - *final = m; + final = this_cpu_ptr(&hw_errs_seen); + final->m = *m; - no_way_out = mce_no_way_out(&m, &msg, valid_banks, regs); + no_way_out = mce_no_way_out(m, &msg, valid_banks, regs); barrier(); @@ -1548,15 +1571,15 @@ noinstr void do_machine_check(struct pt_regs *regs) * Assume the worst for now, but if we find the * severity is MCE_AR_SEVERITY we have other options. */ - if (!(m.mcgstatus & MCG_STATUS_RIPV)) + if (!(m->mcgstatus & MCG_STATUS_RIPV)) kill_current_task = 1; /* * Check if this MCE is signaled to only this logical processor, * on Intel, Zhaoxin only. */ - if (m.cpuvendor == X86_VENDOR_INTEL || - m.cpuvendor == X86_VENDOR_ZHAOXIN) - lmce = m.mcgstatus & MCG_STATUS_LMCES; + if (m->cpuvendor == X86_VENDOR_INTEL || + m->cpuvendor == X86_VENDOR_ZHAOXIN) + lmce = m->mcgstatus & MCG_STATUS_LMCES; /* * Local machine check may already know that we have to panic. @@ -1567,12 +1590,12 @@ noinstr void do_machine_check(struct pt_regs *regs) */ if (lmce) { if (no_way_out) - mce_panic("Fatal local machine check", &m, msg); + mce_panic("Fatal local machine check", &err, msg); } else { order = mce_start(&no_way_out); } - taint = __mc_scan_banks(&m, regs, final, toclear, valid_banks, no_way_out, &worst); + taint = __mc_scan_banks(&err, regs, &final->m, toclear, valid_banks, no_way_out, &worst); if (!no_way_out) mce_clear_state(toclear); @@ -1587,7 +1610,7 @@ noinstr void do_machine_check(struct pt_regs *regs) no_way_out = worst >= MCE_PANIC_SEVERITY; if (no_way_out) - mce_panic("Fatal machine check on current CPU", &m, msg); + mce_panic("Fatal machine check on current CPU", &err, msg); } } else { /* @@ -1599,8 +1622,8 @@ noinstr void do_machine_check(struct pt_regs *regs) * make sure we have the right "msg". */ if (worst >= MCE_PANIC_SEVERITY) { - mce_severity(&m, regs, &msg, true); - mce_panic("Local fatal machine check!", &m, msg); + mce_severity(m, regs, &msg, true); + mce_panic("Local fatal machine check!", &err, msg); } } @@ -1618,14 +1641,14 @@ noinstr void do_machine_check(struct pt_regs *regs) goto out; /* Fault was in user mode and we need to take some action */ - if ((m.cs & 3) == 3) { + if ((m->cs & 3) == 3) { /* If this triggers there is no way to recover. Die hard. */ BUG_ON(!on_thread_stack() || !user_mode(regs)); - if (!mce_usable_address(&m)) - queue_task_work(&m, msg, kill_me_now); + if (!mce_usable_address(m)) + queue_task_work(&err, msg, kill_me_now); else - queue_task_work(&m, msg, kill_me_maybe); + queue_task_work(&err, msg, kill_me_maybe); } else { /* @@ -1637,13 +1660,13 @@ noinstr void do_machine_check(struct pt_regs *regs) * corresponding exception handler which would do that is the * proper one. */ - if (m.kflags & MCE_IN_KERNEL_RECOV) { + if (m->kflags & MCE_IN_KERNEL_RECOV) { if (!fixup_exception(regs, X86_TRAP_MC, 0, 0)) - mce_panic("Failed kernel mode recovery", &m, msg); + mce_panic("Failed kernel mode recovery", &err, msg); } - if (m.kflags & MCE_IN_KERNEL_COPYIN) - queue_task_work(&m, msg, kill_me_never); + if (m->kflags & MCE_IN_KERNEL_COPYIN) + queue_task_work(&err, msg, kill_me_never); } out: diff --git a/arch/x86/kernel/cpu/mce/dev-mcelog.c b/arch/x86/kernel/cpu/mce/dev-mcelog.c index a05ac0716ecf..4a0e3bb4a4fb 100644 --- a/arch/x86/kernel/cpu/mce/dev-mcelog.c +++ b/arch/x86/kernel/cpu/mce/dev-mcelog.c @@ -36,7 +36,7 @@ static DECLARE_WAIT_QUEUE_HEAD(mce_chrdev_wait); static int dev_mce_log(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *mce = (struct mce *)data; + struct mce *mce = &((struct mce_hw_err *)data)->m; unsigned int entry; if (mce->kflags & MCE_HANDLED_CEC) diff --git a/arch/x86/kernel/cpu/mce/genpool.c b/arch/x86/kernel/cpu/mce/genpool.c index 4284749ec803..3337ea5c428d 100644 --- a/arch/x86/kernel/cpu/mce/genpool.c +++ b/arch/x86/kernel/cpu/mce/genpool.c @@ -31,15 +31,15 @@ static LLIST_HEAD(mce_event_llist); */ static bool is_duplicate_mce_record(struct mce_evt_llist *t, struct mce_evt_llist *l) { + struct mce_hw_err *err1, *err2; struct mce_evt_llist *node; - struct mce *m1, *m2; - m1 = &t->mce; + err1 = &t->err; llist_for_each_entry(node, &l->llnode, llnode) { - m2 = &node->mce; + err2 = &node->err; - if (!mce_cmp(m1, m2)) + if (!mce_cmp(&err1->m, &err2->m)) return true; } return false; @@ -73,9 +73,9 @@ struct llist_node *mce_gen_pool_prepare_records(void) void mce_gen_pool_process(struct work_struct *__unused) { + struct mce_hw_err *err; struct llist_node *head; struct mce_evt_llist *node, *tmp; - struct mce *mce; head = llist_del_all(&mce_event_llist); if (!head) @@ -83,8 +83,8 @@ void mce_gen_pool_process(struct work_struct *__unused) head = llist_reverse_order(head); llist_for_each_entry_safe(node, tmp, head, llnode) { - mce = &node->mce; - blocking_notifier_call_chain(&x86_mce_decoder_chain, 0, mce); + err = &node->err; + blocking_notifier_call_chain(&x86_mce_decoder_chain, 0, err); gen_pool_free(mce_evt_pool, (unsigned long)node, sizeof(*node)); } } @@ -94,11 +94,11 @@ bool mce_gen_pool_empty(void) return llist_empty(&mce_event_llist); } -int mce_gen_pool_add(struct mce *mce) +int mce_gen_pool_add(struct mce_hw_err *err) { struct mce_evt_llist *node; - if (filter_mce(mce)) + if (filter_mce(&err->m)) return -EINVAL; if (!mce_evt_pool) @@ -110,7 +110,7 @@ int mce_gen_pool_add(struct mce *mce) return -ENOMEM; } - memcpy(&node->mce, mce, sizeof(*mce)); + memcpy(&node->err, err, sizeof(*err)); llist_add(&node->llnode, &mce_event_llist); return 0; diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c index 94953d749475..1905938e2fd5 100644 --- a/arch/x86/kernel/cpu/mce/inject.c +++ b/arch/x86/kernel/cpu/mce/inject.c @@ -498,6 +498,7 @@ static void prepare_msrs(void *info) static void do_inject(void) { + struct mce_hw_err err; u64 mcg_status = 0; unsigned int cpu = i_mce.extcpu; u8 b = i_mce.bank; @@ -513,7 +514,8 @@ static void do_inject(void) i_mce.status |= MCI_STATUS_SYNDV; if (inj_type == SW_INJ) { - mce_log(&i_mce); + err.m = i_mce; + mce_log(&err); return; } diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h index 9802f7c6cc93..c9db046e7124 100644 --- a/arch/x86/kernel/cpu/mce/internal.h +++ b/arch/x86/kernel/cpu/mce/internal.h @@ -26,12 +26,12 @@ extern struct blocking_notifier_head x86_mce_decoder_chain; struct mce_evt_llist { struct llist_node llnode; - struct mce mce; + struct mce_hw_err err; }; void mce_gen_pool_process(struct work_struct *__unused); bool mce_gen_pool_empty(void); -int mce_gen_pool_add(struct mce *mce); +int mce_gen_pool_add(struct mce_hw_err *err); int mce_gen_pool_init(void); struct llist_node *mce_gen_pool_prepare_records(void); @@ -264,7 +264,7 @@ void mce_setup_for_cpu(unsigned int cpu, struct mce *m); #ifdef CONFIG_X86_MCE_AMD extern bool amd_filter_mce(struct mce *m); bool amd_mce_usable_address(struct mce *m); -void amd_handle_error(struct mce *m); +void amd_handle_error(struct mce_hw_err *err); /* * If MCA_CONFIG[McaLsbInStatusSupported] is set, extract ErrAddr in bits @@ -293,7 +293,7 @@ static __always_inline void smca_extract_err_addr(struct mce *m) #else static inline bool amd_filter_mce(struct mce *m) { return false; } static inline bool amd_mce_usable_address(struct mce *m) { return false; } -static inline void amd_handle_error(struct mce *m) { } +static inline void amd_handle_error(struct mce_hw_err *err) { } static inline void smca_extract_err_addr(struct mce *m) { } #endif diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c index ca87a0939135..4864191918db 100644 --- a/drivers/acpi/acpi_extlog.c +++ b/drivers/acpi/acpi_extlog.c @@ -134,7 +134,7 @@ static int print_extlog_rcd(const char *pfx, static int extlog_print(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *mce = (struct mce *)data; + struct mce *mce = &((struct mce_hw_err *)data)->m; int bank = mce->bank; int cpu = mce->extcpu; struct acpi_hest_generic_status *estatus, *tmp; diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c index d48a388b796e..18916a73a363 100644 --- a/drivers/acpi/nfit/mce.c +++ b/drivers/acpi/nfit/mce.c @@ -13,8 +13,9 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *mce = (struct mce *)data; + struct mce_hw_err *err = (struct mce_hw_err *)data; struct acpi_nfit_desc *acpi_desc; + struct mce *mce = &err->m; struct nfit_spa *nfit_spa; /* We only care about uncorrectable memory errors */ diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c index 91e0a88ef904..d1e47cba0ff2 100644 --- a/drivers/edac/i7core_edac.c +++ b/drivers/edac/i7core_edac.c @@ -1810,7 +1810,7 @@ static void i7core_check_error(struct mem_ctl_info *mci, struct mce *m) static int i7core_mce_check_error(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *mce = (struct mce *)data; + struct mce *mce = &((struct mce_hw_err *)data)->m; struct i7core_dev *i7_dev; struct mem_ctl_info *mci; diff --git a/drivers/edac/igen6_edac.c b/drivers/edac/igen6_edac.c index cdd8480e7368..2c112d4d842b 100644 --- a/drivers/edac/igen6_edac.c +++ b/drivers/edac/igen6_edac.c @@ -911,7 +911,7 @@ static int ecclog_nmi_handler(unsigned int cmd, struct pt_regs *regs) static int ecclog_mce_handler(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *mce = (struct mce *)data; + struct mce *mce = &((struct mce_hw_err *)data)->m; char *type; if (mce->kflags & MCE_HANDLED_CEC) diff --git a/drivers/edac/pnd2_edac.c b/drivers/edac/pnd2_edac.c index 2afcd148fcf8..e2fb2d75af04 100644 --- a/drivers/edac/pnd2_edac.c +++ b/drivers/edac/pnd2_edac.c @@ -1366,7 +1366,7 @@ static void pnd2_unregister_mci(struct mem_ctl_info *mci) */ static int pnd2_mce_check_error(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *mce = (struct mce *)data; + struct mce *mce = &((struct mce_hw_err *)data)->m; struct mem_ctl_info *mci; struct dram_addr daddr; char *type; diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c index 26cca5a9322d..0c4e45245153 100644 --- a/drivers/edac/sb_edac.c +++ b/drivers/edac/sb_edac.c @@ -3255,7 +3255,7 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci, static int sbridge_mce_check_error(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *mce = (struct mce *)data; + struct mce *mce = &((struct mce_hw_err *)data)->m; struct mem_ctl_info *mci; char *type; diff --git a/drivers/edac/skx_common.c b/drivers/edac/skx_common.c index 9c5b6f8bd8bd..e0a4a1ecd25e 100644 --- a/drivers/edac/skx_common.c +++ b/drivers/edac/skx_common.c @@ -633,7 +633,7 @@ static bool skx_error_in_mem(const struct mce *m) int skx_mce_check_error(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *mce = (struct mce *)data; + struct mce *mce = &((struct mce_hw_err *)data)->m; struct decoded_addr res; struct mem_ctl_info *mci; char *type; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index c543600b759b..7c3e7ce811ce 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -3536,7 +3536,7 @@ static struct amdgpu_device *find_adev(uint32_t node_id) static int amdgpu_bad_page_notifier(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *m = (struct mce *)data; + struct mce *m = &((struct mce_hw_err *)data)->m; struct amdgpu_device *adev = NULL; uint32_t gpu_id = 0; uint32_t umc_inst = 0, ch_inst = 0; diff --git a/drivers/ras/amd/fmpm.c b/drivers/ras/amd/fmpm.c index 271dfad05d68..d3ce41a46ac4 100644 --- a/drivers/ras/amd/fmpm.c +++ b/drivers/ras/amd/fmpm.c @@ -400,7 +400,7 @@ static void retire_dram_row(u64 addr, u64 id, u32 cpu) static int fru_handle_mem_poison(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *m = (struct mce *)data; + struct mce *m = &((struct mce_hw_err *)data)->m; struct fru_rec *rec; if (!mce_is_memory_error(m)) diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c index e440b15fbabc..4940e97fbcdc 100644 --- a/drivers/ras/cec.c +++ b/drivers/ras/cec.c @@ -534,7 +534,8 @@ static int __init create_debugfs_nodes(void) static int cec_notifier(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *m = (struct mce *)data; + struct mce_hw_err *err = (struct mce_hw_err *)data; + struct mce *m = &err->m; if (!m) return NOTIFY_DONE; diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h index f0f7b3cb2041..65aba1afcd07 100644 --- a/include/trace/events/mce.h +++ b/include/trace/events/mce.h @@ -19,9 +19,9 @@ TRACE_EVENT(mce_record, - TP_PROTO(struct mce *m), + TP_PROTO(struct mce_hw_err *err), - TP_ARGS(m), + TP_ARGS(err), TP_STRUCT__entry( __field( u64, mcgcap ) @@ -46,25 +46,25 @@ TRACE_EVENT(mce_record, ), TP_fast_assign( - __entry->mcgcap = m->mcgcap; - __entry->mcgstatus = m->mcgstatus; - __entry->status = m->status; - __entry->addr = m->addr; - __entry->misc = m->misc; - __entry->synd = m->synd; - __entry->ipid = m->ipid; - __entry->ip = m->ip; - __entry->tsc = m->tsc; - __entry->ppin = m->ppin; - __entry->walltime = m->time; - __entry->cpu = m->extcpu; - __entry->cpuid = m->cpuid; - __entry->apicid = m->apicid; - __entry->socketid = m->socketid; - __entry->cs = m->cs; - __entry->bank = m->bank; - __entry->cpuvendor = m->cpuvendor; - __entry->microcode = m->microcode; + __entry->mcgcap = err->m.mcgcap; + __entry->mcgstatus = err->m.mcgstatus; + __entry->status = err->m.status; + __entry->addr = err->m.addr; + __entry->misc = err->m.misc; + __entry->synd = err->m.synd; + __entry->ipid = err->m.ipid; + __entry->ip = err->m.ip; + __entry->tsc = err->m.tsc; + __entry->ppin = err->m.ppin; + __entry->walltime = err->m.time; + __entry->cpu = err->m.extcpu; + __entry->cpuid = err->m.cpuid; + __entry->apicid = err->m.apicid; + __entry->socketid = err->m.socketid; + __entry->cs = err->m.cs; + __entry->bank = err->m.bank; + __entry->cpuvendor = err->m.cpuvendor; + __entry->microcode = err->m.microcode; ), TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR: %016Lx, MISC: %016Lx, SYND: %016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, vendor: %u, CPUID: %x, time: %llu, socket: %u, APIC: %x, microcode: %x", From patchwork Thu Apr 4 15:13:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617991 Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2040.outbound.protection.outlook.com [40.107.236.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BB45D12CD8F; Thu, 4 Apr 2024 15:14:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.236.40 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243673; cv=fail; b=cq+zieElnVWkj/1j1yzYplGhqs6e+RI2XAc/nvBZ8shxtUL0dIpRCKONOioDFHjMnQ7vnBwTRABkJcfPD41T6oQLUWgvJ39csSZd5cpi0wzsKTnnPpiI7GJgG/E0pWFQm7NC+x560Et8BHgq7hSUU6KH+kuW59ZMUvfbRVhb8XM= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243673; c=relaxed/simple; bh=y76o4+UtsLidBYG3iCwFjDAYJ7mdGQqRUvl1J3GvqF0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=MgrNHXAbhW7sBUTewjvFVNq4I00XgheszSFKIvh8ebO0FQHYy0AHcrfCyy5t6jv4LVgGkSCTAa27fOxHhw9pcjYdwqBdRWcBPC6IM5X0xbbuv54WPBO7Zipk2I9+0skR5yA02yaas5fKNfClABJJFNNraeKz03fK3oC3iUSfNm8= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=AAxV0cm6; arc=fail smtp.client-ip=40.107.236.40 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="AAxV0cm6" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=jQiVBzmWGtKQAXmf4grz+0nI3nayAcdQe5lih6SZ9dhWdDNIuB4DtGNkgU5KJOiHRAZECxM5OQDDAKYneNQDKnW93h8vn22UEYHAZfxycnz0R1rJuOsWJPW95PtF9I9Vgt+q7P1J9E0QSeyp3GqC8LDTDR+lXMsPsasnrxqmqFkkhsaowql5X3X6nxz/2/ybG14Cm8yg12evPFOtFuEqekHYeiz0EI3Vrl31f0xmacJfW/lziAR0sU/3oIF54iSW6WplsHnuIiMghL9IWo9zvKMJOgIr9TqB4vQ6C+E39s29Tig34yLEd9WaKCVkjWf23sBXJTBooiJYnbKSlc9YGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=iu6p98+cqmewLR13M8JOW8IGBPDT8tqFpev+EABVYV0=; b=l6prR83Pe1777CIQjfBQr0CbXA03IOscIjCG/mmpVCAOAf3/EWWrCWZdCEtFWgPoxuu/+v0b+yWlPwPjJZkE8bPaefZZ65g//A60cBuuDK9IDQTLyR/Xkama26qq3E/VYU29S+QCndYrd2cxB29Du8E3eiqAmSWj9KfaKEHtDJ25rpzwUABWL6WoY7LgxfdJRVpVVn/SkxY9bCzg12yYVeYpIihaFi967gX8sOhF2XAChkX6IkAu/qGfw3oaYBJ4a4i1xgVQDp2/wZswzfiW3Prnd0L1PfIBc07/ldLS6lPQYT9no/Gw5QaslIsG/QTrZVAdkRDz1NPhiupMt/qAJQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iu6p98+cqmewLR13M8JOW8IGBPDT8tqFpev+EABVYV0=; b=AAxV0cm6WsB+c43p3S9ltXvPYLuhmesV/1NxYF45xQA97LlnFTH5lRSVqrgPyRKeBH0KKngK1AL4K3+ZG2gqfS6mZRJfDo0tKl8A7zlq3H0f01o0G+UKgN/c/XyJNNBK/Iv1Mk/lCX6kzaY7LxNdVRbpH8EYPj0CUB7DMnXZ1CY= Received: from DM6PR18CA0029.namprd18.prod.outlook.com (2603:10b6:5:15b::42) by CY8PR12MB7490.namprd12.prod.outlook.com (2603:10b6:930:91::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:28 +0000 Received: from DS3PEPF000099D7.namprd04.prod.outlook.com (2603:10b6:5:15b:cafe::5c) by DM6PR18CA0029.outlook.office365.com (2603:10b6:5:15b::42) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46 via Frontend Transport; Thu, 4 Apr 2024 15:14:28 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099D7.mail.protection.outlook.com (10.167.17.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:28 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:14 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 14/16] x86/mce, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers Date: Thu, 4 Apr 2024 10:13:57 -0500 Message-ID: <20240404151359.47970-15-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099D7:EE_|CY8PR12MB7490:EE_ X-MS-Office365-Filtering-Correlation-Id: fd1112a5-0e28-44d9-fe32-08dc54b9ec32 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: pI6FjPVZXiManxwtp3HKqK2WLBGod51BMtTNgI6ZGUhzdCJZBXs5lYIBZ9V81C87oRyLNi9gFMAKjBWKFi81Evnihao6GeT87mdsYL5BrcfvM+n7Q7tIXlGIzaKWS1YF1vIEylMPnn9ktWLy11nSdieJLJSUcSgUE7h7uX3nJPNjCERHeCqigiCvJH/bLsIq2OTwYuhvm5uE7ig5LDNtLedxGmyLkxTgwEu5k8DSn0376zx0sIAdGHkgT1gdfnr2FWlorCkW1mcyFLFnmZrEDS6u/xlt1bo2prJPIa9hvjcrY+PFa4EJMkQLXO68ndeTDOxF3ZqK8DE1eh6ZHT0Xi83yyYrvAe0WXqI5Mt/X4ZICgNBjks7Qy4knokohZx72jQbVcYA1n/4JYmyEQUsiTmlqzmLNZ7LFtggZr+YfUeSa5hsEAZAXVFfpaARNvBWkNLFGTAN2/KDfqL26cph/x/4wbBOZkhbgiXrTGNlgQNbhimhJ3ygIX5OfFQzewnl0JM8k/+q2gv6EfJYbxHyXkvCcHczlEBxwZ/RRgOdiNnLumOTWjBRikM0dW1LaTV4jOy+XhnLcHY4dk8J1x4VL4K02djPts8VJhd5yB1e6kNfcB1nCM9FHqPxaGOIOsFXbWON/EWNdcXfzJp7KSh71HHgkGUuIXG7xGrlXlhpLxZpl33xKG/Fc0l2vNFXOguNy2zEwRq72YUjNNlQz1jUvLQ== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(1800799015)(376005)(82310400014)(36860700004);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:28.6028 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: fd1112a5-0e28-44d9-fe32-08dc54b9ec32 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099D7.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY8PR12MB7490 From: Avadhut Naik AMD's Scalable MCA systems viz. Genoa will include two new registers: MCA_SYND1 and MCA_SYND2. These registers will include supplemental error information in addition to the existing MCA_SYND register. The data within the registers is considered valid if MCA_STATUS[SyndV] is set. Add fields for these registers as vendor-specific error information in struct mce_hw_err. Save and print these registers wherever MCA_STATUS[SyndV]/MCA_SYND is currently used. Also, modify the mce_record tracepoint to export these new registers through __dynamic_array. While the sizeof() operator has been used to determine the size of this __dynamic_array, the same, if needed in the future can be substituted by caching the size of vendor-specific error information as part of struct mce_hw_err. Signed-off-by: Avadhut Naik Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-19-yazen.ghannam@amd.com v1->v2: * Rebase on upstream changes for MCE trace event. (Avadhut) arch/x86/include/asm/mce.h | 12 ++++++++++++ arch/x86/kernel/cpu/mce/core.c | 26 ++++++++++++++++++-------- drivers/edac/mce_amd.c | 10 +++++++--- include/trace/events/mce.h | 9 +++++++-- 4 files changed, 44 insertions(+), 13 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index e4ad9807b3e3..a701290f80a1 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -118,6 +118,9 @@ #define MSR_AMD64_SMCA_MC0_DESTAT 0xc0002008 #define MSR_AMD64_SMCA_MC0_DEADDR 0xc0002009 #define MSR_AMD64_SMCA_MC0_MISC1 0xc000200a +/* Registers MISC2 to MISC4 are at offsets B to D. */ +#define MSR_AMD64_SMCA_MC0_SYND1 0xc000200e +#define MSR_AMD64_SMCA_MC0_SYND2 0xc000200f #define MSR_AMD64_SMCA_MCx_CTL(x) (MSR_AMD64_SMCA_MC0_CTL + 0x10*(x)) #define MSR_AMD64_SMCA_MCx_STATUS(x) (MSR_AMD64_SMCA_MC0_STATUS + 0x10*(x)) #define MSR_AMD64_SMCA_MCx_ADDR(x) (MSR_AMD64_SMCA_MC0_ADDR + 0x10*(x)) @@ -128,6 +131,8 @@ #define MSR_AMD64_SMCA_MCx_DESTAT(x) (MSR_AMD64_SMCA_MC0_DESTAT + 0x10*(x)) #define MSR_AMD64_SMCA_MCx_DEADDR(x) (MSR_AMD64_SMCA_MC0_DEADDR + 0x10*(x)) #define MSR_AMD64_SMCA_MCx_MISCy(x, y) ((MSR_AMD64_SMCA_MC0_MISC1 + y) + (0x10*(x))) +#define MSR_AMD64_SMCA_MCx_SYND1(x) (MSR_AMD64_SMCA_MC0_SYND1 + 0x10*(x)) +#define MSR_AMD64_SMCA_MCx_SYND2(x) (MSR_AMD64_SMCA_MC0_SYND2 + 0x10*(x)) #define XEC(x, mask) (((x) >> 16) & mask) @@ -185,6 +190,13 @@ enum mce_notifier_prios { struct mce_hw_err { struct mce m; + + union vendor_info { + struct { + u64 synd1; + u64 synd2; + } amd; + } vi; }; struct notifier_block; diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index fef025bda2af..aa27729f7899 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -201,6 +201,10 @@ static void __print_mce(struct mce_hw_err *err) if (mce_flags.smca) { if (m->synd) pr_cont("SYND %llx ", m->synd); + if (err->vi.amd.synd1) + pr_cont("SYND1 %llx ", err->vi.amd.synd1); + if (err->vi.amd.synd2) + pr_cont("SYND2 %llx ", err->vi.amd.synd2); if (m->ipid) pr_cont("IPID %llx ", m->ipid); } @@ -651,8 +655,10 @@ static struct notifier_block mce_default_nb = { /* * Read ADDR and MISC registers. */ -static noinstr void mce_read_aux(struct mce *m, int i) +static noinstr void mce_read_aux(struct mce_hw_err *err, int i) { + struct mce *m = &err->m; + if (m->status & MCI_STATUS_MISCV) m->misc = mce_rdmsrl(mca_msr_reg(i, MCA_MISC)); @@ -674,8 +680,11 @@ static noinstr void mce_read_aux(struct mce *m, int i) if (mce_flags.smca) { m->ipid = mce_rdmsrl(MSR_AMD64_SMCA_MCx_IPID(i)); - if (m->status & MCI_STATUS_SYNDV) + if (m->status & MCI_STATUS_SYNDV) { m->synd = mce_rdmsrl(MSR_AMD64_SMCA_MCx_SYND(i)); + err->vi.amd.synd1 = mce_rdmsrl(MSR_AMD64_SMCA_MCx_SYND1(i)); + err->vi.amd.synd2 = mce_rdmsrl(MSR_AMD64_SMCA_MCx_SYND2(i)); + } } } @@ -751,7 +760,7 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) /* If this entry is not valid, ignore it */ if (!(m->status & MCI_STATUS_VAL)) { if (smca_destat_is_valid(i)) { - mce_read_aux(m, i); + mce_read_aux(&err, i); goto clear_it; } @@ -801,7 +810,7 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) if (flags & MCP_DONTLOG) goto clear_it; - mce_read_aux(m, i); + mce_read_aux(&err, i); m->severity = mce_severity(m, NULL, NULL, false); /* @@ -943,9 +952,10 @@ static __always_inline void quirk_zen_ifu(int bank, struct mce *m, struct pt_reg * Do a quick check if any of the events requires a panic. * This decides if we keep the events around or clear them. */ -static __always_inline int mce_no_way_out(struct mce *m, char **msg, unsigned long *validp, +static __always_inline int mce_no_way_out(struct mce_hw_err *err, char **msg, unsigned long *validp, struct pt_regs *regs) { + struct mce *m = &err->m; char *tmp = *msg; int i; @@ -963,7 +973,7 @@ static __always_inline int mce_no_way_out(struct mce *m, char **msg, unsigned lo m->bank = i; if (mce_severity(m, regs, &tmp, true) >= MCE_PANIC_SEVERITY) { - mce_read_aux(m, i); + mce_read_aux(err, i); *msg = tmp; return 1; } @@ -1361,7 +1371,7 @@ __mc_scan_banks(struct mce_hw_err *err, struct pt_regs *regs, struct mce *final, if (severity == MCE_NO_SEVERITY) continue; - mce_read_aux(m, i); + mce_read_aux(err, i); /* assuming valid severity level != 0 */ m->severity = severity; @@ -1562,7 +1572,7 @@ noinstr void do_machine_check(struct pt_regs *regs) final = this_cpu_ptr(&hw_errs_seen); final->m = *m; - no_way_out = mce_no_way_out(m, &msg, valid_banks, regs); + no_way_out = mce_no_way_out(&err, &msg, valid_banks, regs); barrier(); diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index e02af5da1ec2..32bf4cc564a3 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -792,7 +792,8 @@ static const char *decode_error_status(struct mce *m) static int amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data) { - struct mce *m = (struct mce *)data; + struct mce_hw_err *err = (struct mce_hw_err *)data; + struct mce *m = &err->m; unsigned int fam = x86_family(m->cpuid); int ecc; @@ -850,8 +851,11 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data) if (boot_cpu_has(X86_FEATURE_SMCA)) { pr_emerg(HW_ERR "IPID: 0x%016llx", m->ipid); - if (m->status & MCI_STATUS_SYNDV) - pr_cont(", Syndrome: 0x%016llx", m->synd); + if (m->status & MCI_STATUS_SYNDV) { + pr_cont(", Syndrome: 0x%016llx\n", m->synd); + pr_emerg(HW_ERR "Syndrome1: 0x%016llx, Syndrome2: 0x%016llx", + err->vi.amd.synd1, err->vi.amd.synd2); + } pr_cont("\n"); diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h index 65aba1afcd07..43e8ecc11881 100644 --- a/include/trace/events/mce.h +++ b/include/trace/events/mce.h @@ -43,6 +43,8 @@ TRACE_EVENT(mce_record, __field( u8, bank ) __field( u8, cpuvendor ) __field( u32, microcode ) + __field( u8, len ) + __dynamic_array(u8, v_data, sizeof(err->vi)) ), TP_fast_assign( @@ -65,9 +67,11 @@ TRACE_EVENT(mce_record, __entry->bank = err->m.bank; __entry->cpuvendor = err->m.cpuvendor; __entry->microcode = err->m.microcode; + __entry->len = sizeof(err->vi); + memcpy(__get_dynamic_array(v_data), &err->vi, sizeof(err->vi)); ), - TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR: %016Lx, MISC: %016Lx, SYND: %016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, vendor: %u, CPUID: %x, time: %llu, socket: %u, APIC: %x, microcode: %x", + TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016llx, IPID: %016llx, ADDR: %016llx, MISC: %016llx, SYND: %016llx, RIP: %02x:<%016llx>, TSC: %llx, PPIN: %llx, vendor: %u, CPUID: %x, time: %llu, socket: %u, APIC: %x, microcode: %x, vendor data: %s", __entry->cpu, __entry->mcgcap, __entry->mcgstatus, __entry->bank, __entry->status, @@ -83,7 +87,8 @@ TRACE_EVENT(mce_record, __entry->walltime, __entry->socketid, __entry->apicid, - __entry->microcode) + __entry->microcode, + __print_array(__get_dynamic_array(v_data), __entry->len / 8, 8)) ); #endif /* _TRACE_MCE_H */ From patchwork Thu Apr 4 15:13:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617993 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2054.outbound.protection.outlook.com [40.107.243.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1439954905; Thu, 4 Apr 2024 15:14:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.54 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243675; cv=fail; b=pM5qjUILqa4p+GC2S5SZ3XD8Fy5j8uCLEvIx/4cACmIjWVfqDeSe/yK4lgEUOadvVajMsDh12EwoDC8XRVfOH7QhgcS32EgMUMiW6oMTP7jr99159y36+Zp48vACTtdQBzUGrDe2cnu2yv7dKJAVCyF0zr2N1o7rLGkbP1bG8CU= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243675; c=relaxed/simple; bh=1s1qR5lee3AZGAmY4DApwVzPv7Mwacf7ttyEu7wsbQI=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mFLEXTRhmnP++ymSSRvgI9j0gvZgiD3COEVMYm8lE28kSWmtkLCHNR3tUOZGqrZ6TqxdFN12zXnTFok8Ui+0aC34WDrzETRVxGkbhTV68AxZajhSsv4AH0X84WM4D6fDL0RZWdIsPHIGUYSvgZurRWt8f9LC5+9g5BqbAFLx+hQ= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=3O/4XKR6; arc=fail smtp.client-ip=40.107.243.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="3O/4XKR6" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=j6nlQ9QI/XnfcJoyDRBOpuQmOVHwTfKCnJcKliu5hERauqPOMQFpfbvHH0FWZ63ODMhbDvd8Do+UrEhIb+AoWBGu335l6LmcjURbL0MgjXTu304MhvKBEw6a0yFf5AnDHktftg0dk4HD8C9SdFs9H6y86nKFE0HhO9PmA2i+Ib4fFcNYJW4yd7yL7l6x/X6Zpov/AEKkZAcWmqjg8kWw7EPVJyyhUkHJ1zSw/Y9tN/Ik6uJLipwEd4lTlq8RzmwGyXKJI0VwoBlaMzNG51xxuMormWXmIwN9vkYzaihfTZ8voXGFmLq6nvTP3lJ/Z/ZMEH2cAA2TGSDCseblr391iw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Wxe/NhfsbD2mL8GS5ALgW28w1/nrp5YBVuFG2w6+WLc=; b=Wu+suGWEEC5UosNYUmYuJbVLzgu2WQaDiWBM30MZWjwOciV4tMoZITPIQfgGjP6sbw0+VckV2zRC+c74wkP+WHeni9d8UPGUAhJxsD7h4+oQUR4a46ZI6NOggFE8VpPC2V6ZSYoxegA6fl38CGEEBXANIEpBb3BZCOZzVNgWCqp6t9RA8DZNj0gP3QK5e255uvv4I6Ctm4XgkNrK30AmtMkjBGoPZtAHeqEPOzuFVyJ5X8EuZ3lls5NfiQoSMOVaa0/oaYwRxRDg1oHdoXBFnDHlLuIAb3CfsUdGX62qYhFz2Q0TFjGm3Qh21lY+2mH3Ct89zl9LQHY57Nyx+gqLAQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Wxe/NhfsbD2mL8GS5ALgW28w1/nrp5YBVuFG2w6+WLc=; b=3O/4XKR6/ugm3jYkaq4hiZZOqXJr3BjJAqBMA+YLUuRyrcZNOL0JM/5mEuKfrH7KH15sxB5qQZPgZtldKVmJGdDmcGothCcNj7ZwRsWHtjxtB56dx/pi2QzHimkiaz3Rn+ts6LelFl240SBVLWfPrAcsPi/HvGk974A6VjtxzLM= Received: from DM6PR05CA0050.namprd05.prod.outlook.com (2603:10b6:5:335::19) by DM4PR12MB7693.namprd12.prod.outlook.com (2603:10b6:8:103::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:29 +0000 Received: from DS3PEPF000099DA.namprd04.prod.outlook.com (2603:10b6:5:335:cafe::1c) by DM6PR05CA0050.outlook.office365.com (2603:10b6:5:335::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7472.10 via Frontend Transport; Thu, 4 Apr 2024 15:14:29 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099DA.mail.protection.outlook.com (10.167.17.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:28 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:15 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 15/16] x86/mce/apei: Handle variable register array size Date: Thu, 4 Apr 2024 10:13:58 -0500 Message-ID: <20240404151359.47970-16-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099DA:EE_|DM4PR12MB7693:EE_ X-MS-Office365-Filtering-Correlation-Id: a7b8f784-ef11-41b9-4fe4-08dc54b9ec5c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: zszzOYLdwCT3jkz980JQi45a+CKwPf8TQh7Dix/WtRCytAd3R9zJNywKduJIKO6T78kK+BOz56nxOUX/lHd6nnVOx6yOULdZNisSC4NZCAyJvPnh0FFB9NH1ubnckDg8Z6lEzuN4S01wJ5jpukEy5YdMBGn5UrBlWgPJM15/LuBEtfkT2pwj7THkIOZJe4PWUqs0qOfcIS25HNwne2ASx/RO8itTH6QkrTpB1ljC8aPZHnALZHrm+dryBLUzNMU1VMmUKZDOJPomIrNDXgIXtKUDrjfDJGb5Dil2nzuJUIwp4Mtc4sAI0iAxG6H60C7mkk8crQ/RywghXbuw3ynqJQ7ciMARSdo3mXfmAz7ToRo3YEXLX6sny5rwlEKjdcK9nKt1nfmVF6tys4QMlorOYYAtQLlygESAAoL7PbdXt5G1qXBfsmMRRveqkzh/R3ZImtCXPvcSU0qF2wsjXQcW2KAd26a+mqCOS9T3+4cmSZl9BrEJKk4b2PDS2Pro3P1Y/DF1d3pP7OSfghjXMdxnv7LO/V35GRtzUYQmkta3aVdCVAE6VHy6J54rL8opy/oVLXvqtFYqKoYKNllCI+lweAW4YYLsceKu4sL5BTFHbnUkMy5G/C737vNPermGtWNQMLfal/XGHvZed6VNRuEGa8lb7zc9GIoR29FsUJHk3xkB51wwZv3yxB/4BFYC6MoAAh3ZyP4NM7HC23lg7EcJOg== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(1800799015)(82310400014)(36860700004)(376005);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:28.8892 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a7b8f784-ef11-41b9-4fe4-08dc54b9ec5c X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099DA.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB7693 ACPI Boot Error Record Table (BERT) is being used by the kernel to report errors that occurred in a previous boot. On some modern AMD systems, these very errors within the BERT are reported through the x86 Common Platform Error Record (CPER) format which consists of one or more Processor Context Information Structures. These context structures provide a starting address and represent an x86 MSR range in which the data constitutes a contiguous set of MSRs starting from, and including the starting address. It's common, for AMD systems that implement this behavior, that the MSR range represents the MCAX register space used for the Scalable MCA feature. The apei_smca_report_x86_error() function decodes and passes this information through the MCE notifier chain. However, this function assumes a fixed register size based on the original HW/FW implementation. This assumption breaks with the addition of two new MCAX registers viz. MCA_SYND1 and MCA_SYND2. These registers are added at the end of the MCAX register space, so they won't be included when decoding the CPER data. Rework apei_smca_report_x86_error() to support a variable register array size. This covers any case where the MSR context information starts at the MCAX address for MCA_STATUS and ends at any other register within the MCAX register space. Add code comments indicating the MCAX register at each offset. Co-developed-by: Avadhut Naik Signed-off-by: Avadhut Naik Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-20-yazen.ghannam@amd.com v1->v2: * No change. arch/x86/kernel/cpu/mce/apei.c | 73 +++++++++++++++++++++++++++------- 1 file changed, 59 insertions(+), 14 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c index 89a8ebac53ea..43622241c379 100644 --- a/arch/x86/kernel/cpu/mce/apei.c +++ b/arch/x86/kernel/cpu/mce/apei.c @@ -69,9 +69,9 @@ EXPORT_SYMBOL_GPL(apei_mce_report_mem_error); int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id) { const u64 *i_mce = ((const u64 *) (ctx_info + 1)); + unsigned int cpu, num_registers; struct mce_hw_err err; struct mce *m = &err.m; - unsigned int cpu; memset(&err, 0, sizeof(struct mce_hw_err)); @@ -91,16 +91,12 @@ int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id) return -EINVAL; /* - * The register array size must be large enough to include all the - * SMCA registers which need to be extracted. - * * The number of registers in the register array is determined by * Register Array Size/8 as defined in UEFI spec v2.8, sec N.2.4.2.2. - * The register layout is fixed and currently the raw data in the - * register array includes 6 SMCA registers which the kernel can - * extract. + * Ensure that the array size includes at least 1 register. */ - if (ctx_info->reg_arr_size < 48) + num_registers = ctx_info->reg_arr_size >> 3; + if (!num_registers) return -EINVAL; for_each_possible_cpu(cpu) { @@ -115,12 +111,61 @@ int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id) mce_setup_for_cpu(cpu, m); m->bank = (ctx_info->msr_addr >> 4) & 0xFF; - m->status = *i_mce; - m->addr = *(i_mce + 1); - m->misc = *(i_mce + 2); - /* Skipping MCA_CONFIG */ - m->ipid = *(i_mce + 4); - m->synd = *(i_mce + 5); + + /* + * The SMCA register layout is fixed and includes 16 registers. + * The end of the array may be variable, but the beginning is known. + * Switch on the number of registers. Cap the number of registers to + * expected max (15). + */ + if (num_registers > 15) + num_registers = 15; + + switch (num_registers) { + /* MCA_SYND2 */ + case 15: + err.vi.amd.synd2 = *(i_mce + 14); + fallthrough; + /* MCA_SYND1 */ + case 14: + err.vi.amd.synd1 = *(i_mce + 13); + fallthrough; + /* MCA_MISC4 */ + case 13: + /* MCA_MISC3 */ + case 12: + /* MCA_MISC2 */ + case 11: + /* MCA_MISC1 */ + case 10: + /* MCA_DEADDR */ + case 9: + /* MCA_DESTAT */ + case 8: + /* reserved */ + case 7: + /* MCA_SYND */ + case 6: + m->synd = *(i_mce + 5); + fallthrough; + /* MCA_IPID */ + case 5: + m->ipid = *(i_mce + 4); + fallthrough; + /* MCA_CONFIG */ + case 4: + /* MCA_MISC0 */ + case 3: + m->misc = *(i_mce + 2); + fallthrough; + /* MCA_ADDR */ + case 2: + m->addr = *(i_mce + 1); + fallthrough; + /* MCA_STATUS */ + case 1: + m->status = *i_mce; + } mce_log(&err); From patchwork Thu Apr 4 15:13:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yazen Ghannam X-Patchwork-Id: 13617992 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2064.outbound.protection.outlook.com [40.107.223.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19AA412D201; Thu, 4 Apr 2024 15:14:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.223.64 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243674; cv=fail; b=Ffm4HARUoSDZ+gbZvbI2vXR17mNY4RbkpqrwwT/F7ROJCUyz0n4uQRFGWkuqjUNqkUJWQdAUontw4EYEI/HQTGbna3sOPN3vkDHOtFrr+1z1cpHebXfRMaJ7OMnKOlrklDfEWbfjvidrGf+Fprzc0+rdi9d8I6WJI0hrJ3DmXJk= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712243674; c=relaxed/simple; bh=EWkuiLxWnzqx+F3Nsyigrz/mhV6d8h3yPRx5MWlM7oE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=X3VzeLILVQKB2Uud73U/nAnIFZR6LhySLH4GtPucbzm/NAa2CwF6Qt0sDX9mLwzVCQ+dkNtt4dNwtPM5bpoh332nS1IIxBzvBZeEk9y+I8F6MI23+JJ4L7q+egSlzab83cn1vS+XetB9AdCqZIrFaxzwDI7GqFK/8kJe5myVoRM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=si5MsP0I; arc=fail smtp.client-ip=40.107.223.64 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="si5MsP0I" ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MdQe8obnkXqf8ODVd5yDFMJZA5xThROSrOmGGWxnJYCKHSWPQCGD2Ocd4jq5f+ECmftNf/4B2QV72RyE89jygGkh/weuLL3Rw9MbIi7E6/7wu4XG3oI33f17PXgJ/qfe6IvUzvCbkr7Gg7XemVu/qqR5MtAFFNrRbZRxPXhvyNgGqQ1Jlx0Isg27jn82+CUEnbXeYJ5BafTs+PwhFNuE5dPEAxTRVg9nfIes4EaxkvuJOwlbsMFuWNEQhrF+3c7ZwcIjO1b8nIS5vjkkaUinssLzKRKBZqiEA892Azr0W0m+8g6kx+UY/TrN+oP6F42+lDXVw+TeJWOGTC9WTD0gVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Ash9xkgCW0yMASADP6XU3W6rdbwMTO2LLisUz/Fo/M4=; b=YkYXhhQax5BSgY/lcPgAf42+5PsKjRLnJUkr+D3c0pSLZn+ZFCwUavBpLR8sB/J+rWNwQN8b9QRoL5lX7tpAx0EgtOO3dFMXX2WakQl4xoo3i3pFyHXagABGRsL+NZRR32XOLYnnCaKBUlAZN1y9BokqBTo33ZJqHLBnX38OumKXDQtJ/xA9AO2KUfVInPXoeq/1dTkN+j9wsJnm9Pb2d8AF3tPPd5JlwlprB5goXqD6pydxGE4bevdFAuEknzDWemalQsgM0S0VzjwgMaWZ2eTel4xjkGVa7fxBZTw6ZG+FpIgqGwdPg+1mb4lxg4PBbMz/5kt3NChvbUwgsmnW4Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Ash9xkgCW0yMASADP6XU3W6rdbwMTO2LLisUz/Fo/M4=; b=si5MsP0IH6DtBrjWN2P4iXQECNUVpVR8C7jmpfJ2X62WbxsFQmviIQgX0x5fFRAkgTmbQX2Kd98coLY1Ii0LBjRo2ht90OS6Vsq2AA2AG3PqbSPhcmEgzv6L7wSYFdlmI1DssOuINmMfHK34yGGm26+jmHvsVvaDQ8bOakOzoBY= Received: from CH0PR13CA0010.namprd13.prod.outlook.com (2603:10b6:610:b1::15) by CY5PR12MB6177.namprd12.prod.outlook.com (2603:10b6:930:26::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 15:14:29 +0000 Received: from DS3PEPF000099D6.namprd04.prod.outlook.com (2603:10b6:610:b1:cafe::a) by CH0PR13CA0010.outlook.office365.com (2603:10b6:610:b1::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7292.11 via Frontend Transport; Thu, 4 Apr 2024 15:14:29 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS3PEPF000099D6.mail.protection.outlook.com (10.167.17.7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 15:14:29 +0000 Received: from quartz-7b1chost.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Apr 2024 10:14:15 -0500 From: Yazen Ghannam To: CC: , , , , , Yazen Ghannam Subject: [PATCH v2 16/16] EDAC/mce_amd: Add support for FRU Text in MCA Date: Thu, 4 Apr 2024 10:13:59 -0500 Message-ID: <20240404151359.47970-17-yazen.ghannam@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240404151359.47970-1-yazen.ghannam@amd.com> References: <20240404151359.47970-1-yazen.ghannam@amd.com> Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS3PEPF000099D6:EE_|CY5PR12MB6177:EE_ X-MS-Office365-Filtering-Correlation-Id: ceaa0e43-0c58-47c5-a8b5-08dc54b9ec7a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: uOQmzPgFdrKky2fQlW7VbcoRBEIO+Cvb2IX1PHO4v6BeaSi6Ej7K5dBOEBmQ24OCwJFZBzVTwWVEzJKYodx1EkiojT/fD9akv/aWQA2pD65inB47gxyVfM1FC+vDJShmgZ2jTEdC+8VXJ7v1SB7cav0MGEn13bdG09EQgc8aCWXN41/NmyW7gCfeNj+qsvk4dZiDfEXZFihRIUCUZyeh/g01/LPtukg8rBbZZYmY/UZALatxVD/n2nklkTOO9DZFpAw0K4AMtYIjecCEhBY3P3hKL1h37FQoUhEZV6Oqt3xDu3r5Uda08QM/bYFMytlfuGz2es0/1K85Q+Nahpcx1iYLshFvZZXeA4yPcOL+L56T+ry2Kq4RYTZ8JrATKfbjotrp+4Th1833UOxiH6JJ5B8jDtwV4ZHhlDtU7mblqQuSVvSZDBlJKXWf2KWIFNXgg3eo7oBYc//k4wZNrQnrsBI9vHrom0KV8uvry0b2l3hLwyqUO/OJEJk68+GiELQ2FPSaPUZjU/SCLe3F1/RPLMLYLfP5zk+wv6XsuIJeclZPLW9KdLQ/3OQ954Cx8wUbxVWryUWnn3Eing+2a6+pr/Re/41T1Nk/QldJqR1L5Qs3DaarJ2V8ZAIyf+NfQo8de0138Line6M93xk+3GhEtTAMwmieBbCb82hQ0uk+vrsIm5Mx+IFj818aM2LZB1rUlzEkmp0eRrw0irzkUpMzaA== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(82310400014)(36860700004)(376005)(1800799015);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 15:14:29.0724 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ceaa0e43-0c58-47c5-a8b5-08dc54b9ec7a X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS3PEPF000099D6.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY5PR12MB6177 A new "FRU Text in MCA" feature is defined where the Field Replaceable Unit (FRU) Text for a device is represented by a string in the new MCA_SYND1 and MCA_SYND2 registers. This feature is supported per MCA bank, and it is advertised by the McaFruTextInMca bit (MCA_CONFIG[9]). The FRU Text is populated dynamically for each individual error state (MCA_STATUS, MCA_ADDR, et al.). This handles the case where an MCA bank covers multiple devices, for example, a Unified Memory Controller (UMC) bank that manages two DIMMs. Print the FRU Text string, if available, when decoding an MCA error. Also, add field for MCA_CONFIG MSR in struct mce_hw_err as vendor specific error information and save the value of the MSR. The very value can then be exported through tracepoint for userspace tools like rasdaemon to print FRU Text, if available. Co-developed-by: Avadhut Naik Signed-off-by: Avadhut Naik Signed-off-by: Yazen Ghannam --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-21-yazen.ghannam@amd.com v1->v2: * No change. arch/x86/include/asm/mce.h | 2 ++ arch/x86/kernel/cpu/mce/apei.c | 2 ++ arch/x86/kernel/cpu/mce/core.c | 3 +++ drivers/edac/mce_amd.c | 21 ++++++++++++++------- 4 files changed, 21 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index a701290f80a1..2a8997d7ba4d 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -59,6 +59,7 @@ * - TCC bit is present in MCx_STATUS. */ #define MCI_CONFIG_MCAX 0x1 +#define MCI_CONFIG_FRUTEXT BIT_ULL(9) /* * Note that the full MCACOD field of IA32_MCi_STATUS MSR is @@ -195,6 +196,7 @@ struct mce_hw_err { struct { u64 synd1; u64 synd2; + u64 config; } amd; } vi; }; diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c index 43622241c379..a9c28614530b 100644 --- a/arch/x86/kernel/cpu/mce/apei.c +++ b/arch/x86/kernel/cpu/mce/apei.c @@ -154,6 +154,8 @@ int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id) fallthrough; /* MCA_CONFIG */ case 4: + err.vi.amd.config = *(i_mce + 3); + fallthrough; /* MCA_MISC0 */ case 3: m->misc = *(i_mce + 2); diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index aa27729f7899..a4d09dda5fae 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -207,6 +207,8 @@ static void __print_mce(struct mce_hw_err *err) pr_cont("SYND2 %llx ", err->vi.amd.synd2); if (m->ipid) pr_cont("IPID %llx ", m->ipid); + if (err->vi.amd.config) + pr_cont("CONFIG %llx ", err->vi.amd.config); } pr_cont("\n"); @@ -679,6 +681,7 @@ static noinstr void mce_read_aux(struct mce_hw_err *err, int i) if (mce_flags.smca) { m->ipid = mce_rdmsrl(MSR_AMD64_SMCA_MCx_IPID(i)); + err->vi.amd.config = mce_rdmsrl(MSR_AMD64_SMCA_MCx_CONFIG(i)); if (m->status & MCI_STATUS_SYNDV) { m->synd = mce_rdmsrl(MSR_AMD64_SMCA_MCx_SYND(i)); diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index 32bf4cc564a3..f68b3d1b558e 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -795,6 +795,7 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data) struct mce_hw_err *err = (struct mce_hw_err *)data; struct mce *m = &err->m; unsigned int fam = x86_family(m->cpuid); + u64 mca_config = err->vi.amd.config; int ecc; if (m->kflags & MCE_HANDLED_CEC) @@ -814,11 +815,7 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data) ((m->status & MCI_STATUS_PCC) ? "PCC" : "-")); if (boot_cpu_has(X86_FEATURE_SMCA)) { - u32 low, high; - u32 addr = MSR_AMD64_SMCA_MCx_CONFIG(m->bank); - - if (!rdmsr_safe(addr, &low, &high) && - (low & MCI_CONFIG_MCAX)) + if (mca_config & MCI_CONFIG_MCAX) pr_cont("|%s", ((m->status & MCI_STATUS_TCC) ? "TCC" : "-")); pr_cont("|%s", ((m->status & MCI_STATUS_SYNDV) ? "SyndV" : "-")); @@ -853,8 +850,18 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data) if (m->status & MCI_STATUS_SYNDV) { pr_cont(", Syndrome: 0x%016llx\n", m->synd); - pr_emerg(HW_ERR "Syndrome1: 0x%016llx, Syndrome2: 0x%016llx", - err->vi.amd.synd1, err->vi.amd.synd2); + if (mca_config & MCI_CONFIG_FRUTEXT) { + char frutext[17]; + + memset(frutext, 0, sizeof(frutext)); + memcpy(&frutext[0], &err->vi.amd.synd1, 8); + memcpy(&frutext[8], &err->vi.amd.synd2, 8); + + pr_emerg(HW_ERR "FRU Text: %s", frutext); + } else { + pr_emerg(HW_ERR "Syndrome1: 0x%016llx, Syndrome2: 0x%016llx", + err->vi.amd.synd1, err->vi.amd.synd2); + } } pr_cont("\n");