From patchwork Thu Nov 9 15:14:35 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tyler Baicar X-Patchwork-Id: 10051127 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C8922605FF for ; Thu, 9 Nov 2017 15:14:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B2B8E2ABB3 for ; Thu, 9 Nov 2017 15:14:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A77AB2ABC8; Thu, 9 Nov 2017 15:14:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3AADC2ABC9 for ; Thu, 9 Nov 2017 15:14:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750984AbdKIPOj (ORCPT ); Thu, 9 Nov 2017 10:14:39 -0500 Received: from smtp.codeaurora.org ([198.145.29.96]:38148 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750779AbdKIPOi (ORCPT ); Thu, 9 Nov 2017 10:14:38 -0500 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id EE1BD607BD; Thu, 9 Nov 2017 15:14:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1510240477; bh=bbVZW3pH8/QZRJhfHpqUACbdyDMPHuzfZJsCQrr1ycM=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=BpMRd5/4V1e2HKgKq24wMyiLWlPj1QQ7EzE/lld40dyngRSWyYlShfDWSX9pUJaxf rUFq3GiHHUMrs2GadDVKqA8HO1U2rPYTFoBTXJwEEDIa1v3rbXUtOIRznqUZmfRZEm eYLwMHW/pvVO37LiuAs9wjfpUp9QWnuAP06n7FRo= Received: from [10.235.228.46] (global_nat1_iad_fw.qualcomm.com [129.46.232.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: tbaicar@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 6E6AD60542; Thu, 9 Nov 2017 15:14:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1510240477; bh=bbVZW3pH8/QZRJhfHpqUACbdyDMPHuzfZJsCQrr1ycM=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=BpMRd5/4V1e2HKgKq24wMyiLWlPj1QQ7EzE/lld40dyngRSWyYlShfDWSX9pUJaxf rUFq3GiHHUMrs2GadDVKqA8HO1U2rPYTFoBTXJwEEDIa1v3rbXUtOIRznqUZmfRZEm eYLwMHW/pvVO37LiuAs9wjfpUp9QWnuAP06n7FRo= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 6E6AD60542 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=tbaicar@codeaurora.org Subject: Re: [PATCH V3 2/2] acpi: apei: call into AER handling regardless of severity To: Borislav Petkov Cc: rjw@rjwysocki.net, tony.luck@intel.com, will.deacon@arm.com, james.morse@arm.com, linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org References: <1510168392-30114-1-git-send-email-tbaicar@codeaurora.org> <1510168392-30114-3-git-send-email-tbaicar@codeaurora.org> <20171109094654.daymsvizctfrypbo@pd.tnic> From: Tyler Baicar Message-ID: <74cee7d2-dffa-0559-d529-5c86023161e3@codeaurora.org> Date: Thu, 9 Nov 2017 10:14:35 -0500 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <20171109094654.daymsvizctfrypbo@pd.tnic> Content-Language: en-US Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 11/9/2017 4:46 AM, Borislav Petkov wrote: > On Wed, Nov 08, 2017 at 12:13:12PM -0700, Tyler Baicar wrote: >> Currently the GHES code only calls into the AER driver for >> recoverable type errors. This is incorrect because errors of >> other severities do not get logged by the AER driver and do not >> get exposed to user space via the AER trace event. So, call >> into the AER driver for PCIe errors regardless of the severity >> >> Signed-off-by: Tyler Baicar >> --- >> drivers/acpi/apei/ghes.c | 8 +++----- >> 1 file changed, 3 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c >> index 839c3d5..bb65fa6 100644 >> --- a/drivers/acpi/apei/ghes.c >> +++ b/drivers/acpi/apei/ghes.c >> @@ -458,14 +458,12 @@ static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int >> #endif >> } > Where did the explanatory comment go? > > +/* > + * PCIe AER errors need to be sent to the AER driver for reporting and > + * recovery. The GHES severities map to the following AER severities and > + * require the following handling: > + * > + * GHES_SEV_CORRECTABLE -> AER_CORRECTABLE > + * These need to be reported by the AER driver but no recovery is > + * necessary. > + * GHES_SEV_RECOVERABLE -> AER_NONFATAL > + * GHES_SEV_RECOVERABLE && CPER_SEC_RESET -> AER_FATAL > + * These both need to be reported and recovered from by the AER driver. > + * GHES_SEV_PANIC does not make it to this handling since the kernel must > + * panic. > + */ > > <--- ??? Updated patch including the comment: Currently the GHES code only calls into the AER driver for recoverable type errors. This is incorrect because errors of other severities do not get logged by the AER driver and do not get exposed to user space via the AER trace event. So, call into the AER driver for PCIe errors regardless of the severity Signed-off-by: Tyler Baicar --- drivers/acpi/apei/ghes.c | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) -- Thanks, Tyler diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 839c3d5..15dbf65 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -458,14 +458,26 @@ static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int #endif } -static void ghes_handle_aer(struct acpi_hest_generic_data *gdata, int sev, int sec_sev) +/* + * PCIe AER errors need to be sent to the AER driver for reporting and + * recovery. The GHES severities map to the following AER severities and + * require the following handling: + * + * GHES_SEV_CORRECTABLE -> AER_CORRECTABLE + * These need to be reported by the AER driver but no recovery is + * necessary. + * GHES_SEV_RECOVERABLE -> AER_NONFATAL + * GHES_SEV_RECOVERABLE && CPER_SEC_RESET -> AER_FATAL + * These both need to be reported and recovered from by the AER driver. + * GHES_SEV_PANIC does not make it to this handling since the kernel must + * panic. + */ +static void ghes_handle_aer(struct acpi_hest_generic_data *gdata) { #ifdef CONFIG_ACPI_APEI_PCIEAER struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata); - if (sev == GHES_SEV_RECOVERABLE && - sec_sev == GHES_SEV_RECOVERABLE && - pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID && + if (pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID && pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO) { unsigned int devfn; int aer_severity; @@ -519,7 +531,7 @@ static void ghes_do_proc(struct ghes *ghes, ghes_handle_memory_failure(gdata, sev); } else if (guid_equal(sec_type, &CPER_SEC_PCIE)) { - ghes_handle_aer(gdata, sev, sec_sev); + ghes_handle_aer(gdata); } else if (guid_equal(sec_type, &CPER_SEC_PROC_ARM)) { struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata);