Message ID | 4E571D66.1070302@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Aug 25, 2011 at 10:13 PM, Huang Ying <ying.huang@intel.com> wrote: > Hi, Pavel, > > Sorry, there is a minor issue in the patch. Do you have time to try > the updated version attached. I think Yinghai's patch is a better approach (though it needs a changelog). The ACPI NVS space should not be marked as "busy" by the e820 code in the iomem_resource tree. It's way too complicated to mess around with registering NVS space and trying to deal with it specially in APEI. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/26/2011 09:43 PM, Bjorn Helgaas wrote: > On Thu, Aug 25, 2011 at 10:13 PM, Huang Ying <ying.huang@intel.com> wrote: >> Hi, Pavel, >> >> Sorry, there is a minor issue in the patch. Do you have time to try >> the updated version attached. > > I think Yinghai's patch is a better approach (though it needs a changelog). > > The ACPI NVS space should not be marked as "busy" by the e820 code in > the iomem_resource tree. > > It's way too complicated to mess around with registering NVS space and > trying to deal with it specially in APEI. ACPI NVS is different, it can be used only by firmware, and its interpreter, such as ACPI AML interpreter and APEI interpreter. It can not be used by ordinary driver. If my understanding is correct, this is why ACPI NVS is marked as busy. Best Regards, Huang Ying -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/27/2011 11:28 AM, Pavel Ivanov wrote: > Huang, > > I can confirm that new patch got rid of the old error message but > introduced a new one: > > ERST: Failed to get Error Log Address Range > > Also I noticed there's another suspicious message in dmesg and it was > there earlier too (although it's not shown on console): > > GHES: Failed to enable APEI firmware first mode > > Is it bad? What does it mean? Maybe this has been fixed, please this patch, https://lkml.org/lkml/2011/8/11/297 Best Regards, Huang Ying -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Aug 28, 2011 at 7:27 PM, Huang Ying <ying.huang@intel.com> wrote: > On 08/26/2011 09:43 PM, Bjorn Helgaas wrote: >> On Thu, Aug 25, 2011 at 10:13 PM, Huang Ying <ying.huang@intel.com> wrote: >>> Hi, Pavel, >>> >>> Sorry, there is a minor issue in the patch. Do you have time to try >>> the updated version attached. >> >> I think Yinghai's patch is a better approach (though it needs a changelog). >> >> The ACPI NVS space should not be marked as "busy" by the e820 code in >> the iomem_resource tree. >> >> It's way too complicated to mess around with registering NVS space and >> trying to deal with it specially in APEI. > > ACPI NVS is different, it can be used only by firmware, and its > interpreter, such as ACPI AML interpreter and APEI interpreter. It can > not be used by ordinary driver. If my understanding is correct, this is > why ACPI NVS is marked as busy. I don't understand why ACPI NVS is different. If we reserve it in iomem_resource (without marking it busy), it's already unavailable to anything else unless you know something connected to the NVS region. The only way to allocate space inside NVS is to first discover the ACPI NVS resource, so you could pass it to request_resource() or allocate_resource() as the "root" to allocate from. There's no mechanism for locating that resource, so it's already protected in that sense. To use request_mem_region() (which marks the region busy), you all you need is an address. The only way for a driver to learn an address inside NVS is from the ACPI tables. So I think it's effectively limited to APEI there, too. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/29/2011 10:48 PM, Bjorn Helgaas wrote: > On Sun, Aug 28, 2011 at 7:27 PM, Huang Ying <ying.huang@intel.com> wrote: >> On 08/26/2011 09:43 PM, Bjorn Helgaas wrote: >>> On Thu, Aug 25, 2011 at 10:13 PM, Huang Ying <ying.huang@intel.com> wrote: >>>> Hi, Pavel, >>>> >>>> Sorry, there is a minor issue in the patch. Do you have time to try >>>> the updated version attached. >>> >>> I think Yinghai's patch is a better approach (though it needs a changelog). >>> >>> The ACPI NVS space should not be marked as "busy" by the e820 code in >>> the iomem_resource tree. >>> >>> It's way too complicated to mess around with registering NVS space and >>> trying to deal with it specially in APEI. >> >> ACPI NVS is different, it can be used only by firmware, and its >> interpreter, such as ACPI AML interpreter and APEI interpreter. It can >> not be used by ordinary driver. If my understanding is correct, this is >> why ACPI NVS is marked as busy. > > I don't understand why ACPI NVS is different. If we reserve it in > iomem_resource (without marking it busy), it's already unavailable to > anything else unless you know something connected to the NVS region. I think ACPI NVS can be seen as reserved by ACPI or firmware interpreter. So that, all other drivers should not access it. Best Regards, Huang Ying -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 29, 2011 at 1:37 AM, Huang Ying <ying.huang@intel.com> wrote: > On 08/27/2011 11:28 AM, Pavel Ivanov wrote: >> Also I noticed there's another suspicious message in dmesg and it was >> there earlier too (although it's not shown on console): >> >> GHES: Failed to enable APEI firmware first mode >> >> Is it bad? What does it mean? > > Maybe this has been fixed, please this patch, > > https://lkml.org/lkml/2011/8/11/297 Yes, this patch indeed fixed this failure message. It's now replaced by GHES: APEI firmware first mode is enabled by WHEA _OSC. Can I ask you another question: what does the following message mean? [Firmware Warn]: GHES: Poll interval is 0 for generic hardware error source: 1, disabled. Pavel P.S. No word from Asus yet on the ERST issue. -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/03/2011 10:44 AM, Pavel Ivanov wrote: > On Mon, Aug 29, 2011 at 1:37 AM, Huang Ying <ying.huang@intel.com> wrote: >> On 08/27/2011 11:28 AM, Pavel Ivanov wrote: >>> Also I noticed there's another suspicious message in dmesg and it was >>> there earlier too (although it's not shown on console): >>> >>> GHES: Failed to enable APEI firmware first mode >>> >>> Is it bad? What does it mean? >> >> Maybe this has been fixed, please this patch, >> >> https://lkml.org/lkml/2011/8/11/297 > > Yes, this patch indeed fixed this failure message. It's now replaced by > > GHES: APEI firmware first mode is enabled by WHEA _OSC. > > Can I ask you another question: what does the following message mean? > > [Firmware Warn]: GHES: Poll interval is 0 for generic hardware error > source: 1, disabled. This means: The generic hardware error source with ID 1 is notified by poll, that is, Linux should poll the specified address to check whether there is hardware error report from hardware. But the poll interval specified by firmware is "0". I think this should be a firmware error, and disabled it. I think maybe you can contact firmware vendor to fix this. Best Regards, Huang Ying -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Subject: [BUGFIX] ACPI, APEI, Resolve false conflict between ACPI NVS and APEI Some firmware will access memory in ACPI NVS region via APEI. That is, instructions in APEI ERST/EINJ table will read/write ACPI NVS region. The original resource conflict checking in APEI code will check memory/ioport accessed by APEI via general resource management mech. But ACPI NVS region is marked as busy already, so that the false resource conflict will prevent APEI ERST/EINJ to work. To fix this, this patch excludes ACPI NVS regions when APEI components request resources. So that they will not conflict with ACPI NVS regions. Reported-by: Pavel Ivanov <paivanof@gmail.com> Signed-off-by: Huang Ying <ying.huang@intel.com> --- drivers/acpi/apei/apei-base.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) --- a/drivers/acpi/apei/apei-base.c +++ b/drivers/acpi/apei/apei-base.c @@ -438,8 +438,19 @@ int apei_resources_sub(struct apei_resou } EXPORT_SYMBOL_GPL(apei_resources_sub); +static int apei_get_nvs_callback(__u64 start, __u64 size, void *data) +{ + struct apei_resources *resources = data; + return apei_res_add(&resources->iomem, start, size); +} + +static int apei_get_nvs_resources(struct apei_resources *resources) +{ + return acpi_nvs_for_each_region(apei_get_nvs_callback, resources); +} + /* - * IO memory/port rersource management mechanism is used to check + * IO memory/port resource management mechanism is used to check * whether memory/port area used by GARs conflicts with normal memory * or IO memory/port of devices. */ @@ -448,12 +459,26 @@ int apei_resources_request(struct apei_r { struct apei_res *res, *res_bak = NULL; struct resource *r; + struct apei_resources nvs_resources; int rc; rc = apei_resources_sub(resources, &apei_resources_all); if (rc) return rc; + /* + * Some firmware uses ACPI NVS region, that has been marked as + * busy, so exclude it from APEI resources to avoid false + * conflict. + */ + apei_resources_init(&nvs_resources); + rc = apei_get_nvs_resources(&nvs_resources); + if (rc) + goto res_fini; + rc = apei_resources_sub(resources, &nvs_resources); + if (rc) + goto res_fini; + rc = -EINVAL; list_for_each_entry(res, &resources->iomem, list) { r = request_mem_region(res->start, res->end - res->start, @@ -500,6 +525,8 @@ err_unmap_iomem: break; release_mem_region(res->start, res->end - res->start); } +res_fini: + apei_resources_fini(&nvs_resources); return rc; } EXPORT_SYMBOL_GPL(apei_resources_request);