Message ID | 20200903123456.1823-1-shiju.jose@huawei.com (mailing list archive) |
---|---|
Headers | show |
Series | ACPI / APEI: Add support to notify the vendor specific HW errors | expand |
Hello, Can you help to merge this series? Thanks, Shiju >-----Original Message----- >From: Linuxarm [mailto:linuxarm-bounces@huawei.com] On Behalf Of Shiju >Jose >Sent: 03 September 2020 13:35 >To: linux-acpi@vger.kernel.org; linux-pci@vger.kernel.org; linux- >kernel@vger.kernel.org; rjw@rjwysocki.net; helgaas@kernel.org; >bp@alien8.de; james.morse@arm.com; lorenzo.pieralisi@arm.com; >robh@kernel.org; lenb@kernel.org; tony.luck@intel.com; >dan.carpenter@oracle.com; andriy.shevchenko@linux.intel.com >Cc: Linuxarm <linuxarm@huawei.com> >Subject: [PATCH v15 0/2] ACPI / APEI: Add support to notify the vendor >specific HW errors > >CPER records describing a firmware-first error are identified by GUID. >The ghes driver currently logs, but ignores any unknown CPER records. >This prevents describing errors that can't be represented by a standard entry, >that would otherwise allow a driver to recover from an error. >The UEFI spec calls these 'Non-standard Section Body' (N.2.3 of version 2.8). > >patch set >1. add the notifier chain for these non-standard/vendor-records > in the ghes driver. > >2. add the driver to handle HiSilicon HIP PCIe controller's errors. > >Changes: > >V15: >1. Change in the HIP PCIe error handling driver > for a comment by Andy Shevchenko. > Removed "depends on ACPI" as it already depends on > it through ACPI_APEI_GHES. > >V14: >1. Add patch[1] posted by James to the series. > >2. Following changes made for Bjorn's comments, >2.1 Deleted stub code from ghes.h >2.2 Made CONFIG_PCIE_HISI_ERR depend on CONFIG_ACPI_APEI_GHES. > >V13: >1. Following changes in the HIP PCIe error handling driver. >1.1 Add Bjorn's acked-by. >1.2. Address the comments and macros order Bjorn mentioned. > Fix the words in the commit. > >V12: >1. Changed the Signed-off-by tag to Co-developed-by tag in the patch > "ACPI / APEI: Add a notifier chain for unknown (vendor) CPER records" > >V11: >1. Following modifications made by James Morse in the APEI patch > for the vendor error record. > - Removed kfifo and ghes_gdata_pool. Expanded commit message. > >2. Changes in the HIP PCIe error handling driver > for the comments by Andy Shevchenko. > >V10: >1. Changes for Bjorn's comments on HIP PCIe error handler driver > and APEI patch. > >2. Changes in the HIP PCIe error handler driver > for the feedbacks by Andy Shevchenko. > >V9: >1. Fixed 2 improvements suggested by the kbuild test robot. >1.1 Change ghes_gdata_pool_init() as static function. >1.2. Removed using buffer to store the error data for > logging in the hisi_pcie_handle_error() > >V8: >1. Removed reporting the standard errors through the interface > because of the conflict with the recent patches in the > memory error handling path. >2. Fix comments by Dan Carpenter. > >V7: >1. Add changes in the APEI driver suggested by Borislav Petkov, for > queuing up all the non-fatal HW errors to the work queue and > notify the registered kernel drivers from the bottom half using > blocking notifier, common interface for both standard and > vendor-spcific errors. >2. Fix for further feedbacks in v5 HIP PCIe error handler driver > by Bjorn Helgaas. > >V6: >1. Fix few changes in the patch subject line suggested by Bjorn Helgaas. > >V5: >1. Fix comments from James Morse. >1.1 Changed the notification method to use the atomic_notifier_chain. >1.2 Add the error handled status for the user space. > >V4: >1. Fix for the following smatch warning in the PCIe error driver, > reported by kbuild test robot<lkp@intel.com>: > warn: should '((((1))) << (9 + i))' be a 64 bit type? > if (err->val_bits & BIT(HISI_PCIE_LOCAL_VALID_ERR_MISC + i)) > ^^^ This should be BIT_ULL() because it goes up to 9 + 32. > >V3: >1. Fix the comments from Bjorn Helgaas. > >V2: >1. Changes in the HiSilicon PCIe controller's error handling driver > for the comments from Bjorn Helgaas. > >2. Changes in the APEI interface to support reporting the vendor error > for module with multiple devices, but use the same section type. > In the error handler will use socket id/sub module id etc to distinguish > the device. > >V1: >1. Fix comments from James Morse. > >2. add driver to handle HiSilicon hip08 PCIe controller's errors, > which is an application of the above interface. > >Shiju Jose (1): > ACPI / APEI: Add a notifier chain for unknown (vendor) CPER records > >Yicong Yang (1): > PCI: hip: Add handling of HiSilicon HIP PCIe controller errors > > drivers/acpi/apei/ghes.c | 63 +++++ > drivers/pci/controller/Kconfig | 7 + > drivers/pci/controller/Makefile | 1 + > drivers/pci/controller/pcie-hisi-error.c | 327 +++++++++++++++++++++++ > include/acpi/ghes.h | 18 ++ > 5 files changed, 416 insertions(+) > create mode 100644 drivers/pci/controller/pcie-hisi-error.c > >-- >2.17.1 > > >_______________________________________________ >Linuxarm mailing list >Linuxarm@huawei.com >http://hulk.huawei.com/mailman/listinfo/linuxarm
On Mon, Sep 14, 2020 at 2:34 PM Shiju Jose <shiju.jose@huawei.com> wrote: > > Hello, > > Can you help to merge this series? Do you want this series to go in through the ACPI tree? > >-----Original Message----- > >From: Linuxarm [mailto:linuxarm-bounces@huawei.com] On Behalf Of Shiju > >Jose > >Sent: 03 September 2020 13:35 > >To: linux-acpi@vger.kernel.org; linux-pci@vger.kernel.org; linux- > >kernel@vger.kernel.org; rjw@rjwysocki.net; helgaas@kernel.org; > >bp@alien8.de; james.morse@arm.com; lorenzo.pieralisi@arm.com; > >robh@kernel.org; lenb@kernel.org; tony.luck@intel.com; > >dan.carpenter@oracle.com; andriy.shevchenko@linux.intel.com > >Cc: Linuxarm <linuxarm@huawei.com> > >Subject: [PATCH v15 0/2] ACPI / APEI: Add support to notify the vendor > >specific HW errors > > > >CPER records describing a firmware-first error are identified by GUID. > >The ghes driver currently logs, but ignores any unknown CPER records. > >This prevents describing errors that can't be represented by a standard entry, > >that would otherwise allow a driver to recover from an error. > >The UEFI spec calls these 'Non-standard Section Body' (N.2.3 of version 2.8). > > > >patch set > >1. add the notifier chain for these non-standard/vendor-records > > in the ghes driver. > > > >2. add the driver to handle HiSilicon HIP PCIe controller's errors. > > > >Changes: > > > >V15: > >1. Change in the HIP PCIe error handling driver > > for a comment by Andy Shevchenko. > > Removed "depends on ACPI" as it already depends on > > it through ACPI_APEI_GHES. > > > >V14: > >1. Add patch[1] posted by James to the series. > > > >2. Following changes made for Bjorn's comments, > >2.1 Deleted stub code from ghes.h > >2.2 Made CONFIG_PCIE_HISI_ERR depend on CONFIG_ACPI_APEI_GHES. > > > >V13: > >1. Following changes in the HIP PCIe error handling driver. > >1.1 Add Bjorn's acked-by. > >1.2. Address the comments and macros order Bjorn mentioned. > > Fix the words in the commit. > > > >V12: > >1. Changed the Signed-off-by tag to Co-developed-by tag in the patch > > "ACPI / APEI: Add a notifier chain for unknown (vendor) CPER records" > > > >V11: > >1. Following modifications made by James Morse in the APEI patch > > for the vendor error record. > > - Removed kfifo and ghes_gdata_pool. Expanded commit message. > > > >2. Changes in the HIP PCIe error handling driver > > for the comments by Andy Shevchenko. > > > >V10: > >1. Changes for Bjorn's comments on HIP PCIe error handler driver > > and APEI patch. > > > >2. Changes in the HIP PCIe error handler driver > > for the feedbacks by Andy Shevchenko. > > > >V9: > >1. Fixed 2 improvements suggested by the kbuild test robot. > >1.1 Change ghes_gdata_pool_init() as static function. > >1.2. Removed using buffer to store the error data for > > logging in the hisi_pcie_handle_error() > > > >V8: > >1. Removed reporting the standard errors through the interface > > because of the conflict with the recent patches in the > > memory error handling path. > >2. Fix comments by Dan Carpenter. > > > >V7: > >1. Add changes in the APEI driver suggested by Borislav Petkov, for > > queuing up all the non-fatal HW errors to the work queue and > > notify the registered kernel drivers from the bottom half using > > blocking notifier, common interface for both standard and > > vendor-spcific errors. > >2. Fix for further feedbacks in v5 HIP PCIe error handler driver > > by Bjorn Helgaas. > > > >V6: > >1. Fix few changes in the patch subject line suggested by Bjorn Helgaas. > > > >V5: > >1. Fix comments from James Morse. > >1.1 Changed the notification method to use the atomic_notifier_chain. > >1.2 Add the error handled status for the user space. > > > >V4: > >1. Fix for the following smatch warning in the PCIe error driver, > > reported by kbuild test robot<lkp@intel.com>: > > warn: should '((((1))) << (9 + i))' be a 64 bit type? > > if (err->val_bits & BIT(HISI_PCIE_LOCAL_VALID_ERR_MISC + i)) > > ^^^ This should be BIT_ULL() because it goes up to 9 + 32. > > > >V3: > >1. Fix the comments from Bjorn Helgaas. > > > >V2: > >1. Changes in the HiSilicon PCIe controller's error handling driver > > for the comments from Bjorn Helgaas. > > > >2. Changes in the APEI interface to support reporting the vendor error > > for module with multiple devices, but use the same section type. > > In the error handler will use socket id/sub module id etc to distinguish > > the device. > > > >V1: > >1. Fix comments from James Morse. > > > >2. add driver to handle HiSilicon hip08 PCIe controller's errors, > > which is an application of the above interface. > > > >Shiju Jose (1): > > ACPI / APEI: Add a notifier chain for unknown (vendor) CPER records > > > >Yicong Yang (1): > > PCI: hip: Add handling of HiSilicon HIP PCIe controller errors > > > > drivers/acpi/apei/ghes.c | 63 +++++ > > drivers/pci/controller/Kconfig | 7 + > > drivers/pci/controller/Makefile | 1 + > > drivers/pci/controller/pcie-hisi-error.c | 327 +++++++++++++++++++++++ > > include/acpi/ghes.h | 18 ++ > > 5 files changed, 416 insertions(+) > > create mode 100644 drivers/pci/controller/pcie-hisi-error.c > > > >-- > >2.17.1 > > > > > >_______________________________________________ > >Linuxarm mailing list > >Linuxarm@huawei.com > >http://hulk.huawei.com/mailman/listinfo/linuxarm
On Tue, Sep 15, 2020 at 8:17 PM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Tue, Sep 15, 2020 at 07:31:50PM +0200, Rafael J. Wysocki wrote: > > On Mon, Sep 14, 2020 at 2:34 PM Shiju Jose <shiju.jose@huawei.com> wrote: > > > > > > Hello, > > > > > > Can you help to merge this series? > > > > Do you want this series to go in through the ACPI tree? > > It crosses ACPI and vendor-specific PCI, but the bulk of the changes > are in PCI. I gave some comments on the design and acked the PCI part > from my point of view. I envisioned Lorenzo merging it since he takes > care of drivers/pci/controller/*, but either that (given your ack for > the ACPI part) or ACPI would be OK with me. Well, please feel free to add an ACK from me to this then. Cheers! > > > > drivers/acpi/apei/ghes.c | 63 +++++ > > > > drivers/pci/controller/Kconfig | 7 + > > > > drivers/pci/controller/Makefile | 1 + > > > > drivers/pci/controller/pcie-hisi-error.c | 327 +++++++++++++++++++++++ > > > > include/acpi/ghes.h | 18 ++ > > > > 5 files changed, 416 insertions(+) > > > > create mode 100644 drivers/pci/controller/pcie-hisi-error.c
On Thu, Sep 03, 2020 at 01:34:54PM +0100, Shiju Jose wrote: > CPER records describing a firmware-first error are identified by GUID. > The ghes driver currently logs, but ignores any unknown CPER records. > This prevents describing errors that can't be represented by a standard > entry, that would otherwise allow a driver to recover from an error. > The UEFI spec calls these 'Non-standard Section Body' (N.2.3 of > version 2.8). > > patch set > 1. add the notifier chain for these non-standard/vendor-records > in the ghes driver. > > 2. add the driver to handle HiSilicon HIP PCIe controller's errors. > > Changes: > > V15: > 1. Change in the HIP PCIe error handling driver > for a comment by Andy Shevchenko. > Removed "depends on ACPI" as it already depends on > it through ACPI_APEI_GHES. > > V14: > 1. Add patch[1] posted by James to the series. > > 2. Following changes made for Bjorn's comments, > 2.1 Deleted stub code from ghes.h > 2.2 Made CONFIG_PCIE_HISI_ERR depend on CONFIG_ACPI_APEI_GHES. > > V13: > 1. Following changes in the HIP PCIe error handling driver. > 1.1 Add Bjorn's acked-by. > 1.2. Address the comments and macros order Bjorn mentioned. > Fix the words in the commit. > > V12: > 1. Changed the Signed-off-by tag to Co-developed-by tag in the patch > "ACPI / APEI: Add a notifier chain for unknown (vendor) CPER records" > > V11: > 1. Following modifications made by James Morse in the APEI patch > for the vendor error record. > - Removed kfifo and ghes_gdata_pool. Expanded commit message. > > 2. Changes in the HIP PCIe error handling driver > for the comments by Andy Shevchenko. > > V10: > 1. Changes for Bjorn's comments on HIP PCIe error handler driver > and APEI patch. > > 2. Changes in the HIP PCIe error handler driver > for the feedbacks by Andy Shevchenko. > > V9: > 1. Fixed 2 improvements suggested by the kbuild test robot. > 1.1 Change ghes_gdata_pool_init() as static function. > 1.2. Removed using buffer to store the error data for > logging in the hisi_pcie_handle_error() > > V8: > 1. Removed reporting the standard errors through the interface > because of the conflict with the recent patches in the > memory error handling path. > 2. Fix comments by Dan Carpenter. > > V7: > 1. Add changes in the APEI driver suggested by Borislav Petkov, for > queuing up all the non-fatal HW errors to the work queue and > notify the registered kernel drivers from the bottom half using > blocking notifier, common interface for both standard and > vendor-spcific errors. > 2. Fix for further feedbacks in v5 HIP PCIe error handler driver > by Bjorn Helgaas. > > V6: > 1. Fix few changes in the patch subject line suggested by Bjorn Helgaas. > > V5: > 1. Fix comments from James Morse. > 1.1 Changed the notification method to use the atomic_notifier_chain. > 1.2 Add the error handled status for the user space. > > V4: > 1. Fix for the following smatch warning in the PCIe error driver, > reported by kbuild test robot<lkp@intel.com>: > warn: should '((((1))) << (9 + i))' be a 64 bit type? > if (err->val_bits & BIT(HISI_PCIE_LOCAL_VALID_ERR_MISC + i)) > ^^^ This should be BIT_ULL() because it goes up to 9 + 32. > > V3: > 1. Fix the comments from Bjorn Helgaas. > > V2: > 1. Changes in the HiSilicon PCIe controller's error handling driver > for the comments from Bjorn Helgaas. > > 2. Changes in the APEI interface to support reporting the vendor error > for module with multiple devices, but use the same section type. > In the error handler will use socket id/sub module id etc to distinguish > the device. > > V1: > 1. Fix comments from James Morse. > > 2. add driver to handle HiSilicon hip08 PCIe controller's errors, > which is an application of the above interface. > > Shiju Jose (1): > ACPI / APEI: Add a notifier chain for unknown (vendor) CPER records > > Yicong Yang (1): > PCI: hip: Add handling of HiSilicon HIP PCIe controller errors > > drivers/acpi/apei/ghes.c | 63 +++++ > drivers/pci/controller/Kconfig | 7 + > drivers/pci/controller/Makefile | 1 + > drivers/pci/controller/pcie-hisi-error.c | 327 +++++++++++++++++++++++ > include/acpi/ghes.h | 18 ++ > 5 files changed, 416 insertions(+) > create mode 100644 drivers/pci/controller/pcie-hisi-error.c Applied to pci/apei with Bjorn/Rafael ACKs, thanks. Lorenzo