Message ID | 20240509084833.2147767-2-zhenzhong.duan@intel.com |
---|---|
State | Superseded |
Headers | show |
Series | PCI/AER: Handle Advisory Non-Fatal error | expand |
On Thu, 9 May 2024 16:48:31 +0800 Zhenzhong Duan <zhenzhong.duan@intel.com> wrote: > In some cases the detector of a Non-Fatal Error(NFE) is not the most > appropriate agent to determine the type of the error. For example, > when software performs a configuration read from a non-existent > device or Function, completer will send an ERR_NONFATAL Message. > On some platforms, ERR_NONFATAL results in a System Error, which > breaks normal software probing. > > Advisory Non-Fatal Error(ANFE) is a special case that can be used > in above scenario. It is predominantly determined by the role of the > detecting agent (Requester, Completer, or Receiver) and the specific > error. In such cases, an agent with AER signals the NFE (if enabled) > by sending an ERR_COR Message as an advisory to software, instead of > sending ERR_NONFATAL. > > When processing an ANFE, ideally both correctable error(CE) status and > uncorrectable error(UE) status should be cleared. However, there is no > way to fully identify the UE associated with ANFE. Even worse, Non-Fatal > Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as > NFE will reproduce above mentioned issue, i.e., breaking softwore probing; > treating NFE as ANFE will make us ignoring some UEs which need active > recover operation. To avoid clearing UEs that are not ANFE by accident, > the most conservative route is taken here: If any of the NFE Detected > bits is set in Device Status, do not touch UE status, they should be > cleared later by the UE handler. Otherwise, a specific set of UEs that > may be raised as ANFE according to the PCIe specification will be cleared > if their corresponding severity is Non-Fatal. > > To achieve above purpose, store UNCOR_STATUS bits that might be ANFE > in aer_err_info.anfe_status. So that those bits could be printed and > processed later. > > Tested-by: Yudong Wang <yudong.wang@intel.com> > Co-developed-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com> > Signed-off-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com> > Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Not my most confident review ever as this is nasty and gives me a headache but your description is good and I think the implementation looks reasonable. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Hi, On 5/9/24 1:48 AM, Zhenzhong Duan wrote: > In some cases the detector of a Non-Fatal Error(NFE) is not the most > appropriate agent to determine the type of the error. For example, > when software performs a configuration read from a non-existent > device or Function, completer will send an ERR_NONFATAL Message. > On some platforms, ERR_NONFATAL results in a System Error, which > breaks normal software probing. > > Advisory Non-Fatal Error(ANFE) is a special case that can be used > in above scenario. It is predominantly determined by the role of the > detecting agent (Requester, Completer, or Receiver) and the specific > error. In such cases, an agent with AER signals the NFE (if enabled) > by sending an ERR_COR Message as an advisory to software, instead of > sending ERR_NONFATAL. > > When processing an ANFE, ideally both correctable error(CE) status and > uncorrectable error(UE) status should be cleared. However, there is no > way to fully identify the UE associated with ANFE. Even worse, Non-Fatal > Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as > NFE will reproduce above mentioned issue, i.e., breaking softwore probing; > treating NFE as ANFE will make us ignoring some UEs which need active > recover operation. To avoid clearing UEs that are not ANFE by accident, > the most conservative route is taken here: If any of the NFE Detected > bits is set in Device Status, do not touch UE status, they should be > cleared later by the UE handler. Otherwise, a specific set of UEs that > may be raised as ANFE according to the PCIe specification will be cleared > if their corresponding severity is Non-Fatal. > > To achieve above purpose, store UNCOR_STATUS bits that might be ANFE > in aer_err_info.anfe_status. So that those bits could be printed and > processed later. > > Tested-by: Yudong Wang <yudong.wang@intel.com> > Co-developed-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com> > Signed-off-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com> > Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> > --- > drivers/pci/pci.h | 1 + > drivers/pci/pcie/aer.c | 53 ++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 54 insertions(+) > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index 17fed1846847..3f9eb807f9fd 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -412,6 +412,7 @@ struct aer_err_info { > > unsigned int status; /* COR/UNCOR Error Status */ > unsigned int mask; /* COR/UNCOR Error Mask */ > + unsigned int anfe_status; /* UNCOR Error Status for ANFE */ > struct pcie_tlp_log tlp; /* TLP Header */ > }; > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index ac6293c24976..f2839b51321a 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -107,6 +107,12 @@ struct aer_stats { > PCI_ERR_ROOT_MULTI_COR_RCV | \ > PCI_ERR_ROOT_MULTI_UNCOR_RCV) > > +#define AER_ERR_ANFE_UNC_MASK (PCI_ERR_UNC_POISON_TLP | \ > + PCI_ERR_UNC_COMP_TIME | \ > + PCI_ERR_UNC_COMP_ABORT | \ > + PCI_ERR_UNC_UNX_COMP | \ > + PCI_ERR_UNC_UNSUP) > + > static int pcie_aer_disable; > static pci_ers_result_t aer_root_reset(struct pci_dev *dev); > > @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn, > EXPORT_SYMBOL_GPL(aer_recover_queue); > #endif > > +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info *info) > +{ > + u32 uncor_mask, uncor_status, anfe_status; > + u16 device_status; > + int aer = dev->aer_cap; > + > + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, &uncor_status); > + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK, &uncor_mask); > + /* > + * According to PCIe Base Specification Revision 6.1, > + * Section 6.2.3.2.4, if an UNCOR error is raised as > + * Advisory Non-Fatal error, it will match the following > + * conditions: > + * a. The severity of the error is Non-Fatal. > + * b. The error is one of the following: > + * 1. Poisoned TLP (Section 6.2.3.2.4.3) > + * 2. Completion Timeout (Section 6.2.3.2.4.4) > + * 3. Completer Abort (Section 6.2.3.2.4.1) > + * 4. Unexpected Completion (Section 6.2.3.2.4.5) > + * 5. Unsupported Request (Section 6.2.3.2.4.1) > + */ > + anfe_status = uncor_status & ~uncor_mask & ~info->severity & > + AER_ERR_ANFE_UNC_MASK; > + > + if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &device_status)) > + return; > + /* > + * Take the most conservative route here. If there are Non-Fatal errors > + * detected, do not assume any bit in uncor_status is set by ANFE. > + */ > + if (device_status & PCI_EXP_DEVSTA_NFED) > + return; You can move this check to the top of the function. You don't need to check the rest if NFE error is detected in device status. > + > + /* > + * If there is another ANFE between reading uncor_status and clearing > + * PCI_ERR_COR_ADV_NFAT bit in cor_status register, that ANFE isn't > + * recorded in info->anfe_status. It will be read out as NFE in > + * following uncor_status register reading and processed by NFE > + * handler. > + */ > + info->anfe_status = anfe_status; > +} > + > /** > * aer_get_device_error_info - read error status from dev and store it to info > * @dev: pointer to the device expected to have a error record > @@ -1213,6 +1262,7 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info) > > /* Must reset in this function */ > info->status = 0; > + info->anfe_status = 0; > info->tlp_header_valid = 0; > > /* The device might not support AER */ > @@ -1226,6 +1276,9 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info) > &info->mask); > if (!(info->status & ~info->mask)) > return 0; > + > + if (info->status & PCI_ERR_COR_ADV_NFAT) > + anfe_get_uc_status(dev, info); > } else if (type == PCI_EXP_TYPE_ROOT_PORT || > type == PCI_EXP_TYPE_RC_EC || > type == PCI_EXP_TYPE_DOWNSTREAM ||
Hi >-----Original Message----- >From: Kuppuswamy Sathyanarayanan ><sathyanarayanan.kuppuswamy@linux.intel.com> >Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might >be ANFE in aer_err_info > >Hi, > >On 5/9/24 1:48 AM, Zhenzhong Duan wrote: >> In some cases the detector of a Non-Fatal Error(NFE) is not the most >> appropriate agent to determine the type of the error. For example, >> when software performs a configuration read from a non-existent >> device or Function, completer will send an ERR_NONFATAL Message. >> On some platforms, ERR_NONFATAL results in a System Error, which >> breaks normal software probing. >> >> Advisory Non-Fatal Error(ANFE) is a special case that can be used >> in above scenario. It is predominantly determined by the role of the >> detecting agent (Requester, Completer, or Receiver) and the specific >> error. In such cases, an agent with AER signals the NFE (if enabled) >> by sending an ERR_COR Message as an advisory to software, instead of >> sending ERR_NONFATAL. >> >> When processing an ANFE, ideally both correctable error(CE) status and >> uncorrectable error(UE) status should be cleared. However, there is no >> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal >> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as >> NFE will reproduce above mentioned issue, i.e., breaking softwore probing; >> treating NFE as ANFE will make us ignoring some UEs which need active >> recover operation. To avoid clearing UEs that are not ANFE by accident, >> the most conservative route is taken here: If any of the NFE Detected >> bits is set in Device Status, do not touch UE status, they should be >> cleared later by the UE handler. Otherwise, a specific set of UEs that >> may be raised as ANFE according to the PCIe specification will be cleared >> if their corresponding severity is Non-Fatal. >> >> To achieve above purpose, store UNCOR_STATUS bits that might be ANFE >> in aer_err_info.anfe_status. So that those bits could be printed and >> processed later. >> >> Tested-by: Yudong Wang <yudong.wang@intel.com> >> Co-developed-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com> >> Signed-off-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com> >> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> >> --- >> drivers/pci/pci.h | 1 + >> drivers/pci/pcie/aer.c | 53 >++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 54 insertions(+) >> >> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h >> index 17fed1846847..3f9eb807f9fd 100644 >> --- a/drivers/pci/pci.h >> +++ b/drivers/pci/pci.h >> @@ -412,6 +412,7 @@ struct aer_err_info { >> >> unsigned int status; /* COR/UNCOR Error Status */ >> unsigned int mask; /* COR/UNCOR Error Mask */ >> + unsigned int anfe_status; /* UNCOR Error Status for ANFE */ >> struct pcie_tlp_log tlp; /* TLP Header */ >> }; >> >> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c >> index ac6293c24976..f2839b51321a 100644 >> --- a/drivers/pci/pcie/aer.c >> +++ b/drivers/pci/pcie/aer.c >> @@ -107,6 +107,12 @@ struct aer_stats { >> PCI_ERR_ROOT_MULTI_COR_RCV | > \ >> PCI_ERR_ROOT_MULTI_UNCOR_RCV) >> >> +#define AER_ERR_ANFE_UNC_MASK > (PCI_ERR_UNC_POISON_TLP | \ >> + PCI_ERR_UNC_COMP_TIME | > \ >> + PCI_ERR_UNC_COMP_ABORT | > \ >> + PCI_ERR_UNC_UNX_COMP | > \ >> + PCI_ERR_UNC_UNSUP) >> + >> static int pcie_aer_disable; >> static pci_ers_result_t aer_root_reset(struct pci_dev *dev); >> >> @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain, unsigned >int bus, unsigned int devfn, >> EXPORT_SYMBOL_GPL(aer_recover_queue); >> #endif >> >> +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info >*info) >> +{ >> + u32 uncor_mask, uncor_status, anfe_status; >> + u16 device_status; >> + int aer = dev->aer_cap; >> + >> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, >&uncor_status); >> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK, >&uncor_mask); >> + /* >> + * According to PCIe Base Specification Revision 6.1, >> + * Section 6.2.3.2.4, if an UNCOR error is raised as >> + * Advisory Non-Fatal error, it will match the following >> + * conditions: >> + * a. The severity of the error is Non-Fatal. >> + * b. The error is one of the following: >> + * 1. Poisoned TLP (Section 6.2.3.2.4.3) >> + * 2. Completion Timeout (Section 6.2.3.2.4.4) >> + * 3. Completer Abort (Section 6.2.3.2.4.1) >> + * 4. Unexpected Completion (Section 6.2.3.2.4.5) >> + * 5. Unsupported Request (Section 6.2.3.2.4.1) >> + */ >> + anfe_status = uncor_status & ~uncor_mask & ~info->severity & >> + AER_ERR_ANFE_UNC_MASK; >> + >> + if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA, >&device_status)) >> + return; >> + /* >> + * Take the most conservative route here. If there are Non-Fatal >errors >> + * detected, do not assume any bit in uncor_status is set by ANFE. >> + */ >> + if (device_status & PCI_EXP_DEVSTA_NFED) >> + return; > >You can move this check to the top of the function. You don't need to check >the rest if NFE error is detected in device status. The v3 just worked that way. Jonathan pointed a race that NFE triggered after the check will be treated as ANFE and cleared. Check it after reading UNCOR_STATUS can avoid the race. See https://lkml.org/lkml/2024/4/22/1011 for discussion details. Thanks Zhenzhong > >> + >> + /* >> + * If there is another ANFE between reading uncor_status and >clearing >> + * PCI_ERR_COR_ADV_NFAT bit in cor_status register, that ANFE >isn't >> + * recorded in info->anfe_status. It will be read out as NFE in >> + * following uncor_status register reading and processed by NFE >> + * handler. >> + */ >> + info->anfe_status = anfe_status; >> +} >> + >> /** >> * aer_get_device_error_info - read error status from dev and store it to >info >> * @dev: pointer to the device expected to have a error record >> @@ -1213,6 +1262,7 @@ int aer_get_device_error_info(struct pci_dev >*dev, struct aer_err_info *info) >> >> /* Must reset in this function */ >> info->status = 0; >> + info->anfe_status = 0; >> info->tlp_header_valid = 0; >> >> /* The device might not support AER */ >> @@ -1226,6 +1276,9 @@ int aer_get_device_error_info(struct pci_dev >*dev, struct aer_err_info *info) >> &info->mask); >> if (!(info->status & ~info->mask)) >> return 0; >> + >> + if (info->status & PCI_ERR_COR_ADV_NFAT) >> + anfe_get_uc_status(dev, info); >> } else if (type == PCI_EXP_TYPE_ROOT_PORT || >> type == PCI_EXP_TYPE_RC_EC || >> type == PCI_EXP_TYPE_DOWNSTREAM || > >-- >Sathyanarayanan Kuppuswamy >Linux Kernel Developer
On 6/13/24 7:39 PM, Duan, Zhenzhong wrote: > Hi > >> -----Original Message----- >> From: Kuppuswamy Sathyanarayanan >> <sathyanarayanan.kuppuswamy@linux.intel.com> >> Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might >> be ANFE in aer_err_info >> >> Hi, >> >> On 5/9/24 1:48 AM, Zhenzhong Duan wrote: >>> In some cases the detector of a Non-Fatal Error(NFE) is not the most >>> appropriate agent to determine the type of the error. For example, >>> when software performs a configuration read from a non-existent >>> device or Function, completer will send an ERR_NONFATAL Message. >>> On some platforms, ERR_NONFATAL results in a System Error, which >>> breaks normal software probing. >>> >>> Advisory Non-Fatal Error(ANFE) is a special case that can be used >>> in above scenario. It is predominantly determined by the role of the >>> detecting agent (Requester, Completer, or Receiver) and the specific >>> error. In such cases, an agent with AER signals the NFE (if enabled) >>> by sending an ERR_COR Message as an advisory to software, instead of >>> sending ERR_NONFATAL. >>> >>> When processing an ANFE, ideally both correctable error(CE) status and >>> uncorrectable error(UE) status should be cleared. However, there is no >>> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal >>> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as >>> NFE will reproduce above mentioned issue, i.e., breaking softwore probing; >>> treating NFE as ANFE will make us ignoring some UEs which need active >>> recover operation. To avoid clearing UEs that are not ANFE by accident, >>> the most conservative route is taken here: If any of the NFE Detected >>> bits is set in Device Status, do not touch UE status, they should be >>> cleared later by the UE handler. Otherwise, a specific set of UEs that >>> may be raised as ANFE according to the PCIe specification will be cleared >>> if their corresponding severity is Non-Fatal. >>> >>> To achieve above purpose, store UNCOR_STATUS bits that might be ANFE >>> in aer_err_info.anfe_status. So that those bits could be printed and >>> processed later. >>> >>> Tested-by: Yudong Wang <yudong.wang@intel.com> >>> Co-developed-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com> >>> Signed-off-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com> >>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> >>> --- >>> drivers/pci/pci.h | 1 + >>> drivers/pci/pcie/aer.c | 53 >> ++++++++++++++++++++++++++++++++++++++++++ >>> 2 files changed, 54 insertions(+) >>> >>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h >>> index 17fed1846847..3f9eb807f9fd 100644 >>> --- a/drivers/pci/pci.h >>> +++ b/drivers/pci/pci.h >>> @@ -412,6 +412,7 @@ struct aer_err_info { >>> >>> unsigned int status; /* COR/UNCOR Error Status */ >>> unsigned int mask; /* COR/UNCOR Error Mask */ >>> + unsigned int anfe_status; /* UNCOR Error Status for ANFE */ >>> struct pcie_tlp_log tlp; /* TLP Header */ >>> }; >>> >>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c >>> index ac6293c24976..f2839b51321a 100644 >>> --- a/drivers/pci/pcie/aer.c >>> +++ b/drivers/pci/pcie/aer.c >>> @@ -107,6 +107,12 @@ struct aer_stats { >>> PCI_ERR_ROOT_MULTI_COR_RCV | >> \ >>> PCI_ERR_ROOT_MULTI_UNCOR_RCV) >>> >>> +#define AER_ERR_ANFE_UNC_MASK >> (PCI_ERR_UNC_POISON_TLP | \ >>> + PCI_ERR_UNC_COMP_TIME | >> \ >>> + PCI_ERR_UNC_COMP_ABORT | >> \ >>> + PCI_ERR_UNC_UNX_COMP | >> \ >>> + PCI_ERR_UNC_UNSUP) >>> + >>> static int pcie_aer_disable; >>> static pci_ers_result_t aer_root_reset(struct pci_dev *dev); >>> >>> @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain, unsigned >> int bus, unsigned int devfn, >>> EXPORT_SYMBOL_GPL(aer_recover_queue); >>> #endif >>> >>> +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info >> *info) >>> +{ >>> + u32 uncor_mask, uncor_status, anfe_status; >>> + u16 device_status; >>> + int aer = dev->aer_cap; >>> + >>> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, >> &uncor_status); >>> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK, >> &uncor_mask); >>> + /* >>> + * According to PCIe Base Specification Revision 6.1, >>> + * Section 6.2.3.2.4, if an UNCOR error is raised as >>> + * Advisory Non-Fatal error, it will match the following >>> + * conditions: >>> + * a. The severity of the error is Non-Fatal. >>> + * b. The error is one of the following: >>> + * 1. Poisoned TLP (Section 6.2.3.2.4.3) >>> + * 2. Completion Timeout (Section 6.2.3.2.4.4) >>> + * 3. Completer Abort (Section 6.2.3.2.4.1) >>> + * 4. Unexpected Completion (Section 6.2.3.2.4.5) >>> + * 5. Unsupported Request (Section 6.2.3.2.4.1) >>> + */ >>> + anfe_status = uncor_status & ~uncor_mask & ~info->severity & >>> + AER_ERR_ANFE_UNC_MASK; >>> + >>> + if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA, >> &device_status)) >>> + return; >>> + /* >>> + * Take the most conservative route here. If there are Non-Fatal >> errors >>> + * detected, do not assume any bit in uncor_status is set by ANFE. >>> + */ >>> + if (device_status & PCI_EXP_DEVSTA_NFED) >>> + return; >> You can move this check to the top of the function. You don't need to check >> the rest if NFE error is detected in device status. > The v3 just worked that way. Jonathan pointed a race that NFE triggered after > the check will be treated as ANFE and cleared. Check it after reading UNCOR_STATUS > can avoid the race. > > See https://lkml.org/lkml/2024/4/22/1011 for discussion details. Got it. I would recommend adding a comment about it in handler. May be some thing like, /* * To avoid race between device status read and error status register read, cache * uncorrectable error status before checking for NFE in device status * register. */ > > Thanks > Zhenzhong > >>> + >>> + /* >>> + * If there is another ANFE between reading uncor_status and >> clearing >>> + * PCI_ERR_COR_ADV_NFAT bit in cor_status register, that ANFE >> isn't >>> + * recorded in info->anfe_status. It will be read out as NFE in >>> + * following uncor_status register reading and processed by NFE >>> + * handler. >>> + */ >>> + info->anfe_status = anfe_status; >>> +} >>> + >>> /** >>> * aer_get_device_error_info - read error status from dev and store it to >> info >>> * @dev: pointer to the device expected to have a error record >>> @@ -1213,6 +1262,7 @@ int aer_get_device_error_info(struct pci_dev >> *dev, struct aer_err_info *info) >>> /* Must reset in this function */ >>> info->status = 0; >>> + info->anfe_status = 0; >>> info->tlp_header_valid = 0; >>> >>> /* The device might not support AER */ >>> @@ -1226,6 +1276,9 @@ int aer_get_device_error_info(struct pci_dev >> *dev, struct aer_err_info *info) >>> &info->mask); >>> if (!(info->status & ~info->mask)) >>> return 0; >>> + >>> + if (info->status & PCI_ERR_COR_ADV_NFAT) >>> + anfe_get_uc_status(dev, info); >>> } else if (type == PCI_EXP_TYPE_ROOT_PORT || >>> type == PCI_EXP_TYPE_RC_EC || >>> type == PCI_EXP_TYPE_DOWNSTREAM || >> -- >> Sathyanarayanan Kuppuswamy >> Linux Kernel Developer
>-----Original Message----- >From: Kuppuswamy Sathyanarayanan ><sathyanarayanan.kuppuswamy@linux.intel.com> >Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might >be ANFE in aer_err_info > > >On 6/13/24 7:39 PM, Duan, Zhenzhong wrote: >> Hi >> >>> -----Original Message----- >>> From: Kuppuswamy Sathyanarayanan >>> <sathyanarayanan.kuppuswamy@linux.intel.com> >>> Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that >might >>> be ANFE in aer_err_info >>> >>> Hi, >>> >>> On 5/9/24 1:48 AM, Zhenzhong Duan wrote: >>>> In some cases the detector of a Non-Fatal Error(NFE) is not the most >>>> appropriate agent to determine the type of the error. For example, >>>> when software performs a configuration read from a non-existent >>>> device or Function, completer will send an ERR_NONFATAL Message. >>>> On some platforms, ERR_NONFATAL results in a System Error, which >>>> breaks normal software probing. >>>> >>>> Advisory Non-Fatal Error(ANFE) is a special case that can be used >>>> in above scenario. It is predominantly determined by the role of the >>>> detecting agent (Requester, Completer, or Receiver) and the specific >>>> error. In such cases, an agent with AER signals the NFE (if enabled) >>>> by sending an ERR_COR Message as an advisory to software, instead of >>>> sending ERR_NONFATAL. >>>> >>>> When processing an ANFE, ideally both correctable error(CE) status and >>>> uncorrectable error(UE) status should be cleared. However, there is no >>>> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal >>>> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as >>>> NFE will reproduce above mentioned issue, i.e., breaking softwore >probing; >>>> treating NFE as ANFE will make us ignoring some UEs which need active >>>> recover operation. To avoid clearing UEs that are not ANFE by accident, >>>> the most conservative route is taken here: If any of the NFE Detected >>>> bits is set in Device Status, do not touch UE status, they should be >>>> cleared later by the UE handler. Otherwise, a specific set of UEs that >>>> may be raised as ANFE according to the PCIe specification will be cleared >>>> if their corresponding severity is Non-Fatal. >>>> >>>> To achieve above purpose, store UNCOR_STATUS bits that might be >ANFE >>>> in aer_err_info.anfe_status. So that those bits could be printed and >>>> processed later. >>>> >>>> Tested-by: Yudong Wang <yudong.wang@intel.com> >>>> Co-developed-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com> >>>> Signed-off-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com> >>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> >>>> --- >>>> drivers/pci/pci.h | 1 + >>>> drivers/pci/pcie/aer.c | 53 >>> ++++++++++++++++++++++++++++++++++++++++++ >>>> 2 files changed, 54 insertions(+) >>>> >>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h >>>> index 17fed1846847..3f9eb807f9fd 100644 >>>> --- a/drivers/pci/pci.h >>>> +++ b/drivers/pci/pci.h >>>> @@ -412,6 +412,7 @@ struct aer_err_info { >>>> >>>> unsigned int status; /* COR/UNCOR Error Status */ >>>> unsigned int mask; /* COR/UNCOR Error Mask */ >>>> + unsigned int anfe_status; /* UNCOR Error Status for ANFE */ >>>> struct pcie_tlp_log tlp; /* TLP Header */ >>>> }; >>>> >>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c >>>> index ac6293c24976..f2839b51321a 100644 >>>> --- a/drivers/pci/pcie/aer.c >>>> +++ b/drivers/pci/pcie/aer.c >>>> @@ -107,6 +107,12 @@ struct aer_stats { >>>> PCI_ERR_ROOT_MULTI_COR_RCV | >>> \ >>>> PCI_ERR_ROOT_MULTI_UNCOR_RCV) >>>> >>>> +#define AER_ERR_ANFE_UNC_MASK >>> (PCI_ERR_UNC_POISON_TLP | \ >>>> + PCI_ERR_UNC_COMP_TIME | >>> \ >>>> + PCI_ERR_UNC_COMP_ABORT | >>> \ >>>> + PCI_ERR_UNC_UNX_COMP | >>> \ >>>> + PCI_ERR_UNC_UNSUP) >>>> + >>>> static int pcie_aer_disable; >>>> static pci_ers_result_t aer_root_reset(struct pci_dev *dev); >>>> >>>> @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain, >unsigned >>> int bus, unsigned int devfn, >>>> EXPORT_SYMBOL_GPL(aer_recover_queue); >>>> #endif >>>> >>>> +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info >>> *info) >>>> +{ >>>> + u32 uncor_mask, uncor_status, anfe_status; >>>> + u16 device_status; >>>> + int aer = dev->aer_cap; >>>> + >>>> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, >>> &uncor_status); >>>> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK, >>> &uncor_mask); >>>> + /* >>>> + * According to PCIe Base Specification Revision 6.1, >>>> + * Section 6.2.3.2.4, if an UNCOR error is raised as >>>> + * Advisory Non-Fatal error, it will match the following >>>> + * conditions: >>>> + * a. The severity of the error is Non-Fatal. >>>> + * b. The error is one of the following: >>>> + * 1. Poisoned TLP (Section 6.2.3.2.4.3) >>>> + * 2. Completion Timeout (Section 6.2.3.2.4.4) >>>> + * 3. Completer Abort (Section 6.2.3.2.4.1) >>>> + * 4. Unexpected Completion (Section 6.2.3.2.4.5) >>>> + * 5. Unsupported Request (Section 6.2.3.2.4.1) >>>> + */ >>>> + anfe_status = uncor_status & ~uncor_mask & ~info->severity & >>>> + AER_ERR_ANFE_UNC_MASK; >>>> + >>>> + if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA, >>> &device_status)) >>>> + return; >>>> + /* >>>> + * Take the most conservative route here. If there are Non-Fatal >>> errors >>>> + * detected, do not assume any bit in uncor_status is set by ANFE. >>>> + */ >>>> + if (device_status & PCI_EXP_DEVSTA_NFED) >>>> + return; >>> You can move this check to the top of the function. You don't need to >check >>> the rest if NFE error is detected in device status. >> The v3 just worked that way. Jonathan pointed a race that NFE triggered >after >> the check will be treated as ANFE and cleared. Check it after reading >UNCOR_STATUS >> can avoid the race. >> >> See https://lkml.org/lkml/2024/4/22/1011 for discussion details. > >Got it. I would recommend adding a comment about it in handler. May be >some thing like, > >/* > * To avoid race between device status read and error status register read, >cache > * uncorrectable error status before checking for NFE in device status * >register. */ Good suggestion, will add. Thanks Zhenzhong
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 17fed1846847..3f9eb807f9fd 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -412,6 +412,7 @@ struct aer_err_info { unsigned int status; /* COR/UNCOR Error Status */ unsigned int mask; /* COR/UNCOR Error Mask */ + unsigned int anfe_status; /* UNCOR Error Status for ANFE */ struct pcie_tlp_log tlp; /* TLP Header */ }; diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index ac6293c24976..f2839b51321a 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -107,6 +107,12 @@ struct aer_stats { PCI_ERR_ROOT_MULTI_COR_RCV | \ PCI_ERR_ROOT_MULTI_UNCOR_RCV) +#define AER_ERR_ANFE_UNC_MASK (PCI_ERR_UNC_POISON_TLP | \ + PCI_ERR_UNC_COMP_TIME | \ + PCI_ERR_UNC_COMP_ABORT | \ + PCI_ERR_UNC_UNX_COMP | \ + PCI_ERR_UNC_UNSUP) + static int pcie_aer_disable; static pci_ers_result_t aer_root_reset(struct pci_dev *dev); @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn, EXPORT_SYMBOL_GPL(aer_recover_queue); #endif +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info *info) +{ + u32 uncor_mask, uncor_status, anfe_status; + u16 device_status; + int aer = dev->aer_cap; + + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, &uncor_status); + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK, &uncor_mask); + /* + * According to PCIe Base Specification Revision 6.1, + * Section 6.2.3.2.4, if an UNCOR error is raised as + * Advisory Non-Fatal error, it will match the following + * conditions: + * a. The severity of the error is Non-Fatal. + * b. The error is one of the following: + * 1. Poisoned TLP (Section 6.2.3.2.4.3) + * 2. Completion Timeout (Section 6.2.3.2.4.4) + * 3. Completer Abort (Section 6.2.3.2.4.1) + * 4. Unexpected Completion (Section 6.2.3.2.4.5) + * 5. Unsupported Request (Section 6.2.3.2.4.1) + */ + anfe_status = uncor_status & ~uncor_mask & ~info->severity & + AER_ERR_ANFE_UNC_MASK; + + if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &device_status)) + return; + /* + * Take the most conservative route here. If there are Non-Fatal errors + * detected, do not assume any bit in uncor_status is set by ANFE. + */ + if (device_status & PCI_EXP_DEVSTA_NFED) + return; + + /* + * If there is another ANFE between reading uncor_status and clearing + * PCI_ERR_COR_ADV_NFAT bit in cor_status register, that ANFE isn't + * recorded in info->anfe_status. It will be read out as NFE in + * following uncor_status register reading and processed by NFE + * handler. + */ + info->anfe_status = anfe_status; +} + /** * aer_get_device_error_info - read error status from dev and store it to info * @dev: pointer to the device expected to have a error record @@ -1213,6 +1262,7 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info) /* Must reset in this function */ info->status = 0; + info->anfe_status = 0; info->tlp_header_valid = 0; /* The device might not support AER */ @@ -1226,6 +1276,9 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info) &info->mask); if (!(info->status & ~info->mask)) return 0; + + if (info->status & PCI_ERR_COR_ADV_NFAT) + anfe_get_uc_status(dev, info); } else if (type == PCI_EXP_TYPE_ROOT_PORT || type == PCI_EXP_TYPE_RC_EC || type == PCI_EXP_TYPE_DOWNSTREAM ||