Message ID | 20180805225136.5800-2-okaya@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | PCI: Handle conflict between hotplug and error handling | expand |
On 2018-08-06 04:21, Sinan Kaya wrote: > AER/DPC reset is known as warm-resets. HP link recovery is known as > cold-reset via power-off and power-on command to the PCI slot. > > In the middle of a warm-reset operation (AER/DPC), we are: > > 1. turning off the slow power. Slot power needs to be kept on in order > for recovery to succeed. > > 2. performing a cold reset causing Fatal Error recovery to fail. > > If link goes down due to a DPC event, it should be recovered by DPC > status trigger. Injecting a cold reset in the middle can cause a HW > lockup as it is an undefined behavior. > > Similarly, If link goes down due to an AER secondary bus reset issue, > it should be recovered by HW. Injecting a cold reset in the middle of a > secondary bus reset can cause a HW lockup as it is an undefined > behavior. > > 1. HP ISR observes link down interrupt. > 2. HP ISR checks that there is a fatal error pending, it doesn't touch > the link. > 3. HP ISR waits until link recovery happens. > 4. HP ISR calls the read vendor id function. > 5. If all fails, try the cold-reset approach. > > If fatal error is pending and a fatal error service such as DPC or AER > is running, it is the responsibility of the fatal error service to > recover the link. > > Signed-off-by: Sinan Kaya <okaya@kernel.org> > --- > drivers/pci/hotplug/pciehp_ctrl.c | 19 ++++++++++ > drivers/pci/pci.h | 2 ++ > drivers/pci/pcie/err.c | 60 +++++++++++++++++++++++++++++++ > 3 files changed, 81 insertions(+) > > diff --git a/drivers/pci/hotplug/pciehp_ctrl.c > b/drivers/pci/hotplug/pciehp_ctrl.c > index da7c72372ffc..ba8dd51a3f0f 100644 > --- a/drivers/pci/hotplug/pciehp_ctrl.c > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > @@ -222,9 +222,28 @@ void pciehp_handle_disable_request(struct slot > *slot) > void pciehp_handle_presence_or_link_change(struct slot *slot, u32 > events) > { > struct controller *ctrl = slot->ctrl; > + struct pci_dev *pdev = ctrl->pcie->port; > bool link_active; > u8 present; > > + /* If a fatal error is pending, wait for AER or DPC to handle it. */ > + if (pcie_fatal_error_pending(pdev, PCI_ERR_UNC_SURPDN)) { > + bool recovered; > + > + recovered = pcie_wait_fatal_error_clear(pdev, > + PCI_ERR_UNC_SURPDN); > + > + /* If the fatal error is gone and the link is up, return */ > + if (recovered && pcie_wait_for_link(pdev, true)) { > + ctrl_info(ctrl, "Slot(%s): Ignoring Link event due to successful > fatal error recovery\n", > + slot_name(slot)); > + return; > + } > + > + ctrl_info(ctrl, "Slot(%s): Fatal error recovery failed for Link > event, tryig hotplug reset\n", > + slot_name(slot)); > + } > + > /* > * If the slot is on and presence or link has changed, turn it off. > * Even if it's occupied again, we cannot assume the card is the > same. > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index c358e7a07f3f..2ebb05338368 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -356,6 +356,8 @@ void pci_enable_acs(struct pci_dev *dev); > /* PCI error reporting and recovery */ > void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service); > void pcie_do_nonfatal_recovery(struct pci_dev *dev); > +bool pcie_fatal_error_pending(struct pci_dev *pdev, u32 usrmask); > +bool pcie_wait_fatal_error_clear(struct pci_dev *pdev, u32 usrmask); > > bool pcie_wait_for_link(struct pci_dev *pdev, bool active); > #ifdef CONFIG_PCIEASPM > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c > index f7ce0cb0b0b7..8ea012dbf063 100644 > --- a/drivers/pci/pcie/err.c > +++ b/drivers/pci/pcie/err.c > @@ -16,6 +16,7 @@ > #include <linux/kernel.h> > #include <linux/errno.h> > #include <linux/aer.h> > +#include <linux/delay.h> > #include "portdrv.h" > #include "../pci.h" > > @@ -386,3 +387,62 @@ void pcie_do_nonfatal_recovery(struct pci_dev > *dev) > /* TODO: Should kernel panic here? */ > pci_info(dev, "AER: Device recovery failed\n"); > } > + > +bool pcie_fatal_error_pending(struct pci_dev *pdev, u32 usrmask) > +{ > + u16 err_status = 0; > + u32 status, mask, severity; > + int rc; > + > + if (!pci_is_pcie(pdev)) > + return false; > + > + rc = pcie_capability_read_word(pdev, PCI_EXP_DEVSTA, &err_status); > + if (rc) > + return false; > + > + if (!(err_status & PCI_EXP_DEVSTA_FED)) > + return false; > + > + if (!pdev->aer_cap) > + return false; > + > + rc = pci_read_config_dword(pdev, pdev->aer_cap + > PCI_ERR_UNCOR_STATUS, > + &status); > + if (rc) > + return false; > + > + rc = pci_read_config_dword(pdev, pdev->aer_cap + PCI_ERR_UNCOR_MASK, > + &mask); > + if (rc) > + return false; > + > + rc = pci_read_config_dword(pdev, pdev->aer_cap + PCI_ERR_UNCOR_SEVER, > + &severity); > + if (rc) > + return false; > + > + status &= mask; > + status &= ~usrmask; > + status &= severity; > + > + return !!status; > +} > + > +bool pcie_wait_fatal_error_clear(struct pci_dev *pdev, u32 usrmask) > +{ > + int timeout = 1000; > + bool ret; > + > + for (;;) { > + ret = pcie_fatal_error_pending(pdev, usrmask); > + if (ret == false) > + return true; > + if (timeout <= 0) > + break; > + msleep(20); > + timeout -= 20; I assume that this timeout will come into effect if 1) AER/DPC takes longer time than 1 sec for recovery. 2) Lets us say both AER and DPC are disabled....are we going to wait for this timeout before HP can take over ? > + } > + > + return false; > +}
On Mon, Aug 06, 2018 at 01:21:03PM +0530, poza@codeaurora.org wrote: > On 2018-08-06 04:21, Sinan Kaya wrote: > >+bool pcie_wait_fatal_error_clear(struct pci_dev *pdev, u32 usrmask) > >+{ > >+ int timeout = 1000; > >+ bool ret; > >+ > >+ for (;;) { > >+ ret = pcie_fatal_error_pending(pdev, usrmask); > >+ if (ret == false) > >+ return true; > >+ if (timeout <= 0) > >+ break; > >+ msleep(20); > >+ timeout -= 20; > I assume that this timeout will come into effect if > 1) AER/DPC takes longer time than 1 sec for recovery. > 2) Lets us say both AER and DPC are disabled....are we going to wait for > this timeout before HP can take over ? If CONFIG_PCIEAER is disabled, pdev->aer_cap will not be set because it is populated in pci_aer_init(). pcie_fatal_error_pending(), as introduced by this patch, returns false if pdev->aer_cap is not set. So pciehp will fall back to a cold reset if CONFIG_PCIEAER is disabled. I'm not seeing a similar check for CONFIG_PCIE_DPC=n in this patch, but I'm not familiar enough with PCIe error recovery to know if such a check is at all needed. Thanks, Lukas
On Sun, Aug 05, 2018 at 03:51:36PM -0700, Sinan Kaya wrote: > --- a/drivers/pci/hotplug/pciehp_ctrl.c > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > @@ -222,9 +222,28 @@ void pciehp_handle_disable_request(struct slot *slot) > void pciehp_handle_presence_or_link_change(struct slot *slot, u32 events) > { > struct controller *ctrl = slot->ctrl; > + struct pci_dev *pdev = ctrl->pcie->port; > bool link_active; > u8 present; > > + /* If a fatal error is pending, wait for AER or DPC to handle it. */ > + if (pcie_fatal_error_pending(pdev, PCI_ERR_UNC_SURPDN)) { > + bool recovered; > + > + recovered = pcie_wait_fatal_error_clear(pdev, > + PCI_ERR_UNC_SURPDN); > + > + /* If the fatal error is gone and the link is up, return */ > + if (recovered && pcie_wait_for_link(pdev, true)) { > + ctrl_info(ctrl, "Slot(%s): Ignoring Link event due to successful fatal error recovery\n", > + slot_name(slot)); > + return; > + } > + > + ctrl_info(ctrl, "Slot(%s): Fatal error recovery failed for Link event, tryig hotplug reset\n", Typo tryig -> trying. Otherwise this is Reviewed-by: Lukas Wunner <lukas@wunner.de>
On 2018-08-06 14:56, Lukas Wunner wrote: > On Mon, Aug 06, 2018 at 01:21:03PM +0530, poza@codeaurora.org wrote: >> On 2018-08-06 04:21, Sinan Kaya wrote: >> >+bool pcie_wait_fatal_error_clear(struct pci_dev *pdev, u32 usrmask) >> >+{ >> >+ int timeout = 1000; >> >+ bool ret; >> >+ >> >+ for (;;) { >> >+ ret = pcie_fatal_error_pending(pdev, usrmask); >> >+ if (ret == false) >> >+ return true; >> >+ if (timeout <= 0) >> >+ break; >> >+ msleep(20); >> >+ timeout -= 20; >> I assume that this timeout will come into effect if >> 1) AER/DPC takes longer time than 1 sec for recovery. >> 2) Lets us say both AER and DPC are disabled....are we going to wait >> for >> this timeout before HP can take over ? > > If CONFIG_PCIEAER is disabled, pdev->aer_cap will not be set because > it is populated in pci_aer_init(). > > pcie_fatal_error_pending(), as introduced by this patch, returns false > if pdev->aer_cap is not set. So pciehp will fall back to a cold reset > if CONFIG_PCIEAER is disabled. > > I'm not seeing a similar check for CONFIG_PCIE_DPC=n in this patch, > but I'm not familiar enough with PCIe error recovery to know if such > a check is at all needed. > Either AER or DPC would get triggered, not both. in that case, if AER is disabled, then this code will return false thinking HP needs to handle it. but it might be that DPC would be triggering as well. but I dont see DPC check anywhere, rather we are relying on PCI_EXP_DEVSTA and following condition: if (!pdev->aer_cap) return false; so here we dont check anything with respect to DPC capability (although there is no such thing as dpc_cap) (except If I missed something) > Thanks, > > Lukas
On 8/6/2018 10:55 AM, poza@codeaurora.org wrote: >> > > Either AER or DPC would get triggered, not both. > in that case, if AER is disabled, then this code will return false > thinking HP needs to handle it. > but it might be that DPC would be triggering as well. > but I dont see DPC check anywhere, rather we are relying on PCI_EXP_DEVSTA > and following condition: > if (!pdev->aer_cap) > return false; > so here we dont check anything with respect to DPC capability (although > there is no such thing as dpc_cap) > (except If I missed something) That's true. We either need to go poll every single source (AER/DPC) for an error or rely on the DEVSTA. For ease of implementation, I suggest DEVSTA. Something like this: +bool pcie_fatal_error_pending(struct pci_dev *pdev, u32 usrmask) +{ + u16 err_status = 0; + u32 status, mask, severity; + int rc; + + if (!pci_is_pcie(pdev)) + return false; + + rc = pcie_capability_read_word(pdev, PCI_EXP_DEVSTA, &err_status); + if (rc) + return false; + + if (!(err_status & PCI_EXP_DEVSTA_FED)) + return false; + return true; +}
On 8/6/2018 11:04 AM, Sinan Kaya wrote: >> Either AER or DPC would get triggered, not both. >> in that case, if AER is disabled, then this code will return false >> thinking HP needs to handle it. >> but it might be that DPC would be triggering as well. >> but I dont see DPC check anywhere, rather we are relying on >> PCI_EXP_DEVSTA >> and following condition: >> if (!pdev->aer_cap) >> return false; >> so here we dont check anything with respect to DPC capability >> (although there is no such thing as dpc_cap) >> (except If I missed something) > > That's true. We either need to go poll every single source (AER/DPC) for > an error or rely on the DEVSTA. For ease of implementation, I suggest > DEVSTA. Something like this: Hmm. Too quick... Reduced set doesn't help. Surprise Link Down is also a fatal error. That's what I was trying to exclude. We'll have to poll the error source.
On 2018-08-06 20:36, Sinan Kaya wrote: > On 8/6/2018 11:04 AM, Sinan Kaya wrote: >>> Either AER or DPC would get triggered, not both. >>> in that case, if AER is disabled, then this code will return false >>> thinking HP needs to handle it. >>> but it might be that DPC would be triggering as well. >>> but I dont see DPC check anywhere, rather we are relying on >>> PCI_EXP_DEVSTA >>> and following condition: >>> if (!pdev->aer_cap) >>> return false; >>> so here we dont check anything with respect to DPC capability >>> (although there is no such thing as dpc_cap) >>> (except If I missed something) >> >> That's true. We either need to go poll every single source (AER/DPC) >> for >> an error or rely on the DEVSTA. For ease of implementation, I suggest >> DEVSTA. Something like this: > > Hmm. Too quick... > > Reduced set doesn't help. Surprise Link Down is also a fatal error. > That's what I was trying to exclude. We'll have to poll the error > source. How about using pcie_port_find_service() ? Regards, Oza.
On 8/6/2018 11:12 AM, poza@codeaurora.org wrote: >>> That's true. We either need to go poll every single source (AER/DPC) for >>> an error or rely on the DEVSTA. For ease of implementation, I suggest >>> DEVSTA. Something like this: >> >> Hmm. Too quick... >> >> Reduced set doesn't help. Surprise Link Down is also a fatal error. >> That's what I was trying to exclude. We'll have to poll the error >> source. > > > How about using pcie_port_find_service() ? pcie_port_find_service() tells you that there is an active driver. We could potentially use it like follows to query the register themselves. if ((pcie_port_find_service(AER)) { check AER registers } if ((pcie_port_find_service(DPC)) { check DPC registers }
On Mon, Aug 06, 2018 at 11:06:48AM -0400, Sinan Kaya wrote:
> Surprise Link Down is also a fatal error.
Seriously? On a Downstream Port which has the Hot-Plug Surprise bit
in the Slot Capabilities register set, Surprise Link Down is a fatal
error? That would seem somewhat contradictory. Do you have a
section number in the PCIe Base Spec for this?
Thanks,
Lukas
On 8/6/2018 12:44 PM, Lukas Wunner wrote: > On Mon, Aug 06, 2018 at 11:06:48AM -0400, Sinan Kaya wrote: >> Surprise Link Down is also a fatal error. > > Seriously? On a Downstream Port which has the Hot-Plug Surprise bit > in the Slot Capabilities register set, Surprise Link Down is a fatal > error? That would seem somewhat contradictory. Do you have a > section number in the PCIe Base Spec for this? Spec 3.0. 7.10.2. Uncorrectable Error Status Register (Offset 04h) bit 5 Surprise Down Error Status (Optional). > > Thanks, > > Lukas >
On 8/6/2018 12:49 PM, Sinan Kaya wrote: >>> Surprise Link Down is also a fatal error. >> >> Seriously? On a Downstream Port which has the Hot-Plug Surprise bit >> in the Slot Capabilities register set, Surprise Link Down is a fatal >> error? That would seem somewhat contradictory. Do you have a >> section number in the PCIe Base Spec for this? > > Spec 3.0. 7.10.2. Uncorrectable Error Status Register (Offset 04h) > bit 5 Surprise Down Error Status (Optional). I think hotplug surprise capability takes priority over the one in AER register. Other non-hotplug capable systems can report a link down via AER Fatal error and recover via the error handling path. We should maybe leave this one alone since hotplug code wouldn't clear the AER status registers.
On Mon, Aug 06, 2018 at 12:49:04PM -0400, Sinan Kaya wrote: > On 8/6/2018 12:44 PM, Lukas Wunner wrote: > >On Mon, Aug 06, 2018 at 11:06:48AM -0400, Sinan Kaya wrote: > >>Surprise Link Down is also a fatal error. > > > >Seriously? On a Downstream Port which has the Hot-Plug Surprise bit > >in the Slot Capabilities register set, Surprise Link Down is a fatal > >error? That would seem somewhat contradictory. Do you have a > >section number in the PCIe Base Spec for this? > > Spec 3.0. 7.10.2. Uncorrectable Error Status Register (Offset 04h) > bit 5 Surprise Down Error Status (Optional). Okay, thanks. Can we always mask out this bit on a hotplug port with surprise removal capability? Doesn't make sense to me to let AER/DPC spring to action in that case. Lukas
On 8/6/2018 1:05 PM, Lukas Wunner wrote: > On Mon, Aug 06, 2018 at 12:49:04PM -0400, Sinan Kaya wrote: >> On 8/6/2018 12:44 PM, Lukas Wunner wrote: >>> On Mon, Aug 06, 2018 at 11:06:48AM -0400, Sinan Kaya wrote: >>>> Surprise Link Down is also a fatal error. >>> Seriously? On a Downstream Port which has the Hot-Plug Surprise bit >>> in the Slot Capabilities register set, Surprise Link Down is a fatal >>> error? That would seem somewhat contradictory. Do you have a >>> section number in the PCIe Base Spec for this? >> Spec 3.0. 7.10.2. Uncorrectable Error Status Register (Offset 04h) >> bit 5 Surprise Down Error Status (Optional). > Okay, thanks. Can we always mask out this bit on a hotplug port > with surprise removal capability? Doesn't make sense to me to let > AER/DPC spring to action in that case. Yep, good idea.
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c index da7c72372ffc..ba8dd51a3f0f 100644 --- a/drivers/pci/hotplug/pciehp_ctrl.c +++ b/drivers/pci/hotplug/pciehp_ctrl.c @@ -222,9 +222,28 @@ void pciehp_handle_disable_request(struct slot *slot) void pciehp_handle_presence_or_link_change(struct slot *slot, u32 events) { struct controller *ctrl = slot->ctrl; + struct pci_dev *pdev = ctrl->pcie->port; bool link_active; u8 present; + /* If a fatal error is pending, wait for AER or DPC to handle it. */ + if (pcie_fatal_error_pending(pdev, PCI_ERR_UNC_SURPDN)) { + bool recovered; + + recovered = pcie_wait_fatal_error_clear(pdev, + PCI_ERR_UNC_SURPDN); + + /* If the fatal error is gone and the link is up, return */ + if (recovered && pcie_wait_for_link(pdev, true)) { + ctrl_info(ctrl, "Slot(%s): Ignoring Link event due to successful fatal error recovery\n", + slot_name(slot)); + return; + } + + ctrl_info(ctrl, "Slot(%s): Fatal error recovery failed for Link event, tryig hotplug reset\n", + slot_name(slot)); + } + /* * If the slot is on and presence or link has changed, turn it off. * Even if it's occupied again, we cannot assume the card is the same. diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index c358e7a07f3f..2ebb05338368 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -356,6 +356,8 @@ void pci_enable_acs(struct pci_dev *dev); /* PCI error reporting and recovery */ void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service); void pcie_do_nonfatal_recovery(struct pci_dev *dev); +bool pcie_fatal_error_pending(struct pci_dev *pdev, u32 usrmask); +bool pcie_wait_fatal_error_clear(struct pci_dev *pdev, u32 usrmask); bool pcie_wait_for_link(struct pci_dev *pdev, bool active); #ifdef CONFIG_PCIEASPM diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c index f7ce0cb0b0b7..8ea012dbf063 100644 --- a/drivers/pci/pcie/err.c +++ b/drivers/pci/pcie/err.c @@ -16,6 +16,7 @@ #include <linux/kernel.h> #include <linux/errno.h> #include <linux/aer.h> +#include <linux/delay.h> #include "portdrv.h" #include "../pci.h" @@ -386,3 +387,62 @@ void pcie_do_nonfatal_recovery(struct pci_dev *dev) /* TODO: Should kernel panic here? */ pci_info(dev, "AER: Device recovery failed\n"); } + +bool pcie_fatal_error_pending(struct pci_dev *pdev, u32 usrmask) +{ + u16 err_status = 0; + u32 status, mask, severity; + int rc; + + if (!pci_is_pcie(pdev)) + return false; + + rc = pcie_capability_read_word(pdev, PCI_EXP_DEVSTA, &err_status); + if (rc) + return false; + + if (!(err_status & PCI_EXP_DEVSTA_FED)) + return false; + + if (!pdev->aer_cap) + return false; + + rc = pci_read_config_dword(pdev, pdev->aer_cap + PCI_ERR_UNCOR_STATUS, + &status); + if (rc) + return false; + + rc = pci_read_config_dword(pdev, pdev->aer_cap + PCI_ERR_UNCOR_MASK, + &mask); + if (rc) + return false; + + rc = pci_read_config_dword(pdev, pdev->aer_cap + PCI_ERR_UNCOR_SEVER, + &severity); + if (rc) + return false; + + status &= mask; + status &= ~usrmask; + status &= severity; + + return !!status; +} + +bool pcie_wait_fatal_error_clear(struct pci_dev *pdev, u32 usrmask) +{ + int timeout = 1000; + bool ret; + + for (;;) { + ret = pcie_fatal_error_pending(pdev, usrmask); + if (ret == false) + return true; + if (timeout <= 0) + break; + msleep(20); + timeout -= 20; + } + + return false; +}
AER/DPC reset is known as warm-resets. HP link recovery is known as cold-reset via power-off and power-on command to the PCI slot. In the middle of a warm-reset operation (AER/DPC), we are: 1. turning off the slow power. Slot power needs to be kept on in order for recovery to succeed. 2. performing a cold reset causing Fatal Error recovery to fail. If link goes down due to a DPC event, it should be recovered by DPC status trigger. Injecting a cold reset in the middle can cause a HW lockup as it is an undefined behavior. Similarly, If link goes down due to an AER secondary bus reset issue, it should be recovered by HW. Injecting a cold reset in the middle of a secondary bus reset can cause a HW lockup as it is an undefined behavior. 1. HP ISR observes link down interrupt. 2. HP ISR checks that there is a fatal error pending, it doesn't touch the link. 3. HP ISR waits until link recovery happens. 4. HP ISR calls the read vendor id function. 5. If all fails, try the cold-reset approach. If fatal error is pending and a fatal error service such as DPC or AER is running, it is the responsibility of the fatal error service to recover the link. Signed-off-by: Sinan Kaya <okaya@kernel.org> --- drivers/pci/hotplug/pciehp_ctrl.c | 19 ++++++++++ drivers/pci/pci.h | 2 ++ drivers/pci/pcie/err.c | 60 +++++++++++++++++++++++++++++++ 3 files changed, 81 insertions(+)