Message ID | 20190205210701.25387-1-mr.nuke.me@gmail.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | PCI: pciehp: Do not turn off slot if presence comes up after link | expand |
On Tue, Feb 05, 2019 at 03:06:56PM -0600, Alexandru Gagniuc wrote: > According to PCIe 3.0, the presence detect state is a logical OR of > in-band and out-of-band presence. With this, we'd expect the presence > state to always be asserted when the link comes up. Do you have a PCIe 4.0 spec? I think 5.0 is about to come out, so it'd be nice to have at least a 4.0 citation (including section number), if you have one. > Not all hardware follows this, and it is possible for the presence to > come up after the link. In this case, the PCIe device would be > erroneously disabled and re-probed. It is possible to distinguish > between a delayed presence and a card swap by looking at the DLL state > changed bit -- The link has to come down if the card is removed. > > Thus, for a device that is probed, present and has its link active, a > lack of a link state change event guarantees we have the same device, > and shutdown of is not needed. s/of is/is/, I guess? I'm hoping Lukas will chime in here; thanks for cc'ing him already. > Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> > --- > > Following some discussion in > "PCI: hotplug: Erroneous removal of hotplug PCI devices" [1] > It became apparent that we can't fully rely presence detect changed > as an indicator to shutdown a device. And in PCIe 4.0 the "logical OR" > requirement is going away, so we can use a way to slightly decouple > presence detect and link active. > > I think the approach here is the simplest, and least likely to > interfere with other mis-behaving hardware (in the PCIe 3.0 sense). > > [1] https://www.spinics.net/lists/linux-pci/msg79564.html Would you mind updating this reference to the https://lore.kernel.org/linux-pci form that doesn't depend on external organizations? https://www.kernel.org/lore.html has some details, but unfortunately lacks a hint about how to construct the URL. What I do is look up the Message-ID and use, e.g., https://lore.kernel.org/linux-pci/<Message-ID> > drivers/pci/hotplug/pciehp_ctrl.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c > index 3f3df4c29f6e..ea0dd3e956be 100644 > --- a/drivers/pci/hotplug/pciehp_ctrl.c > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > @@ -220,13 +220,23 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > /* > * If the slot is on and presence or link has changed, turn it off. > * Even if it's occupied again, we cannot assume the card is the same. > + * When the card is swapped, we also expect a change in link state, > + * without which, it's likely presence became high after link-active. > */ > mutex_lock(&ctrl->state_lock); > + present = pciehp_card_present(ctrl); > + link_active = pciehp_check_link_active(ctrl); > switch (ctrl->state) { > case BLINKINGOFF_STATE: > cancel_delayed_work(&ctrl->button_work); > /* fall through */ > case ON_STATE: > + if (present && link_active && > + !(events & PCI_EXP_SLTSTA_DLLSC)) { > + mutex_unlock(&ctrl->state_lock); > + ctrl_warn(ctrl, "Hardware bug: Presence state came up after link"); I'm not 100% sure this is worth KERN_WARN unless the user might be able to do something about it. > + return; > + } > ctrl->state = POWEROFF_STATE; > mutex_unlock(&ctrl->state_lock); > if (events & PCI_EXP_SLTSTA_DLLSC) > -- > 2.19.2 >
On Tue, Feb 05, 2019 at 03:06:56PM -0600, Alexandru Gagniuc wrote: > According to PCIe 3.0, the presence detect state is a logical OR of > in-band and out-of-band presence. Since Bjorn asked for a spec reference: PCIe r4.0 sec 7.8.11: "Presence Detect State - This bit indicates the presence of an adapter in the slot, reflected by the logical OR of the Physical Layer in-band presence detect mechanism and, if present, any out-of-band presence detect mechanism defined for the slot's corresponding form factor." Table 4-14 in sec 4.2.6 is also important because it shows the difference between Link Up and in-band presence. > With this, we'd expect the presence > state to always be asserted when the link comes up. > > Not all hardware follows this, and it is possible for the presence to > come up after the link. In this case, the PCIe device would be > erroneously disabled and re-probed. It is possible to distinguish > between a delayed presence and a card swap by looking at the DLL state > changed bit -- The link has to come down if the card is removed. So in reality, PDC and DLLSC events rarely come in simultaneously and we do allow some leeway in pcie_wait_for_link(), it's just not enough in your particular case due to the way presence is detected by the platform. I think in this case instead of changing the behavior for everyone, it's more appropriate to add a quirk for your hardware. Two ideas: * Amend pcie_init() to set slot_cap |= PCI_EXP_SLTCAP_ABP. Insert this as a quirk right below the existing Thunderbolt quirk and use PCI vendor/device IDs and/or DMI checks to identify your particular hardware. If the ABP bit is set, PDC events are not enabled by pcie_enable_notification(), so you will solely rely on DLLSC events. Problem solved. (Hopefully.) * Alternatively, add a DECLARE_PCI_FIXUP_FINAL() quirk which sets pdev->link_active_reporting = false. This causes pcie_wait_for_link() to wait 1100 msec before the hot-added device's config space is accessed for the first time. Would this work for you? Thanks, Lukas
On 2/9/19 5:58 AM, Lukas Wunner wrote: > > [EXTERNAL EMAIL] > > On Tue, Feb 05, 2019 at 03:06:56PM -0600, Alexandru Gagniuc wrote: >> According to PCIe 3.0, the presence detect state is a logical OR of >> in-band and out-of-band presence. > > Since Bjorn asked for a spec reference: > > PCIe r4.0 sec 7.8.11: "Presence Detect State - This bit indicates the > presence of an adapter in the slot, reflected by the logical OR of the > Physical Layer in-band presence detect mechanism and, if present, any > out-of-band presence detect mechanism defined for the slot's > corresponding form factor." ECN [1] was approved last November, so it's normative spec text. Sorry if the Ukranians didn't get ahold of it yet. I'll try to explain the delta. There's an in-band PD supported/disable set of bits. If 'supported' is set, then you can set 'disable', which removes the 'logical OR" and now PD is only out-of-band presence. > Table 4-14 in sec 4.2.6 is also important because it shows the difference > between Link Up and in-band presence. So, we'd expect PD to come up at the same time or before DLL? >> With this, we'd expect the presence >> state to always be asserted when the link comes up. >> >> Not all hardware follows this, and it is possible for the presence to >> come up after the link. In this case, the PCIe device would be >> erroneously disabled and re-probed. It is possible to distinguish >> between a delayed presence and a card swap by looking at the DLL state >> changed bit -- The link has to come down if the card is removed. > > So in reality, PDC and DLLSC events rarely come in simultaneously and > we do allow some leeway in pcie_wait_for_link(), it's just not enough > in your particular case due to the way presence is detected by the > platform. > > I think in this case instead of changing the behavior for everyone, > it's more appropriate to add a quirk for your hardware. Two ideas: > > * Amend pcie_init() to set slot_cap |= PCI_EXP_SLTCAP_ABP. Insert > this as a quirk right below the existing Thunderbolt quirk and > use PCI vendor/device IDs and/or DMI checks to identify your > particular hardware. If the ABP bit is set, PDC events are not > enabled by pcie_enable_notification(), so you will solely rely on > DLLSC events. Problem solved. (Hopefully.) > > * Alternatively, add a DECLARE_PCI_FIXUP_FINAL() quirk which sets > pdev->link_active_reporting = false. This causes pcie_wait_for_link() > to wait 1100 msec before the hot-added device's config space is > accessed for the first time. These are both hacks right? We're declaring something untrue about the hardware because we want to obtain a specific side-effect. BUt the side-effects are not guaranteed not to change. > Would this work for you? I'm certain it's doable. In theory one could use the PCI ID of the DP, and the vendor and machine names to filter the quirk. But what happens in situations where the same PCI ID switch is used in different parts of the system? You want the quirk here or not there. It quickly becomes a shartstorm. I'd like a generic solution. I suspect supporting 'in-band PD disabled' and quirking for that would be a saner way to do things. If we support it, then we need to handle PDSC after DLLSC scenarios anyway. Alex [1] PCI-SIG ENGINEERING CHANGE NOTICE Async Hot-Plug Updates Nov 29, 2018
On Mon, Feb 11, 2019 at 11:48:12PM +0000, Alex_Gagniuc@Dellteam.com wrote: > On 2/9/19 5:58 AM, Lukas Wunner wrote: > ECN [1] was approved last November, so it's normative spec text. Sorry > if the Ukranians didn't get ahold of it yet. I'll try to explain the delta. > There's an in-band PD supported/disable set of bits. If 'supported' is > set, then you can set 'disable', which removes the 'logical OR" and now > PD is only out-of-band presence. I see, thanks a lot for relaying this information. The PCI SIG should probably consider granting access to specs to open source developers to ease adoption of new features in Linux et al. > > Table 4-14 in sec 4.2.6 is also important because it shows the difference > > between Link Up and in-band presence. > > So, we'd expect PD to come up at the same time or before DLL? Per table 4-14 and figure 4-23 (immediately below the table) in r4.0, PDC ought to be signaled before DLLSC. As said, Linux can handle PDC after DLLSC if they're not too far apart (100 ms, see pcie_wait_for_link()). > I'd like a generic solution. I suspect supporting 'in-band PD disabled' > and quirking for that would be a saner way to do things. If we support > it, then we need to handle PDSC after DLLSC scenarios anyway. I agree, having support for this new ECN would be great. If you look at struct controller in drivers/pci/hotplug/pciehp.h, there's a section labelled /* capabilities and quirks */. An idea would be to add an unsigned int there to indicate whether in-band presence is disabled. An alternative would be to add a flag to struct pci_dev. Instead of modifying the logic in pciehp_handle_presence_or_link_change(), you could amend pcie_wait_for_link() to poll PDS until it's set, in addition to DLLLA. The rationale would be that although the link is up, the hot-added device cannot really be considered accessible until PDS is also set. Unfortunately we cannot do this for all devices because (as I've said before), some broken devices hardwire PDS to zero. But it should be safe to constrain it to those which can disable in-band presence. Be mindful however that pcie_wait_for_link() is also called from the DPC driver. Keith should be able to judge whether a change to that function breaks DPC. You'll probably also need to add code to disable in-band presence if that is supported, either in pcie_init() in pciehp_hpc.c or in generic PCI code. Thanks, Lukas
On Tue, Feb 12, 2019 at 09:30:31AM +0100, Lukas Wunner wrote: > Instead of modifying the logic in pciehp_handle_presence_or_link_change(), > you could amend pcie_wait_for_link() to poll PDS until it's set, in > addition to DLLLA. The rationale would be that although the link is up, > the hot-added device cannot really be considered accessible until PDS > is also set. Unfortunately we cannot do this for all devices because > (as I've said before), some broken devices hardwire PDS to zero. But > it should be safe to constrain it to those which can disable in-band > presence. An alternative approach would of course be to only enable one of PDC or DLLSC events if in-band presence is disabled. In the case of DLLSC, you'd also need to modify pciehp_reset_slot() in addition to pcie_enable_notification(). Thanks, Lukas
On 2/12/19 2:30 AM, Lukas Wunner wrote: > The PCI SIG > should probably consider granting access to specs to open source > developers to ease adoption of new features in Linux et al. Don't get me started on just how ridiculous I think PCI-SIG's policy is with respect to public availability of standards. >>> Table 4-14 in sec 4.2.6 is also important because it shows the difference >>> between Link Up and in-band presence. >> >> So, we'd expect PD to come up at the same time or before DLL? > > Per table 4-14 and figure 4-23 (immediately below the table) in r4.0, > PDC ought to be signaled before DLLSC. As said, Linux can handle PDC > after DLLSC if they're not too far apart (100 ms, see pcie_wait_for_link()). > > >> I'd like a generic solution. I suspect supporting 'in-band PD disabled' >> and quirking for that would be a saner way to do things. If we support >> it, then we need to handle PDSC after DLLSC scenarios anyway. > > I agree, having support for this new ECN would be great. > > If you look at struct controller in drivers/pci/hotplug/pciehp.h, > there's a section labelled /* capabilities and quirks */. An idea > would be to add an unsigned int there to indicate whether in-band > presence is disabled. An alternative would be to add a flag to > struct pci_dev. > Instead of modifying the logic in pciehp_handle_presence_or_link_change(), > you could amend pcie_wait_for_link() to poll PDS until it's set, in > addition to DLLLA. The rationale would be that although the link is up, > the hot-added device cannot really be considered accessible until PDS > is also set. Unfortunately we cannot do this for all devices because > (as I've said before), some broken devices hardwire PDS to zero. But > it should be safe to constrain it to those which can disable in-band > presence. The recommendation is to set the "in-band PD disable" bit, and it's possible that platform firmware may have set it at boot time (*). I'm not sure that there is a spec-justifiable reason to not access a device whose DLL is up, but PD isn't. (*) A bit hypothetical: There is no hardware yet implementing the ECN. > Be mindful however that pcie_wait_for_link() is also called from the > DPC driver. Keith should be able to judge whether a change to that > function breaks DPC. That's why I went for ammending pciehp_handle_presence_or_link_change(). Smaller bug surface ;). What I'm thinking at this point is, keep the patch as is, but, also check for in-band PD disable before aborting the shutdown. Old behavior stays the same. I'll rework this a little bit for in-band PD disable (what's a good acronym for that, BTW?), and then quirk those machines that are known to 'disable' this in hardware. Alex
On Tue, Feb 12, 2019 at 11:57:55PM +0000, Alex_Gagniuc@Dellteam.com wrote: > The recommendation is to set the "in-band PD disable" bit, and it's > possible that platform firmware may have set it at boot time Ok, then I'd suggest to check in pcie_init() whether the feature is supported and enable it if so. And store that fact in a flag, either in struct pci_dev or struct controller. > (*) A bit hypothetical: There is no hardware yet implementing the ECN. Hm, this contradicts Austin Bolen's e-mail of Jan 25 that "Yes, this platform disables in-band presence" (when asked whether your host controller already adheres to the ECN). > I'm > not sure that there is a spec-justifiable reason to not access a device > whose DLL is up, but PD isn't. Austin asked in an e-mail of Jan 24 whether "the hot-inserted device's config space [is] accessed immediately after waiting this 20 + 100 ms delay", which sounded to me like you'd prefer the device not to be accessed until PDS is 1. This can be achieved by polling PDS in pcie_wait_for_link() if in-band presence is disabled, which is why I suggested it. > > Be mindful however that pcie_wait_for_link() is also called from the > > DPC driver. Keith should be able to judge whether a change to that > > function breaks DPC. > > That's why I went for ammending pciehp_handle_presence_or_link_change(). > Smaller bug surface ;). What I'm thinking at this point is, keep the > patch as is, but, also check for in-band PD disable before aborting the > shutdown. Old behavior stays the same. I'm worried that amending pciehp_handle_presence_or_link_change() makes the event handling logic more difficult to understand. Polling PDS in pcie_wait_for_link() or disabling either PDC or DLLSC if in-band presence is disabled seems simpler to reason about. > in-band PD disable (what's a good acronym for that, BTW?) I don't know, maybe inband_presence_disabled? Thanks, Lukas
On 2/13/19 2:36 AM, Lukas Wunner wrote: >> (*) A bit hypothetical: There is no hardware yet implementing the ECN. > > Hm, this contradicts Austin Bolen's e-mail of Jan 25 that "Yes, this > platform disables in-band presence" (when asked whether your host > controller already adheres to the ECN). Both statements are true. The hardware does indeed disable in-band presence, in a rudimentary way that is not compliant with the ECN -- it doesn't implement the bits required by the ECN. >> I'm >> not sure that there is a spec-justifiable reason to not access a device >> whose DLL is up, but PD isn't. > > Austin asked in an e-mail of Jan 24 whether "the hot-inserted device's > config space [is] accessed immediately after waiting this 20 + 100 ms > delay", which sounded to me like you'd prefer the device not to be > accessed until PDS is 1. "Unless Readiness Notifications mechanisms are used (see Section 6.23), the Root Complex and/or system software must allow at least 1.0 s after a Conventional Reset of a device, before it may determine that a device which fails to return a Successful Completion status for a valid Configuration Request is a broken device. This period is independent of how quickly Link training completes." (Section 6.6) The concern was the one second delay, which is addressed by the use of pci_bus_wait_crs(). >>> Be mindful however that pcie_wait_for_link() is also called from the >>> DPC driver. Keith should be able to judge whether a change to that >>> function breaks DPC. >> >> That's why I went for ammending pciehp_handle_presence_or_link_change(). >> Smaller bug surface ;). What I'm thinking at this point is, keep the >> patch as is, but, also check for in-band PD disable before aborting the >> shutdown. Old behavior stays the same. > > I'm worried that amending pciehp_handle_presence_or_link_change() makes > the event handling logic more difficult to understand. Don't worry. It's already difficult to understand ;) > Polling PDS in > pcie_wait_for_link() or disabling either PDC or DLLSC if in-band presence > is disabled seems simpler to reason about. pcie_wait_for_link() is generic PCIe layer. I don't think mixing hotplug concepts is a good layering violation. Disabling PDC or DLLSC can work. I've sometimes wondered why we even care about PDC. I can imagine situations where platform might want to signal imminent removal by yanking PD. >> in-band PD disable (what's a good acronym for that, BTW?) > > I don't know, maybe inband_presence_disabled? PCI_EXP_SLTCAP2_IBPD ?
On Wed, Feb 13, 2019 at 06:55:46PM +0000, Alex_Gagniuc@Dellteam.com wrote: > On 2/13/19 2:36 AM, Lukas Wunner wrote: > > > (*) A bit hypothetical: There is no hardware yet implementing the ECN. > > > > Hm, this contradicts Austin Bolen's e-mail of Jan 25 that "Yes, this > > platform disables in-band presence" (when asked whether your host > > controller already adheres to the ECN). > > Both statements are true. The hardware does indeed disable in-band > presence, in a rudimentary way that is not compliant with the ECN -- it > doesn't implement the bits required by the ECN. Ugh, can a BIOS update make those machines compliant to the ECN or do we need a quirk specifically for them? > > Polling PDS in > > pcie_wait_for_link() or disabling either PDC or DLLSC if in-band presence > > is disabled seems simpler to reason about. > > pcie_wait_for_link() is generic PCIe layer. I don't think mixing hotplug > concepts is a good layering violation. The function used to live in pciehp_hpc.c, but commits 9f5a70f18c58 and f0157160b359 moved it to generic code to allow code sharing with the DPC driver. That's the only reason it's in generic code AFAICS. > >> in-band PD disable (what's a good acronym for that, BTW?) > > > > I don't know, maybe inband_presence_disabled? > > PCI_EXP_SLTCAP2_IBPD ? Yes, something like that. It should match the spec, which I have no access to. Thanks, Lukas
On 2/14/19 1:01 AM, Lukas Wunner wrote: > On Wed, Feb 13, 2019 at 06:55:46PM +0000, Alex_Gagniuc@Dellteam.com wrote: >> On 2/13/19 2:36 AM, Lukas Wunner wrote: >>>> (*) A bit hypothetical: There is no hardware yet implementing the ECN. >>> >>> Hm, this contradicts Austin Bolen's e-mail of Jan 25 that "Yes, this >>> platform disables in-band presence" (when asked whether your host >>> controller already adheres to the ECN). >> >> Both statements are true. The hardware does indeed disable in-band >> presence, in a rudimentary way that is not compliant with the ECN -- it >> doesn't implement the bits required by the ECN. > > Ugh, can a BIOS update make those machines compliant to the ECN No, that is not possible, I'm told. > or do we need a quirk specifically for them? We might have to. > It should match the spec, which I have no access to. I am as irritated as you that the spec is not readily available for public viewing. Cheers, Alex
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c index 3f3df4c29f6e..ea0dd3e956be 100644 --- a/drivers/pci/hotplug/pciehp_ctrl.c +++ b/drivers/pci/hotplug/pciehp_ctrl.c @@ -220,13 +220,23 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) /* * If the slot is on and presence or link has changed, turn it off. * Even if it's occupied again, we cannot assume the card is the same. + * When the card is swapped, we also expect a change in link state, + * without which, it's likely presence became high after link-active. */ mutex_lock(&ctrl->state_lock); + present = pciehp_card_present(ctrl); + link_active = pciehp_check_link_active(ctrl); switch (ctrl->state) { case BLINKINGOFF_STATE: cancel_delayed_work(&ctrl->button_work); /* fall through */ case ON_STATE: + if (present && link_active && + !(events & PCI_EXP_SLTSTA_DLLSC)) { + mutex_unlock(&ctrl->state_lock); + ctrl_warn(ctrl, "Hardware bug: Presence state came up after link"); + return; + } ctrl->state = POWEROFF_STATE; mutex_unlock(&ctrl->state_lock); if (events & PCI_EXP_SLTSTA_DLLSC)
According to PCIe 3.0, the presence detect state is a logical OR of in-band and out-of-band presence. With this, we'd expect the presence state to always be asserted when the link comes up. Not all hardware follows this, and it is possible for the presence to come up after the link. In this case, the PCIe device would be erroneously disabled and re-probed. It is possible to distinguish between a delayed presence and a card swap by looking at the DLL state changed bit -- The link has to come down if the card is removed. Thus, for a device that is probed, present and has its link active, a lack of a link state change event guarantees we have the same device, and shutdown of is not needed. Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> --- Following some discussion in "PCI: hotplug: Erroneous removal of hotplug PCI devices" [1] It became apparent that we can't fully rely presence detect changed as an indicator to shutdown a device. And in PCIe 4.0 the "logical OR" requirement is going away, so we can use a way to slightly decouple presence detect and link active. I think the approach here is the simplest, and least likely to interfere with other mis-behaving hardware (in the PCIe 3.0 sense). [1] https://www.spinics.net/lists/linux-pci/msg79564.html drivers/pci/hotplug/pciehp_ctrl.c | 10 ++++++++++ 1 file changed, 10 insertions(+)