Message ID | 20201122014203.4706-1-ashok.raj@intel.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | [v2,1/1] PCI: pciehp: Add support for handling MRL events | expand |
On Sat, Nov 21, 2020 at 05:42:03PM -0800, Ashok Raj wrote: > --- a/drivers/pci/hotplug/pciehp_ctrl.c > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > { > int present, link_active; > + u8 getstatus = 0; > > /* > * If the slot is on and presence or link has changed, turn it off. > @@ -246,6 +259,20 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > if (events & PCI_EXP_SLTSTA_PDC) > ctrl_info(ctrl, "Slot(%s): Card not present\n", > slot_name(ctrl)); > + if (events & PCI_EXP_SLTSTA_MRLSC) > + ctrl_info(ctrl, "Slot(%s): Latch %s\n", > + slot_name(ctrl), getstatus ? "Open" : "Closed"); This message will currently always be "Latch closed". It should be "Latch open" instead because if the slot was up, the latch must have been closed. So an MRLSC event can only mean that the latch is now open. The "getstatus" variable can be removed. > + /* > + * PCIe Base Spec 5.0 Chapter 6.7.1.3 states. > + * > + * If an MRL Sensor is implemented without a corresponding MRL Sensor input > + * on the Hot-Plug Controller, it is recommended that the MRL Sensor be > + * routed to power fault input of the Hot-Plug Controller. > + * This allows an active adapter to be powered off when the MRL is opened." > + * > + * This seems to suggest that the slot should be brought down as soon as MRL > + * is opened. > + */ > pciehp_disable_slot(ctrl, SURPRISE_REMOVAL); > break; The code comment is not wrapped at 80 chars and a bit long. I'd move it to the commit message and keep only a shortened version here. The "SURPRISE_REMOVAL" may now be problematic because the card may still be in the slot (both presence and link still up) with only the MRL open. My suggestion would be to add a local variable "bool safe_removal" which is initialized to "SAFE_REMOVAL". In the two if-clauses for DLLSC and PDC, it is set to SURPRISE_REMOVAL. > @@ -275,6 +302,13 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > if (link_active) > ctrl_info(ctrl, "Slot(%s): Link Up\n", > slot_name(ctrl)); > + /* > + * If slot is closed && ATTN button exists > + * don't continue, let the ATTN button > + * drive the hot-plug > + */ > + if (((events & PCI_EXP_SLTSTA_MRLSC) && ATTN_BUTTN(ctrl))) > + return; > ctrl->request_result = pciehp_enable_slot(ctrl); > break; Hm, if the Attention Button is pressed with MRL still open, the slot is not brought up. If the MRL is subsequently closed, it is still not brought up. I guess the slot keeps blinking and one has to push the button to abort the operation, then press it once more to attempt another slot bringup. The spec doesn't seem to say how such a situation should be handled. Oh well. I'm wondering if this is the right place to bail out: Immediately before the above hunk, the button_work is canceled, so it can't later trigger bringup of the slot. Shouldn't the above check be in the code block with the "Turn the slot on if it's occupied or link is up" comment? You're also not unlocking the state_lock here before bailing out of the function. > @@ -710,8 +710,10 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id) > down_read(&ctrl->reset_lock); > if (events & DISABLE_SLOT) > pciehp_handle_disable_request(ctrl); > - else if (events & (PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_DLLSC)) > + else if (events & (PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_DLLSC | > + PCI_EXP_SLTSTA_MRLSC)) > pciehp_handle_presence_or_link_change(ctrl, events); > + > up_read(&ctrl->reset_lock); Unnecessary newline added. > @@ -768,6 +770,14 @@ static void pcie_enable_notification(struct controller *ctrl) > cmd |= PCI_EXP_SLTCTL_ABPE; > else > cmd |= PCI_EXP_SLTCTL_PDCE; > + > + /* > + * If MRL sensor is present, then subscribe for MRL > + * Changes to be notified as well. > + */ > + if (MRL_SENS(ctrl)) > + cmd |= PCI_EXP_SLTCTL_MRLSCE; > + The code comment doesn't add much information, so can probably be dropped. You need to add PCI_EXP_SLTCTL_MRLSCE to the "mask" variable in this function (before PFDE, as in pcie_disable_notification()). I don't think the interrupt is enabled at all if it's not added to "mask", has this patch been tested at all? Something else: When pciehp probes, it should check whether the slot is up even though MRL is open. (E.g. the machine is booted, the card in the slot was enumerated but the latch is open.) I think in that case we need to bring down the slot. I suggest adding a check to pciehp_check_presence() whether the latch is open. If so, a PCI_EXP_SLTSTA_MRLSC event should be synthesized. You could rename the latch_closed() function to pciehp_latch_closed() and remove its "static" attribute so that you can call it from pciehp_core.c. Thanks, Lukas
Hi Lukas On Sun, Nov 22, 2020 at 10:08:52AM +0100, Lukas Wunner wrote: > On Sat, Nov 21, 2020 at 05:42:03PM -0800, Ashok Raj wrote: > > --- a/drivers/pci/hotplug/pciehp_ctrl.c > > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > > void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > > { > > int present, link_active; > > + u8 getstatus = 0; > > > > /* > > * If the slot is on and presence or link has changed, turn it off. > > @@ -246,6 +259,20 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > > if (events & PCI_EXP_SLTSTA_PDC) > > ctrl_info(ctrl, "Slot(%s): Card not present\n", > > slot_name(ctrl)); > > + if (events & PCI_EXP_SLTSTA_MRLSC) > > + ctrl_info(ctrl, "Slot(%s): Latch %s\n", > > + slot_name(ctrl), getstatus ? "Open" : "Closed"); > > This message will currently always be "Latch closed". It should be > "Latch open" instead because if the slot was up, the latch must have > been closed. So an MRLSC event can only mean that the latch is now open. > The "getstatus" variable can be removed. We only report if there was an MRLSC event. What if there is a link event, but MRL is closed? This just reflects current state rather than hardcoding a value which could be wrong in cases where you try to remove due to DLLSC event? > > > > + /* > > + * PCIe Base Spec 5.0 Chapter 6.7.1.3 states. > > + * > > + * If an MRL Sensor is implemented without a corresponding MRL Sensor input > > + * on the Hot-Plug Controller, it is recommended that the MRL Sensor be > > + * routed to power fault input of the Hot-Plug Controller. > > + * This allows an active adapter to be powered off when the MRL is opened." > > + * > > + * This seems to suggest that the slot should be brought down as soon as MRL > > + * is opened. > > + */ > > pciehp_disable_slot(ctrl, SURPRISE_REMOVAL); > > break; > > The code comment is not wrapped at 80 chars and a bit long. > I'd move it to the commit message and keep only a shortened version here. Make sense. I'll clean this up. > > The "SURPRISE_REMOVAL" may now be problematic because the card may still > be in the slot (both presence and link still up) with only the MRL open. > My suggestion would be to add a local variable "bool safe_removal" > which is initialized to "SAFE_REMOVAL". In the two if-clauses for > DLLSC and PDC, it is set to SURPRISE_REMOVAL. I see, so for MRL we want to treat it as safe-removal, for other two its surprise. Got it. > > > > @@ -275,6 +302,13 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > > if (link_active) > > ctrl_info(ctrl, "Slot(%s): Link Up\n", > > slot_name(ctrl)); > > + /* > > + * If slot is closed && ATTN button exists > > + * don't continue, let the ATTN button > > + * drive the hot-plug > > + */ > > + if (((events & PCI_EXP_SLTSTA_MRLSC) && ATTN_BUTTN(ctrl))) > > + return; > > ctrl->request_result = pciehp_enable_slot(ctrl); > > break; > > Hm, if the Attention Button is pressed with MRL still open, the slot is > not brought up. If the MRL is subsequently closed, it is still not > brought up. I guess the slot keeps blinking and one has to push the > button to abort the operation, then press it once more to attempt > another slot bringup. The spec doesn't seem to say how such a situation > should be handled. Oh well. Looks like we are in the same boat today even without MRL. If no card in slot and someone presses ATTN, after 5 sec blink, we call the synthetic PDC event. But the check for present || link_active would fail and return immediately. So the light would keep blinking until someone presses ATTN to cancel? Maybe in that we should simply cancel if it was blinking before we return? > > I'm wondering if this is the right place to bail out: Immediately > before the above hunk, the button_work is canceled, so it can't later > trigger bringup of the slot. Shouldn't the above check be in the > code block with the "Turn the slot on if it's occupied or link is up" > comment? Or maybe remove the check !latch_closed(ctrl), and let if fall through anyway. > > You're also not unlocking the state_lock here before bailing out of > the function. > > > > @@ -710,8 +710,10 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id) > > down_read(&ctrl->reset_lock); > > if (events & DISABLE_SLOT) > > pciehp_handle_disable_request(ctrl); > > - else if (events & (PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_DLLSC)) > > + else if (events & (PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_DLLSC | > > + PCI_EXP_SLTSTA_MRLSC)) > > pciehp_handle_presence_or_link_change(ctrl, events); > > + > > up_read(&ctrl->reset_lock); > > Unnecessary newline added. Will remove. > > > > @@ -768,6 +770,14 @@ static void pcie_enable_notification(struct controller *ctrl) > > cmd |= PCI_EXP_SLTCTL_ABPE; > > else > > cmd |= PCI_EXP_SLTCTL_PDCE; > > + > > + /* > > + * If MRL sensor is present, then subscribe for MRL > > + * Changes to be notified as well. > > + */ > > + if (MRL_SENS(ctrl)) > > + cmd |= PCI_EXP_SLTCTL_MRLSCE; > > + > > The code comment doesn't add much information, so can probably be > dropped. Make sense. > > You need to add PCI_EXP_SLTCTL_MRLSCE to the "mask" variable in this > function (before PFDE, as in pcie_disable_notification()). > I don't think the interrupt is enabled at all if it's not added to > "mask", has this patch been tested at all? The first patch was tested, but I didn't have that in the mask variable even then. > > Something else: When pciehp probes, it should check whether the slot > is up even though MRL is open. (E.g. the machine is booted, the card > in the slot was enumerated but the latch is open.) I think in that > case we need to bring down the slot. I suggest adding a check to > pciehp_check_presence() whether the latch is open. If so, > a PCI_EXP_SLTSTA_MRLSC event should be synthesized. You could rename > the latch_closed() function to pciehp_latch_closed() and remove its > "static" attribute so that you can call it from pciehp_core.c. Good point. I missed that. I'll have another version spun after a test.
Hi Lukas and Bjorn On Sun, Nov 22, 2020 at 10:08:52AM +0100, Lukas Wunner wrote: > > @@ -275,6 +302,13 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > > if (link_active) > > ctrl_info(ctrl, "Slot(%s): Link Up\n", > > slot_name(ctrl)); > > + /* > > + * If slot is closed && ATTN button exists > > + * don't continue, let the ATTN button > > + * drive the hot-plug > > + */ > > + if (((events & PCI_EXP_SLTSTA_MRLSC) && ATTN_BUTTN(ctrl))) > > + return; > > ctrl->request_result = pciehp_enable_slot(ctrl); > > break; > > Hm, if the Attention Button is pressed with MRL still open, the slot is > not brought up. If the MRL is subsequently closed, it is still not > brought up. I guess the slot keeps blinking and one has to push the > button to abort the operation, then press it once more to attempt > another slot bringup. The spec doesn't seem to say how such a situation > should be handled. Oh well. > > I'm wondering if this is the right place to bail out: Immediately > before the above hunk, the button_work is canceled, so it can't later > trigger bringup of the slot. Shouldn't the above check be in the > code block with the "Turn the slot on if it's occupied or link is up" > comment? > I have a fix tested on the platform, but I'm wondering if that's exactly what you had in mind. Currently we don't subscribe for PDC events when ATTN exists. So the behavior is almost similar to this MRL case after ATTN, but the slot is not ready for hot-add. - Press ATTN, - Slot is empty - After 5 seconds synthetic PDC arrives. but since no presence and no link active, we leave slot in BLINKINGON_STATE, and led keeps blinking if someone were to add a card after the 5 seconds, no hot-add is processed since we don't get notifications for PDC events when ATTN exists. Can we automatically cancel the blinking and return slot back to OFF_STATE? This way we don't need another button press to first cancel, and restart add via another button press? According to section 6.7.1.5 Attention Button. Once the power indicator begins blinking, a 5 second abort interval exists during which a second depression of the attention button cancels the operation. If the operation initiated by the attention button fails for any reason, it is recommended that system software present an error message explaining failure via a software user interface, or add the error message to system log. Seems like we can cancel the blinking and return back to power off state. Since the attention button press wasn't successful to add anything.? Alternately we can also choose to subscribe to PDC, but ignore if slot is in OFF_STATE. So we let ATTN drive the add. But if PDC happens and we are in BLINKINGON_STATE, then we can process the hot-add? Spec says a software recommendation, but i think the cancel after 5 seconds seems better? Cheers, Ashok
On Thu, Dec 03, 2020 at 02:51:24PM -0800, Raj, Ashok wrote: > - Press ATTN, > - Slot is empty > - After 5 seconds synthetic PDC arrives. > but since no presence and no link active, we leave slot in > BLINKINGON_STATE, and led keeps blinking > > if someone were to add a card after the 5 seconds, no hot-add is processed > since we don't get notifications for PDC events when ATTN exists. > > Can we automatically cancel the blinking and return slot back to OFF_STATE? Yes. > If the operation initiated by the attention button fails for any reason, it > is recommended that system software present an error message explaining > failure via a software user interface, or add the error message to system > log. Ah so we're supposed to log a message if the slot is empty. That needs to be added then to the code snippet I sent you yesterday in response to your off-list e-mail. > Alternately we can also choose to subscribe to PDC, but ignore if slot is > in OFF_STATE. So we let ATTN drive the add. But if PDC happens and we are > in BLINKINGON_STATE, then we can process the hot-add? Spec says a software > recommendation, but i think the cancel after 5 seconds seems better? That approach seems more complicated. It's better to stop blinking and return to OFF_STATE if after the 5 second interval the slot is found to be empty. Thanks, Lukas
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c index 9f85815b4f53..aa8b187ff769 100644 --- a/drivers/pci/hotplug/pciehp_ctrl.c +++ b/drivers/pci/hotplug/pciehp_ctrl.c @@ -224,9 +224,22 @@ void pciehp_handle_disable_request(struct controller *ctrl) ctrl->request_result = pciehp_disable_slot(ctrl, SAFE_REMOVAL); } +static bool latch_closed(struct controller *ctrl) +{ + u8 getstatus = 0; + + if (!MRL_SENS(ctrl)) + return true; + + pciehp_get_latch_status(ctrl, &getstatus); + + return (getstatus == 0 ? true : false); +} + void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) { int present, link_active; + u8 getstatus = 0; /* * If the slot is on and presence or link has changed, turn it off. @@ -246,6 +259,20 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) if (events & PCI_EXP_SLTSTA_PDC) ctrl_info(ctrl, "Slot(%s): Card not present\n", slot_name(ctrl)); + if (events & PCI_EXP_SLTSTA_MRLSC) + ctrl_info(ctrl, "Slot(%s): Latch %s\n", + slot_name(ctrl), getstatus ? "Open" : "Closed"); + /* + * PCIe Base Spec 5.0 Chapter 6.7.1.3 states. + * + * If an MRL Sensor is implemented without a corresponding MRL Sensor input + * on the Hot-Plug Controller, it is recommended that the MRL Sensor be + * routed to power fault input of the Hot-Plug Controller. + * This allows an active adapter to be powered off when the MRL is opened." + * + * This seems to suggest that the slot should be brought down as soon as MRL + * is opened. + */ pciehp_disable_slot(ctrl, SURPRISE_REMOVAL); break; default: @@ -257,7 +284,7 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) mutex_lock(&ctrl->state_lock); present = pciehp_card_present(ctrl); link_active = pciehp_check_link_active(ctrl); - if (present <= 0 && link_active <= 0) { + if ((present <= 0 && link_active <= 0) || !latch_closed(ctrl)) { mutex_unlock(&ctrl->state_lock); return; } @@ -275,6 +302,13 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) if (link_active) ctrl_info(ctrl, "Slot(%s): Link Up\n", slot_name(ctrl)); + /* + * If slot is closed && ATTN button exists + * don't continue, let the ATTN button + * drive the hot-plug + */ + if (((events & PCI_EXP_SLTSTA_MRLSC) && ATTN_BUTTN(ctrl))) + return; ctrl->request_result = pciehp_enable_slot(ctrl); break; default: diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c index 53433b37e181..7cfa27bcf951 100644 --- a/drivers/pci/hotplug/pciehp_hpc.c +++ b/drivers/pci/hotplug/pciehp_hpc.c @@ -605,7 +605,7 @@ static irqreturn_t pciehp_isr(int irq, void *dev_id) */ status &= PCI_EXP_SLTSTA_ABP | PCI_EXP_SLTSTA_PFD | PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_CC | - PCI_EXP_SLTSTA_DLLSC; + PCI_EXP_SLTSTA_DLLSC | PCI_EXP_SLTSTA_MRLSC; /* * If we've already reported a power fault, don't report it again @@ -710,8 +710,10 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id) down_read(&ctrl->reset_lock); if (events & DISABLE_SLOT) pciehp_handle_disable_request(ctrl); - else if (events & (PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_DLLSC)) + else if (events & (PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_DLLSC | + PCI_EXP_SLTSTA_MRLSC)) pciehp_handle_presence_or_link_change(ctrl, events); + up_read(&ctrl->reset_lock); ret = IRQ_HANDLED; @@ -768,6 +770,14 @@ static void pcie_enable_notification(struct controller *ctrl) cmd |= PCI_EXP_SLTCTL_ABPE; else cmd |= PCI_EXP_SLTCTL_PDCE; + + /* + * If MRL sensor is present, then subscribe for MRL + * Changes to be notified as well. + */ + if (MRL_SENS(ctrl)) + cmd |= PCI_EXP_SLTCTL_MRLSCE; + if (!pciehp_poll_mode) cmd |= PCI_EXP_SLTCTL_HPIE | PCI_EXP_SLTCTL_CCIE;
When Mechanical Retention Lock (MRL) is present, Linux doesn't process those change events. Support for these can be found starting Icelake Server. The following changes need to be enabled when MRL is present. 1. Subscribe to MRL change events in SlotControl. 2. When MRL is closed, - If there is no ATTN button, then POWER on the slot. - If there is ATTN button, and an MRL event pending, ignore Presence Detect. Since we want ATTN button to drive the hotplug event. - If currently slot is powered on, but MRL is open, PCIe Base Spec 5.0 Chapter 6.7.1.3 states. If an MRL Sensor is implemented without a corresponding MRL Sensor input on the Hot-Plug Controller, it is recommended that the MRL Sensor be routed to power fault input of the Hot-Plug Controller. This allows an active adapter to be powered off when the MRL is opened." This seems to suggest that the slot should be brought down as soon as MRL is opened. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Co-developed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> --- Changes since v1: - Changes suggested by Lucas Wunner https://lore.kernel.org/linux-pci/20201119223749.GA103783@otc-nc-03/T/#m1f661ae901e7dedad73dea370bb63abd52c610eb - Consolidate MRL handling in pciehp_handle_presence_or_link_change() - Added helped latch_closed() - Add comments why MRL open should function as hot-remove. - Don't nuke PDC, it might mask a button PUSH synthesized event after 5 secs. - Bjorn: Fix Subject to be consistent with other commits. --- drivers/pci/hotplug/pciehp_ctrl.c | 36 +++++++++++++++++++++++++++++++++++- drivers/pci/hotplug/pciehp_hpc.c | 14 ++++++++++++-- 2 files changed, 47 insertions(+), 3 deletions(-)