Message ID | 20210823184919.3412-1-jonathan.derrick@intel.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | [v2] PCI: pciehp: Quirk to ignore spurious DLLSC when off | expand |
Hi Jon, Thank you for the patch! Yet something to improve: [auto build test ERROR on pci/next] [also build test ERROR on v5.14-rc7 next-20210823] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Jon-Derrick/PCI-pciehp-Quirk-to-ignore-spurious-DLLSC-when-off/20210824-025108 base: https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next config: arm-randconfig-r021-20210822 (attached as .config) compiler: arm-linux-gnueabi-gcc (GCC) 11.2.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/64e85d3cc3a64aea26c4b89c9661ee4eef99ac5e git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Jon-Derrick/PCI-pciehp-Quirk-to-ignore-spurious-DLLSC-when-off/20210824-025108 git checkout 64e85d3cc3a64aea26c4b89c9661ee4eef99ac5e # save the attached .config to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=arm SHELL=/bin/bash drivers/ If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): drivers/pci/quirks.c: In function 'shared_pcc_and_link_slot': >> drivers/pci/quirks.c:5778:31: error: 'struct pci_dev' has no member named 'shared_pcc_and_link_slot' 5778 | parent->shared_pcc_and_link_slot = 1; | ^~ vim +5778 drivers/pci/quirks.c 5745 5746 static void apex_pci_fixup_class(struct pci_dev *pdev) 5747 { 5748 pdev->class = (PCI_CLASS_SYSTEM_OTHER << 8) | pdev->class; 5749 } 5750 DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a, 5751 PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class); 5752 5753 /* 5754 * This is a special card that sits in a x8 pciehp slot but is bifurcated as 5755 * a x4x4 and manifests as two slots with respect to PCIe hot plug register 5756 * states. However, the hotplug controller treats these slots as a single x8 5757 * slot for link and power. Either one of the two slots can be powered down 5758 * separately but real power and link will be active till the last of the two 5759 * slots is powered down. When the last of the two x4 slots is turned off, 5760 * power and link will be turned off for the x8 slot by the HP controller. 5761 * This configuration causes some interesting behavior in bringup sequence 5762 * 5763 * When the second slot is powered off to remove the card, this will cause 5764 * the link to go down for both x4 slots. So, the x4 that is already powered 5765 * down earlier will see a DLLSC event and attempt to bring itself up (card 5766 * present, link change event, link state is down). Special handling is 5767 * required in pciehp_handle_presence_or_link_change to prevent this unintended 5768 * bring up 5769 * 5770 */ 5771 static void shared_pcc_and_link_slot(struct pci_dev *pdev) 5772 { 5773 struct pci_dev *parent = pci_upstream_bridge(pdev); 5774 5775 if (pdev->subsystem_vendor == 0x108e && 5776 pdev->subsystem_device == 0x487d) { 5777 if (parent) > 5778 parent->shared_pcc_and_link_slot = 1; --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Subject should start with a verb (see history for examples). "when off"? Needs a little more context to make sense by itself. On Mon, Aug 23, 2021 at 12:49:19PM -0600, Jon Derrick wrote: > From: James Puthukattukaran <james.puthukattukaran@oracle.com> > > When a specific x8 CEM card is bifurcated into x4x4 mode, and the > upstream ports both support hotplugging on each respective x4 device, a > slot management system for the CEM card requires both x4 devices to be > sysfs removed from the OS before it can safely turn-off physical power. > The implications are that Slot Control will display Powered Off status > for the device where the device is actually powered until both ports > have powered off. Apparently this is related to a "specific x8 CEM card"? Please identify it (marketing name, model, etc -- some way a user can tell whether this quirk applies to the hardware he/she is holding), describe it, and make a case for why we care about it. E.g., if this is a shipping product, we probably do care; if it's just a lab fixture, maybe not. > When power is removed from the first half, real power and link remains > active while waiting for the second half to have power removed. When > power is then removed from the second half, the first half starts > shutdown sequence and will trigger a DLLSC event. This is misinterpreted > as an enabling event and causes the first half to be re-enabled. > > The spurious enable can be resolved by ignoring link status change > events when no link is active when in the off state. This patch adds a > quirk for the card. > > Acked-by: Jon Derrick <jonathan.derrick@intel.com> > Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com> > --- > v1->v2: Device-specific quirk > > drivers/pci/hotplug/pciehp_ctrl.c | 7 +++++++ > drivers/pci/quirks.c | 30 ++++++++++++++++++++++++++++++ > include/linux/pci.h | 1 + > 3 files changed, 38 insertions(+) > > diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c > index 529c34808440..db41f78bfac8 100644 > --- a/drivers/pci/hotplug/pciehp_ctrl.c > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > @@ -225,6 +225,7 @@ void pciehp_handle_disable_request(struct controller *ctrl) > void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > { > int present, link_active; > + struct pci_dev *pdev = ctrl->pcie->port; > > /* > * If the slot is on and presence or link has changed, turn it off. > @@ -265,6 +266,12 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > cancel_delayed_work(&ctrl->button_work); > fallthrough; > case OFF_STATE: > + if (pdev->shared_pcc_and_link_slot && > + (events & PCI_EXP_SLTSTA_DLLSC) && !link_active) { > + mutex_unlock(&ctrl->state_lock); > + break; > + } > + > ctrl->state = POWERON_STATE; > mutex_unlock(&ctrl->state_lock); > if (present) > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index 43fbf55871ef..92a5bae8926e 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -5749,3 +5749,33 @@ static void apex_pci_fixup_class(struct pci_dev *pdev) > } > DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a, > PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class); > + > +/* > + * This is a special card that sits in a x8 pciehp slot but is bifurcated as > + * a x4x4 and manifests as two slots with respect to PCIe hot plug register > + * states. However, the hotplug controller treats these slots as a single x8 > + * slot for link and power. Either one of the two slots can be powered down > + * separately but real power and link will be active till the last of the two > + * slots is powered down. When the last of the two x4 slots is turned off, > + * power and link will be turned off for the x8 slot by the HP controller. > + * This configuration causes some interesting behavior in bringup sequence > + * > + * When the second slot is powered off to remove the card, this will cause > + * the link to go down for both x4 slots. So, the x4 that is already powered > + * down earlier will see a DLLSC event and attempt to bring itself up (card > + * present, link change event, link state is down). Special handling is > + * required in pciehp_handle_presence_or_link_change to prevent this unintended > + * bring up > + * > + */ > +static void shared_pcc_and_link_slot(struct pci_dev *pdev) > +{ > + struct pci_dev *parent = pci_upstream_bridge(pdev); > + > + if (pdev->subsystem_vendor == 0x108e && > + pdev->subsystem_device == 0x487d) { > + if (parent) > + parent->shared_pcc_and_link_slot = 1; > + } > +} > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0B60, shared_pcc_and_link_slot); > diff --git a/include/linux/pci.h b/include/linux/pci.h > index e752cc39a1fe..ba84f7c93c31 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -469,6 +469,7 @@ struct pci_dev { > > #ifdef CONFIG_HOTPLUG_PCI_PCIE > unsigned int broken_cmd_compl:1; /* No compl for some cmds */ > + unsigned int shared_pcc_and_link_slot:1; > #endif > #ifdef CONFIG_PCIE_PTM > unsigned int ptm_root:1; > -- > 2.27.0 >
On Mon, Aug 23, 2021 at 12:49:19PM -0600, Derrick, Jonathan wrote: > From: James Puthukattukaran <james.puthukattukaran@oracle.com> > > When a specific x8 CEM card is bifurcated into x4x4 mode, and the > upstream ports both support hotplugging on each respective x4 device, a > slot management system for the CEM card requires both x4 devices to be > sysfs removed from the OS before it can safely turn-off physical power. sysfs removed from the OS seems a bit confusing.. Do you mean removed via sysfs by echo 0 > power? > The implications are that Slot Control will display Powered Off status > for the device where the device is actually powered until both ports > have powered off. > > When power is removed from the first half, real power and link remains > active while waiting for the second half to have power removed. When > power is then removed from the second half, the first half starts > shutdown sequence and will trigger a DLLSC event. This is misinterpreted > as an enabling event and causes the first half to be re-enabled. > > The spurious enable can be resolved by ignoring link status change > events when no link is active when in the off state. This patch adds a > quirk for the card. > > Acked-by: Jon Derrick <jonathan.derrick@intel.com> > Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com> > --- > v1->v2: Device-specific quirk > > drivers/pci/hotplug/pciehp_ctrl.c | 7 +++++++ > drivers/pci/quirks.c | 30 ++++++++++++++++++++++++++++++ > include/linux/pci.h | 1 + > 3 files changed, 38 insertions(+) > > diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c > index 529c34808440..db41f78bfac8 100644 > --- a/drivers/pci/hotplug/pciehp_ctrl.c > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > @@ -225,6 +225,7 @@ void pciehp_handle_disable_request(struct controller *ctrl) > void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > { > int present, link_active; > + struct pci_dev *pdev = ctrl->pcie->port; > > /* > * If the slot is on and presence or link has changed, turn it off. > @@ -265,6 +266,12 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > cancel_delayed_work(&ctrl->button_work); > fallthrough; > case OFF_STATE: > + if (pdev->shared_pcc_and_link_slot && > + (events & PCI_EXP_SLTSTA_DLLSC) && !link_active) { Is it !link_active? Based on the explanation, until we turn off both slots you would see link is still active correct? > + mutex_unlock(&ctrl->state_lock); > + break; > + } > +
On 8/23/21 6:41 PM, Raj, Ashok wrote: > On Mon, Aug 23, 2021 at 12:49:19PM -0600, Derrick, Jonathan wrote: >> From: James Puthukattukaran <james.puthukattukaran@oracle.com> >> >> When a specific x8 CEM card is bifurcated into x4x4 mode, and the >> upstream ports both support hotplugging on each respective x4 device, a >> slot management system for the CEM card requires both x4 devices to be >> sysfs removed from the OS before it can safely turn-off physical power. > > sysfs removed from the OS seems a bit confusing.. Do you mean removed via > sysfs by echo 0 > power? Yes it seems to be related to slot power and not just virtual removal > >> The implications are that Slot Control will display Powered Off status >> for the device where the device is actually powered until both ports >> have powered off. >> >> When power is removed from the first half, real power and link remains >> active while waiting for the second half to have power removed. When >> power is then removed from the second half, the first half starts >> shutdown sequence and will trigger a DLLSC event. This is misinterpreted >> as an enabling event and causes the first half to be re-enabled. >> >> The spurious enable can be resolved by ignoring link status change >> events when no link is active when in the off state. This patch adds a >> quirk for the card. >> >> Acked-by: Jon Derrick <jonathan.derrick@intel.com> >> Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com> >> --- >> v1->v2: Device-specific quirk >> >> drivers/pci/hotplug/pciehp_ctrl.c | 7 +++++++ >> drivers/pci/quirks.c | 30 ++++++++++++++++++++++++++++++ >> include/linux/pci.h | 1 + >> 3 files changed, 38 insertions(+) >> >> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c >> index 529c34808440..db41f78bfac8 100644 >> --- a/drivers/pci/hotplug/pciehp_ctrl.c >> +++ b/drivers/pci/hotplug/pciehp_ctrl.c >> @@ -225,6 +225,7 @@ void pciehp_handle_disable_request(struct controller *ctrl) >> void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) >> { >> int present, link_active; >> + struct pci_dev *pdev = ctrl->pcie->port; >> >> /* >> * If the slot is on and presence or link has changed, turn it off. >> @@ -265,6 +266,12 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) >> cancel_delayed_work(&ctrl->button_work); >> fallthrough; >> case OFF_STATE: >> + if (pdev->shared_pcc_and_link_slot && >> + (events & PCI_EXP_SLTSTA_DLLSC) && !link_active) { > > Is it !link_active? Based on the explanation, until we turn off both slots > you would see link is still active correct? My understanding is the internal link is active but link_active shows negative status while the one side is faking being 'powered off'. Then if a DLLSC occurs while the slot is already OFF_STATE and !link_active, I believe we can assume it was previously !link_active as well and it's an event from the other side. > >> + mutex_unlock(&ctrl->state_lock); >> + break; >> + } >> +
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c index 529c34808440..db41f78bfac8 100644 --- a/drivers/pci/hotplug/pciehp_ctrl.c +++ b/drivers/pci/hotplug/pciehp_ctrl.c @@ -225,6 +225,7 @@ void pciehp_handle_disable_request(struct controller *ctrl) void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) { int present, link_active; + struct pci_dev *pdev = ctrl->pcie->port; /* * If the slot is on and presence or link has changed, turn it off. @@ -265,6 +266,12 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) cancel_delayed_work(&ctrl->button_work); fallthrough; case OFF_STATE: + if (pdev->shared_pcc_and_link_slot && + (events & PCI_EXP_SLTSTA_DLLSC) && !link_active) { + mutex_unlock(&ctrl->state_lock); + break; + } + ctrl->state = POWERON_STATE; mutex_unlock(&ctrl->state_lock); if (present) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 43fbf55871ef..92a5bae8926e 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5749,3 +5749,33 @@ static void apex_pci_fixup_class(struct pci_dev *pdev) } DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a, PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class); + +/* + * This is a special card that sits in a x8 pciehp slot but is bifurcated as + * a x4x4 and manifests as two slots with respect to PCIe hot plug register + * states. However, the hotplug controller treats these slots as a single x8 + * slot for link and power. Either one of the two slots can be powered down + * separately but real power and link will be active till the last of the two + * slots is powered down. When the last of the two x4 slots is turned off, + * power and link will be turned off for the x8 slot by the HP controller. + * This configuration causes some interesting behavior in bringup sequence + * + * When the second slot is powered off to remove the card, this will cause + * the link to go down for both x4 slots. So, the x4 that is already powered + * down earlier will see a DLLSC event and attempt to bring itself up (card + * present, link change event, link state is down). Special handling is + * required in pciehp_handle_presence_or_link_change to prevent this unintended + * bring up + * + */ +static void shared_pcc_and_link_slot(struct pci_dev *pdev) +{ + struct pci_dev *parent = pci_upstream_bridge(pdev); + + if (pdev->subsystem_vendor == 0x108e && + pdev->subsystem_device == 0x487d) { + if (parent) + parent->shared_pcc_and_link_slot = 1; + } +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0B60, shared_pcc_and_link_slot); diff --git a/include/linux/pci.h b/include/linux/pci.h index e752cc39a1fe..ba84f7c93c31 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -469,6 +469,7 @@ struct pci_dev { #ifdef CONFIG_HOTPLUG_PCI_PCIE unsigned int broken_cmd_compl:1; /* No compl for some cmds */ + unsigned int shared_pcc_and_link_slot:1; #endif #ifdef CONFIG_PCIE_PTM unsigned int ptm_root:1;