Message ID | 20241003132503.2279433-1-ajayagarwal@google.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2] PCI/ASPM: Disable L1 before disabling L1ss | expand |
On Thu, Oct 03, 2024 at 06:55:03PM +0530, Ajay Agarwal wrote: > The current sequence in the driver for L1ss update is as follows. > > Disable L1ss > Disable L1 > Enable L1ss as required > Enable L1 if required > > With this sequence, a bus hang is observed during the L1ss > disable sequence when the RC CPU attempts to clear the RC L1ss > register after clearing the EP L1ss register. Thanks for this. What exactly does the bus hang look like to a user? I guess the problem happens in pcie_config_aspm_l1ss(), where we do: pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... 0) pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... 0) where clearing the child (endpoint) PCI_L1SS_CTL1_L1_2_MASK works, but something goes wrong when clearing the parent (RP) mask? The clear_and_set will do a read followed by a write, and one of those causes some kind of error? > It looks like the > RC attempts to enter L1ss again and at the same time, access to > RC L1ss register fails because aux clk is still not active. I assume "access to RC L1ss register fails" means something like "reading the Root Port PCI_L1SS_CTL1 register returns ~0" which I guess would be the read part of the pci_clear_and_set_config_dword()? ~0 data might be returned because of some PCIe error like Unsupported Request, Completion Timeout, etc? Such an error should be logged in the AER Capability. This *sounds* like it would be a hardware defect in the Root Port. This register is on the upstream end of the link, so I would think it would be readable no matter what state the link is in. Sec 5.5.4 requires that L1 be disabled in PCI_EXP_LNKCTL while *setting* either of the ASPM L1 PM Substates enable bits. I don't see anything there about requiring that for *clearing* those enable bits. But maybe it is required, and in any event I guess it's simpler to do it as you do here and have L1 (indeed *all* ASPM) disabled while configuring L1 SS. > PCIe spec r6.2, section 5.5.4, recommends that setting either > or both of the enable bits for ASPM L1 PM Substates must be done > while ASPM L1 is disabled. My interpretation here is that > clearing L1ss should also be done when L1 is disabled. Thereby, > change the sequence as follows. > > Disable L1 > Disable L1ss > Enable L1ss as required > Enable L1 if required > > Signed-off-by: Ajay Agarwal <ajayagarwal@google.com> > --- > drivers/pci/pcie/aspm.c | 50 ++++++++++++++++++++--------------------- > 1 file changed, 24 insertions(+), 26 deletions(-) > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index cee2365e54b8..c172886129f3 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -848,17 +848,13 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist) > /* Configure the ASPM L1 substates */ > static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state) > { > - u32 val, enable_req; > + u32 val; > struct pci_dev *child = link->downstream, *parent = link->pdev; > > - enable_req = (link->aspm_enabled ^ state) & state; > - > /* > - * Here are the rules specified in the PCIe spec for enabling L1SS: > + * Spec r6.2, section 5.5.4, mentions the rules for enabling L1SS: > * - When enabling L1.x, enable bit at parent first, then at child > * - When disabling L1.x, disable bit at child first, then at parent > - * - When enabling ASPM L1.x, need to disable L1 > - * (at child followed by parent). > * - The ASPM/PCIPM L1.2 must be disabled while programming timing > * parameters > * > @@ -871,16 +867,6 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state) > PCI_L1SS_CTL1_L1SS_MASK, 0); > pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, > PCI_L1SS_CTL1_L1SS_MASK, 0); > - /* > - * If needed, disable L1, and it gets enabled later > - * in pcie_config_aspm_link(). > - */ > - if (enable_req & (PCIE_LINK_STATE_L1_1 | PCIE_LINK_STATE_L1_2)) { > - pcie_capability_clear_word(child, PCI_EXP_LNKCTL, > - PCI_EXP_LNKCTL_ASPM_L1); > - pcie_capability_clear_word(parent, PCI_EXP_LNKCTL, > - PCI_EXP_LNKCTL_ASPM_L1); > - } > > val = 0; > if (state & PCIE_LINK_STATE_L1_1) > @@ -937,21 +923,33 @@ static void pcie_config_aspm_link(struct pcie_link_state *link, u32 state) > dwstream |= PCI_EXP_LNKCTL_ASPM_L1; > } > > + /* > + * Spec r6.2, section 5.5.4, recommends that setting either or both of > + * the enable bits for ASPM L1 PM Substates must be done while ASPM L1 > + * is disabled. So disable L1 here, and it gets enabled later after the > + * L1ss configuration has been completed. > + * > + * Spec r6.2, section 7.5.3.7, mentions that ASPM L1 must be enabled by > + * software in the Upstream component on a Link prior to enabling ASPM > + * L1 in the Downstream component on the Link. When disabling L1, > + * software must disable ASPM L1 in the Downstream component on a Link > + * prior to disabling ASPM L1 in the Upstream component on that Link. > + * > + * Spec doesn't mention L0s. > + * > + * Disable L1 and L0s here, and they get enabled later after the L1ss > + * configuration has been completed. > + */ > + list_for_each_entry(child, &linkbus->devices, bus_list) > + pcie_config_aspm_dev(child, 0); > + pcie_config_aspm_dev(parent, 0); > + > if (link->aspm_capable & PCIE_LINK_STATE_L1SS) > pcie_config_aspm_l1ss(link, state); > > - /* > - * Spec 2.0 suggests all functions should be configured the > - * same setting for ASPM. Enabling ASPM L1 should be done in > - * upstream component first and then downstream, and vice > - * versa for disabling ASPM L1. Spec doesn't mention L0S. > - */ > - if (state & PCIE_LINK_STATE_L1) > - pcie_config_aspm_dev(parent, upstream); > + pcie_config_aspm_dev(parent, upstream); > list_for_each_entry(child, &linkbus->devices, bus_list) > pcie_config_aspm_dev(child, dwstream); > - if (!(state & PCIE_LINK_STATE_L1)) > - pcie_config_aspm_dev(parent, upstream); I think the reason for having pcie_config_aspm_dev(parent) both before and after configuring the children is because pcie_config_aspm_link() may be called either to enable L1 or to disable it. I guess your change always disables ASPM completely (disabling the downstream (child) component first, then the upstream), and here we are either leaving L1 disabled or enabling it, and in either case it should be safe to configure the upstream (parent) component first, then the downstream one. Of course, we may also enable L0s here, and AFAICS it should always be safe to do that in the upstream component first, followed by the downstream one. Bottom line, this looks good to me, and I think it's nice that this removes the "parent then child" or "child then parent" logic here. > link->aspm_enabled = state; > > -- > 2.46.1.824.gd892dcdcdd-goog >
On Thu, Oct 03, 2024 at 12:01:22PM -0500, Bjorn Helgaas wrote: > On Thu, Oct 03, 2024 at 06:55:03PM +0530, Ajay Agarwal wrote: > > The current sequence in the driver for L1ss update is as follows. > > > > Disable L1ss > > Disable L1 > > Enable L1ss as required > > Enable L1 if required > > > > With this sequence, a bus hang is observed during the L1ss > > disable sequence when the RC CPU attempts to clear the RC L1ss > > register after clearing the EP L1ss register. > > Thanks for this. What exactly does the bus hang look like to a user? > The CPU is just hung on reading the RC PCI_L1SS_CTL1 register. After some time, the CPU watchdog expires and the system reboots. > I guess the problem happens in pcie_config_aspm_l1ss(), where we do: > > pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... 0) > pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... 0) > > where clearing the child (endpoint) PCI_L1SS_CTL1_L1_2_MASK works, but > something goes wrong when clearing the parent (RP) mask? The > clear_and_set will do a read followed by a write, and one of those > causes some kind of error? > During ASPM disable, in pcie_config_aspm_l1ss(), we do: 1. pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... 0) 2. pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... 0) 3. pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... 0) 4. pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... 0) We observe that the steps 1 and 2 go through just fine. But the read of PCI_L1SS_CTL1 register in the step 3 hangs. I am not sure why. The issue is pretty difficult to reproduce, and adding prints around these steps masks the issue. > > It looks like the > > RC attempts to enter L1ss again and at the same time, access to > > RC L1ss register fails because aux clk is still not active. > > I assume "access to RC L1ss register fails" means something like > "reading the Root Port PCI_L1SS_CTL1 register returns ~0" which I > guess would be the read part of the pci_clear_and_set_config_dword()? > > ~0 data might be returned because of some PCIe error like Unsupported > Request, Completion Timeout, etc? Such an error should be logged in > the AER Capability. > This is not a PCIe bus transaction. This is CPU on the RC side accessing the RC side config register, so the link is not involved at all. Hence, no timeout or other AER errors logged/reported. The AXI-DBI bus just hangs. > This *sounds* like it would be a hardware defect in the Root Port. > This register is on the upstream end of the link, so I would think it > would be readable no matter what state the link is in. > Exactly. As described above, this is not a PCIe transaction. > Sec 5.5.4 requires that L1 be disabled in PCI_EXP_LNKCTL while > *setting* either of the ASPM L1 PM Substates enable bits. I don't see > anything there about requiring that for *clearing* those enable bits. > But maybe it is required, and in any event I guess it's simpler to do > it as you do here and have L1 (indeed *all* ASPM) disabled while > configuring L1 SS. > Right. The spec does not talk about the sequence when one wants to clear these L1ss bits. But I am interpreting the word "setting" as "setting to 1" as well as "setting to 0". > > PCIe spec r6.2, section 5.5.4, recommends that setting either > > or both of the enable bits for ASPM L1 PM Substates must be done > > while ASPM L1 is disabled. My interpretation here is that > > clearing L1ss should also be done when L1 is disabled. Thereby, > > change the sequence as follows. > > > > Disable L1 > > Disable L1ss > > Enable L1ss as required > > Enable L1 if required > > > > Signed-off-by: Ajay Agarwal <ajayagarwal@google.com> > > --- > > drivers/pci/pcie/aspm.c | 50 ++++++++++++++++++++--------------------- > > 1 file changed, 24 insertions(+), 26 deletions(-) > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > index cee2365e54b8..c172886129f3 100644 > > --- a/drivers/pci/pcie/aspm.c > > +++ b/drivers/pci/pcie/aspm.c > > @@ -848,17 +848,13 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist) > > /* Configure the ASPM L1 substates */ > > static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state) > > { > > - u32 val, enable_req; > > + u32 val; > > struct pci_dev *child = link->downstream, *parent = link->pdev; > > > > - enable_req = (link->aspm_enabled ^ state) & state; > > - > > /* > > - * Here are the rules specified in the PCIe spec for enabling L1SS: > > + * Spec r6.2, section 5.5.4, mentions the rules for enabling L1SS: > > * - When enabling L1.x, enable bit at parent first, then at child > > * - When disabling L1.x, disable bit at child first, then at parent > > - * - When enabling ASPM L1.x, need to disable L1 > > - * (at child followed by parent). > > * - The ASPM/PCIPM L1.2 must be disabled while programming timing > > * parameters > > * > > @@ -871,16 +867,6 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state) > > PCI_L1SS_CTL1_L1SS_MASK, 0); > > pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, > > PCI_L1SS_CTL1_L1SS_MASK, 0); > > - /* > > - * If needed, disable L1, and it gets enabled later > > - * in pcie_config_aspm_link(). > > - */ > > - if (enable_req & (PCIE_LINK_STATE_L1_1 | PCIE_LINK_STATE_L1_2)) { > > - pcie_capability_clear_word(child, PCI_EXP_LNKCTL, > > - PCI_EXP_LNKCTL_ASPM_L1); > > - pcie_capability_clear_word(parent, PCI_EXP_LNKCTL, > > - PCI_EXP_LNKCTL_ASPM_L1); > > - } > > > > val = 0; > > if (state & PCIE_LINK_STATE_L1_1) > > @@ -937,21 +923,33 @@ static void pcie_config_aspm_link(struct pcie_link_state *link, u32 state) > > dwstream |= PCI_EXP_LNKCTL_ASPM_L1; > > } > > > > + /* > > + * Spec r6.2, section 5.5.4, recommends that setting either or both of > > + * the enable bits for ASPM L1 PM Substates must be done while ASPM L1 > > + * is disabled. So disable L1 here, and it gets enabled later after the > > + * L1ss configuration has been completed. > > + * > > + * Spec r6.2, section 7.5.3.7, mentions that ASPM L1 must be enabled by > > + * software in the Upstream component on a Link prior to enabling ASPM > > + * L1 in the Downstream component on the Link. When disabling L1, > > + * software must disable ASPM L1 in the Downstream component on a Link > > + * prior to disabling ASPM L1 in the Upstream component on that Link. > > + * > > + * Spec doesn't mention L0s. > > + * > > + * Disable L1 and L0s here, and they get enabled later after the L1ss > > + * configuration has been completed. > > + */ > > + list_for_each_entry(child, &linkbus->devices, bus_list) > > + pcie_config_aspm_dev(child, 0); > > + pcie_config_aspm_dev(parent, 0); > > + > > if (link->aspm_capable & PCIE_LINK_STATE_L1SS) > > pcie_config_aspm_l1ss(link, state); > > > > - /* > > - * Spec 2.0 suggests all functions should be configured the > > - * same setting for ASPM. Enabling ASPM L1 should be done in > > - * upstream component first and then downstream, and vice > > - * versa for disabling ASPM L1. Spec doesn't mention L0S. > > - */ > > - if (state & PCIE_LINK_STATE_L1) > > - pcie_config_aspm_dev(parent, upstream); > > + pcie_config_aspm_dev(parent, upstream); > > list_for_each_entry(child, &linkbus->devices, bus_list) > > pcie_config_aspm_dev(child, dwstream); > > - if (!(state & PCIE_LINK_STATE_L1)) > > - pcie_config_aspm_dev(parent, upstream); > > I think the reason for having pcie_config_aspm_dev(parent) both before > and after configuring the children is because pcie_config_aspm_link() > may be called either to enable L1 or to disable it. > > I guess your change always disables ASPM completely (disabling the > downstream (child) component first, then the upstream), and here we > are either leaving L1 disabled or enabling it, and in either case it > should be safe to configure the upstream (parent) component first, > then the downstream one. > > Of course, we may also enable L0s here, and AFAICS it should always be > safe to do that in the upstream component first, followed by the > downstream one. > > Bottom line, this looks good to me, and I think it's nice that this > removes the "parent then child" or "child then parent" logic here. > Agreed with all the points. > > link->aspm_enabled = state; > > > > -- > > 2.46.1.824.gd892dcdcdd-goog > >
On Thu, Oct 03, 2024 at 10:53:58PM +0530, Ajay Agarwal wrote: > On Thu, Oct 03, 2024 at 12:01:22PM -0500, Bjorn Helgaas wrote: > > On Thu, Oct 03, 2024 at 06:55:03PM +0530, Ajay Agarwal wrote: > > > The current sequence in the driver for L1ss update is as follows. > > > > > > Disable L1ss > > > Disable L1 > > > Enable L1ss as required > > > Enable L1 if required > > > > > > With this sequence, a bus hang is observed during the L1ss > > > disable sequence when the RC CPU attempts to clear the RC L1ss > > > register after clearing the EP L1ss register. > > > > Thanks for this. What exactly does the bus hang look like to a user? > > > The CPU is just hung on reading the RC PCI_L1SS_CTL1 register. After > some time, the CPU watchdog expires and the system reboots. Wow. Good to know that this is outside the PCIe domain. I think this is a good change, and since it is partly motivated by hardware behavior that might be legal but seems somewhat unusual, can we identify the hardware (CPU and PCIe Root Complex) involved here? > > I guess the problem happens in pcie_config_aspm_l1ss(), where we do: > > > > pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... 0) > > pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... 0) > > > > where clearing the child (endpoint) PCI_L1SS_CTL1_L1_2_MASK works, but > > something goes wrong when clearing the parent (RP) mask? The > > clear_and_set will do a read followed by a write, and one of those > > causes some kind of error? > > > During ASPM disable, in pcie_config_aspm_l1ss(), we do: > 1. pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... 0) > 2. pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... 0) > 3. pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... 0) > 4. pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... 0) > > We observe that the steps 1 and 2 go through just fine. But the read of > PCI_L1SS_CTL1 register in the step 3 hangs. I am not sure why. > The issue is pretty difficult to reproduce, and adding prints around > these steps masks the issue. I guess the L1 disable is between 2 and 3, right? And 3 and 4 may enable L1 SS (using val, not 0)? 1. same 2. same 2.5 pcie_capability_clear_word(child, PCI_EXP_LNKCTL_ASPM_L1) 2.6 pcie_capability_clear_word(parent, PCI_EXP_LNKCTL_ASPM_L1) 3. pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... val) 4. pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... val) Bjorn
On Thu, Oct 03, 2024 at 03:23:21PM -0500, Bjorn Helgaas wrote: > On Thu, Oct 03, 2024 at 10:53:58PM +0530, Ajay Agarwal wrote: > > On Thu, Oct 03, 2024 at 12:01:22PM -0500, Bjorn Helgaas wrote: > > > On Thu, Oct 03, 2024 at 06:55:03PM +0530, Ajay Agarwal wrote: > > > > The current sequence in the driver for L1ss update is as follows. > > > > > > > > Disable L1ss > > > > Disable L1 > > > > Enable L1ss as required > > > > Enable L1 if required > > > > > > > > With this sequence, a bus hang is observed during the L1ss > > > > disable sequence when the RC CPU attempts to clear the RC L1ss > > > > register after clearing the EP L1ss register. > > > > > > Thanks for this. What exactly does the bus hang look like to a user? > > > > > The CPU is just hung on reading the RC PCI_L1SS_CTL1 register. After > > some time, the CPU watchdog expires and the system reboots. > > Wow. Good to know that this is outside the PCIe domain. I think this > is a good change, and since it is partly motivated by hardware > behavior that might be legal but seems somewhat unusual, can we > identify the hardware (CPU and PCIe Root Complex) involved here? > The CPU is an ARM A-core. The PCIe RC is a Synopsys Designware core. > > > I guess the problem happens in pcie_config_aspm_l1ss(), where we do: > > > > > > pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... 0) > > > pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... 0) > > > > > > where clearing the child (endpoint) PCI_L1SS_CTL1_L1_2_MASK works, but > > > something goes wrong when clearing the parent (RP) mask? The > > > clear_and_set will do a read followed by a write, and one of those > > > causes some kind of error? > > > > > During ASPM disable, in pcie_config_aspm_l1ss(), we do: > > 1. pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... 0) > > 2. pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... 0) > > 3. pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... 0) > > 4. pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... 0) > > > > We observe that the steps 1 and 2 go through just fine. But the read of > > PCI_L1SS_CTL1 register in the step 3 hangs. I am not sure why. > > The issue is pretty difficult to reproduce, and adding prints around > > these steps masks the issue. > > I guess the L1 disable is between 2 and 3, right? And 3 and 4 may > enable L1 SS (using val, not 0)? > > 1. same > 2. same > 2.5 pcie_capability_clear_word(child, PCI_EXP_LNKCTL_ASPM_L1) > 2.6 pcie_capability_clear_word(parent, PCI_EXP_LNKCTL_ASPM_L1) > 3. pci_clear_and_set_config_dword(parent->l1ss + PCI_L1SS_CTL1, ... val) > 4. pci_clear_and_set_config_dword(child->l1ss + PCI_L1SS_CTL1, ... val) > Thats the sequence when L1ss is enabled. When it is disabled, then steps 2.5 and 2.6 do not run. And 'val' remains 0. > Bjorn
On Thu, Oct 03, 2024 at 06:55:03PM +0530, Ajay Agarwal wrote: > The current sequence in the driver for L1ss update is as follows. > > Disable L1ss > Disable L1 > Enable L1ss as required > Enable L1 if required > > With this sequence, a bus hang is observed during the L1ss > disable sequence when the RC CPU attempts to clear the RC L1ss > register after clearing the EP L1ss register. It looks like the > RC attempts to enter L1ss again and at the same time, access to > RC L1ss register fails because aux clk is still not active. > > PCIe spec r6.2, section 5.5.4, recommends that setting either > or both of the enable bits for ASPM L1 PM Substates must be done > while ASPM L1 is disabled. My interpretation here is that > clearing L1ss should also be done when L1 is disabled. Thereby, > change the sequence as follows. > > Disable L1 > Disable L1ss > Enable L1ss as required > Enable L1 if required I think we also write the L1.2 enable bits in PCI_L1SS_CTL1 in aspm_calc_l12_info() when ASPM L1 may be enabled: pcie_aspm_init_link_state pcie_aspm_cap_init pcie_capability_read_word(PCI_EXP_LNKCTL) aspm_l1ss_init aspm_calc_l12_info pci_clear_and_set_config_dword(PCI_L1SS_CTL1, PCI_L1SS_CTL1_L1_2_MASK) That looks like another path where we should make a similar change. What do you think? Bjorn
On Fri, Oct 04, 2024 at 06:19:28PM -0500, Bjorn Helgaas wrote: > On Thu, Oct 03, 2024 at 06:55:03PM +0530, Ajay Agarwal wrote: > > The current sequence in the driver for L1ss update is as follows. > > > > Disable L1ss > > Disable L1 > > Enable L1ss as required > > Enable L1 if required > > > > With this sequence, a bus hang is observed during the L1ss > > disable sequence when the RC CPU attempts to clear the RC L1ss > > register after clearing the EP L1ss register. It looks like the > > RC attempts to enter L1ss again and at the same time, access to > > RC L1ss register fails because aux clk is still not active. > > > > PCIe spec r6.2, section 5.5.4, recommends that setting either > > or both of the enable bits for ASPM L1 PM Substates must be done > > while ASPM L1 is disabled. My interpretation here is that > > clearing L1ss should also be done when L1 is disabled. Thereby, > > change the sequence as follows. > > > > Disable L1 > > Disable L1ss > > Enable L1ss as required > > Enable L1 if required > > I think we also write the L1.2 enable bits in PCI_L1SS_CTL1 in > aspm_calc_l12_info() when ASPM L1 may be enabled: > > pcie_aspm_init_link_state > pcie_aspm_cap_init > pcie_capability_read_word(PCI_EXP_LNKCTL) > aspm_l1ss_init > aspm_calc_l12_info > pci_clear_and_set_config_dword(PCI_L1SS_CTL1, PCI_L1SS_CTL1_L1_2_MASK) > > That looks like another path where we should make a similar change. > What do you think? > I agree. We should make a similar change there. Thanks for pointing out. Will make the change in the next version. > Bjorn
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index cee2365e54b8..c172886129f3 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -848,17 +848,13 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist) /* Configure the ASPM L1 substates */ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state) { - u32 val, enable_req; + u32 val; struct pci_dev *child = link->downstream, *parent = link->pdev; - enable_req = (link->aspm_enabled ^ state) & state; - /* - * Here are the rules specified in the PCIe spec for enabling L1SS: + * Spec r6.2, section 5.5.4, mentions the rules for enabling L1SS: * - When enabling L1.x, enable bit at parent first, then at child * - When disabling L1.x, disable bit at child first, then at parent - * - When enabling ASPM L1.x, need to disable L1 - * (at child followed by parent). * - The ASPM/PCIPM L1.2 must be disabled while programming timing * parameters * @@ -871,16 +867,6 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state) PCI_L1SS_CTL1_L1SS_MASK, 0); pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, PCI_L1SS_CTL1_L1SS_MASK, 0); - /* - * If needed, disable L1, and it gets enabled later - * in pcie_config_aspm_link(). - */ - if (enable_req & (PCIE_LINK_STATE_L1_1 | PCIE_LINK_STATE_L1_2)) { - pcie_capability_clear_word(child, PCI_EXP_LNKCTL, - PCI_EXP_LNKCTL_ASPM_L1); - pcie_capability_clear_word(parent, PCI_EXP_LNKCTL, - PCI_EXP_LNKCTL_ASPM_L1); - } val = 0; if (state & PCIE_LINK_STATE_L1_1) @@ -937,21 +923,33 @@ static void pcie_config_aspm_link(struct pcie_link_state *link, u32 state) dwstream |= PCI_EXP_LNKCTL_ASPM_L1; } + /* + * Spec r6.2, section 5.5.4, recommends that setting either or both of + * the enable bits for ASPM L1 PM Substates must be done while ASPM L1 + * is disabled. So disable L1 here, and it gets enabled later after the + * L1ss configuration has been completed. + * + * Spec r6.2, section 7.5.3.7, mentions that ASPM L1 must be enabled by + * software in the Upstream component on a Link prior to enabling ASPM + * L1 in the Downstream component on the Link. When disabling L1, + * software must disable ASPM L1 in the Downstream component on a Link + * prior to disabling ASPM L1 in the Upstream component on that Link. + * + * Spec doesn't mention L0s. + * + * Disable L1 and L0s here, and they get enabled later after the L1ss + * configuration has been completed. + */ + list_for_each_entry(child, &linkbus->devices, bus_list) + pcie_config_aspm_dev(child, 0); + pcie_config_aspm_dev(parent, 0); + if (link->aspm_capable & PCIE_LINK_STATE_L1SS) pcie_config_aspm_l1ss(link, state); - /* - * Spec 2.0 suggests all functions should be configured the - * same setting for ASPM. Enabling ASPM L1 should be done in - * upstream component first and then downstream, and vice - * versa for disabling ASPM L1. Spec doesn't mention L0S. - */ - if (state & PCIE_LINK_STATE_L1) - pcie_config_aspm_dev(parent, upstream); + pcie_config_aspm_dev(parent, upstream); list_for_each_entry(child, &linkbus->devices, bus_list) pcie_config_aspm_dev(child, dwstream); - if (!(state & PCIE_LINK_STATE_L1)) - pcie_config_aspm_dev(parent, upstream); link->aspm_enabled = state;
The current sequence in the driver for L1ss update is as follows. Disable L1ss Disable L1 Enable L1ss as required Enable L1 if required With this sequence, a bus hang is observed during the L1ss disable sequence when the RC CPU attempts to clear the RC L1ss register after clearing the EP L1ss register. It looks like the RC attempts to enter L1ss again and at the same time, access to RC L1ss register fails because aux clk is still not active. PCIe spec r6.2, section 5.5.4, recommends that setting either or both of the enable bits for ASPM L1 PM Substates must be done while ASPM L1 is disabled. My interpretation here is that clearing L1ss should also be done when L1 is disabled. Thereby, change the sequence as follows. Disable L1 Disable L1ss Enable L1ss as required Enable L1 if required Signed-off-by: Ajay Agarwal <ajayagarwal@google.com> --- drivers/pci/pcie/aspm.c | 50 ++++++++++++++++++++--------------------- 1 file changed, 24 insertions(+), 26 deletions(-)