Message ID | 20230120091540.3305-1-vidyas@nvidia.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | [V2] PCI/ASPM: Skip L1SS save/restore if not already enabled | expand |
On 1/20/2023 10:15 AM, Vidya Sagar wrote: > Skip save and restore of ASPM L1 Sub-States specific registers if they > are not already enabled in the system. This is to avoid issues observed > on certain platforms during restoration process, particularly when > restoring the L1SS registers contents. > > BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216782 > Signed-off-by: Vidya Sagar <vidyas@nvidia.com> It would be good if the bug reporters could test this. When that happens, please feel free to add Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> to it. > --- > v2: > * Address review comments from Kai-Heng Feng and Rafael > > drivers/pci/pcie/aspm.c | 17 ++++++++++++++++- > include/linux/pci.h | 1 + > 2 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index 53a1fa306e1e..bd2a922081bd 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -761,11 +761,23 @@ void pci_save_aspm_l1ss_state(struct pci_dev *dev) > { > struct pci_cap_saved_state *save_state; > u16 l1ss = dev->l1ss; > - u32 *cap; > + u32 *cap, val; > > if (!l1ss) > return; > > + /* > + * Skip save and restore of L1 Sub-States registers if they are not > + * already enabled in the system > + */ > + pci_read_config_dword(dev, l1ss + PCI_L1SS_CTL1, &val); > + if (!(val & PCI_L1SS_CTL1_L1SS_MASK)) { > + dev->skip_l1ss_restore = true; > + return; > + } > + > + dev->skip_l1ss_restore = false; > + > save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_L1SS); > if (!save_state) > return; > @@ -784,6 +796,9 @@ void pci_restore_aspm_l1ss_state(struct pci_dev *dev) > if (!l1ss) > return; > > + if (dev->skip_l1ss_restore) > + return; > + > save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_L1SS); > if (!save_state) > return; > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 22319ea71ab0..39534602b55e 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -395,6 +395,7 @@ struct pci_dev { > unsigned int ltr_path:1; /* Latency Tolerance Reporting > supported from root to here */ > u16 l1ss; /* L1SS Capability pointer */ > + bool skip_l1ss_restore; /* Skip L1SS Save/Restore */ > #endif > unsigned int pasid_no_tlp:1; /* PASID works without TLP Prefix */ > unsigned int eetlp_prefix_path:1; /* End-to-End TLP Prefix */
On Fri, Jan 20, 2023 at 02:45:40PM +0530, Vidya Sagar wrote: > Skip save and restore of ASPM L1 Sub-States specific registers if they > are not already enabled in the system. This is to avoid issues observed > on certain platforms during restoration process, particularly when > restoring the L1SS registers contents. > > BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216782 > Signed-off-by: Vidya Sagar <vidyas@nvidia.com> > --- > v2: > * Address review comments from Kai-Heng Feng and Rafael > > drivers/pci/pcie/aspm.c | 17 ++++++++++++++++- > include/linux/pci.h | 1 + > 2 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index 53a1fa306e1e..bd2a922081bd 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -761,11 +761,23 @@ void pci_save_aspm_l1ss_state(struct pci_dev *dev) > { > struct pci_cap_saved_state *save_state; > u16 l1ss = dev->l1ss; > - u32 *cap; > + u32 *cap, val; > > if (!l1ss) > return; > > + /* > + * Skip save and restore of L1 Sub-States registers if they are not > + * already enabled in the system > + */ > + pci_read_config_dword(dev, l1ss + PCI_L1SS_CTL1, &val); > + if (!(val & PCI_L1SS_CTL1_L1SS_MASK)) { > + dev->skip_l1ss_restore = true; > + return; > + } I think this fix is still problematic. PCIe r6.0, sec 5.5.4, requires that If setting either or both of the enable bits for ASPM L1 PM Substates, both ports must be configured as described in this section while ASPM L1 is disabled. The current Linux code does not observe this because ASPM L1 is enabled by PCI_EXP_LNKCTL (in the PCIe Capability Link Control register), while ASPM L1 PM Substate configuration is in PCI_L1SS_CTL1 (in the L1 PM Substates Capability), and these two things are not integrated: pci_restore_state pci_restore_aspm_l1ss_state aspm_program_l1ss pci_write_config_dword(PCI_L1SS_CTL1, ctl1) # L1SS restore pci_restore_pcie_state pcie_capability_write_word(PCI_EXP_LNKCTL, cap[i++]) # L1 restore So I suspect the problem is that we're writing PCI_L1SS_CTL1 while ASPM L1 is enabled, and the device gets confused somehow. I think it would be better change this restore flow to follow that spec requirement instead of skipping the save/restore like this. I hesitate to even include the patch below because it's clearly not a real fix, but if the system does resume and we see this message, it would be a good clue that this is what's happening. Bjorn diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 53a1fa306e1e..c8349b1f982f 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -779,7 +779,7 @@ void pci_restore_aspm_l1ss_state(struct pci_dev *dev) { struct pci_cap_saved_state *save_state; u32 *cap, ctl1, ctl2; - u16 l1ss = dev->l1ss; + u16 ctl, l1ss = dev->l1ss; if (!l1ss) return; @@ -788,6 +788,13 @@ void pci_restore_aspm_l1ss_state(struct pci_dev *dev) if (!save_state) return; + pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &ctl); + if (ctl & PCI_EXP_LNKCTL_ASPM_L1) { + pci_info(dev, "ASPM: can't restore L1SS while L1 enabled (%#06x)\n", + ctl); + return; + } + cap = (u32 *)&save_state->cap.data[0]; ctl2 = *cap++; ctl1 = *cap;
[+cc Thomas] On Wed, Feb 08, 2023 at 05:42:29PM -0600, Bjorn Helgaas wrote: > On Fri, Jan 20, 2023 at 02:45:40PM +0530, Vidya Sagar wrote: > > Skip save and restore of ASPM L1 Sub-States specific registers if they > > are not already enabled in the system. This is to avoid issues observed > > on certain platforms during restoration process, particularly when > > restoring the L1SS registers contents. > > > > BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216782 > > Signed-off-by: Vidya Sagar <vidyas@nvidia.com> > > --- > > v2: > > * Address review comments from Kai-Heng Feng and Rafael > > > > drivers/pci/pcie/aspm.c | 17 ++++++++++++++++- > > include/linux/pci.h | 1 + > > 2 files changed, 17 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > index 53a1fa306e1e..bd2a922081bd 100644 > > --- a/drivers/pci/pcie/aspm.c > > +++ b/drivers/pci/pcie/aspm.c > > @@ -761,11 +761,23 @@ void pci_save_aspm_l1ss_state(struct pci_dev *dev) > > { > > struct pci_cap_saved_state *save_state; > > u16 l1ss = dev->l1ss; > > - u32 *cap; > > + u32 *cap, val; > > > > if (!l1ss) > > return; > > > > + /* > > + * Skip save and restore of L1 Sub-States registers if they are not > > + * already enabled in the system > > + */ > > + pci_read_config_dword(dev, l1ss + PCI_L1SS_CTL1, &val); > > + if (!(val & PCI_L1SS_CTL1_L1SS_MASK)) { > > + dev->skip_l1ss_restore = true; > > + return; > > + } > > I think this fix is still problematic. PCIe r6.0, sec 5.5.4, requires > that > > If setting either or both of the enable bits for ASPM L1 PM > Substates, both ports must be configured as described in this > section while ASPM L1 is disabled. > > The current Linux code does not observe this because ASPM L1 is > enabled by PCI_EXP_LNKCTL (in the PCIe Capability Link Control > register), while ASPM L1 PM Substate configuration is in PCI_L1SS_CTL1 > (in the L1 PM Substates Capability), and these two things are not > integrated: > > pci_restore_state > pci_restore_aspm_l1ss_state > aspm_program_l1ss > pci_write_config_dword(PCI_L1SS_CTL1, ctl1) # L1SS restore > pci_restore_pcie_state > pcie_capability_write_word(PCI_EXP_LNKCTL, cap[i++]) # L1 restore > > So I suspect the problem is that we're writing PCI_L1SS_CTL1 while > ASPM L1 is enabled, and the device gets confused somehow. > > I think it would be better change this restore flow to follow that > spec requirement instead of skipping the save/restore like this. A revert of 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume") has been in linux-next starting with Feb 6. I originally reverted 5e85eba6f50d ("PCI/ASPM: Refactor L1 PM Substates Control Register programming") because it broke suspend/resume differently [1]. I had to revert 4ff116d0d5fd at the same time because 5e85eba6f50d added aspm_program_l1ss(), which was used by 4ff116d0d5fd. I don't think Tasev or Mark have directly tested reverting 4ff116d0d5fd to see if it resolves the problem *they* are seeing. But that would be good to know so I can update the commit logs. Bjorn [1] https://bugzilla.kernel.org/show_bug.cgi?id=216877
Hi, Resending this in plaintext mode. I apologize for the duplicate mail. Sorry, Mark Francis ---------------------- Hello, I tried the test patch with the "ASPM: can't restore L1SS while L1 enabled" message on the v6.1 tag. I tried setting the ASPM policy to default rather than powersupersave. Tested twice. The result is I get to see the messages in the kernel log. The system resumed successfully in all tests. [ 330.438136] ACPI: PM: Waking up from system sleep state S3 [ 330.445959] ACPI: EC: interrupt unblocked [ 330.446174] pcieport 0000:00:1c.0: ASPM: can't restore L1SS while L1 enabled (0x0042) [ 330.446177] pcieport 0000:00:1c.6: ASPM: can't restore L1SS while L1 enabled (0x0002) [ 330.448354] r8169 0000:03:00.0: ASPM: can't restore L1SS while L1 enabled (0x0142) [ 330.448368] sdhci-pci 0000:04:00.0: ASPM: can't restore L1SS while L1 enabled (0x0102) [ 330.448672] pcieport 0000:00:06.0: ASPM: can't restore L1SS while L1 enabled (0x0042) [ 330.448965] nvme 0000:02:00.0: ASPM: can't restore L1SS while L1 enabled (0x0042) [ 330.449814] pcieport 0000:00:01.0: ASPM: can't restore L1SS while L1 enabled (0x0042) [ 330.577111] pci 0000:01:00.0: ASPM: can't restore L1SS while L1 enabled (0x0142) [ 330.580820] ACPI: EC: event unblocked [ 330.581066] sd 0:0:0:0: [sda] Starting disk I also noticed that these messages also pop out when activating a userspace powersave tool (i.e., tlp). (I was restoring my machine after the test, that is, re-enabling services like tlp. Then, I accidentally knocked off the wall plug with my foot causing tlp to activate its battery profile) [ 4065.786154] pcieport 0000:00:1c.0: ASPM: can't restore L1SS while L1 enabled (0x0042) [ 4065.799553] r8169 0000:03:00.0: ASPM: can't restore L1SS while L1 enabled (0x0142) [ 4065.969703] r8169 0000:03:00.0 enp3s0: Link is Down I really wish I could also try and speculate other solutions but I am ignorant with respect to the PCIe specifications. Nevertheless, Hope this helps. Let me know if I also need to test the case where the commits are reverted. Thanks, On Fri, Feb 10, 2023 at 9:35 PM Mark Enriquez <enriquezmark36@gmail.com> wrote: > > Hello, > > I tried the test patch with the "ASPM: can't restore L1SS while L1 enabled" message on the v6.1 tag. > > I tried setting the ASPM policy to default rather than powersupersave. Tested twice. > The result is I get to see the messages in the kernel log. The system resumed successfully in all tests. > [ 330.438136] ACPI: PM: Waking up from system sleep state S3 > [ 330.445959] ACPI: EC: interrupt unblocked > [ 330.446174] pcieport 0000:00:1c.0: ASPM: can't restore L1SS while L1 enabled (0x0042) > [ 330.446177] pcieport 0000:00:1c.6: ASPM: can't restore L1SS while L1 enabled (0x0002) > [ 330.448354] r8169 0000:03:00.0: ASPM: can't restore L1SS while L1 enabled (0x0142) > [ 330.448368] sdhci-pci 0000:04:00.0: ASPM: can't restore L1SS while L1 enabled (0x0102) > [ 330.448672] pcieport 0000:00:06.0: ASPM: can't restore L1SS while L1 enabled (0x0042) > [ 330.448965] nvme 0000:02:00.0: ASPM: can't restore L1SS while L1 enabled (0x0042) > [ 330.449814] pcieport 0000:00:01.0: ASPM: can't restore L1SS while L1 enabled (0x0042) > [ 330.577111] pci 0000:01:00.0: ASPM: can't restore L1SS while L1 enabled (0x0142) > [ 330.580820] ACPI: EC: event unblocked > [ 330.581066] sd 0:0:0:0: [sda] Starting disk > > I also noticed that these messages also pop out when activating a userspace powersave tool (i.e., tlp). > (I was restoring my machine after the test, that is, re-enabling services like tlp. > Then, I accidentally knocked off the wall plug with my foot causing tlp to activate its battery profile) > [ 4065.786154] pcieport 0000:00:1c.0: ASPM: can't restore L1SS while L1 enabled (0x0042) > [ 4065.799553] r8169 0000:03:00.0: ASPM: can't restore L1SS while L1 enabled (0x0142) > [ 4065.969703] r8169 0000:03:00.0 enp3s0: Link is Down > > I really wish I could also try and speculate other solutions but I am ignorant with respect to the PCIe specifications. > > Nevertheless, Hope this helps. > Let me know if I also need to test the case where the commits are reverted. > > Thanks, > > On Fri, Feb 10, 2023 at 8:18 AM Bjorn Helgaas <helgaas@kernel.org> wrote: >> >> [+cc Thomas] >> >> On Wed, Feb 08, 2023 at 05:42:29PM -0600, Bjorn Helgaas wrote: >> > On Fri, Jan 20, 2023 at 02:45:40PM +0530, Vidya Sagar wrote: >> > > Skip save and restore of ASPM L1 Sub-States specific registers if they >> > > are not already enabled in the system. This is to avoid issues observed >> > > on certain platforms during restoration process, particularly when >> > > restoring the L1SS registers contents. >> > > >> > > BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216782 >> > > Signed-off-by: Vidya Sagar <vidyas@nvidia.com> >> > > --- >> > > v2: >> > > * Address review comments from Kai-Heng Feng and Rafael >> > > >> > > drivers/pci/pcie/aspm.c | 17 ++++++++++++++++- >> > > include/linux/pci.h | 1 + >> > > 2 files changed, 17 insertions(+), 1 deletion(-) >> > > >> > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c >> > > index 53a1fa306e1e..bd2a922081bd 100644 >> > > --- a/drivers/pci/pcie/aspm.c >> > > +++ b/drivers/pci/pcie/aspm.c >> > > @@ -761,11 +761,23 @@ void pci_save_aspm_l1ss_state(struct pci_dev *dev) >> > > { >> > > struct pci_cap_saved_state *save_state; >> > > u16 l1ss = dev->l1ss; >> > > - u32 *cap; >> > > + u32 *cap, val; >> > > >> > > if (!l1ss) >> > > return; >> > > >> > > + /* >> > > + * Skip save and restore of L1 Sub-States registers if they are not >> > > + * already enabled in the system >> > > + */ >> > > + pci_read_config_dword(dev, l1ss + PCI_L1SS_CTL1, &val); >> > > + if (!(val & PCI_L1SS_CTL1_L1SS_MASK)) { >> > > + dev->skip_l1ss_restore = true; >> > > + return; >> > > + } >> > >> > I think this fix is still problematic. PCIe r6.0, sec 5.5.4, requires >> > that >> > >> > If setting either or both of the enable bits for ASPM L1 PM >> > Substates, both ports must be configured as described in this >> > section while ASPM L1 is disabled. >> > >> > The current Linux code does not observe this because ASPM L1 is >> > enabled by PCI_EXP_LNKCTL (in the PCIe Capability Link Control >> > register), while ASPM L1 PM Substate configuration is in PCI_L1SS_CTL1 >> > (in the L1 PM Substates Capability), and these two things are not >> > integrated: >> > >> > pci_restore_state >> > pci_restore_aspm_l1ss_state >> > aspm_program_l1ss >> > pci_write_config_dword(PCI_L1SS_CTL1, ctl1) # L1SS restore >> > pci_restore_pcie_state >> > pcie_capability_write_word(PCI_EXP_LNKCTL, cap[i++]) # L1 restore >> > >> > So I suspect the problem is that we're writing PCI_L1SS_CTL1 while >> > ASPM L1 is enabled, and the device gets confused somehow. >> > >> > I think it would be better change this restore flow to follow that >> > spec requirement instead of skipping the save/restore like this. >> >> A revert of 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability >> for suspend/resume") has been in linux-next starting with Feb 6. >> >> I originally reverted 5e85eba6f50d ("PCI/ASPM: Refactor L1 PM >> Substates Control Register programming") because it broke >> suspend/resume differently [1]. >> >> I had to revert 4ff116d0d5fd at the same time because 5e85eba6f50d >> added aspm_program_l1ss(), which was used by 4ff116d0d5fd. >> >> I don't think Tasev or Mark have directly tested reverting >> 4ff116d0d5fd to see if it resolves the problem *they* are seeing. But >> that would be good to know so I can update the commit logs. >> >> Bjorn >> >> [1] https://bugzilla.kernel.org/show_bug.cgi?id=216877
On Fri, Feb 10, 2023 at 09:39:56PM +0800, Mark Enriquez wrote: > I tried the test patch with the "ASPM: can't restore L1SS while L1 > enabled" message on the v6.1 tag. > > I tried setting the ASPM policy to default rather than powersupersave. > Tested twice. > The result is I get to see the messages in the kernel log. The system > resumed successfully in all tests. > [ 330.438136] ACPI: PM: Waking up from system sleep state S3 > [ 330.445959] ACPI: EC: interrupt unblocked > [ 330.446174] pcieport 0000:00:1c.0: ASPM: can't restore L1SS while > L1 enabled (0x0042) > [ 330.446177] pcieport 0000:00:1c.6: ASPM: can't restore L1SS while > L1 enabled (0x0002) > [ 330.448354] r8169 0000:03:00.0: ASPM: can't restore L1SS while L1 > enabled (0x0142) > [ 330.448368] sdhci-pci 0000:04:00.0: ASPM: can't restore L1SS while > L1 enabled (0x0102) That's perfect, thank you very much! That means we're in dangerous territory because v6.1 will restore the L1SS state while L1 is enabled, which the spec doesn't allow for. If you apply the patch below on vanilla v6.1 (or v6.2-rc, whatever is more convenient), my hope is that resume will work. I think we're just going to have to revert for now and give up the power savings we get from 4ff116d0d5fd. We'll revisit it later, of course, to get the power savings back. commit a6b1e19ca489 ("Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"") parent 830b3c68c1fb Author: Bjorn Helgaas <bhelgaas@google.com> Date: Fri Feb 10 07:49:18 2023 -0600 Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume" This reverts commit 4ff116d0d5fd8a025604b0802d93a2d5f4e465d1. diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 2127aba3550b..92c6f7e5ca2e 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1665,7 +1665,6 @@ int pci_save_state(struct pci_dev *dev) return i; pci_save_ltr_state(dev); - pci_save_aspm_l1ss_state(dev); pci_save_dpc_state(dev); pci_save_aer_state(dev); pci_save_ptm_state(dev); @@ -1772,7 +1771,6 @@ void pci_restore_state(struct pci_dev *dev) * LTR itself (in the PCIe capability). */ pci_restore_ltr_state(dev); - pci_restore_aspm_l1ss_state(dev); pci_restore_pcie_state(dev); pci_restore_pasid_state(dev); @@ -3465,11 +3463,6 @@ void pci_allocate_cap_save_buffers(struct pci_dev *dev) if (error) pci_err(dev, "unable to allocate suspend buffer for LTR\n"); - error = pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_L1SS, - 2 * sizeof(u32)); - if (error) - pci_err(dev, "unable to allocate suspend buffer for ASPM-L1SS\n"); - pci_allocate_vc_save_buffers(dev); } diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index b1ebb7ab8805..ce169b12a8f6 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -565,14 +565,10 @@ bool pcie_wait_for_link(struct pci_dev *pdev, bool active); void pcie_aspm_init_link_state(struct pci_dev *pdev); void pcie_aspm_exit_link_state(struct pci_dev *pdev); void pcie_aspm_powersave_config_link(struct pci_dev *pdev); -void pci_save_aspm_l1ss_state(struct pci_dev *dev); -void pci_restore_aspm_l1ss_state(struct pci_dev *dev); #else static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { } static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { } static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { } -static inline void pci_save_aspm_l1ss_state(struct pci_dev *dev) { } -static inline void pci_restore_aspm_l1ss_state(struct pci_dev *dev) { } #endif #ifdef CONFIG_PCIE_ECRC diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 53a1fa306e1e..915cbd939dd9 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -757,43 +757,6 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state) PCI_L1SS_CTL1_L1SS_MASK, val); } -void pci_save_aspm_l1ss_state(struct pci_dev *dev) -{ - struct pci_cap_saved_state *save_state; - u16 l1ss = dev->l1ss; - u32 *cap; - - if (!l1ss) - return; - - save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_L1SS); - if (!save_state) - return; - - cap = (u32 *)&save_state->cap.data[0]; - pci_read_config_dword(dev, l1ss + PCI_L1SS_CTL2, cap++); - pci_read_config_dword(dev, l1ss + PCI_L1SS_CTL1, cap++); -} - -void pci_restore_aspm_l1ss_state(struct pci_dev *dev) -{ - struct pci_cap_saved_state *save_state; - u32 *cap, ctl1, ctl2; - u16 l1ss = dev->l1ss; - - if (!l1ss) - return; - - save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_L1SS); - if (!save_state) - return; - - cap = (u32 *)&save_state->cap.data[0]; - ctl2 = *cap++; - ctl1 = *cap; - aspm_program_l1ss(dev, ctl1, ctl2); -} - static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) { pcie_capability_clear_and_set_word(pdev, PCI_EXP_LNKCTL,
I recompiled with this on v6.1 and tested twice (still with all user space power saving tools disabled) on ASPM policy default and powersupersave. The suspend/resume works. Retried the tests on top of v6.2-rc7 and suspend/resume still works. It's kind of sad that L1SS will have to be set aside for now. Though, to be honest, I wasn't really able to measure any power savings from it. Given that this specific Gigabyte laptop (G5 GD) is notoriously power inefficient (~8 watts on complete idle), I think it should be easier to see if L1SS is shaving off some watts. Then again, I have no way to verify my measurements or if L1SS is truly being achieved. It could be that the PCIe devices are just chilling on the L1 state. In any case, thank you so much for this. I Hope that this gets figured out in the near future. On Fri, Feb 10, 2023 at 9:56 PM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Fri, Feb 10, 2023 at 09:39:56PM +0800, Mark Enriquez wrote: > > I tried the test patch with the "ASPM: can't restore L1SS while L1 > > enabled" message on the v6.1 tag. > > > > I tried setting the ASPM policy to default rather than powersupersave. > > Tested twice. > > The result is I get to see the messages in the kernel log. The system > > resumed successfully in all tests. > > [ 330.438136] ACPI: PM: Waking up from system sleep state S3 > > [ 330.445959] ACPI: EC: interrupt unblocked > > [ 330.446174] pcieport 0000:00:1c.0: ASPM: can't restore L1SS while > > L1 enabled (0x0042) > > [ 330.446177] pcieport 0000:00:1c.6: ASPM: can't restore L1SS while > > L1 enabled (0x0002) > > [ 330.448354] r8169 0000:03:00.0: ASPM: can't restore L1SS while L1 > > enabled (0x0142) > > [ 330.448368] sdhci-pci 0000:04:00.0: ASPM: can't restore L1SS while > > L1 enabled (0x0102) > > That's perfect, thank you very much! That means we're in dangerous > territory because v6.1 will restore the L1SS state while L1 is > enabled, which the spec doesn't allow for. > > If you apply the patch below on vanilla v6.1 (or v6.2-rc, whatever is > more convenient), my hope is that resume will work. > > I think we're just going to have to revert for now and give up the > power savings we get from 4ff116d0d5fd. We'll revisit it later, of > course, to get the power savings back. > > commit a6b1e19ca489 ("Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"") > parent 830b3c68c1fb > Author: Bjorn Helgaas <bhelgaas@google.com> > Date: Fri Feb 10 07:49:18 2023 -0600 > > Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume" > > This reverts commit 4ff116d0d5fd8a025604b0802d93a2d5f4e465d1. > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 2127aba3550b..92c6f7e5ca2e 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -1665,7 +1665,6 @@ int pci_save_state(struct pci_dev *dev) > return i; > > pci_save_ltr_state(dev); > - pci_save_aspm_l1ss_state(dev); > pci_save_dpc_state(dev); > pci_save_aer_state(dev); > pci_save_ptm_state(dev); > @@ -1772,7 +1771,6 @@ void pci_restore_state(struct pci_dev *dev) > * LTR itself (in the PCIe capability). > */ > pci_restore_ltr_state(dev); > - pci_restore_aspm_l1ss_state(dev); > > pci_restore_pcie_state(dev); > pci_restore_pasid_state(dev); > @@ -3465,11 +3463,6 @@ void pci_allocate_cap_save_buffers(struct pci_dev *dev) > if (error) > pci_err(dev, "unable to allocate suspend buffer for LTR\n"); > > - error = pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_L1SS, > - 2 * sizeof(u32)); > - if (error) > - pci_err(dev, "unable to allocate suspend buffer for ASPM-L1SS\n"); > - > pci_allocate_vc_save_buffers(dev); > } > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index b1ebb7ab8805..ce169b12a8f6 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -565,14 +565,10 @@ bool pcie_wait_for_link(struct pci_dev *pdev, bool active); > void pcie_aspm_init_link_state(struct pci_dev *pdev); > void pcie_aspm_exit_link_state(struct pci_dev *pdev); > void pcie_aspm_powersave_config_link(struct pci_dev *pdev); > -void pci_save_aspm_l1ss_state(struct pci_dev *dev); > -void pci_restore_aspm_l1ss_state(struct pci_dev *dev); > #else > static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { } > static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { } > static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { } > -static inline void pci_save_aspm_l1ss_state(struct pci_dev *dev) { } > -static inline void pci_restore_aspm_l1ss_state(struct pci_dev *dev) { } > #endif > > #ifdef CONFIG_PCIE_ECRC > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index 53a1fa306e1e..915cbd939dd9 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -757,43 +757,6 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state) > PCI_L1SS_CTL1_L1SS_MASK, val); > } > > -void pci_save_aspm_l1ss_state(struct pci_dev *dev) > -{ > - struct pci_cap_saved_state *save_state; > - u16 l1ss = dev->l1ss; > - u32 *cap; > - > - if (!l1ss) > - return; > - > - save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_L1SS); > - if (!save_state) > - return; > - > - cap = (u32 *)&save_state->cap.data[0]; > - pci_read_config_dword(dev, l1ss + PCI_L1SS_CTL2, cap++); > - pci_read_config_dword(dev, l1ss + PCI_L1SS_CTL1, cap++); > -} > - > -void pci_restore_aspm_l1ss_state(struct pci_dev *dev) > -{ > - struct pci_cap_saved_state *save_state; > - u32 *cap, ctl1, ctl2; > - u16 l1ss = dev->l1ss; > - > - if (!l1ss) > - return; > - > - save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_L1SS); > - if (!save_state) > - return; > - > - cap = (u32 *)&save_state->cap.data[0]; > - ctl2 = *cap++; > - ctl1 = *cap; > - aspm_program_l1ss(dev, ctl1, ctl2); > -} > - > static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) > { > pcie_capability_clear_and_set_word(pdev, PCI_EXP_LNKCTL,
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 53a1fa306e1e..bd2a922081bd 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -761,11 +761,23 @@ void pci_save_aspm_l1ss_state(struct pci_dev *dev) { struct pci_cap_saved_state *save_state; u16 l1ss = dev->l1ss; - u32 *cap; + u32 *cap, val; if (!l1ss) return; + /* + * Skip save and restore of L1 Sub-States registers if they are not + * already enabled in the system + */ + pci_read_config_dword(dev, l1ss + PCI_L1SS_CTL1, &val); + if (!(val & PCI_L1SS_CTL1_L1SS_MASK)) { + dev->skip_l1ss_restore = true; + return; + } + + dev->skip_l1ss_restore = false; + save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_L1SS); if (!save_state) return; @@ -784,6 +796,9 @@ void pci_restore_aspm_l1ss_state(struct pci_dev *dev) if (!l1ss) return; + if (dev->skip_l1ss_restore) + return; + save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_L1SS); if (!save_state) return; diff --git a/include/linux/pci.h b/include/linux/pci.h index 22319ea71ab0..39534602b55e 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -395,6 +395,7 @@ struct pci_dev { unsigned int ltr_path:1; /* Latency Tolerance Reporting supported from root to here */ u16 l1ss; /* L1SS Capability pointer */ + bool skip_l1ss_restore; /* Skip L1SS Save/Restore */ #endif unsigned int pasid_no_tlp:1; /* PASID works without TLP Prefix */ unsigned int eetlp_prefix_path:1; /* End-to-End TLP Prefix */
Skip save and restore of ASPM L1 Sub-States specific registers if they are not already enabled in the system. This is to avoid issues observed on certain platforms during restoration process, particularly when restoring the L1SS registers contents. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216782 Signed-off-by: Vidya Sagar <vidyas@nvidia.com> --- v2: * Address review comments from Kai-Heng Feng and Rafael drivers/pci/pcie/aspm.c | 17 ++++++++++++++++- include/linux/pci.h | 1 + 2 files changed, 17 insertions(+), 1 deletion(-)