Message ID | 20240424110223.21799-2-jhp@endlessos.org (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | PCI: vmd: Enable PCI PM's L1 substates of remapped PCIe Root Port and NVMe | expand |
Hi Jian-Hong, On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote: > Currently, when enable link's L1.2 features with __pci_enable_link_state(), > it configs the link directly without ensuring related L1.2 parameters, such > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have been > programmed. > > This leads the link's L1.2 between PCIe Root Port and child device gets > wrong configs when a caller tries to enabled it. > > Here is a failed example on ASUS B1400CEAE with enabled VMD: > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor PCIe > Controller (rev 01) (prog-if 00 [Normal decode]) > ... > Capabilities: [200 v1] L1 PM Substates > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ > L1_PM_Substates+ > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > T_CommonMode=45us LTR1.2_Threshold=101376ns > L1SubCtl2: T_PwrOn=50us > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550 NVMe > SSD (rev 01) (prog-if 02 [NVM Express]) > ... > Capabilities: [900 v1] L1 PM Substates > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > L1_PM_Substates+ > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > T_CommonMode=0us LTR1.2_Threshold=0ns > L1SubCtl2: T_PwrOn=10us > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the PCIe > Root Port and the child NVMe, they should be programmed with the same > LTR1.2_Threshold value. However, they have different values in this case. > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly before > enable L1.2 bits of L1 PM Substates Control Register in > __pci_enable_link_state(). > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394 > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org> > --- > v2: > - Prepare the PCIe LTR parameters before enable L1 Substates > > v3: > - Only enable supported features for the L1 Substates part > > v4: > - Focus on fixing L1.2 parameters, instead of re-initializing whole L1SS > > v5: > - Fix typo and commit message > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce > aspm_get_l1ss_cap()" > > drivers/pci/pcie/aspm.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index c55ac11faa73..553327dee991 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state); > static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool > locked) > { > struct pcie_link_state *link = pcie_aspm_get_link(pdev); > + struct pci_dev *child = link->downstream, *parent = link->pdev; > + u32 parent_l1ss_cap, child_l1ss_cap; > > if (!link) > return -EINVAL; > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct pci_dev > *pdev, int state, bool locked) > link->aspm_default |= ASPM_STATE_L1_1_PCIPM | ASPM_STATE_L1; > if (state & PCIE_LINK_STATE_L1_2_PCIPM) > link->aspm_default |= ASPM_STATE_L1_2_PCIPM | ASPM_STATE_L1; > + /* > + * Ensure L1.2 parameters: Common_Mode_Restore_Times, T_POWER_ON and > + * LTR_L1.2_THRESHOLD are programmed properly before enable bits for > + * L1.2, per PCIe r6.0, sec 5.5.4. > + */ > + if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) { This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags. 'state' should not even matter. The timings should always be calculated and programmed as long as L1_2 is capable. That way the timings are ready even if L1_2 isn't being enabled now (in case the user enables it later). David > + parent_l1ss_cap = aspm_get_l1ss_cap(parent); > + child_l1ss_cap = aspm_get_l1ss_cap(child); > + aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap); > + } > pcie_config_aspm_link(link, policy_to_aspm_state(link)); > > link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;
David E. Box <david.e.box@linux.intel.com> 於 2024年4月27日 週六 上午8:03寫道: > > Hi Jian-Hong, > > On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote: > > Currently, when enable link's L1.2 features with __pci_enable_link_state(), > > it configs the link directly without ensuring related L1.2 parameters, such > > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have been > > programmed. > > > > This leads the link's L1.2 between PCIe Root Port and child device gets > > wrong configs when a caller tries to enabled it. > > > > Here is a failed example on ASUS B1400CEAE with enabled VMD: > > > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor PCIe > > Controller (rev 01) (prog-if 00 [Normal decode]) > > ... > > Capabilities: [200 v1] L1 PM Substates > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ > > L1_PM_Substates+ > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > T_CommonMode=45us LTR1.2_Threshold=101376ns > > L1SubCtl2: T_PwrOn=50us > > > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550 NVMe > > SSD (rev 01) (prog-if 02 [NVM Express]) > > ... > > Capabilities: [900 v1] L1 PM Substates > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > L1_PM_Substates+ > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > T_CommonMode=0us LTR1.2_Threshold=0ns > > L1SubCtl2: T_PwrOn=10us > > > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the PCIe > > Root Port and the child NVMe, they should be programmed with the same > > LTR1.2_Threshold value. However, they have different values in this case. > > > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly before > > enable L1.2 bits of L1 PM Substates Control Register in > > __pci_enable_link_state(). > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394 > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org> > > --- > > v2: > > - Prepare the PCIe LTR parameters before enable L1 Substates > > > > v3: > > - Only enable supported features for the L1 Substates part > > > > v4: > > - Focus on fixing L1.2 parameters, instead of re-initializing whole L1SS > > > > v5: > > - Fix typo and commit message > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce > > aspm_get_l1ss_cap()" > > > > drivers/pci/pcie/aspm.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > index c55ac11faa73..553327dee991 100644 > > --- a/drivers/pci/pcie/aspm.c > > +++ b/drivers/pci/pcie/aspm.c > > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state); > > static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool > > locked) > > { > > struct pcie_link_state *link = pcie_aspm_get_link(pdev); > > + struct pci_dev *child = link->downstream, *parent = link->pdev; > > + u32 parent_l1ss_cap, child_l1ss_cap; > > > > if (!link) > > return -EINVAL; > > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct pci_dev > > *pdev, int state, bool locked) > > link->aspm_default |= ASPM_STATE_L1_1_PCIPM | ASPM_STATE_L1; > > if (state & PCIE_LINK_STATE_L1_2_PCIPM) > > link->aspm_default |= ASPM_STATE_L1_2_PCIPM | ASPM_STATE_L1; > > + /* > > + * Ensure L1.2 parameters: Common_Mode_Restore_Times, T_POWER_ON and > > + * LTR_L1.2_THRESHOLD are programmed properly before enable bits for > > + * L1.2, per PCIe r6.0, sec 5.5.4. > > + */ > > + if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) { > > This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags. Thanks for your review, but I notice some description in PCIe spec, 5.5.4 L1 PM Substates Configuration: "Prior to setting either or both of the enable bits for L1.2, the values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2 Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale fields) must be programmed." => I think this includes both "ASPM L1.2 Enable" and "PCI-PM L1.2 Enable" bits. Jain-Hong Pan > 'state' should not even matter. > The timings should always be calculated and programmed as long > as L1_2 is capable. That way the timings are ready even if L1_2 isn't being > enabled now (in case the user enables it later). > > David > > > + parent_l1ss_cap = aspm_get_l1ss_cap(parent); > > + child_l1ss_cap = aspm_get_l1ss_cap(child); > > + aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap); > > + } > > pcie_config_aspm_link(link, policy_to_aspm_state(link)); > > > > link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0; >
On Tue, 2024-04-30 at 15:46 +0800, Jian-Hong Pan wrote: > David E. Box <david.e.box@linux.intel.com> 於 2024年4月27日 週六 上午8:03寫道: > > > > Hi Jian-Hong, > > > > On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote: > > > Currently, when enable link's L1.2 features with > > > __pci_enable_link_state(), > > > it configs the link directly without ensuring related L1.2 parameters, > > > such > > > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have been > > > programmed. > > > > > > This leads the link's L1.2 between PCIe Root Port and child device gets > > > wrong configs when a caller tries to enabled it. > > > > > > Here is a failed example on ASUS B1400CEAE with enabled VMD: > > > > > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor PCIe > > > Controller (rev 01) (prog-if 00 [Normal decode]) > > > ... > > > Capabilities: [200 v1] L1 PM Substates > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ > > > L1_PM_Substates+ > > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > T_CommonMode=45us LTR1.2_Threshold=101376ns > > > L1SubCtl2: T_PwrOn=50us > > > > > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550 > > > NVMe > > > SSD (rev 01) (prog-if 02 [NVM Express]) > > > ... > > > Capabilities: [900 v1] L1 PM Substates > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > L1_PM_Substates+ > > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > T_CommonMode=0us LTR1.2_Threshold=0ns > > > L1SubCtl2: T_PwrOn=10us > > > > > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the PCIe > > > Root Port and the child NVMe, they should be programmed with the same > > > LTR1.2_Threshold value. However, they have different values in this case. > > > > > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly before > > > enable L1.2 bits of L1 PM Substates Control Register in > > > __pci_enable_link_state(). > > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394 > > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org> > > > --- > > > v2: > > > - Prepare the PCIe LTR parameters before enable L1 Substates > > > > > > v3: > > > - Only enable supported features for the L1 Substates part > > > > > > v4: > > > - Focus on fixing L1.2 parameters, instead of re-initializing whole L1SS > > > > > > v5: > > > - Fix typo and commit message > > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce > > > aspm_get_l1ss_cap()" > > > > > > drivers/pci/pcie/aspm.c | 12 ++++++++++++ > > > 1 file changed, 12 insertions(+) > > > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > > index c55ac11faa73..553327dee991 100644 > > > --- a/drivers/pci/pcie/aspm.c > > > +++ b/drivers/pci/pcie/aspm.c > > > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state); > > > static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool > > > locked) > > > { > > > struct pcie_link_state *link = pcie_aspm_get_link(pdev); > > > + struct pci_dev *child = link->downstream, *parent = link->pdev; > > > + u32 parent_l1ss_cap, child_l1ss_cap; > > > > > > if (!link) > > > return -EINVAL; > > > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct pci_dev > > > *pdev, int state, bool locked) > > > link->aspm_default |= ASPM_STATE_L1_1_PCIPM | > > > ASPM_STATE_L1; > > > if (state & PCIE_LINK_STATE_L1_2_PCIPM) > > > link->aspm_default |= ASPM_STATE_L1_2_PCIPM | > > > ASPM_STATE_L1; > > > + /* > > > + * Ensure L1.2 parameters: Common_Mode_Restore_Times, T_POWER_ON > > > and > > > + * LTR_L1.2_THRESHOLD are programmed properly before enable bits > > > for > > > + * L1.2, per PCIe r6.0, sec 5.5.4. > > > + */ > > > + if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) { > > > > This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags. > > Thanks for your review, but I notice some description in PCIe spec, > 5.5.4 L1 PM Substates Configuration: > "Prior to setting either or both of the enable bits for L1.2, the > values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2 > Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale > fields) must be programmed." => I think this includes both "ASPM L1.2 > Enable" and "PCI-PM L1.2 Enable" bits. That's fine. While the spec clearly calls out the ASPM L1.2 Enable bit here, I see no harm in including PCI-PM L1.2 in that check. This is what the code already does in aspm_l1ss_init(). The issue is the mixed used of two different types of flags that don't have the same meaning. 'state' contains PCIE_LINK_STATE flags which are part of the caller API to the pci_<enabled/disable>_link_state() functions. The ASPM_STATE flags are used internally to aspm.c to track all states and their meaningful combinations such as ASPM_STATE_L1_2_MASK which includes ASPM L1.2 and PCI-PM L1.2. You should not do bit operations between them. Also, you should not require that the timings be calculated only if L1_2 is enabled. You should calculate the timings as long as it's capable. This is also what aspm_l1ss_init() does. The confusion might be over the fact that you are having __pci_enable_link_state() call aspm_calc_l12_info(). This should have been handled during initialization of the link in aspm_l1ss_init() and I'm not sure why it didn't. Maybe it's because, for VMD, ASPM default state would have started out all disabled and this somehow led to aspm_l1ss_init() not getting called. But looking through the code I don't see it. It would be great if you can confirm why they weren't calculated before. David > > Jain-Hong Pan > > > 'state' should not even matter. > > The timings should always be calculated and programmed as long > > as L1_2 is capable. That way the timings are ready even if L1_2 isn't being > > enabled now (in case the user enables it later). > > > > David > > > > > + parent_l1ss_cap = aspm_get_l1ss_cap(parent); > > > + child_l1ss_cap = aspm_get_l1ss_cap(child); > > > + aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap); > > > + } > > > pcie_config_aspm_link(link, policy_to_aspm_state(link)); > > > > > > link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0; > >
David E. Box <david.e.box@linux.intel.com> 於 2024年5月1日 週三 上午2:26寫道: > > On Tue, 2024-04-30 at 15:46 +0800, Jian-Hong Pan wrote: > > David E. Box <david.e.box@linux.intel.com> 於 2024年4月27日 週六 上午8:03寫道: > > > > > > Hi Jian-Hong, > > > > > > On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote: > > > > Currently, when enable link's L1.2 features with > > > > __pci_enable_link_state(), > > > > it configs the link directly without ensuring related L1.2 parameters, > > > > such > > > > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have been > > > > programmed. > > > > > > > > This leads the link's L1.2 between PCIe Root Port and child device gets > > > > wrong configs when a caller tries to enabled it. > > > > > > > > Here is a failed example on ASUS B1400CEAE with enabled VMD: > > > > > > > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor PCIe > > > > Controller (rev 01) (prog-if 00 [Normal decode]) > > > > ... > > > > Capabilities: [200 v1] L1 PM Substates > > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ > > > > L1_PM_Substates+ > > > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > T_CommonMode=45us LTR1.2_Threshold=101376ns > > > > L1SubCtl2: T_PwrOn=50us > > > > > > > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550 > > > > NVMe > > > > SSD (rev 01) (prog-if 02 [NVM Express]) > > > > ... > > > > Capabilities: [900 v1] L1 PM Substates > > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > L1_PM_Substates+ > > > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > T_CommonMode=0us LTR1.2_Threshold=0ns > > > > L1SubCtl2: T_PwrOn=10us > > > > > > > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the PCIe > > > > Root Port and the child NVMe, they should be programmed with the same > > > > LTR1.2_Threshold value. However, they have different values in this case. > > > > > > > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly before > > > > enable L1.2 bits of L1 PM Substates Control Register in > > > > __pci_enable_link_state(). > > > > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394 > > > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org> > > > > --- > > > > v2: > > > > - Prepare the PCIe LTR parameters before enable L1 Substates > > > > > > > > v3: > > > > - Only enable supported features for the L1 Substates part > > > > > > > > v4: > > > > - Focus on fixing L1.2 parameters, instead of re-initializing whole L1SS > > > > > > > > v5: > > > > - Fix typo and commit message > > > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce > > > > aspm_get_l1ss_cap()" > > > > > > > > drivers/pci/pcie/aspm.c | 12 ++++++++++++ > > > > 1 file changed, 12 insertions(+) > > > > > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > > > index c55ac11faa73..553327dee991 100644 > > > > --- a/drivers/pci/pcie/aspm.c > > > > +++ b/drivers/pci/pcie/aspm.c > > > > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state); > > > > static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool > > > > locked) > > > > { > > > > struct pcie_link_state *link = pcie_aspm_get_link(pdev); > > > > + struct pci_dev *child = link->downstream, *parent = link->pdev; > > > > + u32 parent_l1ss_cap, child_l1ss_cap; > > > > > > > > if (!link) > > > > return -EINVAL; > > > > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct pci_dev > > > > *pdev, int state, bool locked) > > > > link->aspm_default |= ASPM_STATE_L1_1_PCIPM | > > > > ASPM_STATE_L1; > > > > if (state & PCIE_LINK_STATE_L1_2_PCIPM) > > > > link->aspm_default |= ASPM_STATE_L1_2_PCIPM | > > > > ASPM_STATE_L1; > > > > + /* > > > > + * Ensure L1.2 parameters: Common_Mode_Restore_Times, T_POWER_ON > > > > and > > > > + * LTR_L1.2_THRESHOLD are programmed properly before enable bits > > > > for > > > > + * L1.2, per PCIe r6.0, sec 5.5.4. > > > > + */ > > > > + if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) { > > > > > > This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags. > > > > Thanks for your review, but I notice some description in PCIe spec, > > 5.5.4 L1 PM Substates Configuration: > > "Prior to setting either or both of the enable bits for L1.2, the > > values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2 > > Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale > > fields) must be programmed." => I think this includes both "ASPM L1.2 > > Enable" and "PCI-PM L1.2 Enable" bits. > > That's fine. While the spec clearly calls out the ASPM L1.2 Enable bit here, I > see no harm in including PCI-PM L1.2 in that check. This is what the code > already does in aspm_l1ss_init(). > > The issue is the mixed used of two different types of flags that don't have the > same meaning. 'state' contains PCIE_LINK_STATE flags which are part of the > caller API to the pci_<enabled/disable>_link_state() functions. The ASPM_STATE > flags are used internally to aspm.c to track all states and their meaningful > combinations such as ASPM_STATE_L1_2_MASK which includes ASPM L1.2 and PCI-PM > L1.2. You should not do bit operations between them. > > Also, you should not require that the timings be calculated only if L1_2 is > enabled. You should calculate the timings as long as it's capable. This is also > what aspm_l1ss_init() does. > > The confusion might be over the fact that you are having > __pci_enable_link_state() call aspm_calc_l12_info(). This should have been > handled during initialization of the link in aspm_l1ss_init() and I'm not sure > why it didn't. Maybe it's because, for VMD, ASPM default state would have > started out all disabled and this somehow led to aspm_l1ss_init() not getting > called. But looking through the code I don't see it. It would be great if you > can confirm why they weren't calculated before. I debug it again. If I delete the pci_reset_bus() in vmd controller like: diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c index 87b7856f375a..39bfda4350bf 100644 --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -930,25 +930,6 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) pci_scan_child_bus(vmd->bus); vmd_domain_reset(vmd); - /* When Intel VMD is enabled, the OS does not discover the Root Ports - * owned by Intel VMD within the MMCFG space. pci_reset_bus() applies - * a reset to the parent of the PCI device supplied as argument. This - * is why we pass a child device, so the reset can be triggered at - * the Intel bridge level and propagated to all the children in the - * hierarchy. - */ - list_for_each_entry(child, &vmd->bus->children, node) { - if (!list_empty(&child->devices)) { - dev = list_first_entry(&child->devices, - struct pci_dev, bus_list); - ret = pci_reset_bus(dev); - if (ret) - pci_warn(dev, "can't reset device: %d\n", ret); - - break; - } - } - pci_assign_unassigned_bus_resources(vmd->bus); pci_walk_bus(vmd->bus, vmd_pm_enable_quirk, &features); Although PCI-PM_L1.2 is disabled, both PCI bridge and the NVMe's LTR1.2_Threshold are configured as 101376ns: 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode]) ... Capabilities: [200 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=45us PortTPowerOnTime=50us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- T_CommonMode=45us LTR1.2_Threshold=101376ns L1SubCtl2: T_PwrOn=50us 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express]) ... Capabilities: [900 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+ PortCommonModeRestoreTime=32us PortTPowerOnTime=10us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- T_CommonMode=0us LTR1.2_Threshold=101376ns L1SubCtl2: T_PwrOn=50us Then, I apply the patch "PCI: vmd: Set PCI devices to D0 before enable PCI PM's L1 substates". Both PCI bridge and the NVMe's PCI-PM_L1.2 is enabled and LTR1.2_Threshold is configured as 101376ns. 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode]) ... Capabilities: [200 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=45us PortTPowerOnTime=50us L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- T_CommonMode=45us LTR1.2_Threshold=101376ns L1SubCtl2: T_PwrOn=50us 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express]) ... Capabilities: [900 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+ PortCommonModeRestoreTime=32us PortTPowerOnTime=10us L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- T_CommonMode=0us LTR1.2_Threshold=101376ns L1SubCtl2: T_PwrOn=50us I do not know VMD very much. However, from the test result, it looks like LTR1.2_Threshold has been configured properly originally. But, LTR1.2_Threshold is misconfigured by 'pci_reset_bus()'. Jian-Hong Pan > > > > Jain-Hong Pan > > > > > 'state' should not even matter. > > > The timings should always be calculated and programmed as long > > > as L1_2 is capable. That way the timings are ready even if L1_2 isn't being > > > enabled now (in case the user enables it later). > > > > > > David > > > > > > > + parent_l1ss_cap = aspm_get_l1ss_cap(parent); > > > > + child_l1ss_cap = aspm_get_l1ss_cap(child); > > > > + aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap); > > > > + } > > > > pcie_config_aspm_link(link, policy_to_aspm_state(link)); > > > > > > > > link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0; > > > >
+Nirmal, Thanks Jian-Hong. This is a good find. On Fri, 2024-05-03 at 17:45 +0800, Jian-Hong Pan wrote: > David E. Box <david.e.box@linux.intel.com> 於 2024年5月1日 週三 上午2:26寫道: > > > > On Tue, 2024-04-30 at 15:46 +0800, Jian-Hong Pan wrote: > > > David E. Box <david.e.box@linux.intel.com> 於 2024年4月27日 週六 上午8:03寫道: > > > > > > > > Hi Jian-Hong, > > > > > > > > On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote: > > > > > Currently, when enable link's L1.2 features with > > > > > __pci_enable_link_state(), > > > > > it configs the link directly without ensuring related L1.2 parameters, > > > > > such > > > > > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have > > > > > been > > > > > programmed. > > > > > > > > > > This leads the link's L1.2 between PCIe Root Port and child device > > > > > gets > > > > > wrong configs when a caller tries to enabled it. > > > > > > > > > > Here is a failed example on ASUS B1400CEAE with enabled VMD: > > > > > > > > > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor > > > > > PCIe > > > > > Controller (rev 01) (prog-if 00 [Normal decode]) > > > > > ... > > > > > Capabilities: [200 v1] L1 PM Substates > > > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ > > > > > L1_PM_Substates+ > > > > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > > > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > > T_CommonMode=45us LTR1.2_Threshold=101376ns > > > > > L1SubCtl2: T_PwrOn=50us > > > > > > > > > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue > > > > > SN550 > > > > > NVMe > > > > > SSD (rev 01) (prog-if 02 [NVM Express]) > > > > > ... > > > > > Capabilities: [900 v1] L1 PM Substates > > > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > > L1_PM_Substates+ > > > > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > > > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > > T_CommonMode=0us LTR1.2_Threshold=0ns > > > > > L1SubCtl2: T_PwrOn=10us > > > > > > > > > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the > > > > > PCIe > > > > > Root Port and the child NVMe, they should be programmed with the same > > > > > LTR1.2_Threshold value. However, they have different values in this > > > > > case. > > > > > > > > > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly > > > > > before > > > > > enable L1.2 bits of L1 PM Substates Control Register in > > > > > __pci_enable_link_state(). > > > > > > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394 > > > > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org> > > > > > --- > > > > > v2: > > > > > - Prepare the PCIe LTR parameters before enable L1 Substates > > > > > > > > > > v3: > > > > > - Only enable supported features for the L1 Substates part > > > > > > > > > > v4: > > > > > - Focus on fixing L1.2 parameters, instead of re-initializing whole > > > > > L1SS > > > > > > > > > > v5: > > > > > - Fix typo and commit message > > > > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce > > > > > aspm_get_l1ss_cap()" > > > > > > > > > > drivers/pci/pcie/aspm.c | 12 ++++++++++++ > > > > > 1 file changed, 12 insertions(+) > > > > > > > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > > > > index c55ac11faa73..553327dee991 100644 > > > > > --- a/drivers/pci/pcie/aspm.c > > > > > +++ b/drivers/pci/pcie/aspm.c > > > > > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state); > > > > > static int __pci_enable_link_state(struct pci_dev *pdev, int state, > > > > > bool > > > > > locked) > > > > > { > > > > > struct pcie_link_state *link = pcie_aspm_get_link(pdev); > > > > > + struct pci_dev *child = link->downstream, *parent = link- > > > > > >pdev; > > > > > + u32 parent_l1ss_cap, child_l1ss_cap; > > > > > > > > > > if (!link) > > > > > return -EINVAL; > > > > > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct > > > > > pci_dev > > > > > *pdev, int state, bool locked) > > > > > link->aspm_default |= ASPM_STATE_L1_1_PCIPM | > > > > > ASPM_STATE_L1; > > > > > if (state & PCIE_LINK_STATE_L1_2_PCIPM) > > > > > link->aspm_default |= ASPM_STATE_L1_2_PCIPM | > > > > > ASPM_STATE_L1; > > > > > + /* > > > > > + * Ensure L1.2 parameters: Common_Mode_Restore_Times, > > > > > T_POWER_ON > > > > > and > > > > > + * LTR_L1.2_THRESHOLD are programmed properly before enable > > > > > bits > > > > > for > > > > > + * L1.2, per PCIe r6.0, sec 5.5.4. > > > > > + */ > > > > > + if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) { > > > > > > > > This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags. > > > > > > Thanks for your review, but I notice some description in PCIe spec, > > > 5.5.4 L1 PM Substates Configuration: > > > "Prior to setting either or both of the enable bits for L1.2, the > > > values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2 > > > Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale > > > fields) must be programmed." => I think this includes both "ASPM L1.2 > > > Enable" and "PCI-PM L1.2 Enable" bits. > > > > That's fine. While the spec clearly calls out the ASPM L1.2 Enable bit here, > > I > > see no harm in including PCI-PM L1.2 in that check. This is what the code > > already does in aspm_l1ss_init(). > > > > The issue is the mixed used of two different types of flags that don't have > > the > > same meaning. 'state' contains PCIE_LINK_STATE flags which are part of the > > caller API to the pci_<enabled/disable>_link_state() functions. The > > ASPM_STATE > > flags are used internally to aspm.c to track all states and their meaningful > > combinations such as ASPM_STATE_L1_2_MASK which includes ASPM L1.2 and PCI- > > PM > > L1.2. You should not do bit operations between them. > > > > Also, you should not require that the timings be calculated only if L1_2 is > > enabled. You should calculate the timings as long as it's capable. This is > > also > > what aspm_l1ss_init() does. > > > > The confusion might be over the fact that you are having > > __pci_enable_link_state() call aspm_calc_l12_info(). This should have been > > handled during initialization of the link in aspm_l1ss_init() and I'm not > > sure > > why it didn't. Maybe it's because, for VMD, ASPM default state would have > > started out all disabled and this somehow led to aspm_l1ss_init() not > > getting > > called. But looking through the code I don't see it. It would be great if > > you > > can confirm why they weren't calculated before. > > I debug it again. If I delete the pci_reset_bus() in vmd controller like: > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c > index 87b7856f375a..39bfda4350bf 100644 > --- a/drivers/pci/controller/vmd.c > +++ b/drivers/pci/controller/vmd.c > @@ -930,25 +930,6 @@ static int vmd_enable_domain(struct vmd_dev *vmd, > unsigned long features) > pci_scan_child_bus(vmd->bus); > vmd_domain_reset(vmd); > > - /* When Intel VMD is enabled, the OS does not discover the Root Ports > - * owned by Intel VMD within the MMCFG space. pci_reset_bus() applies > - * a reset to the parent of the PCI device supplied as argument. This > - * is why we pass a child device, so the reset can be triggered at > - * the Intel bridge level and propagated to all the children in the > - * hierarchy. > - */ > - list_for_each_entry(child, &vmd->bus->children, node) { > - if (!list_empty(&child->devices)) { > - dev = list_first_entry(&child->devices, > - struct pci_dev, bus_list); > - ret = pci_reset_bus(dev); Hi Nirmal. It's not clear to me from the comment why there's a need to do a bus reset. It looks like it is causing misconfiguration of the ASPM L1.2 timings which would have been done above in pci_scan_child_bus(). Jian-Hong discovered that without the above reset code, the timings are correct. This patch recalculates the timings during the call to pci_enable_link_state() which is called during pci_bus_walk() below. Originally I thought the recalculation might have been needed by all callers to pci_enabled_link_state() since it changes the default BIOS configuration. But it looks like the reset is the cause and only the VMD driver would need the recalculation as a result. I don't see qcom doing a reset. Jian-Hong, given this (and assuming the reset is needed) I would not call aspm_calc_l12_info() from pci_enable_link_state() but instead try redoing the whole ASPM initialization right after the resets are done, maybe by calling pci_scan_child_bus() again. What do you think Bjorn? David > - if (ret) > - pci_warn(dev, "can't reset device: %d\n", > ret); > - > - break; > - } > - } > - > pci_assign_unassigned_bus_resources(vmd->bus); > > pci_walk_bus(vmd->bus, vmd_pm_enable_quirk, &features); > > Although PCI-PM_L1.2 is disabled, both PCI bridge and the NVMe's > LTR1.2_Threshold are configured as 101376ns: > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core > Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal > decode]) > ... > Capabilities: [200 v1] L1 PM Substates > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > T_CommonMode=45us LTR1.2_Threshold=101376ns > L1SubCtl2: T_PwrOn=50us > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD > Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express]) > ... > Capabilities: [900 v1] L1 PM Substates > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+ > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > T_CommonMode=0us LTR1.2_Threshold=101376ns > L1SubCtl2: T_PwrOn=50us > > Then, I apply the patch "PCI: vmd: Set PCI devices to D0 before enable > PCI PM's L1 substates". Both PCI bridge and the NVMe's PCI-PM_L1.2 is > enabled and LTR1.2_Threshold is configured as 101376ns. > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core > Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal > decode]) > ... > Capabilities: [200 v1] L1 PM Substates > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > T_CommonMode=45us LTR1.2_Threshold=101376ns > L1SubCtl2: T_PwrOn=50us > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD > Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express]) > ... > Capabilities: [900 v1] L1 PM Substates > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+ > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > T_CommonMode=0us LTR1.2_Threshold=101376ns > L1SubCtl2: T_PwrOn=50us > > I do not know VMD very much. However, from the test result, it looks > like LTR1.2_Threshold has been configured properly originally. But, > LTR1.2_Threshold is misconfigured by 'pci_reset_bus()'. > > Jian-Hong Pan > > > > > > > > > Jain-Hong Pan > > > > > > > 'state' should not even matter. > > > > The timings should always be calculated and programmed as long > > > > as L1_2 is capable. That way the timings are ready even if L1_2 isn't > > > > being > > > > enabled now (in case the user enables it later). > > > > > > > > David > > > > > > > > > + parent_l1ss_cap = aspm_get_l1ss_cap(parent); > > > > > + child_l1ss_cap = aspm_get_l1ss_cap(child); > > > > > + aspm_calc_l12_info(link, parent_l1ss_cap, > > > > > child_l1ss_cap); > > > > > + } > > > > > pcie_config_aspm_link(link, policy_to_aspm_state(link)); > > > > > > > > > > link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0; > > > > > >
[+cc Francisco] On Fri, May 03, 2024 at 12:15:49PM -0700, David E. Box wrote: > On Fri, 2024-05-03 at 17:45 +0800, Jian-Hong Pan wrote: > > David E. Box <david.e.box@linux.intel.com> 於 2024年5月1日 週三 上午2:26寫道: > > > On Tue, 2024-04-30 at 15:46 +0800, Jian-Hong Pan wrote: > > > > David E. Box <david.e.box@linux.intel.com> 於 2024年4月27日 週六 上午8:03寫道: > > > > > On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote: > > > > > > Currently, when enable link's L1.2 features with > > > > > > __pci_enable_link_state(), > > > > > > it configs the link directly without ensuring related L1.2 parameters, > > > > > > such > > > > > > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have > > > > > > been > > > > > > programmed. > > > > > > > > > > > > This leads the link's L1.2 between PCIe Root Port and child device > > > > > > gets > > > > > > wrong configs when a caller tries to enabled it. > > > > > > > > > > > > Here is a failed example on ASUS B1400CEAE with enabled VMD: > > > > > > > > > > > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor > > > > > > PCIe > > > > > > Controller (rev 01) (prog-if 00 [Normal decode]) > > > > > > ... > > > > > > Capabilities: [200 v1] L1 PM Substates > > > > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ > > > > > > L1_PM_Substates+ > > > > > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > > > > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > > > T_CommonMode=45us LTR1.2_Threshold=101376ns > > > > > > L1SubCtl2: T_PwrOn=50us > > > > > > > > > > > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue > > > > > > SN550 > > > > > > NVMe > > > > > > SSD (rev 01) (prog-if 02 [NVM Express]) > > > > > > ... > > > > > > Capabilities: [900 v1] L1 PM Substates > > > > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > > > L1_PM_Substates+ > > > > > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > > > > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > > > T_CommonMode=0us LTR1.2_Threshold=0ns > > > > > > L1SubCtl2: T_PwrOn=10us > > > > > > > > > > > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the > > > > > > PCIe > > > > > > Root Port and the child NVMe, they should be programmed with the same > > > > > > LTR1.2_Threshold value. However, they have different values in this > > > > > > case. > > > > > > > > > > > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly > > > > > > before > > > > > > enable L1.2 bits of L1 PM Substates Control Register in > > > > > > __pci_enable_link_state(). > > > > > > > > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394 > > > > > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org> > > > > > > --- > > > > > > v2: > > > > > > - Prepare the PCIe LTR parameters before enable L1 Substates > > > > > > > > > > > > v3: > > > > > > - Only enable supported features for the L1 Substates part > > > > > > > > > > > > v4: > > > > > > - Focus on fixing L1.2 parameters, instead of re-initializing whole > > > > > > L1SS > > > > > > > > > > > > v5: > > > > > > - Fix typo and commit message > > > > > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce > > > > > > aspm_get_l1ss_cap()" > > > > > > > > > > > > drivers/pci/pcie/aspm.c | 12 ++++++++++++ > > > > > > 1 file changed, 12 insertions(+) > > > > > > > > > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > > > > > index c55ac11faa73..553327dee991 100644 > > > > > > --- a/drivers/pci/pcie/aspm.c > > > > > > +++ b/drivers/pci/pcie/aspm.c > > > > > > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state); > > > > > > static int __pci_enable_link_state(struct pci_dev *pdev, int state, > > > > > > bool > > > > > > locked) > > > > > > { > > > > > > struct pcie_link_state *link = pcie_aspm_get_link(pdev); > > > > > > + struct pci_dev *child = link->downstream, *parent = link- > > > > > > >pdev; > > > > > > + u32 parent_l1ss_cap, child_l1ss_cap; > > > > > > > > > > > > if (!link) > > > > > > return -EINVAL; > > > > > > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct > > > > > > pci_dev > > > > > > *pdev, int state, bool locked) > > > > > > link->aspm_default |= ASPM_STATE_L1_1_PCIPM | > > > > > > ASPM_STATE_L1; > > > > > > if (state & PCIE_LINK_STATE_L1_2_PCIPM) > > > > > > link->aspm_default |= ASPM_STATE_L1_2_PCIPM | > > > > > > ASPM_STATE_L1; > > > > > > + /* > > > > > > + * Ensure L1.2 parameters: Common_Mode_Restore_Times, > > > > > > T_POWER_ON > > > > > > and > > > > > > + * LTR_L1.2_THRESHOLD are programmed properly before enable > > > > > > bits > > > > > > for > > > > > > + * L1.2, per PCIe r6.0, sec 5.5.4. > > > > > > + */ > > > > > > + if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) { > > > > > > > > > > This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags. FWIW, Ilpo has removed the ASPM_STATE flags, so eventually this would have to be updated to apply on the current pci/aspm branch. We're at rc6 already, so likely this will end up being v6.11 material so you'll be able to rebase on v6.10-rc1 when it comes out. > > > > Thanks for your review, but I notice some description in PCIe spec, > > > > 5.5.4 L1 PM Substates Configuration: > > > > "Prior to setting either or both of the enable bits for L1.2, the > > > > values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2 > > > > Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale > > > > fields) must be programmed." => I think this includes both "ASPM L1.2 > > > > Enable" and "PCI-PM L1.2 Enable" bits. > > > > > > That's fine. While the spec clearly calls out the ASPM L1.2 Enable bit here, > > > I > > > see no harm in including PCI-PM L1.2 in that check. This is what the code > > > already does in aspm_l1ss_init(). > > > > > > The issue is the mixed used of two different types of flags that don't have > > > the > > > same meaning. 'state' contains PCIE_LINK_STATE flags which are part of the > > > caller API to the pci_<enabled/disable>_link_state() functions. The > > > ASPM_STATE > > > flags are used internally to aspm.c to track all states and their meaningful > > > combinations such as ASPM_STATE_L1_2_MASK which includes ASPM L1.2 and PCI- > > > PM > > > L1.2. You should not do bit operations between them. > > > > > > Also, you should not require that the timings be calculated only if L1_2 is > > > enabled. You should calculate the timings as long as it's capable. This is > > > also > > > what aspm_l1ss_init() does. > > > > > > The confusion might be over the fact that you are having > > > __pci_enable_link_state() call aspm_calc_l12_info(). This should have been > > > handled during initialization of the link in aspm_l1ss_init() and I'm not > > > sure > > > why it didn't. Maybe it's because, for VMD, ASPM default state would have > > > started out all disabled and this somehow led to aspm_l1ss_init() not > > > getting > > > called. But looking through the code I don't see it. It would be great if > > > you > > > can confirm why they weren't calculated before. > > > > I debug it again. If I delete the pci_reset_bus() in vmd controller like: > > > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c > > index 87b7856f375a..39bfda4350bf 100644 > > --- a/drivers/pci/controller/vmd.c > > +++ b/drivers/pci/controller/vmd.c > > @@ -930,25 +930,6 @@ static int vmd_enable_domain(struct vmd_dev *vmd, > > unsigned long features) > > pci_scan_child_bus(vmd->bus); > > vmd_domain_reset(vmd); > > > > - /* When Intel VMD is enabled, the OS does not discover the Root Ports > > - * owned by Intel VMD within the MMCFG space. pci_reset_bus() applies > > - * a reset to the parent of the PCI device supplied as argument. This > > - * is why we pass a child device, so the reset can be triggered at > > - * the Intel bridge level and propagated to all the children in the > > - * hierarchy. > > - */ > > - list_for_each_entry(child, &vmd->bus->children, node) { > > - if (!list_empty(&child->devices)) { > > - dev = list_first_entry(&child->devices, > > - struct pci_dev, bus_list); > > - ret = pci_reset_bus(dev); > > Hi Nirmal. It's not clear to me from the comment why there's a need to do a bus > reset. It looks like it is causing misconfiguration of the ASPM L1.2 timings > which would have been done above in pci_scan_child_bus(). Jian-Hong discovered > that without the above reset code, the timings are correct. I don't understand that comment either. If we don't enumerate the Root Ports below VMD, it sounds like something is wrong, and reset doesn't seem like the right fix. The reset was added by 0a584655ef89 ("PCI: vmd: Fix secondary bus reset for Intel bridges") for guest reboots. Maybe Francisco can shed more light on it. > This patch recalculates the timings during the call to pci_enable_link_state() > which is called during pci_bus_walk() below. Originally I thought the > recalculation might have been needed by all callers to pci_enabled_link_state() > since it changes the default BIOS configuration. But it looks like the reset is > the cause and only the VMD driver would need the recalculation as a result. I > don't see qcom doing a reset. > > Jian-Hong, given this (and assuming the reset is needed) I would not call > aspm_calc_l12_info() from pci_enable_link_state() but instead try redoing the > whole ASPM initialization right after the resets are done, maybe by calling > pci_scan_child_bus() again. What do you think Bjorn? I would expect pci_reset_bus() to save and restore config space, but if we don't enumerate all the devices correctly, I suppose we wouldn't do that for devices we don't know about. > > - if (ret) > > - pci_warn(dev, "can't reset device: %d\n", > > ret); > > - > > - break; > > - } > > - } > > - > > pci_assign_unassigned_bus_resources(vmd->bus); > > > > pci_walk_bus(vmd->bus, vmd_pm_enable_quirk, &features); > > > > Although PCI-PM_L1.2 is disabled, both PCI bridge and the NVMe's > > LTR1.2_Threshold are configured as 101376ns: > > > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core > > Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal > > decode]) > > ... > > Capabilities: [200 v1] L1 PM Substates > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > T_CommonMode=45us LTR1.2_Threshold=101376ns > > L1SubCtl2: T_PwrOn=50us > > > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD > > Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express]) > > ... > > Capabilities: [900 v1] L1 PM Substates > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+ > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > T_CommonMode=0us LTR1.2_Threshold=101376ns > > L1SubCtl2: T_PwrOn=50us > > > > Then, I apply the patch "PCI: vmd: Set PCI devices to D0 before enable > > PCI PM's L1 substates". Both PCI bridge and the NVMe's PCI-PM_L1.2 is > > enabled and LTR1.2_Threshold is configured as 101376ns. > > > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core > > Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal > > decode]) > > ... > > Capabilities: [200 v1] L1 PM Substates > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > > L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > T_CommonMode=45us LTR1.2_Threshold=101376ns > > L1SubCtl2: T_PwrOn=50us > > > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD > > Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express]) > > ... > > Capabilities: [900 v1] L1 PM Substates > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+ > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > > L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > T_CommonMode=0us LTR1.2_Threshold=101376ns > > L1SubCtl2: T_PwrOn=50us > > > > I do not know VMD very much. However, from the test result, it looks > > like LTR1.2_Threshold has been configured properly originally. But, > > LTR1.2_Threshold is misconfigured by 'pci_reset_bus()'. > > ... > > > > > 'state' should not even matter. > > > > > The timings should always be calculated and programmed as long > > > > > as L1_2 is capable. That way the timings are ready even if L1_2 isn't > > > > > being > > > > > enabled now (in case the user enables it later). > > > > > > > > > > > + parent_l1ss_cap = aspm_get_l1ss_cap(parent); > > > > > > + child_l1ss_cap = aspm_get_l1ss_cap(child); > > > > > > + aspm_calc_l12_info(link, parent_l1ss_cap, > > > > > > child_l1ss_cap); > > > > > > + } > > > > > > pcie_config_aspm_link(link, policy_to_aspm_state(link)); > > > > > > > > > > > > link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0; > > > > > > > > >
Bjorn Helgaas <helgaas@kernel.org> 於 2024年5月4日 週六 上午6:28寫道: > > [+cc Francisco] > > On Fri, May 03, 2024 at 12:15:49PM -0700, David E. Box wrote: > > On Fri, 2024-05-03 at 17:45 +0800, Jian-Hong Pan wrote: > > > David E. Box <david.e.box@linux.intel.com> 於 2024年5月1日 週三 上午2:26寫道: > > > > On Tue, 2024-04-30 at 15:46 +0800, Jian-Hong Pan wrote: > > > > > David E. Box <david.e.box@linux.intel.com> 於 2024年4月27日 週六 上午8:03寫道: > > > > > > On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote: > > > > > > > Currently, when enable link's L1.2 features with > > > > > > > __pci_enable_link_state(), > > > > > > > it configs the link directly without ensuring related L1.2 parameters, > > > > > > > such > > > > > > > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have > > > > > > > been > > > > > > > programmed. > > > > > > > > > > > > > > This leads the link's L1.2 between PCIe Root Port and child device > > > > > > > gets > > > > > > > wrong configs when a caller tries to enabled it. > > > > > > > > > > > > > > Here is a failed example on ASUS B1400CEAE with enabled VMD: > > > > > > > > > > > > > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor > > > > > > > PCIe > > > > > > > Controller (rev 01) (prog-if 00 [Normal decode]) > > > > > > > ... > > > > > > > Capabilities: [200 v1] L1 PM Substates > > > > > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ > > > > > > > L1_PM_Substates+ > > > > > > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > > > > > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > > > > T_CommonMode=45us LTR1.2_Threshold=101376ns > > > > > > > L1SubCtl2: T_PwrOn=50us > > > > > > > > > > > > > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue > > > > > > > SN550 > > > > > > > NVMe > > > > > > > SSD (rev 01) (prog-if 02 [NVM Express]) > > > > > > > ... > > > > > > > Capabilities: [900 v1] L1 PM Substates > > > > > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > > > > L1_PM_Substates+ > > > > > > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > > > > > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > > > > T_CommonMode=0us LTR1.2_Threshold=0ns > > > > > > > L1SubCtl2: T_PwrOn=10us > > > > > > > > > > > > > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the > > > > > > > PCIe > > > > > > > Root Port and the child NVMe, they should be programmed with the same > > > > > > > LTR1.2_Threshold value. However, they have different values in this > > > > > > > case. > > > > > > > > > > > > > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly > > > > > > > before > > > > > > > enable L1.2 bits of L1 PM Substates Control Register in > > > > > > > __pci_enable_link_state(). > > > > > > > > > > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394 > > > > > > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org> > > > > > > > --- > > > > > > > v2: > > > > > > > - Prepare the PCIe LTR parameters before enable L1 Substates > > > > > > > > > > > > > > v3: > > > > > > > - Only enable supported features for the L1 Substates part > > > > > > > > > > > > > > v4: > > > > > > > - Focus on fixing L1.2 parameters, instead of re-initializing whole > > > > > > > L1SS > > > > > > > > > > > > > > v5: > > > > > > > - Fix typo and commit message > > > > > > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce > > > > > > > aspm_get_l1ss_cap()" > > > > > > > > > > > > > > drivers/pci/pcie/aspm.c | 12 ++++++++++++ > > > > > > > 1 file changed, 12 insertions(+) > > > > > > > > > > > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > > > > > > index c55ac11faa73..553327dee991 100644 > > > > > > > --- a/drivers/pci/pcie/aspm.c > > > > > > > +++ b/drivers/pci/pcie/aspm.c > > > > > > > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state); > > > > > > > static int __pci_enable_link_state(struct pci_dev *pdev, int state, > > > > > > > bool > > > > > > > locked) > > > > > > > { > > > > > > > struct pcie_link_state *link = pcie_aspm_get_link(pdev); > > > > > > > + struct pci_dev *child = link->downstream, *parent = link- > > > > > > > >pdev; > > > > > > > + u32 parent_l1ss_cap, child_l1ss_cap; > > > > > > > > > > > > > > if (!link) > > > > > > > return -EINVAL; > > > > > > > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct > > > > > > > pci_dev > > > > > > > *pdev, int state, bool locked) > > > > > > > link->aspm_default |= ASPM_STATE_L1_1_PCIPM | > > > > > > > ASPM_STATE_L1; > > > > > > > if (state & PCIE_LINK_STATE_L1_2_PCIPM) > > > > > > > link->aspm_default |= ASPM_STATE_L1_2_PCIPM | > > > > > > > ASPM_STATE_L1; > > > > > > > + /* > > > > > > > + * Ensure L1.2 parameters: Common_Mode_Restore_Times, > > > > > > > T_POWER_ON > > > > > > > and > > > > > > > + * LTR_L1.2_THRESHOLD are programmed properly before enable > > > > > > > bits > > > > > > > for > > > > > > > + * L1.2, per PCIe r6.0, sec 5.5.4. > > > > > > > + */ > > > > > > > + if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) { > > > > > > > > > > > > This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags. > > FWIW, Ilpo has removed the ASPM_STATE flags, so eventually this would > have to be updated to apply on the current pci/aspm branch. We're at > rc6 already, so likely this will end up being v6.11 material so you'll > be able to rebase on v6.10-rc1 when it comes out. > > > > > > Thanks for your review, but I notice some description in PCIe spec, > > > > > 5.5.4 L1 PM Substates Configuration: > > > > > "Prior to setting either or both of the enable bits for L1.2, the > > > > > values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2 > > > > > Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale > > > > > fields) must be programmed." => I think this includes both "ASPM L1.2 > > > > > Enable" and "PCI-PM L1.2 Enable" bits. > > > > > > > > That's fine. While the spec clearly calls out the ASPM L1.2 Enable bit here, > > > > I > > > > see no harm in including PCI-PM L1.2 in that check. This is what the code > > > > already does in aspm_l1ss_init(). > > > > > > > > The issue is the mixed used of two different types of flags that don't have > > > > the > > > > same meaning. 'state' contains PCIE_LINK_STATE flags which are part of the > > > > caller API to the pci_<enabled/disable>_link_state() functions. The > > > > ASPM_STATE > > > > flags are used internally to aspm.c to track all states and their meaningful > > > > combinations such as ASPM_STATE_L1_2_MASK which includes ASPM L1.2 and PCI- > > > > PM > > > > L1.2. You should not do bit operations between them. > > > > > > > > Also, you should not require that the timings be calculated only if L1_2 is > > > > enabled. You should calculate the timings as long as it's capable. This is > > > > also > > > > what aspm_l1ss_init() does. > > > > > > > > The confusion might be over the fact that you are having > > > > __pci_enable_link_state() call aspm_calc_l12_info(). This should have been > > > > handled during initialization of the link in aspm_l1ss_init() and I'm not > > > > sure > > > > why it didn't. Maybe it's because, for VMD, ASPM default state would have > > > > started out all disabled and this somehow led to aspm_l1ss_init() not > > > > getting > > > > called. But looking through the code I don't see it. It would be great if > > > > you > > > > can confirm why they weren't calculated before. > > > > > > I debug it again. If I delete the pci_reset_bus() in vmd controller like: > > > > > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c > > > index 87b7856f375a..39bfda4350bf 100644 > > > --- a/drivers/pci/controller/vmd.c > > > +++ b/drivers/pci/controller/vmd.c > > > @@ -930,25 +930,6 @@ static int vmd_enable_domain(struct vmd_dev *vmd, > > > unsigned long features) > > > pci_scan_child_bus(vmd->bus); > > > vmd_domain_reset(vmd); > > > > > > - /* When Intel VMD is enabled, the OS does not discover the Root Ports > > > - * owned by Intel VMD within the MMCFG space. pci_reset_bus() applies > > > - * a reset to the parent of the PCI device supplied as argument. This > > > - * is why we pass a child device, so the reset can be triggered at > > > - * the Intel bridge level and propagated to all the children in the > > > - * hierarchy. > > > - */ > > > - list_for_each_entry(child, &vmd->bus->children, node) { > > > - if (!list_empty(&child->devices)) { > > > - dev = list_first_entry(&child->devices, > > > - struct pci_dev, bus_list); > > > - ret = pci_reset_bus(dev); > > > > Hi Nirmal. It's not clear to me from the comment why there's a need to do a bus > > reset. It looks like it is causing misconfiguration of the ASPM L1.2 timings > > which would have been done above in pci_scan_child_bus(). Jian-Hong discovered > > that without the above reset code, the timings are correct. > > I don't understand that comment either. If we don't enumerate the > Root Ports below VMD, it sounds like something is wrong, and reset > doesn't seem like the right fix. > > The reset was added by 0a584655ef89 ("PCI: vmd: Fix secondary bus > reset for Intel bridges") for guest reboots. Maybe Francisco can shed > more light on it. > > > This patch recalculates the timings during the call to pci_enable_link_state() > > which is called during pci_bus_walk() below. Originally I thought the > > recalculation might have been needed by all callers to pci_enabled_link_state() > > since it changes the default BIOS configuration. But it looks like the reset is > > the cause and only the VMD driver would need the recalculation as a result. I > > don't see qcom doing a reset. > > > > Jian-Hong, given this (and assuming the reset is needed) I would not call > > aspm_calc_l12_info() from pci_enable_link_state() but instead try redoing the > > whole ASPM initialization right after the resets are done, maybe by calling > > pci_scan_child_bus() again. What do you think Bjorn? > > I would expect pci_reset_bus() to save and restore config space, but > if we don't enumerate all the devices correctly, I suppose we wouldn't > do that for devices we don't know about. According to the discussion above, the misconfiguration of ASPM L1.2 timings comes from pci_reset_bus() which is added by another issue related to VT-d pass-through: * 6aab5622296b ("PCI: vmd: Clean up domain before enumeration") * 0a584655ef89 ("PCI: vmd: Fix secondary bus reset for Intel bridges") However, I do not quite understand the scenario. Should I drop patches "PCI/ASPM: Introduce aspm_get_l1ss_cap()" and "PCI/ASPM: Fix L1.2 parameters when enable link state", then only send "PCI: vmd: Set PCI devices to D0 before enable PCI PM's L1 substates" and "PCI/ASPM: Add notes about enabling PCI-PM L1SS to pci_enable_link_state(_locked)" first? Jian-Hong Pan > > > - if (ret) > > > - pci_warn(dev, "can't reset device: %d\n", > > > ret); > > > - > > > - break; > > > - } > > > - } > > > - > > > pci_assign_unassigned_bus_resources(vmd->bus); > > > > > > pci_walk_bus(vmd->bus, vmd_pm_enable_quirk, &features); > > > > > > Although PCI-PM_L1.2 is disabled, both PCI bridge and the NVMe's > > > LTR1.2_Threshold are configured as 101376ns: > > > > > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core > > > Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal > > > decode]) > > > ... > > > Capabilities: [200 v1] L1 PM Substates > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ > > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > T_CommonMode=45us LTR1.2_Threshold=101376ns > > > L1SubCtl2: T_PwrOn=50us > > > > > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD > > > Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express]) > > > ... > > > Capabilities: [900 v1] L1 PM Substates > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+ > > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > T_CommonMode=0us LTR1.2_Threshold=101376ns > > > L1SubCtl2: T_PwrOn=50us > > > > > > Then, I apply the patch "PCI: vmd: Set PCI devices to D0 before enable > > > PCI PM's L1 substates". Both PCI bridge and the NVMe's PCI-PM_L1.2 is > > > enabled and LTR1.2_Threshold is configured as 101376ns. > > > > > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core > > > Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal > > > decode]) > > > ... > > > Capabilities: [200 v1] L1 PM Substates > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ > > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > > > L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > T_CommonMode=45us LTR1.2_Threshold=101376ns > > > L1SubCtl2: T_PwrOn=50us > > > > > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD > > > Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express]) > > > ... > > > Capabilities: [900 v1] L1 PM Substates > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+ > > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > > > L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > T_CommonMode=0us LTR1.2_Threshold=101376ns > > > L1SubCtl2: T_PwrOn=50us > > > > > > I do not know VMD very much. However, from the test result, it looks > > > like LTR1.2_Threshold has been configured properly originally. But, > > > LTR1.2_Threshold is misconfigured by 'pci_reset_bus()'. > > > ... > > > > > > 'state' should not even matter. > > > > > > The timings should always be calculated and programmed as long > > > > > > as L1_2 is capable. That way the timings are ready even if L1_2 isn't > > > > > > being > > > > > > enabled now (in case the user enables it later). > > > > > > > > > > > > > + parent_l1ss_cap = aspm_get_l1ss_cap(parent); > > > > > > > + child_l1ss_cap = aspm_get_l1ss_cap(child); > > > > > > > + aspm_calc_l12_info(link, parent_l1ss_cap, > > > > > > > child_l1ss_cap); > > > > > > > + } > > > > > > > pcie_config_aspm_link(link, policy_to_aspm_state(link)); > > > > > > > > > > > > > > link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0; > > > > > > > > > > > >
Bjorn Helgaas <helgaas@kernel.org> 於 2024年5月4日 週六 上午6:28寫道: > > [+cc Francisco] > > On Fri, May 03, 2024 at 12:15:49PM -0700, David E. Box wrote: > > On Fri, 2024-05-03 at 17:45 +0800, Jian-Hong Pan wrote: > > > David E. Box <david.e.box@linux.intel.com> 於 2024年5月1日 週三 上午2:26寫道: > > > > On Tue, 2024-04-30 at 15:46 +0800, Jian-Hong Pan wrote: > > > > > David E. Box <david.e.box@linux.intel.com> 於 2024年4月27日 週六 上午8:03寫道: > > > > > > On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote: > > > > > > > Currently, when enable link's L1.2 features with > > > > > > > __pci_enable_link_state(), > > > > > > > it configs the link directly without ensuring related L1.2 parameters, > > > > > > > such > > > > > > > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have > > > > > > > been > > > > > > > programmed. > > > > > > > > > > > > > > This leads the link's L1.2 between PCIe Root Port and child device > > > > > > > gets > > > > > > > wrong configs when a caller tries to enabled it. > > > > > > > > > > > > > > Here is a failed example on ASUS B1400CEAE with enabled VMD: > > > > > > > > > > > > > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor > > > > > > > PCIe > > > > > > > Controller (rev 01) (prog-if 00 [Normal decode]) > > > > > > > ... > > > > > > > Capabilities: [200 v1] L1 PM Substates > > > > > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ > > > > > > > L1_PM_Substates+ > > > > > > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > > > > > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > > > > T_CommonMode=45us LTR1.2_Threshold=101376ns > > > > > > > L1SubCtl2: T_PwrOn=50us > > > > > > > > > > > > > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue > > > > > > > SN550 > > > > > > > NVMe > > > > > > > SSD (rev 01) (prog-if 02 [NVM Express]) > > > > > > > ... > > > > > > > Capabilities: [900 v1] L1 PM Substates > > > > > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > > > > L1_PM_Substates+ > > > > > > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > > > > > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > > > > > T_CommonMode=0us LTR1.2_Threshold=0ns > > > > > > > L1SubCtl2: T_PwrOn=10us > > > > > > > > > > > > > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the > > > > > > > PCIe > > > > > > > Root Port and the child NVMe, they should be programmed with the same > > > > > > > LTR1.2_Threshold value. However, they have different values in this > > > > > > > case. > > > > > > > > > > > > > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly > > > > > > > before > > > > > > > enable L1.2 bits of L1 PM Substates Control Register in > > > > > > > __pci_enable_link_state(). > > > > > > > > > > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394 > > > > > > > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org> > > > > > > > --- > > > > > > > v2: > > > > > > > - Prepare the PCIe LTR parameters before enable L1 Substates > > > > > > > > > > > > > > v3: > > > > > > > - Only enable supported features for the L1 Substates part > > > > > > > > > > > > > > v4: > > > > > > > - Focus on fixing L1.2 parameters, instead of re-initializing whole > > > > > > > L1SS > > > > > > > > > > > > > > v5: > > > > > > > - Fix typo and commit message > > > > > > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce > > > > > > > aspm_get_l1ss_cap()" > > > > > > > > > > > > > > drivers/pci/pcie/aspm.c | 12 ++++++++++++ > > > > > > > 1 file changed, 12 insertions(+) > > > > > > > > > > > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > > > > > > index c55ac11faa73..553327dee991 100644 > > > > > > > --- a/drivers/pci/pcie/aspm.c > > > > > > > +++ b/drivers/pci/pcie/aspm.c > > > > > > > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state); > > > > > > > static int __pci_enable_link_state(struct pci_dev *pdev, int state, > > > > > > > bool > > > > > > > locked) > > > > > > > { > > > > > > > struct pcie_link_state *link = pcie_aspm_get_link(pdev); > > > > > > > + struct pci_dev *child = link->downstream, *parent = link- > > > > > > > >pdev; > > > > > > > + u32 parent_l1ss_cap, child_l1ss_cap; > > > > > > > > > > > > > > if (!link) > > > > > > > return -EINVAL; > > > > > > > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct > > > > > > > pci_dev > > > > > > > *pdev, int state, bool locked) > > > > > > > link->aspm_default |= ASPM_STATE_L1_1_PCIPM | > > > > > > > ASPM_STATE_L1; > > > > > > > if (state & PCIE_LINK_STATE_L1_2_PCIPM) > > > > > > > link->aspm_default |= ASPM_STATE_L1_2_PCIPM | > > > > > > > ASPM_STATE_L1; > > > > > > > + /* > > > > > > > + * Ensure L1.2 parameters: Common_Mode_Restore_Times, > > > > > > > T_POWER_ON > > > > > > > and > > > > > > > + * LTR_L1.2_THRESHOLD are programmed properly before enable > > > > > > > bits > > > > > > > for > > > > > > > + * L1.2, per PCIe r6.0, sec 5.5.4. > > > > > > > + */ > > > > > > > + if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) { > > > > > > > > > > > > This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags. > > FWIW, Ilpo has removed the ASPM_STATE flags, so eventually this would > have to be updated to apply on the current pci/aspm branch. We're at > rc6 already, so likely this will end up being v6.11 material so you'll > be able to rebase on v6.10-rc1 when it comes out. > > > > > > Thanks for your review, but I notice some description in PCIe spec, > > > > > 5.5.4 L1 PM Substates Configuration: > > > > > "Prior to setting either or both of the enable bits for L1.2, the > > > > > values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2 > > > > > Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale > > > > > fields) must be programmed." => I think this includes both "ASPM L1.2 > > > > > Enable" and "PCI-PM L1.2 Enable" bits. > > > > > > > > That's fine. While the spec clearly calls out the ASPM L1.2 Enable bit here, > > > > I > > > > see no harm in including PCI-PM L1.2 in that check. This is what the code > > > > already does in aspm_l1ss_init(). > > > > > > > > The issue is the mixed used of two different types of flags that don't have > > > > the > > > > same meaning. 'state' contains PCIE_LINK_STATE flags which are part of the > > > > caller API to the pci_<enabled/disable>_link_state() functions. The > > > > ASPM_STATE > > > > flags are used internally to aspm.c to track all states and their meaningful > > > > combinations such as ASPM_STATE_L1_2_MASK which includes ASPM L1.2 and PCI- > > > > PM > > > > L1.2. You should not do bit operations between them. > > > > > > > > Also, you should not require that the timings be calculated only if L1_2 is > > > > enabled. You should calculate the timings as long as it's capable. This is > > > > also > > > > what aspm_l1ss_init() does. > > > > > > > > The confusion might be over the fact that you are having > > > > __pci_enable_link_state() call aspm_calc_l12_info(). This should have been > > > > handled during initialization of the link in aspm_l1ss_init() and I'm not > > > > sure > > > > why it didn't. Maybe it's because, for VMD, ASPM default state would have > > > > started out all disabled and this somehow led to aspm_l1ss_init() not > > > > getting > > > > called. But looking through the code I don't see it. It would be great if > > > > you > > > > can confirm why they weren't calculated before. > > > > > > I debug it again. If I delete the pci_reset_bus() in vmd controller like: > > > > > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c > > > index 87b7856f375a..39bfda4350bf 100644 > > > --- a/drivers/pci/controller/vmd.c > > > +++ b/drivers/pci/controller/vmd.c > > > @@ -930,25 +930,6 @@ static int vmd_enable_domain(struct vmd_dev *vmd, > > > unsigned long features) > > > pci_scan_child_bus(vmd->bus); > > > vmd_domain_reset(vmd); > > > > > > - /* When Intel VMD is enabled, the OS does not discover the Root Ports > > > - * owned by Intel VMD within the MMCFG space. pci_reset_bus() applies > > > - * a reset to the parent of the PCI device supplied as argument. This > > > - * is why we pass a child device, so the reset can be triggered at > > > - * the Intel bridge level and propagated to all the children in the > > > - * hierarchy. > > > - */ > > > - list_for_each_entry(child, &vmd->bus->children, node) { > > > - if (!list_empty(&child->devices)) { > > > - dev = list_first_entry(&child->devices, > > > - struct pci_dev, bus_list); > > > - ret = pci_reset_bus(dev); > > > > Hi Nirmal. It's not clear to me from the comment why there's a need to do a bus > > reset. It looks like it is causing misconfiguration of the ASPM L1.2 timings > > which would have been done above in pci_scan_child_bus(). Jian-Hong discovered > > that without the above reset code, the timings are correct. > > I don't understand that comment either. If we don't enumerate the > Root Ports below VMD, it sounds like something is wrong, and reset > doesn't seem like the right fix. > > The reset was added by 0a584655ef89 ("PCI: vmd: Fix secondary bus > reset for Intel bridges") for guest reboots. Maybe Francisco can shed > more light on it. Hi Francisco, We found: the VMD remapped PCI devices' (PCIe bridge and NVMe SSD) PCIe LTR1.2_Threshold is correct originally, but becomes misconfigured after "pci_reset_bus()" in "vmd_enable_domain()". Here is an example that the PCI bridge and its child NVMe SSD have different LTR1.2_Threshold: 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor PCIe Controller (rev 01) (prog-if 00 [Normal decode]) ... Capabilities: [200 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=45us PortTPowerOnTime=50us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- T_CommonMode=45us LTR1.2_Threshold=101376ns L1SubCtl2: T_PwrOn=50us 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550 NVMe SSD (rev 01) (prog-if 02 [NVM Express]) ... Capabilities: [900 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+ PortCommonModeRestoreTime=32us PortTPowerOnTime=10us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- T_CommonMode=0us LTR1.2_Threshold=0ns L1SubCtl2: T_PwrOn=10us It will be great to have your knowledge: Is resetting a PCI device at bridge level still needed here? Thanks, Jian-Hong Pan > > This patch recalculates the timings during the call to pci_enable_link_state() > > which is called during pci_bus_walk() below. Originally I thought the > > recalculation might have been needed by all callers to pci_enabled_link_state() > > since it changes the default BIOS configuration. But it looks like the reset is > > the cause and only the VMD driver would need the recalculation as a result. I > > don't see qcom doing a reset. > > > > Jian-Hong, given this (and assuming the reset is needed) I would not call > > aspm_calc_l12_info() from pci_enable_link_state() but instead try redoing the > > whole ASPM initialization right after the resets are done, maybe by calling > > pci_scan_child_bus() again. What do you think Bjorn? > > I would expect pci_reset_bus() to save and restore config space, but > if we don't enumerate all the devices correctly, I suppose we wouldn't > do that for devices we don't know about. > > > > - if (ret) > > > - pci_warn(dev, "can't reset device: %d\n", > > > ret); > > > - > > > - break; > > > - } > > > - } > > > - > > > pci_assign_unassigned_bus_resources(vmd->bus); > > > > > > pci_walk_bus(vmd->bus, vmd_pm_enable_quirk, &features); > > > > > > Although PCI-PM_L1.2 is disabled, both PCI bridge and the NVMe's > > > LTR1.2_Threshold are configured as 101376ns: > > > > > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core > > > Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal > > > decode]) > > > ... > > > Capabilities: [200 v1] L1 PM Substates > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ > > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > T_CommonMode=45us LTR1.2_Threshold=101376ns > > > L1SubCtl2: T_PwrOn=50us > > > > > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD > > > Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express]) > > > ... > > > Capabilities: [900 v1] L1 PM Substates > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+ > > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > > > L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > T_CommonMode=0us LTR1.2_Threshold=101376ns > > > L1SubCtl2: T_PwrOn=50us > > > > > > Then, I apply the patch "PCI: vmd: Set PCI devices to D0 before enable > > > PCI PM's L1 substates". Both PCI bridge and the NVMe's PCI-PM_L1.2 is > > > enabled and LTR1.2_Threshold is configured as 101376ns. > > > > > > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core > > > Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal > > > decode]) > > > ... > > > Capabilities: [200 v1] L1 PM Substates > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ > > > PortCommonModeRestoreTime=45us PortTPowerOnTime=50us > > > L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > T_CommonMode=45us LTR1.2_Threshold=101376ns > > > L1SubCtl2: T_PwrOn=50us > > > > > > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD > > > Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express]) > > > ... > > > Capabilities: [900 v1] L1 PM Substates > > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+ > > > PortCommonModeRestoreTime=32us PortTPowerOnTime=10us > > > L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- > > > T_CommonMode=0us LTR1.2_Threshold=101376ns > > > L1SubCtl2: T_PwrOn=50us > > > > > > I do not know VMD very much. However, from the test result, it looks > > > like LTR1.2_Threshold has been configured properly originally. But, > > > LTR1.2_Threshold is misconfigured by 'pci_reset_bus()'. > > > ... > > > > > > 'state' should not even matter. > > > > > > The timings should always be calculated and programmed as long > > > > > > as L1_2 is capable. That way the timings are ready even if L1_2 isn't > > > > > > being > > > > > > enabled now (in case the user enables it later). > > > > > > > > > > > > > + parent_l1ss_cap = aspm_get_l1ss_cap(parent); > > > > > > > + child_l1ss_cap = aspm_get_l1ss_cap(child); > > > > > > > + aspm_calc_l12_info(link, parent_l1ss_cap, > > > > > > > child_l1ss_cap); > > > > > > > + } > > > > > > > pcie_config_aspm_link(link, policy_to_aspm_state(link)); > > > > > > > > > > > > > > link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0; > > > > > > > > > > > >
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index c55ac11faa73..553327dee991 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state); static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool locked) { struct pcie_link_state *link = pcie_aspm_get_link(pdev); + struct pci_dev *child = link->downstream, *parent = link->pdev; + u32 parent_l1ss_cap, child_l1ss_cap; if (!link) return -EINVAL; @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool locked) link->aspm_default |= ASPM_STATE_L1_1_PCIPM | ASPM_STATE_L1; if (state & PCIE_LINK_STATE_L1_2_PCIPM) link->aspm_default |= ASPM_STATE_L1_2_PCIPM | ASPM_STATE_L1; + /* + * Ensure L1.2 parameters: Common_Mode_Restore_Times, T_POWER_ON and + * LTR_L1.2_THRESHOLD are programmed properly before enable bits for + * L1.2, per PCIe r6.0, sec 5.5.4. + */ + if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) { + parent_l1ss_cap = aspm_get_l1ss_cap(parent); + child_l1ss_cap = aspm_get_l1ss_cap(child); + aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap); + } pcie_config_aspm_link(link, policy_to_aspm_state(link)); link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;
Currently, when enable link's L1.2 features with __pci_enable_link_state(), it configs the link directly without ensuring related L1.2 parameters, such as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have been programmed. This leads the link's L1.2 between PCIe Root Port and child device gets wrong configs when a caller tries to enabled it. Here is a failed example on ASUS B1400CEAE with enabled VMD: 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor PCIe Controller (rev 01) (prog-if 00 [Normal decode]) ... Capabilities: [200 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=45us PortTPowerOnTime=50us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- T_CommonMode=45us LTR1.2_Threshold=101376ns L1SubCtl2: T_PwrOn=50us 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550 NVMe SSD (rev 01) (prog-if 02 [NVM Express]) ... Capabilities: [900 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+ PortCommonModeRestoreTime=32us PortTPowerOnTime=10us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- T_CommonMode=0us LTR1.2_Threshold=0ns L1SubCtl2: T_PwrOn=10us According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the PCIe Root Port and the child NVMe, they should be programmed with the same LTR1.2_Threshold value. However, they have different values in this case. Invoke aspm_calc_l12_info() to program the L1.2 parameters properly before enable L1.2 bits of L1 PM Substates Control Register in __pci_enable_link_state(). Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394 Signed-off-by: Jian-Hong Pan <jhp@endlessos.org> --- v2: - Prepare the PCIe LTR parameters before enable L1 Substates v3: - Only enable supported features for the L1 Substates part v4: - Focus on fixing L1.2 parameters, instead of re-initializing whole L1SS v5: - Fix typo and commit message - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce aspm_get_l1ss_cap()" drivers/pci/pcie/aspm.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)