Message ID | 20230913040832.114610-3-mario.limonciello@amd.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Add quirk for PCIe root port on AMD systems | expand |
On Tue, Sep 12, 2023 at 11:08:32PM -0500, Mario Limonciello wrote: > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -2752,6 +2752,11 @@ int pci_prepare_to_sleep(struct pci_dev *dev) > if (target_state == PCI_POWER_ERROR) > return -EIO; > > + /* quirk to avoid setting D3 */ > + if (wakeup && dev->dev_flags & PCI_DEV_FLAGS_NO_WAKE_D3 && > + (target_state == PCI_D3hot || target_state == PCI_D3cold)) > + target_state = PCI_D0; > + > pci_enable_wake(dev, target_state, wakeup); > > error = pci_set_power_state(dev, target_state); Would it be possible to just add the affected system to bridge_d3_blacklist[]? Or would that defeat power management of other (non-affected) Root Ports in the same machine? There's already PCI_DEV_FLAGS_NO_D3, would it be possible to just reuse that instead of adding another codepath for D3 quirks? Thanks, Lukas
On 9/12/2023 23:25, Lukas Wunner wrote: > On Tue, Sep 12, 2023 at 11:08:32PM -0500, Mario Limonciello wrote: >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -2752,6 +2752,11 @@ int pci_prepare_to_sleep(struct pci_dev *dev) >> if (target_state == PCI_POWER_ERROR) >> return -EIO; >> >> + /* quirk to avoid setting D3 */ >> + if (wakeup && dev->dev_flags & PCI_DEV_FLAGS_NO_WAKE_D3 && >> + (target_state == PCI_D3hot || target_state == PCI_D3cold)) >> + target_state = PCI_D0; >> + >> pci_enable_wake(dev, target_state, wakeup); >> >> error = pci_set_power_state(dev, target_state); > > Would it be possible to just add the affected system to > bridge_d3_blacklist[]? It's initially reported on Lenovo Z13, but it affects all Rembrandt and Phoenix machines that have USB4 controller enabled. It's reproduced on every OEM system I have access to. > > Or would that defeat power management of other (non-affected) > Root Ports in the same machine? > > There's already PCI_DEV_FLAGS_NO_D3, would it be possible to just > reuse that instead of adding another codepath for D3 quirks? > The root port can handle D3 (including wakeup) at runtime fine. Issue occurs only during s2idle w/ hardware sleep. In v16/v17 (see cover letter for links) Rafael suggested to tie this specifically to suspend behavior and when wakeup flag is set. I didn't think it was appropriate to overload the existing flag because of this difference. > Thanks, > > Lukas
On Wed, Sep 13, 2023 at 6:44 AM Mario Limonciello <mario.limonciello@amd.com> wrote: > > On 9/12/2023 23:25, Lukas Wunner wrote: > > On Tue, Sep 12, 2023 at 11:08:32PM -0500, Mario Limonciello wrote: > >> --- a/drivers/pci/pci.c > >> +++ b/drivers/pci/pci.c > >> @@ -2752,6 +2752,11 @@ int pci_prepare_to_sleep(struct pci_dev *dev) > >> if (target_state == PCI_POWER_ERROR) > >> return -EIO; > >> > >> + /* quirk to avoid setting D3 */ > >> + if (wakeup && dev->dev_flags & PCI_DEV_FLAGS_NO_WAKE_D3 && > >> + (target_state == PCI_D3hot || target_state == PCI_D3cold)) > >> + target_state = PCI_D0; > >> + > >> pci_enable_wake(dev, target_state, wakeup); > >> > >> error = pci_set_power_state(dev, target_state); > > > > Would it be possible to just add the affected system to > > bridge_d3_blacklist[]? > > It's initially reported on Lenovo Z13, but it affects all Rembrandt and > Phoenix machines that have USB4 controller enabled. > > It's reproduced on every OEM system I have access to. > > > > > Or would that defeat power management of other (non-affected) > > Root Ports in the same machine? > > > > There's already PCI_DEV_FLAGS_NO_D3, would it be possible to just > > reuse that instead of adding another codepath for D3 quirks? > > > > The root port can handle D3 (including wakeup) at runtime fine. > Issue occurs only during s2idle w/ hardware sleep. > > In v16/v17 (see cover letter for links) Rafael suggested to tie this > specifically to suspend behavior and when wakeup flag is set. Right, it is not necessary to avoid D3 on those ports for PM-runtime and when there are no system wakeup devices underneath. > I didn't think it was appropriate to overload the existing flag because > of this difference. I agree.
Hi Mario, kernel test robot noticed the following build warnings: [auto build test WARNING on 0bb80ecc33a8fb5a682236443c1e740d5c917d1d] url: https://github.com/intel-lab-lkp/linux/commits/Mario-Limonciello/PCI-Move-the-PCI_CLASS_SERIAL_USB_USB4-definition-to-common-header/20230913-121559 base: 0bb80ecc33a8fb5a682236443c1e740d5c917d1d patch link: https://lore.kernel.org/r/20230913040832.114610-3-mario.limonciello%40amd.com patch subject: [PATCH v18 2/2] PCI: Add a quirk for AMD PCIe root ports w/ USB4 controllers config: riscv-defconfig (https://download.01.org/0day-ci/archive/20230913/202309131736.HcuHnd8S-lkp@intel.com/config) compiler: riscv64-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230913/202309131736.HcuHnd8S-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202309131736.HcuHnd8S-lkp@intel.com/ All warnings (new ones prefixed by >>): drivers/pci/quirks.c: In function 'quirk_ryzen_rp_d3': >> drivers/pci/quirks.c:3890:16: warning: suggest parentheses around assignment used as truth value [-Wparentheses] 3890 | while (child = pci_get_class(PCI_CLASS_SERIAL_USB_USB4, child)) { | ^~~~~ vim +3890 drivers/pci/quirks.c 3775 3776 /* 3777 * Some AMD/ATI GPUS (HD8570 - Oland) report that a D3hot->D0 transition 3778 * causes a reset (i.e., they advertise NoSoftRst-). This transition seems 3779 * to have no effect on the device: it retains the framebuffer contents and 3780 * monitor sync. Advertising this support makes other layers, like VFIO, 3781 * assume pci_reset_function() is viable for this device. Mark it as 3782 * unavailable to skip it when testing reset methods. 3783 */ 3784 DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_ATI, PCI_ANY_ID, 3785 PCI_CLASS_DISPLAY_VGA, 8, quirk_no_pm_reset); 3786 3787 /* 3788 * Thunderbolt controllers with broken MSI hotplug signaling: 3789 * Entire 1st generation (Light Ridge, Eagle Ridge, Light Peak) and part 3790 * of the 2nd generation (Cactus Ridge 4C up to revision 1, Port Ridge). 3791 */ 3792 static void quirk_thunderbolt_hotplug_msi(struct pci_dev *pdev) 3793 { 3794 if (pdev->is_hotplug_bridge && 3795 (pdev->device != PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C || 3796 pdev->revision <= 1)) 3797 pdev->no_msi = 1; 3798 } 3799 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_LIGHT_RIDGE, 3800 quirk_thunderbolt_hotplug_msi); 3801 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_EAGLE_RIDGE, 3802 quirk_thunderbolt_hotplug_msi); 3803 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_LIGHT_PEAK, 3804 quirk_thunderbolt_hotplug_msi); 3805 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C, 3806 quirk_thunderbolt_hotplug_msi); 3807 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE, 3808 quirk_thunderbolt_hotplug_msi); 3809 3810 #ifdef CONFIG_ACPI 3811 /* 3812 * Apple: Shutdown Cactus Ridge Thunderbolt controller. 3813 * 3814 * On Apple hardware the Cactus Ridge Thunderbolt controller needs to be 3815 * shutdown before suspend. Otherwise the native host interface (NHI) will not 3816 * be present after resume if a device was plugged in before suspend. 3817 * 3818 * The Thunderbolt controller consists of a PCIe switch with downstream 3819 * bridges leading to the NHI and to the tunnel PCI bridges. 3820 * 3821 * This quirk cuts power to the whole chip. Therefore we have to apply it 3822 * during suspend_noirq of the upstream bridge. 3823 * 3824 * Power is automagically restored before resume. No action is needed. 3825 */ 3826 static void quirk_apple_poweroff_thunderbolt(struct pci_dev *dev) 3827 { 3828 acpi_handle bridge, SXIO, SXFP, SXLV; 3829 3830 if (!x86_apple_machine) 3831 return; 3832 if (pci_pcie_type(dev) != PCI_EXP_TYPE_UPSTREAM) 3833 return; 3834 3835 /* 3836 * SXIO/SXFP/SXLF turns off power to the Thunderbolt controller. 3837 * We don't know how to turn it back on again, but firmware does, 3838 * so we can only use SXIO/SXFP/SXLF if we're suspending via 3839 * firmware. 3840 */ 3841 if (!pm_suspend_via_firmware()) 3842 return; 3843 3844 bridge = ACPI_HANDLE(&dev->dev); 3845 if (!bridge) 3846 return; 3847 3848 /* 3849 * SXIO and SXLV are present only on machines requiring this quirk. 3850 * Thunderbolt bridges in external devices might have the same 3851 * device ID as those on the host, but they will not have the 3852 * associated ACPI methods. This implicitly checks that we are at 3853 * the right bridge. 3854 */ 3855 if (ACPI_FAILURE(acpi_get_handle(bridge, "DSB0.NHI0.SXIO", &SXIO)) 3856 || ACPI_FAILURE(acpi_get_handle(bridge, "DSB0.NHI0.SXFP", &SXFP)) 3857 || ACPI_FAILURE(acpi_get_handle(bridge, "DSB0.NHI0.SXLV", &SXLV))) 3858 return; 3859 pci_info(dev, "quirk: cutting power to Thunderbolt controller...\n"); 3860 3861 /* magic sequence */ 3862 acpi_execute_simple_method(SXIO, NULL, 1); 3863 acpi_execute_simple_method(SXFP, NULL, 0); 3864 msleep(300); 3865 acpi_execute_simple_method(SXLV, NULL, 0); 3866 acpi_execute_simple_method(SXIO, NULL, 0); 3867 acpi_execute_simple_method(SXLV, NULL, 0); 3868 } 3869 DECLARE_PCI_FIXUP_SUSPEND_LATE(PCI_VENDOR_ID_INTEL, 3870 PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C, 3871 quirk_apple_poweroff_thunderbolt); 3872 3873 3874 /* 3875 * Putting PCIe root ports on Ryzen SoCs with USB4 controllers into D3 power 3876 * states may cause problems when the system attempts wake up from s2idle. 3877 * 3878 * This manifests as a missing wakeup interrupt on the following systems: 3879 * Family 19h model 44h (PCI 0x14b9) 3880 * Family 19h model 74h (PCI 0x14eb) 3881 * Family 19h model 78h (PCI 0x14eb) 3882 * 3883 * Add a quirk to the root port if a USB4 controller is connected to it 3884 * to avoid D3 power states. 3885 */ 3886 static void quirk_ryzen_rp_d3(struct pci_dev *pdev) 3887 { 3888 struct pci_dev *child = NULL; 3889 > 3890 while (child = pci_get_class(PCI_CLASS_SERIAL_USB_USB4, child)) { 3891 if (pci_upstream_bridge(child) != pdev) 3892 continue; 3893 pdev->dev_flags |= PCI_DEV_FLAGS_NO_WAKE_D3; 3894 pci_info(pdev, "quirk: disabling D3 for wakeup\n"); 3895 break; 3896 } 3897 } 3898 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x14b9, quirk_ryzen_rp_d3); 3899 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x14eb, quirk_ryzen_rp_d3); 3900 #endif 3901
Hi Mario, kernel test robot noticed the following build warnings: [auto build test WARNING on 0bb80ecc33a8fb5a682236443c1e740d5c917d1d] url: https://github.com/intel-lab-lkp/linux/commits/Mario-Limonciello/PCI-Move-the-PCI_CLASS_SERIAL_USB_USB4-definition-to-common-header/20230913-121559 base: 0bb80ecc33a8fb5a682236443c1e740d5c917d1d patch link: https://lore.kernel.org/r/20230913040832.114610-3-mario.limonciello%40amd.com patch subject: [PATCH v18 2/2] PCI: Add a quirk for AMD PCIe root ports w/ USB4 controllers config: i386-randconfig-r023-20230913 (https://download.01.org/0day-ci/archive/20230913/202309131834.q68yWKdZ-lkp@intel.com/config) compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230913/202309131834.q68yWKdZ-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202309131834.q68yWKdZ-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/pci/quirks.c:3890:15: warning: using the result of an assignment as a condition without parentheses [-Wparentheses] while (child = pci_get_class(PCI_CLASS_SERIAL_USB_USB4, child)) { ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/pci/quirks.c:3890:15: note: place parentheses around the assignment to silence this warning while (child = pci_get_class(PCI_CLASS_SERIAL_USB_USB4, child)) { ^ ( ) drivers/pci/quirks.c:3890:15: note: use '==' to turn this assignment into an equality comparison while (child = pci_get_class(PCI_CLASS_SERIAL_USB_USB4, child)) { ^ == 1 warning generated. vim +3890 drivers/pci/quirks.c 3775 3776 /* 3777 * Some AMD/ATI GPUS (HD8570 - Oland) report that a D3hot->D0 transition 3778 * causes a reset (i.e., they advertise NoSoftRst-). This transition seems 3779 * to have no effect on the device: it retains the framebuffer contents and 3780 * monitor sync. Advertising this support makes other layers, like VFIO, 3781 * assume pci_reset_function() is viable for this device. Mark it as 3782 * unavailable to skip it when testing reset methods. 3783 */ 3784 DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_ATI, PCI_ANY_ID, 3785 PCI_CLASS_DISPLAY_VGA, 8, quirk_no_pm_reset); 3786 3787 /* 3788 * Thunderbolt controllers with broken MSI hotplug signaling: 3789 * Entire 1st generation (Light Ridge, Eagle Ridge, Light Peak) and part 3790 * of the 2nd generation (Cactus Ridge 4C up to revision 1, Port Ridge). 3791 */ 3792 static void quirk_thunderbolt_hotplug_msi(struct pci_dev *pdev) 3793 { 3794 if (pdev->is_hotplug_bridge && 3795 (pdev->device != PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C || 3796 pdev->revision <= 1)) 3797 pdev->no_msi = 1; 3798 } 3799 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_LIGHT_RIDGE, 3800 quirk_thunderbolt_hotplug_msi); 3801 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_EAGLE_RIDGE, 3802 quirk_thunderbolt_hotplug_msi); 3803 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_LIGHT_PEAK, 3804 quirk_thunderbolt_hotplug_msi); 3805 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C, 3806 quirk_thunderbolt_hotplug_msi); 3807 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE, 3808 quirk_thunderbolt_hotplug_msi); 3809 3810 #ifdef CONFIG_ACPI 3811 /* 3812 * Apple: Shutdown Cactus Ridge Thunderbolt controller. 3813 * 3814 * On Apple hardware the Cactus Ridge Thunderbolt controller needs to be 3815 * shutdown before suspend. Otherwise the native host interface (NHI) will not 3816 * be present after resume if a device was plugged in before suspend. 3817 * 3818 * The Thunderbolt controller consists of a PCIe switch with downstream 3819 * bridges leading to the NHI and to the tunnel PCI bridges. 3820 * 3821 * This quirk cuts power to the whole chip. Therefore we have to apply it 3822 * during suspend_noirq of the upstream bridge. 3823 * 3824 * Power is automagically restored before resume. No action is needed. 3825 */ 3826 static void quirk_apple_poweroff_thunderbolt(struct pci_dev *dev) 3827 { 3828 acpi_handle bridge, SXIO, SXFP, SXLV; 3829 3830 if (!x86_apple_machine) 3831 return; 3832 if (pci_pcie_type(dev) != PCI_EXP_TYPE_UPSTREAM) 3833 return; 3834 3835 /* 3836 * SXIO/SXFP/SXLF turns off power to the Thunderbolt controller. 3837 * We don't know how to turn it back on again, but firmware does, 3838 * so we can only use SXIO/SXFP/SXLF if we're suspending via 3839 * firmware. 3840 */ 3841 if (!pm_suspend_via_firmware()) 3842 return; 3843 3844 bridge = ACPI_HANDLE(&dev->dev); 3845 if (!bridge) 3846 return; 3847 3848 /* 3849 * SXIO and SXLV are present only on machines requiring this quirk. 3850 * Thunderbolt bridges in external devices might have the same 3851 * device ID as those on the host, but they will not have the 3852 * associated ACPI methods. This implicitly checks that we are at 3853 * the right bridge. 3854 */ 3855 if (ACPI_FAILURE(acpi_get_handle(bridge, "DSB0.NHI0.SXIO", &SXIO)) 3856 || ACPI_FAILURE(acpi_get_handle(bridge, "DSB0.NHI0.SXFP", &SXFP)) 3857 || ACPI_FAILURE(acpi_get_handle(bridge, "DSB0.NHI0.SXLV", &SXLV))) 3858 return; 3859 pci_info(dev, "quirk: cutting power to Thunderbolt controller...\n"); 3860 3861 /* magic sequence */ 3862 acpi_execute_simple_method(SXIO, NULL, 1); 3863 acpi_execute_simple_method(SXFP, NULL, 0); 3864 msleep(300); 3865 acpi_execute_simple_method(SXLV, NULL, 0); 3866 acpi_execute_simple_method(SXIO, NULL, 0); 3867 acpi_execute_simple_method(SXLV, NULL, 0); 3868 } 3869 DECLARE_PCI_FIXUP_SUSPEND_LATE(PCI_VENDOR_ID_INTEL, 3870 PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C, 3871 quirk_apple_poweroff_thunderbolt); 3872 3873 3874 /* 3875 * Putting PCIe root ports on Ryzen SoCs with USB4 controllers into D3 power 3876 * states may cause problems when the system attempts wake up from s2idle. 3877 * 3878 * This manifests as a missing wakeup interrupt on the following systems: 3879 * Family 19h model 44h (PCI 0x14b9) 3880 * Family 19h model 74h (PCI 0x14eb) 3881 * Family 19h model 78h (PCI 0x14eb) 3882 * 3883 * Add a quirk to the root port if a USB4 controller is connected to it 3884 * to avoid D3 power states. 3885 */ 3886 static void quirk_ryzen_rp_d3(struct pci_dev *pdev) 3887 { 3888 struct pci_dev *child = NULL; 3889 > 3890 while (child = pci_get_class(PCI_CLASS_SERIAL_USB_USB4, child)) { 3891 if (pci_upstream_bridge(child) != pdev) 3892 continue; 3893 pdev->dev_flags |= PCI_DEV_FLAGS_NO_WAKE_D3; 3894 pci_info(pdev, "quirk: disabling D3 for wakeup\n"); 3895 break; 3896 } 3897 } 3898 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x14b9, quirk_ryzen_rp_d3); 3899 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x14eb, quirk_ryzen_rp_d3); 3900 #endif 3901
On Wed, Sep 13, 2023 at 6:11 AM Mario Limonciello <mario.limonciello@amd.com> wrote: > > Iain reports that USB devices can't be used to wake a Lenovo Z13 > from suspend. This is because the PCIe root port has been put > into D3hot and AMD's platform can't handle USB devices waking the > platform from a hardware sleep state in this case. It would be good to mention the PMC involvement, because it is necessary to trigger the issue IIUC. Apparently, if a Root Port is in D3hot at the time the PMC is called to reduce the platform power, the PMC takes that as a license to do something that prevents wakeup signaling from working. > This problem only occurs on Linux, when waking from a platform hardware > sleep state. Comparing the behavior on Windows and Linux, Windows > doesn't put the root ports into D3. > > In Windows systems that support Modern Standby specify hardware > pre-conditions for the SoC to achieve the lowest power state by device > constraints in a SOC specific "Power Engine Plugin" (PEP) [1] [2]. > They can be marked as disabled or enabled and when enabled can specify > the minimum power state required for an ACPI device. > > The policy on Linux does not take constraints into account to decide what > state to put the device into at suspend like Windows does. I'm not sure whether or not it is really clear what happens in Windows nor whether it is relevant for this patch. The relevant information is that Windows keeps these ports in D0 and that demonstrably prevents the PMC from using a platform state in which PCIe wakeup doesn't work. Therefore Linux needs to do the same thing, but only if system wakeup is enabled for them (or the devices underneath). > Rather for > devices that support D3, the target state is selected by this policy: > 1. If platform_pci_power_manageable(): > Use platform_pci_choose_state() > 2. If the device is armed for wakeup: > Select the deepest D-state that supports a PME. > 3. Else: > Use D3hot. > > Devices are considered power manageable by the platform when they have > one or more objects described in the table in section 7.3 of the ACPI 6.5 > specification [3]. > > If devices are not considered power manageable; specs are ambiguous as > to what should happen. In this situation Windows 11 leaves PCIe > ports in D0 while Linux puts them into D3 due to the policy introduced by > commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"). > > As the Windows PEP driver uses constraints to express the desired state > that should be selected for suspend but Linux doesn't introduce a quirk > for the problematic root ports. I would say "but Linux doesn't do that, so ...", because it currently reads like the quirk was not present which is slightly confusing. > > When selecting a target state specifically for sleep in > `pci_prepare_to_sleep` this quirk will prevent the root ports from > selecting D3hot or D3cold if they have been configured as a wakeup source. > > Cc: stable@vger.kernel.org > Link: https://learn.microsoft.com/en-us/windows-hardware/design/device-experiences/platform-design-for-modern-standby#low-power-core-silicon-cpu-soc-dram [1] > Link: https://uefi.org/sites/default/files/resources/Intel_ACPI_Low_Power_S0_Idle.pdf [2] > Link: https://uefi.org/specs/ACPI/6.5/07_Power_and_Performance_Mgmt.html#device-power-management-objects [3] > Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") > Reported-by: Iain Lane <iain@orangesquash.org.uk> > Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-external-USB-keyboard/m-p/5217121 > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > --- > The same PCI ID is used for multiple different root ports. This problem > only affects the root port that the USB4 controller is connected to. > > drivers/pci/pci.c | 5 +++++ > drivers/pci/quirks.c | 28 ++++++++++++++++++++++++++++ > include/linux/pci.h | 2 ++ > 3 files changed, 35 insertions(+) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 59c01d68c6d5..a113b8941d09 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -2752,6 +2752,11 @@ int pci_prepare_to_sleep(struct pci_dev *dev) > if (target_state == PCI_POWER_ERROR) > return -EIO; > > + /* quirk to avoid setting D3 */ > + if (wakeup && dev->dev_flags & PCI_DEV_FLAGS_NO_WAKE_D3 && > + (target_state == PCI_D3hot || target_state == PCI_D3cold)) > + target_state = PCI_D0; > + > pci_enable_wake(dev, target_state, wakeup); > > error = pci_set_power_state(dev, target_state); > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index eeec1d6f9023..c6f2c301f62a 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -3869,6 +3869,34 @@ static void quirk_apple_poweroff_thunderbolt(struct pci_dev *dev) > DECLARE_PCI_FIXUP_SUSPEND_LATE(PCI_VENDOR_ID_INTEL, > PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C, > quirk_apple_poweroff_thunderbolt); > + > + > +/* > + * Putting PCIe root ports on Ryzen SoCs with USB4 controllers into D3 power > + * states may cause problems when the system attempts wake up from s2idle. > + * > + * This manifests as a missing wakeup interrupt on the following systems: > + * Family 19h model 44h (PCI 0x14b9) > + * Family 19h model 74h (PCI 0x14eb) > + * Family 19h model 78h (PCI 0x14eb) > + * > + * Add a quirk to the root port if a USB4 controller is connected to it > + * to avoid D3 power states. > + */ > +static void quirk_ryzen_rp_d3(struct pci_dev *pdev) > +{ > + struct pci_dev *child = NULL; > + > + while (child = pci_get_class(PCI_CLASS_SERIAL_USB_USB4, child)) { > + if (pci_upstream_bridge(child) != pdev) > + continue; > + pdev->dev_flags |= PCI_DEV_FLAGS_NO_WAKE_D3; > + pci_info(pdev, "quirk: disabling D3 for wakeup\n"); > + break; I'd use pcie_find_root_port() here. > + } > +} > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x14b9, quirk_ryzen_rp_d3); > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x14eb, quirk_ryzen_rp_d3); > #endif > > /* > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 8c7c2c3c6c65..2292fbc20630 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -245,6 +245,8 @@ enum pci_dev_flags { > PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11), > /* Device does honor MSI masking despite saying otherwise */ > PCI_DEV_FLAGS_HAS_MSI_MASKING = (__force pci_dev_flags_t) (1 << 12), > + /* Wakeup events don't work over D3 */ > + PCI_DEV_FLAGS_NO_WAKE_D3 = (__force pci_dev_flags_t) (1 << 13), > }; > > enum pci_irq_reroute_variant { > -- Overall, I think that this is much better than the previous iterations, because it adds the quirk where it is needed and triggers it under the conditions in which it matters. So with the above addressed, please feel free to add Reviewed-by: Rafael J. Wysocki <rafael@kernel.org> to this patch.
On Tue, Sep 12, 2023 at 11:43:53PM -0500, Mario Limonciello wrote: > On 9/12/2023 23:25, Lukas Wunner wrote: > > There's already PCI_DEV_FLAGS_NO_D3, would it be possible to just > > reuse that instead of adding another codepath for D3 quirks? > > > > The root port can handle D3 (including wakeup) at runtime fine. > Issue occurs only during s2idle w/ hardware sleep. I see. If this only affects system sleep, not runtime PM, what you can do is define a DECLARE_PCI_FIXUP_SUSPEND_LATE() which calls pci_d3cold_disable() and also define a DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY() which calls pci_d3cold_enable(). And I think you can make those calls conditional on pm_suspend_no_platform() to constrain to s2idle. User space should still be able to influence runtime PM via the d3cold_allowed flag (unless I'm missing something). Thanks, Lukas
On Wed, Sep 13, 2023 at 12:20:14PM +0200, Rafael J. Wysocki wrote: > On Wed, Sep 13, 2023 at 6:11 AM Mario Limonciello > <mario.limonciello@amd.com> wrote: > > > > Iain reports that USB devices can't be used to wake a Lenovo Z13 > > from suspend. This is because the PCIe root port has been put > > into D3hot and AMD's platform can't handle USB devices waking the > > platform from a hardware sleep state in this case. > > It would be good to mention the PMC involvement, because it is > necessary to trigger the issue IIUC. > > Apparently, if a Root Port is in D3hot at the time the PMC is called > to reduce the platform power, the PMC takes that as a license to do > something that prevents wakeup signaling from working. This absolutely needs to be part of the commit log and the patch. If the device advertises PME_Support for D3hot or D3cold, but we don't actually get those PMEs after putting it in D3hot or D3cold, that's a bug in the device. "AMD's platform can't handle devices waking from hardware sleep" isn't specific enough to help future PCI maintenance because "hardware sleep state" is not a PCI concept. > > This problem only occurs on Linux, when waking from a platform hardware > > sleep state. Comparing the behavior on Windows and Linux, Windows > > doesn't put the root ports into D3. > > > > In Windows systems that support Modern Standby specify hardware > > pre-conditions for the SoC to achieve the lowest power state by device > > constraints in a SOC specific "Power Engine Plugin" (PEP) [1] [2]. > > They can be marked as disabled or enabled and when enabled can specify > > the minimum power state required for an ACPI device. > > > > The policy on Linux does not take constraints into account to decide what > > state to put the device into at suspend like Windows does. > > I'm not sure whether or not it is really clear what happens in Windows > nor whether it is relevant for this patch. > > The relevant information is that Windows keeps these ports in D0 and > that demonstrably prevents the PMC from using a platform state in > which PCIe wakeup doesn't work. Therefore Linux needs to do the same > thing, but only if system wakeup is enabled for them (or the devices > underneath). So it sounds like either of these scenarios would work: A) Root Port stays in D0, PMC selects platform state X, wakeups still work B) Root Port in D3hot, PMC selects platform state Y that doesn't break wakeups, so wakeups still work PCI isn't in a position to pick one over the other because it has no idea what the tradeoffs are. IIUC, this quirk basically forces scenario A (although a naive reading would suggest that we could still put the Root Port in D1 or D2, since the quirk only mentions D3). > > Rather for > > devices that support D3, the target state is selected by this policy: > > 1. If platform_pci_power_manageable(): > > Use platform_pci_choose_state() > > 2. If the device is armed for wakeup: > > Select the deepest D-state that supports a PME. > > 3. Else: > > Use D3hot. > > > > Devices are considered power manageable by the platform when they have > > one or more objects described in the table in section 7.3 of the ACPI 6.5 > > specification [3]. > > > > If devices are not considered power manageable; specs are ambiguous as > > to what should happen. In this situation Windows 11 leaves PCIe > > ports in D0 while Linux puts them into D3 due to the policy introduced by > > commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"). > > > > As the Windows PEP driver uses constraints to express the desired state > > that should be selected for suspend but Linux doesn't introduce a quirk > > for the problematic root ports. > > I would say "but Linux doesn't do that, so ...", because it currently > reads like the quirk was not present which is slightly confusing. > > > When selecting a target state specifically for sleep in > > `pci_prepare_to_sleep` this quirk will prevent the root ports from > > selecting D3hot or D3cold if they have been configured as a wakeup source. > > > > Cc: stable@vger.kernel.org > > Link: https://learn.microsoft.com/en-us/windows-hardware/design/device-experiences/platform-design-for-modern-standby#low-power-core-silicon-cpu-soc-dram [1] > > Link: https://uefi.org/sites/default/files/resources/Intel_ACPI_Low_Power_S0_Idle.pdf [2] > > Link: https://uefi.org/specs/ACPI/6.5/07_Power_and_Performance_Mgmt.html#device-power-management-objects [3] > > Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") > > Reported-by: Iain Lane <iain@orangesquash.org.uk> > > Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-external-USB-keyboard/m-p/5217121 > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > > --- > > The same PCI ID is used for multiple different root ports. This problem > > only affects the root port that the USB4 controller is connected to. If true, this seems important, not something to discard because it's after "---". > > drivers/pci/pci.c | 5 +++++ > > drivers/pci/quirks.c | 28 ++++++++++++++++++++++++++++ > > include/linux/pci.h | 2 ++ > > 3 files changed, 35 insertions(+) > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > index 59c01d68c6d5..a113b8941d09 100644 > > --- a/drivers/pci/pci.c > > +++ b/drivers/pci/pci.c > > @@ -2752,6 +2752,11 @@ int pci_prepare_to_sleep(struct pci_dev *dev) > > if (target_state == PCI_POWER_ERROR) > > return -EIO; > > > > + /* quirk to avoid setting D3 */ > > + if (wakeup && dev->dev_flags & PCI_DEV_FLAGS_NO_WAKE_D3 && Why did you pick dev_flags? If there's not a reason to prefer that, I'd just add a 1-bit bitfield because that doesn't require a new #define. > > + (target_state == PCI_D3hot || target_state == PCI_D3cold)) > > + target_state = PCI_D0; > > + > > pci_enable_wake(dev, target_state, wakeup); > > > > error = pci_set_power_state(dev, target_state); > > + * Putting PCIe root ports on Ryzen SoCs with USB4 controllers into D3 power > > + * states may cause problems when the system attempts wake up from s2idle. > > + * > > + * This manifests as a missing wakeup interrupt on the following systems: > > + * Family 19h model 44h (PCI 0x14b9) > > + * Family 19h model 74h (PCI 0x14eb) > > + * Family 19h model 78h (PCI 0x14eb) > > + * > > + * Add a quirk to the root port if a USB4 controller is connected to it > > + * to avoid D3 power states. I want to know whether this is D3hot, D3cold, or both. Also in the pci_info() below. Also, do we have some indication that this is specific to Ryzen? If not, I assume this is an ongoing issue, and matching on Device IDs just means we'll have to debug the same problem again and add more IDs. > > +static void quirk_ryzen_rp_d3(struct pci_dev *pdev) > > +{ > > + struct pci_dev *child = NULL; > > + > > + while (child = pci_get_class(PCI_CLASS_SERIAL_USB_USB4, child)) { > > + if (pci_upstream_bridge(child) != pdev) > > + continue; > > + pdev->dev_flags |= PCI_DEV_FLAGS_NO_WAKE_D3; > > + pci_info(pdev, "quirk: disabling D3 for wakeup\n"); I don't remember seeing any evidence that this is a USB4-specific issue. My guess is that it affects wakeups from *any* device below these Root Ports, since I assume the PMEs are bog standard PCIe events, not anything special about USB4. It sounds like this is only an issue when amd_pmc_s2idle_prepare() is involved, right? The PMEs and wakeups work as expected until we tell the PMC to do its magic thing? If so, shouldn't this be conditional on something in amd/pmc.c to connect these pieces together? Looks like amd/pmc.c only works if the platform provides an AMDI0005, AMDI0006, etc., ACPI device? I think it'd be nice if amd_pmc_probe() logged a hint about it being activated. AFAICS it only logs something on errors. This has been incredibly painful to debug. It looks like the PMCs do very subtle power management things, and it'd be nice to have a hint that there's really fancy stuff going on in the background. Bjorn
On 9/13/2023 10:40, Bjorn Helgaas wrote: > On Wed, Sep 13, 2023 at 12:20:14PM +0200, Rafael J. Wysocki wrote: >> On Wed, Sep 13, 2023 at 6:11 AM Mario Limonciello >> <mario.limonciello@amd.com> wrote: >>> >>> Iain reports that USB devices can't be used to wake a Lenovo Z13 >>> from suspend. This is because the PCIe root port has been put >>> into D3hot and AMD's platform can't handle USB devices waking the >>> platform from a hardware sleep state in this case. >> >> It would be good to mention the PMC involvement, because it is >> necessary to trigger the issue IIUC. >> >> Apparently, if a Root Port is in D3hot at the time the PMC is called >> to reduce the platform power, the PMC takes that as a license to do >> something that prevents wakeup signaling from working. > > This absolutely needs to be part of the commit log and the patch. > > If the device advertises PME_Support for D3hot or D3cold, but we don't > actually get those PMEs after putting it in D3hot or D3cold, that's a > bug in the device. "AMD's platform can't handle devices waking from > hardware sleep" isn't specific enough to help future PCI maintenance > because "hardware sleep state" is not a PCI concept. OK. > >>> This problem only occurs on Linux, when waking from a platform hardware >>> sleep state. Comparing the behavior on Windows and Linux, Windows >>> doesn't put the root ports into D3. >>> >>> In Windows systems that support Modern Standby specify hardware >>> pre-conditions for the SoC to achieve the lowest power state by device >>> constraints in a SOC specific "Power Engine Plugin" (PEP) [1] [2]. >>> They can be marked as disabled or enabled and when enabled can specify >>> the minimum power state required for an ACPI device. >>> >>> The policy on Linux does not take constraints into account to decide what >>> state to put the device into at suspend like Windows does. >> >> I'm not sure whether or not it is really clear what happens in Windows >> nor whether it is relevant for this patch. >> >> The relevant information is that Windows keeps these ports in D0 and >> that demonstrably prevents the PMC from using a platform state in >> which PCIe wakeup doesn't work. Therefore Linux needs to do the same >> thing, but only if system wakeup is enabled for them (or the devices >> underneath). > > So it sounds like either of these scenarios would work: > > A) Root Port stays in D0, PMC selects platform state X, wakeups still > work > > B) Root Port in D3hot, PMC selects platform state Y that doesn't > break wakeups, so wakeups still work > > PCI isn't in a position to pick one over the other because it has no > idea what the tradeoffs are. > Right. > IIUC, this quirk basically forces scenario A (although a naive reading > would suggest that we could still put the Root Port in D1 or D2, since > the quirk only mentions D3). > I haven't done any testing with D1 or D2 as Linux doesn't select these states. >>> Rather for >>> devices that support D3, the target state is selected by this policy: >>> 1. If platform_pci_power_manageable(): >>> Use platform_pci_choose_state() >>> 2. If the device is armed for wakeup: >>> Select the deepest D-state that supports a PME. >>> 3. Else: >>> Use D3hot. >>> >>> Devices are considered power manageable by the platform when they have >>> one or more objects described in the table in section 7.3 of the ACPI 6.5 >>> specification [3]. >>> >>> If devices are not considered power manageable; specs are ambiguous as >>> to what should happen. In this situation Windows 11 leaves PCIe >>> ports in D0 while Linux puts them into D3 due to the policy introduced by >>> commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"). >>> >>> As the Windows PEP driver uses constraints to express the desired state >>> that should be selected for suspend but Linux doesn't introduce a quirk >>> for the problematic root ports. >> >> I would say "but Linux doesn't do that, so ...", because it currently >> reads like the quirk was not present which is slightly confusing. >> >>> When selecting a target state specifically for sleep in >>> `pci_prepare_to_sleep` this quirk will prevent the root ports from >>> selecting D3hot or D3cold if they have been configured as a wakeup source. >>> >>> Cc: stable@vger.kernel.org >>> Link: https://learn.microsoft.com/en-us/windows-hardware/design/device-experiences/platform-design-for-modern-standby#low-power-core-silicon-cpu-soc-dram [1] >>> Link: https://uefi.org/sites/default/files/resources/Intel_ACPI_Low_Power_S0_Idle.pdf [2] >>> Link: https://uefi.org/specs/ACPI/6.5/07_Power_and_Performance_Mgmt.html#device-power-management-objects [3] >>> Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") >>> Reported-by: Iain Lane <iain@orangesquash.org.uk> >>> Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-external-USB-keyboard/m-p/5217121 >>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >>> --- >>> The same PCI ID is used for multiple different root ports. This problem >>> only affects the root port that the USB4 controller is connected to. > > If true, this seems important, not something to discard because it's > after "---". OK. > >>> drivers/pci/pci.c | 5 +++++ >>> drivers/pci/quirks.c | 28 ++++++++++++++++++++++++++++ >>> include/linux/pci.h | 2 ++ >>> 3 files changed, 35 insertions(+) >>> >>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >>> index 59c01d68c6d5..a113b8941d09 100644 >>> --- a/drivers/pci/pci.c >>> +++ b/drivers/pci/pci.c >>> @@ -2752,6 +2752,11 @@ int pci_prepare_to_sleep(struct pci_dev *dev) >>> if (target_state == PCI_POWER_ERROR) >>> return -EIO; >>> >>> + /* quirk to avoid setting D3 */ >>> + if (wakeup && dev->dev_flags & PCI_DEV_FLAGS_NO_WAKE_D3 && > > Why did you pick dev_flags? If there's not a reason to prefer that, > I'd just add a 1-bit bitfield because that doesn't require a new > #define. > There was no strong reason for it. A 1-bit bitfield struct pci_dev will actually make it easier for this quirk to live in a more proper home for the situation (drivers/platform/x86/amd/pmc/pmc.c). >>> + (target_state == PCI_D3hot || target_state == PCI_D3cold)) >>> + target_state = PCI_D0; >>> + >>> pci_enable_wake(dev, target_state, wakeup); >>> >>> error = pci_set_power_state(dev, target_state); > >>> + * Putting PCIe root ports on Ryzen SoCs with USB4 controllers into D3 power >>> + * states may cause problems when the system attempts wake up from s2idle. >>> + * >>> + * This manifests as a missing wakeup interrupt on the following systems: >>> + * Family 19h model 44h (PCI 0x14b9) >>> + * Family 19h model 74h (PCI 0x14eb) >>> + * Family 19h model 78h (PCI 0x14eb) >>> + * >>> + * Add a quirk to the root port if a USB4 controller is connected to it >>> + * to avoid D3 power states. > > I want to know whether this is D3hot, D3cold, or both. Also in the > pci_info() below. Linux doesn't select D3cold for this root port, but it should affect both. > > Also, do we have some indication that this is specific to Ryzen? If > not, I assume this is an ongoing issue, and matching on Device IDs > just means we'll have to debug the same problem again and add more > IDs. This is why my earlier attempts (v16 and v17) tried to tie it to constraints. These are what the uPEP driver in Windows uses to make the decision of what power state to put integrated devices like the root port into. In Windows if no uPEP driver is installed "Windows internal policy" dictates what happens. If the uPEP driver is installed then it influences the policy based upon the constraints. Rafael had feedback against constraints in v17, which is why I'm back to a quirk for v18. This issue as I've described it is specific to AMD Ryzen. I expect it to be an ongoing issue. I also expect unless we use constraints or convince the firmware team to add a _S0W object with a value of "0" for the sake of Linux that we will be adding IDs every year to wherever this lands as we reproduce it on newer SoCs. > >>> +static void quirk_ryzen_rp_d3(struct pci_dev *pdev) >>> +{ >>> + struct pci_dev *child = NULL; >>> + >>> + while (child = pci_get_class(PCI_CLASS_SERIAL_USB_USB4, child)) { >>> + if (pci_upstream_bridge(child) != pdev) >>> + continue; >>> + pdev->dev_flags |= PCI_DEV_FLAGS_NO_WAKE_D3; >>> + pci_info(pdev, "quirk: disabling D3 for wakeup\n"); > > I don't remember seeing any evidence that this is a USB4-specific > issue. My guess is that it affects wakeups from *any* device below > these Root Ports, since I assume the PMEs are bog standard PCIe > events, not anything special about USB4. The hardware team describes the issue to me as specific to how the internal interrupt routing works with the USB4 controller connected to this root port. > > It sounds like this is only an issue when amd_pmc_s2idle_prepare() is > involved, right? The PMEs and wakeups work as expected until we tell > the PMC to do its magic thing? > > If so, shouldn't this be conditional on something in amd/pmc.c to > connect these pieces together? Looks like amd/pmc.c only works if > the platform provides an AMDI0005, AMDI0006, etc., ACPI device? > > I think it'd be nice if amd_pmc_probe() logged a hint about it being > activated. I personally really thought the constraints approach from v16 and v17 did this well and would have scaled effectively. As Rafael has opposition to it what I'm thinking from everyone's feedback today is to add code into amd_pmc_probe() that twiddles a new bit for the matching device in `struct pci_dev`, maybe called `no_d3_for_wakeup`. Then as we add PMC support for new devices, we can add a new line to a switch/case to set that bit if necessary for the platform. > AFAICS it only logs something on errors. This has been > incredibly painful to debug. It looks like the PMCs do very subtle > power management things, and it'd be nice to have a hint that there's > really fancy stuff going on in the background. > Sure I'll add a dev_info or pci_info when it's set.
On 9/13/2023 09:31, Lukas Wunner wrote: > On Tue, Sep 12, 2023 at 11:43:53PM -0500, Mario Limonciello wrote: >> On 9/12/2023 23:25, Lukas Wunner wrote: >>> There's already PCI_DEV_FLAGS_NO_D3, would it be possible to just >>> reuse that instead of adding another codepath for D3 quirks? >>> >> >> The root port can handle D3 (including wakeup) at runtime fine. >> Issue occurs only during s2idle w/ hardware sleep. > > I see. > > If this only affects system sleep, not runtime PM, what you can do is > define a DECLARE_PCI_FIXUP_SUSPEND_LATE() which calls pci_d3cold_disable() > and also define a DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY() which calls > pci_d3cold_enable(). > > And I think you can make those calls conditional on pm_suspend_no_platform() > to constrain to s2idle. > > User space should still be able to influence runtime PM via the > d3cold_allowed flag (unless I'm missing something). > > Thanks, > > Lukas The part you're missing is that D3hot is affected by this issue too, otherwise it would be a good proposal.
On Wed, Sep 13, 2023 at 6:35 PM Mario Limonciello <mario.limonciello@amd.com> wrote: > > On 9/13/2023 10:40, Bjorn Helgaas wrote: > > On Wed, Sep 13, 2023 at 12:20:14PM +0200, Rafael J. Wysocki wrote: > >> On Wed, Sep 13, 2023 at 6:11 AM Mario Limonciello > >> <mario.limonciello@amd.com> wrote: [cut] > > > > Also, do we have some indication that this is specific to Ryzen? If > > not, I assume this is an ongoing issue, and matching on Device IDs > > just means we'll have to debug the same problem again and add more > > IDs. > > This is why my earlier attempts (v16 and v17) tried to tie it to > constraints. These are what the uPEP driver in Windows uses to make the > decision of what power state to put integrated devices like the root > port into. > > In Windows if no uPEP driver is installed "Windows internal policy" > dictates what happens. If the uPEP driver is installed then it > influences the policy based upon the constraints. > > Rafael had feedback against constraints in v17, which is why I'm back to > a quirk for v18. > > This issue as I've described it is specific to AMD Ryzen. OK, so a quirk is the way to go IMO, because starting to rely on LPI constraints in general retroactively is almost guaranteed to regress things this way or another. Whatever is done, it needs to be Ryzen-specific, unless there is evidence that other (and in particular non-AMD) platforms are affected. > I expect it to be an ongoing issue. I also expect unless we use > constraints or convince the firmware team to add a _S0W object with a > value of "0" for the sake of Linux that we will be adding IDs every year > to wherever this lands as we reproduce it on newer SoCs. So maybe the way to go is to make the AMD PMC driver set a flag for Root Ports on suspend or similar. In any case, I think that it would be good to agree on the approach at the general level before posting any new patches, because we seem to be going back and forth here.
On Wed, Sep 13, 2023 at 07:42:05PM +0200, Rafael J. Wysocki wrote: > On Wed, Sep 13, 2023 at 6:35 PM Mario Limonciello > <mario.limonciello@amd.com> wrote: > > > > On 9/13/2023 10:40, Bjorn Helgaas wrote: > > > On Wed, Sep 13, 2023 at 12:20:14PM +0200, Rafael J. Wysocki wrote: > > >> On Wed, Sep 13, 2023 at 6:11 AM Mario Limonciello > > >> <mario.limonciello@amd.com> wrote: > > [cut] > > > > > > > Also, do we have some indication that this is specific to Ryzen? If > > > not, I assume this is an ongoing issue, and matching on Device IDs > > > just means we'll have to debug the same problem again and add more > > > IDs. > > > > This is why my earlier attempts (v16 and v17) tried to tie it to > > constraints. These are what the uPEP driver in Windows uses to make the > > decision of what power state to put integrated devices like the root > > port into. > > > > In Windows if no uPEP driver is installed "Windows internal policy" > > dictates what happens. If the uPEP driver is installed then it > > influences the policy based upon the constraints. > > > > Rafael had feedback against constraints in v17, which is why I'm back to > > a quirk for v18. > > > > This issue as I've described it is specific to AMD Ryzen. > > OK, so a quirk is the way to go IMO, because starting to rely on LPI > constraints in general retroactively is almost guaranteed to regress > things this way or another. > > Whatever is done, it needs to be Ryzen-specific, unless there is > evidence that other (and in particular non-AMD) platforms are > affected. > > > I expect it to be an ongoing issue. I also expect unless we use > > constraints or convince the firmware team to add a _S0W object with a > > value of "0" for the sake of Linux that we will be adding IDs every year > > to wherever this lands as we reproduce it on newer SoCs. > > So maybe the way to go is to make the AMD PMC driver set a flag for > Root Ports on suspend or similar. I like the quirk approach. When PMC is involved, the device behavior doesn't conform to what it advertised via PME_Support. The v18 quirk isn't connected to PMC at all, so IIUC it avoids D3hot/D3cold unnecessarily when amd/pmc is not loaded. I don't object to avoiding D3hot/D3cold unconditionally. Presumably we *could* save a little power by using them when amd/pci isn't loaded, but amd/pci would have to iterate through all PCI devices when it loads, save previous state, do the quirk, and then restore the previous state on module unload. And it would have to use notifiers or assume no Root Port hotplug. All sounds kind of complicated. Maybe it would even be enough to just clear dev->pme_support so we know wakeups don't work. It would be a pretty big benefit if we didn't have to add another bit and complicate pci_prepare_to_sleep() or pci_target_state(). Bjorn
On 9/13/2023 16:05, Bjorn Helgaas wrote: [cut] >>> I expect it to be an ongoing issue. I also expect unless we use >>> constraints or convince the firmware team to add a _S0W object with a >>> value of "0" for the sake of Linux that we will be adding IDs every year >>> to wherever this lands as we reproduce it on newer SoCs. >> >> So maybe the way to go is to make the AMD PMC driver set a flag for >> Root Ports on suspend or similar. > > I like the quirk approach. When PMC is involved, the device behavior > doesn't conform to what it advertised via PME_Support. > > The v18 quirk isn't connected to PMC at all, so IIUC it avoids > D3hot/D3cold unnecessarily when amd/pmc is not loaded. > Technically someone could; but realistically no one will be using these machines without amd-pmc. The battery life over suspend would be abhorrent. > I don't object to avoiding D3hot/D3cold unconditionally. Presumably > we *could* save a little power by using them when amd/pci isn't > loaded, but amd/pci would have to iterate through all PCI devices when > it loads, save previous state, do the quirk, and then restore the > previous state on module unload. And it would have to use notifiers > or assume no Root Port hotplug. All sounds kind of complicated. > Yeah this does sound needlessly complicated. > Maybe it would even be enough to just clear dev->pme_support so we > know wakeups don't work. It would be a pretty big benefit if we > didn't have to add another bit and complicate pci_prepare_to_sleep() > or pci_target_state(). > I don't think clearing PME support entirely is going to help. The reason is that pci_target_state() will fall back to PCI_D3hot when dev->pme_support is fully cleared. I think that clearing *just the bits* for D3hot and D3cold in PME support should work though. I'll test this. Assuming it works how about if we put the quirk to clear the D3hot/D3cold PME support bit in drivers/platform/x86/amd/pmc/pmc-quirks.c? It's still a quirk file and it makes it very clear that this behavior is caused by what amd-pmc does.
On 9/13/2023 16:16, Mario Limonciello wrote: > On 9/13/2023 16:05, Bjorn Helgaas wrote: > [cut] >>>> I expect it to be an ongoing issue. I also expect unless we use >>>> constraints or convince the firmware team to add a _S0W object with a >>>> value of "0" for the sake of Linux that we will be adding IDs every >>>> year >>>> to wherever this lands as we reproduce it on newer SoCs. >>> >>> So maybe the way to go is to make the AMD PMC driver set a flag for >>> Root Ports on suspend or similar. >> >> I like the quirk approach. When PMC is involved, the device behavior >> doesn't conform to what it advertised via PME_Support. >> >> The v18 quirk isn't connected to PMC at all, so IIUC it avoids >> D3hot/D3cold unnecessarily when amd/pmc is not loaded. >> > > Technically someone could; but realistically no one will be using these > machines without amd-pmc. > > The battery life over suspend would be abhorrent. > >> I don't object to avoiding D3hot/D3cold unconditionally. Presumably >> we *could* save a little power by using them when amd/pci isn't >> loaded, but amd/pci would have to iterate through all PCI devices when >> it loads, save previous state, do the quirk, and then restore the >> previous state on module unload. And it would have to use notifiers >> or assume no Root Port hotplug. All sounds kind of complicated. >> > > Yeah this does sound needlessly complicated. > >> Maybe it would even be enough to just clear dev->pme_support so we >> know wakeups don't work. It would be a pretty big benefit if we >> didn't have to add another bit and complicate pci_prepare_to_sleep() >> or pci_target_state(). >> > > I don't think clearing PME support entirely is going to help. The > reason is that pci_target_state() will fall back to PCI_D3hot when > dev->pme_support is fully cleared. > > I think that clearing *just the bits* for D3hot and D3cold in PME > support should work though. I'll test this. I did confirm this works properly. > > Assuming it works how about if we put the quirk to clear the > D3hot/D3cold PME support bit in drivers/platform/x86/amd/pmc/pmc-quirks.c? > > It's still a quirk file and it makes it very clear that this behavior is > caused by what amd-pmc does. I've got it coded up like this and working, so please let me know if this approach is amenable and I'll drop an updated version. If you would prefer it to be in pci/quirks.c I believe I can do either.
On Wed, Sep 13, 2023 at 11:59:00PM -0500, Mario Limonciello wrote: > On 9/13/2023 16:16, Mario Limonciello wrote: > > On 9/13/2023 16:05, Bjorn Helgaas wrote: > > [cut] > > > > > I expect it to be an ongoing issue. I also expect unless we use > > > > > constraints or convince the firmware team to add a _S0W object with a > > > > > value of "0" for the sake of Linux that we will be adding > > > > > IDs every year > > > > > to wherever this lands as we reproduce it on newer SoCs. > > > > > > > > So maybe the way to go is to make the AMD PMC driver set a flag for > > > > Root Ports on suspend or similar. > > > > > > I like the quirk approach. When PMC is involved, the device behavior > > > doesn't conform to what it advertised via PME_Support. > > > > > > The v18 quirk isn't connected to PMC at all, so IIUC it avoids > > > D3hot/D3cold unnecessarily when amd/pmc is not loaded. > > > > Technically someone could; but realistically no one will be using these > > machines without amd-pmc. > > > > The battery life over suspend would be abhorrent. > > > > > I don't object to avoiding D3hot/D3cold unconditionally. Presumably > > > we *could* save a little power by using them when amd/pci isn't > > > loaded, but amd/pci would have to iterate through all PCI devices when > > > it loads, save previous state, do the quirk, and then restore the > > > previous state on module unload. And it would have to use notifiers > > > or assume no Root Port hotplug. All sounds kind of complicated. > > > > Yeah this does sound needlessly complicated. > > > > > Maybe it would even be enough to just clear dev->pme_support so we > > > know wakeups don't work. It would be a pretty big benefit if we > > > didn't have to add another bit and complicate pci_prepare_to_sleep() > > > or pci_target_state(). > > > > I don't think clearing PME support entirely is going to help. The > > reason is that pci_target_state() will fall back to PCI_D3hot when > > dev->pme_support is fully cleared. > > > > I think that clearing *just the bits* for D3hot and D3cold in PME > > support should work though. I'll test this. > > I did confirm this works properly. > > > Assuming it works how about if we put the quirk to clear the > > D3hot/D3cold PME support bit in > > drivers/platform/x86/amd/pmc/pmc-quirks.c? > > > > It's still a quirk file and it makes it very clear that this behavior is > > caused by what amd-pmc does. > > I've got it coded up like this and working, so please let me know if this > approach is amenable and I'll drop an updated version. > > If you would prefer it to be in pci/quirks.c I believe I can do either. If the quirk is in a loadable module, as opposed to being built-in, does it get applied to the relevant Root Ports when the module is loaded? I didn't look exhaustively, but I don't see a reference to pci_fixup_device() in the module load path. Bjorn
On 9/14/2023 07:32, Bjorn Helgaas wrote: > On Wed, Sep 13, 2023 at 11:59:00PM -0500, Mario Limonciello wrote: >> On 9/13/2023 16:16, Mario Limonciello wrote: >>> On 9/13/2023 16:05, Bjorn Helgaas wrote: >>> [cut] >>>>>> I expect it to be an ongoing issue. I also expect unless we use >>>>>> constraints or convince the firmware team to add a _S0W object with a >>>>>> value of "0" for the sake of Linux that we will be adding >>>>>> IDs every year >>>>>> to wherever this lands as we reproduce it on newer SoCs. >>>>> >>>>> So maybe the way to go is to make the AMD PMC driver set a flag for >>>>> Root Ports on suspend or similar. >>>> >>>> I like the quirk approach. When PMC is involved, the device behavior >>>> doesn't conform to what it advertised via PME_Support. >>>> >>>> The v18 quirk isn't connected to PMC at all, so IIUC it avoids >>>> D3hot/D3cold unnecessarily when amd/pmc is not loaded. >>> >>> Technically someone could; but realistically no one will be using these >>> machines without amd-pmc. >>> >>> The battery life over suspend would be abhorrent. >>> >>>> I don't object to avoiding D3hot/D3cold unconditionally. Presumably >>>> we *could* save a little power by using them when amd/pci isn't >>>> loaded, but amd/pci would have to iterate through all PCI devices when >>>> it loads, save previous state, do the quirk, and then restore the >>>> previous state on module unload. And it would have to use notifiers >>>> or assume no Root Port hotplug. All sounds kind of complicated. >>> >>> Yeah this does sound needlessly complicated. >>> >>>> Maybe it would even be enough to just clear dev->pme_support so we >>>> know wakeups don't work. It would be a pretty big benefit if we >>>> didn't have to add another bit and complicate pci_prepare_to_sleep() >>>> or pci_target_state(). >>> >>> I don't think clearing PME support entirely is going to help. The >>> reason is that pci_target_state() will fall back to PCI_D3hot when >>> dev->pme_support is fully cleared. >>> >>> I think that clearing *just the bits* for D3hot and D3cold in PME >>> support should work though. I'll test this. >> >> I did confirm this works properly. >> >>> Assuming it works how about if we put the quirk to clear the >>> D3hot/D3cold PME support bit in >>> drivers/platform/x86/amd/pmc/pmc-quirks.c? >>> >>> It's still a quirk file and it makes it very clear that this behavior is >>> caused by what amd-pmc does. >> >> I've got it coded up like this and working, so please let me know if this >> approach is amenable and I'll drop an updated version. >> >> If you would prefer it to be in pci/quirks.c I believe I can do either. > > If the quirk is in a loadable module, as opposed to being built-in, > does it get applied to the relevant Root Ports when the module is > loaded? I didn't look exhaustively, but I don't see a reference to > pci_fixup_device() in the module load path. > > Bjorn Right; when done in a module it would be done with code that is part of probe. So it has the implication that it would prevent D3hot/D3cold for this root port at runtime as well. If you think it should be tied to pci_fixup_device() calls then it needs to be built-in.
On Wed, Sep 13, 2023 at 11:36:49AM -0500, Mario Limonciello wrote: > On 9/13/2023 09:31, Lukas Wunner wrote: > > If this only affects system sleep, not runtime PM, what you can do is > > define a DECLARE_PCI_FIXUP_SUSPEND_LATE() which calls pci_d3cold_disable() > > and also define a DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY() which calls > > pci_d3cold_enable(). > > > > And I think you can make those calls conditional on pm_suspend_no_platform() > > to constrain to s2idle. > > > > User space should still be able to influence runtime PM via the > > d3cold_allowed flag (unless I'm missing something). > > The part you're missing is that D3hot is affected by this issue too, > otherwise it would be a good proposal. I recall that in an earlier version of the patch, you solved the issue by amending pci_bridge_d3_possible(). Changing the dev->no_d3cold flag indirectly influences the bridge_d3 flag (through pci_dev_check_d3cold() and pci_bridge_d3_update()). If dev->no_d3cold is set on a device below a port, that port is prevented from entring D3hot because it would result in the device effectively being in D3cold. So you might want to take a closer look at this approach despite the flag suggesting that it only influences D3cold. Thanks, Lukas
On 9/14/2023 09:17, Lukas Wunner wrote: > On Wed, Sep 13, 2023 at 11:36:49AM -0500, Mario Limonciello wrote: >> On 9/13/2023 09:31, Lukas Wunner wrote: >>> If this only affects system sleep, not runtime PM, what you can do is >>> define a DECLARE_PCI_FIXUP_SUSPEND_LATE() which calls pci_d3cold_disable() >>> and also define a DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY() which calls >>> pci_d3cold_enable(). >>> >>> And I think you can make those calls conditional on pm_suspend_no_platform() >>> to constrain to s2idle. >>> >>> User space should still be able to influence runtime PM via the >>> d3cold_allowed flag (unless I'm missing something). >> >> The part you're missing is that D3hot is affected by this issue too, >> otherwise it would be a good proposal. > > I recall that in an earlier version of the patch, you solved the issue > by amending pci_bridge_d3_possible(). > > Changing the dev->no_d3cold flag indirectly influences the bridge_d3 > flag (through pci_dev_check_d3cold() and pci_bridge_d3_update()). > > If dev->no_d3cold is set on a device below a port, that port is > prevented from entring D3hot because it would result in the > device effectively being in D3cold. > > So you might want to take a closer look at this approach despite > the flag suggesting that it only influences D3cold. > Ah; I hadn't considered setting it on a device below the port. In this particular situation the only devices below the root port are USB controllers. If those devices don't go into D3 the system can't enter hardware sleep.
On Thu, Sep 14, 2023 at 09:31:38AM -0500, Mario Limonciello wrote: > On 9/14/2023 09:17, Lukas Wunner wrote: > > On Wed, Sep 13, 2023 at 11:36:49AM -0500, Mario Limonciello wrote: > > > On 9/13/2023 09:31, Lukas Wunner wrote: > > > > If this only affects system sleep, not runtime PM, what you can do is > > > > define a DECLARE_PCI_FIXUP_SUSPEND_LATE() which calls pci_d3cold_disable() > > > > and also define a DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY() which calls > > > > pci_d3cold_enable(). > > > > > > > > And I think you can make those calls conditional on pm_suspend_no_platform() > > > > to constrain to s2idle. > > > > > > > > User space should still be able to influence runtime PM via the > > > > d3cold_allowed flag (unless I'm missing something). > > > > > > The part you're missing is that D3hot is affected by this issue too, > > > otherwise it would be a good proposal. > > > > I recall that in an earlier version of the patch, you solved the issue > > by amending pci_bridge_d3_possible(). > > > > Changing the dev->no_d3cold flag indirectly influences the bridge_d3 > > flag (through pci_dev_check_d3cold() and pci_bridge_d3_update()). > > > > If dev->no_d3cold is set on a device below a port, that port is > > prevented from entring D3hot because it would result in the > > device effectively being in D3cold. > > > > So you might want to take a closer look at this approach despite > > the flag suggesting that it only influences D3cold. > > > > Ah; I hadn't considered setting it on a device below the port. In this > particular situation the only devices below the root port are USB > controllers. > > If those devices don't go into D3 the system can't enter hardware sleep. If you set dev->no_d3cold on the USB controllers, they should still be able to go to D3hot, but not D3cold, which perhaps might be sufficient. It should prevent D3cold *and* D3hot on the Root Port above. And if you set that on system sleep in a quirk and clear it on resume, runtime PM shouldn't be affected. Thanks, Lukas
On Thu, Sep 14, 2023 at 04:53:32PM +0200, Lukas Wunner wrote: > On Thu, Sep 14, 2023 at 09:31:38AM -0500, Mario Limonciello wrote: > > On 9/14/2023 09:17, Lukas Wunner wrote: > > > On Wed, Sep 13, 2023 at 11:36:49AM -0500, Mario Limonciello wrote: > > > > On 9/13/2023 09:31, Lukas Wunner wrote: > > > > > If this only affects system sleep, not runtime PM, what you can do is > > > > > define a DECLARE_PCI_FIXUP_SUSPEND_LATE() which calls pci_d3cold_disable() > > > > > and also define a DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY() which calls > > > > > pci_d3cold_enable(). > > > > > > > > > > And I think you can make those calls conditional on pm_suspend_no_platform() > > > > > to constrain to s2idle. > > > > > > > > > > User space should still be able to influence runtime PM via the > > > > > d3cold_allowed flag (unless I'm missing something). > > > > > > > > The part you're missing is that D3hot is affected by this issue too, > > > > otherwise it would be a good proposal. > > > > > > I recall that in an earlier version of the patch, you solved the issue > > > by amending pci_bridge_d3_possible(). > > > > > > Changing the dev->no_d3cold flag indirectly influences the bridge_d3 > > > flag (through pci_dev_check_d3cold() and pci_bridge_d3_update()). > > > > > > If dev->no_d3cold is set on a device below a port, that port is > > > prevented from entring D3hot because it would result in the > > > device effectively being in D3cold. > > > > > > So you might want to take a closer look at this approach despite > > > the flag suggesting that it only influences D3cold. > > > > Ah; I hadn't considered setting it on a device below the port. In this > > particular situation the only devices below the root port are USB > > controllers. > > > > If those devices don't go into D3 the system can't enter hardware sleep. > > If you set dev->no_d3cold on the USB controllers, they should still > be able to go to D3hot, but not D3cold, which perhaps might be sufficient. > It should prevent D3cold *and* D3hot on the Root Port above. And if you > set that on system sleep in a quirk and clear it on resume, runtime PM > shouldn't be affected. dev->no_d3cold appears to be mainly an administrative policy knob twidded via sysfs. There *are* a few cases where drivers (i915, nouveau, xhci) update it via pci_d3cold_enable() or pci_d3cold_disable(), but they all look vulnerable to issues if people use the sysfs knob, and I'm a little dubious that they're legit in the first place. This AMD Root Port issue is not an administrative choice; it's purely a functional problem of the device advertising that it supports PME# but not actually being able to do it. So if we can do this by fixing dev->pme_support (i.e., the copy of what it advertised), I'd rather do that. Bjorn
On Thu, Sep 14, 2023 at 5:33 PM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Thu, Sep 14, 2023 at 04:53:32PM +0200, Lukas Wunner wrote: > > On Thu, Sep 14, 2023 at 09:31:38AM -0500, Mario Limonciello wrote: > > > On 9/14/2023 09:17, Lukas Wunner wrote: > > > > On Wed, Sep 13, 2023 at 11:36:49AM -0500, Mario Limonciello wrote: > > > > > On 9/13/2023 09:31, Lukas Wunner wrote: > > > > > > If this only affects system sleep, not runtime PM, what you can do is > > > > > > define a DECLARE_PCI_FIXUP_SUSPEND_LATE() which calls pci_d3cold_disable() > > > > > > and also define a DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY() which calls > > > > > > pci_d3cold_enable(). > > > > > > > > > > > > And I think you can make those calls conditional on pm_suspend_no_platform() > > > > > > to constrain to s2idle. > > > > > > > > > > > > User space should still be able to influence runtime PM via the > > > > > > d3cold_allowed flag (unless I'm missing something). > > > > > > > > > > The part you're missing is that D3hot is affected by this issue too, > > > > > otherwise it would be a good proposal. > > > > > > > > I recall that in an earlier version of the patch, you solved the issue > > > > by amending pci_bridge_d3_possible(). > > > > > > > > Changing the dev->no_d3cold flag indirectly influences the bridge_d3 > > > > flag (through pci_dev_check_d3cold() and pci_bridge_d3_update()). > > > > > > > > If dev->no_d3cold is set on a device below a port, that port is > > > > prevented from entring D3hot because it would result in the > > > > device effectively being in D3cold. > > > > > > > > So you might want to take a closer look at this approach despite > > > > the flag suggesting that it only influences D3cold. > > > > > > Ah; I hadn't considered setting it on a device below the port. In this > > > particular situation the only devices below the root port are USB > > > controllers. > > > > > > If those devices don't go into D3 the system can't enter hardware sleep. > > > > If you set dev->no_d3cold on the USB controllers, they should still > > be able to go to D3hot, but not D3cold, which perhaps might be sufficient. > > It should prevent D3cold *and* D3hot on the Root Port above. And if you > > set that on system sleep in a quirk and clear it on resume, runtime PM > > shouldn't be affected. > > dev->no_d3cold appears to be mainly an administrative policy knob > twidded via sysfs. > > There *are* a few cases where drivers (i915, nouveau, xhci) update it > via pci_d3cold_enable() or pci_d3cold_disable(), but they all look > vulnerable to issues if people use the sysfs knob, and I'm a little > dubious that they're legit in the first place. > > This AMD Root Port issue is not an administrative choice; it's purely > a functional problem of the device advertising that it supports PME# > but not actually being able to do it. So if we can do this by fixing > dev->pme_support (i.e., the copy of what it advertised), I'd rather do > that. Besides, it is not really necessary to prevent D3hot on the Root Port in question in all cases. It can go into D3 just fine if there are no wakeup devices under it and I suppose that the platform can achieve more energy savings (over the case when the port is always held in D0).
On Thu, Sep 14, 2023 at 10:33:03AM -0500, Bjorn Helgaas wrote: > dev->no_d3cold appears to be mainly an administrative policy knob > twidded via sysfs. Actually the user space choice to disable D3cold is stored in a different flag called pdev->d3cold_allowed. The fact that d3cold_allowed_store() indirectly modifies the no_d3cold flag as well looks like a bug that went unnoticed for a couple of years. From a quick look, d3cold_allowed_store() should probably call pci_bridge_d3_update() instead of pci_d3cold_enable() / pci_d3cold_disable(). This was introduced by commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"). Perhaps Mika can chime in whether this is indeed wrong. Thanks, Lukas
On Thu, Sep 14, 2023 at 09:04:29PM +0200, Lukas Wunner wrote: > On Thu, Sep 14, 2023 at 10:33:03AM -0500, Bjorn Helgaas wrote: > > dev->no_d3cold appears to be mainly an administrative policy knob > > twidded via sysfs. > > Actually the user space choice to disable D3cold is stored in a > different flag called pdev->d3cold_allowed. > > The fact that d3cold_allowed_store() indirectly modifies the > no_d3cold flag as well looks like a bug that went unnoticed > for a couple of years. From a quick look, d3cold_allowed_store() > should probably call pci_bridge_d3_update() instead of > pci_d3cold_enable() / pci_d3cold_disable(). This was introduced by > commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"). > Perhaps Mika can chime in whether this is indeed wrong. Note that pci_dev_check_d3cold() checks both the no_d3cold flag (which tells whether the *driver* disabled D3cold) and the d3cold_allowed flag (which tells whether *user space* disabled D3cold). Basically right now we allow user space to override the driver setting, which feels unsafe. (Does user space always know better than the driver whether D3cold can safely be entered? I don't think so.) Thanks, Lukas
On 9/13/2023 23:59, Mario Limonciello wrote: > On 9/13/2023 16:16, Mario Limonciello wrote: >> On 9/13/2023 16:05, Bjorn Helgaas wrote: >> [cut] >>>>> I expect it to be an ongoing issue. I also expect unless we use >>>>> constraints or convince the firmware team to add a _S0W object with a >>>>> value of "0" for the sake of Linux that we will be adding IDs every >>>>> year >>>>> to wherever this lands as we reproduce it on newer SoCs. >>>> >>>> So maybe the way to go is to make the AMD PMC driver set a flag for >>>> Root Ports on suspend or similar. >>> >>> I like the quirk approach. When PMC is involved, the device behavior >>> doesn't conform to what it advertised via PME_Support. >>> >>> The v18 quirk isn't connected to PMC at all, so IIUC it avoids >>> D3hot/D3cold unnecessarily when amd/pmc is not loaded. >>> >> >> Technically someone could; but realistically no one will be using >> these machines without amd-pmc. >> >> The battery life over suspend would be abhorrent. >> >>> I don't object to avoiding D3hot/D3cold unconditionally. Presumably >>> we *could* save a little power by using them when amd/pci isn't >>> loaded, but amd/pci would have to iterate through all PCI devices when >>> it loads, save previous state, do the quirk, and then restore the >>> previous state on module unload. And it would have to use notifiers >>> or assume no Root Port hotplug. All sounds kind of complicated. >>> >> >> Yeah this does sound needlessly complicated. >> >>> Maybe it would even be enough to just clear dev->pme_support so we >>> know wakeups don't work. It would be a pretty big benefit if we >>> didn't have to add another bit and complicate pci_prepare_to_sleep() >>> or pci_target_state(). >>> >> >> I don't think clearing PME support entirely is going to help. The >> reason is that pci_target_state() will fall back to PCI_D3hot when >> dev->pme_support is fully cleared. >> >> I think that clearing *just the bits* for D3hot and D3cold in PME >> support should work though. I'll test this. > > I did confirm this works properly. > >> >> Assuming it works how about if we put the quirk to clear the >> D3hot/D3cold PME support bit in >> drivers/platform/x86/amd/pmc/pmc-quirks.c? >> >> It's still a quirk file and it makes it very clear that this behavior >> is caused by what amd-pmc does. > > I've got it coded up like this and working, so please let me know if > this approach is amenable and I'll drop an updated version. > > If you would prefer it to be in pci/quirks.c I believe I can do either. I've also got a variation with pci/quirks.c working too. Here's the trade offs: pci/quirks.c ------------ * Two lines for every platform affected by this. IE: DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x14b9, quirk_disable_pme_suspend); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x14b9, quirk_reenable_pme); * D3hot/D3cold works at runtime (since PME works at runtime) * Only runs if s2idle is used * Runs whether amd-pmc is bound or not. drivers/platform/x86/amd/pmc/pmc-quirks.c ----------------------------------------- * 1 line for adding new affected platforms * Runs at probe; PME is disabled for D3hot/D3cold at runtime. * Only runs if s2idle is used * Only runs if amd-pmc is bound. Having implemented both ways and given users will effectively always use amd-pmc, I have a preference towards pci/quirks.c which only patches dev->pme_support to drop D3hot/cold at suspend time and restores it at resume. Please let me know which way you prefer.
On Thu, Sep 14, 2023 at 07:55:46PM -0500, Mario Limonciello wrote: > On 9/13/2023 23:59, Mario Limonciello wrote: > > On 9/13/2023 16:16, Mario Limonciello wrote: > > > On 9/13/2023 16:05, Bjorn Helgaas wrote: > > > [cut] > > > > > > I expect it to be an ongoing issue. I also expect unless we use > > > > > > constraints or convince the firmware team to add a _S0W object with a > > > > > > value of "0" for the sake of Linux that we will be > > > > > > adding IDs every year > > > > > > to wherever this lands as we reproduce it on newer SoCs. > > > > > > > > > > So maybe the way to go is to make the AMD PMC driver set a flag for > > > > > Root Ports on suspend or similar. > > > > > > > > I like the quirk approach. When PMC is involved, the device behavior > > > > doesn't conform to what it advertised via PME_Support. > > > > > > > > The v18 quirk isn't connected to PMC at all, so IIUC it avoids > > > > D3hot/D3cold unnecessarily when amd/pmc is not loaded. > > > > > > > > > > Technically someone could; but realistically no one will be using > > > these machines without amd-pmc. > > > > > > The battery life over suspend would be abhorrent. > > > > > > > I don't object to avoiding D3hot/D3cold unconditionally. Presumably > > > > we *could* save a little power by using them when amd/pci isn't > > > > loaded, but amd/pci would have to iterate through all PCI devices when > > > > it loads, save previous state, do the quirk, and then restore the > > > > previous state on module unload. And it would have to use notifiers > > > > or assume no Root Port hotplug. All sounds kind of complicated. > > > > > > > > > > Yeah this does sound needlessly complicated. > > > > > > > Maybe it would even be enough to just clear dev->pme_support so we > > > > know wakeups don't work. It would be a pretty big benefit if we > > > > didn't have to add another bit and complicate pci_prepare_to_sleep() > > > > or pci_target_state(). > > > > > > > > > > I don't think clearing PME support entirely is going to help. The > > > reason is that pci_target_state() will fall back to PCI_D3hot when > > > dev->pme_support is fully cleared. > > > > > > I think that clearing *just the bits* for D3hot and D3cold in PME > > > support should work though. I'll test this. > > > > I did confirm this works properly. > > > > > > > > Assuming it works how about if we put the quirk to clear the > > > D3hot/D3cold PME support bit in > > > drivers/platform/x86/amd/pmc/pmc-quirks.c? > > > > > > It's still a quirk file and it makes it very clear that this > > > behavior is caused by what amd-pmc does. > > > > I've got it coded up like this and working, so please let me know if > > this approach is amenable and I'll drop an updated version. > > > > If you would prefer it to be in pci/quirks.c I believe I can do either. > > I've also got a variation with pci/quirks.c working too. > > Here's the trade offs: > > pci/quirks.c > ------------ > * Two lines for every platform affected by this. IE: > DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x14b9, > quirk_disable_pme_suspend); > DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x14b9, quirk_reenable_pme); > * D3hot/D3cold works at runtime (since PME works at runtime) > * Only runs if s2idle is used > * Runs whether amd-pmc is bound or not. > > drivers/platform/x86/amd/pmc/pmc-quirks.c > ----------------------------------------- > * 1 line for adding new affected platforms > * Runs at probe; PME is disabled for D3hot/D3cold at runtime. > * Only runs if s2idle is used > * Only runs if amd-pmc is bound. > > Having implemented both ways and given users will effectively always use > amd-pmc, I have a preference towards pci/quirks.c which only patches > dev->pme_support to drop D3hot/cold at suspend time and restores it at > resume. > > Please let me know which way you prefer. I think the pci/quirks.c one sounds nicer. I was envisioning a one-time quirk where it'd be easy to log a note about this issue, but I see why the suspend/resume quirk has advantages. I don't really like opaque magic behavior (like D3hot/D3cold not being used when dmesg and lspci claim that PME in those states *should* work), but maybe we can figure out some way to make that visible. Bjorn
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 59c01d68c6d5..a113b8941d09 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2752,6 +2752,11 @@ int pci_prepare_to_sleep(struct pci_dev *dev) if (target_state == PCI_POWER_ERROR) return -EIO; + /* quirk to avoid setting D3 */ + if (wakeup && dev->dev_flags & PCI_DEV_FLAGS_NO_WAKE_D3 && + (target_state == PCI_D3hot || target_state == PCI_D3cold)) + target_state = PCI_D0; + pci_enable_wake(dev, target_state, wakeup); error = pci_set_power_state(dev, target_state); diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index eeec1d6f9023..c6f2c301f62a 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3869,6 +3869,34 @@ static void quirk_apple_poweroff_thunderbolt(struct pci_dev *dev) DECLARE_PCI_FIXUP_SUSPEND_LATE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C, quirk_apple_poweroff_thunderbolt); + + +/* + * Putting PCIe root ports on Ryzen SoCs with USB4 controllers into D3 power + * states may cause problems when the system attempts wake up from s2idle. + * + * This manifests as a missing wakeup interrupt on the following systems: + * Family 19h model 44h (PCI 0x14b9) + * Family 19h model 74h (PCI 0x14eb) + * Family 19h model 78h (PCI 0x14eb) + * + * Add a quirk to the root port if a USB4 controller is connected to it + * to avoid D3 power states. + */ +static void quirk_ryzen_rp_d3(struct pci_dev *pdev) +{ + struct pci_dev *child = NULL; + + while (child = pci_get_class(PCI_CLASS_SERIAL_USB_USB4, child)) { + if (pci_upstream_bridge(child) != pdev) + continue; + pdev->dev_flags |= PCI_DEV_FLAGS_NO_WAKE_D3; + pci_info(pdev, "quirk: disabling D3 for wakeup\n"); + break; + } +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x14b9, quirk_ryzen_rp_d3); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x14eb, quirk_ryzen_rp_d3); #endif /* diff --git a/include/linux/pci.h b/include/linux/pci.h index 8c7c2c3c6c65..2292fbc20630 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -245,6 +245,8 @@ enum pci_dev_flags { PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11), /* Device does honor MSI masking despite saying otherwise */ PCI_DEV_FLAGS_HAS_MSI_MASKING = (__force pci_dev_flags_t) (1 << 12), + /* Wakeup events don't work over D3 */ + PCI_DEV_FLAGS_NO_WAKE_D3 = (__force pci_dev_flags_t) (1 << 13), }; enum pci_irq_reroute_variant {
Iain reports that USB devices can't be used to wake a Lenovo Z13 from suspend. This is because the PCIe root port has been put into D3hot and AMD's platform can't handle USB devices waking the platform from a hardware sleep state in this case. This problem only occurs on Linux, when waking from a platform hardware sleep state. Comparing the behavior on Windows and Linux, Windows doesn't put the root ports into D3. In Windows systems that support Modern Standby specify hardware pre-conditions for the SoC to achieve the lowest power state by device constraints in a SOC specific "Power Engine Plugin" (PEP) [1] [2]. They can be marked as disabled or enabled and when enabled can specify the minimum power state required for an ACPI device. The policy on Linux does not take constraints into account to decide what state to put the device into at suspend like Windows does. Rather for devices that support D3, the target state is selected by this policy: 1. If platform_pci_power_manageable(): Use platform_pci_choose_state() 2. If the device is armed for wakeup: Select the deepest D-state that supports a PME. 3. Else: Use D3hot. Devices are considered power manageable by the platform when they have one or more objects described in the table in section 7.3 of the ACPI 6.5 specification [3]. If devices are not considered power manageable; specs are ambiguous as to what should happen. In this situation Windows 11 leaves PCIe ports in D0 while Linux puts them into D3 due to the policy introduced by commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"). As the Windows PEP driver uses constraints to express the desired state that should be selected for suspend but Linux doesn't introduce a quirk for the problematic root ports. When selecting a target state specifically for sleep in `pci_prepare_to_sleep` this quirk will prevent the root ports from selecting D3hot or D3cold if they have been configured as a wakeup source. Cc: stable@vger.kernel.org Link: https://learn.microsoft.com/en-us/windows-hardware/design/device-experiences/platform-design-for-modern-standby#low-power-core-silicon-cpu-soc-dram [1] Link: https://uefi.org/sites/default/files/resources/Intel_ACPI_Low_Power_S0_Idle.pdf [2] Link: https://uefi.org/specs/ACPI/6.5/07_Power_and_Performance_Mgmt.html#device-power-management-objects [3] Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Reported-by: Iain Lane <iain@orangesquash.org.uk> Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-external-USB-keyboard/m-p/5217121 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> --- The same PCI ID is used for multiple different root ports. This problem only affects the root port that the USB4 controller is connected to. drivers/pci/pci.c | 5 +++++ drivers/pci/quirks.c | 28 ++++++++++++++++++++++++++++ include/linux/pci.h | 2 ++ 3 files changed, 35 insertions(+)