Message ID | 6401388.2t0qD3iTOr@aspire.rjw.lan (mailing list archive) |
---|---|
State | Mainlined |
Delegated to: | Rafael Wysocki |
Headers | show |
Series | PCI / ACPI / PM: Resume all bridges on suspend-to-RAM | expand |
On Thu, Aug 16, 2018 at 12:56:46PM +0200, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Commit 26112ddc254c (PCI / ACPI / PM: Resume bridges w/o drivers on > suspend-to-RAM) attempted to fix a functional regression resulting > from commit c62ec4610c40 (PM / core: Fix direct_complete handling > for devices with no callbacks) by resuming PCI bridges without > drivers (that is, "parallel PCI" ones) during system-wide suspend if > the target system state is not ACPI S0 (working state). > > That turns out insufficient, however, as it is reported that, at > least in one case, the platform firmware gets confused if a PCIe > root port is suspended before entering the ACPI S3 sleep state. > > For this reason, drop the driver check from acpi_pci_need_resume() > and resume all bridges (including PCIe ports with drivers) during > system-wide suspend if the target system state is not ACPI S0. > > [If the target system state is ACPI S0, it means suspend-to-idle > and the platform firmware is not going to be invoked to actually > suspend the system, so there is no need to resume the bridges in > that case.] > > Fixes: c62ec4610c40 (PM / core: Fix direct_complete handling for devices with no callbacks) > Reported-by: teika kazura <teika@gmx.com> > Tested-by: teika kazura <teika@gmx.com> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=200675 > Cc: 4.15+ <stable@vger.kernel.org> # 4.15+: 26112ddc254c (PCI / ACPI / PM: Resume bridges ...) > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
On Thu, Aug 16, 2018 at 12:56:46PM +0200, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Commit 26112ddc254c (PCI / ACPI / PM: Resume bridges w/o drivers on > suspend-to-RAM) attempted to fix a functional regression resulting > from commit c62ec4610c40 (PM / core: Fix direct_complete handling > for devices with no callbacks) by resuming PCI bridges without > drivers (that is, "parallel PCI" ones) during system-wide suspend if > the target system state is not ACPI S0 (working state). > > That turns out insufficient, however, as it is reported that, at > least in one case, the platform firmware gets confused if a PCIe > root port is suspended before entering the ACPI S3 sleep state. > > For this reason, drop the driver check from acpi_pci_need_resume() > and resume all bridges (including PCIe ports with drivers) during > system-wide suspend if the target system state is not ACPI S0. > > [If the target system state is ACPI S0, it means suspend-to-idle > and the platform firmware is not going to be invoked to actually > suspend the system, so there is no need to resume the bridges in > that case.] > > Fixes: c62ec4610c40 (PM / core: Fix direct_complete handling for devices with no callbacks) > Reported-by: teika kazura <teika@gmx.com> > Tested-by: teika kazura <teika@gmx.com> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=200675 > Cc: 4.15+ <stable@vger.kernel.org> # 4.15+: 26112ddc254c (PCI / ACPI / PM: Resume bridges ...) > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Thanks for doing this. I don't like dependencies on the PCIe PM/AER/hotplug/etc features being implemented as a "driver" because they could be implemented in the PCI core directly. > --- > drivers/pci/pci-acpi.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > Index: linux-pm/drivers/pci/pci-acpi.c > =================================================================== > --- linux-pm.orig/drivers/pci/pci-acpi.c > +++ linux-pm/drivers/pci/pci-acpi.c > @@ -632,13 +632,11 @@ static bool acpi_pci_need_resume(struct > /* > * In some cases (eg. Samsung 305V4A) leaving a bridge in suspend over > * system-wide suspend/resume confuses the platform firmware, so avoid > - * doing that, unless the bridge has a driver that should take care of > - * the PM handling. According to Section 16.1.6 of ACPI 6.2, endpoint > + * doing that. According to Section 16.1.6 of ACPI 6.2, endpoint > * devices are expected to be in D3 before invoking the S3 entry path > * from the firmware, so they should not be affected by this issue. > */ > - if (pci_is_bridge(dev) && !dev->driver && > - acpi_target_system_state() != ACPI_STATE_S0) > + if (pci_is_bridge(dev) && acpi_target_system_state() != ACPI_STATE_S0) > return true; > > if (!adev || !acpi_device_power_manageable(adev)) >
For the record, about the exactness of the patch description. The patch mentions the regression by the commit c62ec4610c40, but it is not the cause of the bug (https://bugzilla.kernel.org/show_bug.cgi?id=20067) reported by me; I reverted c62ec4610c40 on linux-4.17.13, and the bug remained. # Some details: my bug was introduced by the commit (i) 877b3729ca0 on Jan 3. The commit (ii) c62ec4610c40 was on May 22. The commit (iii) 26112ddc254c on Jun 30 fixes one problem caused by c62ec4610c40. The present patch modifies the code of the commit (iii), so it can be said as the completion of the commit (iii). It at the same time fixes my bug, too. This suggests the present patch possibly fixes other unknown PM problems; former kernels had some loose end(s). Now this patch puts the kernel in a better position. I'm a lay Linux user, and don't know if this post helps. If it does, it may be worth mentioning it in the above bugzilla entry. Dziękuję (thanks), kernel developers. Best regards, Teika (Teika kazura) From: "Rafael J. Wysocki" <rjw@rjwysocki.net> Subject: [PATCH] PCI / ACPI / PM: Resume all bridges on suspend-to-RAM Date: Thu, 16 Aug 2018 12:56:46 +0200 > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Commit 26112ddc254c (PCI / ACPI / PM: Resume bridges w/o drivers on > suspend-to-RAM) attempted to fix a functional regression resulting > from commit c62ec4610c40 (PM / core: Fix direct_complete handling > for devices with no callbacks) by resuming PCI bridges without > drivers (that is, "parallel PCI" ones) during system-wide suspend if > the target system state is not ACPI S0 (working state). > > That turns out insufficient, however, as it is reported that, at > least in one case, the platform firmware gets confused if a PCIe > root port is suspended before entering the ACPI S3 sleep state. > > For this reason, drop the driver check from acpi_pci_need_resume() > and resume all bridges (including PCIe ports with drivers) during > system-wide suspend if the target system state is not ACPI S0. > > [If the target system state is ACPI S0, it means suspend-to-idle > and the platform firmware is not going to be invoked to actually > suspend the system, so there is no need to resume the bridges in > that case.] > > Fixes: c62ec4610c40 (PM / core: Fix direct_complete handling for devices with no callbacks) > Reported-by: teika kazura <teika@gmx.com> > Tested-by: teika kazura <teika@gmx.com> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=200675 > Cc: 4.15+ <stable@vger.kernel.org> # 4.15+: 26112ddc254c (PCI / ACPI / PM: Resume bridges ...) > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > drivers/pci/pci-acpi.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > Index: linux-pm/drivers/pci/pci-acpi.c > =================================================================== > --- linux-pm.orig/drivers/pci/pci-acpi.c > +++ linux-pm/drivers/pci/pci-acpi.c > @@ -632,13 +632,11 @@ static bool acpi_pci_need_resume(struct > /* > * In some cases (eg. Samsung 305V4A) leaving a bridge in suspend over > * system-wide suspend/resume confuses the platform firmware, so avoid > - * doing that, unless the bridge has a driver that should take care of > - * the PM handling. According to Section 16.1.6 of ACPI 6.2, endpoint > + * doing that. According to Section 16.1.6 of ACPI 6.2, endpoint > * devices are expected to be in D3 before invoking the S3 entry path > * from the firmware, so they should not be affected by this issue. > */ > - if (pci_is_bridge(dev) && !dev->driver && > - acpi_target_system_state() != ACPI_STATE_S0) > + if (pci_is_bridge(dev) && acpi_target_system_state() != ACPI_STATE_S0) > return true; > > if (!adev || !acpi_device_power_manageable(adev)) >
On Fri, Aug 17, 2018 at 7:45 AM Teika Kazura <teika@gmx.com> wrote: > > For the record, about the exactness of the patch description. > > The patch mentions the regression by the commit c62ec4610c40, but it is not the cause of the bug (https://bugzilla.kernel.org/show_bug.cgi?id=20067) > reported by me; I reverted c62ec4610c40 on linux-4.17.13, and the bug remained. > > # Some details: my bug was introduced by the commit (i) 877b3729ca0 on Jan 3. The commit (ii) c62ec4610c40 was on May 22. The commit (iii) 26112ddc254c > on Jun 30 fixes one problem caused by c62ec4610c40. The present patch modifies the code of the commit (iii), so it can be said as the completion of the > commit (iii). It at the same time fixes my bug, too. You are right, commit 877b3729ca0 introduced the issue for you, but it did that by exposing the same functional problem in the firmware that was previously addressed by commit 26112ddc254c in a different case. > This suggests the present patch possibly fixes other unknown PM problems; former kernels had some loose end(s). Now this patch puts the kernel in a better position. > > I'm a lay Linux user, and don't know if this post helps. If it does, it may be worth mentioning it in the above bugzilla entry. Yes, it does, thanks! I have updated the tags and the commit log of this patch according to the information above. Cheers, Rafael
Index: linux-pm/drivers/pci/pci-acpi.c =================================================================== --- linux-pm.orig/drivers/pci/pci-acpi.c +++ linux-pm/drivers/pci/pci-acpi.c @@ -632,13 +632,11 @@ static bool acpi_pci_need_resume(struct /* * In some cases (eg. Samsung 305V4A) leaving a bridge in suspend over * system-wide suspend/resume confuses the platform firmware, so avoid - * doing that, unless the bridge has a driver that should take care of - * the PM handling. According to Section 16.1.6 of ACPI 6.2, endpoint + * doing that. According to Section 16.1.6 of ACPI 6.2, endpoint * devices are expected to be in D3 before invoking the S3 entry path * from the firmware, so they should not be affected by this issue. */ - if (pci_is_bridge(dev) && !dev->driver && - acpi_target_system_state() != ACPI_STATE_S0) + if (pci_is_bridge(dev) && acpi_target_system_state() != ACPI_STATE_S0) return true; if (!adev || !acpi_device_power_manageable(adev))