diff mbox

[RFC,1/3] PCI/PM: Fix kexec for D3cold and bridge suspending

Message ID 1347872076-5260-2-git-send-email-ying.huang@intel.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Huang, Ying Sept. 17, 2012, 8:54 a.m. UTC
If PCI devices are put into D3cold before kexec, because the
configuration registers of PCI devices in D3cold are not accessible.

And if PCI bridges are put into low power state before kexec,
configuration registers of PCI devices underneath the PCI bridges are
not accessible too.

These will make some PCI devices can not be scanned after kexec, so
resume the PCI devices in D3cold or PCI bridges in low power state
before kexec.

Signed-off-by: Huang Ying <ying.huang@intel.com>
---
 drivers/pci/pci-driver.c |    4 ++++
 1 file changed, 4 insertions(+)

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Bjorn Helgaas Sept. 17, 2012, 8:54 p.m. UTC | #1
+cc Eric and kexec list

On Mon, Sep 17, 2012 at 2:54 AM, Huang Ying <ying.huang@intel.com> wrote:
> If PCI devices are put into D3cold before kexec, because the
> configuration registers of PCI devices in D3cold are not accessible.
>
> And if PCI bridges are put into low power state before kexec,
> configuration registers of PCI devices underneath the PCI bridges are
> not accessible too.
>
> These will make some PCI devices can not be scanned after kexec, so
> resume the PCI devices in D3cold or PCI bridges in low power state
> before kexec.

Don't we need to resume the device even without the kexec issue?  And
even if it's in D1 or D2?

It looks to me like pci_msi_shutdown() (and probably drv->shutdown())
depend on the device being in D0.

> Signed-off-by: Huang Ying <ying.huang@intel.com>
> ---
>  drivers/pci/pci-driver.c |    4 ++++
>  1 file changed, 4 insertions(+)
>
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -421,6 +421,10 @@ static void pci_device_shutdown(struct d
>         struct pci_dev *pci_dev = to_pci_dev(dev);
>         struct pci_driver *drv = pci_dev->driver;
>
> +       /* Resume bridges and devices in D3cold for kexec to work properly */
> +       if (pci_dev->current_state == PCI_D3cold || pci_dev->subordinate)
> +               pm_runtime_resume(dev);
> +
>         if (drv && drv->shutdown)
>                 drv->shutdown(pci_dev);
>         pci_msi_shutdown(pci_dev);
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Sept. 20, 2012, 7:38 a.m. UTC | #2
Bjorn Helgaas <bhelgaas@google.com> writes:

> +cc Eric and kexec list
>
> On Mon, Sep 17, 2012 at 2:54 AM, Huang Ying <ying.huang@intel.com> wrote:
>> If PCI devices are put into D3cold before kexec, because the
>> configuration registers of PCI devices in D3cold are not accessible.
>>
>> And if PCI bridges are put into low power state before kexec,
>> configuration registers of PCI devices underneath the PCI bridges are
>> not accessible too.
>>
>> These will make some PCI devices can not be scanned after kexec, so
>> resume the PCI devices in D3cold or PCI bridges in low power state
>> before kexec.
>
> Don't we need to resume the device even without the kexec issue?  And
> even if it's in D1 or D2?

The basic requirement is that the device needs to be visible so we can
auto discover it.  As I recall most sleep states don't make the device
invisible and we can handle the rest in the device initializatoin code.

> It looks to me like pci_msi_shutdown() (and probably drv->shutdown())
> depend on the device being in D0.

There is certainly a depenency on the config registers being visible.
Although I don't know if much will go wrong if they aren't.

Ceratinly pci_msi_shutdown doesn't have anything to do if the device has
had so much power removed that the device is not even exectuing.

Eric

>> Signed-off-by: Huang Ying <ying.huang@intel.com>
>> ---
>>  drivers/pci/pci-driver.c |    4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -421,6 +421,10 @@ static void pci_device_shutdown(struct d
>>         struct pci_dev *pci_dev = to_pci_dev(dev);
>>         struct pci_driver *drv = pci_dev->driver;
>>
>> +       /* Resume bridges and devices in D3cold for kexec to work properly */
>> +       if (pci_dev->current_state == PCI_D3cold || pci_dev->subordinate)
>> +               pm_runtime_resume(dev);
>> +
>>         if (drv && drv->shutdown)
>>                 drv->shutdown(pci_dev);
>>         pci_msi_shutdown(pci_dev);
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Huang, Ying Sept. 20, 2012, 8:19 a.m. UTC | #3
On Thu, 2012-09-20 at 00:38 -0700, Eric W. Biederman wrote:
> Bjorn Helgaas <bhelgaas@google.com> writes:
> 
> > +cc Eric and kexec list
> >
> > On Mon, Sep 17, 2012 at 2:54 AM, Huang Ying <ying.huang@intel.com> wrote:
> >> If PCI devices are put into D3cold before kexec, because the
> >> configuration registers of PCI devices in D3cold are not accessible.
> >>
> >> And if PCI bridges are put into low power state before kexec,
> >> configuration registers of PCI devices underneath the PCI bridges are
> >> not accessible too.
> >>
> >> These will make some PCI devices can not be scanned after kexec, so
> >> resume the PCI devices in D3cold or PCI bridges in low power state
> >> before kexec.
> >
> > Don't we need to resume the device even without the kexec issue?  And
> > even if it's in D1 or D2?
> 
> The basic requirement is that the device needs to be visible so we can
> auto discover it.  As I recall most sleep states don't make the device
> invisible and we can handle the rest in the device initializatoin code.

PCI devices in D3cold or under a bridge in D3hot will not be visible, so
we must fix that for kexec to run properly.

> > It looks to me like pci_msi_shutdown() (and probably drv->shutdown())
> > depend on the device being in D0.
> 
> There is certainly a depenency on the config registers being visible.
> Although I don't know if much will go wrong if they aren't.
> 
> Ceratinly pci_msi_shutdown doesn't have anything to do if the device has
> had so much power removed that the device is not even exectuing.

Don't know which power state device should be in for pci_msi_shutdown
etc.  But it appears that normal shutdown/reboot and kexec works at most
times so far.  D3cold and bridge in D3hot works for normal
shutdown/reboot, but not for kexec.  So I write some fix.

Best Regards,
Huang Ying

> >> Signed-off-by: Huang Ying <ying.huang@intel.com>
> >> ---
> >>  drivers/pci/pci-driver.c |    4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> --- a/drivers/pci/pci-driver.c
> >> +++ b/drivers/pci/pci-driver.c
> >> @@ -421,6 +421,10 @@ static void pci_device_shutdown(struct d
> >>         struct pci_dev *pci_dev = to_pci_dev(dev);
> >>         struct pci_driver *drv = pci_dev->driver;
> >>
> >> +       /* Resume bridges and devices in D3cold for kexec to work properly */
> >> +       if (pci_dev->current_state == PCI_D3cold || pci_dev->subordinate)
> >> +               pm_runtime_resume(dev);
> >> +
> >>         if (drv && drv->shutdown)
> >>                 drv->shutdown(pci_dev);
> >>         pci_msi_shutdown(pci_dev);
> >
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Sept. 20, 2012, 8:27 a.m. UTC | #4
Huang Ying <ying.huang@intel.com> writes:

> On Thu, 2012-09-20 at 00:38 -0700, Eric W. Biederman wrote:
>> Bjorn Helgaas <bhelgaas@google.com> writes:
>> 
>> > +cc Eric and kexec list
>> >
>> > On Mon, Sep 17, 2012 at 2:54 AM, Huang Ying <ying.huang@intel.com> wrote:
>> >> If PCI devices are put into D3cold before kexec, because the
>> >> configuration registers of PCI devices in D3cold are not accessible.
>> >>
>> >> And if PCI bridges are put into low power state before kexec,
>> >> configuration registers of PCI devices underneath the PCI bridges are
>> >> not accessible too.
>> >>
>> >> These will make some PCI devices can not be scanned after kexec, so
>> >> resume the PCI devices in D3cold or PCI bridges in low power state
>> >> before kexec.
>> >
>> > Don't we need to resume the device even without the kexec issue?  And
>> > even if it's in D1 or D2?
>> 
>> The basic requirement is that the device needs to be visible so we can
>> auto discover it.  As I recall most sleep states don't make the device
>> invisible and we can handle the rest in the device initializatoin code.
>
> PCI devices in D3cold or under a bridge in D3hot will not be visible, so
> we must fix that for kexec to run properly.

That seems reasonable to me.

>> > It looks to me like pci_msi_shutdown() (and probably drv->shutdown())
>> > depend on the device being in D0.
>> 
>> There is certainly a depenency on the config registers being visible.
>> Although I don't know if much will go wrong if they aren't.
>> 
>> Ceratinly pci_msi_shutdown doesn't have anything to do if the device has
>> had so much power removed that the device is not even exectuing.
>
> Don't know which power state device should be in for pci_msi_shutdown
> etc.  But it appears that normal shutdown/reboot and kexec works at most
> times so far.  D3cold and bridge in D3hot works for normal
> shutdown/reboot, but not for kexec.  So I write some fix.

To be clear.  Has someone picked up this patch yet?

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>

As for reboot it tends to always work because frequently the firmware
code triggers a soft reset of the entire machine, which puts devices in
their power on reset state, which is not a suspended state.  Although
that behavior is not required for a software reboot.

Eric



> Best Regards,
> Huang Ying
>
>> >> Signed-off-by: Huang Ying <ying.huang@intel.com>
>> >> ---
>> >>  drivers/pci/pci-driver.c |    4 ++++
>> >>  1 file changed, 4 insertions(+)
>> >>
>> >> --- a/drivers/pci/pci-driver.c
>> >> +++ b/drivers/pci/pci-driver.c
>> >> @@ -421,6 +421,10 @@ static void pci_device_shutdown(struct d
>> >>         struct pci_dev *pci_dev = to_pci_dev(dev);
>> >>         struct pci_driver *drv = pci_dev->driver;
>> >>
>> >> +       /* Resume bridges and devices in D3cold for kexec to work properly */
>> >> +       if (pci_dev->current_state == PCI_D3cold || pci_dev->subordinate)
>> >> +               pm_runtime_resume(dev);
>> >> +
>> >>         if (drv && drv->shutdown)
>> >>                 drv->shutdown(pci_dev);
>> >>         pci_msi_shutdown(pci_dev);
>> >
>> > _______________________________________________
>> > kexec mailing list
>> > kexec@lists.infradead.org
>> > http://lists.infradead.org/mailman/listinfo/kexec
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki Sept. 20, 2012, 7:20 p.m. UTC | #5
On Monday, September 17, 2012, Bjorn Helgaas wrote:
> +cc Eric and kexec list
> 
> On Mon, Sep 17, 2012 at 2:54 AM, Huang Ying <ying.huang@intel.com> wrote:
> > If PCI devices are put into D3cold before kexec, because the
> > configuration registers of PCI devices in D3cold are not accessible.
> >
> > And if PCI bridges are put into low power state before kexec,
> > configuration registers of PCI devices underneath the PCI bridges are
> > not accessible too.
> >
> > These will make some PCI devices can not be scanned after kexec, so
> > resume the PCI devices in D3cold or PCI bridges in low power state
> > before kexec.
> 
> Don't we need to resume the device even without the kexec issue?  And
> even if it's in D1 or D2?
> 
> It looks to me like pci_msi_shutdown() (and probably drv->shutdown())
> depend on the device being in D0.

We should in theory, but we didn't do any power management of PCI bridges
before, so this is the first time we have a problem with it.

So I'd say, yeah, let's resume if current_state is between D1 and D3cold
inclusive and the kexec comment is not very helpful (the problem is not
kexec-specific in general).

Speaking of kexec, it might consider using the hibernation device freeze
instead of device shutdown (which the kexec jump feature does).  I've seen
reports of problems that would be solved this way most likely.

Thanks,
Rafael


> > Signed-off-by: Huang Ying <ying.huang@intel.com>
> > ---
> >  drivers/pci/pci-driver.c |    4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -421,6 +421,10 @@ static void pci_device_shutdown(struct d
> >         struct pci_dev *pci_dev = to_pci_dev(dev);
> >         struct pci_driver *drv = pci_dev->driver;
> >
> > +       /* Resume bridges and devices in D3cold for kexec to work properly */
> > +       if (pci_dev->current_state == PCI_D3cold || pci_dev->subordinate)
> > +               pm_runtime_resume(dev);
> > +
> >         if (drv && drv->shutdown)
> >                 drv->shutdown(pci_dev);
> >         pci_msi_shutdown(pci_dev);
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Huang, Ying Sept. 21, 2012, 12:28 a.m. UTC | #6
On Thu, 2012-09-20 at 21:20 +0200, Rafael J. Wysocki wrote:
> On Monday, September 17, 2012, Bjorn Helgaas wrote:
> > +cc Eric and kexec list
> > 
> > On Mon, Sep 17, 2012 at 2:54 AM, Huang Ying <ying.huang@intel.com> wrote:
> > > If PCI devices are put into D3cold before kexec, because the
> > > configuration registers of PCI devices in D3cold are not accessible.
> > >
> > > And if PCI bridges are put into low power state before kexec,
> > > configuration registers of PCI devices underneath the PCI bridges are
> > > not accessible too.
> > >
> > > These will make some PCI devices can not be scanned after kexec, so
> > > resume the PCI devices in D3cold or PCI bridges in low power state
> > > before kexec.
> > 
> > Don't we need to resume the device even without the kexec issue?  And
> > even if it's in D1 or D2?
> > 
> > It looks to me like pci_msi_shutdown() (and probably drv->shutdown())
> > depend on the device being in D0.
> 
> We should in theory, but we didn't do any power management of PCI bridges
> before, so this is the first time we have a problem with it.
> 
> So I'd say, yeah, let's resume if current_state is between D1 and D3cold
> inclusive and the kexec comment is not very helpful (the problem is not
> kexec-specific in general).

Resume from D1 to D3cold for any device or just bridges?

Best Regards,
Huang Ying

> Speaking of kexec, it might consider using the hibernation device freeze
> instead of device shutdown (which the kexec jump feature does).  I've seen
> reports of problems that would be solved this way most likely.
> 
> Thanks,
> Rafael
> 
> 
> > > Signed-off-by: Huang Ying <ying.huang@intel.com>
> > > ---
> > >  drivers/pci/pci-driver.c |    4 ++++
> > >  1 file changed, 4 insertions(+)
> > >
> > > --- a/drivers/pci/pci-driver.c
> > > +++ b/drivers/pci/pci-driver.c
> > > @@ -421,6 +421,10 @@ static void pci_device_shutdown(struct d
> > >         struct pci_dev *pci_dev = to_pci_dev(dev);
> > >         struct pci_driver *drv = pci_dev->driver;
> > >
> > > +       /* Resume bridges and devices in D3cold for kexec to work properly */
> > > +       if (pci_dev->current_state == PCI_D3cold || pci_dev->subordinate)
> > > +               pm_runtime_resume(dev);
> > > +
> > >         if (drv && drv->shutdown)
> > >                 drv->shutdown(pci_dev);
> > >         pci_msi_shutdown(pci_dev);
> > 
> > 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki Sept. 21, 2012, 7:30 p.m. UTC | #7
On Friday, September 21, 2012, Huang Ying wrote:
> On Thu, 2012-09-20 at 21:20 +0200, Rafael J. Wysocki wrote:
> > On Monday, September 17, 2012, Bjorn Helgaas wrote:
> > > +cc Eric and kexec list
> > > 
> > > On Mon, Sep 17, 2012 at 2:54 AM, Huang Ying <ying.huang@intel.com> wrote:
> > > > If PCI devices are put into D3cold before kexec, because the
> > > > configuration registers of PCI devices in D3cold are not accessible.
> > > >
> > > > And if PCI bridges are put into low power state before kexec,
> > > > configuration registers of PCI devices underneath the PCI bridges are
> > > > not accessible too.
> > > >
> > > > These will make some PCI devices can not be scanned after kexec, so
> > > > resume the PCI devices in D3cold or PCI bridges in low power state
> > > > before kexec.
> > > 
> > > Don't we need to resume the device even without the kexec issue?  And
> > > even if it's in D1 or D2?
> > > 
> > > It looks to me like pci_msi_shutdown() (and probably drv->shutdown())
> > > depend on the device being in D0.
> > 
> > We should in theory, but we didn't do any power management of PCI bridges
> > before, so this is the first time we have a problem with it.
> > 
> > So I'd say, yeah, let's resume if current_state is between D1 and D3cold
> > inclusive and the kexec comment is not very helpful (the problem is not
> > kexec-specific in general).
> 
> Resume from D1 to D3cold for any device or just bridges?

I'd say every device for starters.

Thanks,
Rafael


> > Speaking of kexec, it might consider using the hibernation device freeze
> > instead of device shutdown (which the kexec jump feature does).  I've seen
> > reports of problems that would be solved this way most likely.
> > 
> > Thanks,
> > Rafael
> > 
> > 
> > > > Signed-off-by: Huang Ying <ying.huang@intel.com>
> > > > ---
> > > >  drivers/pci/pci-driver.c |    4 ++++
> > > >  1 file changed, 4 insertions(+)
> > > >
> > > > --- a/drivers/pci/pci-driver.c
> > > > +++ b/drivers/pci/pci-driver.c
> > > > @@ -421,6 +421,10 @@ static void pci_device_shutdown(struct d
> > > >         struct pci_dev *pci_dev = to_pci_dev(dev);
> > > >         struct pci_driver *drv = pci_dev->driver;
> > > >
> > > > +       /* Resume bridges and devices in D3cold for kexec to work properly */
> > > > +       if (pci_dev->current_state == PCI_D3cold || pci_dev->subordinate)
> > > > +               pm_runtime_resume(dev);
> > > > +
> > > >         if (drv && drv->shutdown)
> > > >                 drv->shutdown(pci_dev);
> > > >         pci_msi_shutdown(pci_dev);
> > > 
> > > 
> > 
> 
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -421,6 +421,10 @@  static void pci_device_shutdown(struct d
 	struct pci_dev *pci_dev = to_pci_dev(dev);
 	struct pci_driver *drv = pci_dev->driver;
 
+	/* Resume bridges and devices in D3cold for kexec to work properly */
+	if (pci_dev->current_state == PCI_D3cold || pci_dev->subordinate)
+		pm_runtime_resume(dev);
+
 	if (drv && drv->shutdown)
 		drv->shutdown(pci_dev);
 	pci_msi_shutdown(pci_dev);