diff mbox series

PCI/PM: Put devices to low power state on shutdown

Message ID 20240712062411.35732-1-kai.heng.feng@canonical.com (mailing list archive)
State New
Delegated to: Bjorn Helgaas
Headers show
Series PCI/PM: Put devices to low power state on shutdown | expand

Commit Message

Kai-Heng Feng July 12, 2024, 6:24 a.m. UTC
Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
connected.

The following error message can be found during shutdown:
pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:09:04.0
pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
pcieport 0000:09:04.0:   device [8086:0b26] error status/mask=00000080/00002000
pcieport 0000:09:04.0:    [ 7] BadDLLP

Calling aer_remove() during shutdown can quiesce the error message,
however the spurious wakeup still happens.

The issue won't happen if the device is in D3 before system shutdown, so
putting device to low power state before shutdown to solve the issue.

I don't have a sniffer so this is purely guesswork, however I believe
putting device to low power state it's the right thing to do.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
 drivers/pci/pci-driver.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Mario Limonciello July 12, 2024, 2:59 p.m. UTC | #1
On 7/12/2024 1:24, Kai-Heng Feng wrote:
> Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
> connected.
> 
> The following error message can be found during shutdown:
> pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:09:04.0
> pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> pcieport 0000:09:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> pcieport 0000:09:04.0:    [ 7] BadDLLP
> 
> Calling aer_remove() during shutdown can quiesce the error message,
> however the spurious wakeup still happens.
> 
> The issue won't happen if the device is in D3 before system shutdown, so
> putting device to low power state before shutdown to solve the issue.
> 
> I don't have a sniffer so this is purely guesswork, however I believe
> putting device to low power state it's the right thing to do.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

This is directionally very similar to the proposal that I had at the end 
of last year.

https://lore.kernel.org/linux-pci/20231213182656.6165-1-mario.limonciello@amd.com/#t

I definitely think we should be aiming at all devices that don't wake 
the system as being in D3 at shutdown.

> ---
>   drivers/pci/pci-driver.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index af2996d0d17f..4c6f66f3eb54 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -510,6 +510,14 @@ static void pci_device_shutdown(struct device *dev)
>   	if (drv && drv->shutdown)
>   		drv->shutdown(pci_dev);
>   
> +	/*
> +	 * If driver already changed device's power state, it can mean the
> +	 * wakeup setting is in place, or a workaround is used. Hence keep it
> +	 * as is.
> +	 */
> +	if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
> +		pci_prepare_to_sleep(pci_dev);
> +
>   	/*
>   	 * If this is a kexec reboot, turn off Bus Master bit on the
>   	 * device to tell it to not continue to do DMA. Don't touch
Mario Limonciello Aug. 22, 2024, 7:28 p.m. UTC | #2
On 7/12/2024 01:24, Kai-Heng Feng wrote:
> Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
> connected.
> 
> The following error message can be found during shutdown:
> pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:09:04.0
> pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> pcieport 0000:09:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> pcieport 0000:09:04.0:    [ 7] BadDLLP
> 
> Calling aer_remove() during shutdown can quiesce the error message,
> however the spurious wakeup still happens.
> 
> The issue won't happen if the device is in D3 before system shutdown, so
> putting device to low power state before shutdown to solve the issue.
> 
> I don't have a sniffer so this is purely guesswork, however I believe
> putting device to low power state it's the right thing to do.

KH,

I did testing with your patch along with a few others, and found that it 
does the best job to put a majority of devices into a low power state 
properly.

I have the details of what happens at S5 outlined on this Gist:
https://gist.github.com/superm1/f8f81e52f5b1d55b64493fdaec38e31c

* KH column is this patch.
* ML column is 
https://lore.kernel.org/linux-usb/43594a1c-c0dd-4ae1-b2c4-f5198e3fe951@amd.com/T/#m03d0b36f86fb4722009b24a8ee547011128db80b
* FS column is 0fab972eef49 being applied again

I also have power testing data from an OEM's system that shows that it 
improves things well enough that a previously failing energy star 
certification is now passing.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com>

> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> ---
>   drivers/pci/pci-driver.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index af2996d0d17f..4c6f66f3eb54 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -510,6 +510,14 @@ static void pci_device_shutdown(struct device *dev)
>   	if (drv && drv->shutdown)
>   		drv->shutdown(pci_dev);
>   
> +	/*
> +	 * If driver already changed device's power state, it can mean the
> +	 * wakeup setting is in place, or a workaround is used. Hence keep it
> +	 * as is.
> +	 */
> +	if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
> +		pci_prepare_to_sleep(pci_dev);
> +
>   	/*
>   	 * If this is a kexec reboot, turn off Bus Master bit on the
>   	 * device to tell it to not continue to do DMA. Don't touch
Kai-Heng Feng Aug. 26, 2024, 12:03 p.m. UTC | #3
On Fri, Aug 23, 2024 at 3:28 AM Mario Limonciello
<mario.limonciello@amd.com> wrote:
>
> On 7/12/2024 01:24, Kai-Heng Feng wrote:
> > Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
> > connected.
> >
> > The following error message can be found during shutdown:
> > pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:09:04.0
> > pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> > pcieport 0000:09:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> > pcieport 0000:09:04.0:    [ 7] BadDLLP
> >
> > Calling aer_remove() during shutdown can quiesce the error message,
> > however the spurious wakeup still happens.
> >
> > The issue won't happen if the device is in D3 before system shutdown, so
> > putting device to low power state before shutdown to solve the issue.
> >
> > I don't have a sniffer so this is purely guesswork, however I believe
> > putting device to low power state it's the right thing to do.
>
> KH,
>
> I did testing with your patch along with a few others, and found that it
> does the best job to put a majority of devices into a low power state
> properly.
>
> I have the details of what happens at S5 outlined on this Gist:
> https://gist.github.com/superm1/f8f81e52f5b1d55b64493fdaec38e31c
>
> * KH column is this patch.
> * ML column is
> https://lore.kernel.org/linux-usb/43594a1c-c0dd-4ae1-b2c4-f5198e3fe951@amd.com/T/#m03d0b36f86fb4722009b24a8ee547011128db80b
> * FS column is 0fab972eef49 being applied again
>
> I also have power testing data from an OEM's system that shows that it
> improves things well enough that a previously failing energy star
> certification is now passing.

Thanks a lot for the testing.

Bjorn, do you think this patch is in good form to get included in next -rc1?

Kai-Heng

>
> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
> Tested-by: Mario Limonciello <mario.limonciello@amd.com>
>
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > ---
> >   drivers/pci/pci-driver.c | 8 ++++++++
> >   1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index af2996d0d17f..4c6f66f3eb54 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -510,6 +510,14 @@ static void pci_device_shutdown(struct device *dev)
> >       if (drv && drv->shutdown)
> >               drv->shutdown(pci_dev);
> >
> > +     /*
> > +      * If driver already changed device's power state, it can mean the
> > +      * wakeup setting is in place, or a workaround is used. Hence keep it
> > +      * as is.
> > +      */
> > +     if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
> > +             pci_prepare_to_sleep(pci_dev);
> > +
> >       /*
> >        * If this is a kexec reboot, turn off Bus Master bit on the
> >        * device to tell it to not continue to do DMA. Don't touch
>
Mario Limonciello Sept. 11, 2024, 2:08 p.m. UTC | #4
On 8/22/2024 14:28, Mario Limonciello wrote:
> On 7/12/2024 01:24, Kai-Heng Feng wrote:
>> Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
>> connected.
>>
>> The following error message can be found during shutdown:
>> pcieport 0000:00:1d.0: AER: Correctable error message received from 
>> 0000:09:04.0
>> pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data 
>> Link Layer, (Receiver ID)
>> pcieport 0000:09:04.0:   device [8086:0b26] error 
>> status/mask=00000080/00002000
>> pcieport 0000:09:04.0:    [ 7] BadDLLP
>>
>> Calling aer_remove() during shutdown can quiesce the error message,
>> however the spurious wakeup still happens.
>>
>> The issue won't happen if the device is in D3 before system shutdown, so
>> putting device to low power state before shutdown to solve the issue.
>>
>> I don't have a sniffer so this is purely guesswork, however I believe
>> putting device to low power state it's the right thing to do.
> 
> KH,
> 
> I did testing with your patch along with a few others, and found that it 
> does the best job to put a majority of devices into a low power state 
> properly.
> 
> I have the details of what happens at S5 outlined on this Gist:
> https://gist.github.com/superm1/f8f81e52f5b1d55b64493fdaec38e31c
> 
> * KH column is this patch.
> * ML column is 
> https://lore.kernel.org/linux-usb/43594a1c-c0dd-4ae1-b2c4-f5198e3fe951@amd.com/T/#m03d0b36f86fb4722009b24a8ee547011128db80b
> * FS column is 0fab972eef49 being applied again
> 
> I also have power testing data from an OEM's system that shows that it 
> improves things well enough that a previously failing energy star 
> certification is now passing.
> 
> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
> Tested-by: Mario Limonciello <mario.limonciello@amd.com>

Bjorn,

As we're getting close to the merge window, any thoughts about this patch?

Thanks,

> 
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>> ---
>>   drivers/pci/pci-driver.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index af2996d0d17f..4c6f66f3eb54 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -510,6 +510,14 @@ static void pci_device_shutdown(struct device *dev)
>>       if (drv && drv->shutdown)
>>           drv->shutdown(pci_dev);
>> +    /*
>> +     * If driver already changed device's power state, it can mean the
>> +     * wakeup setting is in place, or a workaround is used. Hence 
>> keep it
>> +     * as is.
>> +     */
>> +    if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
>> +        pci_prepare_to_sleep(pci_dev);
>> +
>>       /*
>>        * If this is a kexec reboot, turn off Bus Master bit on the
>>        * device to tell it to not continue to do DMA. Don't touch
>
Bjorn Helgaas Sept. 11, 2024, 7:05 p.m. UTC | #5
On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
> Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
> connected.
> 
> The following error message can be found during shutdown:
> pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:09:04.0
> pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> pcieport 0000:09:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> pcieport 0000:09:04.0:    [ 7] BadDLLP
> 
> Calling aer_remove() during shutdown can quiesce the error message,
> however the spurious wakeup still happens.
> 
> The issue won't happen if the device is in D3 before system shutdown, so
> putting device to low power state before shutdown to solve the issue.
> 
> I don't have a sniffer so this is purely guesswork, however I believe
> putting device to low power state it's the right thing to do.

My objection here is that we don't have an explanation of why this
should matter or a pointer to any spec language about this situation,
so it feels a little bit random.

I suppose the problem wouldn't happen if AER interrupts were disabled?
We already do disable them in aer_suspend(), but maybe that's not used
in the shutdown path?

My understanding is that .shutdown() should turn off device interrupts
and stop DMA.  So maybe we need an aer_shutdown() that disables
interrupts?

> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> ---
>  drivers/pci/pci-driver.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index af2996d0d17f..4c6f66f3eb54 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -510,6 +510,14 @@ static void pci_device_shutdown(struct device *dev)
>  	if (drv && drv->shutdown)
>  		drv->shutdown(pci_dev);
>  
> +	/*
> +	 * If driver already changed device's power state, it can mean the
> +	 * wakeup setting is in place, or a workaround is used. Hence keep it
> +	 * as is.
> +	 */
> +	if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
> +		pci_prepare_to_sleep(pci_dev);
> +
>  	/*
>  	 * If this is a kexec reboot, turn off Bus Master bit on the
>  	 * device to tell it to not continue to do DMA. Don't touch
> -- 
> 2.43.0
>
Mario Limonciello Sept. 11, 2024, 7:16 p.m. UTC | #6
On 9/11/2024 14:05, Bjorn Helgaas wrote:
> On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
>> Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
>> connected.
>>
>> The following error message can be found during shutdown:
>> pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:09:04.0
>> pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
>> pcieport 0000:09:04.0:   device [8086:0b26] error status/mask=00000080/00002000
>> pcieport 0000:09:04.0:    [ 7] BadDLLP
>>
>> Calling aer_remove() during shutdown can quiesce the error message,
>> however the spurious wakeup still happens.
>>
>> The issue won't happen if the device is in D3 before system shutdown, so
>> putting device to low power state before shutdown to solve the issue.
>>
>> I don't have a sniffer so this is purely guesswork, however I believe
>> putting device to low power state it's the right thing to do.
> 
> My objection here is that we don't have an explanation of why this
> should matter or a pointer to any spec language about this situation,
> so it feels a little bit random.
> 
> I suppose the problem wouldn't happen if AER interrupts were disabled?
> We already do disable them in aer_suspend(), but maybe that's not used
> in the shutdown path?
> 
> My understanding is that .shutdown() should turn off device interrupts
> and stop DMA.  So maybe we need an aer_shutdown() that disables
> interrupts?
> 

IMO I see this commit as two problems with the same solution.

I don't doubt that cleaning up AER interrupts in the shutdown path would 
help AER messages, but you really don't "want" devices to be in D0 when 
the system is "off" because even if the system is off some rails are 
still active and the device might still be powered.

A powered device could cause interrupts (IE a spurious wakeup).

>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>> ---
>>   drivers/pci/pci-driver.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index af2996d0d17f..4c6f66f3eb54 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -510,6 +510,14 @@ static void pci_device_shutdown(struct device *dev)
>>   	if (drv && drv->shutdown)
>>   		drv->shutdown(pci_dev);
>>   
>> +	/*
>> +	 * If driver already changed device's power state, it can mean the
>> +	 * wakeup setting is in place, or a workaround is used. Hence keep it
>> +	 * as is.
>> +	 */
>> +	if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
>> +		pci_prepare_to_sleep(pci_dev);
>> +
>>   	/*
>>   	 * If this is a kexec reboot, turn off Bus Master bit on the
>>   	 * device to tell it to not continue to do DMA. Don't touch
>> -- 
>> 2.43.0
>>
Mario Limonciello Sept. 11, 2024, 7:38 p.m. UTC | #7
On 9/11/2024 14:16, Mario Limonciello wrote:
> On 9/11/2024 14:05, Bjorn Helgaas wrote:
>> On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
>>> Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
>>> connected.
>>>
>>> The following error message can be found during shutdown:
>>> pcieport 0000:00:1d.0: AER: Correctable error message received from 
>>> 0000:09:04.0
>>> pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, 
>>> type=Data Link Layer, (Receiver ID)
>>> pcieport 0000:09:04.0:   device [8086:0b26] error 
>>> status/mask=00000080/00002000
>>> pcieport 0000:09:04.0:    [ 7] BadDLLP
>>>
>>> Calling aer_remove() during shutdown can quiesce the error message,
>>> however the spurious wakeup still happens.
>>>
>>> The issue won't happen if the device is in D3 before system shutdown, so
>>> putting device to low power state before shutdown to solve the issue.
>>>
>>> I don't have a sniffer so this is purely guesswork, however I believe
>>> putting device to low power state it's the right thing to do.
>>
>> My objection here is that we don't have an explanation of why this
>> should matter or a pointer to any spec language about this situation,
>> so it feels a little bit random.
>>
>> I suppose the problem wouldn't happen if AER interrupts were disabled?
>> We already do disable them in aer_suspend(), but maybe that's not used
>> in the shutdown path?
>>
>> My understanding is that .shutdown() should turn off device interrupts
>> and stop DMA.  So maybe we need an aer_shutdown() that disables
>> interrupts?
>>
> 
> IMO I see this commit as two problems with the same solution.
> 
> I don't doubt that cleaning up AER interrupts in the shutdown path would 
> help AER messages, but you really don't "want" devices to be in D0 when 
> the system is "off" because even if the system is off some rails are 
> still active and the device might still be powered.
> 
> A powered device could cause interrupts (IE a spurious wakeup).

It's a bit of a stretch, but ACPI 7.4.2.5 and 7.4.2.6 are the closest 
corollary to a spec I can find.

"Devices states are compatible with the current Power Resource states. 
In other words, all devices are in the D3 state when the system state is 
S4."

https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/07_Power_and_Performance_Mgmt/oem-supplied-system-level-control-methods.html

> 
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
>>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>>> ---
>>>   drivers/pci/pci-driver.c | 8 ++++++++
>>>   1 file changed, 8 insertions(+)
>>>
>>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>>> index af2996d0d17f..4c6f66f3eb54 100644
>>> --- a/drivers/pci/pci-driver.c
>>> +++ b/drivers/pci/pci-driver.c
>>> @@ -510,6 +510,14 @@ static void pci_device_shutdown(struct device *dev)
>>>       if (drv && drv->shutdown)
>>>           drv->shutdown(pci_dev);
>>> +    /*
>>> +     * If driver already changed device's power state, it can mean the
>>> +     * wakeup setting is in place, or a workaround is used. Hence 
>>> keep it
>>> +     * as is.
>>> +     */
>>> +    if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
>>> +        pci_prepare_to_sleep(pci_dev);
>>> +
>>>       /*
>>>        * If this is a kexec reboot, turn off Bus Master bit on the
>>>        * device to tell it to not continue to do DMA. Don't touch
>>> -- 
>>> 2.43.0
>>>
>
Kai-Heng Feng Sept. 12, 2024, 3 a.m. UTC | #8
Hi Bjorn,

On Thu, Sep 12, 2024 at 3:05 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
> > Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
> > connected.
> >
> > The following error message can be found during shutdown:
> > pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:09:04.0
> > pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> > pcieport 0000:09:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> > pcieport 0000:09:04.0:    [ 7] BadDLLP
> >
> > Calling aer_remove() during shutdown can quiesce the error message,
> > however the spurious wakeup still happens.
> >
> > The issue won't happen if the device is in D3 before system shutdown, so
> > putting device to low power state before shutdown to solve the issue.
> >
> > I don't have a sniffer so this is purely guesswork, however I believe
> > putting device to low power state it's the right thing to do.
>
> My objection here is that we don't have an explanation of why this
> should matter or a pointer to any spec language about this situation,
> so it feels a little bit random.

I have the same feeling too. The PCIe spec doesn't specify what's the
correct power state for shutdown.
So we can only "logically" think the software should put devices to
low power state during shutdown.

>
> I suppose the problem wouldn't happen if AER interrupts were disabled?
> We already do disable them in aer_suspend(), but maybe that's not used
> in the shutdown path?

That was my first thought, so I modified pcie_port_shutdown_service()
to disable AER interrupt.
That approach didn't work though.

>
> My understanding is that .shutdown() should turn off device interrupts
> and stop DMA.  So maybe we need an aer_shutdown() that disables
> interrupts?

Logically we should do that. However that approach doesn't solve this issue.

>
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > ---
> >  drivers/pci/pci-driver.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index af2996d0d17f..4c6f66f3eb54 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -510,6 +510,14 @@ static void pci_device_shutdown(struct device *dev)
> >       if (drv && drv->shutdown)
> >               drv->shutdown(pci_dev);
> >
> > +     /*
> > +      * If driver already changed device's power state, it can mean the
> > +      * wakeup setting is in place, or a workaround is used. Hence keep it
> > +      * as is.
> > +      */
> > +     if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
> > +             pci_prepare_to_sleep(pci_dev);
> > +
> >       /*
> >        * If this is a kexec reboot, turn off Bus Master bit on the
> >        * device to tell it to not continue to do DMA. Don't touch
> > --
> > 2.43.0
> >
Kai-Heng Feng Sept. 12, 2024, 7:02 a.m. UTC | #9
On Thu, Sep 12, 2024 at 3:38 AM Mario Limonciello
<mario.limonciello@amd.com> wrote:
>
> On 9/11/2024 14:16, Mario Limonciello wrote:
> > On 9/11/2024 14:05, Bjorn Helgaas wrote:
> >> On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
> >>> Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
> >>> connected.
> >>>
> >>> The following error message can be found during shutdown:
> >>> pcieport 0000:00:1d.0: AER: Correctable error message received from
> >>> 0000:09:04.0
> >>> pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable,
> >>> type=Data Link Layer, (Receiver ID)
> >>> pcieport 0000:09:04.0:   device [8086:0b26] error
> >>> status/mask=00000080/00002000
> >>> pcieport 0000:09:04.0:    [ 7] BadDLLP
> >>>
> >>> Calling aer_remove() during shutdown can quiesce the error message,
> >>> however the spurious wakeup still happens.
> >>>
> >>> The issue won't happen if the device is in D3 before system shutdown, so
> >>> putting device to low power state before shutdown to solve the issue.
> >>>
> >>> I don't have a sniffer so this is purely guesswork, however I believe
> >>> putting device to low power state it's the right thing to do.
> >>
> >> My objection here is that we don't have an explanation of why this
> >> should matter or a pointer to any spec language about this situation,
> >> so it feels a little bit random.
> >>
> >> I suppose the problem wouldn't happen if AER interrupts were disabled?
> >> We already do disable them in aer_suspend(), but maybe that's not used
> >> in the shutdown path?
> >>
> >> My understanding is that .shutdown() should turn off device interrupts
> >> and stop DMA.  So maybe we need an aer_shutdown() that disables
> >> interrupts?
> >>
> >
> > IMO I see this commit as two problems with the same solution.
> >
> > I don't doubt that cleaning up AER interrupts in the shutdown path would
> > help AER messages, but you really don't "want" devices to be in D0 when
> > the system is "off" because even if the system is off some rails are
> > still active and the device might still be powered.
> >
> > A powered device could cause interrupts (IE a spurious wakeup).
>
> It's a bit of a stretch, but ACPI 7.4.2.5 and 7.4.2.6 are the closest
> corollary to a spec I can find.
>
> "Devices states are compatible with the current Power Resource states.
> In other words, all devices are in the D3 state when the system state is
> S4."

In addition to that, vendor collected the wave form from the device,
Windows put the device to D3 while Linux kept the device in D0, and
that asserted one of the PCIe interrupt line to cause system wakeup.

Kai-Heng

>
> https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/07_Power_and_Performance_Mgmt/oem-supplied-system-level-control-methods.html
>
> >
> >>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
> >>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> >>> ---
> >>>   drivers/pci/pci-driver.c | 8 ++++++++
> >>>   1 file changed, 8 insertions(+)
> >>>
> >>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> >>> index af2996d0d17f..4c6f66f3eb54 100644
> >>> --- a/drivers/pci/pci-driver.c
> >>> +++ b/drivers/pci/pci-driver.c
> >>> @@ -510,6 +510,14 @@ static void pci_device_shutdown(struct device *dev)
> >>>       if (drv && drv->shutdown)
> >>>           drv->shutdown(pci_dev);
> >>> +    /*
> >>> +     * If driver already changed device's power state, it can mean the
> >>> +     * wakeup setting is in place, or a workaround is used. Hence
> >>> keep it
> >>> +     * as is.
> >>> +     */
> >>> +    if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
> >>> +        pci_prepare_to_sleep(pci_dev);
> >>> +
> >>>       /*
> >>>        * If this is a kexec reboot, turn off Bus Master bit on the
> >>>        * device to tell it to not continue to do DMA. Don't touch
> >>> --
> >>> 2.43.0
> >>>
> >
>
Mario Limonciello Sept. 12, 2024, 1:10 p.m. UTC | #10
On 9/12/2024 02:02, Kai-Heng Feng wrote:
> On Thu, Sep 12, 2024 at 3:38 AM Mario Limonciello
> <mario.limonciello@amd.com> wrote:
>>
>> On 9/11/2024 14:16, Mario Limonciello wrote:
>>> On 9/11/2024 14:05, Bjorn Helgaas wrote:
>>>> On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
>>>>> Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
>>>>> connected.
>>>>>
>>>>> The following error message can be found during shutdown:
>>>>> pcieport 0000:00:1d.0: AER: Correctable error message received from
>>>>> 0000:09:04.0
>>>>> pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable,
>>>>> type=Data Link Layer, (Receiver ID)
>>>>> pcieport 0000:09:04.0:   device [8086:0b26] error
>>>>> status/mask=00000080/00002000
>>>>> pcieport 0000:09:04.0:    [ 7] BadDLLP
>>>>>
>>>>> Calling aer_remove() during shutdown can quiesce the error message,
>>>>> however the spurious wakeup still happens.
>>>>>
>>>>> The issue won't happen if the device is in D3 before system shutdown, so
>>>>> putting device to low power state before shutdown to solve the issue.
>>>>>
>>>>> I don't have a sniffer so this is purely guesswork, however I believe
>>>>> putting device to low power state it's the right thing to do.
>>>>
>>>> My objection here is that we don't have an explanation of why this
>>>> should matter or a pointer to any spec language about this situation,
>>>> so it feels a little bit random.
>>>>
>>>> I suppose the problem wouldn't happen if AER interrupts were disabled?
>>>> We already do disable them in aer_suspend(), but maybe that's not used
>>>> in the shutdown path?
>>>>
>>>> My understanding is that .shutdown() should turn off device interrupts
>>>> and stop DMA.  So maybe we need an aer_shutdown() that disables
>>>> interrupts?
>>>>
>>>
>>> IMO I see this commit as two problems with the same solution.
>>>
>>> I don't doubt that cleaning up AER interrupts in the shutdown path would
>>> help AER messages, but you really don't "want" devices to be in D0 when
>>> the system is "off" because even if the system is off some rails are
>>> still active and the device might still be powered.
>>>
>>> A powered device could cause interrupts (IE a spurious wakeup).
>>
>> It's a bit of a stretch, but ACPI 7.4.2.5 and 7.4.2.6 are the closest
>> corollary to a spec I can find.
>>
>> "Devices states are compatible with the current Power Resource states.
>> In other words, all devices are in the D3 state when the system state is
>> S4."
> 
> In addition to that, vendor collected the wave form from the device,
> Windows put the device to D3 while Linux kept the device in D0, and
> that asserted one of the PCIe interrupt line to cause system wakeup.
> 

I did the same collection and confirmed the state as well of many PCI 
devices on the system, elsewhere in this thread.

https://gist.github.com/superm1/f8f81e52f5b1d55b64493fdaec38e31c
Bjorn Helgaas Sept. 12, 2024, 4:57 p.m. UTC | #11
[+cc Rafael]

On Thu, Sep 12, 2024 at 11:00:43AM +0800, Kai-Heng Feng wrote:
> On Thu, Sep 12, 2024 at 3:05 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
> > > Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
> > > connected.
> > >
> > > The following error message can be found during shutdown:
> > > pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:09:04.0
> > > pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> > > pcieport 0000:09:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> > > pcieport 0000:09:04.0:    [ 7] BadDLLP
> > >
> > > Calling aer_remove() during shutdown can quiesce the error message,
> > > however the spurious wakeup still happens.
> > >
> > > The issue won't happen if the device is in D3 before system shutdown, so
> > > putting device to low power state before shutdown to solve the issue.
> > >
> > > I don't have a sniffer so this is purely guesswork, however I believe
> > > putting device to low power state it's the right thing to do.
> >
> > My objection here is that we don't have an explanation of why this
> > should matter or a pointer to any spec language about this situation,
> > so it feels a little bit random.
> 
> I have the same feeling too. The PCIe spec doesn't specify what's the
> correct power state for shutdown.
> So we can only "logically" think the software should put devices to
> low power state during shutdown.
> 
> > I suppose the problem wouldn't happen if AER interrupts were disabled?
> > We already do disable them in aer_suspend(), but maybe that's not used
> > in the shutdown path?
> 
> That was my first thought, so I modified pcie_port_shutdown_service()
> to disable AER interrupt.
> That approach didn't work though.
> 
> > My understanding is that .shutdown() should turn off device interrupts
> > and stop DMA.  So maybe we need an aer_shutdown() that disables
> > interrupts?
> 
> Logically we should do that. However that approach doesn't solve this issue.

I'm not completely clear on the semantics of the .shutdown()
interface.  The doc at
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/device/driver.h?id=v6.10#n73
says "@shutdown: Called at shut-down time to quiesce the device"

Turning off device interrupts and DMA *would* fit within the idea of
quiescing the device.  Does that also include changing the device
power state?  I dunno.  The power state isn't *mentioned* in the
.shutdown() context, while it *is* mentioned for .suspend().

IIUC, this patch and commit log uses "shutdown" to refer to a
system-wide *poweroff*, which is a different concept despite using the
same "shutdown" name.

So should the system poweroff procedure use .suspend()?  Should it use
both .shutdown() and .suspend()?  I think it only uses .shutdown()
today:

  kernel_power_off
    kernel_shutdown_prepare(SYSTEM_POWER_OFF)
      device_shutdown
        while (!list_empty(&devices_kset->list))
          dev->bus->shutdown(dev)
            pci_device_shutdown

There are several driver .shutdown() methods that do things like this:

  e1000_shutdown
    if (system_state == SYSTEM_POWER_OFF)
      pci_set_power_state(pdev, PCI_D3hot)

Maybe that's the right thing and should be done by the PCI core, which
is similar to what you propose here.  But I think it muddies the
definition of .shutdown() a bit by mixing in power management stuff.

> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
> > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > ---
> > >  drivers/pci/pci-driver.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > >
> > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > > index af2996d0d17f..4c6f66f3eb54 100644
> > > --- a/drivers/pci/pci-driver.c
> > > +++ b/drivers/pci/pci-driver.c
> > > @@ -510,6 +510,14 @@ static void pci_device_shutdown(struct device *dev)
> > >       if (drv && drv->shutdown)
> > >               drv->shutdown(pci_dev);
> > >
> > > +     /*
> > > +      * If driver already changed device's power state, it can mean the
> > > +      * wakeup setting is in place, or a workaround is used. Hence keep it
> > > +      * as is.
> > > +      */
> > > +     if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
> > > +             pci_prepare_to_sleep(pci_dev);
> > > +
> > >       /*
> > >        * If this is a kexec reboot, turn off Bus Master bit on the
> > >        * device to tell it to not continue to do DMA. Don't touch
> > > --
> > > 2.43.0
> > >
Kai-Heng Feng Sept. 13, 2024, 6 a.m. UTC | #12
On Fri, Sep 13, 2024 at 12:57 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Rafael]
>
> On Thu, Sep 12, 2024 at 11:00:43AM +0800, Kai-Heng Feng wrote:
> > On Thu, Sep 12, 2024 at 3:05 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
> > > > Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
> > > > connected.
> > > >
> > > > The following error message can be found during shutdown:
> > > > pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:09:04.0
> > > > pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> > > > pcieport 0000:09:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> > > > pcieport 0000:09:04.0:    [ 7] BadDLLP
> > > >
> > > > Calling aer_remove() during shutdown can quiesce the error message,
> > > > however the spurious wakeup still happens.
> > > >
> > > > The issue won't happen if the device is in D3 before system shutdown, so
> > > > putting device to low power state before shutdown to solve the issue.
> > > >
> > > > I don't have a sniffer so this is purely guesswork, however I believe
> > > > putting device to low power state it's the right thing to do.
> > >
> > > My objection here is that we don't have an explanation of why this
> > > should matter or a pointer to any spec language about this situation,
> > > so it feels a little bit random.
> >
> > I have the same feeling too. The PCIe spec doesn't specify what's the
> > correct power state for shutdown.
> > So we can only "logically" think the software should put devices to
> > low power state during shutdown.
> >
> > > I suppose the problem wouldn't happen if AER interrupts were disabled?
> > > We already do disable them in aer_suspend(), but maybe that's not used
> > > in the shutdown path?
> >
> > That was my first thought, so I modified pcie_port_shutdown_service()
> > to disable AER interrupt.
> > That approach didn't work though.
> >
> > > My understanding is that .shutdown() should turn off device interrupts
> > > and stop DMA.  So maybe we need an aer_shutdown() that disables
> > > interrupts?
> >
> > Logically we should do that. However that approach doesn't solve this issue.
>
> I'm not completely clear on the semantics of the .shutdown()
> interface.  The doc at
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/device/driver.h?id=v6.10#n73
> says "@shutdown: Called at shut-down time to quiesce the device"
>
> Turning off device interrupts and DMA *would* fit within the idea of
> quiescing the device.  Does that also include changing the device
> power state?  I dunno.  The power state isn't *mentioned* in the
> .shutdown() context, while it *is* mentioned for .suspend().

IMO putting a device to low power also qualifies as "quiesce the device".

>
> IIUC, this patch and commit log uses "shutdown" to refer to a
> system-wide *poweroff*, which is a different concept despite using the
> same "shutdown" name.

For ACPI based system, there are .suspend for S3/s2idle, .poweroff for
S4, and .shutdown for S5.
Unless we want to introduce a new callback for S5, I think the concept
is quite similar.

For DT based system, the OS should also perform the same thing, as
there's no firmware to cleanup the power state.

We can also move .shutdown to be part of pm_ops, but I don't think
it's necessary,

>
> So should the system poweroff procedure use .suspend()?  Should it use
> both .shutdown() and .suspend()?  I think it only uses .shutdown()
> today:
>
>   kernel_power_off
>     kernel_shutdown_prepare(SYSTEM_POWER_OFF)
>       device_shutdown
>         while (!list_empty(&devices_kset->list))
>           dev->bus->shutdown(dev)
>             pci_device_shutdown
>
> There are several driver .shutdown() methods that do things like this:
>
>   e1000_shutdown
>     if (system_state == SYSTEM_POWER_OFF)
>       pci_set_power_state(pdev, PCI_D3hot)
>
> Maybe that's the right thing and should be done by the PCI core, which
> is similar to what you propose here.  But I think it muddies the
> definition of .shutdown() a bit by mixing in power management stuff.

Do you think adding a new "low power state" callback to be called
after .shutdown a good idea?
That would make the concept of .shutdown different to .suspend and
.poweroff. I personally see .suspend, .poweroff and .shutdown the same
action but target different power states.

Kai-Heng

>
> > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
> > > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > > ---
> > > >  drivers/pci/pci-driver.c | 8 ++++++++
> > > >  1 file changed, 8 insertions(+)
> > > >
> > > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > > > index af2996d0d17f..4c6f66f3eb54 100644
> > > > --- a/drivers/pci/pci-driver.c
> > > > +++ b/drivers/pci/pci-driver.c
> > > > @@ -510,6 +510,14 @@ static void pci_device_shutdown(struct device *dev)
> > > >       if (drv && drv->shutdown)
> > > >               drv->shutdown(pci_dev);
> > > >
> > > > +     /*
> > > > +      * If driver already changed device's power state, it can mean the
> > > > +      * wakeup setting is in place, or a workaround is used. Hence keep it
> > > > +      * as is.
> > > > +      */
> > > > +     if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
> > > > +             pci_prepare_to_sleep(pci_dev);
> > > > +
> > > >       /*
> > > >        * If this is a kexec reboot, turn off Bus Master bit on the
> > > >        * device to tell it to not continue to do DMA. Don't touch
> > > > --
> > > > 2.43.0
> > > >
Mika Westerberg Sept. 13, 2024, 8:01 a.m. UTC | #13
Hi,

On Fri, Sep 13, 2024 at 02:00:58PM +0800, Kai-Heng Feng wrote:
> On Fri, Sep 13, 2024 at 12:57 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > [+cc Rafael]
> >
> > On Thu, Sep 12, 2024 at 11:00:43AM +0800, Kai-Heng Feng wrote:
> > > On Thu, Sep 12, 2024 at 3:05 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
> > > > > Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
> > > > > connected.
> > > > >
> > > > > The following error message can be found during shutdown:
> > > > > pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:09:04.0
> > > > > pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> > > > > pcieport 0000:09:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> > > > > pcieport 0000:09:04.0:    [ 7] BadDLLP
> > > > >
> > > > > Calling aer_remove() during shutdown can quiesce the error message,
> > > > > however the spurious wakeup still happens.
> > > > >
> > > > > The issue won't happen if the device is in D3 before system shutdown, so
> > > > > putting device to low power state before shutdown to solve the issue.
> > > > >
> > > > > I don't have a sniffer so this is purely guesswork, however I believe
> > > > > putting device to low power state it's the right thing to do.
> > > >
> > > > My objection here is that we don't have an explanation of why this
> > > > should matter or a pointer to any spec language about this situation,
> > > > so it feels a little bit random.
> > >
> > > I have the same feeling too. The PCIe spec doesn't specify what's the
> > > correct power state for shutdown.
> > > So we can only "logically" think the software should put devices to
> > > low power state during shutdown.
> > >
> > > > I suppose the problem wouldn't happen if AER interrupts were disabled?
> > > > We already do disable them in aer_suspend(), but maybe that's not used
> > > > in the shutdown path?
> > >
> > > That was my first thought, so I modified pcie_port_shutdown_service()
> > > to disable AER interrupt.
> > > That approach didn't work though.
> > >
> > > > My understanding is that .shutdown() should turn off device interrupts
> > > > and stop DMA.  So maybe we need an aer_shutdown() that disables
> > > > interrupts?
> > >
> > > Logically we should do that. However that approach doesn't solve this issue.
> >
> > I'm not completely clear on the semantics of the .shutdown()
> > interface.  The doc at
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/device/driver.h?id=v6.10#n73
> > says "@shutdown: Called at shut-down time to quiesce the device"
> >
> > Turning off device interrupts and DMA *would* fit within the idea of
> > quiescing the device.  Does that also include changing the device
> > power state?  I dunno.  The power state isn't *mentioned* in the
> > .shutdown() context, while it *is* mentioned for .suspend().
> 
> IMO putting a device to low power also qualifies as "quiesce the device".
> 
> >
> > IIUC, this patch and commit log uses "shutdown" to refer to a
> > system-wide *poweroff*, which is a different concept despite using the
> > same "shutdown" name.
> 
> For ACPI based system, there are .suspend for S3/s2idle, .poweroff for
> S4, and .shutdown for S5.
> Unless we want to introduce a new callback for S5, I think the concept
> is quite similar.
> 
> For DT based system, the OS should also perform the same thing, as
> there's no firmware to cleanup the power state.
> 
> We can also move .shutdown to be part of pm_ops, but I don't think
> it's necessary,
> 
> >
> > So should the system poweroff procedure use .suspend()?  Should it use
> > both .shutdown() and .suspend()?  I think it only uses .shutdown()
> > today:
> >
> >   kernel_power_off
> >     kernel_shutdown_prepare(SYSTEM_POWER_OFF)
> >       device_shutdown
> >         while (!list_empty(&devices_kset->list))
> >           dev->bus->shutdown(dev)
> >             pci_device_shutdown
> >
> > There are several driver .shutdown() methods that do things like this:
> >
> >   e1000_shutdown
> >     if (system_state == SYSTEM_POWER_OFF)
> >       pci_set_power_state(pdev, PCI_D3hot)
> >
> > Maybe that's the right thing and should be done by the PCI core, which
> > is similar to what you propose here.  But I think it muddies the
> > definition of .shutdown() a bit by mixing in power management stuff.
> 
> Do you think adding a new "low power state" callback to be called
> after .shutdown a good idea?
> That would make the concept of .shutdown different to .suspend and
> .poweroff. I personally see .suspend, .poweroff and .shutdown the same
> action but target different power states.

I don't mean to confuse you guys but with this one too, I wonder if you
tried to "disable" the device instead of putting it into D3? On another
thread (Mario at least is aware of this) I mentioned that our PCIe SV
folks identified a couple issues in Linux implementation around power
management and one thing that we are missing is to disable I/O and MMIO
upon entering D3.

I know this is about entering S5 (power off) but I wonder if simply
disabling the device (I/O, MMIO and bus mastering) could stop it from
waking up? To my understanding this can be interpreted as quiesce too :)
Something like the below patch (it also includes the runtime suspend
path which should not matter here. This is the similar patch I shared in
another thread).

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index f412ef73a6e4..79406814699d 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -514,11 +514,9 @@ static void pci_device_shutdown(struct device *dev)
 	 * If this is a kexec reboot, turn off Bus Master bit on the
 	 * device to tell it to not continue to do DMA. Don't touch
 	 * devices in D3cold or unknown states.
-	 * If it is not a kexec reboot, firmware will hit the PCI
-	 * devices with big hammer and stop their DMA any way.
 	 */
-	if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
-		pci_clear_master(pci_dev);
+	if (pci_dev->current_state <= PCI_D3hot)
+		pci_disable_device(pci_dev);
 }
 
 #ifdef CONFIG_PM_SLEEP
@@ -1332,6 +1330,7 @@ static int pci_pm_runtime_suspend(struct device *dev)
 
 	if (!pci_dev->state_saved) {
 		pci_save_state(pci_dev);
+		pci_pm_default_suspend(pci_dev);
 		pci_finish_runtime_suspend(pci_dev);
 	}
 
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ffaaca0978cb..91f4e7a03c94 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2218,6 +2218,13 @@ static void do_pci_disable_device(struct pci_dev *dev)
 		pci_command &= ~PCI_COMMAND_MASTER;
 		pci_write_config_word(dev, PCI_COMMAND, pci_command);
 	}
+	/*
+	 * PCI PM 1.2 sec 8.2.2 says that when a function is put into D3
+	 * the OS needs to disable I/O and MMIO space in addition to bus
+	 * mastering so do that here.
+	 */
+	pci_command &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY);
+	pci_write_config_word(dev, PCI_COMMAND, pci_command);
 
 	pcibios_disable_device(dev);
 }
Mario Limonciello Sept. 13, 2024, 8:33 p.m. UTC | #14
On 9/13/2024 03:01, Mika Westerberg wrote:
> Hi,
> 
> On Fri, Sep 13, 2024 at 02:00:58PM +0800, Kai-Heng Feng wrote:
>> On Fri, Sep 13, 2024 at 12:57 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>
>>> [+cc Rafael]
>>>
>>> On Thu, Sep 12, 2024 at 11:00:43AM +0800, Kai-Heng Feng wrote:
>>>> On Thu, Sep 12, 2024 at 3:05 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>>>> On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
>>>>>> Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
>>>>>> connected.
>>>>>>
>>>>>> The following error message can be found during shutdown:
>>>>>> pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:09:04.0
>>>>>> pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
>>>>>> pcieport 0000:09:04.0:   device [8086:0b26] error status/mask=00000080/00002000
>>>>>> pcieport 0000:09:04.0:    [ 7] BadDLLP
>>>>>>
>>>>>> Calling aer_remove() during shutdown can quiesce the error message,
>>>>>> however the spurious wakeup still happens.
>>>>>>
>>>>>> The issue won't happen if the device is in D3 before system shutdown, so
>>>>>> putting device to low power state before shutdown to solve the issue.
>>>>>>
>>>>>> I don't have a sniffer so this is purely guesswork, however I believe
>>>>>> putting device to low power state it's the right thing to do.
>>>>>
>>>>> My objection here is that we don't have an explanation of why this
>>>>> should matter or a pointer to any spec language about this situation,
>>>>> so it feels a little bit random.
>>>>
>>>> I have the same feeling too. The PCIe spec doesn't specify what's the
>>>> correct power state for shutdown.
>>>> So we can only "logically" think the software should put devices to
>>>> low power state during shutdown.
>>>>
>>>>> I suppose the problem wouldn't happen if AER interrupts were disabled?
>>>>> We already do disable them in aer_suspend(), but maybe that's not used
>>>>> in the shutdown path?
>>>>
>>>> That was my first thought, so I modified pcie_port_shutdown_service()
>>>> to disable AER interrupt.
>>>> That approach didn't work though.
>>>>
>>>>> My understanding is that .shutdown() should turn off device interrupts
>>>>> and stop DMA.  So maybe we need an aer_shutdown() that disables
>>>>> interrupts?
>>>>
>>>> Logically we should do that. However that approach doesn't solve this issue.
>>>
>>> I'm not completely clear on the semantics of the .shutdown()
>>> interface.  The doc at
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/device/driver.h?id=v6.10#n73
>>> says "@shutdown: Called at shut-down time to quiesce the device"
>>>
>>> Turning off device interrupts and DMA *would* fit within the idea of
>>> quiescing the device.  Does that also include changing the device
>>> power state?  I dunno.  The power state isn't *mentioned* in the
>>> .shutdown() context, while it *is* mentioned for .suspend().
>>
>> IMO putting a device to low power also qualifies as "quiesce the device".
>>
>>>
>>> IIUC, this patch and commit log uses "shutdown" to refer to a
>>> system-wide *poweroff*, which is a different concept despite using the
>>> same "shutdown" name.
>>
>> For ACPI based system, there are .suspend for S3/s2idle, .poweroff for
>> S4, and .shutdown for S5.
>> Unless we want to introduce a new callback for S5, I think the concept
>> is quite similar.
>>
>> For DT based system, the OS should also perform the same thing, as
>> there's no firmware to cleanup the power state.
>>
>> We can also move .shutdown to be part of pm_ops, but I don't think
>> it's necessary,
>>
>>>
>>> So should the system poweroff procedure use .suspend()?  Should it use
>>> both .shutdown() and .suspend()?  I think it only uses .shutdown()
>>> today:
>>>
>>>    kernel_power_off
>>>      kernel_shutdown_prepare(SYSTEM_POWER_OFF)
>>>        device_shutdown
>>>          while (!list_empty(&devices_kset->list))
>>>            dev->bus->shutdown(dev)
>>>              pci_device_shutdown
>>>
>>> There are several driver .shutdown() methods that do things like this:
>>>
>>>    e1000_shutdown
>>>      if (system_state == SYSTEM_POWER_OFF)
>>>        pci_set_power_state(pdev, PCI_D3hot)
>>>
>>> Maybe that's the right thing and should be done by the PCI core, which
>>> is similar to what you propose here.  But I think it muddies the
>>> definition of .shutdown() a bit by mixing in power management stuff.
>>
>> Do you think adding a new "low power state" callback to be called
>> after .shutdown a good idea?
>> That would make the concept of .shutdown different to .suspend and
>> .poweroff. I personally see .suspend, .poweroff and .shutdown the same
>> action but target different power states.
> 
> I don't mean to confuse you guys but with this one too, I wonder if you
> tried to "disable" the device instead of putting it into D3? On another
> thread (Mario at least is aware of this) I mentioned that our PCIe SV
> folks identified a couple issues in Linux implementation around power
> management and one thing that we are missing is to disable I/O and MMIO
> upon entering D3.
> 
> I know this is about entering S5 (power off) but I wonder if simply
> disabling the device (I/O, MMIO and bus mastering) could stop it from
> waking up? 

To me, it's a two-fold problem.  The device consumes too much power, and 
the device issues interrupts when system is in S5.

Putting it in D3 should nip both, disabling the device might help the 
latter.

I did the same thing a vendor did for KH where I double checked the 
waveform at S5 and could see the devices still in D0.

Or do you think that by the device being in D0 but disabled should be 
enough for decreasing power?

> To my understanding this can be interpreted as quiesce too :)
> Something like the below patch (it also includes the runtime suspend
> path which should not matter here. This is the similar patch I shared in
> another thread).
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index f412ef73a6e4..79406814699d 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -514,11 +514,9 @@ static void pci_device_shutdown(struct device *dev)
>   	 * If this is a kexec reboot, turn off Bus Master bit on the
>   	 * device to tell it to not continue to do DMA. Don't touch
>   	 * devices in D3cold or unknown states.
> -	 * If it is not a kexec reboot, firmware will hit the PCI
> -	 * devices with big hammer and stop their DMA any way.
>   	 */
> -	if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
> -		pci_clear_master(pci_dev);
> +	if (pci_dev->current_state <= PCI_D3hot)
> +		pci_disable_device(pci_dev);
>   }
>   
>   #ifdef CONFIG_PM_SLEEP
> @@ -1332,6 +1330,7 @@ static int pci_pm_runtime_suspend(struct device *dev)
>   
>   	if (!pci_dev->state_saved) {
>   		pci_save_state(pci_dev);
> +		pci_pm_default_suspend(pci_dev);
>   		pci_finish_runtime_suspend(pci_dev);
>   	}
>   
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index ffaaca0978cb..91f4e7a03c94 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2218,6 +2218,13 @@ static void do_pci_disable_device(struct pci_dev *dev)
>   		pci_command &= ~PCI_COMMAND_MASTER;
>   		pci_write_config_word(dev, PCI_COMMAND, pci_command);
>   	}
> +	/*
> +	 * PCI PM 1.2 sec 8.2.2 says that when a function is put into D3
> +	 * the OS needs to disable I/O and MMIO space in addition to bus
> +	 * mastering so do that here.
> +	 */
> +	pci_command &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY);
> +	pci_write_config_word(dev, PCI_COMMAND, pci_command);
>   
>   	pcibios_disable_device(dev);
>   }
Mika Westerberg Sept. 15, 2024, 7:14 a.m. UTC | #15
Hi,

On Fri, Sep 13, 2024 at 03:33:33PM -0500, Mario Limonciello wrote:
> > I know this is about entering S5 (power off) but I wonder if simply
> > disabling the device (I/O, MMIO and bus mastering) could stop it from
> > waking up?
> 
> To me, it's a two-fold problem.  The device consumes too much power, and the
> device issues interrupts when system is in S5.
> 
> Putting it in D3 should nip both, disabling the device might help the
> latter.
> 
> I did the same thing a vendor did for KH where I double checked the waveform
> at S5 and could see the devices still in D0.
> 
> Or do you think that by the device being in D0 but disabled should be enough
> for decreasing power?

No, not about power but just to solve the issue here. I'm not objecting
putting the device into D3 instead on the grounds that if Windows does
this then probably it is safe or us to do as well and also avoids
possible untested paths in the firmware side too.

Strictly based on ACPI spec [1] there is no such requirement though.

[1] https://uefi.org/specs/ACPI/6.5/16_Waking_and_Sleeping.html#transitioning-from-the-working-to-the-soft-off-state
Chia-Lin Kao (AceLan) Oct. 4, 2024, 4:33 a.m. UTC | #16
On Wed, Sep 11, 2024 at 02:38:20PM -0500, Mario Limonciello wrote:
> On 9/11/2024 14:16, Mario Limonciello wrote:
> > On 9/11/2024 14:05, Bjorn Helgaas wrote:
> > > On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
> > > > Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
> > > > connected.
> > > > 
> > > > The following error message can be found during shutdown:
> > > > pcieport 0000:00:1d.0: AER: Correctable error message received
> > > > from 0000:09:04.0
> > > > pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable,
> > > > type=Data Link Layer, (Receiver ID)
> > > > pcieport 0000:09:04.0:   device [8086:0b26] error
> > > > status/mask=00000080/00002000
> > > > pcieport 0000:09:04.0:    [ 7] BadDLLP
> > > > 
> > > > Calling aer_remove() during shutdown can quiesce the error message,
> > > > however the spurious wakeup still happens.
> > > > 
> > > > The issue won't happen if the device is in D3 before system shutdown, so
> > > > putting device to low power state before shutdown to solve the issue.
> > > > 
> > > > I don't have a sniffer so this is purely guesswork, however I believe
> > > > putting device to low power state it's the right thing to do.
> > > 
> > > My objection here is that we don't have an explanation of why this
> > > should matter or a pointer to any spec language about this situation,
> > > so it feels a little bit random.
> > > 
> > > I suppose the problem wouldn't happen if AER interrupts were disabled?
> > > We already do disable them in aer_suspend(), but maybe that's not used
> > > in the shutdown path?
> > > 
> > > My understanding is that .shutdown() should turn off device interrupts
> > > and stop DMA.  So maybe we need an aer_shutdown() that disables
> > > interrupts?
> > > 
> > 
> > IMO I see this commit as two problems with the same solution.
> > 
> > I don't doubt that cleaning up AER interrupts in the shutdown path would
> > help AER messages, but you really don't "want" devices to be in D0 when
> > the system is "off" because even if the system is off some rails are
> > still active and the device might still be powered.
> > 
> > A powered device could cause interrupts (IE a spurious wakeup).
> 
> It's a bit of a stretch, but ACPI 7.4.2.5 and 7.4.2.6 are the closest
> corollary to a spec I can find.
> 
> "Devices states are compatible with the current Power Resource states. In
> other words, all devices are in the D3 state when the system state is S4."
> 
> https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/07_Power_and_Performance_Mgmt/oem-supplied-system-level-control-methods.html
Hi,

I'd like to revive this thread and support this description from the
ACPI spec.

In ACPI 7.4.2.5, it states: "All devices are in the D3 state when the
system state is S4," and in ACPI 7.4.2.6, it says: "The S5 state is
similar to the S4 state except that OSPM does not save any context."

I believe this implies that devices should also be in D3 when the
system is in S5.

> 
> > 
> > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
> > > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > > ---
> > > >   drivers/pci/pci-driver.c | 8 ++++++++
> > > >   1 file changed, 8 insertions(+)
> > > > 
> > > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > > > index af2996d0d17f..4c6f66f3eb54 100644
> > > > --- a/drivers/pci/pci-driver.c
> > > > +++ b/drivers/pci/pci-driver.c
> > > > @@ -510,6 +510,14 @@ static void pci_device_shutdown(struct device *dev)
> > > >       if (drv && drv->shutdown)
> > > >           drv->shutdown(pci_dev);
> > > > +    /*
> > > > +     * If driver already changed device's power state, it can mean the
> > > > +     * wakeup setting is in place, or a workaround is used.
> > > > Hence keep it
> > > > +     * as is.
> > > > +     */
> > > > +    if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
> > > > +        pci_prepare_to_sleep(pci_dev);
> > > > +
> > > >       /*
> > > >        * If this is a kexec reboot, turn off Bus Master bit on the
> > > >        * device to tell it to not continue to do DMA. Don't touch
> > > > -- 
> > > > 2.43.0
> > > > 
> > 
>
Kai-Heng Feng Oct. 4, 2024, 9:26 a.m. UTC | #17
On 2024/10/4 12:33 PM, Chia-Lin Kao (AceLan) wrote:
> On Wed, Sep 11, 2024 at 02:38:20PM -0500, Mario Limonciello wrote:
>> On 9/11/2024 14:16, Mario Limonciello wrote:
>>> On 9/11/2024 14:05, Bjorn Helgaas wrote:
>>>> On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
>>>>> Some laptops wake up after poweroff when HP Thunderbolt Dock G4 is
>>>>> connected.
>>>>>
>>>>> The following error message can be found during shutdown:
>>>>> pcieport 0000:00:1d.0: AER: Correctable error message received
>>>>> from 0000:09:04.0
>>>>> pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable,
>>>>> type=Data Link Layer, (Receiver ID)
>>>>> pcieport 0000:09:04.0:   device [8086:0b26] error
>>>>> status/mask=00000080/00002000
>>>>> pcieport 0000:09:04.0:    [ 7] BadDLLP
>>>>>
>>>>> Calling aer_remove() during shutdown can quiesce the error message,
>>>>> however the spurious wakeup still happens.
>>>>>
>>>>> The issue won't happen if the device is in D3 before system shutdown, so
>>>>> putting device to low power state before shutdown to solve the issue.
>>>>>
>>>>> I don't have a sniffer so this is purely guesswork, however I believe
>>>>> putting device to low power state it's the right thing to do.
>>>>
>>>> My objection here is that we don't have an explanation of why this
>>>> should matter or a pointer to any spec language about this situation,
>>>> so it feels a little bit random.
>>>>
>>>> I suppose the problem wouldn't happen if AER interrupts were disabled?
>>>> We already do disable them in aer_suspend(), but maybe that's not used
>>>> in the shutdown path?
>>>>
>>>> My understanding is that .shutdown() should turn off device interrupts
>>>> and stop DMA.  So maybe we need an aer_shutdown() that disables
>>>> interrupts?
>>>>
>>>
>>> IMO I see this commit as two problems with the same solution.
>>>
>>> I don't doubt that cleaning up AER interrupts in the shutdown path would
>>> help AER messages, but you really don't "want" devices to be in D0 when
>>> the system is "off" because even if the system is off some rails are
>>> still active and the device might still be powered.
>>>
>>> A powered device could cause interrupts (IE a spurious wakeup).
>>
>> It's a bit of a stretch, but ACPI 7.4.2.5 and 7.4.2.6 are the closest
>> corollary to a spec I can find.
>>
>> "Devices states are compatible with the current Power Resource states. In
>> other words, all devices are in the D3 state when the system state is S4."
>>
>> https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/07_Power_and_Performance_Mgmt/oem-supplied-system-level-control-methods.html
> Hi,
> 
> I'd like to revive this thread and support this description from the
> ACPI spec.
> 
> In ACPI 7.4.2.5, it states: "All devices are in the D3 state when the
> system state is S4," and in ACPI 7.4.2.6, it says: "The S5 state is
> similar to the S4 state except that OSPM does not save any context."
> 
> I believe this implies that devices should also be in D3 when the
> system is in S5.

Sorry for the belated response.

I think AceLan found a better explanation on why this patch is needed.

I can resend the patch with modified message if Bjorn agrees.

Kai-Heng

> 
>>
>>>
>>>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219036
>>>>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>>>>> ---
>>>>>    drivers/pci/pci-driver.c | 8 ++++++++
>>>>>    1 file changed, 8 insertions(+)
>>>>>
>>>>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>>>>> index af2996d0d17f..4c6f66f3eb54 100644
>>>>> --- a/drivers/pci/pci-driver.c
>>>>> +++ b/drivers/pci/pci-driver.c
>>>>> @@ -510,6 +510,14 @@ static void pci_device_shutdown(struct device *dev)
>>>>>        if (drv && drv->shutdown)
>>>>>            drv->shutdown(pci_dev);
>>>>> +    /*
>>>>> +     * If driver already changed device's power state, it can mean the
>>>>> +     * wakeup setting is in place, or a workaround is used.
>>>>> Hence keep it
>>>>> +     * as is.
>>>>> +     */
>>>>> +    if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
>>>>> +        pci_prepare_to_sleep(pci_dev);
>>>>> +
>>>>>        /*
>>>>>         * If this is a kexec reboot, turn off Bus Master bit on the
>>>>>         * device to tell it to not continue to do DMA. Don't touch
>>>>> -- 
>>>>> 2.43.0
>>>>>
>>>
>>
Bjorn Helgaas Oct. 9, 2024, 10:24 p.m. UTC | #18
On Fri, Sep 13, 2024 at 11:01:23AM +0300, Mika Westerberg wrote:
> On Fri, Sep 13, 2024 at 02:00:58PM +0800, Kai-Heng Feng wrote:
> > On Fri, Sep 13, 2024 at 12:57 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Thu, Sep 12, 2024 at 11:00:43AM +0800, Kai-Heng Feng wrote:
> > > > On Thu, Sep 12, 2024 at 3:05 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
> > > > > > Some laptops wake up after poweroff when HP Thunderbolt
> > > > > > Dock G4 is connected.
> > > > > >
> > > > > > The following error message can be found during shutdown:
> > > > > > pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:09:04.0
> > > > > > pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> > > > > > pcieport 0000:09:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> > > > > > pcieport 0000:09:04.0:    [ 7] BadDLLP
> > > > > >
> > > > > > Calling aer_remove() during shutdown can quiesce the error
> > > > > > message, however the spurious wakeup still happens.
> > > > > >
> > > > > > The issue won't happen if the device is in D3 before
> > > > > > system shutdown, so putting device to low power state
> > > > > > before shutdown to solve the issue.
> > > > > >
> > > > > > I don't have a sniffer so this is purely guesswork,
> > > > > > however I believe putting device to low power state it's
> > > > > > the right thing to do.
> > > > >
> > > > > My objection here is that we don't have an explanation of
> > > > > why this should matter or a pointer to any spec language
> > > > > about this situation, so it feels a little bit random.
> ...

> I don't mean to confuse you guys but with this one too, I wonder if you
> tried to "disable" the device instead of putting it into D3? On another
> thread (Mario at least is aware of this) I mentioned that our PCIe SV
> folks identified a couple issues in Linux implementation around power
> management and one thing that we are missing is to disable I/O and MMIO
> upon entering D3.
> ...

This is really interesting -- did they discover a functional problem,
or did they just notice that we don't follow the PCI PM spec?

> +++ b/drivers/pci/pci.c
> @@ -2218,6 +2218,13 @@ static void do_pci_disable_device(struct pci_dev *dev)
>  		pci_command &= ~PCI_COMMAND_MASTER;
>  		pci_write_config_word(dev, PCI_COMMAND, pci_command);
>  	}
> +	/*
> +	 * PCI PM 1.2 sec 8.2.2 says that when a function is put into D3
> +	 * the OS needs to disable I/O and MMIO space in addition to bus
> +	 * mastering so do that here.
> +	 */
> +	pci_command &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY);
> +	pci_write_config_word(dev, PCI_COMMAND, pci_command);
>  
>  	pcibios_disable_device(dev);
>  }

This do_pci_disable_device() proposal is interesting.

pci_enable_device() turns on PCI_COMMAND_MEMORY and PCI_COMMAND_IO,
which enables the device to respond to MMIO and I/O port accesses to
its BARs from the driver.  It also makes sure the device is in D0,
because BAR access only works in D0.

pci_set_master() turns on PCI_COMMAND_MASTER, which enables the device
to perform DMA (including generating MSIs).

pci_disable_device() *sounds* like it should be the opposite of
pci_enable_device(), but it's currently basically the same as
pci_clear_master(), which clears PCI_COMMAND_MASTER to prevent DMA.
I didn't know about this text in 8.2.2, and I wish I knew why we
don't currently clear PCI_COMMAND_MEMORY and PCI_COMMAND_IO.

If we want to pursue this, I think it should be split to its own patch
and moved out of pci_disable_device() because I don't think this path
necessary implies putting the device in D3, so I think it would fit
better with the spec if we cleared PCI_COMMAND_MEMORY and
PCI_COMMAND_IO in a path that explicitly does put it in D3.

I think there's a significant chance of breaking something because
drivers are currently able to access BARs after pci_disable_device(),
and there are a LOT of callers.  But if there's a problem it would
fix, we should definitely explore it.

Bjorn
Mika Westerberg Oct. 10, 2024, 4:52 a.m. UTC | #19
On Wed, Oct 09, 2024 at 05:24:03PM -0500, Bjorn Helgaas wrote:
> On Fri, Sep 13, 2024 at 11:01:23AM +0300, Mika Westerberg wrote:
> > On Fri, Sep 13, 2024 at 02:00:58PM +0800, Kai-Heng Feng wrote:
> > > On Fri, Sep 13, 2024 at 12:57 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > On Thu, Sep 12, 2024 at 11:00:43AM +0800, Kai-Heng Feng wrote:
> > > > > On Thu, Sep 12, 2024 at 3:05 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > > On Fri, Jul 12, 2024 at 02:24:11PM +0800, Kai-Heng Feng wrote:
> > > > > > > Some laptops wake up after poweroff when HP Thunderbolt
> > > > > > > Dock G4 is connected.
> > > > > > >
> > > > > > > The following error message can be found during shutdown:
> > > > > > > pcieport 0000:00:1d.0: AER: Correctable error message received from 0000:09:04.0
> > > > > > > pcieport 0000:09:04.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> > > > > > > pcieport 0000:09:04.0:   device [8086:0b26] error status/mask=00000080/00002000
> > > > > > > pcieport 0000:09:04.0:    [ 7] BadDLLP
> > > > > > >
> > > > > > > Calling aer_remove() during shutdown can quiesce the error
> > > > > > > message, however the spurious wakeup still happens.
> > > > > > >
> > > > > > > The issue won't happen if the device is in D3 before
> > > > > > > system shutdown, so putting device to low power state
> > > > > > > before shutdown to solve the issue.
> > > > > > >
> > > > > > > I don't have a sniffer so this is purely guesswork,
> > > > > > > however I believe putting device to low power state it's
> > > > > > > the right thing to do.
> > > > > >
> > > > > > My objection here is that we don't have an explanation of
> > > > > > why this should matter or a pointer to any spec language
> > > > > > about this situation, so it feels a little bit random.
> > ...
> 
> > I don't mean to confuse you guys but with this one too, I wonder if you
> > tried to "disable" the device instead of putting it into D3? On another
> > thread (Mario at least is aware of this) I mentioned that our PCIe SV
> > folks identified a couple issues in Linux implementation around power
> > management and one thing that we are missing is to disable I/O and MMIO
> > upon entering D3.
> > ...
> 
> This is really interesting -- did they discover a functional problem,
> or did they just notice that we don't follow the PCI PM spec?

The latter.

> > +++ b/drivers/pci/pci.c
> > @@ -2218,6 +2218,13 @@ static void do_pci_disable_device(struct pci_dev *dev)
> >  		pci_command &= ~PCI_COMMAND_MASTER;
> >  		pci_write_config_word(dev, PCI_COMMAND, pci_command);
> >  	}
> > +	/*
> > +	 * PCI PM 1.2 sec 8.2.2 says that when a function is put into D3
> > +	 * the OS needs to disable I/O and MMIO space in addition to bus
> > +	 * mastering so do that here.
> > +	 */
> > +	pci_command &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY);
> > +	pci_write_config_word(dev, PCI_COMMAND, pci_command);
> >  
> >  	pcibios_disable_device(dev);
> >  }
> 
> This do_pci_disable_device() proposal is interesting.
> 
> pci_enable_device() turns on PCI_COMMAND_MEMORY and PCI_COMMAND_IO,
> which enables the device to respond to MMIO and I/O port accesses to
> its BARs from the driver.  It also makes sure the device is in D0,
> because BAR access only works in D0.
> 
> pci_set_master() turns on PCI_COMMAND_MASTER, which enables the device
> to perform DMA (including generating MSIs).
> 
> pci_disable_device() *sounds* like it should be the opposite of
> pci_enable_device(), but it's currently basically the same as
> pci_clear_master(), which clears PCI_COMMAND_MASTER to prevent DMA.
> I didn't know about this text in 8.2.2, and I wish I knew why we
> don't currently clear PCI_COMMAND_MEMORY and PCI_COMMAND_IO.
> 
> If we want to pursue this, I think it should be split to its own patch
> and moved out of pci_disable_device() because I don't think this path
> necessary implies putting the device in D3, so I think it would fit
> better with the spec if we cleared PCI_COMMAND_MEMORY and
> PCI_COMMAND_IO in a path that explicitly does put it in D3.
> 
> I think there's a significant chance of breaking something because
> drivers are currently able to access BARs after pci_disable_device(),
> and there are a LOT of callers.  But if there's a problem it would
> fix, we should definitely explore it.

At the moment it does not seem to fix anything as far as I can tell so
not sure how important it is. Of course from spec perspective we should
probably deal with it.
diff mbox series

Patch

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index af2996d0d17f..4c6f66f3eb54 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -510,6 +510,14 @@  static void pci_device_shutdown(struct device *dev)
 	if (drv && drv->shutdown)
 		drv->shutdown(pci_dev);
 
+	/*
+	 * If driver already changed device's power state, it can mean the
+	 * wakeup setting is in place, or a workaround is used. Hence keep it
+	 * as is.
+	 */
+	if (!kexec_in_progress && pci_dev->current_state == PCI_D0)
+		pci_prepare_to_sleep(pci_dev);
+
 	/*
 	 * If this is a kexec reboot, turn off Bus Master bit on the
 	 * device to tell it to not continue to do DMA. Don't touch