diff mbox series

PCI/ACPI: Decouple the negotiation of ASPM and other PCIe services

Message ID 20220407131602.14727-1-yangyicong@hisilicon.com (mailing list archive)
State Changes Requested
Headers show
Series PCI/ACPI: Decouple the negotiation of ASPM and other PCIe services | expand

Commit Message

Yicong Yang April 7, 2022, 1:16 p.m. UTC
Currently we regard ASPM as a necessary PCIe service and if it's disabled
by pcie_aspm=off we cannot enable other services like AER and hotplug.
However the ASPM is just one of the PCIe services and other services
mentioned no dependency on this. So this patch decouples the negotiation
of ASPM and other PCIe services, then we can make use of other services
in the absence of ASPM.

Aaron Sierra tried to fix this originally:
https://lore.kernel.org/linux-pci/20190702201318.GC128603@google.com/

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 drivers/acpi/pci_root.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Bjorn Helgaas April 7, 2022, 3:42 p.m. UTC | #1
[+cc Rafael]

On Thu, Apr 07, 2022 at 09:16:02PM +0800, Yicong Yang wrote:
> Currently we regard ASPM as a necessary PCIe service and if it's disabled
> by pcie_aspm=off we cannot enable other services like AER and hotplug.
> However the ASPM is just one of the PCIe services and other services
> mentioned no dependency on this. So this patch decouples the negotiation
> of ASPM and other PCIe services, then we can make use of other services
> in the absence of ASPM.

Why do you want to boot with "pcie_aspm=off"?  If we have to use a
PCI-related parameter to boot, something is already wrong, so if
there's a problem that requires ASPM to be disabled, we should fix
that first.

If there's a known hardware or firmware issue with ASPM, we should
quirk it so users don't have to discover this parameter.

> Aaron Sierra tried to fix this originally:
> https://lore.kernel.org/linux-pci/20190702201318.GC128603@google.com/

Yes.  My question from that review is still open:

  But Rafael added ACPI_PCIE_REQ_SUPPORT with 415e12b23792 ("PCI/ACPI:
  Request _OSC control once for each root bridge (v3)") [1], apparently
  related to a bug [2].  I assume there was some reason for requiring
  all those things together, so I'd really like his comments.

  [1] https://git.kernel.org/linus/415e12b23792
  [2] https://bugzilla.kernel.org/show_bug.cgi?id=20232

Rafael clearly said in [1] that we need to:

  ... check if all of the requisite _OSC support bits are set before
  calling acpi_pci_osc_control_set() for a given root complex.

We would still need to explain why Rafael thought all these _OSC
support bits were required, but now they're not.

_OSC does not negotiate directly for control of ASPM (though of course
it *does* negotiate for control of the PCIe Capability, which contains
the ASPM control bits), but the PCI Firmware spec, r3.3, sec 4.5.3, has
this comment in a sample _OSC implementation:

  // Only allow native hot plug control if the OS supports:
  // * ASPM
  // * Clock PM
  // * MSI/MSI-X

which matches the current ACPI_PCIE_REQ_SUPPORT.

So I think I would approach this by reworking the _OSC negotiation so
we always advertise "OSC_PCI_ASPM_SUPPORT | OSC_PCI_CLOCK_PM_SUPPORT"
if CONFIG_PCIEASPM=y.

Advertising support for ASPM doesn't mean Linux has to actually
*enable* it, so we could make a different mechanism to prevent use of
ASPM if we have a device or platform quirk or we're booting with
"pcie_aspm=off".

> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  drivers/acpi/pci_root.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index 6f9e75d14808..16fa7c5a11ad 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -37,8 +37,6 @@ static int acpi_pci_root_scan_dependent(struct acpi_device *adev)
>  }
>  
>  #define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \
> -				| OSC_PCI_ASPM_SUPPORT \
> -				| OSC_PCI_CLOCK_PM_SUPPORT \
>  				| OSC_PCI_MSI_SUPPORT)
>  
>  static const struct acpi_device_id root_device_ids[] = {
> -- 
> 2.24.0
>
Rafael J. Wysocki April 7, 2022, 4:41 p.m. UTC | #2
On Thu, Apr 7, 2022 at 5:43 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Rafael]
>
> On Thu, Apr 07, 2022 at 09:16:02PM +0800, Yicong Yang wrote:
> > Currently we regard ASPM as a necessary PCIe service and if it's disabled
> > by pcie_aspm=off we cannot enable other services like AER and hotplug.
> > However the ASPM is just one of the PCIe services and other services
> > mentioned no dependency on this. So this patch decouples the negotiation
> > of ASPM and other PCIe services, then we can make use of other services
> > in the absence of ASPM.
>
> Why do you want to boot with "pcie_aspm=off"?  If we have to use a
> PCI-related parameter to boot, something is already wrong, so if
> there's a problem that requires ASPM to be disabled, we should fix
> that first.
>
> If there's a known hardware or firmware issue with ASPM, we should
> quirk it so users don't have to discover this parameter.
>
> > Aaron Sierra tried to fix this originally:
> > https://lore.kernel.org/linux-pci/20190702201318.GC128603@google.com/
>
> Yes.  My question from that review is still open:
>
>   But Rafael added ACPI_PCIE_REQ_SUPPORT with 415e12b23792 ("PCI/ACPI:
>   Request _OSC control once for each root bridge (v3)") [1], apparently
>   related to a bug [2].  I assume there was some reason for requiring
>   all those things together, so I'd really like his comments.

Well, it was quite a few years ago.

>   [1] https://git.kernel.org/linus/415e12b23792
>   [2] https://bugzilla.kernel.org/show_bug.cgi?id=20232
>
> Rafael clearly said in [1] that we need to:
>
>   ... check if all of the requisite _OSC support bits are set before
>   calling acpi_pci_osc_control_set() for a given root complex.

IIRC, the idea was to avoid requesting native control of anything PCIe
if those bits were not set in the mask, because otherwise we wouldn't
be able to get PME and native hotplug control which were not
configurable at that time.  [PME is still not configurable and
potentially related to hotplug, because they may use the same MSI IRQ
in principle, but the native hotplug is configurable now anyway.]

> We would still need to explain why Rafael thought all these _OSC
> support bits were required, but now they're not.
>
> _OSC does not negotiate directly for control of ASPM (though of course
> it *does* negotiate for control of the PCIe Capability, which contains
> the ASPM control bits), but the PCI Firmware spec, r3.3, sec 4.5.3, has
> this comment in a sample _OSC implementation:
>
>   // Only allow native hot plug control if the OS supports:
>   // * ASPM
>   // * Clock PM
>   // * MSI/MSI-X
>
> which matches the current ACPI_PCIE_REQ_SUPPORT.
>
> So I think I would approach this by reworking the _OSC negotiation so
> we always advertise "OSC_PCI_ASPM_SUPPORT | OSC_PCI_CLOCK_PM_SUPPORT"
> if CONFIG_PCIEASPM=y.

That'd be reasonable IMO.

> Advertising support for ASPM doesn't mean Linux has to actually
> *enable* it, so we could make a different mechanism to prevent use of
> ASPM if we have a device or platform quirk or we're booting with
> "pcie_aspm=off".

Right.

Note that if we don't request the native control of a PCIe feature,
this basically gives the BIOS a licence to scribble on the related
device registers and some of the features are not independent, so we
may need to advertise support for two features in order to get control
of just one of them.
Yicong Yang April 11, 2022, 9:30 a.m. UTC | #3
Hi Bjorn, Rafael,

On 2022/4/8 0:41, Rafael J. Wysocki wrote:
> On Thu, Apr 7, 2022 at 5:43 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>
>> [+cc Rafael]
>>
>> On Thu, Apr 07, 2022 at 09:16:02PM +0800, Yicong Yang wrote:
>>> Currently we regard ASPM as a necessary PCIe service and if it's disabled
>>> by pcie_aspm=off we cannot enable other services like AER and hotplug.
>>> However the ASPM is just one of the PCIe services and other services
>>> mentioned no dependency on this. So this patch decouples the negotiation
>>> of ASPM and other PCIe services, then we can make use of other services
>>> in the absence of ASPM.
>>
>> Why do you want to boot with "pcie_aspm=off"?  If we have to use a
>> PCI-related parameter to boot, something is already wrong, so if
>> there's a problem that requires ASPM to be disabled, we should fix
>> that first.
>>

We found this when testing the functions of AER and hotplug. The pcie_aspm=off
is added by the tester who found it affect the function and it makes us think
that it maybe not reasonable to couple these 3 services together.

>> If there's a known hardware or firmware issue with ASPM, we should
>> quirk it so users don't have to discover this parameter.
>>
>>> Aaron Sierra tried to fix this originally:
>>> https://lore.kernel.org/linux-pci/20190702201318.GC128603@google.com/
>>
>> Yes.  My question from that review is still open:
>>
>>   But Rafael added ACPI_PCIE_REQ_SUPPORT with 415e12b23792 ("PCI/ACPI:
>>   Request _OSC control once for each root bridge (v3)") [1], apparently
>>   related to a bug [2].  I assume there was some reason for requiring
>>   all those things together, so I'd really like his comments.
> 
> Well, it was quite a few years ago.
> 
>>   [1] https://git.kernel.org/linus/415e12b23792
>>   [2] https://bugzilla.kernel.org/show_bug.cgi?id=20232
>>
>> Rafael clearly said in [1] that we need to:
>>
>>   ... check if all of the requisite _OSC support bits are set before
>>   calling acpi_pci_osc_control_set() for a given root complex.
> 
> IIRC, the idea was to avoid requesting native control of anything PCIe
> if those bits were not set in the mask, because otherwise we wouldn't
> be able to get PME and native hotplug control which were not
> configurable at that time.  [PME is still not configurable and
> potentially related to hotplug, because they may use the same MSI IRQ
> in principle, but the native hotplug is configurable now anyway.]
> 

I'm a bit confused about the 'configurable' here, is it only about PME
and hotplug share the same interrupt? Currently the PME and hotplug
interrupt is allocated by the pcie port driver and PME can get the irq
if the interrupt is allocated successfully.

>> We would still need to explain why Rafael thought all these _OSC
>> support bits were required, but now they're not.
>>
>> _OSC does not negotiate directly for control of ASPM (though of course
>> it *does* negotiate for control of the PCIe Capability, which contains
>> the ASPM control bits), but the PCI Firmware spec, r3.3, sec 4.5.3, has
>> this comment in a sample _OSC implementation:
>>
>>   // Only allow native hot plug control if the OS supports:
>>   // * ASPM
>>   // * Clock PM
>>   // * MSI/MSI-X
>>
>> which matches the current ACPI_PCIE_REQ_SUPPORT.
>>

thanks for the reference. So it indicates that native hotplug depends
on ASPM? But maybe not AER or PME, as commented following above sample
in the spec:

// Always allow native PME, AER (no dependencies)

So the AER should work on pcie_aspm=off, does it make sense?

>> So I think I would approach this by reworking the _OSC negotiation so
>> we always advertise "OSC_PCI_ASPM_SUPPORT | OSC_PCI_CLOCK_PM_SUPPORT"
>> if CONFIG_PCIEASPM=y.
> 
> That'd be reasonable IMO.
> 
>> Advertising support for ASPM doesn't mean Linux has to actually
>> *enable* it, so we could make a different mechanism to prevent use of
>> ASPM if we have a device or platform quirk or we're booting with
>> "pcie_aspm=off".
> 

I agree on this and I think this approach can resolve the condition here.
If os got the ASPM control but user sepcified pcie_aspm=off, I think we
can stop the ASPM link configuring to avoid enable the ASPM function.

> Right.
> 
> Note that if we don't request the native control of a PCIe feature,
> this basically gives the BIOS a licence to scribble on the related
> device registers and some of the features are not independent, so we
> may need to advertise support for two features in order to get control
> of just one of them.
> .
> 

ok. thanks for the note.

Regards,
Yicong
diff mbox series

Patch

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 6f9e75d14808..16fa7c5a11ad 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -37,8 +37,6 @@  static int acpi_pci_root_scan_dependent(struct acpi_device *adev)
 }
 
 #define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \
-				| OSC_PCI_ASPM_SUPPORT \
-				| OSC_PCI_CLOCK_PM_SUPPORT \
 				| OSC_PCI_MSI_SUPPORT)
 
 static const struct acpi_device_id root_device_ids[] = {