diff mbox

PCI: Add information about describing PCI in ACPI

Message ID 20161117175938.17465.45820.stgit@bhelgaas-glaptop.roam.corp.google.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Bjorn Helgaas Nov. 17, 2016, 5:59 p.m. UTC
Add a writeup about how PCI host bridges should be described in ACPI
using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 Documentation/PCI/00-INDEX      |    2 +
 Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 138 insertions(+)
 create mode 100644 Documentation/PCI/acpi-info.txt


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Gabriele Paoloni Nov. 18, 2016, 5:17 p.m. UTC | #1
Hi Bjorn

Many thanks for putting this together, it really helps!

One thing below..

> -----Original Message-----

> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-

> owner@vger.kernel.org] On Behalf Of Bjorn Helgaas

> Sent: 17 November 2016 18:00

> To: linux-pci@vger.kernel.org

> Cc: linux-acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-

> arm-kernel@lists.infradead.org; linaro-acpi@lists.linaro.org

> Subject: [PATCH] PCI: Add information about describing PCI in ACPI

> 

> Add a writeup about how PCI host bridges should be described in ACPI

> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.

> 

> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

> ---

>  Documentation/PCI/00-INDEX      |    2 +

>  Documentation/PCI/acpi-info.txt |  136

> +++++++++++++++++++++++++++++++++++++++

>  2 files changed, 138 insertions(+)

>  create mode 100644 Documentation/PCI/acpi-info.txt

> 

> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX

> index 147231f..0780280 100644

> --- a/Documentation/PCI/00-INDEX

> +++ b/Documentation/PCI/00-INDEX

> @@ -1,5 +1,7 @@

>  00-INDEX

>  	- this file

> +acpi-info.txt

> +	- info on how PCI host bridges are represented in ACPI

>  MSI-HOWTO.txt

>  	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and

> FAQ.

>  PCIEBUS-HOWTO.txt

> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-

> info.txt

> new file mode 100644

> index 0000000..ccbcfda

> --- /dev/null

> +++ b/Documentation/PCI/acpi-info.txt

> @@ -0,0 +1,136 @@

> +	    ACPI considerations for PCI host bridges

> +

> +The basic requirement is that the ACPI namespace should describe

> +*everything* that consumes address space unless there's another

> +standard way for the OS to find it [1, 2].  For example, windows that

> +are forwarded to PCI by a PCI host bridge should be described via ACPI

> +devices, since the OS can't locate the host bridge by itself.  PCI

> +devices *below* the host bridge do not need to be described via ACPI,

> +because the resources they consume are inside the host bridge windows,

> +and the OS can discover them via the standard PCI enumeration

> +mechanism (using config accesses to read and size the BARs).

> +

> +This ACPI resource description is done via _CRS methods of devices in

> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:

> +the OS can read _CRS and figure out what resource is being consumed

> +even if it doesn't have a driver for the device [3].  That's important

> +because it means an old OS can work correctly even on a system with

> +new devices unknown to the OS.  The new devices won't do anything, but

> +the OS can at least make sure no resources conflict with them.

> +

> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for

> +reserving address space!  The static tables are for things the OS

> +needs to know early in boot, before it can parse the ACPI namespace.

> +If a new table is defined, an old OS needs to operate correctly even

> +though it ignores the table.  _CRS allows that because it is generic

> +and understood by the old OS; a static table does not.


Right so if my understanding is correct you are saying that resources
described in the MCFG table should also be declared in PNP0C02 devices
so that the PNP driver can reserve these resources.

On the other side the PCI Root bridge driver should not reserve such
resources.

Well if my understanding is correct I think we have a problem here:
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74

As you can see pci_ecam_create() will conflict with the pnp driver
as it will try to reserve the resources from the MCFG table...

Maybe we need to rework pci_ecam_create() ?

Thanks

Gab

> +

> +If the OS is expected to manage an ACPI device, that device will have

> +a specific _HID/_CID that tells the OS what driver to bind to it, and

> +the _CRS tells the OS and the driver where the device's registers are.

> +

> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no

> +programming model for them other than "don't use these resources for

> +anything else."  So any address space that is (1) not claimed by some

> +other ACPI device and (2) should not be assigned by the OS to

> +something else, should be claimed by a PNP0C02 _CRS method.

> +

> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should

> +describe all the address space they consume.  In principle, this would

> +be all the windows they forward down to the PCI bus, as well as the

> +bridge registers themselves.  The bridge registers include things like

> +secondary/subordinate bus registers that determine the bus range below

> +the bridge, window registers that describe the apertures, etc.  These

> +are all device-specific, non-architected things, so the only way a

> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which

> +contain the device-specific details.  These bridge registers also

> +include ECAM space, since it is consumed by the bridge.

> +

> +ACPI defined a Producer/Consumer bit that was intended to distinguish

> +the bridge apertures from the bridge registers [4, 5].  However,

> +BIOSes didn't use that bit correctly, and the result is that OSes have

> +to assume that everything in a PCI host bridge _CRS is a window.  That

> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08

> +device itself.

> +

> +The workaround is to describe the bridge registers (including ECAM

> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,

> +the bridge register space is device-specific anyway, so the generic

> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For

> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA

> +method.

> +

> +Note that the PCIe spec actually does require ECAM unless there's a

> +standard firmware interface for config access, e.g., the ia64 SAL

> +interface [7].  One reason is that we want a generic host bridge

> +driver (pci_root.c), and a generic driver requires a generic way to

> +access config space.

> +

> +

> +[1] ACPI 6.0, sec 6.1:

> +    For any device that is on a non-enumerable type of bus (for

> +    example, an ISA bus), OSPM enumerates the devices' identifier(s)

> +    and the ACPI system firmware must supply an _HID object ... for

> +    each device to enable OSPM to do that.

> +

> +[2] ACPI 6.0, sec 3.7:

> +    The OS enumerates motherboard devices simply by reading through

> +    the ACPI Namespace looking for devices with hardware IDs.

> +

> +    Each device enumerated by ACPI includes ACPI-defined objects in

> +    the ACPI Namespace that report the hardware resources the device

> +    could occupy [_PRS], an object that reports the resources that are

> +    currently used by the device [_CRS], and objects for configuring

> +    those resources [_SRS].  The information is used by the Plug and

> +    Play OS (OSPM) to configure the devices.

> +

> +[3] ACPI 6.0, sec 6.2:

> +    OSPM uses device configuration objects to configure hardware

> +    resources for devices enumerated via ACPI.  Device configuration

> +    objects provide information about current and possible resource

> +    requirements, the relationship between shared resources, and

> +    methods for configuring hardware resources.

> +

> +    When OSPM enumerates a device, it calls _PRS to determine the

> +    resource requirements of the device.  It may also call _CRS to

> +    find the current resource settings for the device.  Using this

> +    information, the Plug and Play system determines what resources

> +    the device should consume and sets those resources by calling the

> +    device’s _SRS control method.

> +

> +    In ACPI, devices can consume resources (for example, legacy

> +    keyboards), provide resources (for example, a proprietary PCI

> +    bridge), or do both.  Unless otherwise specified, resources for a

> +    device are assumed to be taken from the nearest matching resource

> +    above the device in the device hierarchy.

> +

> +[4] ACPI 6.0, sec 6.4.3.5.4:

> +    Extended Address Space Descriptor

> +    General Flags: Bit [0] Consumer/Producer:

> +	1–This device consumes this resource

> +	0–This device produces and consumes this resource

> +

> +[5] ACPI 6.0, sec 19.6.43:

> +    ResourceUsage specifies whether the Memory range is consumed by

> +    this device (ResourceConsumer) or passed on to child devices

> +    (ResourceProducer).  If nothing is specified, then

> +    ResourceConsumer is assumed.

> +

> +[6] PCI Firmware 3.0, sec 4.1.2:

> +    If the operating system does not natively comprehend reserving the

> +    MMCFG region, the MMCFG region must be reserved by firmware.  The

> +    address range reported in the MCFG table or by _CBA method (see

> +    Section 4.1.3) must be reserved by declaring a motherboard

> +    resource.  For most systems, the motherboard resource would appear

> +    at the root of the ACPI namespace (under \_SB) in a node with a

> +    _HID of EISAID (PNP0C02), and the resources in this case should

> +    not be claimed in the root PCI bus’s _CRS.  The resources can

> +    optionally be returned in Int15 E820 or EFIGetMemoryMap as

> +    reserved memory but must always be reported through ACPI as a

> +    motherboard resource.

> +

> +[7] PCI Express 3.0, sec 7.2.2:

> +    For systems that are PC-compatible, or that do not implement a

> +    processor-architecture-specific firmware interface standard that

> +    allows access to the Configuration Space, the ECAM is required as

> +    defined in this section.
Bjorn Helgaas Nov. 18, 2016, 5:54 p.m. UTC | #2
On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > -----Original Message-----
> > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > Sent: 17 November 2016 18:00

> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space!  The static tables are for things the OS
> > +needs to know early in boot, before it can parse the ACPI namespace.
> > +If a new table is defined, an old OS needs to operate correctly even
> > +though it ignores the table.  _CRS allows that because it is generic
> > +and understood by the old OS; a static table does not.
> 
> Right so if my understanding is correct you are saying that resources
> described in the MCFG table should also be declared in PNP0C02 devices
> so that the PNP driver can reserve these resources.

Yes.

> On the other side the PCI Root bridge driver should not reserve such
> resources.
> 
> Well if my understanding is correct I think we have a problem here:
> http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> 
> As you can see pci_ecam_create() will conflict with the pnp driver
> as it will try to reserve the resources from the MCFG table...
> 
> Maybe we need to rework pci_ecam_create() ?

I think it's OK as it is.

The pnp/system.c driver does try to reserve PNP0C02 resources, and it
marks them as "not busy".  That way they appear in /proc/iomem and
won't be allocated for anything else, but they can still be requested
by drivers, e.g., pci/ecam.c, which will mark them "busy".

This is analogous to what the PCI core does in pci_claim_resource().
This is really a function of the ACPI/PNP *core*, which should reserve
all _CRS resources for all devices (not just PNP0C02 devices).  But
it's done by pnp/system.c, and only for PNP0C02, because there's a
bunch of historical baggage there.

You'll also notice that in this case, things are out of order:
logically the pnp/system.c reservation should happen first, but in
fact the pci/ecam.c request happens *before* the pnp/system.c one.
That means the pnp/system.c one might fail and complain "[mem ...]
could not be reserved".

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Nov. 18, 2016, 11:02 p.m. UTC | #3
On Thu, Nov 17, 2016 at 6:59 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Looks great overall, but I have a few comments (below).

> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>         - this file
> +acpi-info.txt
> +       - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>         - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +           ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in

To be painfully precise, those need not be methods.  They may be
static objects too.

> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have

I'm not very keen on using the term "ACPI device" in documentation as
it is not particularly well defined.  As a rule, I prefer to talk
about "non-discoverable devices described via ACPI" or similar.

Accordingly, I'd change the above line to something like "If the OS is
expected to manage a non-discoverable device described via ACPI, that
device will have".

> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to

Here I'd say "any address space that is (1) not claimed by _CRS under
any other device object in the ACPI namespace and (2) ...".

> +something else, should be claimed by a PNP0C02 _CRS method.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gabriele Paoloni Nov. 21, 2016, 8:52 a.m. UTC | #4
Hi Bjorn

> -----Original Message-----
> From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> Sent: 18 November 2016 17:54
> To: Gabriele Paoloni
> Cc: Bjorn Helgaas; linux-pci@vger.kernel.org; linux-
> acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; linaro-acpi@lists.linaro.org
> Subject: Re: [PATCH] PCI: Add information about describing PCI in ACPI
> 
> On Fri, Nov 18, 2016 at 05:17:34PM +0000, Gabriele Paoloni wrote:
> > > -----Original Message-----
> > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > > owner@vger.kernel.org] On Behalf Of Bjorn Helgaas
> > > Sent: 17 November 2016 18:00
> 
> > > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms
> for
> > > +reserving address space!  The static tables are for things the OS
> > > +needs to know early in boot, before it can parse the ACPI
> namespace.
> > > +If a new table is defined, an old OS needs to operate correctly
> even
> > > +though it ignores the table.  _CRS allows that because it is
> generic
> > > +and understood by the old OS; a static table does not.
> >
> > Right so if my understanding is correct you are saying that resources
> > described in the MCFG table should also be declared in PNP0C02
> devices
> > so that the PNP driver can reserve these resources.
> 
> Yes.
> 
> > On the other side the PCI Root bridge driver should not reserve such
> > resources.
> >
> > Well if my understanding is correct I think we have a problem here:
> > http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L74
> >
> > As you can see pci_ecam_create() will conflict with the pnp driver
> > as it will try to reserve the resources from the MCFG table...
> >
> > Maybe we need to rework pci_ecam_create() ?
> 
> I think it's OK as it is.
> 
> The pnp/system.c driver does try to reserve PNP0C02 resources, and it
> marks them as "not busy".  That way they appear in /proc/iomem and
> won't be allocated for anything else, but they can still be requested
> by drivers, e.g., pci/ecam.c, which will mark them "busy".
> 
> This is analogous to what the PCI core does in pci_claim_resource().
> This is really a function of the ACPI/PNP *core*, which should reserve
> all _CRS resources for all devices (not just PNP0C02 devices).  But
> it's done by pnp/system.c, and only for PNP0C02, because there's a
> bunch of historical baggage there.
> 
> You'll also notice that in this case, things are out of order:
> logically the pnp/system.c reservation should happen first, but in
> fact the pci/ecam.c request happens *before* the pnp/system.c one.
> That means the pnp/system.c one might fail and complain "[mem ...]
> could not be reserved".

Correct me if I am wrong...

So currently we are relying on the fact that pci_ecam_create() is called
before the pnp driver.
If the pnp driver came first we would end up in pci_ecam_create() failing
here:
http://lxr.free-electrons.com/source/drivers/pci/ecam.c#L76

I am not sure but it seems to me like a bit weak condition to rely on...
what about removing the error condition in pci_ecam_create() and logging
just a dev_info()?

Thanks

Gab


> 
> Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Nov. 21, 2016, 1:58 p.m. UTC | #5
On Sat, Nov 19, 2016 at 12:02:24AM +0100, Rafael J. Wysocki wrote:
> On Thu, Nov 17, 2016 at 6:59 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> >
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> 
> Looks great overall, but I have a few comments (below).

Thanks a lot for taking a look, Rafael!  I applied all your suggestions.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ard Biesheuvel Nov. 22, 2016, 10:09 a.m. UTC | #6
On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> Add a writeup about how PCI host bridges should be described in ACPI
> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  Documentation/PCI/00-INDEX      |    2 +
>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 138 insertions(+)
>  create mode 100644 Documentation/PCI/acpi-info.txt
>
> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> index 147231f..0780280 100644
> --- a/Documentation/PCI/00-INDEX
> +++ b/Documentation/PCI/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>         - this file
> +acpi-info.txt
> +       - info on how PCI host bridges are represented in ACPI
>  MSI-HOWTO.txt
>         - the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
>  PCIEBUS-HOWTO.txt
> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> new file mode 100644
> index 0000000..ccbcfda
> --- /dev/null
> +++ b/Documentation/PCI/acpi-info.txt
> @@ -0,0 +1,136 @@
> +           ACPI considerations for PCI host bridges
> +
> +The basic requirement is that the ACPI namespace should describe
> +*everything* that consumes address space unless there's another
> +standard way for the OS to find it [1, 2].  For example, windows that
> +are forwarded to PCI by a PCI host bridge should be described via ACPI
> +devices, since the OS can't locate the host bridge by itself.  PCI
> +devices *below* the host bridge do not need to be described via ACPI,
> +because the resources they consume are inside the host bridge windows,
> +and the OS can discover them via the standard PCI enumeration
> +mechanism (using config accesses to read and size the BARs).
> +
> +This ACPI resource description is done via _CRS methods of devices in
> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> +the OS can read _CRS and figure out what resource is being consumed
> +even if it doesn't have a driver for the device [3].  That's important
> +because it means an old OS can work correctly even on a system with
> +new devices unknown to the OS.  The new devices won't do anything, but
> +the OS can at least make sure no resources conflict with them.
> +
> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> +reserving address space!  The static tables are for things the OS
> +needs to know early in boot, before it can parse the ACPI namespace.
> +If a new table is defined, an old OS needs to operate correctly even
> +though it ignores the table.  _CRS allows that because it is generic
> +and understood by the old OS; a static table does not.
> +
> +If the OS is expected to manage an ACPI device, that device will have
> +a specific _HID/_CID that tells the OS what driver to bind to it, and
> +the _CRS tells the OS and the driver where the device's registers are.
> +
> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no
> +programming model for them other than "don't use these resources for
> +anything else."  So any address space that is (1) not claimed by some
> +other ACPI device and (2) should not be assigned by the OS to
> +something else, should be claimed by a PNP0C02 _CRS method.
> +
> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> +describe all the address space they consume.  In principle, this would
> +be all the windows they forward down to the PCI bus, as well as the
> +bridge registers themselves.  The bridge registers include things like
> +secondary/subordinate bus registers that determine the bus range below
> +the bridge, window registers that describe the apertures, etc.  These
> +are all device-specific, non-architected things, so the only way a
> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> +contain the device-specific details.  These bridge registers also
> +include ECAM space, since it is consumed by the bridge.
> +
> +ACPI defined a Producer/Consumer bit that was intended to distinguish
> +the bridge apertures from the bridge registers [4, 5].  However,
> +BIOSes didn't use that bit correctly, and the result is that OSes have
> +to assume that everything in a PCI host bridge _CRS is a window.  That
> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> +device itself.
> +

Is that universally true? Or is it still possible to do the right
thing here on new ACPI architectures such as arm64?

> +The workaround is to describe the bridge registers (including ECAM
> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> +the bridge register space is device-specific anyway, so the generic
> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> +method.
> +
> +Note that the PCIe spec actually does require ECAM unless there's a
> +standard firmware interface for config access, e.g., the ia64 SAL
> +interface [7].  One reason is that we want a generic host bridge
> +driver (pci_root.c), and a generic driver requires a generic way to
> +access config space.
> +
> +
> +[1] ACPI 6.0, sec 6.1:
> +    For any device that is on a non-enumerable type of bus (for
> +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> +    and the ACPI system firmware must supply an _HID object ... for
> +    each device to enable OSPM to do that.
> +
> +[2] ACPI 6.0, sec 3.7:
> +    The OS enumerates motherboard devices simply by reading through
> +    the ACPI Namespace looking for devices with hardware IDs.
> +
> +    Each device enumerated by ACPI includes ACPI-defined objects in
> +    the ACPI Namespace that report the hardware resources the device
> +    could occupy [_PRS], an object that reports the resources that are
> +    currently used by the device [_CRS], and objects for configuring
> +    those resources [_SRS].  The information is used by the Plug and
> +    Play OS (OSPM) to configure the devices.
> +
> +[3] ACPI 6.0, sec 6.2:
> +    OSPM uses device configuration objects to configure hardware
> +    resources for devices enumerated via ACPI.  Device configuration
> +    objects provide information about current and possible resource
> +    requirements, the relationship between shared resources, and
> +    methods for configuring hardware resources.
> +
> +    When OSPM enumerates a device, it calls _PRS to determine the
> +    resource requirements of the device.  It may also call _CRS to
> +    find the current resource settings for the device.  Using this
> +    information, the Plug and Play system determines what resources
> +    the device should consume and sets those resources by calling the
> +    device’s _SRS control method.
> +
> +    In ACPI, devices can consume resources (for example, legacy
> +    keyboards), provide resources (for example, a proprietary PCI
> +    bridge), or do both.  Unless otherwise specified, resources for a
> +    device are assumed to be taken from the nearest matching resource
> +    above the device in the device hierarchy.
> +
> +[4] ACPI 6.0, sec 6.4.3.5.4:
> +    Extended Address Space Descriptor
> +    General Flags: Bit [0] Consumer/Producer:
> +       1–This device consumes this resource
> +       0–This device produces and consumes this resource
> +
> +[5] ACPI 6.0, sec 19.6.43:
> +    ResourceUsage specifies whether the Memory range is consumed by
> +    this device (ResourceConsumer) or passed on to child devices
> +    (ResourceProducer).  If nothing is specified, then
> +    ResourceConsumer is assumed.
> +
> +[6] PCI Firmware 3.0, sec 4.1.2:
> +    If the operating system does not natively comprehend reserving the
> +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> +    address range reported in the MCFG table or by _CBA method (see
> +    Section 4.1.3) must be reserved by declaring a motherboard
> +    resource.  For most systems, the motherboard resource would appear
> +    at the root of the ACPI namespace (under \_SB) in a node with a
> +    _HID of EISAID (PNP0C02), and the resources in this case should
> +    not be claimed in the root PCI bus’s _CRS.  The resources can
> +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> +    reserved memory but must always be reported through ACPI as a
> +    motherboard resource.
> +
> +[7] PCI Express 3.0, sec 7.2.2:
> +    For systems that are PC-compatible, or that do not implement a
> +    processor-architecture-specific firmware interface standard that
> +    allows access to the Configuration Space, the ECAM is required as
> +    defined in this section.
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Nov. 23, 2016, 1:06 a.m. UTC | #7
On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:

> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> > +describe all the address space they consume.  In principle, this would
> > +be all the windows they forward down to the PCI bus, as well as the
> > +bridge registers themselves.  The bridge registers include things like
> > +secondary/subordinate bus registers that determine the bus range below
> > +the bridge, window registers that describe the apertures, etc.  These
> > +are all device-specific, non-architected things, so the only way a
> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> > +contain the device-specific details.  These bridge registers also
> > +include ECAM space, since it is consumed by the bridge.
> > +
> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> > +the bridge apertures from the bridge registers [4, 5].  However,
> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> > +device itself.
> 
> Is that universally true? Or is it still possible to do the right
> thing here on new ACPI architectures such as arm64?

That's a very good question.  I had thought that the ACPI spec had
given up on Consumer/Producer completely, but I was wrong.  In the 6.0
spec, the Consumer/Producer bit is still documented in the Extended
Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
"ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).

Linux looks at the producer_consumer bit in acpi_decode_space(), which
I think is used for all these descriptors (QWord, DWord, Word, and
Extended).  This doesn't quite follow the spec -- we probably should
ignore it except for Extended.  In any event, acpi_decode_space() sets
IORESOURCE_WINDOW for Producer descriptors, but we don't test
IORESOURCE_WINDOW in the PCI host bridge code.

x86 and ia64 supply their own pci_acpi_root_prepare_resources()
functions that call acpi_pci_probe_root_resources(), which parses _CRS
and looks at producer_consumer.  Then they do a little arch-specific
stuff on the result.

On arm64 we use acpi_pci_probe_root_resources() directly, with no
arch-specific stuff.

On all three arches, we ignore the Consumer/Producer bit, so all the
resources are treated as Producers, e.g., as bridge windows.

I think we *could* implement an arm64 version of
pci_acpi_root_prepare_resources() that would pay attention to the
Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
compliant, we would have to use Extended descriptors for all bridge
windows, even if they would fit in a DWord or QWord.

Should we do that?  I dunno.  I'd like to hear your opinion(s).

It *would* be nice to have bridge registers in the bridge _CRS.  That
would eliminate the need for looking up the HISI0081/PNP0C02 devices
to find the bridge registers.  Avoiding that lookup is only a
temporary advantage -- the next round of bridges are supposed to fully
implement ECAM, and then we won't need to know where the registers
are.

Apart from the lookup, there's still some advantage in describing the
registers in the PNP0A03 device instead of an unrelated PNP0C02
device, because it makes /proc/iomem more accurate and potentially
makes host bridge hotplug cleaner.  We would have to enhance the host
bridge driver to do the reservations currently done by pnp/system.c.

There's some value in doing it the same way as on x86, even though
that way is somewhat broken.

Whatever we decide, I think it's very important to get it figured out
ASAP because it affects the ECAM quirks that we're trying to merge in
v4.10.

> > +The workaround is to describe the bridge registers (including ECAM
> > +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
> > +the bridge register space is device-specific anyway, so the generic
> > +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
> > +ECAM, pci_root.c learns about the space from either MCFG or the _CBA
> > +method.
> > +
> > +Note that the PCIe spec actually does require ECAM unless there's a
> > +standard firmware interface for config access, e.g., the ia64 SAL
> > +interface [7].  One reason is that we want a generic host bridge
> > +driver (pci_root.c), and a generic driver requires a generic way to
> > +access config space.
> > +
> > +
> > +[1] ACPI 6.0, sec 6.1:
> > +    For any device that is on a non-enumerable type of bus (for
> > +    example, an ISA bus), OSPM enumerates the devices' identifier(s)
> > +    and the ACPI system firmware must supply an _HID object ... for
> > +    each device to enable OSPM to do that.
> > +
> > +[2] ACPI 6.0, sec 3.7:
> > +    The OS enumerates motherboard devices simply by reading through
> > +    the ACPI Namespace looking for devices with hardware IDs.
> > +
> > +    Each device enumerated by ACPI includes ACPI-defined objects in
> > +    the ACPI Namespace that report the hardware resources the device
> > +    could occupy [_PRS], an object that reports the resources that are
> > +    currently used by the device [_CRS], and objects for configuring
> > +    those resources [_SRS].  The information is used by the Plug and
> > +    Play OS (OSPM) to configure the devices.
> > +
> > +[3] ACPI 6.0, sec 6.2:
> > +    OSPM uses device configuration objects to configure hardware
> > +    resources for devices enumerated via ACPI.  Device configuration
> > +    objects provide information about current and possible resource
> > +    requirements, the relationship between shared resources, and
> > +    methods for configuring hardware resources.
> > +
> > +    When OSPM enumerates a device, it calls _PRS to determine the
> > +    resource requirements of the device.  It may also call _CRS to
> > +    find the current resource settings for the device.  Using this
> > +    information, the Plug and Play system determines what resources
> > +    the device should consume and sets those resources by calling the
> > +    device’s _SRS control method.
> > +
> > +    In ACPI, devices can consume resources (for example, legacy
> > +    keyboards), provide resources (for example, a proprietary PCI
> > +    bridge), or do both.  Unless otherwise specified, resources for a
> > +    device are assumed to be taken from the nearest matching resource
> > +    above the device in the device hierarchy.
> > +
> > +[4] ACPI 6.0, sec 6.4.3.5.4:
> > +    Extended Address Space Descriptor
> > +    General Flags: Bit [0] Consumer/Producer:
> > +       1–This device consumes this resource
> > +       0–This device produces and consumes this resource
> > +
> > +[5] ACPI 6.0, sec 19.6.43:
> > +    ResourceUsage specifies whether the Memory range is consumed by
> > +    this device (ResourceConsumer) or passed on to child devices
> > +    (ResourceProducer).  If nothing is specified, then
> > +    ResourceConsumer is assumed.
> > +
> > +[6] PCI Firmware 3.0, sec 4.1.2:
> > +    If the operating system does not natively comprehend reserving the
> > +    MMCFG region, the MMCFG region must be reserved by firmware.  The
> > +    address range reported in the MCFG table or by _CBA method (see
> > +    Section 4.1.3) must be reserved by declaring a motherboard
> > +    resource.  For most systems, the motherboard resource would appear
> > +    at the root of the ACPI namespace (under \_SB) in a node with a
> > +    _HID of EISAID (PNP0C02), and the resources in this case should
> > +    not be claimed in the root PCI bus’s _CRS.  The resources can
> > +    optionally be returned in Int15 E820 or EFIGetMemoryMap as
> > +    reserved memory but must always be reported through ACPI as a
> > +    motherboard resource.
> > +
> > +[7] PCI Express 3.0, sec 7.2.2:
> > +    For systems that are PC-compatible, or that do not implement a
> > +    processor-architecture-specific firmware interface standard that
> > +    allows access to the Configuration Space, the ECAM is required as
> > +    defined in this section.
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lv Zheng Nov. 23, 2016, 3:23 a.m. UTC | #8
Hi, Bjorn

Thanks for the documentation.
It really helps!

However I have a question below.

> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Bjorn

> Helgaas

> Subject: [PATCH] PCI: Add information about describing PCI in ACPI

> 

> Add a writeup about how PCI host bridges should be described in ACPI

> using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.

> 

> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

> ---

>  Documentation/PCI/00-INDEX      |    2 +

>  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++

>  2 files changed, 138 insertions(+)

>  create mode 100644 Documentation/PCI/acpi-info.txt

> 

> diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX

> index 147231f..0780280 100644

> --- a/Documentation/PCI/00-INDEX

> +++ b/Documentation/PCI/00-INDEX

> @@ -1,5 +1,7 @@

>  00-INDEX

>  	- this file

> +acpi-info.txt

> +	- info on how PCI host bridges are represented in ACPI

>  MSI-HOWTO.txt

>  	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.

>  PCIEBUS-HOWTO.txt

> diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt

> new file mode 100644

> index 0000000..ccbcfda

> --- /dev/null

> +++ b/Documentation/PCI/acpi-info.txt

> @@ -0,0 +1,136 @@

> +	    ACPI considerations for PCI host bridges

> +

> +The basic requirement is that the ACPI namespace should describe

> +*everything* that consumes address space unless there's another

> +standard way for the OS to find it [1, 2].  For example, windows that

> +are forwarded to PCI by a PCI host bridge should be described via ACPI

> +devices, since the OS can't locate the host bridge by itself.  PCI

> +devices *below* the host bridge do not need to be described via ACPI,

> +because the resources they consume are inside the host bridge windows,

> +and the OS can discover them via the standard PCI enumeration

> +mechanism (using config accesses to read and size the BARs).

> +

> +This ACPI resource description is done via _CRS methods of devices in

> +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:

> +the OS can read _CRS and figure out what resource is being consumed

> +even if it doesn't have a driver for the device [3].  That's important

> +because it means an old OS can work correctly even on a system with

> +new devices unknown to the OS.  The new devices won't do anything, but

> +the OS can at least make sure no resources conflict with them.

> +

> +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for

> +reserving address space!  The static tables are for things the OS

> +needs to know early in boot, before it can parse the ACPI namespace.

> +If a new table is defined, an old OS needs to operate correctly even

> +though it ignores the table.  _CRS allows that because it is generic

> +and understood by the old OS; a static table does not.


The entire document doesn't talk about the details of _CBA.
There is only one line below mentioned _CBA as an example.

> +

> +If the OS is expected to manage an ACPI device, that device will have

> +a specific _HID/_CID that tells the OS what driver to bind to it, and

> +the _CRS tells the OS and the driver where the device's registers are.

> +

> +PNP0C02 "motherboard" devices are basically a catch-all.  There's no

> +programming model for them other than "don't use these resources for

> +anything else."  So any address space that is (1) not claimed by some

> +other ACPI device and (2) should not be assigned by the OS to

> +something else, should be claimed by a PNP0C02 _CRS method.

> +

> +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should

> +describe all the address space they consume.  In principle, this would

> +be all the windows they forward down to the PCI bus, as well as the

> +bridge registers themselves.  The bridge registers include things like

> +secondary/subordinate bus registers that determine the bus range below

> +the bridge, window registers that describe the apertures, etc.  These

> +are all device-specific, non-architected things, so the only way a

> +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which

> +contain the device-specific details.  These bridge registers also

> +include ECAM space, since it is consumed by the bridge.

> +

> +ACPI defined a Producer/Consumer bit that was intended to distinguish

> +the bridge apertures from the bridge registers [4, 5].  However,

> +BIOSes didn't use that bit correctly, and the result is that OSes have

> +to assume that everything in a PCI host bridge _CRS is a window.  That

> +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08

> +device itself.

> +

> +The workaround is to describe the bridge registers (including ECAM

> +space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,

> +the bridge register space is device-specific anyway, so the generic

> +PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For

> +ECAM, pci_root.c learns about the space from either MCFG or the _CBA

> +method.


Should the relationship of MCFG and _CBA be covered in this document?

Thanks and best regards
Lv

> +

> +Note that the PCIe spec actually does require ECAM unless there's a

> +standard firmware interface for config access, e.g., the ia64 SAL

> +interface [7].  One reason is that we want a generic host bridge

> +driver (pci_root.c), and a generic driver requires a generic way to

> +access config space.

> +

> +

> +[1] ACPI 6.0, sec 6.1:

> +    For any device that is on a non-enumerable type of bus (for

> +    example, an ISA bus), OSPM enumerates the devices' identifier(s)

> +    and the ACPI system firmware must supply an _HID object ... for

> +    each device to enable OSPM to do that.

> +

> +[2] ACPI 6.0, sec 3.7:

> +    The OS enumerates motherboard devices simply by reading through

> +    the ACPI Namespace looking for devices with hardware IDs.

> +

> +    Each device enumerated by ACPI includes ACPI-defined objects in

> +    the ACPI Namespace that report the hardware resources the device

> +    could occupy [_PRS], an object that reports the resources that are

> +    currently used by the device [_CRS], and objects for configuring

> +    those resources [_SRS].  The information is used by the Plug and

> +    Play OS (OSPM) to configure the devices.

> +

> +[3] ACPI 6.0, sec 6.2:

> +    OSPM uses device configuration objects to configure hardware

> +    resources for devices enumerated via ACPI.  Device configuration

> +    objects provide information about current and possible resource

> +    requirements, the relationship between shared resources, and

> +    methods for configuring hardware resources.

> +

> +    When OSPM enumerates a device, it calls _PRS to determine the

> +    resource requirements of the device.  It may also call _CRS to

> +    find the current resource settings for the device.  Using this

> +    information, the Plug and Play system determines what resources

> +    the device should consume and sets those resources by calling the

> +    device’s _SRS control method.

> +

> +    In ACPI, devices can consume resources (for example, legacy

> +    keyboards), provide resources (for example, a proprietary PCI

> +    bridge), or do both.  Unless otherwise specified, resources for a

> +    device are assumed to be taken from the nearest matching resource

> +    above the device in the device hierarchy.

> +

> +[4] ACPI 6.0, sec 6.4.3.5.4:

> +    Extended Address Space Descriptor

> +    General Flags: Bit [0] Consumer/Producer:

> +	1–This device consumes this resource

> +	0–This device produces and consumes this resource

> +

> +[5] ACPI 6.0, sec 19.6.43:

> +    ResourceUsage specifies whether the Memory range is consumed by

> +    this device (ResourceConsumer) or passed on to child devices

> +    (ResourceProducer).  If nothing is specified, then

> +    ResourceConsumer is assumed.

> +

> +[6] PCI Firmware 3.0, sec 4.1.2:

> +    If the operating system does not natively comprehend reserving the

> +    MMCFG region, the MMCFG region must be reserved by firmware.  The

> +    address range reported in the MCFG table or by _CBA method (see

> +    Section 4.1.3) must be reserved by declaring a motherboard

> +    resource.  For most systems, the motherboard resource would appear

> +    at the root of the ACPI namespace (under \_SB) in a node with a

> +    _HID of EISAID (PNP0C02), and the resources in this case should

> +    not be claimed in the root PCI bus’s _CRS.  The resources can

> +    optionally be returned in Int15 E820 or EFIGetMemoryMap as

> +    reserved memory but must always be reported through ACPI as a

> +    motherboard resource.

> +

> +[7] PCI Express 3.0, sec 7.2.2:

> +    For systems that are PC-compatible, or that do not implement a

> +    processor-architecture-specific firmware interface standard that

> +    allows access to the Configuration Space, the ECAM is required as

> +    defined in this section.

> 

> --

> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in

> the body of a message to majordomo@vger.kernel.org

> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ard Biesheuvel Nov. 23, 2016, 7:28 a.m. UTC | #9
On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
>> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
>
>> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
>> > +describe all the address space they consume.  In principle, this would
>> > +be all the windows they forward down to the PCI bus, as well as the
>> > +bridge registers themselves.  The bridge registers include things like
>> > +secondary/subordinate bus registers that determine the bus range below
>> > +the bridge, window registers that describe the apertures, etc.  These
>> > +are all device-specific, non-architected things, so the only way a
>> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
>> > +contain the device-specific details.  These bridge registers also
>> > +include ECAM space, since it is consumed by the bridge.
>> > +
>> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
>> > +the bridge apertures from the bridge registers [4, 5].  However,
>> > +BIOSes didn't use that bit correctly, and the result is that OSes have
>> > +to assume that everything in a PCI host bridge _CRS is a window.  That
>> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
>> > +device itself.
>>
>> Is that universally true? Or is it still possible to do the right
>> thing here on new ACPI architectures such as arm64?
>
> That's a very good question.  I had thought that the ACPI spec had
> given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> spec, the Consumer/Producer bit is still documented in the Extended
> Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
>
> Linux looks at the producer_consumer bit in acpi_decode_space(), which
> I think is used for all these descriptors (QWord, DWord, Word, and
> Extended).  This doesn't quite follow the spec -- we probably should
> ignore it except for Extended.  In any event, acpi_decode_space() sets
> IORESOURCE_WINDOW for Producer descriptors, but we don't test
> IORESOURCE_WINDOW in the PCI host bridge code.
>
> x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> functions that call acpi_pci_probe_root_resources(), which parses _CRS
> and looks at producer_consumer.  Then they do a little arch-specific
> stuff on the result.
>
> On arm64 we use acpi_pci_probe_root_resources() directly, with no
> arch-specific stuff.
>
> On all three arches, we ignore the Consumer/Producer bit, so all the
> resources are treated as Producers, e.g., as bridge windows.
>
> I think we *could* implement an arm64 version of
> pci_acpi_root_prepare_resources() that would pay attention to the
> Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> compliant, we would have to use Extended descriptors for all bridge
> windows, even if they would fit in a DWord or QWord.
>
> Should we do that?  I dunno.  I'd like to hear your opinion(s).
>

Yes, I think we should. If the spec allows for a way for a PNP0A03
device to describe all of its resources unambiguously, we should not
be relying on workarounds that were designed for another architecture
in another decade (for, presumably, another OS)

Just for my understanding, we will need to use extended descriptors
for all consumed *and* produced regions, even though dword/qword are
implicitly produced-only, due to the fact that the bit is ignored?

> It *would* be nice to have bridge registers in the bridge _CRS.  That
> would eliminate the need for looking up the HISI0081/PNP0C02 devices
> to find the bridge registers.  Avoiding that lookup is only a
> temporary advantage -- the next round of bridges are supposed to fully
> implement ECAM, and then we won't need to know where the registers
> are.
>
> Apart from the lookup, there's still some advantage in describing the
> registers in the PNP0A03 device instead of an unrelated PNP0C02
> device, because it makes /proc/iomem more accurate and potentially
> makes host bridge hotplug cleaner.  We would have to enhance the host
> bridge driver to do the reservations currently done by pnp/system.c.
>
> There's some value in doing it the same way as on x86, even though
> that way is somewhat broken.
>
> Whatever we decide, I think it's very important to get it figured out
> ASAP because it affects the ECAM quirks that we're trying to merge in
> v4.10.
>

I agree. What exactly is the impact for the quirks mechanism as proposed?
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi Nov. 23, 2016, 12:30 p.m. UTC | #10
On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >
> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> >> > +describe all the address space they consume.  In principle, this would
> >> > +be all the windows they forward down to the PCI bus, as well as the
> >> > +bridge registers themselves.  The bridge registers include things like
> >> > +secondary/subordinate bus registers that determine the bus range below
> >> > +the bridge, window registers that describe the apertures, etc.  These
> >> > +are all device-specific, non-architected things, so the only way a
> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> >> > +contain the device-specific details.  These bridge registers also
> >> > +include ECAM space, since it is consumed by the bridge.
> >> > +
> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> >> > +the bridge apertures from the bridge registers [4, 5].  However,
> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> >> > +device itself.
> >>
> >> Is that universally true? Or is it still possible to do the right
> >> thing here on new ACPI architectures such as arm64?
> >
> > That's a very good question.  I had thought that the ACPI spec had
> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > spec, the Consumer/Producer bit is still documented in the Extended
> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> >
> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > I think is used for all these descriptors (QWord, DWord, Word, and
> > Extended).  This doesn't quite follow the spec -- we probably should
> > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > IORESOURCE_WINDOW in the PCI host bridge code.
> >
> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > and looks at producer_consumer.  Then they do a little arch-specific
> > stuff on the result.
> >
> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > arch-specific stuff.
> >
> > On all three arches, we ignore the Consumer/Producer bit, so all the
> > resources are treated as Producers, e.g., as bridge windows.
> >
> > I think we *could* implement an arm64 version of
> > pci_acpi_root_prepare_resources() that would pay attention to the
> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > compliant, we would have to use Extended descriptors for all bridge
> > windows, even if they would fit in a DWord or QWord.
> >
> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> >
> 
> Yes, I think we should. If the spec allows for a way for a PNP0A03
> device to describe all of its resources unambiguously, we should not
> be relying on workarounds that were designed for another architecture
> in another decade (for, presumably, another OS)

That was the idea I floated at LPC16. We can override the
acpi_pci_root_ops prepare_resources() function pointer with a function
that checks IORESOURCE_WINDOW and filters resources accordingly (and
specific quirk "drivers" may know how to intepret resources that aren't
IORESOURCE_WINDOW - ie they can use it to describe the PCI ECAM config
space quirk region in their _CRS).

In a way that's something that makes sense anyway because given
that we are starting from a clean slate on ARM64 considering resources
that are not IORESOURCE_WINDOW as host bridge windows is just something
we are inheriting from x86, it is not really ACPI specs compliant (is
it ?).

> Just for my understanding, we will need to use extended descriptors
> for all consumed *and* produced regions, even though dword/qword are
> implicitly produced-only, due to the fact that the bit is ignored?

That's something that has to be clarified within the ASWG ie why the
consumer bit is ignored for *some* descriptors and not for others.

As things stand unfortunately the answer seems yes (I do not know
why).

> > It *would* be nice to have bridge registers in the bridge _CRS.  That
> > would eliminate the need for looking up the HISI0081/PNP0C02 devices
> > to find the bridge registers.  Avoiding that lookup is only a
> > temporary advantage -- the next round of bridges are supposed to fully
> > implement ECAM, and then we won't need to know where the registers
> > are.
> >
> > Apart from the lookup, there's still some advantage in describing the
> > registers in the PNP0A03 device instead of an unrelated PNP0C02
> > device, because it makes /proc/iomem more accurate and potentially
> > makes host bridge hotplug cleaner.  We would have to enhance the host
> > bridge driver to do the reservations currently done by pnp/system.c.
> >
> > There's some value in doing it the same way as on x86, even though
> > that way is somewhat broken.
> >
> > Whatever we decide, I think it's very important to get it figured out
> > ASAP because it affects the ECAM quirks that we're trying to merge in
> > v4.10.
> >
> 
> I agree. What exactly is the impact for the quirks mechanism as proposed?
The impact is that we could just use the PNP0A03 _CRS to report the PCI
ECAM config space quirk region through a consumer resource keeping in
mind what I say above (actually I think that's what was done on APM
firmware initially, for the records).

Lorenzo
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Nov. 23, 2016, 3:06 p.m. UTC | #11
On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >
> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> >> > +describe all the address space they consume.  In principle, this would
> >> > +be all the windows they forward down to the PCI bus, as well as the
> >> > +bridge registers themselves.  The bridge registers include things like
> >> > +secondary/subordinate bus registers that determine the bus range below
> >> > +the bridge, window registers that describe the apertures, etc.  These
> >> > +are all device-specific, non-architected things, so the only way a
> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> >> > +contain the device-specific details.  These bridge registers also
> >> > +include ECAM space, since it is consumed by the bridge.
> >> > +
> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> >> > +the bridge apertures from the bridge registers [4, 5].  However,
> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> >> > +device itself.
> >>
> >> Is that universally true? Or is it still possible to do the right
> >> thing here on new ACPI architectures such as arm64?
> >
> > That's a very good question.  I had thought that the ACPI spec had
> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > spec, the Consumer/Producer bit is still documented in the Extended
> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> >
> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > I think is used for all these descriptors (QWord, DWord, Word, and
> > Extended).  This doesn't quite follow the spec -- we probably should
> > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > IORESOURCE_WINDOW in the PCI host bridge code.
> >
> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > and looks at producer_consumer.  Then they do a little arch-specific
> > stuff on the result.
> >
> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > arch-specific stuff.
> >
> > On all three arches, we ignore the Consumer/Producer bit, so all the
> > resources are treated as Producers, e.g., as bridge windows.
> >
> > I think we *could* implement an arm64 version of
> > pci_acpi_root_prepare_resources() that would pay attention to the
> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > compliant, we would have to use Extended descriptors for all bridge
> > windows, even if they would fit in a DWord or QWord.
> >
> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> >
> 
> Yes, I think we should. If the spec allows for a way for a PNP0A03
> device to describe all of its resources unambiguously, we should not
> be relying on workarounds that were designed for another architecture
> in another decade (for, presumably, another OS)
> 
> Just for my understanding, we will need to use extended descriptors
> for all consumed *and* produced regions, even though dword/qword are
> implicitly produced-only, due to the fact that the bit is ignored?

From an ACPI spec point of view, I would say QWord/DWord/Word
descriptors are implicitly *consumer*-only because ResourceConsumer
is the default and they don't have a bit to indicate otherwise.

The current code assumes all PNP0A03 resources are producers.  If we
implement an arm64 pci_acpi_root_prepare_resources() that pays
attention to the Consumer/Producer bit, we would have to:

  - Reserve all producer regions in the iomem/ioport trees.  This is
    already done via pci_acpi_root_add_resources(), but we might need
    a new check to handle consumers differently.

  - Reserve all consumer regions.  This corresponds to what
    pnp/system.c does for PNP0C02 devices.  This is similar to the
    producer regions, but I think the consumer ones should be marked
    IORESOURCE_BUSY.

  - Use every producer (IORESOURCE_WINDOW) as a host bridge window.

I think it's a bug that acpi_decode_space() looks at producer_consumer
for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
descriptors for consumed regions should be safe, as long as they don't
set the Consumer/Producer bit.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Duc Dang Nov. 23, 2016, 8:52 p.m. UTC | #12
On Wed, Nov 23, 2016 at 4:30 AM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
>> On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
>> >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> >
>> >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
>> >> > +describe all the address space they consume.  In principle, this would
>> >> > +be all the windows they forward down to the PCI bus, as well as the
>> >> > +bridge registers themselves.  The bridge registers include things like
>> >> > +secondary/subordinate bus registers that determine the bus range below
>> >> > +the bridge, window registers that describe the apertures, etc.  These
>> >> > +are all device-specific, non-architected things, so the only way a
>> >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
>> >> > +contain the device-specific details.  These bridge registers also
>> >> > +include ECAM space, since it is consumed by the bridge.
>> >> > +
>> >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
>> >> > +the bridge apertures from the bridge registers [4, 5].  However,
>> >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
>> >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
>> >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
>> >> > +device itself.
>> >>
>> >> Is that universally true? Or is it still possible to do the right
>> >> thing here on new ACPI architectures such as arm64?
>> >
>> > That's a very good question.  I had thought that the ACPI spec had
>> > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
>> > spec, the Consumer/Producer bit is still documented in the Extended
>> > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
>> > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
>> >
>> > Linux looks at the producer_consumer bit in acpi_decode_space(), which
>> > I think is used for all these descriptors (QWord, DWord, Word, and
>> > Extended).  This doesn't quite follow the spec -- we probably should
>> > ignore it except for Extended.  In any event, acpi_decode_space() sets
>> > IORESOURCE_WINDOW for Producer descriptors, but we don't test
>> > IORESOURCE_WINDOW in the PCI host bridge code.
>> >
>> > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
>> > functions that call acpi_pci_probe_root_resources(), which parses _CRS
>> > and looks at producer_consumer.  Then they do a little arch-specific
>> > stuff on the result.
>> >
>> > On arm64 we use acpi_pci_probe_root_resources() directly, with no
>> > arch-specific stuff.
>> >
>> > On all three arches, we ignore the Consumer/Producer bit, so all the
>> > resources are treated as Producers, e.g., as bridge windows.
>> >
>> > I think we *could* implement an arm64 version of
>> > pci_acpi_root_prepare_resources() that would pay attention to the
>> > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
>> > compliant, we would have to use Extended descriptors for all bridge
>> > windows, even if they would fit in a DWord or QWord.
>> >
>> > Should we do that?  I dunno.  I'd like to hear your opinion(s).
>> >
>>
>> Yes, I think we should. If the spec allows for a way for a PNP0A03
>> device to describe all of its resources unambiguously, we should not
>> be relying on workarounds that were designed for another architecture
>> in another decade (for, presumably, another OS)
>
> That was the idea I floated at LPC16. We can override the
> acpi_pci_root_ops prepare_resources() function pointer with a function
> that checks IORESOURCE_WINDOW and filters resources accordingly (and
> specific quirk "drivers" may know how to intepret resources that aren't
> IORESOURCE_WINDOW - ie they can use it to describe the PCI ECAM config
> space quirk region in their _CRS).
>
> In a way that's something that makes sense anyway because given
> that we are starting from a clean slate on ARM64 considering resources
> that are not IORESOURCE_WINDOW as host bridge windows is just something
> we are inheriting from x86, it is not really ACPI specs compliant (is
> it ?).
>
>> Just for my understanding, we will need to use extended descriptors
>> for all consumed *and* produced regions, even though dword/qword are
>> implicitly produced-only, due to the fact that the bit is ignored?
>
> That's something that has to be clarified within the ASWG ie why the
> consumer bit is ignored for *some* descriptors and not for others.
>
> As things stand unfortunately the answer seems yes (I do not know
> why).
>
>> > It *would* be nice to have bridge registers in the bridge _CRS.  That
>> > would eliminate the need for looking up the HISI0081/PNP0C02 devices
>> > to find the bridge registers.  Avoiding that lookup is only a
>> > temporary advantage -- the next round of bridges are supposed to fully
>> > implement ECAM, and then we won't need to know where the registers
>> > are.
>> >
>> > Apart from the lookup, there's still some advantage in describing the
>> > registers in the PNP0A03 device instead of an unrelated PNP0C02
>> > device, because it makes /proc/iomem more accurate and potentially
>> > makes host bridge hotplug cleaner.  We would have to enhance the host
>> > bridge driver to do the reservations currently done by pnp/system.c.
>> >
>> > There's some value in doing it the same way as on x86, even though
>> > that way is somewhat broken.
>> >
>> > Whatever we decide, I think it's very important to get it figured out
>> > ASAP because it affects the ECAM quirks that we're trying to merge in
>> > v4.10.
>> >
>>
>> I agree. What exactly is the impact for the quirks mechanism as proposed?
> The impact is that we could just use the PNP0A03 _CRS to report the PCI
> ECAM config space quirk region through a consumer resource keeping in
> mind what I say above (actually I think that's what was done on APM
> firmware initially, for the records).

Just to clarify: APM firmware initially has a _CSR region to declare
the controller register region. We don't know that we need to declare
the reserved space for ECAM until Bjorn pointed out recently (with the
usage of PNP0C02).

I really like this idea about declaring ECAM space and any additional
spaces required for ECAM quirk inside PNP0A03 _CRS. For the firmware
that already shipped, the quirk will need to add additional resources
(for ECAM and other needed regions) into the root-bus. If we decided
to go with this, do we still have time to make additional adjustment
for the current ECAM quirk and the foundation patches before
v4.10-rc1?

>
> Lorenzo
> _______________________________________________
> Linaro-acpi mailing list
> Linaro-acpi@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/linaro-acpi
Regards,
Duc Dang.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Nov. 29, 2016, 6:19 p.m. UTC | #13
On Wed, Nov 23, 2016 at 09:06:33AM -0600, Bjorn Helgaas wrote:
> On Wed, Nov 23, 2016 at 07:28:12AM +0000, Ard Biesheuvel wrote:
> > On 23 November 2016 at 01:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Tue, Nov 22, 2016 at 10:09:50AM +0000, Ard Biesheuvel wrote:
> > >> On 17 November 2016 at 17:59, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > >
> > >> > +PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
> > >> > +describe all the address space they consume.  In principle, this would
> > >> > +be all the windows they forward down to the PCI bus, as well as the
> > >> > +bridge registers themselves.  The bridge registers include things like
> > >> > +secondary/subordinate bus registers that determine the bus range below
> > >> > +the bridge, window registers that describe the apertures, etc.  These
> > >> > +are all device-specific, non-architected things, so the only way a
> > >> > +PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
> > >> > +contain the device-specific details.  These bridge registers also
> > >> > +include ECAM space, since it is consumed by the bridge.
> > >> > +
> > >> > +ACPI defined a Producer/Consumer bit that was intended to distinguish
> > >> > +the bridge apertures from the bridge registers [4, 5].  However,
> > >> > +BIOSes didn't use that bit correctly, and the result is that OSes have
> > >> > +to assume that everything in a PCI host bridge _CRS is a window.  That
> > >> > +leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
> > >> > +device itself.
> > >>
> > >> Is that universally true? Or is it still possible to do the right
> > >> thing here on new ACPI architectures such as arm64?
> > >
> > > That's a very good question.  I had thought that the ACPI spec had
> > > given up on Consumer/Producer completely, but I was wrong.  In the 6.0
> > > spec, the Consumer/Producer bit is still documented in the Extended
> > > Address Space Descriptor (sec 6.4.3.5.4).  It is documented as
> > > "ignored" in the QWord, DWord, and Word descriptors (sec 6.4.3.5.1,2,3).
> > >
> > > Linux looks at the producer_consumer bit in acpi_decode_space(), which
> > > I think is used for all these descriptors (QWord, DWord, Word, and
> > > Extended).  This doesn't quite follow the spec -- we probably should
> > > ignore it except for Extended.  In any event, acpi_decode_space() sets
> > > IORESOURCE_WINDOW for Producer descriptors, but we don't test
> > > IORESOURCE_WINDOW in the PCI host bridge code.
> > >
> > > x86 and ia64 supply their own pci_acpi_root_prepare_resources()
> > > functions that call acpi_pci_probe_root_resources(), which parses _CRS
> > > and looks at producer_consumer.  Then they do a little arch-specific
> > > stuff on the result.
> > >
> > > On arm64 we use acpi_pci_probe_root_resources() directly, with no
> > > arch-specific stuff.
> > >
> > > On all three arches, we ignore the Consumer/Producer bit, so all the
> > > resources are treated as Producers, e.g., as bridge windows.
> > >
> > > I think we *could* implement an arm64 version of
> > > pci_acpi_root_prepare_resources() that would pay attention to the
> > > Consumer/Producer bit by checking IORESOURCE_WINDOW.  To be spec
> > > compliant, we would have to use Extended descriptors for all bridge
> > > windows, even if they would fit in a DWord or QWord.
> > >
> > > Should we do that?  I dunno.  I'd like to hear your opinion(s).
> > >
> > 
> > Yes, I think we should. If the spec allows for a way for a PNP0A03
> > device to describe all of its resources unambiguously, we should not
> > be relying on workarounds that were designed for another architecture
> > in another decade (for, presumably, another OS)
> > 
> > Just for my understanding, we will need to use extended descriptors
> > for all consumed *and* produced regions, even though dword/qword are
> > implicitly produced-only, due to the fact that the bit is ignored?
> 
> From an ACPI spec point of view, I would say QWord/DWord/Word
> descriptors are implicitly *consumer*-only because ResourceConsumer
> is the default and they don't have a bit to indicate otherwise.
> 
> The current code assumes all PNP0A03 resources are producers.  If we
> implement an arm64 pci_acpi_root_prepare_resources() that pays
> attention to the Consumer/Producer bit, we would have to:
> 
>   - Reserve all producer regions in the iomem/ioport trees.  This is
>     already done via pci_acpi_root_add_resources(), but we might need
>     a new check to handle consumers differently.
> 
>   - Reserve all consumer regions.  This corresponds to what
>     pnp/system.c does for PNP0C02 devices.  This is similar to the
>     producer regions, but I think the consumer ones should be marked
>     IORESOURCE_BUSY.
> 
>   - Use every producer (IORESOURCE_WINDOW) as a host bridge window.
> 
> I think it's a bug that acpi_decode_space() looks at producer_consumer
> for QWord/DWord/Word descriptors, but I think QWord/DWord/Word
> descriptors for consumed regions should be safe, as long as they don't
> set the Consumer/Producer bit.

I'm going to post a couple very lightly-tested patches that should
make us ignore the Consumer/Producer bit for QWord/DWord/Word.  I'd
appreciate any discussion about whether that's the right approach.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Nov. 29, 2016, 6:20 p.m. UTC | #14
On Wed, Nov 23, 2016 at 03:23:35AM +0000, Zheng, Lv wrote:
> Hi, Bjorn
> 
> Thanks for the documentation.
> It really helps!
> 
> However I have a question below.
> 
> > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Bjorn
> > Helgaas
> > Subject: [PATCH] PCI: Add information about describing PCI in ACPI
> > 
> > Add a writeup about how PCI host bridges should be described in ACPI
> > using PNP0A03/PNP0A08 devices, PNP0C02 devices, and the MCFG table.
> > 
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > ---
> >  Documentation/PCI/00-INDEX      |    2 +
> >  Documentation/PCI/acpi-info.txt |  136 +++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 138 insertions(+)
> >  create mode 100644 Documentation/PCI/acpi-info.txt
> > 
> > diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
> > index 147231f..0780280 100644
> > --- a/Documentation/PCI/00-INDEX
> > +++ b/Documentation/PCI/00-INDEX
> > @@ -1,5 +1,7 @@
> >  00-INDEX
> >  	- this file
> > +acpi-info.txt
> > +	- info on how PCI host bridges are represented in ACPI
> >  MSI-HOWTO.txt
> >  	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
> >  PCIEBUS-HOWTO.txt
> > diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
> > new file mode 100644
> > index 0000000..ccbcfda
> > --- /dev/null
> > +++ b/Documentation/PCI/acpi-info.txt
> > @@ -0,0 +1,136 @@
> > +	    ACPI considerations for PCI host bridges
> > +
> > +The basic requirement is that the ACPI namespace should describe
> > +*everything* that consumes address space unless there's another
> > +standard way for the OS to find it [1, 2].  For example, windows that
> > +are forwarded to PCI by a PCI host bridge should be described via ACPI
> > +devices, since the OS can't locate the host bridge by itself.  PCI
> > +devices *below* the host bridge do not need to be described via ACPI,
> > +because the resources they consume are inside the host bridge windows,
> > +and the OS can discover them via the standard PCI enumeration
> > +mechanism (using config accesses to read and size the BARs).
> > +
> > +This ACPI resource description is done via _CRS methods of devices in
> > +the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
> > +the OS can read _CRS and figure out what resource is being consumed
> > +even if it doesn't have a driver for the device [3].  That's important
> > +because it means an old OS can work correctly even on a system with
> > +new devices unknown to the OS.  The new devices won't do anything, but
> > +the OS can at least make sure no resources conflict with them.
> > +
> > +Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
> > +reserving address space!  The static tables are for things the OS
> > +needs to know early in boot, before it can parse the ACPI namespace.
> > +If a new table is defined, an old OS needs to operate correctly even
> > +though it ignores the table.  _CRS allows that because it is generic
> > +and understood by the old OS; a static table does not.
> 
> The entire document doesn't talk about the details of _CBA.
> There is only one line below mentioned _CBA as an example.

Yes, that's a good point.  I'll add some more details about MCFG
and _CBA.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
index 147231f..0780280 100644
--- a/Documentation/PCI/00-INDEX
+++ b/Documentation/PCI/00-INDEX
@@ -1,5 +1,7 @@ 
 00-INDEX
 	- this file
+acpi-info.txt
+	- info on how PCI host bridges are represented in ACPI
 MSI-HOWTO.txt
 	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
 PCIEBUS-HOWTO.txt
diff --git a/Documentation/PCI/acpi-info.txt b/Documentation/PCI/acpi-info.txt
new file mode 100644
index 0000000..ccbcfda
--- /dev/null
+++ b/Documentation/PCI/acpi-info.txt
@@ -0,0 +1,136 @@ 
+	    ACPI considerations for PCI host bridges
+
+The basic requirement is that the ACPI namespace should describe
+*everything* that consumes address space unless there's another
+standard way for the OS to find it [1, 2].  For example, windows that
+are forwarded to PCI by a PCI host bridge should be described via ACPI
+devices, since the OS can't locate the host bridge by itself.  PCI
+devices *below* the host bridge do not need to be described via ACPI,
+because the resources they consume are inside the host bridge windows,
+and the OS can discover them via the standard PCI enumeration
+mechanism (using config accesses to read and size the BARs).
+
+This ACPI resource description is done via _CRS methods of devices in
+the ACPI namespace [2].   _CRS methods are like generalized PCI BARs:
+the OS can read _CRS and figure out what resource is being consumed
+even if it doesn't have a driver for the device [3].  That's important
+because it means an old OS can work correctly even on a system with
+new devices unknown to the OS.  The new devices won't do anything, but
+the OS can at least make sure no resources conflict with them.
+
+Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
+reserving address space!  The static tables are for things the OS
+needs to know early in boot, before it can parse the ACPI namespace.
+If a new table is defined, an old OS needs to operate correctly even
+though it ignores the table.  _CRS allows that because it is generic
+and understood by the old OS; a static table does not.
+
+If the OS is expected to manage an ACPI device, that device will have
+a specific _HID/_CID that tells the OS what driver to bind to it, and
+the _CRS tells the OS and the driver where the device's registers are.
+
+PNP0C02 "motherboard" devices are basically a catch-all.  There's no
+programming model for them other than "don't use these resources for
+anything else."  So any address space that is (1) not claimed by some
+other ACPI device and (2) should not be assigned by the OS to
+something else, should be claimed by a PNP0C02 _CRS method.
+
+PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
+describe all the address space they consume.  In principle, this would
+be all the windows they forward down to the PCI bus, as well as the
+bridge registers themselves.  The bridge registers include things like
+secondary/subordinate bus registers that determine the bus range below
+the bridge, window registers that describe the apertures, etc.  These
+are all device-specific, non-architected things, so the only way a
+PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which
+contain the device-specific details.  These bridge registers also
+include ECAM space, since it is consumed by the bridge.
+
+ACPI defined a Producer/Consumer bit that was intended to distinguish
+the bridge apertures from the bridge registers [4, 5].  However,
+BIOSes didn't use that bit correctly, and the result is that OSes have
+to assume that everything in a PCI host bridge _CRS is a window.  That
+leaves no way to describe the bridge registers in the PNP0A03/PNP0A08
+device itself.
+
+The workaround is to describe the bridge registers (including ECAM
+space) in PNP0C02 catch-all devices [6].  With the exception of ECAM,
+the bridge register space is device-specific anyway, so the generic
+PNP0A03/PNP0A08 driver (pci_root.c) has no need to know about it.  For
+ECAM, pci_root.c learns about the space from either MCFG or the _CBA
+method.
+
+Note that the PCIe spec actually does require ECAM unless there's a
+standard firmware interface for config access, e.g., the ia64 SAL
+interface [7].  One reason is that we want a generic host bridge
+driver (pci_root.c), and a generic driver requires a generic way to
+access config space.
+
+
+[1] ACPI 6.0, sec 6.1:
+    For any device that is on a non-enumerable type of bus (for
+    example, an ISA bus), OSPM enumerates the devices' identifier(s)
+    and the ACPI system firmware must supply an _HID object ... for
+    each device to enable OSPM to do that.
+
+[2] ACPI 6.0, sec 3.7:
+    The OS enumerates motherboard devices simply by reading through
+    the ACPI Namespace looking for devices with hardware IDs.
+
+    Each device enumerated by ACPI includes ACPI-defined objects in
+    the ACPI Namespace that report the hardware resources the device
+    could occupy [_PRS], an object that reports the resources that are
+    currently used by the device [_CRS], and objects for configuring
+    those resources [_SRS].  The information is used by the Plug and
+    Play OS (OSPM) to configure the devices.
+
+[3] ACPI 6.0, sec 6.2:
+    OSPM uses device configuration objects to configure hardware
+    resources for devices enumerated via ACPI.  Device configuration
+    objects provide information about current and possible resource
+    requirements, the relationship between shared resources, and
+    methods for configuring hardware resources.
+
+    When OSPM enumerates a device, it calls _PRS to determine the
+    resource requirements of the device.  It may also call _CRS to
+    find the current resource settings for the device.  Using this
+    information, the Plug and Play system determines what resources
+    the device should consume and sets those resources by calling the
+    device’s _SRS control method.
+
+    In ACPI, devices can consume resources (for example, legacy
+    keyboards), provide resources (for example, a proprietary PCI
+    bridge), or do both.  Unless otherwise specified, resources for a
+    device are assumed to be taken from the nearest matching resource
+    above the device in the device hierarchy.
+
+[4] ACPI 6.0, sec 6.4.3.5.4:
+    Extended Address Space Descriptor
+    General Flags: Bit [0] Consumer/Producer:
+	1–This device consumes this resource
+	0–This device produces and consumes this resource
+
+[5] ACPI 6.0, sec 19.6.43:
+    ResourceUsage specifies whether the Memory range is consumed by
+    this device (ResourceConsumer) or passed on to child devices
+    (ResourceProducer).  If nothing is specified, then
+    ResourceConsumer is assumed.
+
+[6] PCI Firmware 3.0, sec 4.1.2:
+    If the operating system does not natively comprehend reserving the
+    MMCFG region, the MMCFG region must be reserved by firmware.  The
+    address range reported in the MCFG table or by _CBA method (see
+    Section 4.1.3) must be reserved by declaring a motherboard
+    resource.  For most systems, the motherboard resource would appear
+    at the root of the ACPI namespace (under \_SB) in a node with a
+    _HID of EISAID (PNP0C02), and the resources in this case should
+    not be claimed in the root PCI bus’s _CRS.  The resources can
+    optionally be returned in Int15 E820 or EFIGetMemoryMap as
+    reserved memory but must always be reported through ACPI as a
+    motherboard resource.
+
+[7] PCI Express 3.0, sec 7.2.2:
+    For systems that are PC-compatible, or that do not implement a
+    processor-architecture-specific firmware interface standard that
+    allows access to the Configuration Space, the ECAM is required as
+    defined in this section.