diff mbox series

[v2] arm64: PCI: Add quirk for Qualcomm WoA devices

Message ID 20230423030520.9570-1-shawn.guo@linaro.org (mailing list archive)
State Handled Elsewhere, archived
Headers show
Series [v2] arm64: PCI: Add quirk for Qualcomm WoA devices | expand

Commit Message

Shawn Guo April 23, 2023, 3:05 a.m. UTC
Commit 8fd4391ee717 ("arm64: PCI: Exclude ACPI "consumer" resources from
host bridge windows") introduced a check to remove host bridge register
resources for all arm64 platforms, with the assumption that the PNP0A03
_CRS resources would always be host bridge registers and never as windows
on arm64 platforms.

The assumption stands true until Qualcomm WoA (Windows on ARM) devices
emerge.  These devices describe host bridge windows in PNP0A03 _CRS
resources instead.  For example, the Microsoft Surface Pro X has host
bridges defined as

    Name (_HID, EisaId ("PNP0A08") /* PCI Express Bus */)  // _HID: Hardware ID
    Name (_CID, EisaId ("PNP0A03") /* PCI Bus */)  // _CID: Compatible ID

    Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
    {
        Name (RBUF, ResourceTemplate ()
        {
            Memory32Fixed (ReadWrite,
                0x60200000,         // Address Base
                0x01DF0000,         // Address Length
                )
            WordBusNumber (ResourceProducer, MinFixed, MaxFixed, PosDecode,
                0x0000,             // Granularity
                0x0000,             // Range Minimum
                0x0001,             // Range Maximum
                0x0000,             // Translation Offset
                0x0002,             // Length
                ,, )
        })
        Return (RBUF) /* \_SB_.PCI0._CRS.RBUF */
    }

The Memory32Fixed holds a host bridge window, but it's not properly
defined as a "producer" resource.  Consequently the resource gets
removed by kernel, and the BAR allocation fails later on:

    [ 0.150731] pci 0002:00:00.0: BAR 14: no space for [mem size 0x00100000]
    [ 0.150744] pci 0002:00:00.0: BAR 14: failed to assign [mem size 0x00100000]
    [ 0.150758] pci 0002:01:00.0: BAR 0: no space for [mem size 0x00004000 64bit]
    [ 0.150769] pci 0002:01:00.0: BAR 0: failed to assign [mem size 0x00004000 64bit]

This eventually prevents the PCIe NVME drive from being accessible.

Add a quirk for these devices to avoid the resource being removed.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
Changes for v2:
- Match devices using PPTT instead of DSDT to avoid maintenance burden.
  Hope this is an acceptable compromise.
- Add const delaration to qcom_platlist[].

v1 link:
https://lore.kernel.org/lkml/20230227021221.17980-1-shawn.guo@linaro.org/

 arch/arm64/kernel/pci.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

Comments

Bjorn Helgaas April 28, 2023, 9:30 p.m. UTC | #1
[+cc Andy, Bjorn A, plea for help from Qualcomm firmware folks]

On Sun, Apr 23, 2023 at 11:05:20AM +0800, Shawn Guo wrote:
> Commit 8fd4391ee717 ("arm64: PCI: Exclude ACPI "consumer" resources from
> host bridge windows") introduced a check to remove host bridge register
> resources for all arm64 platforms, with the assumption that the PNP0A03
> _CRS resources would always be host bridge registers and never as windows
> on arm64 platforms.

That's not quite what the commit log says.  The 8fd4391ee717
assumption is that on arm64,

  - _CRS *consumer* resources are host bridge registers
  - _CRS *producer* resources are windows

which I think matches the intent of the ACPI spec.

> The assumption stands true until Qualcomm WoA (Windows on ARM) devices
> emerge.  These devices describe host bridge windows in PNP0A03 _CRS
> resources instead.  For example, the Microsoft Surface Pro X has host
> bridges defined as
> 
>     Name (_CID, EisaId ("PNP0A03") /* PCI Bus */)  // _CID: Compatible ID
> 
>     Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
>     {
>         Name (RBUF, ResourceTemplate ()
>         {
>             Memory32Fixed (ReadWrite,
>                 0x60200000,         // Address Base
>                 0x01DF0000,         // Address Length
>                 )
> ...

> The Memory32Fixed holds a host bridge window, but it's not properly
> defined as a "producer" resource.

I assume you're saying the use of Memory32Fixed for a window is a
firmware defect, right?

(Per ACPI r6.5, sec 19.6.83, the Memory32Fixed descriptor cannot
specify a Producer/Consumer ResourceUsage.  I think that means the
space is assumed to be ResourceConsumer.)

> Consequently the resource gets removed by kernel, and the BAR
> allocation fails later on:
> 
>     [ 0.150731] pci 0002:00:00.0: BAR 14: no space for [mem size 0x00100000]
>     [ 0.150744] pci 0002:00:00.0: BAR 14: failed to assign [mem size 0x00100000]
>     [ 0.150758] pci 0002:01:00.0: BAR 0: no space for [mem size 0x00004000 64bit]
>     [ 0.150769] pci 0002:01:00.0: BAR 0: failed to assign [mem size 0x00004000 64bit]
> 
> This eventually prevents the PCIe NVME drive from being accessible.
> 
> Add a quirk for these devices to avoid the resource being removed.

Since this is a Windows laptop, I assume this works with Windows and
that Windows will in fact assign BARs in that Memory32Fixed area.

If we knew what the firmware author's intent was, we could probably
make Linux understand it.

Maybe (probably) Windows treats these descriptors the same on arm64 as
on x86, i.e., *everything* in PNP0A03 _CRS is assumed to be "producer"
(at least, that's my experimental observation; I have no actual
knowledge of Windows).

So I guess 8fd4391ee717 must have been motivated by some early arm64
platform that put "consumer" descriptors in PNP0A03 _CRS as Lorenzo
said [1].

In that case I guess our choices are:

  - Add quirks like this and keep adding them for every new arm64
    platform that uses the same "everything in PNP0A03 _CRS is a
    producer" strategy.

  - Remove 8fd4391ee717, break whatever early arm64 platforms needed
    it, and add piecemeal quirks for them.

I hate both, but I think I hate the first more because it has no end,
while the second is painful but limited.

Obviously we would need to do whatever we can to identify and fix
things that depend on 8fd4391ee717 before reverting it.

Bjorn

[1] https://lore.kernel.org/lkml/ZBA2Gl5xCjk7mMoW@lpieralisi/

> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> ---
> Changes for v2:
> - Match devices using PPTT instead of DSDT to avoid maintenance burden.
>   Hope this is an acceptable compromise.
> - Add const delaration to qcom_platlist[].
> 
> v1 link:
> https://lore.kernel.org/lkml/20230227021221.17980-1-shawn.guo@linaro.org/
> 
>  arch/arm64/kernel/pci.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
> index 2276689b5411..2ff2f3befa76 100644
> --- a/arch/arm64/kernel/pci.c
> +++ b/arch/arm64/kernel/pci.c
> @@ -109,16 +109,44 @@ int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
>  	return 0;
>  }
>  
> +#define QCOM_PCI_QUIRK "Host bridge windows in PNP0A03 _CRS"
> +
> +/*
> + * Ideally DSDT (Differentiated System Description Table) should be used to
> + * match the platforms, as the quirk is in there. But devices from different
> + * manufacturers usually have different oem_id and oem_table_id in DSDT,
> + * so matching DSDT makes the list a maintenance burden.  As a compromise,
> + * PPTT (Processor Properties Topology Table) is used instead to work
> + * around this quirk for the most Qualcomm WoA (Windows on ARM) devices.
> + */
> +static const struct acpi_platform_list qcom_platlist[] = {
> +	{ "QCOM  ", "QCOMEDK2", 0, ACPI_SIG_PPTT, all_versions, QCOM_PCI_QUIRK },
> +	{ }
> +};
> +
>  static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci)
>  {
>  	struct resource_entry *entry, *tmp;
>  	int status;
> +	int idx;
>  
>  	status = acpi_pci_probe_root_resources(ci);
> +
> +	/*
> +	 * Instead of describing host bridge registers in PNP0A03 _CRS
> +	 * resources, Qualcomm WoA devices describe host bridge windows in
> +	 * there.  We do not want to destroy the resources on these platforms.
> +	 */
> +	idx = acpi_match_platform_list(qcom_platlist);
> +	if (idx >= 0)
> +		goto done;
> +
>  	resource_list_for_each_entry_safe(entry, tmp, &ci->resources) {
>  		if (!(entry->res->flags & IORESOURCE_WINDOW))
>  			resource_list_destroy_entry(entry);
>  	}
> +
> +done:
>  	return status;
>  }
>  
> -- 
> 2.17.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Shawn Guo May 2, 2023, 8:45 a.m. UTC | #2
On Fri, Apr 28, 2023 at 04:30:27PM -0500, Bjorn Helgaas wrote:
> [+cc Andy, Bjorn A, plea for help from Qualcomm firmware folks]
> 
> On Sun, Apr 23, 2023 at 11:05:20AM +0800, Shawn Guo wrote:
> > Commit 8fd4391ee717 ("arm64: PCI: Exclude ACPI "consumer" resources from
> > host bridge windows") introduced a check to remove host bridge register
> > resources for all arm64 platforms, with the assumption that the PNP0A03
> > _CRS resources would always be host bridge registers and never as windows
> > on arm64 platforms.
> 
> That's not quite what the commit log says.  The 8fd4391ee717
> assumption is that on arm64,
> 
>   - _CRS *consumer* resources are host bridge registers
>   - _CRS *producer* resources are windows
> 
> which I think matches the intent of the ACPI spec.

Yes, I will update.

> 
> > The assumption stands true until Qualcomm WoA (Windows on ARM) devices
> > emerge.  These devices describe host bridge windows in PNP0A03 _CRS
> > resources instead.  For example, the Microsoft Surface Pro X has host
> > bridges defined as
> > 
> >     Name (_CID, EisaId ("PNP0A03") /* PCI Bus */)  // _CID: Compatible ID
> > 
> >     Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
> >     {
> >         Name (RBUF, ResourceTemplate ()
> >         {
> >             Memory32Fixed (ReadWrite,
> >                 0x60200000,         // Address Base
> >                 0x01DF0000,         // Address Length
> >                 )
> > ...
> 
> > The Memory32Fixed holds a host bridge window, but it's not properly
> > defined as a "producer" resource.
> 
> I assume you're saying the use of Memory32Fixed for a window is a
> firmware defect, right?

Yes, I will reword.

> 
> (Per ACPI r6.5, sec 19.6.83, the Memory32Fixed descriptor cannot
> specify a Producer/Consumer ResourceUsage.  I think that means the
> space is assumed to be ResourceConsumer.)
> 
> > Consequently the resource gets removed by kernel, and the BAR
> > allocation fails later on:
> > 
> >     [ 0.150731] pci 0002:00:00.0: BAR 14: no space for [mem size 0x00100000]
> >     [ 0.150744] pci 0002:00:00.0: BAR 14: failed to assign [mem size 0x00100000]
> >     [ 0.150758] pci 0002:01:00.0: BAR 0: no space for [mem size 0x00004000 64bit]
> >     [ 0.150769] pci 0002:01:00.0: BAR 0: failed to assign [mem size 0x00004000 64bit]
> > 
> > This eventually prevents the PCIe NVME drive from being accessible.
> > 
> > Add a quirk for these devices to avoid the resource being removed.
> 
> Since this is a Windows laptop, I assume this works with Windows and
> that Windows will in fact assign BARs in that Memory32Fixed area.
> 
> If we knew what the firmware author's intent was, we could probably
> make Linux understand it.
> 
> Maybe (probably) Windows treats these descriptors the same on arm64 as
> on x86, i.e., *everything* in PNP0A03 _CRS is assumed to be "producer"
> (at least, that's my experimental observation; I have no actual
> knowledge of Windows).

That's my bet too.

> 
> So I guess 8fd4391ee717 must have been motivated by some early arm64
> platform that put "consumer" descriptors in PNP0A03 _CRS as Lorenzo
> said [1].
> 
> In that case I guess our choices are:
> 
>   - Add quirks like this and keep adding them for every new arm64
>     platform that uses the same "everything in PNP0A03 _CRS is a
>     producer" strategy.
> 
>   - Remove 8fd4391ee717, break whatever early arm64 platforms needed
>     it, and add piecemeal quirks for them.
> 
> I hate both, but I think I hate the first more because it has no end,
> while the second is painful but limited.

Thanks for your opinion on this!  Let's try to pursue the second then.

> 
> Obviously we would need to do whatever we can to identify and fix
> things that depend on 8fd4391ee717 before reverting it.

Lorenzo,

I have zero experience on any of those early arm64 platforms.  I would
appreciate it if you can give some direction on how to identify them.

Looking at your comment below, I'm wondering if it's true that the
firmware on those early arm64 platforms has no MCFG table but provide
root->mcfg_addr via _CBA method?

"I believe it is because there were arm64 platforms (early) that added a
consumer descriptor in the host bridge CRS with MMIO registers space in
it (I am not sure I can find the bug report - it has been a while,
remember the issue with non-ECAM config space and where to add the MMIO
resource required to "extend" MCFG config space ? I will never forget
that :))."

It would be very helpful if we can find someone running any of those
early platforms, so that we can ask favor to dump ACPI tables and test
things out.

Shawn
diff mbox series

Patch

diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
index 2276689b5411..2ff2f3befa76 100644
--- a/arch/arm64/kernel/pci.c
+++ b/arch/arm64/kernel/pci.c
@@ -109,16 +109,44 @@  int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
 	return 0;
 }
 
+#define QCOM_PCI_QUIRK "Host bridge windows in PNP0A03 _CRS"
+
+/*
+ * Ideally DSDT (Differentiated System Description Table) should be used to
+ * match the platforms, as the quirk is in there. But devices from different
+ * manufacturers usually have different oem_id and oem_table_id in DSDT,
+ * so matching DSDT makes the list a maintenance burden.  As a compromise,
+ * PPTT (Processor Properties Topology Table) is used instead to work
+ * around this quirk for the most Qualcomm WoA (Windows on ARM) devices.
+ */
+static const struct acpi_platform_list qcom_platlist[] = {
+	{ "QCOM  ", "QCOMEDK2", 0, ACPI_SIG_PPTT, all_versions, QCOM_PCI_QUIRK },
+	{ }
+};
+
 static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci)
 {
 	struct resource_entry *entry, *tmp;
 	int status;
+	int idx;
 
 	status = acpi_pci_probe_root_resources(ci);
+
+	/*
+	 * Instead of describing host bridge registers in PNP0A03 _CRS
+	 * resources, Qualcomm WoA devices describe host bridge windows in
+	 * there.  We do not want to destroy the resources on these platforms.
+	 */
+	idx = acpi_match_platform_list(qcom_platlist);
+	if (idx >= 0)
+		goto done;
+
 	resource_list_for_each_entry_safe(entry, tmp, &ci->resources) {
 		if (!(entry->res->flags & IORESOURCE_WINDOW))
 			resource_list_destroy_entry(entry);
 	}
+
+done:
 	return status;
 }