diff mbox series

[RFC] Revert "arm64: PCI: Exclude ACPI "consumer" resources from host bridge windows"

Message ID 20210510234020.1330087-1-luzmaximilian@gmail.com (mailing list archive)
State New, archived
Delegated to: Lorenzo Pieralisi
Headers show
Series [RFC] Revert "arm64: PCI: Exclude ACPI "consumer" resources from host bridge windows" | expand

Commit Message

Maximilian Luz May 10, 2021, 11:40 p.m. UTC
The Microsoft Surface Pro X has host bridges defined as

    Name (_HID, EisaId ("PNP0A08") /* PCI Express Bus */)  // _HID: Hardware ID
    Name (_CID, EisaId ("PNP0A03") /* PCI Bus */)  // _CID: Compatible ID

    Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
    {
        Name (RBUF, ResourceTemplate ()
        {
            Memory32Fixed (ReadWrite,
                0x60200000,         // Address Base
                0x01DF0000,         // Address Length
                )
            WordBusNumber (ResourceProducer, MinFixed, MaxFixed, PosDecode,
                0x0000,             // Granularity
                0x0000,             // Range Minimum
                0x0001,             // Range Maximum
                0x0000,             // Translation Offset
                0x0002,             // Length
                ,, )
        })
        Return (RBUF) /* \_SB_.PCI0._CRS.RBUF */
    }

meaning that the memory resources aren't (explicitly) defined as
"producers", i.e. host bridge windows.

Commit 8fd4391ee717 ("arm64: PCI: Exclude ACPI "consumer" resources from
host bridge windows") introduced a check that removes such resources,
causing BAR allocation failures later on:

    [ 0.150731] pci 0002:00:00.0: BAR 14: no space for [mem size 0x00100000]
    [ 0.150744] pci 0002:00:00.0: BAR 14: failed to assign [mem size 0x00100000]
    [ 0.150758] pci 0002:01:00.0: BAR 0: no space for [mem size 0x00004000 64bit]
    [ 0.150769] pci 0002:01:00.0: BAR 0: failed to assign [mem size 0x00004000 64bit]

This eventually prevents the PCIe NVME drive from being accessible.

On x86 we already skip the check for producer/window due to some history
with negligent firmware. It seems that Microsoft is intent on continuing
that history on their ARM devices, so let's drop that check here too.

Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
---

Please note: I am not sure if this is the right way to fix that, e.g. I
don't know if any additional checks like on IA64 or x86 might be
required instead, or if this might break things on other devices. So
please consider this more as a bug report rather than a fix.

Apologies for the re-send, I seem to have unintentionally added a blank
line before the subject.

---
 arch/arm64/kernel/pci.c | 14 --------------
 1 file changed, 14 deletions(-)

Comments

Will Deacon May 26, 2021, 8:58 p.m. UTC | #1
On Tue, May 11, 2021 at 01:40:20AM +0200, Maximilian Luz wrote:
> The Microsoft Surface Pro X has host bridges defined as
> 
>     Name (_HID, EisaId ("PNP0A08") /* PCI Express Bus */)  // _HID: Hardware ID
>     Name (_CID, EisaId ("PNP0A03") /* PCI Bus */)  // _CID: Compatible ID
> 
>     Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
>     {
>         Name (RBUF, ResourceTemplate ()
>         {
>             Memory32Fixed (ReadWrite,
>                 0x60200000,         // Address Base
>                 0x01DF0000,         // Address Length
>                 )
>             WordBusNumber (ResourceProducer, MinFixed, MaxFixed, PosDecode,
>                 0x0000,             // Granularity
>                 0x0000,             // Range Minimum
>                 0x0001,             // Range Maximum
>                 0x0000,             // Translation Offset
>                 0x0002,             // Length
>                 ,, )
>         })
>         Return (RBUF) /* \_SB_.PCI0._CRS.RBUF */
>     }
> 
> meaning that the memory resources aren't (explicitly) defined as
> "producers", i.e. host bridge windows.
> 
> Commit 8fd4391ee717 ("arm64: PCI: Exclude ACPI "consumer" resources from
> host bridge windows") introduced a check that removes such resources,
> causing BAR allocation failures later on:
> 
>     [ 0.150731] pci 0002:00:00.0: BAR 14: no space for [mem size 0x00100000]
>     [ 0.150744] pci 0002:00:00.0: BAR 14: failed to assign [mem size 0x00100000]
>     [ 0.150758] pci 0002:01:00.0: BAR 0: no space for [mem size 0x00004000 64bit]
>     [ 0.150769] pci 0002:01:00.0: BAR 0: failed to assign [mem size 0x00004000 64bit]
> 
> This eventually prevents the PCIe NVME drive from being accessible.
> 
> On x86 we already skip the check for producer/window due to some history
> with negligent firmware. It seems that Microsoft is intent on continuing
> that history on their ARM devices, so let's drop that check here too.
> 
> Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
> ---
> 
> Please note: I am not sure if this is the right way to fix that, e.g. I
> don't know if any additional checks like on IA64 or x86 might be
> required instead, or if this might break things on other devices. So
> please consider this more as a bug report rather than a fix.
> 
> Apologies for the re-send, I seem to have unintentionally added a blank
> line before the subject.
> 
> ---
>  arch/arm64/kernel/pci.c | 14 --------------
>  1 file changed, 14 deletions(-)

Adding Lorenzo to cc, as he'll have a much better idea about this than me.

This is:

https://lore.kernel.org/r/20210510234020.1330087-1-luzmaximilian@gmail.com

Will
Lorenzo Pieralisi May 27, 2021, 9:32 a.m. UTC | #2
On Wed, May 26, 2021 at 09:58:36PM +0100, Will Deacon wrote:
> On Tue, May 11, 2021 at 01:40:20AM +0200, Maximilian Luz wrote:
> > The Microsoft Surface Pro X has host bridges defined as
> > 
> >     Name (_HID, EisaId ("PNP0A08") /* PCI Express Bus */)  // _HID: Hardware ID
> >     Name (_CID, EisaId ("PNP0A03") /* PCI Bus */)  // _CID: Compatible ID
> > 
> >     Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
> >     {
> >         Name (RBUF, ResourceTemplate ()
> >         {
> >             Memory32Fixed (ReadWrite,
> >                 0x60200000,         // Address Base
> >                 0x01DF0000,         // Address Length
> >                 )
> >             WordBusNumber (ResourceProducer, MinFixed, MaxFixed, PosDecode,
> >                 0x0000,             // Granularity
> >                 0x0000,             // Range Minimum
> >                 0x0001,             // Range Maximum
> >                 0x0000,             // Translation Offset
> >                 0x0002,             // Length
> >                 ,, )
> >         })
> >         Return (RBUF) /* \_SB_.PCI0._CRS.RBUF */
> >     }
> > 
> > meaning that the memory resources aren't (explicitly) defined as
> > "producers", i.e. host bridge windows.
> > 
> > Commit 8fd4391ee717 ("arm64: PCI: Exclude ACPI "consumer" resources from
> > host bridge windows") introduced a check that removes such resources,
> > causing BAR allocation failures later on:
> > 
> >     [ 0.150731] pci 0002:00:00.0: BAR 14: no space for [mem size 0x00100000]
> >     [ 0.150744] pci 0002:00:00.0: BAR 14: failed to assign [mem size 0x00100000]
> >     [ 0.150758] pci 0002:01:00.0: BAR 0: no space for [mem size 0x00004000 64bit]
> >     [ 0.150769] pci 0002:01:00.0: BAR 0: failed to assign [mem size 0x00004000 64bit]
> > 
> > This eventually prevents the PCIe NVME drive from being accessible.
> > 
> > On x86 we already skip the check for producer/window due to some history
> > with negligent firmware. It seems that Microsoft is intent on continuing
> > that history on their ARM devices, so let's drop that check here too.
> > 
> > Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
> > ---
> > 
> > Please note: I am not sure if this is the right way to fix that, e.g. I
> > don't know if any additional checks like on IA64 or x86 might be
> > required instead, or if this might break things on other devices. So
> > please consider this more as a bug report rather than a fix.
> > 
> > Apologies for the re-send, I seem to have unintentionally added a blank
> > line before the subject.
> > 
> > ---
> >  arch/arm64/kernel/pci.c | 14 --------------
> >  1 file changed, 14 deletions(-)
> 
> Adding Lorenzo to cc, as he'll have a much better idea about this than me.
> 
> This is:
> 
> https://lore.kernel.org/r/20210510234020.1330087-1-luzmaximilian@gmail.com

Sigh. We can't apply this patch since it would trigger regressions on
other platforms (IIUC the root complex registers would end up in the
host bridge memory windows).

I am not keen on reverting commit 8fd4391ee717 because it does the
right thing.

I think this requires a quirk and immediate reporting to Microsoft.

Bjorn, what are your thoughts on this ?

Thanks,
Lorenzo
Maximilian Luz May 27, 2021, 11:31 a.m. UTC | #3
On 5/27/21 11:32 AM, Lorenzo Pieralisi wrote:
> On Wed, May 26, 2021 at 09:58:36PM +0100, Will Deacon wrote:
>> On Tue, May 11, 2021 at 01:40:20AM +0200, Maximilian Luz wrote:
>>> The Microsoft Surface Pro X has host bridges defined as
>>>
>>>      Name (_HID, EisaId ("PNP0A08") /* PCI Express Bus */)  // _HID: Hardware ID
>>>      Name (_CID, EisaId ("PNP0A03") /* PCI Bus */)  // _CID: Compatible ID
>>>
>>>      Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
>>>      {
>>>          Name (RBUF, ResourceTemplate ()
>>>          {
>>>              Memory32Fixed (ReadWrite,
>>>                  0x60200000,         // Address Base
>>>                  0x01DF0000,         // Address Length
>>>                  )
>>>              WordBusNumber (ResourceProducer, MinFixed, MaxFixed, PosDecode,
>>>                  0x0000,             // Granularity
>>>                  0x0000,             // Range Minimum
>>>                  0x0001,             // Range Maximum
>>>                  0x0000,             // Translation Offset
>>>                  0x0002,             // Length
>>>                  ,, )
>>>          })
>>>          Return (RBUF) /* \_SB_.PCI0._CRS.RBUF */
>>>      }
>>>
>>> meaning that the memory resources aren't (explicitly) defined as
>>> "producers", i.e. host bridge windows.
>>>
>>> Commit 8fd4391ee717 ("arm64: PCI: Exclude ACPI "consumer" resources from
>>> host bridge windows") introduced a check that removes such resources,
>>> causing BAR allocation failures later on:
>>>
>>>      [ 0.150731] pci 0002:00:00.0: BAR 14: no space for [mem size 0x00100000]
>>>      [ 0.150744] pci 0002:00:00.0: BAR 14: failed to assign [mem size 0x00100000]
>>>      [ 0.150758] pci 0002:01:00.0: BAR 0: no space for [mem size 0x00004000 64bit]
>>>      [ 0.150769] pci 0002:01:00.0: BAR 0: failed to assign [mem size 0x00004000 64bit]
>>>
>>> This eventually prevents the PCIe NVME drive from being accessible.
>>>
>>> On x86 we already skip the check for producer/window due to some history
>>> with negligent firmware. It seems that Microsoft is intent on continuing
>>> that history on their ARM devices, so let's drop that check here too.
>>>
>>> Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
>>> ---
>>>
>>> Please note: I am not sure if this is the right way to fix that, e.g. I
>>> don't know if any additional checks like on IA64 or x86 might be
>>> required instead, or if this might break things on other devices. So
>>> please consider this more as a bug report rather than a fix.
>>>
>>> Apologies for the re-send, I seem to have unintentionally added a blank
>>> line before the subject.
>>>
>>> ---
>>>   arch/arm64/kernel/pci.c | 14 --------------
>>>   1 file changed, 14 deletions(-)
>>
>> Adding Lorenzo to cc, as he'll have a much better idea about this than me.
>>
>> This is:
>>
>> https://lore.kernel.org/r/20210510234020.1330087-1-luzmaximilian@gmail.com
> 
> Sigh. We can't apply this patch since it would trigger regressions on
> other platforms (IIUC the root complex registers would end up in the
> host bridge memory windows).
> 
> I am not keen on reverting commit 8fd4391ee717 because it does the
> right thing.
> 
> I think this requires a quirk and immediate reporting to Microsoft.

Since I wrote this I have found other arm64 devices with the same
problem. I don't think that this is Microsoft exclusive anymore, but
rather that this is a Qualcomm problem (Qualcomm SoC seems to be the
common thread). See e.g. DSDTs in [1]. So it should probably be reported
to them.

Regards,
Max

[1]: https://github.com/aarch64-laptops/build/tree/dfce25bc12655713c7e1e0422b191e9c944e4fb2/misc
Bjorn Helgaas May 27, 2021, 4:34 p.m. UTC | #4
On Thu, May 27, 2021 at 10:32:00AM +0100, Lorenzo Pieralisi wrote:
> On Wed, May 26, 2021 at 09:58:36PM +0100, Will Deacon wrote:
> > On Tue, May 11, 2021 at 01:40:20AM +0200, Maximilian Luz wrote:
> > > The Microsoft Surface Pro X has host bridges defined as
> > > 
> > >     Name (_HID, EisaId ("PNP0A08") /* PCI Express Bus */)  // _HID: Hardware ID
> > >     Name (_CID, EisaId ("PNP0A03") /* PCI Bus */)  // _CID: Compatible ID
> > > 
> > >     Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
> > >     {
> > >         Name (RBUF, ResourceTemplate ()
> > >         {
> > >             Memory32Fixed (ReadWrite,
> > >                 0x60200000,         // Address Base
> > >                 0x01DF0000,         // Address Length
> > >                 )
> > >             WordBusNumber (ResourceProducer, MinFixed, MaxFixed, PosDecode,
> > >                 0x0000,             // Granularity
> > >                 0x0000,             // Range Minimum
> > >                 0x0001,             // Range Maximum
> > >                 0x0000,             // Translation Offset
> > >                 0x0002,             // Length
> > >                 ,, )
> > >         })
> > >         Return (RBUF) /* \_SB_.PCI0._CRS.RBUF */
> > >     }
> > > 
> > > meaning that the memory resources aren't (explicitly) defined as
> > > "producers", i.e. host bridge windows.
> > > 
> > > Commit 8fd4391ee717 ("arm64: PCI: Exclude ACPI "consumer" resources from
> > > host bridge windows") introduced a check that removes such resources,
> > > causing BAR allocation failures later on:
> > > 
> > >     [ 0.150731] pci 0002:00:00.0: BAR 14: no space for [mem size 0x00100000]
> > >     [ 0.150744] pci 0002:00:00.0: BAR 14: failed to assign [mem size 0x00100000]
> > >     [ 0.150758] pci 0002:01:00.0: BAR 0: no space for [mem size 0x00004000 64bit]
> > >     [ 0.150769] pci 0002:01:00.0: BAR 0: failed to assign [mem size 0x00004000 64bit]
> > > 
> > > This eventually prevents the PCIe NVME drive from being accessible.
> > > 
> > > On x86 we already skip the check for producer/window due to some history
> > > with negligent firmware. It seems that Microsoft is intent on continuing
> > > that history on their ARM devices, so let's drop that check here too.
> > > 
> > > Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
> > > ---
> > > 
> > > Please note: I am not sure if this is the right way to fix that, e.g. I
> > > don't know if any additional checks like on IA64 or x86 might be
> > > required instead, or if this might break things on other devices. So
> > > please consider this more as a bug report rather than a fix.
> > > 
> > > Apologies for the re-send, I seem to have unintentionally added a blank
> > > line before the subject.
> > > 
> > > ---
> > >  arch/arm64/kernel/pci.c | 14 --------------
> > >  1 file changed, 14 deletions(-)
> > 
> > Adding Lorenzo to cc, as he'll have a much better idea about this than me.
> > 
> > This is:
> > 
> > https://lore.kernel.org/r/20210510234020.1330087-1-luzmaximilian@gmail.com
> 
> Sigh. We can't apply this patch since it would trigger regressions on
> other platforms (IIUC the root complex registers would end up in the
> host bridge memory windows).
> 
> I am not keen on reverting commit 8fd4391ee717 because it does the
> right thing.
> 
> I think this requires a quirk and immediate reporting to Microsoft.
> 
> Bjorn, what are your thoughts on this ?

In retrospect, I think 8fd4391ee717 (which I wrote), was probably a
mistake.

Sure, it's a nice idea to have PNP0A03 _CRS methods that work nicely
as designed, by describing host bridge registers as "consumer"
resources and host bridge windows as "producer" registers, instead of
having the bridge registers in _CRS of an unrelated PNP0C02 device.

But realistically, the PNP0A03/PNP0C02 issue is a solved problem, even
though it's ugly, and I'm not sure why I thought Microsoft would see
value in doing this differently on arm64 than on x86 and ia64.

What would break if we reverted 8fd4391ee717?  I guess any arm64
platforms that described host bridge register space in PNP0A03 _CRS
"consumer" resources?  And Windows probably doesn't work or isn't
supported on those platforms?

Bjorn
Lorenzo Pieralisi May 27, 2021, 4:56 p.m. UTC | #5
On Thu, May 27, 2021 at 11:34:52AM -0500, Bjorn Helgaas wrote:

[...]

> > > https://lore.kernel.org/r/20210510234020.1330087-1-luzmaximilian@gmail.com
> > 
> > Sigh. We can't apply this patch since it would trigger regressions on
> > other platforms (IIUC the root complex registers would end up in the
> > host bridge memory windows).
> > 
> > I am not keen on reverting commit 8fd4391ee717 because it does the
> > right thing.
> > 
> > I think this requires a quirk and immediate reporting to Microsoft.
> > 
> > Bjorn, what are your thoughts on this ?
> 
> In retrospect, I think 8fd4391ee717 (which I wrote), was probably a
> mistake.
> 
> Sure, it's a nice idea to have PNP0A03 _CRS methods that work nicely
> as designed, by describing host bridge registers as "consumer"
> resources and host bridge windows as "producer" registers, instead of
> having the bridge registers in _CRS of an unrelated PNP0C02 device.
> 
> But realistically, the PNP0A03/PNP0C02 issue is a solved problem, even
> though it's ugly, and I'm not sure why I thought Microsoft would see
> value in doing this differently on arm64 than on x86 and ia64.

We hoped we could comply with the specs, given that we were starting
from a clean slate (and not from ACPI tables cut and paste)

> What would break if we reverted 8fd4391ee717?  I guess any arm64
> platforms that described host bridge register space in PNP0A03 _CRS
> "consumer" resources ?

Yes. We would end up with that register space in the host bridge memory
windows - this does not sound right.

> And Windows probably doesn't work or isn't supported on those
> platforms?

By the look of it the answer is yes, Windows was not bootstrapped on
those platforms given that I *assume* Windows does not discriminate
between producer and consumer resources at all.

Lorenzo
diff mbox series

Patch

diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
index 1006ed2d7c60..80f87fe0a2b8 100644
--- a/arch/arm64/kernel/pci.c
+++ b/arch/arm64/kernel/pci.c
@@ -94,19 +94,6 @@  int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
 	return 0;
 }
 
-static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci)
-{
-	struct resource_entry *entry, *tmp;
-	int status;
-
-	status = acpi_pci_probe_root_resources(ci);
-	resource_list_for_each_entry_safe(entry, tmp, &ci->resources) {
-		if (!(entry->res->flags & IORESOURCE_WINDOW))
-			resource_list_destroy_entry(entry);
-	}
-	return status;
-}
-
 /*
  * Lookup the bus range for the domain in MCFG, and set up config space
  * mapping.
@@ -184,7 +171,6 @@  struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root *root)
 	}
 
 	root_ops->release_info = pci_acpi_generic_release_info;
-	root_ops->prepare_resources = pci_acpi_root_prepare_resources;
 	root_ops->pci_ops = (struct pci_ops *)&ri->cfg->ops->pci_ops;
 	bus = acpi_pci_root_create(root, root_ops, &ri->common, ri->cfg);
 	if (!bus)