diff mbox series

[v7,1/1] x86/PCI: Ignore E820 reservations for bridge windows on newer systems

Message ID 20220505152016.5059-2-hdegoede@redhat.com (mailing list archive)
State Superseded
Delegated to: Bjorn Helgaas
Headers show
Series x86/PCI: Ignore E820 reservations for bridge windows on newer systems | expand

Commit Message

Hans de Goede May 5, 2022, 3:20 p.m. UTC
Some BIOS-es contain bugs where they add addresses which are already
used in some other manner to the PCI host bridge window returned by
the ACPI _CRS method. To avoid this Linux by default excludes
E820 reservations when allocating addresses since 2010, see:
commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
space").

Recently (2019) some systems have shown-up with E820 reservations which
cover the entire _CRS returned PCI bridge memory window, causing all
attempts to assign memory to PCI BARs which have not been setup by the
BIOS to fail. For example here are the relevant dmesg bits from a
Lenovo IdeaPad 3 15IIL 81WE:

 [mem 0x000000004bc50000-0x00000000cfffffff] reserved
 pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]

The ACPI specifications appear to allow this new behavior:

The relationship between E820 and ACPI _CRS is not really very clear.
ACPI v6.3, sec 15, table 15-374, says AddressRangeReserved means:

  This range of addresses is in use or reserved by the system and is
  not to be included in the allocatable memory pool of the operating
  system's memory manager.

and it may be used when:

  The address range is in use by a memory-mapped system device.

Furthermore, sec 15.2 says:

  Address ranges defined for baseboard memory-mapped I/O devices, such
  as APICs, are returned as reserved.

A PCI host bridge qualifies as a baseboard memory-mapped I/O device,
and its apertures are in use and certainly should not be included in
the general allocatable pool, so the fact that some BIOS-es reports
the PCI aperture as "reserved" in E820 doesn't seem like a BIOS bug.

So it seems that the excluding of E820 reserved addresses is a mistake.

Ideally Linux would fully stop excluding E820 reserved addresses,
but then various old systems will regress.
Instead keep the old behavior for old systems, while ignoring
the E820 reservations for any systems from now on.

Old systems are defined here as BIOS year < 2018, this was chosen to
make sure that pci_use_e820 will not be set on the currently affected
systems, the oldest known one is from 2019.

Testing has shown that some newer systems also have a bad _CRS return.
The pci_crs_quirks DMI table is used to keep excluding E820 reservations
from the bridge window on these systems.

Also add pci=no_e820 and pci=use_e820 options to allow overriding
the BIOS year + DMI matching logic.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
BugLink: https://bugs.launchpad.net/bugs/1878279
BugLink: https://bugs.launchpad.net/bugs/1931715
BugLink: https://bugs.launchpad.net/bugs/1932069
BugLink: https://bugs.launchpad.net/bugs/1921649
Cc: Benoit Grégoire <benoitg@coeus.ca>
Cc: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
Changes in v7:
- Re-add the pci=use_e820 and pci=no_e820 kernel cmdline options since it
  turns out that some newer laptops still need pci=use_e820
- Add DMI quirks for known newer laptops which need pci=use_e820

Changes in v6:
- Remove the possibility to change the behavior from the commandline
  because of worries that users may use this to paper over other problems

Changes in v5:
- Drop mention of Windows behavior from the commit msg, replace with a
  reference to the specs
- Improve documentation in Documentation/admin-guide/kernel-parameters.txt
- Reword the big comment added, use "PCI host bridge window" in it and drop
  all refences to Windows

Changes in v4:
- Rewrap the big comment block to fit in 80 columns
- Add Rafael's Acked-by
- Add Cc: stable@vger.kernel.org

Changes in v3:
- Commit msg tweaks (drop dmesg timestamps, typo fix)
- Use "defined(CONFIG_...)" instead of "defined CONFIG_..."
- Add Mika's Reviewed-by

Changes in v2:
- Replace the per model DMI quirk approach with disabling E820 reservations
  checking for all systems with a BIOS year >= 2018
- Add documentation for the new kernel-parameters to
  Documentation/admin-guide/kernel-parameters.txt
---
 .../admin-guide/kernel-parameters.txt         |  9 +++
 arch/x86/include/asm/pci_x86.h                |  2 +
 arch/x86/pci/acpi.c                           | 74 ++++++++++++++++++-
 arch/x86/pci/common.c                         |  6 ++
 4 files changed, 89 insertions(+), 2 deletions(-)

Comments

Bjorn Helgaas May 6, 2022, 4:51 p.m. UTC | #1
On Thu, May 05, 2022 at 05:20:16PM +0200, Hans de Goede wrote:
> Some BIOS-es contain bugs where they add addresses which are already
> used in some other manner to the PCI host bridge window returned by
> the ACPI _CRS method. To avoid this Linux by default excludes
> E820 reservations when allocating addresses since 2010, see:
> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
> space").
> 
> Recently (2019) some systems have shown-up with E820 reservations which
> cover the entire _CRS returned PCI bridge memory window, causing all
> attempts to assign memory to PCI BARs which have not been setup by the
> BIOS to fail. For example here are the relevant dmesg bits from a
> Lenovo IdeaPad 3 15IIL 81WE:
> 
>  [mem 0x000000004bc50000-0x00000000cfffffff] reserved
>  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
> 
> The ACPI specifications appear to allow this new behavior:
> 
> The relationship between E820 and ACPI _CRS is not really very clear.
> ACPI v6.3, sec 15, table 15-374, says AddressRangeReserved means:
> 
>   This range of addresses is in use or reserved by the system and is
>   not to be included in the allocatable memory pool of the operating
>   system's memory manager.
> 
> and it may be used when:
> 
>   The address range is in use by a memory-mapped system device.
> 
> Furthermore, sec 15.2 says:
> 
>   Address ranges defined for baseboard memory-mapped I/O devices, such
>   as APICs, are returned as reserved.
> 
> A PCI host bridge qualifies as a baseboard memory-mapped I/O device,
> and its apertures are in use and certainly should not be included in
> the general allocatable pool, so the fact that some BIOS-es reports
> the PCI aperture as "reserved" in E820 doesn't seem like a BIOS bug.
> 
> So it seems that the excluding of E820 reserved addresses is a mistake.
> 
> Ideally Linux would fully stop excluding E820 reserved addresses,
> but then various old systems will regress.
> Instead keep the old behavior for old systems, while ignoring
> the E820 reservations for any systems from now on.
> 
> Old systems are defined here as BIOS year < 2018, this was chosen to
> make sure that pci_use_e820 will not be set on the currently affected
> systems, the oldest known one is from 2019.
> 
> Testing has shown that some newer systems also have a bad _CRS return.
> The pci_crs_quirks DMI table is used to keep excluding E820 reservations
> from the bridge window on these systems.
> 
> Also add pci=no_e820 and pci=use_e820 options to allow overriding
> the BIOS year + DMI matching logic.
> 
> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
> BugLink: https://bugs.launchpad.net/bugs/1878279
> BugLink: https://bugs.launchpad.net/bugs/1931715
> BugLink: https://bugs.launchpad.net/bugs/1932069
> BugLink: https://bugs.launchpad.net/bugs/1921649
> Cc: Benoit Grégoire <benoitg@coeus.ca>
> Cc: Hui Wang <hui.wang@canonical.com>
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>

> +	 * Ideally Linux would fully stop using E820 reservations, but then
> +	 * various old systems will regress. Instead keep the old behavior for
> +	 * old systems + known to be broken newer systems in pci_crs_quirks.
> +	 */
> +	if (year >= 0 && year < 2018)
> +		pci_use_e820 = true;

How did you pick 2018?  Prior to this patch, we used E820 reservations
for all machines.  This patch would change that for 2019-2022
machines, so there's a risk of breaking some of them.

I'm hesitant about changing the behavior for machines already in the
field because if they were tested at all with Linux, it was without
this patch.  So I would lean toward preserving the current behavior
for BIOS year < 2023.

> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index 9e1e6b8d8876..7e6f79aab6a8 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -595,6 +595,12 @@ char *__init pcibios_setup(char *str)
>  	} else if (!strcmp(str, "nocrs")) {
>  		pci_probe |= PCI_ROOT_NO_CRS;
>  		return NULL;
> +	} else if (!strcmp(str, "use_e820")) {
> +		pci_probe |= PCI_USE_E820;

I think we should add_taint(TAINT_FIRMWARE_WORKAROUND) for both these
cases.

We probably should do it for *all* the parameters here, but that would
be a separate discussion.

> +		return NULL;
> +	} else if (!strcmp(str, "no_e820")) {
> +		pci_probe |= PCI_NO_E820;
> +		return NULL;
>  #ifdef CONFIG_PHYS_ADDR_T_64BIT
>  	} else if (!strcmp(str, "big_root_window")) {
>  		pci_probe |= PCI_BIG_ROOT_WINDOW;
> -- 
> 2.36.0
>
Hans de Goede May 7, 2022, 10:09 a.m. UTC | #2
Hi Bjorn,

On 5/6/22 18:51, Bjorn Helgaas wrote:
> On Thu, May 05, 2022 at 05:20:16PM +0200, Hans de Goede wrote:
>> Some BIOS-es contain bugs where they add addresses which are already
>> used in some other manner to the PCI host bridge window returned by
>> the ACPI _CRS method. To avoid this Linux by default excludes
>> E820 reservations when allocating addresses since 2010, see:
>> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
>> space").
>>
>> Recently (2019) some systems have shown-up with E820 reservations which
>> cover the entire _CRS returned PCI bridge memory window, causing all
>> attempts to assign memory to PCI BARs which have not been setup by the
>> BIOS to fail. For example here are the relevant dmesg bits from a
>> Lenovo IdeaPad 3 15IIL 81WE:
>>
>>  [mem 0x000000004bc50000-0x00000000cfffffff] reserved
>>  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>>
>> The ACPI specifications appear to allow this new behavior:
>>
>> The relationship between E820 and ACPI _CRS is not really very clear.
>> ACPI v6.3, sec 15, table 15-374, says AddressRangeReserved means:
>>
>>   This range of addresses is in use or reserved by the system and is
>>   not to be included in the allocatable memory pool of the operating
>>   system's memory manager.
>>
>> and it may be used when:
>>
>>   The address range is in use by a memory-mapped system device.
>>
>> Furthermore, sec 15.2 says:
>>
>>   Address ranges defined for baseboard memory-mapped I/O devices, such
>>   as APICs, are returned as reserved.
>>
>> A PCI host bridge qualifies as a baseboard memory-mapped I/O device,
>> and its apertures are in use and certainly should not be included in
>> the general allocatable pool, so the fact that some BIOS-es reports
>> the PCI aperture as "reserved" in E820 doesn't seem like a BIOS bug.
>>
>> So it seems that the excluding of E820 reserved addresses is a mistake.
>>
>> Ideally Linux would fully stop excluding E820 reserved addresses,
>> but then various old systems will regress.
>> Instead keep the old behavior for old systems, while ignoring
>> the E820 reservations for any systems from now on.
>>
>> Old systems are defined here as BIOS year < 2018, this was chosen to
>> make sure that pci_use_e820 will not be set on the currently affected
>> systems, the oldest known one is from 2019.
>>
>> Testing has shown that some newer systems also have a bad _CRS return.
>> The pci_crs_quirks DMI table is used to keep excluding E820 reservations
>> from the bridge window on these systems.
>>
>> Also add pci=no_e820 and pci=use_e820 options to allow overriding
>> the BIOS year + DMI matching logic.
>>
>> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
>> BugLink: https://bugs.launchpad.net/bugs/1878279
>> BugLink: https://bugs.launchpad.net/bugs/1931715
>> BugLink: https://bugs.launchpad.net/bugs/1932069
>> BugLink: https://bugs.launchpad.net/bugs/1921649
>> Cc: Benoit Grégoire <benoitg@coeus.ca>
>> Cc: Hui Wang <hui.wang@canonical.com>
>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> 
>> +	 * Ideally Linux would fully stop using E820 reservations, but then
>> +	 * various old systems will regress. Instead keep the old behavior for
>> +	 * old systems + known to be broken newer systems in pci_crs_quirks.
>> +	 */
>> +	if (year >= 0 && year < 2018)
>> +		pci_use_e820 = true;
> 
> How did you pick 2018?  Prior to this patch, we used E820 reservations
> for all machines.  This patch would change that for 2019-2022
> machines, so there's a risk of breaking some of them.

Correct. I picked 2018 because the first devices where using E820
reservations are causing issues (i2c controller not getting resources
leading to non working touchpad / thunderbolt hotplug issues) have
BIOS dates starting in 2019. I added a year margin, so we could make
this 2019.

> I'm hesitant about changing the behavior for machines already in the
> field because if they were tested at all with Linux, it was without
> this patch.  So I would lean toward preserving the current behavior
> for BIOS year < 2023.

I see, I presume the idea is to then use DMI to disable E820 clipping
on current devices where this is known to cause problems ?

So for v8 I would:

1. Change the cut-off check to < 2023
2. Drop the DMI quirks I added for models which are known to need E820
   clipping hit by the < 2018 check
3. Add DMI quirks for models for which it is known that we must _not_
   do E820 clipping

Is this the direction you want to go / does that sound right?

Note the DMI list for 3. will initially very likely be incomplete, but
I can ask around for testing once we have settled on this approach
and do one or more follow up patches to extend the list.


>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
>> index 9e1e6b8d8876..7e6f79aab6a8 100644
>> --- a/arch/x86/pci/common.c
>> +++ b/arch/x86/pci/common.c
>> @@ -595,6 +595,12 @@ char *__init pcibios_setup(char *str)
>>  	} else if (!strcmp(str, "nocrs")) {
>>  		pci_probe |= PCI_ROOT_NO_CRS;
>>  		return NULL;
>> +	} else if (!strcmp(str, "use_e820")) {
>> +		pci_probe |= PCI_USE_E820;
> 
> I think we should add_taint(TAINT_FIRMWARE_WORKAROUND) for both these
> cases.

Ok, I'll add this for v8.

> 
> We probably should do it for *all* the parameters here, but that would
> be a separate discussion.
> 
>> +		return NULL;
>> +	} else if (!strcmp(str, "no_e820")) {
>> +		pci_probe |= PCI_NO_E820;
>> +		return NULL;
>>  #ifdef CONFIG_PHYS_ADDR_T_64BIT
>>  	} else if (!strcmp(str, "big_root_window")) {
>>  		pci_probe |= PCI_BIG_ROOT_WINDOW;
>> -- 
>> 2.36.0
>>
> 


Regards,

Hans
Bjorn Helgaas May 7, 2022, 3:31 p.m. UTC | #3
On Sat, May 07, 2022 at 12:09:03PM +0200, Hans de Goede wrote:
> Hi Bjorn,
> 
> On 5/6/22 18:51, Bjorn Helgaas wrote:
> > On Thu, May 05, 2022 at 05:20:16PM +0200, Hans de Goede wrote:
> >> Some BIOS-es contain bugs where they add addresses which are already
> >> used in some other manner to the PCI host bridge window returned by
> >> the ACPI _CRS method. To avoid this Linux by default excludes
> >> E820 reservations when allocating addresses since 2010, see:
> >> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
> >> space").
> >>
> >> Recently (2019) some systems have shown-up with E820 reservations which
> >> cover the entire _CRS returned PCI bridge memory window, causing all
> >> attempts to assign memory to PCI BARs which have not been setup by the
> >> BIOS to fail. For example here are the relevant dmesg bits from a
> >> Lenovo IdeaPad 3 15IIL 81WE:
> >>
> >>  [mem 0x000000004bc50000-0x00000000cfffffff] reserved
> >>  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
> >>
> >> The ACPI specifications appear to allow this new behavior:
> >>
> >> The relationship between E820 and ACPI _CRS is not really very clear.
> >> ACPI v6.3, sec 15, table 15-374, says AddressRangeReserved means:
> >>
> >>   This range of addresses is in use or reserved by the system and is
> >>   not to be included in the allocatable memory pool of the operating
> >>   system's memory manager.
> >>
> >> and it may be used when:
> >>
> >>   The address range is in use by a memory-mapped system device.
> >>
> >> Furthermore, sec 15.2 says:
> >>
> >>   Address ranges defined for baseboard memory-mapped I/O devices, such
> >>   as APICs, are returned as reserved.
> >>
> >> A PCI host bridge qualifies as a baseboard memory-mapped I/O device,
> >> and its apertures are in use and certainly should not be included in
> >> the general allocatable pool, so the fact that some BIOS-es reports
> >> the PCI aperture as "reserved" in E820 doesn't seem like a BIOS bug.
> >>
> >> So it seems that the excluding of E820 reserved addresses is a mistake.
> >>
> >> Ideally Linux would fully stop excluding E820 reserved addresses,
> >> but then various old systems will regress.
> >> Instead keep the old behavior for old systems, while ignoring
> >> the E820 reservations for any systems from now on.
> >>
> >> Old systems are defined here as BIOS year < 2018, this was chosen to
> >> make sure that pci_use_e820 will not be set on the currently affected
> >> systems, the oldest known one is from 2019.
> >>
> >> Testing has shown that some newer systems also have a bad _CRS return.
> >> The pci_crs_quirks DMI table is used to keep excluding E820 reservations
> >> from the bridge window on these systems.
> >>
> >> Also add pci=no_e820 and pci=use_e820 options to allow overriding
> >> the BIOS year + DMI matching logic.
> >>
> >> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
> >> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
> >> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
> >> BugLink: https://bugs.launchpad.net/bugs/1878279
> >> BugLink: https://bugs.launchpad.net/bugs/1931715
> >> BugLink: https://bugs.launchpad.net/bugs/1932069
> >> BugLink: https://bugs.launchpad.net/bugs/1921649
> >> Cc: Benoit Grégoire <benoitg@coeus.ca>
> >> Cc: Hui Wang <hui.wang@canonical.com>
> >> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> > 
> >> +	 * Ideally Linux would fully stop using E820 reservations, but then
> >> +	 * various old systems will regress. Instead keep the old behavior for
> >> +	 * old systems + known to be broken newer systems in pci_crs_quirks.
> >> +	 */
> >> +	if (year >= 0 && year < 2018)
> >> +		pci_use_e820 = true;
> > 
> > How did you pick 2018?  Prior to this patch, we used E820 reservations
> > for all machines.  This patch would change that for 2019-2022
> > machines, so there's a risk of breaking some of them.
> 
> Correct. I picked 2018 because the first devices where using E820
> reservations are causing issues (i2c controller not getting resources
> leading to non working touchpad / thunderbolt hotplug issues) have
> BIOS dates starting in 2019. I added a year margin, so we could make
> this 2019.
> 
> > I'm hesitant about changing the behavior for machines already in the
> > field because if they were tested at all with Linux, it was without
> > this patch.  So I would lean toward preserving the current behavior
> > for BIOS year < 2023.
> 
> I see, I presume the idea is to then use DMI to disable E820 clipping
> on current devices where this is known to cause problems ?
> 
> So for v8 I would:
> 
> 1. Change the cut-off check to < 2023
> 2. Drop the DMI quirks I added for models which are known to need E820
>    clipping hit by the < 2018 check
> 3. Add DMI quirks for models for which it is known that we must _not_
>    do E820 clipping
> 
> Is this the direction you want to go / does that sound right?

Yes, I think that's what we should do.  All the machines in the field
will be unaffected, except that we add quirks for known problems.

Bjorn
Hans de Goede May 9, 2022, 5:33 p.m. UTC | #4
Hi Bjorn,

On 5/7/22 17:31, Bjorn Helgaas wrote:
> On Sat, May 07, 2022 at 12:09:03PM +0200, Hans de Goede wrote:
>> Hi Bjorn,
>>
>> On 5/6/22 18:51, Bjorn Helgaas wrote:
>>> On Thu, May 05, 2022 at 05:20:16PM +0200, Hans de Goede wrote:
>>>> Some BIOS-es contain bugs where they add addresses which are already
>>>> used in some other manner to the PCI host bridge window returned by
>>>> the ACPI _CRS method. To avoid this Linux by default excludes
>>>> E820 reservations when allocating addresses since 2010, see:
>>>> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
>>>> space").
>>>>
>>>> Recently (2019) some systems have shown-up with E820 reservations which
>>>> cover the entire _CRS returned PCI bridge memory window, causing all
>>>> attempts to assign memory to PCI BARs which have not been setup by the
>>>> BIOS to fail. For example here are the relevant dmesg bits from a
>>>> Lenovo IdeaPad 3 15IIL 81WE:
>>>>
>>>>  [mem 0x000000004bc50000-0x00000000cfffffff] reserved
>>>>  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>>>>
>>>> The ACPI specifications appear to allow this new behavior:
>>>>
>>>> The relationship between E820 and ACPI _CRS is not really very clear.
>>>> ACPI v6.3, sec 15, table 15-374, says AddressRangeReserved means:
>>>>
>>>>   This range of addresses is in use or reserved by the system and is
>>>>   not to be included in the allocatable memory pool of the operating
>>>>   system's memory manager.
>>>>
>>>> and it may be used when:
>>>>
>>>>   The address range is in use by a memory-mapped system device.
>>>>
>>>> Furthermore, sec 15.2 says:
>>>>
>>>>   Address ranges defined for baseboard memory-mapped I/O devices, such
>>>>   as APICs, are returned as reserved.
>>>>
>>>> A PCI host bridge qualifies as a baseboard memory-mapped I/O device,
>>>> and its apertures are in use and certainly should not be included in
>>>> the general allocatable pool, so the fact that some BIOS-es reports
>>>> the PCI aperture as "reserved" in E820 doesn't seem like a BIOS bug.
>>>>
>>>> So it seems that the excluding of E820 reserved addresses is a mistake.
>>>>
>>>> Ideally Linux would fully stop excluding E820 reserved addresses,
>>>> but then various old systems will regress.
>>>> Instead keep the old behavior for old systems, while ignoring
>>>> the E820 reservations for any systems from now on.
>>>>
>>>> Old systems are defined here as BIOS year < 2018, this was chosen to
>>>> make sure that pci_use_e820 will not be set on the currently affected
>>>> systems, the oldest known one is from 2019.
>>>>
>>>> Testing has shown that some newer systems also have a bad _CRS return.
>>>> The pci_crs_quirks DMI table is used to keep excluding E820 reservations
>>>> from the bridge window on these systems.
>>>>
>>>> Also add pci=no_e820 and pci=use_e820 options to allow overriding
>>>> the BIOS year + DMI matching logic.
>>>>
>>>> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
>>>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
>>>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
>>>> BugLink: https://bugs.launchpad.net/bugs/1878279
>>>> BugLink: https://bugs.launchpad.net/bugs/1931715
>>>> BugLink: https://bugs.launchpad.net/bugs/1932069
>>>> BugLink: https://bugs.launchpad.net/bugs/1921649
>>>> Cc: Benoit Grégoire <benoitg@coeus.ca>
>>>> Cc: Hui Wang <hui.wang@canonical.com>
>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>
>>>> +	 * Ideally Linux would fully stop using E820 reservations, but then
>>>> +	 * various old systems will regress. Instead keep the old behavior for
>>>> +	 * old systems + known to be broken newer systems in pci_crs_quirks.
>>>> +	 */
>>>> +	if (year >= 0 && year < 2018)
>>>> +		pci_use_e820 = true;
>>>
>>> How did you pick 2018?  Prior to this patch, we used E820 reservations
>>> for all machines.  This patch would change that for 2019-2022
>>> machines, so there's a risk of breaking some of them.
>>
>> Correct. I picked 2018 because the first devices where using E820
>> reservations are causing issues (i2c controller not getting resources
>> leading to non working touchpad / thunderbolt hotplug issues) have
>> BIOS dates starting in 2019. I added a year margin, so we could make
>> this 2019.
>>
>>> I'm hesitant about changing the behavior for machines already in the
>>> field because if they were tested at all with Linux, it was without
>>> this patch.  So I would lean toward preserving the current behavior
>>> for BIOS year < 2023.
>>
>> I see, I presume the idea is to then use DMI to disable E820 clipping
>> on current devices where this is known to cause problems ?
>>
>> So for v8 I would:
>>
>> 1. Change the cut-off check to < 2023
>> 2. Drop the DMI quirks I added for models which are known to need E820
>>    clipping hit by the < 2018 check
>> 3. Add DMI quirks for models for which it is known that we must _not_
>>    do E820 clipping
>>
>> Is this the direction you want to go / does that sound right?
> 
> Yes, I think that's what we should do.  All the machines in the field
> will be unaffected, except that we add quirks for known problems.

I've been working on this today. I've mostly been going through
the all the existing bugs about this, to make a list of DMI matches
for devices on which we should _not_ do e820 clipping to fix th
kernel being unable to assign BARs there.

I've found an interesting pattern there, all affected devices
are Lenovo devices with "IIL" in there device name, e.g. :
"IdeaPad 3 15IIL05". I've looked up all Lenovo devices which
have "IIL" as part of their DMI_PRODUCT_VERSION string here:
https://github.com/linuxhw/DMI/

And then looked them up at https://linux-hardware.org/ and checked
their dmesg to see if they have the e820 problem other ideapads
have. I've gone through approx. half the list now and all
except one model seem to have the e820 problem.

So it looks like we might be able to match all problem models
with a single DMI match.

So the problem seems to be limited to one specific device
series / range and this is making me have second thoughts
about doing a date based cut-off at all. Trying to switch
over any models which are new in 2023 is fine, the problem
with a DMI BIOS date approach though is that as soon as some
new management-engine CVE comes out we will also see BIOS
updates with a year of 2023 for many existing models, of
up to 3-4 years old at least; and chances are that some of
those older models getting BIOS updates will be bitten by
this change.

So as said I'm having second thoughts about the date based
approach. Bjorn, what do you think of just using DMI quirks
to disable e820 clipping on known problematic models and
otherwise keeping things as is ?

Note I'm also fine with going with the 2023 date based
approach, I'm just wondering if that will be a good idea
and not something which we might regret later.

Regards,

Hans


p.s.

I've seen your email about the Acer laptop; I'll take
a look at that coming Wednesday.
Hans de Goede May 9, 2022, 6:21 p.m. UTC | #5
Hi,

On 5/9/22 19:33, Hans de Goede wrote:
> Hi Bjorn,
> 
> On 5/7/22 17:31, Bjorn Helgaas wrote:
>> On Sat, May 07, 2022 at 12:09:03PM +0200, Hans de Goede wrote:
>>> Hi Bjorn,
>>>
>>> On 5/6/22 18:51, Bjorn Helgaas wrote:
>>>> On Thu, May 05, 2022 at 05:20:16PM +0200, Hans de Goede wrote:
>>>>> Some BIOS-es contain bugs where they add addresses which are already
>>>>> used in some other manner to the PCI host bridge window returned by
>>>>> the ACPI _CRS method. To avoid this Linux by default excludes
>>>>> E820 reservations when allocating addresses since 2010, see:
>>>>> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
>>>>> space").
>>>>>
>>>>> Recently (2019) some systems have shown-up with E820 reservations which
>>>>> cover the entire _CRS returned PCI bridge memory window, causing all
>>>>> attempts to assign memory to PCI BARs which have not been setup by the
>>>>> BIOS to fail. For example here are the relevant dmesg bits from a
>>>>> Lenovo IdeaPad 3 15IIL 81WE:
>>>>>
>>>>>  [mem 0x000000004bc50000-0x00000000cfffffff] reserved
>>>>>  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>>>>>
>>>>> The ACPI specifications appear to allow this new behavior:
>>>>>
>>>>> The relationship between E820 and ACPI _CRS is not really very clear.
>>>>> ACPI v6.3, sec 15, table 15-374, says AddressRangeReserved means:
>>>>>
>>>>>   This range of addresses is in use or reserved by the system and is
>>>>>   not to be included in the allocatable memory pool of the operating
>>>>>   system's memory manager.
>>>>>
>>>>> and it may be used when:
>>>>>
>>>>>   The address range is in use by a memory-mapped system device.
>>>>>
>>>>> Furthermore, sec 15.2 says:
>>>>>
>>>>>   Address ranges defined for baseboard memory-mapped I/O devices, such
>>>>>   as APICs, are returned as reserved.
>>>>>
>>>>> A PCI host bridge qualifies as a baseboard memory-mapped I/O device,
>>>>> and its apertures are in use and certainly should not be included in
>>>>> the general allocatable pool, so the fact that some BIOS-es reports
>>>>> the PCI aperture as "reserved" in E820 doesn't seem like a BIOS bug.
>>>>>
>>>>> So it seems that the excluding of E820 reserved addresses is a mistake.
>>>>>
>>>>> Ideally Linux would fully stop excluding E820 reserved addresses,
>>>>> but then various old systems will regress.
>>>>> Instead keep the old behavior for old systems, while ignoring
>>>>> the E820 reservations for any systems from now on.
>>>>>
>>>>> Old systems are defined here as BIOS year < 2018, this was chosen to
>>>>> make sure that pci_use_e820 will not be set on the currently affected
>>>>> systems, the oldest known one is from 2019.
>>>>>
>>>>> Testing has shown that some newer systems also have a bad _CRS return.
>>>>> The pci_crs_quirks DMI table is used to keep excluding E820 reservations
>>>>> from the bridge window on these systems.
>>>>>
>>>>> Also add pci=no_e820 and pci=use_e820 options to allow overriding
>>>>> the BIOS year + DMI matching logic.
>>>>>
>>>>> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
>>>>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
>>>>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
>>>>> BugLink: https://bugs.launchpad.net/bugs/1878279
>>>>> BugLink: https://bugs.launchpad.net/bugs/1931715
>>>>> BugLink: https://bugs.launchpad.net/bugs/1932069
>>>>> BugLink: https://bugs.launchpad.net/bugs/1921649
>>>>> Cc: Benoit Grégoire <benoitg@coeus.ca>
>>>>> Cc: Hui Wang <hui.wang@canonical.com>
>>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>>
>>>>> +	 * Ideally Linux would fully stop using E820 reservations, but then
>>>>> +	 * various old systems will regress. Instead keep the old behavior for
>>>>> +	 * old systems + known to be broken newer systems in pci_crs_quirks.
>>>>> +	 */
>>>>> +	if (year >= 0 && year < 2018)
>>>>> +		pci_use_e820 = true;
>>>>
>>>> How did you pick 2018?  Prior to this patch, we used E820 reservations
>>>> for all machines.  This patch would change that for 2019-2022
>>>> machines, so there's a risk of breaking some of them.
>>>
>>> Correct. I picked 2018 because the first devices where using E820
>>> reservations are causing issues (i2c controller not getting resources
>>> leading to non working touchpad / thunderbolt hotplug issues) have
>>> BIOS dates starting in 2019. I added a year margin, so we could make
>>> this 2019.
>>>
>>>> I'm hesitant about changing the behavior for machines already in the
>>>> field because if they were tested at all with Linux, it was without
>>>> this patch.  So I would lean toward preserving the current behavior
>>>> for BIOS year < 2023.
>>>
>>> I see, I presume the idea is to then use DMI to disable E820 clipping
>>> on current devices where this is known to cause problems ?
>>>
>>> So for v8 I would:
>>>
>>> 1. Change the cut-off check to < 2023
>>> 2. Drop the DMI quirks I added for models which are known to need E820
>>>    clipping hit by the < 2018 check
>>> 3. Add DMI quirks for models for which it is known that we must _not_
>>>    do E820 clipping
>>>
>>> Is this the direction you want to go / does that sound right?
>>
>> Yes, I think that's what we should do.  All the machines in the field
>> will be unaffected, except that we add quirks for known problems.
> 
> I've been working on this today. I've mostly been going through
> the all the existing bugs about this, to make a list of DMI matches
> for devices on which we should _not_ do e820 clipping to fix th
> kernel being unable to assign BARs there.
> 
> I've found an interesting pattern there, all affected devices
> are Lenovo devices with "IIL" in there device name, e.g. :
> "IdeaPad 3 15IIL05". I've looked up all Lenovo devices which
> have "IIL" as part of their DMI_PRODUCT_VERSION string here:
> https://github.com/linuxhw/DMI/
> 
> And then looked them up at https://linux-hardware.org/ and checked
> their dmesg to see if they have the e820 problem other ideapads
> have. I've gone through approx. half the list now and all
> except one model seem to have the e820 problem.
> 
> So it looks like we might be able to match all problem models
> with a single DMI match.
> 
> So the problem seems to be limited to one specific device
> series / range and this is making me have second thoughts
> about doing a date based cut-off at all. Trying to switch
> over any models which are new in 2023 is fine, the problem
> with a DMI BIOS date approach though is that as soon as some
> new management-engine CVE comes out we will also see BIOS
> updates with a year of 2023 for many existing models, of
> up to 3-4 years old at least; and chances are that some of
> those older models getting BIOS updates will be bitten by
> this change.
> 
> So as said I'm having second thoughts about the date based
> approach. Bjorn, what do you think of just using DMI quirks
> to disable e820 clipping on known problematic models and
> otherwise keeping things as is ?
> 
> Note I'm also fine with going with the 2023 date based
> approach, I'm just wondering if that will be a good idea
> and not something which we might regret later.
> 
> Regards,
> 
> Hans
> 
> 
> p.s.
> 
> I've seen your email about the Acer laptop; I'll take
> a look at that coming Wednesday.

So I couldn't help myself and I took a quick peek. This
definitely is the same issue as on the various lenovo
devices, with an E820 reserved region covering the entire
bridge window, causing assigning unassigned BARs like
for the I2C controller for the touchpad to not work.

Interestingly enough, this is the first non Lenovo
machine with this issue I have heard about.

Regards,

Hans
Bjorn Helgaas May 9, 2022, 7:36 p.m. UTC | #6
On Mon, May 09, 2022 at 07:33:27PM +0200, Hans de Goede wrote:
> Hi Bjorn,
> 
> On 5/7/22 17:31, Bjorn Helgaas wrote:
> > On Sat, May 07, 2022 at 12:09:03PM +0200, Hans de Goede wrote:
> >> Hi Bjorn,
> >>
> >> On 5/6/22 18:51, Bjorn Helgaas wrote:
> >>> On Thu, May 05, 2022 at 05:20:16PM +0200, Hans de Goede wrote:
> >>>> Some BIOS-es contain bugs where they add addresses which are already
> >>>> used in some other manner to the PCI host bridge window returned by
> >>>> the ACPI _CRS method. To avoid this Linux by default excludes
> >>>> E820 reservations when allocating addresses since 2010, see:
> >>>> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
> >>>> space").
> >>>>
> >>>> Recently (2019) some systems have shown-up with E820 reservations which
> >>>> cover the entire _CRS returned PCI bridge memory window, causing all
> >>>> attempts to assign memory to PCI BARs which have not been setup by the
> >>>> BIOS to fail. For example here are the relevant dmesg bits from a
> >>>> Lenovo IdeaPad 3 15IIL 81WE:
> >>>>
> >>>>  [mem 0x000000004bc50000-0x00000000cfffffff] reserved
> >>>>  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
> >>>>
> >>>> The ACPI specifications appear to allow this new behavior:
> >>>>
> >>>> The relationship between E820 and ACPI _CRS is not really very clear.
> >>>> ACPI v6.3, sec 15, table 15-374, says AddressRangeReserved means:
> >>>>
> >>>>   This range of addresses is in use or reserved by the system and is
> >>>>   not to be included in the allocatable memory pool of the operating
> >>>>   system's memory manager.
> >>>>
> >>>> and it may be used when:
> >>>>
> >>>>   The address range is in use by a memory-mapped system device.
> >>>>
> >>>> Furthermore, sec 15.2 says:
> >>>>
> >>>>   Address ranges defined for baseboard memory-mapped I/O devices, such
> >>>>   as APICs, are returned as reserved.
> >>>>
> >>>> A PCI host bridge qualifies as a baseboard memory-mapped I/O device,
> >>>> and its apertures are in use and certainly should not be included in
> >>>> the general allocatable pool, so the fact that some BIOS-es reports
> >>>> the PCI aperture as "reserved" in E820 doesn't seem like a BIOS bug.
> >>>>
> >>>> So it seems that the excluding of E820 reserved addresses is a mistake.
> >>>>
> >>>> Ideally Linux would fully stop excluding E820 reserved addresses,
> >>>> but then various old systems will regress.
> >>>> Instead keep the old behavior for old systems, while ignoring
> >>>> the E820 reservations for any systems from now on.
> >>>>
> >>>> Old systems are defined here as BIOS year < 2018, this was chosen to
> >>>> make sure that pci_use_e820 will not be set on the currently affected
> >>>> systems, the oldest known one is from 2019.
> >>>>
> >>>> Testing has shown that some newer systems also have a bad _CRS return.
> >>>> The pci_crs_quirks DMI table is used to keep excluding E820 reservations
> >>>> from the bridge window on these systems.
> >>>>
> >>>> Also add pci=no_e820 and pci=use_e820 options to allow overriding
> >>>> the BIOS year + DMI matching logic.
> >>>>
> >>>> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
> >>>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
> >>>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
> >>>> BugLink: https://bugs.launchpad.net/bugs/1878279
> >>>> BugLink: https://bugs.launchpad.net/bugs/1931715
> >>>> BugLink: https://bugs.launchpad.net/bugs/1932069
> >>>> BugLink: https://bugs.launchpad.net/bugs/1921649
> >>>> Cc: Benoit Grégoire <benoitg@coeus.ca>
> >>>> Cc: Hui Wang <hui.wang@canonical.com>
> >>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> >>>
> >>>> +	 * Ideally Linux would fully stop using E820 reservations, but then
> >>>> +	 * various old systems will regress. Instead keep the old behavior for
> >>>> +	 * old systems + known to be broken newer systems in pci_crs_quirks.
> >>>> +	 */
> >>>> +	if (year >= 0 && year < 2018)
> >>>> +		pci_use_e820 = true;
> >>>
> >>> How did you pick 2018?  Prior to this patch, we used E820 reservations
> >>> for all machines.  This patch would change that for 2019-2022
> >>> machines, so there's a risk of breaking some of them.
> >>
> >> Correct. I picked 2018 because the first devices where using E820
> >> reservations are causing issues (i2c controller not getting resources
> >> leading to non working touchpad / thunderbolt hotplug issues) have
> >> BIOS dates starting in 2019. I added a year margin, so we could make
> >> this 2019.
> >>
> >>> I'm hesitant about changing the behavior for machines already in the
> >>> field because if they were tested at all with Linux, it was without
> >>> this patch.  So I would lean toward preserving the current behavior
> >>> for BIOS year < 2023.
> >>
> >> I see, I presume the idea is to then use DMI to disable E820 clipping
> >> on current devices where this is known to cause problems ?
> >>
> >> So for v8 I would:
> >>
> >> 1. Change the cut-off check to < 2023
> >> 2. Drop the DMI quirks I added for models which are known to need E820
> >>    clipping hit by the < 2018 check
> >> 3. Add DMI quirks for models for which it is known that we must _not_
> >>    do E820 clipping
> >>
> >> Is this the direction you want to go / does that sound right?
> > 
> > Yes, I think that's what we should do.  All the machines in the field
> > will be unaffected, except that we add quirks for known problems.
> 
> I've been working on this today. I've mostly been going through
> the all the existing bugs about this, to make a list of DMI matches
> for devices on which we should _not_ do e820 clipping to fix th
> kernel being unable to assign BARs there.
> 
> I've found an interesting pattern there, all affected devices
> are Lenovo devices with "IIL" in there device name, e.g. :
> "IdeaPad 3 15IIL05". I've looked up all Lenovo devices which
> have "IIL" as part of their DMI_PRODUCT_VERSION string here:
> https://github.com/linuxhw/DMI/
> 
> And then looked them up at https://linux-hardware.org/ and checked
> their dmesg to see if they have the e820 problem other ideapads
> have. I've gone through approx. half the list now and all
> except one model seem to have the e820 problem.
> 
> So it looks like we might be able to match all problem models
> with a single DMI match.

That sounds reasonable.  I assume that if we skip the clipping for
every platform that matches "IIL", we can also add exceptions for the
inevitable "IIL" platforms that do need the clip?  E.g., specific
entries at the end of the list that override the previous generic
match?

> So the problem seems to be limited to one specific device
> series / range and this is making me have second thoughts
> about doing a date based cut-off at all. Trying to switch
> over any models which are new in 2023 is fine, the problem
> with a DMI BIOS date approach though is that as soon as some
> new management-engine CVE comes out we will also see BIOS
> updates with a year of 2023 for many existing models, of
> up to 3-4 years old at least; and chances are that some of
> those older models getting BIOS updates will be bitten by
> this change.

That's a good point and sounds fairly painful when that happens,
but I don't see a nice way out of this.

> So as said I'm having second thoughts about the date based
> approach. Bjorn, what do you think of just using DMI quirks
> to disable e820 clipping on known problematic models and
> otherwise keeping things as is ?

I think we need a long-term strategy that can be clearly expressed 
in a sentence or two and is consistent with the ACPI and PCI specs,
and I don't think the current strategy is it.  Clipping with E820
regions happened to work for some machines, but there's no reason to
think it will work in general.

> Note I'm also fine with going with the 2023 date based
> approach, I'm just wondering if that will be a good idea
> and not something which we might regret later.
> 
> Regards,
> 
> Hans
> 
> 
> p.s.
> 
> I've seen your email about the Acer laptop; I'll take
> a look at that coming Wednesday.
>
Hans de Goede May 12, 2022, 7:15 p.m. UTC | #7
Hi,

On 5/9/22 21:36, Bjorn Helgaas wrote:
> On Mon, May 09, 2022 at 07:33:27PM +0200, Hans de Goede wrote:
>> Hi Bjorn,
>>
>> On 5/7/22 17:31, Bjorn Helgaas wrote:
>>> On Sat, May 07, 2022 at 12:09:03PM +0200, Hans de Goede wrote:
>>>> Hi Bjorn,
>>>>
>>>> On 5/6/22 18:51, Bjorn Helgaas wrote:
>>>>> On Thu, May 05, 2022 at 05:20:16PM +0200, Hans de Goede wrote:
>>>>>> Some BIOS-es contain bugs where they add addresses which are already
>>>>>> used in some other manner to the PCI host bridge window returned by
>>>>>> the ACPI _CRS method. To avoid this Linux by default excludes
>>>>>> E820 reservations when allocating addresses since 2010, see:
>>>>>> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
>>>>>> space").
>>>>>>
>>>>>> Recently (2019) some systems have shown-up with E820 reservations which
>>>>>> cover the entire _CRS returned PCI bridge memory window, causing all
>>>>>> attempts to assign memory to PCI BARs which have not been setup by the
>>>>>> BIOS to fail. For example here are the relevant dmesg bits from a
>>>>>> Lenovo IdeaPad 3 15IIL 81WE:
>>>>>>
>>>>>>  [mem 0x000000004bc50000-0x00000000cfffffff] reserved
>>>>>>  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>>>>>>
>>>>>> The ACPI specifications appear to allow this new behavior:
>>>>>>
>>>>>> The relationship between E820 and ACPI _CRS is not really very clear.
>>>>>> ACPI v6.3, sec 15, table 15-374, says AddressRangeReserved means:
>>>>>>
>>>>>>   This range of addresses is in use or reserved by the system and is
>>>>>>   not to be included in the allocatable memory pool of the operating
>>>>>>   system's memory manager.
>>>>>>
>>>>>> and it may be used when:
>>>>>>
>>>>>>   The address range is in use by a memory-mapped system device.
>>>>>>
>>>>>> Furthermore, sec 15.2 says:
>>>>>>
>>>>>>   Address ranges defined for baseboard memory-mapped I/O devices, such
>>>>>>   as APICs, are returned as reserved.
>>>>>>
>>>>>> A PCI host bridge qualifies as a baseboard memory-mapped I/O device,
>>>>>> and its apertures are in use and certainly should not be included in
>>>>>> the general allocatable pool, so the fact that some BIOS-es reports
>>>>>> the PCI aperture as "reserved" in E820 doesn't seem like a BIOS bug.
>>>>>>
>>>>>> So it seems that the excluding of E820 reserved addresses is a mistake.
>>>>>>
>>>>>> Ideally Linux would fully stop excluding E820 reserved addresses,
>>>>>> but then various old systems will regress.
>>>>>> Instead keep the old behavior for old systems, while ignoring
>>>>>> the E820 reservations for any systems from now on.
>>>>>>
>>>>>> Old systems are defined here as BIOS year < 2018, this was chosen to
>>>>>> make sure that pci_use_e820 will not be set on the currently affected
>>>>>> systems, the oldest known one is from 2019.
>>>>>>
>>>>>> Testing has shown that some newer systems also have a bad _CRS return.
>>>>>> The pci_crs_quirks DMI table is used to keep excluding E820 reservations
>>>>>> from the bridge window on these systems.
>>>>>>
>>>>>> Also add pci=no_e820 and pci=use_e820 options to allow overriding
>>>>>> the BIOS year + DMI matching logic.
>>>>>>
>>>>>> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
>>>>>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
>>>>>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
>>>>>> BugLink: https://bugs.launchpad.net/bugs/1878279
>>>>>> BugLink: https://bugs.launchpad.net/bugs/1931715
>>>>>> BugLink: https://bugs.launchpad.net/bugs/1932069
>>>>>> BugLink: https://bugs.launchpad.net/bugs/1921649
>>>>>> Cc: Benoit Grégoire <benoitg@coeus.ca>
>>>>>> Cc: Hui Wang <hui.wang@canonical.com>
>>>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>>>
>>>>>> +	 * Ideally Linux would fully stop using E820 reservations, but then
>>>>>> +	 * various old systems will regress. Instead keep the old behavior for
>>>>>> +	 * old systems + known to be broken newer systems in pci_crs_quirks.
>>>>>> +	 */
>>>>>> +	if (year >= 0 && year < 2018)
>>>>>> +		pci_use_e820 = true;
>>>>>
>>>>> How did you pick 2018?  Prior to this patch, we used E820 reservations
>>>>> for all machines.  This patch would change that for 2019-2022
>>>>> machines, so there's a risk of breaking some of them.
>>>>
>>>> Correct. I picked 2018 because the first devices where using E820
>>>> reservations are causing issues (i2c controller not getting resources
>>>> leading to non working touchpad / thunderbolt hotplug issues) have
>>>> BIOS dates starting in 2019. I added a year margin, so we could make
>>>> this 2019.
>>>>
>>>>> I'm hesitant about changing the behavior for machines already in the
>>>>> field because if they were tested at all with Linux, it was without
>>>>> this patch.  So I would lean toward preserving the current behavior
>>>>> for BIOS year < 2023.
>>>>
>>>> I see, I presume the idea is to then use DMI to disable E820 clipping
>>>> on current devices where this is known to cause problems ?
>>>>
>>>> So for v8 I would:
>>>>
>>>> 1. Change the cut-off check to < 2023
>>>> 2. Drop the DMI quirks I added for models which are known to need E820
>>>>    clipping hit by the < 2018 check
>>>> 3. Add DMI quirks for models for which it is known that we must _not_
>>>>    do E820 clipping
>>>>
>>>> Is this the direction you want to go / does that sound right?
>>>
>>> Yes, I think that's what we should do.  All the machines in the field
>>> will be unaffected, except that we add quirks for known problems.
>>
>> I've been working on this today. I've mostly been going through
>> the all the existing bugs about this, to make a list of DMI matches
>> for devices on which we should _not_ do e820 clipping to fix th
>> kernel being unable to assign BARs there.
>>
>> I've found an interesting pattern there, all affected devices
>> are Lenovo devices with "IIL" in there device name, e.g. :
>> "IdeaPad 3 15IIL05". I've looked up all Lenovo devices which
>> have "IIL" as part of their DMI_PRODUCT_VERSION string here:
>> https://github.com/linuxhw/DMI/
>>
>> And then looked them up at https://linux-hardware.org/ and checked
>> their dmesg to see if they have the e820 problem other ideapads
>> have. I've gone through approx. half the list now and all
>> except one model seem to have the e820 problem.
>>
>> So it looks like we might be able to match all problem models
>> with a single DMI match.
> 
> That sounds reasonable.  I assume that if we skip the clipping for
> every platform that matches "IIL", we can also add exceptions for the
> inevitable "IIL" platforms that do need the clip?

Yes we can add a more specific match higher up in the pci_crs_quirks[]
array and then use a callback which returns non 0 to make
dmi_check_system() abort checking the rest of the array.

> E.g., specific
> entries at the end of the list that override the previous generic
> match?
> 
>> So the problem seems to be limited to one specific device
>> series / range and this is making me have second thoughts
>> about doing a date based cut-off at all. Trying to switch
>> over any models which are new in 2023 is fine, the problem
>> with a DMI BIOS date approach though is that as soon as some
>> new management-engine CVE comes out we will also see BIOS
>> updates with a year of 2023 for many existing models, of
>> up to 3-4 years old at least; and chances are that some of
>> those older models getting BIOS updates will be bitten by
>> this change.
> 
> That's a good point and sounds fairly painful when that happens,
> but I don't see a nice way out of this.
> 
>> So as said I'm having second thoughts about the date based
>> approach. Bjorn, what do you think of just using DMI quirks
>> to disable e820 clipping on known problematic models and
>> otherwise keeping things as is ?
> 
> I think we need a long-term strategy that can be clearly expressed 
> in a sentence or two and is consistent with the ACPI and PCI specs,
> and I don't think the current strategy is it.  Clipping with E820
> regions happened to work for some machines, but there's no reason to
> think it will work in general.

Ok, so what I'm reading between the lines here is that despite
the concerns which I've voiced you want to continue with
disabling e820 clipping by default for machines with a
DMI_BIOS_DATE year of 2023 or newer. Which is fine by me I
just wanted to get my concerns out there.

I'm almost done prepping a v8 now. I had to do some other
stuff and spend a lot of time checking dmesg output for
all the Lenovo *IIL* models.

For v8 I've also added a quirk for the Acer model you pointed
me to in another email.

Regards,

Hans
Bjorn Helgaas May 18, 2022, 10:07 p.m. UTC | #8
On Mon, May 09, 2022 at 07:33:27PM +0200, Hans de Goede wrote:
> On 5/7/22 17:31, Bjorn Helgaas wrote:
>> On Sat, May 07, 2022 at 12:09:03PM +0200, Hans de Goede wrote:
>>> On 5/6/22 18:51, Bjorn Helgaas wrote:
>>>> On Thu, May 05, 2022 at 05:20:16PM +0200, Hans de Goede wrote:
>>>>> Some BIOS-es contain bugs where they add addresses which are already
>>>>> used in some other manner to the PCI host bridge window returned by
>>>>> the ACPI _CRS method. To avoid this Linux by default excludes
>>>>> E820 reservations when allocating addresses since 2010, see:
>>>>> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
>>>>> space").
>>>>>
>>>>> Recently (2019) some systems have shown-up with E820 reservations which
>>>>> cover the entire _CRS returned PCI bridge memory window, causing all
>>>>> attempts to assign memory to PCI BARs which have not been setup by the
>>>>> BIOS to fail. For example here are the relevant dmesg bits from a
>>>>> Lenovo IdeaPad 3 15IIL 81WE:
>>>>>
>>>>>  [mem 0x000000004bc50000-0x00000000cfffffff] reserved
>>>>>  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>>>>>
>>>>> The ACPI specifications appear to allow this new behavior:
>>>>>
>>>>> The relationship between E820 and ACPI _CRS is not really very clear.
>>>>> ACPI v6.3, sec 15, table 15-374, says AddressRangeReserved means:
>>>>>
>>>>>   This range of addresses is in use or reserved by the system and is
>>>>>   not to be included in the allocatable memory pool of the operating
>>>>>   system's memory manager.
>>>>>
>>>>> and it may be used when:
>>>>>
>>>>>   The address range is in use by a memory-mapped system device.
>>>>>
>>>>> Furthermore, sec 15.2 says:
>>>>>
>>>>>   Address ranges defined for baseboard memory-mapped I/O devices, such
>>>>>   as APICs, are returned as reserved.
>>>>>
>>>>> A PCI host bridge qualifies as a baseboard memory-mapped I/O device,
>>>>> and its apertures are in use and certainly should not be included in
>>>>> the general allocatable pool, so the fact that some BIOS-es reports
>>>>> the PCI aperture as "reserved" in E820 doesn't seem like a BIOS bug.
>>>>>
>>>>> So it seems that the excluding of E820 reserved addresses is a mistake.
>>>>>
>>>>> Ideally Linux would fully stop excluding E820 reserved addresses,
>>>>> but then various old systems will regress.
>>>>> Instead keep the old behavior for old systems, while ignoring
>>>>> the E820 reservations for any systems from now on.
>>>>>
>>>>> Old systems are defined here as BIOS year < 2018, this was chosen to
>>>>> make sure that pci_use_e820 will not be set on the currently affected
>>>>> systems, the oldest known one is from 2019.
>>>>>
>>>>> Testing has shown that some newer systems also have a bad _CRS return.
>>>>> The pci_crs_quirks DMI table is used to keep excluding E820 reservations
>>>>> from the bridge window on these systems.
>>>>>
>>>>> Also add pci=no_e820 and pci=use_e820 options to allow overriding
>>>>> the BIOS year + DMI matching logic.
>>>>>
>>>>> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
>>>>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
>>>>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
>>>>> BugLink: https://bugs.launchpad.net/bugs/1878279
>>>>> BugLink: https://bugs.launchpad.net/bugs/1931715
>>>>> BugLink: https://bugs.launchpad.net/bugs/1932069
>>>>> BugLink: https://bugs.launchpad.net/bugs/1921649
>>>>> Cc: Benoit Grégoire <benoitg@coeus.ca>
>>>>> Cc: Hui Wang <hui.wang@canonical.com>
>>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>>
>>>>> +	 * Ideally Linux would fully stop using E820 reservations, but then
>>>>> +	 * various old systems will regress. Instead keep the old behavior for
>>>>> +	 * old systems + known to be broken newer systems in pci_crs_quirks.
>>>>> +	 */
>>>>> +	if (year >= 0 && year < 2018)
>>>>> +		pci_use_e820 = true;
>>>>
>>>> How did you pick 2018?  Prior to this patch, we used E820 reservations
>>>> for all machines.  This patch would change that for 2019-2022
>>>> machines, so there's a risk of breaking some of them.
>>>
>>> Correct. I picked 2018 because the first devices where using E820
>>> reservations are causing issues (i2c controller not getting resources
>>> leading to non working touchpad / thunderbolt hotplug issues) have
>>> BIOS dates starting in 2019. I added a year margin, so we could make
>>> this 2019.
>>>
>>>> I'm hesitant about changing the behavior for machines already in the
>>>> field because if they were tested at all with Linux, it was without
>>>> this patch.  So I would lean toward preserving the current behavior
>>>> for BIOS year < 2023.
>>>
>>> I see, I presume the idea is to then use DMI to disable E820 clipping
>>> on current devices where this is known to cause problems ?
>>>
>>> So for v8 I would:
>>>
>>> 1. Change the cut-off check to < 2023
>>> 2. Drop the DMI quirks I added for models which are known to need E820
>>>    clipping hit by the < 2018 check
>>> 3. Add DMI quirks for models for which it is known that we must _not_
>>>    do E820 clipping
>>>
>>> Is this the direction you want to go / does that sound right?
>> 
>> Yes, I think that's what we should do.  All the machines in the field
>> will be unaffected, except that we add quirks for known problems.
> 
> I've been working on this today. I've mostly been going through
> the all the existing bugs about this, to make a list of DMI matches
> for devices on which we should _not_ do e820 clipping to fix th
> kernel being unable to assign BARs there.
> 
> I've found an interesting pattern there, all affected devices
> are Lenovo devices with "IIL" in there device name, e.g. :
> "IdeaPad 3 15IIL05". I've looked up all Lenovo devices which
> have "IIL" as part of their DMI_PRODUCT_VERSION string here:
> https://github.com/linuxhw/DMI/
> 
> And then looked them up at https://linux-hardware.org/ and checked
> their dmesg to see if they have the e820 problem other ideapads
> have. I've gone through approx. half the list now and all
> except one model seem to have the e820 problem.
> 
> So it looks like we might be able to match all problem models
> with a single DMI match.
> 

> So the problem seems to be limited to one specific device
> series / range and this is making me have second thoughts
> about doing a date based cut-off at all. Trying to switch
> over any models which are new in 2023 is fine, the problem
> with a DMI BIOS date approach though is that as soon as some
> new management-engine CVE comes out we will also see BIOS
> updates with a year of 2023 for many existing models, of
> up to 3-4 years old at least; and chances are that some of
> those older models getting BIOS updates will be bitten by
> this change.
> 
> So as said I'm having second thoughts about the date based
> approach. Bjorn, what do you think of just using DMI quirks
> to disable e820 clipping on known problematic models and
> otherwise keeping things as is ?

The current v8 patch [1] adds quirks to disable clipping for Lenovo
"*IIL*" and Acer Spin 5.  I think we also need to add one for Clevo
Barebones [2], don't we?

Here's how I think about the date-based approach.  See if it seems
sensible to you.

Without a date check, we'll continue clipping by default:

  - Future systems like Lenovo *IIL*, Acer Spin, and Clevo Barebones
    will require new quirks to disable clipping.

  - The problem here is E820 entries that cover entire _CRS windows
    that should not be clipped out.

  - I think these E820 entries are legal per spec, and it would be
    hard to get BIOS vendors to change them.

  - We will discover new systems that need clipping disabled piecemeal
    as they are released.

  - Future systems like Lenovo X1 Carbon and the Chromebooks (probably
    anything using coreboot) will just work and we will not notice new
    ones that rely on the clipping.

  - BIOS updates will not require new quirks unless they change the
    DMI model string.

With a date check that disables clipping, e.g., "no clipping when
date > 2022":

  - Future systems like Lenovo *IIL*, Acer Spin, and Clevo Barebones
    will just work without new quirks.

  - Future systems like Lenovo X1 Carbon and the Chromebooks will
    require new quirks to *enable* clipping.

  - The problem here is that _CRS contains regions that are not usable
    by PCI devices, and we rely on the E820 kludge to clip them out.

  - I think this use of E820 is clearly a firmware bug, so we have a
    fighting chance of getting it changed eventually.

  - BIOS updates after the cutoff date *will* require quirks, but only
    for systems like Lenovo X1 Carbon and Chromebooks that we already
    think have broken firmware.

Is that a fair summary?

If so, it still seems to me like it's better to add quirks for
firmware that we think is broken than for firmware that seems unusual
but correct.

But I do think maybe we should split the date check to a separate
patch.  The rationale for the date check (to put the quirk burden on
systems with buggy firmware) is a little different from the "disable
E820 clipping" reason for the quirks.

I think the "disable clipping" quirks by themselves should fix all the
known broken systems, shouldn't they?  I'd really, really like to get
these quirks in for v5.18 if possible.

For future reference, I'm attaching below all the bugs I know about
and the details of what's in E820 and _CRS.

Bjorn

[1] https://lore.kernel.org/r/20220512202511.34197-1-hdegoede@redhat.com
[2] https://bugzilla.kernel.org/show_bug.cgi?id=214259


https://bugzilla.kernel.org/show_bug.cgi?id=16228       Dell Precision T3500
  BIOS-e820:                                   0xbfe4dc00-0xc0000000 (reserved)
  pci_root PNP0A03:00: host bridge window [mem 0xbff00000-0xdfffffff]
  pci 0000:00:1f.2: BAR 5: assigned [mem 0xbff00000-0xbff007ff]
  ahci 0000:00:1f.2: controller reset failed (0xffffffff)
  E820 covers part of _CRS window, 4dc2287c1805 clips that out.
  Without 4dc2287c1805, doesn't boot because AHCI at
  [mem 0xbff00000-0xbff007ff] is unusable.

https://bugzilla.kernel.org/show_bug.cgi?id=206459      Lenovo Yoga C940-14IIL
  DMI: LENOVO 81Q9/LNVNB161216, BIOS AUCN54WW 01/09/2020
  BIOS-e820:                         [mem 0x4bc50000-0xcfffffff] reserved
  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
  pci 0000:2b:00.0: BAR 14: no space for [mem size 0x0c200000]
  E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
  With 4dc2287c1805, Thunderbolt dock hot-add can't allocate MMIO
  space.

https://bugzilla.kernel.org/show_bug.cgi?id=214259      Clevo X170KM Barebone
  DMI: TUXEDO TUXEDO Book XUX7 - Gen12/X170KM-G, BIOS 1.07.05RTR1 02/01/2021
  BIOS-e820:                         [mem 0x6bc00000-0xefffffff] reserved
  pci_bus 0000:00: root bus resource [mem 0x71000000-0xdfffffff window]
  BIOS-e820:                         [mem 0x0100000000-0x48effffff] usable
  pci_bus 0000:00: root bus resource [mem 0x4000000000-0x7fffffffff window]
  pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit] # intel-lpss
  pci 0000:00:15.1: reg 0x10: [mem 0x00000000-0x00000fff 64bit] # intel-lpss
  pci 0000:00:15.2: reg 0x10: [mem 0x00000000-0x00000fff 64bit] # intel-lpss
  pci 0000:00:19.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit] # intel-lpss
  pci 0000:00:1f.5: reg 0x10: [mem 0xfe010000-0xfe010fff]       # intel_spi_pci, no window
  pci 0000:00:15.0: BAR 0: assigned [mem 0x420210f000-0x420210ffff 64bit]
  pci 0000:00:15.1: BAR 0: assigned [mem 0x4202111000-0x4202111fff 64bit]
  pci 0000:00:15.2: BAR 0: assigned [mem 0x4202112000-0x4202112fff 64bit]
  pci 0000:00:19.0: BAR 0: assigned [mem 0x4202113000-0x4202113fff 64bit]
  pci 0000:00:1f.5: BAR 0: no space for [mem size 0x00001000]
  E820 covers entire _CRS window.  4dc2287c1805 clips it all out.

https://bugzilla.redhat.com/show_bug.cgi?id=1868899     Lenovo IdeaPad 3 15IIL05
  DMI: LENOVO 81WE/LNVNB161216, BIOS EMCN44WW 12/23/2020
  BIOS-e820:                         [mem 0x4bc50000-0xcfffffff] reserved
  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
  pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
  pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
  E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
  With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
  allocate).

https://bugzilla.redhat.com/show_bug.cgi?id=1871793     Lenovo IdeaPad 5 14IIL05
  DMI: LENOVO 81YH/LNVNB161216, BIOS DSCN22WW(V1.08) 03/06/2020
  BIOS-e820:                         [mem 0x5bc50000-0xcfffffff] reserved
  pci_bus 0000:00: root bus resource [mem 0x6d400000-0xbfffffff window]
  pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
  pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
  E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
  With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
  allocate).

https://bugzilla.redhat.com/show_bug.cgi?id=2029207     Lenovo X1 Carbon (20A7)
  DMI: LENOVO 20A7008CUK/20A7008CUK, BIOS GRET63WW (1.40 ) 03/27/2020
  BIOS-e820:                         [mem 0xdceff000-0xdfa0ffff] reserved
  pci_bus 0000:00: root bus resource [mem 0xdfa00000-0xfebfffff window]
  pci 0000:00:1c.0: BAR 14: assigned [mem 0xdfa00000-0xdfbfffff]
  E820 covers part of _CRS window, 4dc2287c1805 clips that out.
  Without 4dc2287c1805, doesn't wake from suspend because
  [mem 0xdfa00000-0xdfa0ffff] unusable.

https://bugs.launchpad.net/bugs/1878279 Lenovo IdeaPad 5 14IIL05
  DMI: LENOVO 81YH/LNVNB161216, BIOS DSCN23WW(V1.09) 03/25/2020
  BIOS-e820:                         [mem 0x5bc50000-0xcfffffff] reserved
  pci_bus 0000:00: root bus resource [mem 0x6d400000-0xbfffffff window]
  pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
  pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
  E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
  With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
  allocate).

https://bugs.launchpad.net/bugs/1880172 Lenovo IdeaPad 3 14IIL05 Core i3-1005G1 (81WD004MGE)
  DMI: LENOVO 81WD/LNVNB161216, BIOS EMCN13WW 03/06/2020
  BIOS-e820:                         [mem 0x6b000000-0xcfffffff] reserved
  pci_bus 0000:00: root bus resource [mem 0x70400000-0xbfffffff window]
  pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
  pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
  E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
  With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
  allocate).

https://bugs.launchpad.net/bugs/1884232 Acer Spin SP513-54N
  DMI: Acer Spin SP513-54N/Caboom_IL, BIOS V1.00 02/21/2020
  BIOS-e820:                         [mem 0x39e00000-0xcfffffff] reserved
  pci_bus 0000:00: root bus resource [mem 0x3f800000-0xbfffffff window]
  pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
  pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
  E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
  With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
  allocate).

https://bugs.launchpad.net/bugs/1921649 Lenovo IdeaPad S145 82DJ0000BR
  DMI: LENOVO 82DJ/LNVNB161216, BIOS DKCN48WW 07/22/2020
  BIOS-e820:                         [mem 0x6b000000-0xcfffffff] reserved
  pci_bus 0000:00: root bus resource [mem 0x70400000-0xbfffffff window]
  pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
  pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
  E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
  With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
  allocate).

https://bugs.launchpad.net/bugs/1931715 Lenovo IdeaPad S145 82DJ0001BR
  DMI: LENOVO 82DJ/LNVNB161216, BIOS DKCN51WW 12/23/2020
  BIOS-e820:                         [mem 0x6b000000-0xcfffffff] reserved
  pci_bus 0000:00: root bus resource [mem 0x70400000-0xbfffffff window]
  pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
  E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
  With 4dc2287c1805, touchpad broken, no dmesg from broken case.

https://bugs.launchpad.net/bugs/1932069 Lenovo BS145-15IIL
  DMI: LENOVO 82HB/LNVNB161216, BIOS DKCN51WW 12/23/2020
  BIOS-e820:                         [mem 0x4bc50000-0xcfffffff] reserved
  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
  pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
  pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
  E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
  With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
  allocate).

https://lore.kernel.org/r/4e9fca2f-0af1-3684-6c97-4c35befd5019@redhat.com
Google internal report b/148759816
  Chromebook asus-C523NA-A20057-coral
  BIOS-e820:                         [mem 0x7b000000-0x7fffffff] reserved
  BIOS-e820:                         [mem 0xd0000000-0xd0ffffff]
  acpi PNP0A08:00: ... [mem 0x7b800000-0x7fffffff window] ...
  acpi PNP0A08:00: ... [mem 0x80000000-0xe0000000 window] ...
  pci_bus 0000:00: root bus resource [mem 0x7b800000-0xe0000000 window]
  E820 covers [mem 0x7b800000-0x7fffffff] of _CRS window.
  E820 punches [mem 0xd0000000-0xd0ffffff] hole for hidden device.
  4dc2287c1805 clips these out.  Without 4dc2287c1805, boot fails.
  Hidden device is a landmine unless we trim it out.

  But 4c5e242d3e93 ("x86/PCI: Clip only host bridge windows for E820
  regions") changes the time at which clipping is done.

  If clipping preserves resources *completely* covered by E820, and we
  clip *before* coalescing, [mem 0x7b800000-0x7fffffff window] is
  preserved and [mem 0xd0000000-0xd0ffffff] is clipped out, resulting
  in [mem 0x7b800000-0xcfffffff window], where 0x7b800000-0x7fffffff is
  not usable.

  If clipping preserves resources *completely* covered by E820, and we
  clip *after* coalescing, we clip out [mem 0x7b000000-0x7fffffff]
  and [mem 0xd0000000-0xd0ffffff], resulting in [mem
  0x80000000-0xcfffffff window], which is usable.

Chromebook hp-x360-12b-n4000-octopus similar to asus-C523NA-A20057-coral

Summary:
  These are broken by 4dc2287c1805:
    https://bugzilla.kernel.org/show_bug.cgi?id=206459  01/09/2020 LENOVO
    https://bugzilla.redhat.com/show_bug.cgi?id=1868899 12/23/2020 LENOVO
    https://bugzilla.redhat.com/show_bug.cgi?id=1871793 03/06/2020 LENOVO
    https://bugs.launchpad.net/bugs/1878279             03/25/2020 LENOVO
    https://bugs.launchpad.net/bugs/1880172             03/06/2020 LENOVO
    https://bugs.launchpad.net/bugs/1884232             02/21/2020 Acer
    https://bugs.launchpad.net/bugs/1921649             07/22/2020 LENOVO
    https://bugs.launchpad.net/bugs/1931715             12/23/2020 LENOVO
    https://bugs.launchpad.net/bugs/1932069             12/23/2020 LENOVO
      For all of the above, E820 covers entire _CRS window.  Trimming
      the entire window means we can't assign any MMIO space.

  These require 4dc2287c1805:
    https://bugzilla.kernel.org/show_bug.cgi?id=16228   03/09/2009 Dell
    https://bugzilla.redhat.com/show_bug.cgi?id=2029207 03/27/2020 LENOVO
    b/148759816 Chromebooks
      For the above, E820 covers part of _CRS window.  Failure to trim
      that part means we assign MMIO space that doesn't work.
Hans de Goede May 19, 2022, 2:01 p.m. UTC | #9
Hi Bjorn,

On 5/19/22 00:07, Bjorn Helgaas wrote:
> On Mon, May 09, 2022 at 07:33:27PM +0200, Hans de Goede wrote:
>> On 5/7/22 17:31, Bjorn Helgaas wrote:
>>> On Sat, May 07, 2022 at 12:09:03PM +0200, Hans de Goede wrote:
>>>> On 5/6/22 18:51, Bjorn Helgaas wrote:
>>>>> On Thu, May 05, 2022 at 05:20:16PM +0200, Hans de Goede wrote:
>>>>>> Some BIOS-es contain bugs where they add addresses which are already
>>>>>> used in some other manner to the PCI host bridge window returned by
>>>>>> the ACPI _CRS method. To avoid this Linux by default excludes
>>>>>> E820 reservations when allocating addresses since 2010, see:
>>>>>> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
>>>>>> space").
>>>>>>
>>>>>> Recently (2019) some systems have shown-up with E820 reservations which
>>>>>> cover the entire _CRS returned PCI bridge memory window, causing all
>>>>>> attempts to assign memory to PCI BARs which have not been setup by the
>>>>>> BIOS to fail. For example here are the relevant dmesg bits from a
>>>>>> Lenovo IdeaPad 3 15IIL 81WE:
>>>>>>
>>>>>>  [mem 0x000000004bc50000-0x00000000cfffffff] reserved
>>>>>>  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>>>>>>
>>>>>> The ACPI specifications appear to allow this new behavior:
>>>>>>
>>>>>> The relationship between E820 and ACPI _CRS is not really very clear.
>>>>>> ACPI v6.3, sec 15, table 15-374, says AddressRangeReserved means:
>>>>>>
>>>>>>   This range of addresses is in use or reserved by the system and is
>>>>>>   not to be included in the allocatable memory pool of the operating
>>>>>>   system's memory manager.
>>>>>>
>>>>>> and it may be used when:
>>>>>>
>>>>>>   The address range is in use by a memory-mapped system device.
>>>>>>
>>>>>> Furthermore, sec 15.2 says:
>>>>>>
>>>>>>   Address ranges defined for baseboard memory-mapped I/O devices, such
>>>>>>   as APICs, are returned as reserved.
>>>>>>
>>>>>> A PCI host bridge qualifies as a baseboard memory-mapped I/O device,
>>>>>> and its apertures are in use and certainly should not be included in
>>>>>> the general allocatable pool, so the fact that some BIOS-es reports
>>>>>> the PCI aperture as "reserved" in E820 doesn't seem like a BIOS bug.
>>>>>>
>>>>>> So it seems that the excluding of E820 reserved addresses is a mistake.
>>>>>>
>>>>>> Ideally Linux would fully stop excluding E820 reserved addresses,
>>>>>> but then various old systems will regress.
>>>>>> Instead keep the old behavior for old systems, while ignoring
>>>>>> the E820 reservations for any systems from now on.
>>>>>>
>>>>>> Old systems are defined here as BIOS year < 2018, this was chosen to
>>>>>> make sure that pci_use_e820 will not be set on the currently affected
>>>>>> systems, the oldest known one is from 2019.
>>>>>>
>>>>>> Testing has shown that some newer systems also have a bad _CRS return.
>>>>>> The pci_crs_quirks DMI table is used to keep excluding E820 reservations
>>>>>> from the bridge window on these systems.
>>>>>>
>>>>>> Also add pci=no_e820 and pci=use_e820 options to allow overriding
>>>>>> the BIOS year + DMI matching logic.
>>>>>>
>>>>>> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
>>>>>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
>>>>>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
>>>>>> BugLink: https://bugs.launchpad.net/bugs/1878279
>>>>>> BugLink: https://bugs.launchpad.net/bugs/1931715
>>>>>> BugLink: https://bugs.launchpad.net/bugs/1932069
>>>>>> BugLink: https://bugs.launchpad.net/bugs/1921649
>>>>>> Cc: Benoit Grégoire <benoitg@coeus.ca>
>>>>>> Cc: Hui Wang <hui.wang@canonical.com>
>>>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>>>
>>>>>> +	 * Ideally Linux would fully stop using E820 reservations, but then
>>>>>> +	 * various old systems will regress. Instead keep the old behavior for
>>>>>> +	 * old systems + known to be broken newer systems in pci_crs_quirks.
>>>>>> +	 */
>>>>>> +	if (year >= 0 && year < 2018)
>>>>>> +		pci_use_e820 = true;
>>>>>
>>>>> How did you pick 2018?  Prior to this patch, we used E820 reservations
>>>>> for all machines.  This patch would change that for 2019-2022
>>>>> machines, so there's a risk of breaking some of them.
>>>>
>>>> Correct. I picked 2018 because the first devices where using E820
>>>> reservations are causing issues (i2c controller not getting resources
>>>> leading to non working touchpad / thunderbolt hotplug issues) have
>>>> BIOS dates starting in 2019. I added a year margin, so we could make
>>>> this 2019.
>>>>
>>>>> I'm hesitant about changing the behavior for machines already in the
>>>>> field because if they were tested at all with Linux, it was without
>>>>> this patch.  So I would lean toward preserving the current behavior
>>>>> for BIOS year < 2023.
>>>>
>>>> I see, I presume the idea is to then use DMI to disable E820 clipping
>>>> on current devices where this is known to cause problems ?
>>>>
>>>> So for v8 I would:
>>>>
>>>> 1. Change the cut-off check to < 2023
>>>> 2. Drop the DMI quirks I added for models which are known to need E820
>>>>    clipping hit by the < 2018 check
>>>> 3. Add DMI quirks for models for which it is known that we must _not_
>>>>    do E820 clipping
>>>>
>>>> Is this the direction you want to go / does that sound right?
>>>
>>> Yes, I think that's what we should do.  All the machines in the field
>>> will be unaffected, except that we add quirks for known problems.
>>
>> I've been working on this today. I've mostly been going through
>> the all the existing bugs about this, to make a list of DMI matches
>> for devices on which we should _not_ do e820 clipping to fix th
>> kernel being unable to assign BARs there.
>>
>> I've found an interesting pattern there, all affected devices
>> are Lenovo devices with "IIL" in there device name, e.g. :
>> "IdeaPad 3 15IIL05". I've looked up all Lenovo devices which
>> have "IIL" as part of their DMI_PRODUCT_VERSION string here:
>> https://github.com/linuxhw/DMI/
>>
>> And then looked them up at https://linux-hardware.org/ and checked
>> their dmesg to see if they have the e820 problem other ideapads
>> have. I've gone through approx. half the list now and all
>> except one model seem to have the e820 problem.
>>
>> So it looks like we might be able to match all problem models
>> with a single DMI match.
>>
> 
>> So the problem seems to be limited to one specific device
>> series / range and this is making me have second thoughts
>> about doing a date based cut-off at all. Trying to switch
>> over any models which are new in 2023 is fine, the problem
>> with a DMI BIOS date approach though is that as soon as some
>> new management-engine CVE comes out we will also see BIOS
>> updates with a year of 2023 for many existing models, of
>> up to 3-4 years old at least; and chances are that some of
>> those older models getting BIOS updates will be bitten by
>> this change.
>>
>> So as said I'm having second thoughts about the date based
>> approach. Bjorn, what do you think of just using DMI quirks
>> to disable e820 clipping on known problematic models and
>> otherwise keeping things as is ?
> 
> The current v8 patch [1] adds quirks to disable clipping for Lenovo
> "*IIL*" and Acer Spin 5.  I think we also need to add one for Clevo
> Barebones [2], don't we?

Right looking at the dmesg output that one needs to be added to,
I've done that for the v9 which I'm preparing.

> Here's how I think about the date-based approach.  See if it seems
> sensible to you.
> 
> Without a date check, we'll continue clipping by default:
> 
>   - Future systems like Lenovo *IIL*, Acer Spin, and Clevo Barebones
>     will require new quirks to disable clipping.
> 
>   - The problem here is E820 entries that cover entire _CRS windows
>     that should not be clipped out.
> 
>   - I think these E820 entries are legal per spec, and it would be
>     hard to get BIOS vendors to change them.
> 
>   - We will discover new systems that need clipping disabled piecemeal
>     as they are released.
> 
>   - Future systems like Lenovo X1 Carbon and the Chromebooks (probably
>     anything using coreboot) will just work and we will not notice new
>     ones that rely on the clipping.
> 
>   - BIOS updates will not require new quirks unless they change the
>     DMI model string.
> 
> With a date check that disables clipping, e.g., "no clipping when
> date > 2022":
> 
>   - Future systems like Lenovo *IIL*, Acer Spin, and Clevo Barebones
>     will just work without new quirks.
> 
>   - Future systems like Lenovo X1 Carbon and the Chromebooks will
>     require new quirks to *enable* clipping.
> 
>   - The problem here is that _CRS contains regions that are not usable
>     by PCI devices, and we rely on the E820 kludge to clip them out.
> 
>   - I think this use of E820 is clearly a firmware bug, so we have a
>     fighting chance of getting it changed eventually.
> 
>   - BIOS updates after the cutoff date *will* require quirks, but only
>     for systems like Lenovo X1 Carbon and Chromebooks that we already
>     think have broken firmware.
> 
> Is that a fair summary?

Yes that seems a good summary to me. Actually it is so good I'm going
to steal it for the commit msg of the split out date check patch :)

> If so, it still seems to me like it's better to add quirks for
> firmware that we think is broken than for firmware that seems unusual
> but correct.

Ack.

> But I do think maybe we should split the date check to a separate
> patch.  The rationale for the date check (to put the quirk burden on
> systems with buggy firmware) is a little different from the "disable
> E820 clipping" reason for the quirks.
> 
> I think the "disable clipping" quirks by themselves should fix all the
> known broken systems, shouldn't they?

Yes they should.

> I'd really, really like to get
> these quirks in for v5.18 if possible.

Ok, I'll go and prepare a v9 and I will submit that later today.

> For future reference, I'm attaching below all the bugs I know about
> and the details of what's in E820 and _CRS.
> 
> Bjorn
> 
> [1] https://lore.kernel.org/r/20220512202511.34197-1-hdegoede@redhat.com
> [2] https://bugzilla.kernel.org/show_bug.cgi?id=214259
> 
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=16228       Dell Precision T3500
>   BIOS-e820:                                   0xbfe4dc00-0xc0000000 (reserved)
>   pci_root PNP0A03:00: host bridge window [mem 0xbff00000-0xdfffffff]
>   pci 0000:00:1f.2: BAR 5: assigned [mem 0xbff00000-0xbff007ff]
>   ahci 0000:00:1f.2: controller reset failed (0xffffffff)
>   E820 covers part of _CRS window, 4dc2287c1805 clips that out.
>   Without 4dc2287c1805, doesn't boot because AHCI at
>   [mem 0xbff00000-0xbff007ff] is unusable.
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=206459      Lenovo Yoga C940-14IIL
>   DMI: LENOVO 81Q9/LNVNB161216, BIOS AUCN54WW 01/09/2020
>   BIOS-e820:                         [mem 0x4bc50000-0xcfffffff] reserved
>   pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>   pci 0000:2b:00.0: BAR 14: no space for [mem size 0x0c200000]
>   E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
>   With 4dc2287c1805, Thunderbolt dock hot-add can't allocate MMIO
>   space.
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=214259      Clevo X170KM Barebone
>   DMI: TUXEDO TUXEDO Book XUX7 - Gen12/X170KM-G, BIOS 1.07.05RTR1 02/01/2021
>   BIOS-e820:                         [mem 0x6bc00000-0xefffffff] reserved
>   pci_bus 0000:00: root bus resource [mem 0x71000000-0xdfffffff window]
>   BIOS-e820:                         [mem 0x0100000000-0x48effffff] usable
>   pci_bus 0000:00: root bus resource [mem 0x4000000000-0x7fffffffff window]
>   pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit] # intel-lpss
>   pci 0000:00:15.1: reg 0x10: [mem 0x00000000-0x00000fff 64bit] # intel-lpss
>   pci 0000:00:15.2: reg 0x10: [mem 0x00000000-0x00000fff 64bit] # intel-lpss
>   pci 0000:00:19.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit] # intel-lpss
>   pci 0000:00:1f.5: reg 0x10: [mem 0xfe010000-0xfe010fff]       # intel_spi_pci, no window
>   pci 0000:00:15.0: BAR 0: assigned [mem 0x420210f000-0x420210ffff 64bit]
>   pci 0000:00:15.1: BAR 0: assigned [mem 0x4202111000-0x4202111fff 64bit]
>   pci 0000:00:15.2: BAR 0: assigned [mem 0x4202112000-0x4202112fff 64bit]
>   pci 0000:00:19.0: BAR 0: assigned [mem 0x4202113000-0x4202113fff 64bit]
>   pci 0000:00:1f.5: BAR 0: no space for [mem size 0x00001000]
>   E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1868899     Lenovo IdeaPad 3 15IIL05
>   DMI: LENOVO 81WE/LNVNB161216, BIOS EMCN44WW 12/23/2020
>   BIOS-e820:                         [mem 0x4bc50000-0xcfffffff] reserved
>   pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>   pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
>   pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
>   E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
>   With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
>   allocate).
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1871793     Lenovo IdeaPad 5 14IIL05
>   DMI: LENOVO 81YH/LNVNB161216, BIOS DSCN22WW(V1.08) 03/06/2020
>   BIOS-e820:                         [mem 0x5bc50000-0xcfffffff] reserved
>   pci_bus 0000:00: root bus resource [mem 0x6d400000-0xbfffffff window]
>   pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
>   pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
>   E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
>   With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
>   allocate).
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=2029207     Lenovo X1 Carbon (20A7)
>   DMI: LENOVO 20A7008CUK/20A7008CUK, BIOS GRET63WW (1.40 ) 03/27/2020
>   BIOS-e820:                         [mem 0xdceff000-0xdfa0ffff] reserved
>   pci_bus 0000:00: root bus resource [mem 0xdfa00000-0xfebfffff window]
>   pci 0000:00:1c.0: BAR 14: assigned [mem 0xdfa00000-0xdfbfffff]
>   E820 covers part of _CRS window, 4dc2287c1805 clips that out.
>   Without 4dc2287c1805, doesn't wake from suspend because
>   [mem 0xdfa00000-0xdfa0ffff] unusable.
> 
> https://bugs.launchpad.net/bugs/1878279 Lenovo IdeaPad 5 14IIL05
>   DMI: LENOVO 81YH/LNVNB161216, BIOS DSCN23WW(V1.09) 03/25/2020
>   BIOS-e820:                         [mem 0x5bc50000-0xcfffffff] reserved
>   pci_bus 0000:00: root bus resource [mem 0x6d400000-0xbfffffff window]
>   pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
>   pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
>   E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
>   With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
>   allocate).
> 
> https://bugs.launchpad.net/bugs/1880172 Lenovo IdeaPad 3 14IIL05 Core i3-1005G1 (81WD004MGE)
>   DMI: LENOVO 81WD/LNVNB161216, BIOS EMCN13WW 03/06/2020
>   BIOS-e820:                         [mem 0x6b000000-0xcfffffff] reserved
>   pci_bus 0000:00: root bus resource [mem 0x70400000-0xbfffffff window]
>   pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
>   pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
>   E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
>   With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
>   allocate).
> 
> https://bugs.launchpad.net/bugs/1884232 Acer Spin SP513-54N
>   DMI: Acer Spin SP513-54N/Caboom_IL, BIOS V1.00 02/21/2020
>   BIOS-e820:                         [mem 0x39e00000-0xcfffffff] reserved
>   pci_bus 0000:00: root bus resource [mem 0x3f800000-0xbfffffff window]
>   pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
>   pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
>   E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
>   With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
>   allocate).
> 
> https://bugs.launchpad.net/bugs/1921649 Lenovo IdeaPad S145 82DJ0000BR
>   DMI: LENOVO 82DJ/LNVNB161216, BIOS DKCN48WW 07/22/2020
>   BIOS-e820:                         [mem 0x6b000000-0xcfffffff] reserved
>   pci_bus 0000:00: root bus resource [mem 0x70400000-0xbfffffff window]
>   pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
>   pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
>   E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
>   With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
>   allocate).
> 
> https://bugs.launchpad.net/bugs/1931715 Lenovo IdeaPad S145 82DJ0001BR
>   DMI: LENOVO 82DJ/LNVNB161216, BIOS DKCN51WW 12/23/2020
>   BIOS-e820:                         [mem 0x6b000000-0xcfffffff] reserved
>   pci_bus 0000:00: root bus resource [mem 0x70400000-0xbfffffff window]
>   pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
>   E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
>   With 4dc2287c1805, touchpad broken, no dmesg from broken case.
> 
> https://bugs.launchpad.net/bugs/1932069 Lenovo BS145-15IIL
>   DMI: LENOVO 82HB/LNVNB161216, BIOS DKCN51WW 12/23/2020
>   BIOS-e820:                         [mem 0x4bc50000-0xcfffffff] reserved
>   pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>   pci 0000:00:15.0: reg 0x10: [mem 0x00000000-0x00000fff 64bit]
>   pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit]
>   E820 covers entire _CRS window.  4dc2287c1805 clips it all out.
>   With 4dc2287c1805, touchpad broken (BIOS didn't assign, no space to
>   allocate).
> 
> https://lore.kernel.org/r/4e9fca2f-0af1-3684-6c97-4c35befd5019@redhat.com
> Google internal report b/148759816
>   Chromebook asus-C523NA-A20057-coral
>   BIOS-e820:                         [mem 0x7b000000-0x7fffffff] reserved
>   BIOS-e820:                         [mem 0xd0000000-0xd0ffffff]
>   acpi PNP0A08:00: ... [mem 0x7b800000-0x7fffffff window] ...
>   acpi PNP0A08:00: ... [mem 0x80000000-0xe0000000 window] ...
>   pci_bus 0000:00: root bus resource [mem 0x7b800000-0xe0000000 window]
>   E820 covers [mem 0x7b800000-0x7fffffff] of _CRS window.
>   E820 punches [mem 0xd0000000-0xd0ffffff] hole for hidden device.
>   4dc2287c1805 clips these out.  Without 4dc2287c1805, boot fails.
>   Hidden device is a landmine unless we trim it out.
> 
>   But 4c5e242d3e93 ("x86/PCI: Clip only host bridge windows for E820
>   regions") changes the time at which clipping is done.
> 
>   If clipping preserves resources *completely* covered by E820, and we
>   clip *before* coalescing, [mem 0x7b800000-0x7fffffff window] is
>   preserved and [mem 0xd0000000-0xd0ffffff] is clipped out, resulting
>   in [mem 0x7b800000-0xcfffffff window], where 0x7b800000-0x7fffffff is
>   not usable.
> 
>   If clipping preserves resources *completely* covered by E820, and we
>   clip *after* coalescing, we clip out [mem 0x7b000000-0x7fffffff]
>   and [mem 0xd0000000-0xd0ffffff], resulting in [mem
>   0x80000000-0xcfffffff window], which is usable.
> 
> Chromebook hp-x360-12b-n4000-octopus similar to asus-C523NA-A20057-coral
> 
> Summary:
>   These are broken by 4dc2287c1805:
>     https://bugzilla.kernel.org/show_bug.cgi?id=206459  01/09/2020 LENOVO
>     https://bugzilla.redhat.com/show_bug.cgi?id=1868899 12/23/2020 LENOVO
>     https://bugzilla.redhat.com/show_bug.cgi?id=1871793 03/06/2020 LENOVO
>     https://bugs.launchpad.net/bugs/1878279             03/25/2020 LENOVO
>     https://bugs.launchpad.net/bugs/1880172             03/06/2020 LENOVO
>     https://bugs.launchpad.net/bugs/1884232             02/21/2020 Acer
>     https://bugs.launchpad.net/bugs/1921649             07/22/2020 LENOVO
>     https://bugs.launchpad.net/bugs/1931715             12/23/2020 LENOVO
>     https://bugs.launchpad.net/bugs/1932069             12/23/2020 LENOVO
>       For all of the above, E820 covers entire _CRS window.  Trimming
>       the entire window means we can't assign any MMIO space.
> 
>   These require 4dc2287c1805:
>     https://bugzilla.kernel.org/show_bug.cgi?id=16228   03/09/2009 Dell
>     https://bugzilla.redhat.com/show_bug.cgi?id=2029207 03/27/2020 LENOVO
>     b/148759816 Chromebooks
>       For the above, E820 covers part of _CRS window.  Failure to trim
>       that part means we assign MMIO space that doesn't work.
> 

I believe that this summary is correct, except that the broken by
4dc2287c1805 list needs the TUXEDO / Clevo:

    https://bugzilla.kernel.org/show_bug.cgi?id=214259 02/01/2021 TUXEDO (Clevo)

Regards,

Hans
Bjorn Helgaas May 19, 2022, 2:14 p.m. UTC | #10
On Thu, May 19, 2022 at 04:01:48PM +0200, Hans de Goede wrote:

> Ok, I'll go and prepare a v9 and I will submit that later today.

Would it be practical to split into three patches?

  1) Add command-line args
  2) Add DMI quirks
  3) Add date check

It seems easier to assimilate and document in smaller pieces, if
that's possible.

> I believe that this summary is correct, except that the broken by
> 4dc2287c1805 list needs the TUXEDO / Clevo:
> 
>     https://bugzilla.kernel.org/show_bug.cgi?id=214259 02/01/2021 TUXEDO (Clevo)

Right, thanks!

Bjorn
Hans de Goede May 19, 2022, 2:29 p.m. UTC | #11
Hi,

On 5/19/22 16:14, Bjorn Helgaas wrote:
> On Thu, May 19, 2022 at 04:01:48PM +0200, Hans de Goede wrote:
> 
>> Ok, I'll go and prepare a v9 and I will submit that later today.
> 
> Would it be practical to split into three patches?
> 
>   1) Add command-line args
>   2) Add DMI quirks
>   3) Add date check
> 
> It seems easier to assimilate and document in smaller pieces, if
> that's possible.

Ack, will do. Note this will cause quite a bit of copy/paste
in the commit msg to explain why these changes are necessary.

Regards,

Hans
Bjorn Helgaas May 19, 2022, 2:49 p.m. UTC | #12
On Thu, May 19, 2022 at 04:29:43PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 5/19/22 16:14, Bjorn Helgaas wrote:
> > On Thu, May 19, 2022 at 04:01:48PM +0200, Hans de Goede wrote:
> > 
> >> Ok, I'll go and prepare a v9 and I will submit that later today.
> > 
> > Would it be practical to split into three patches?
> > 
> >   1) Add command-line args
> >   2) Add DMI quirks
> >   3) Add date check
> > 
> > It seems easier to assimilate and document in smaller pieces, if
> > that's possible.
> 
> Ack, will do. Note this will cause quite a bit of copy/paste
> in the commit msg to explain why these changes are necessary.

OK, if the repetition gets excessive I can squash them back
together.  Hopefully the main explanation can go in the first patch,
the second can just mention the fact that these machines need the
exception, and the third can focus on the plan for the future.
Hans de Goede May 19, 2022, 3:06 p.m. UTC | #13
Hi Bjorn,

On 5/19/22 16:49, Bjorn Helgaas wrote:
> On Thu, May 19, 2022 at 04:29:43PM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 5/19/22 16:14, Bjorn Helgaas wrote:
>>> On Thu, May 19, 2022 at 04:01:48PM +0200, Hans de Goede wrote:
>>>
>>>> Ok, I'll go and prepare a v9 and I will submit that later today.
>>>
>>> Would it be practical to split into three patches?
>>>
>>>   1) Add command-line args
>>>   2) Add DMI quirks
>>>   3) Add date check
>>>
>>> It seems easier to assimilate and document in smaller pieces, if
>>> that's possible.
>>
>> Ack, will do. Note this will cause quite a bit of copy/paste
>> in the commit msg to explain why these changes are necessary.
> 
> OK, if the repetition gets excessive I can squash them back
> together.  Hopefully the main explanation can go in the first patch,
> the second can just mention the fact that these machines need the
> exception, and the third can focus on the plan for the future.

I'm almost done with prepping v9 and atm there is a 17 line
introduction of the problem which is shared between all 3
patches in the commit msg.

I personally don't think this is too bad, but feel free to
shorten it a bit in patch 2 + 3 before merging these.

I think the split makes sense, so I would prefer you amending
the commit msg over squashing them back together again.

Regards,

Hans
diff mbox series

Patch

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 3f1cc5e317ed..2477b639d5c4 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4066,6 +4066,15 @@ 
 				please report a bug.
 		nocrs		[X86] Ignore PCI host bridge windows from ACPI.
 				If you need to use this, please report a bug.
+		use_e820	[X86] Use E820 reservations to exclude parts of
+				PCI host bridge windows. This is a workaround
+				for BIOS defects in host bridge _CRS methods.
+				If you need to use this, please report a bug to
+				<linux-pci@vger.kernel.org>.
+		no_e820		[X86] Ignore E820 reservations for PCI host
+				bridge windows. This is the default on modern
+				hardware. If you need to use this, please report
+				a bug to <linux-pci@vger.kernel.org>.
 		routeirq	Do IRQ routing for all PCI devices.
 				This is normally done in pci_enable_device(),
 				so this option is a temporary workaround
diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h
index a0627dfae541..ce3fd3311772 100644
--- a/arch/x86/include/asm/pci_x86.h
+++ b/arch/x86/include/asm/pci_x86.h
@@ -42,6 +42,8 @@  do {						\
 #define PCI_ROOT_NO_CRS		0x100000
 #define PCI_NOASSIGN_BARS	0x200000
 #define PCI_BIG_ROOT_WINDOW	0x400000
+#define PCI_USE_E820		0x800000
+#define PCI_NO_E820		0x1000000
 
 extern unsigned int pci_probe;
 extern unsigned long pirq_table_addr;
diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index 562c81a51ea0..749f175c0fb7 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -22,6 +22,7 @@  struct pci_root_info {
 
 static bool pci_use_crs = true;
 static bool pci_ignore_seg;
+static bool pci_use_e820;
 
 static int __init set_use_crs(const struct dmi_system_id *id)
 {
@@ -42,6 +43,12 @@  static int __init set_ignore_seg(const struct dmi_system_id *id)
 	return 0;
 }
 
+static int __init set_use_e820(const struct dmi_system_id *id)
+{
+	pci_use_e820 = true;
+	return 0;
+}
+
 static const struct dmi_system_id pci_crs_quirks[] __initconst = {
 	/* http://bugzilla.kernel.org/show_bug.cgi?id=14183 */
 	{
@@ -136,6 +143,42 @@  static const struct dmi_system_id pci_crs_quirks[] __initconst = {
 			DMI_MATCH(DMI_PRODUCT_NAME, "HP xw9300 Workstation"),
 		},
 	},
+
+	/*
+	 * Asus-C523NA / Google Coral Chromebook needs the following E820 clips:
+	 * clipped [mem 0x000a0000-0x000bffff window] to [mem 0x00100000-0x000bffff window]
+	 *  for e820 entry [mem 0x000a0000-0x000fffff]
+	 * clipped [mem 0x7b800000-0x7fffffff window] to [mem 0x80000000-0x7fffffff window]
+	 *  for e820 entry [mem 0x7b000000-0x7fffffff]
+	 * clipped [mem 0x80000000-0xe0000000 window] to [mem 0x80000000-0xcfffffff window]
+	 *  for e820 entry [mem 0xd0000000-0xd0ffffff]
+	 * Otherwise the system does not boot. Note the first 2 clips completely cover
+	 * the windows.
+	 */
+	{
+		.callback = set_use_e820,
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "Google"),
+			DMI_MATCH(DMI_PRODUCT_NAME, "Coral"),
+		},
+	},
+
+	/*
+	 * Lenovo ThinkPad X1 Carbon 2nd needs the following E820 clips:
+	 * clipped [mem 0xdfa00000-0xfebfffff window] to [mem 0xdfa10000-0xfebfffff window]
+	 *  for e820 entry [mem 0xdceff000-0xdfa0ffff]
+	 * clipped [mem 0xdfa10000-0xfebfffff window] to [mem 0xdfa10000-0xf7ffffff window]
+	 *  for e820 entry [mem 0xf8000000-0xfbffffff]
+	 * Otherwise the system does not resume from suspend.
+	 * https://bugzilla.redhat.com/show_bug.cgi?id=2029207
+	 */
+	{
+		.callback = set_use_e820,
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+			DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad X1 Carbon 2nd"),
+		},
+	},
 	{}
 };
 
@@ -146,6 +189,22 @@  void __init pci_acpi_crs_quirks(void)
 	if (year >= 0 && year < 2008 && iomem_resource.end <= 0xffffffff)
 		pci_use_crs = false;
 
+	/*
+	 * Some BIOS-es contain bugs where they add addresses which are already
+	 * used in some other manner to the PCI host bridge window returned by
+	 * the ACPI _CRS method. To avoid this Linux by default excludes
+	 * E820 reservations when allocating addresses since 2010.
+	 * In 2019 some systems have shown-up with E820 reservations which cover
+	 * the entire _CRS returned PCI host bridge window, causing all attempts
+	 * to assign memory to PCI BARs to fail if Linux uses E820 reservations.
+	 *
+	 * Ideally Linux would fully stop using E820 reservations, but then
+	 * various old systems will regress. Instead keep the old behavior for
+	 * old systems + known to be broken newer systems in pci_crs_quirks.
+	 */
+	if (year >= 0 && year < 2018)
+		pci_use_e820 = true;
+
 	dmi_check_system(pci_crs_quirks);
 
 	/*
@@ -161,6 +220,15 @@  void __init pci_acpi_crs_quirks(void)
 	       "if necessary, use \"pci=%s\" and report a bug\n",
 	       pci_use_crs ? "Using" : "Ignoring",
 	       pci_use_crs ? "nocrs" : "use_crs");
+
+	/* "pci=use_e820"/"pci=no_e820" on the kernel cmdline takes precedence */
+	if (pci_probe & PCI_NO_E820)
+		pci_use_e820 = false;
+	else if (pci_probe & PCI_USE_E820)
+		pci_use_e820 = true;
+
+	printk(KERN_INFO "PCI: %s E820 reservations for host bridge windows\n",
+	       pci_use_e820 ? "Using" : "Ignoring");
 }
 
 #ifdef	CONFIG_PCI_MMCONFIG
@@ -301,8 +369,10 @@  static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci)
 
 	status = acpi_pci_probe_root_resources(ci);
 
-	resource_list_for_each_entry(entry, &ci->resources)
-		remove_e820_regions(&device->dev, entry->res);
+	if (pci_use_e820) {
+		resource_list_for_each_entry(entry, &ci->resources)
+			remove_e820_regions(&device->dev, entry->res);
+	}
 
 	if (pci_use_crs) {
 		resource_list_for_each_entry_safe(entry, tmp, &ci->resources)
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 9e1e6b8d8876..7e6f79aab6a8 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -595,6 +595,12 @@  char *__init pcibios_setup(char *str)
 	} else if (!strcmp(str, "nocrs")) {
 		pci_probe |= PCI_ROOT_NO_CRS;
 		return NULL;
+	} else if (!strcmp(str, "use_e820")) {
+		pci_probe |= PCI_USE_E820;
+		return NULL;
+	} else if (!strcmp(str, "no_e820")) {
+		pci_probe |= PCI_NO_E820;
+		return NULL;
 #ifdef CONFIG_PHYS_ADDR_T_64BIT
 	} else if (!strcmp(str, "big_root_window")) {
 		pci_probe |= PCI_BIG_ROOT_WINDOW;