diff mbox

[6/6] x86/PCI: quirk Thunderbolt PCI-to-PCI bridges

Message ID 1372177330-28013-7-git-send-email-mika.westerberg@linux.intel.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Mika Westerberg June 25, 2013, 4:22 p.m. UTC
Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
This means that the BIOS will allocate bridge resources based on some
assumptions of a maximum Thunderbolt chain. It also disables native PCIe
hotplug of the root port where the Thunderbolt host router is connected.

In order to support this we must make sure that the kernel does not try to
be too smart and resize / open the bridge windows during PCI enumeration.
For example by default the kernel will add certain amount of space to the
bridge memory/io windows (this is configurable via pci=hp[mem|io]size=xxx
command line option). Eventually we run out of space that the BIOS has
allocated.

Also address space for expansion ROMs should not be allocated (BIOS does
not execute them for Thunderbolt endpoints). If we don't prevent this the
kernel might find expansion ROM associated with some endpoint and reopen
the bridge window which the BIOS already closed leading again resource
exhaustion.

Fix this by adding a quirk that matches known Thunderbolt PCI-to-PCI
bridges and in that case prevents allocation of expansion ROM resources and
makes sure that the PCI core does not increase size of bridge windows.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
---
 arch/x86/pci/fixup.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

Comments

Jesse Barnes June 25, 2013, 9:15 p.m. UTC | #1
On Tue, 25 Jun 2013 19:22:10 +0300
Mika Westerberg <mika.westerberg@linux.intel.com> wrote:

> +	if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> +		pr_info("Thunderbolt host router detected disabling ROMs\n");
> +		pci_probe |= PCI_NOASSIGN_ROMS;
> +	}

I wonder if this should just be the default on x86?  Or do we allocate
ROM space to address some other platform where we need it and the BIOS
doesn't do it for the devices we care about?
Mika Westerberg June 26, 2013, 12:17 p.m. UTC | #2
On Tue, Jun 25, 2013 at 02:15:56PM -0700, Jesse Barnes wrote:
> On Tue, 25 Jun 2013 19:22:10 +0300
> Mika Westerberg <mika.westerberg@linux.intel.com> wrote:
> 
> > +	if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> > +		pr_info("Thunderbolt host router detected disabling ROMs\n");
> > +		pci_probe |= PCI_NOASSIGN_ROMS;
> > +	}
> 
> I wonder if this should just be the default on x86?  Or do we allocate
> ROM space to address some other platform where we need it and the BIOS
> doesn't do it for the devices we care about?

Good question. In our case it definitely helps to have pci=norom the
default. Can't tell if it might break something that depends on the current
behaviour.

Bjorn, Greg, Rafael,

What do you think?
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Greg Kroah-Hartman June 26, 2013, 3:04 p.m. UTC | #3
On Wed, Jun 26, 2013 at 03:17:57PM +0300, Mika Westerberg wrote:
> On Tue, Jun 25, 2013 at 02:15:56PM -0700, Jesse Barnes wrote:
> > On Tue, 25 Jun 2013 19:22:10 +0300
> > Mika Westerberg <mika.westerberg@linux.intel.com> wrote:
> > 
> > > +	if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> > > +		pr_info("Thunderbolt host router detected disabling ROMs\n");
> > > +		pci_probe |= PCI_NOASSIGN_ROMS;
> > > +	}
> > 
> > I wonder if this should just be the default on x86?  Or do we allocate
> > ROM space to address some other platform where we need it and the BIOS
> > doesn't do it for the devices we care about?
> 
> Good question. In our case it definitely helps to have pci=norom the
> default. Can't tell if it might break something that depends on the current
> behaviour.
> 
> Bjorn, Greg, Rafael,
> 
> What do you think?

I can't recall any specific reason to not do this, so no objection from
me, but make it a nice and small patch that can easily be reverted if
problems show up in the wild :)

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 26, 2013, 8:59 p.m. UTC | #4
[+cc Alex]

On Wed, Jun 26, 2013 at 6:17 AM, Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
> On Tue, Jun 25, 2013 at 02:15:56PM -0700, Jesse Barnes wrote:
>> On Tue, 25 Jun 2013 19:22:10 +0300
>> Mika Westerberg <mika.westerberg@linux.intel.com> wrote:
>>
>> > +   if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
>> > +           pr_info("Thunderbolt host router detected disabling ROMs\n");
>> > +           pci_probe |= PCI_NOASSIGN_ROMS;
>> > +   }
>>
>> I wonder if this should just be the default on x86?  Or do we allocate
>> ROM space to address some other platform where we need it and the BIOS
>> doesn't do it for the devices we care about?
>
> Good question. In our case it definitely helps to have pci=norom the
> default. Can't tell if it might break something that depends on the current
> behaviour.

I think the current default behavior is that if the BIOS has assigned
the ROM BAR, we keep that assignment, and if it hasn't, we allocate
MEM space for it.  And "pci=norom" means that we don't allocate MEM
space for it, even if the BIOS hasn't assigned it.

"pci=norom" is only implemented on x86.  I think most other arches
allocate MEM space for ROMs, with no way to turn that off.  PA-RISC
seems to ignore ROMs (dino_fixup_bus()), but that looks like the
exception.

I'm slightly concerned that if we make the x86 default be "never
assign space for ROMs unless the BIOS has done it," we might break
virtualized guests that need access to ROMs.  Alex?

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Williamson June 26, 2013, 10:15 p.m. UTC | #5
On Wed, 2013-06-26 at 14:59 -0600, Bjorn Helgaas wrote:
> [+cc Alex]
> 
> On Wed, Jun 26, 2013 at 6:17 AM, Mika Westerberg
> <mika.westerberg@linux.intel.com> wrote:
> > On Tue, Jun 25, 2013 at 02:15:56PM -0700, Jesse Barnes wrote:
> >> On Tue, 25 Jun 2013 19:22:10 +0300
> >> Mika Westerberg <mika.westerberg@linux.intel.com> wrote:
> >>
> >> > +   if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> >> > +           pr_info("Thunderbolt host router detected disabling ROMs\n");
> >> > +           pci_probe |= PCI_NOASSIGN_ROMS;
> >> > +   }
> >>
> >> I wonder if this should just be the default on x86?  Or do we allocate
> >> ROM space to address some other platform where we need it and the BIOS
> >> doesn't do it for the devices we care about?
> >
> > Good question. In our case it definitely helps to have pci=norom the
> > default. Can't tell if it might break something that depends on the current
> > behaviour.
> 
> I think the current default behavior is that if the BIOS has assigned
> the ROM BAR, we keep that assignment, and if it hasn't, we allocate
> MEM space for it.  And "pci=norom" means that we don't allocate MEM
> space for it, even if the BIOS hasn't assigned it.
> 
> "pci=norom" is only implemented on x86.  I think most other arches
> allocate MEM space for ROMs, with no way to turn that off.  PA-RISC
> seems to ignore ROMs (dino_fixup_bus()), but that looks like the
> exception.
>
> I'm slightly concerned that if we make the x86 default be "never
> assign space for ROMs unless the BIOS has done it," we might break
> virtualized guests that need access to ROMs.  Alex?

Yep, I'm more than slightly concerned by that too.  Whenever possible we
want to pass the ROM to the guest since it may end up being a boot
device or drivers within the guest may require it.  We can pass the ROM
to the guest from an image file, but that requires someplace from which
to dump the image, which is usually PCI sysfs.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 26, 2013, 10:18 p.m. UTC | #6
On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
> This means that the BIOS will allocate bridge resources based on some
> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
> hotplug of the root port where the Thunderbolt host router is connected.

The BIOS often assigns PCI bridge windows, e.g., for root ports
leading to ExpressCard slots.  I assume BIOSes make similar
assumptions there about what ExpressCards are likely to be plugged in.
 Is your BIOS assistance the same sort of thing, or is there something
additional happening at hot-add time?  (I think there might be AML
that does Thunderbolt-specific stuff at hotplug-time, but I assume
that's not related to resource assignment, because the OS owns those
resources.)

> In order to support this we must make sure that the kernel does not try to
> be too smart and resize / open the bridge windows during PCI enumeration.
> For example by default the kernel will add certain amount of space to the
> bridge memory/io windows (this is configurable via pci=hp[mem|io]size=xxx
> command line option). Eventually we run out of space that the BIOS has
> allocated.

ROMs usually aren't very big compared to regular memory BARs.  Is the
problem just that adding space the BIOS didn't plan for causes the OS
to increase the window size and bump into space assigned to a peer of
the Thunderbolt controller?

> Also address space for expansion ROMs should not be allocated (BIOS does
> not execute them for Thunderbolt endpoints). If we don't prevent this the
> kernel might find expansion ROM associated with some endpoint and reopen
> the bridge window which the BIOS already closed leading again resource
> exhaustion.

I assume the only reason we should not allocate expansion ROM space is
to avoid resource exhaustion.  If we had enough resources, allocating
ROM space wouldn't cause anything bad to happen, would it?

> Fix this by adding a quirk that matches known Thunderbolt PCI-to-PCI
> bridges and in that case prevents allocation of expansion ROM resources and
> makes sure that the PCI core does not increase size of bridge windows.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> ---
>  arch/x86/pci/fixup.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
>
> diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
> index f5809fa..924822b 100644
> --- a/arch/x86/pci/fixup.c
> +++ b/arch/x86/pci/fixup.c
> @@ -7,6 +7,8 @@
>  #include <linux/pci.h>
>  #include <linux/init.h>
>  #include <linux/vgaarb.h>
> +#include <linux/acpi.h>
> +#include <linux/pci-acpi.h>
>  #include <asm/pci_x86.h>
>
>  static void pci_fixup_i450nx(struct pci_dev *d)
> @@ -539,3 +541,52 @@ static void twinhead_reserve_killing_zone(struct pci_dev *dev)
>          }
>  }
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x27B9, twinhead_reserve_killing_zone);
> +
> +#ifdef CONFIG_ACPI
> +/*
> + * BIOS assisted Thunderbolt PCI enumeration should handle all resource
> + * allocation on behalf of OS.
> + */
> +static void quirk_thunderbolt(struct pci_dev *dev)
> +{
> +       struct acpi_pci_root *root;
> +       acpi_handle handle;
> +
> +       handle = acpi_find_root_bridge_handle(dev);
> +       if (!handle)
> +               return;
> +
> +       root = acpi_pci_find_root(handle);
> +       if (!root)
> +               return;
> +
> +       /*
> +        * Native PCIe hotplug should be disabled when BIOS assisted
> +        * hotplug is in use.
> +        */
> +       if (root->osc_control_set & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL)
> +               return;

I'm not really sure why this test is here.  I dimly recall hearing
that Thunderbolt hotplug requires some work at hotplug-time, and this
is not all public, and hence is done by AML.  But that in itself
doesn't seem related to the question of allocating ROM space.

> +       /*
> +        * Make sure that we don't allocate resources for expansion ROMs.
> +        * This may accidentally increase the size of the bridge window
> +        * causing us to run out of resources.
> +        */
> +       if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> +               pr_info("Thunderbolt host router detected disabling ROMs\n");

We have a pci_dev, so this should use dev_info().

> +               pci_probe |= PCI_NOASSIGN_ROMS;

I think you really only care about the space for ROMs on devices
connected via Thunderbolt, so it's kind of a shame that the switch is
global and affects ROMs on *all* devices.  I guess there's nothing
simple to be done about that, though.

> +       }
> +
> +       /*
> +        * Don't add anything to the BIOS allocated bridge window size for
> +        * the same reason.
> +        */
> +       dev->is_hotplug_bridge = 0;

"is_hotplug_bridge" is also used in MPS management
(pcie_find_smpss()).  Did you investigate that and make sure this is
safe?  I think the Thunderbolt controller supports hotplug, and I'm
not sure what will happen if the MPS code assumes it doesn't.

> +}
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1513, quirk_thunderbolt);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x151a, quirk_thunderbolt);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x151b, quirk_thunderbolt);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1547, quirk_thunderbolt);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1548, quirk_thunderbolt);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1549, quirk_thunderbolt);
> +#endif
> --
> 1.8.3.1
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu June 26, 2013, 10:26 p.m. UTC | #7
On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> <mika.westerberg@linux.intel.com> wrote:
>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>> This means that the BIOS will allocate bridge resources based on some
>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>> hotplug of the root port where the Thunderbolt host router is connected.

We should not need tricks in this patch after

https://patchwork.kernel.org/patch/2766521/

[2/3] PCI / ACPI: Use boot-time resource allocation rules during hotplug

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu June 26, 2013, 10:31 p.m. UTC | #8
On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
>> <mika.westerberg@linux.intel.com> wrote:
>>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>>> This means that the BIOS will allocate bridge resources based on some
>>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>>> hotplug of the root port where the Thunderbolt host router is connected.
>
> We should not need tricks in this patch after
>
> https://patchwork.kernel.org/patch/2766521/
>
> [2/3] PCI / ACPI: Use boot-time resource allocation rules during hotplug
>

BTW, Rafael, looks like that "boot-time" is not accurate here.

During acpi hotplug, firmare could do extra help for us like assign
some resources to pci device bars, so it is NOT "boot-time".

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu June 26, 2013, 10:38 p.m. UTC | #9
On Wed, Jun 26, 2013 at 3:44 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Wednesday, June 26, 2013 03:31:18 PM Yinghai Lu wrote:
>> On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>> > On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> >> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
>> >> <mika.westerberg@linux.intel.com> wrote:
>> >>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>> >>> This means that the BIOS will allocate bridge resources based on some
>> >>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>> >>> hotplug of the root port where the Thunderbolt host router is connected.
>> >
>> > We should not need tricks in this patch after
>> >
>> > https://patchwork.kernel.org/patch/2766521/
>> >
>> > [2/3] PCI / ACPI: Use boot-time resource allocation rules during hotplug
>> >
>>
>> BTW, Rafael, looks like that "boot-time" is not accurate here.
>>
>> During acpi hotplug, firmare could do extra help for us like assign
>> some resources to pci device bars, so it is NOT "boot-time".
>
> Well, Linus has merged it already and besides "boot-time rules" need not
> mean "boot-time resources", right?

ok.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki June 26, 2013, 10:44 p.m. UTC | #10
On Wednesday, June 26, 2013 03:31:18 PM Yinghai Lu wrote:
> On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> > On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> >> <mika.westerberg@linux.intel.com> wrote:
> >>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
> >>> This means that the BIOS will allocate bridge resources based on some
> >>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
> >>> hotplug of the root port where the Thunderbolt host router is connected.
> >
> > We should not need tricks in this patch after
> >
> > https://patchwork.kernel.org/patch/2766521/
> >
> > [2/3] PCI / ACPI: Use boot-time resource allocation rules during hotplug
> >
> 
> BTW, Rafael, looks like that "boot-time" is not accurate here.
> 
> During acpi hotplug, firmare could do extra help for us like assign
> some resources to pci device bars, so it is NOT "boot-time".

Well, Linus has merged it already and besides "boot-time rules" need not
mean "boot-time resources", right?

Rafael
Bjorn Helgaas June 26, 2013, 10:55 p.m. UTC | #11
On Wed, Jun 26, 2013 at 4:31 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>> On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
>>> <mika.westerberg@linux.intel.com> wrote:
>>>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>>>> This means that the BIOS will allocate bridge resources based on some
>>>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>>>> hotplug of the root port where the Thunderbolt host router is connected.
> ...
> During acpi hotplug, firmare could do extra help for us like assign
> some resources to pci device bars, so it is NOT "boot-time".

Really?  How can firmware assign BARs at hotplug-time?  I mean,
obviously firmware *can* write things to the BARs before giving the
device to the OS, but how would it know what to write?  I assume the
OS owns the address space, and it can change the upstream bridge
windows or the BARs of another device on the bus at any time, subject
to the OS's own issues as far as quiescing or unbinding drivers, etc.,
but without coordinating with the BIOS.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu June 26, 2013, 11:56 p.m. UTC | #12
On Wed, Jun 26, 2013 at 3:55 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Wed, Jun 26, 2013 at 4:31 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>> On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>>> On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>>> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
>>>> <mika.westerberg@linux.intel.com> wrote:
>>>>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>>>>> This means that the BIOS will allocate bridge resources based on some
>>>>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>>>>> hotplug of the root port where the Thunderbolt host router is connected.
>> ...
>> During acpi hotplug, firmare could do extra help for us like assign
>> some resources to pci device bars, so it is NOT "boot-time".
>
> Really?  How can firmware assign BARs at hotplug-time?  I mean,
> obviously firmware *can* write things to the BARs before giving the
> device to the OS, but how would it know what to write?

should be acpi code, or SMI code or even BMC firmware via sideband.

>  I assume the
> OS owns the address space, and it can change the upstream bridge
> windows or the BARs of another device on the bus at any time, subject
> to the OS's own issues as far as quiescing or unbinding drivers, etc.,
> but without coordinating with the BIOS.

for thunderbolt or dock with acpiphp, then all children devices/bridges should
not have drivers loaded yet.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mika Westerberg June 27, 2013, 1:09 p.m. UTC | #13
On Wed, Jun 26, 2013 at 04:15:01PM -0600, Alex Williamson wrote:
> On Wed, 2013-06-26 at 14:59 -0600, Bjorn Helgaas wrote:
> > [+cc Alex]
> > 
> > On Wed, Jun 26, 2013 at 6:17 AM, Mika Westerberg
> > <mika.westerberg@linux.intel.com> wrote:
> > > On Tue, Jun 25, 2013 at 02:15:56PM -0700, Jesse Barnes wrote:
> > >> On Tue, 25 Jun 2013 19:22:10 +0300
> > >> Mika Westerberg <mika.westerberg@linux.intel.com> wrote:
> > >>
> > >> > +   if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> > >> > +           pr_info("Thunderbolt host router detected disabling ROMs\n");
> > >> > +           pci_probe |= PCI_NOASSIGN_ROMS;
> > >> > +   }
> > >>
> > >> I wonder if this should just be the default on x86?  Or do we allocate
> > >> ROM space to address some other platform where we need it and the BIOS
> > >> doesn't do it for the devices we care about?
> > >
> > > Good question. In our case it definitely helps to have pci=norom the
> > > default. Can't tell if it might break something that depends on the current
> > > behaviour.
> > 
> > I think the current default behavior is that if the BIOS has assigned
> > the ROM BAR, we keep that assignment, and if it hasn't, we allocate
> > MEM space for it.  And "pci=norom" means that we don't allocate MEM
> > space for it, even if the BIOS hasn't assigned it.
> > 
> > "pci=norom" is only implemented on x86.  I think most other arches
> > allocate MEM space for ROMs, with no way to turn that off.  PA-RISC
> > seems to ignore ROMs (dino_fixup_bus()), but that looks like the
> > exception.
> >
> > I'm slightly concerned that if we make the x86 default be "never
> > assign space for ROMs unless the BIOS has done it," we might break
> > virtualized guests that need access to ROMs.  Alex?
> 
> Yep, I'm more than slightly concerned by that too.  Whenever possible we
> want to pass the ROM to the guest since it may end up being a boot
> device or drivers within the guest may require it.  We can pass the ROM
> to the guest from an image file, but that requires someplace from which
> to dump the image, which is usually PCI sysfs.  Thanks,

OK, then I guess changing the default is out of the question. We don't want
to break things.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mika Westerberg June 27, 2013, 1:54 p.m. UTC | #14
On Wed, Jun 26, 2013 at 04:18:53PM -0600, Bjorn Helgaas wrote:
> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> <mika.westerberg@linux.intel.com> wrote:
> > Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
> > This means that the BIOS will allocate bridge resources based on some
> > assumptions of a maximum Thunderbolt chain. It also disables native PCIe
> > hotplug of the root port where the Thunderbolt host router is connected.
> 
> The BIOS often assigns PCI bridge windows, e.g., for root ports
> leading to ExpressCard slots.  I assume BIOSes make similar
> assumptions there about what ExpressCards are likely to be plugged in.
>  Is your BIOS assistance the same sort of thing, or is there something
> additional happening at hot-add time?  (I think there might be AML
> that does Thunderbolt-specific stuff at hotplug-time, but I assume
> that's not related to resource assignment, because the OS owns those
> resources.)

Yes, if I understand it right the BIOS gets SCI on hotplug, then it
enumerates and configures the devices, and finally sends an ACPI bus check
event to the OS.

> > In order to support this we must make sure that the kernel does not try to
> > be too smart and resize / open the bridge windows during PCI enumeration.
> > For example by default the kernel will add certain amount of space to the
> > bridge memory/io windows (this is configurable via pci=hp[mem|io]size=xxx
> > command line option). Eventually we run out of space that the BIOS has
> > allocated.
> 
> ROMs usually aren't very big compared to regular memory BARs.  Is the
> problem just that adding space the BIOS didn't plan for causes the OS
> to increase the window size and bump into space assigned to a peer of
> the Thunderbolt controller?

You are right.

I did some more investigation on this and the BIOS seems to close the
bridge windows if there are no devices behind the bridge or the device is
not using certain type of resource (io/pmem). However, Linux then finds out
that the bridge supports these optional windows (in pci_bridge_check_ranges())
and because is_hotplug_bridge == 1, it adds the default sizes to the bridge
window resources causing this:

[   15.753262] pci 0000:07:06.0: scanning [bus 80-80] behind bridge, pass 0
[   15.754458] pci_bus 0000:80: scanning bus
[   15.755563] pci_bus 0000:80: fixups for bus
[   15.756668] pci 0000:07:06.0: PCI bridge to [bus 80]
...
[   15.873433] pci 0000:07:06.0: BAR 7: can't assign io (size 0x1000)
[   15.874542] pci 0000:07:06.0: BAR 8: can't assign mem (size 0x200000)
[   15.875632] pci 0000:07:06.0: BAR 9: can't assign mem pref (size 0x200000)

The above bridge has all the windows closed by the BIOS.

If I have "pci=hpmemsize=0,hpiosize=0" in the kernel command line this
works.

> > Also address space for expansion ROMs should not be allocated (BIOS does
> > not execute them for Thunderbolt endpoints). If we don't prevent this the
> > kernel might find expansion ROM associated with some endpoint and reopen
> > the bridge window which the BIOS already closed leading again resource
> > exhaustion.
> 
> I assume the only reason we should not allocate expansion ROM space is
> to avoid resource exhaustion.  If we had enough resources, allocating
> ROM space wouldn't cause anything bad to happen, would it?

Right.

> > Fix this by adding a quirk that matches known Thunderbolt PCI-to-PCI
> > bridges and in that case prevents allocation of expansion ROM resources and
> > makes sure that the PCI core does not increase size of bridge windows.
> >
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> > ---
> >  arch/x86/pci/fixup.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 51 insertions(+)
> >
> > diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
> > index f5809fa..924822b 100644
> > --- a/arch/x86/pci/fixup.c
> > +++ b/arch/x86/pci/fixup.c
> > @@ -7,6 +7,8 @@
> >  #include <linux/pci.h>
> >  #include <linux/init.h>
> >  #include <linux/vgaarb.h>
> > +#include <linux/acpi.h>
> > +#include <linux/pci-acpi.h>
> >  #include <asm/pci_x86.h>
> >
> >  static void pci_fixup_i450nx(struct pci_dev *d)
> > @@ -539,3 +541,52 @@ static void twinhead_reserve_killing_zone(struct pci_dev *dev)
> >          }
> >  }
> >  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x27B9, twinhead_reserve_killing_zone);
> > +
> > +#ifdef CONFIG_ACPI
> > +/*
> > + * BIOS assisted Thunderbolt PCI enumeration should handle all resource
> > + * allocation on behalf of OS.
> > + */
> > +static void quirk_thunderbolt(struct pci_dev *dev)
> > +{
> > +       struct acpi_pci_root *root;
> > +       acpi_handle handle;
> > +
> > +       handle = acpi_find_root_bridge_handle(dev);
> > +       if (!handle)
> > +               return;
> > +
> > +       root = acpi_pci_find_root(handle);
> > +       if (!root)
> > +               return;
> > +
> > +       /*
> > +        * Native PCIe hotplug should be disabled when BIOS assisted
> > +        * hotplug is in use.
> > +        */
> > +       if (root->osc_control_set & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL)
> > +               return;
> 
> I'm not really sure why this test is here.  I dimly recall hearing
> that Thunderbolt hotplug requires some work at hotplug-time, and this
> is not all public, and hence is done by AML.  But that in itself
> doesn't seem related to the question of allocating ROM space.

The check is here because we need to check that the native PCIe hotplug is
disabled by the BIOS. Thunderbolt supports standard PCIe hotplug but some
operating systems (not including Linux) can't handle that properly so BIOS
will do that on behalf of the OS.

It is possible that some systems use native PCIe hotplug (although we
haven't seen such yet). Hence we need to check it here and not apply the
quirk in that case.

> > +       /*
> > +        * Make sure that we don't allocate resources for expansion ROMs.
> > +        * This may accidentally increase the size of the bridge window
> > +        * causing us to run out of resources.
> > +        */
> > +       if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
> > +               pr_info("Thunderbolt host router detected disabling ROMs\n");
> 
> We have a pci_dev, so this should use dev_info().

OK.

> > +               pci_probe |= PCI_NOASSIGN_ROMS;
> 
> I think you really only care about the space for ROMs on devices
> connected via Thunderbolt, so it's kind of a shame that the switch is
> global and affects ROMs on *all* devices.  I guess there's nothing
> simple to be done about that, though.

Well, now that I understand this a bit better, I think we don't need to
set the PCI_NOASSIGN_ROMS anymore...

> > +       }
> > +
> > +       /*
> > +        * Don't add anything to the BIOS allocated bridge window size for
> > +        * the same reason.
> > +        */
> > +       dev->is_hotplug_bridge = 0;

...nor do this

I think that we can get this working so that we add a new flag to struct
pci_dev, something like 'no_additional_hotplug_bus_space' and in this quirk
set that.

Then in __pci_bus_size_bridges() we do:

	pci_bridge_check_ranges(bus);
	if (bus->self->is_hotplug_bridge &&
	    !bus->self->no_additional_hotplug_bus_space) {
		additional_io_size  = pci_hotplug_io_size;
		additional_mem_size = pci_hotplug_mem_size;
	}

This should prevent the problem this patch was trying to solve. Does that
work for you?

Of course, once we do that the user is not able to change the defaults for
Thunderbolt PCI bridges anymore by passing new values from the kernel
command line. Not sure if it is needed, though.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mika Westerberg June 27, 2013, 1:58 p.m. UTC | #15
On Wed, Jun 26, 2013 at 03:26:59PM -0700, Yinghai Lu wrote:
> On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
> > <mika.westerberg@linux.intel.com> wrote:
> >> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
> >> This means that the BIOS will allocate bridge resources based on some
> >> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
> >> hotplug of the root port where the Thunderbolt host router is connected.
> 
> We should not need tricks in this patch after
> 
> https://patchwork.kernel.org/patch/2766521/
> 
> [2/3] PCI / ACPI: Use boot-time resource allocation rules during hotplug

Unfortunately that patch is not enough :-( We still need to make sure that
we don't add anything to the bridge window sizes like __pci_bus_size_bridges()
is currently doing (e.g similar to pci=hpmemsize=0,hpiosize=0 for the
Thunderbolt bridges).
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 27, 2013, 4 p.m. UTC | #16
On Wed, Jun 26, 2013 at 5:56 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Wed, Jun 26, 2013 at 3:55 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> On Wed, Jun 26, 2013 at 4:31 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>>> On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>>>> On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>>>> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
>>>>> <mika.westerberg@linux.intel.com> wrote:
>>>>>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>>>>>> This means that the BIOS will allocate bridge resources based on some
>>>>>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>>>>>> hotplug of the root port where the Thunderbolt host router is connected.
>>> ...
>>> During acpi hotplug, firmare could do extra help for us like assign
>>> some resources to pci device bars, so it is NOT "boot-time".
>>
>> Really?  How can firmware assign BARs at hotplug-time?  I mean,
>> obviously firmware *can* write things to the BARs before giving the
>> device to the OS, but how would it know what to write?
>
> should be acpi code, or SMI code or even BMC firmware via sideband.

How would that code (ACPI code (by which I assume you mean AML), SMI
code, BMC firmware) know what values to write to BARs?  Is it reading
the windows of upstream bridges from config space?  Is it assuming the
OS never changes the windows of bridges upstream from the Thunderbolt
controller?

>>  I assume the
>> OS owns the address space, and it can change the upstream bridge
>> windows or the BARs of another device on the bus at any time, subject
>> to the OS's own issues as far as quiescing or unbinding drivers, etc.,
>> but without coordinating with the BIOS.
>
> for thunderbolt or dock with acpiphp, then all children devices/bridges should
> not have drivers loaded yet.

I said "upstream bridge windows or ... another device on the bus,"
i.e., I'm referring to devices other than the Thunderbolt controller.
I assume the OS is free to change resource assignments for those
non-Thunderbolt devices, and obviously those assignments affect what
is available for Thunderbolt.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mika Westerberg June 27, 2013, 4:27 p.m. UTC | #17
On Thu, Jun 27, 2013 at 04:54:05PM +0300, Mika Westerberg wrote:
> I think that we can get this working so that we add a new flag to struct
> pci_dev, something like 'no_additional_hotplug_bus_space' and in this quirk
> set that.
> 
> Then in __pci_bus_size_bridges() we do:
> 
> 	pci_bridge_check_ranges(bus);
> 	if (bus->self->is_hotplug_bridge &&
> 	    !bus->self->no_additional_hotplug_bus_space) {
> 		additional_io_size  = pci_hotplug_io_size;
> 		additional_mem_size = pci_hotplug_mem_size;
> 	}
> 
> This should prevent the problem this patch was trying to solve. Does that
> work for you?

Forget about that -- It looks like these messages are harmless:

    pcieport 0000:0a:05.0: BAR 8: can't assign mem (size 0x200000)
    pcieport 0000:0a:05.0: BAR 9: can't assign mem pref (size 0x200000)

It just means that we tried to allocate the resource but failed because the
bridge has that window closed, if I got it right. If user doesn't want to
see those, he/she can always pass 'pci=hpmensize=0,hpiosize=0' in the
kernel command line.

With pcibios_resource_survey_bus() fix, looks like this quirk is not needed
at all.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu June 27, 2013, 5:18 p.m. UTC | #18
On Thu, Jun 27, 2013 at 9:27 AM, Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
> On Thu, Jun 27, 2013 at 04:54:05PM +0300, Mika Westerberg wrote:
>> I think that we can get this working so that we add a new flag to struct
>> pci_dev, something like 'no_additional_hotplug_bus_space' and in this quirk
>> set that.
>>
>> Then in __pci_bus_size_bridges() we do:
>>
>>       pci_bridge_check_ranges(bus);
>>       if (bus->self->is_hotplug_bridge &&
>>           !bus->self->no_additional_hotplug_bus_space) {
>>               additional_io_size  = pci_hotplug_io_size;
>>               additional_mem_size = pci_hotplug_mem_size;
>>       }
>>
>> This should prevent the problem this patch was trying to solve. Does that
>> work for you?
>
> Forget about that -- It looks like these messages are harmless:
>
>     pcieport 0000:0a:05.0: BAR 8: can't assign mem (size 0x200000)
>     pcieport 0000:0a:05.0: BAR 9: can't assign mem pref (size 0x200000)
>
> It just means that we tried to allocate the resource but failed because the
> bridge has that window closed, if I got it right. If user doesn't want to
> see those, he/she can always pass 'pci=hpmensize=0,hpiosize=0' in the
> kernel command line.

Yes, that extra range for hotplug is optional, so if must+optional
fails, second try
will be must only.

>
> With pcibios_resource_survey_bus() fix, looks like this quirk is not needed
> at all.

Good.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu June 27, 2013, 5:27 p.m. UTC | #19
On Thu, Jun 27, 2013 at 9:00 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Wed, Jun 26, 2013 at 5:56 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>> On Wed, Jun 26, 2013 at 3:55 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>> On Wed, Jun 26, 2013 at 4:31 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>>>> On Wed, Jun 26, 2013 at 3:26 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>>>>> On Wed, Jun 26, 2013 at 3:18 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>>>>> On Tue, Jun 25, 2013 at 10:22 AM, Mika Westerberg
>>>>>> <mika.westerberg@linux.intel.com> wrote:
>>>>>>> Thunderbolt PCI-to-PCI bridges typically use BIOS "assisted" enumeration.
>>>>>>> This means that the BIOS will allocate bridge resources based on some
>>>>>>> assumptions of a maximum Thunderbolt chain. It also disables native PCIe
>>>>>>> hotplug of the root port where the Thunderbolt host router is connected.
>>>> ...
>>>> During acpi hotplug, firmare could do extra help for us like assign
>>>> some resources to pci device bars, so it is NOT "boot-time".
>>>
>>> Really?  How can firmware assign BARs at hotplug-time?  I mean,
>>> obviously firmware *can* write things to the BARs before giving the
>>> device to the OS, but how would it know what to write?
>>
>> should be acpi code, or SMI code or even BMC firmware via sideband.
>
> How would that code (ACPI code (by which I assume you mean AML), SMI
> code, BMC firmware) know what values to write to BARs?  Is it reading
> the windows of upstream bridges from config space?  Is it assuming the
> OS never changes the windows of bridges upstream from the Thunderbolt
> controller?

Kernel only try to change upstream bridge for pciehp during hot-add.

for acpiphp, because could have other devices already there before hot-add,
kernel does not try to expand the upstream bridge. size and assign only
to the new added devices.

>
>>>  I assume the
>>> OS owns the address space, and it can change the upstream bridge
>>> windows or the BARs of another device on the bus at any time, subject
>>> to the OS's own issues as far as quiescing or unbinding drivers, etc.,
>>> but without coordinating with the BIOS.
>>
>> for thunderbolt or dock with acpiphp, then all children devices/bridges should
>> not have drivers loaded yet.
>
> I said "upstream bridge windows or ... another device on the bus,"
> i.e., I'm referring to devices other than the Thunderbolt controller.
> I assume the OS is free to change resource assignments for those
> non-Thunderbolt devices, and obviously those assignments affect what
> is available for Thunderbolt.

ok. that upstream one can not be touched from firmware.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index f5809fa..924822b 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -7,6 +7,8 @@ 
 #include <linux/pci.h>
 #include <linux/init.h>
 #include <linux/vgaarb.h>
+#include <linux/acpi.h>
+#include <linux/pci-acpi.h>
 #include <asm/pci_x86.h>
 
 static void pci_fixup_i450nx(struct pci_dev *d)
@@ -539,3 +541,52 @@  static void twinhead_reserve_killing_zone(struct pci_dev *dev)
         }
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x27B9, twinhead_reserve_killing_zone);
+
+#ifdef CONFIG_ACPI
+/*
+ * BIOS assisted Thunderbolt PCI enumeration should handle all resource
+ * allocation on behalf of OS.
+ */
+static void quirk_thunderbolt(struct pci_dev *dev)
+{
+	struct acpi_pci_root *root;
+	acpi_handle handle;
+
+	handle = acpi_find_root_bridge_handle(dev);
+	if (!handle)
+		return;
+
+	root = acpi_pci_find_root(handle);
+	if (!root)
+		return;
+
+	/*
+	 * Native PCIe hotplug should be disabled when BIOS assisted
+	 * hotplug is in use.
+	 */
+	if (root->osc_control_set & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL)
+		return;
+
+	/*
+	 * Make sure that we don't allocate resources for expansion ROMs.
+	 * This may accidentally increase the size of the bridge window
+	 * causing us to run out of resources.
+	 */
+	if (!(pci_probe & PCI_NOASSIGN_ROMS)) {
+		pr_info("Thunderbolt host router detected disabling ROMs\n");
+		pci_probe |= PCI_NOASSIGN_ROMS;
+	}
+
+	/*
+	 * Don't add anything to the BIOS allocated bridge window size for
+	 * the same reason.
+	 */
+	dev->is_hotplug_bridge = 0;
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1513, quirk_thunderbolt);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x151a, quirk_thunderbolt);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x151b, quirk_thunderbolt);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1547, quirk_thunderbolt);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1548, quirk_thunderbolt);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1549, quirk_thunderbolt);
+#endif