diff mbox

[RFC,2/2] PCI: Handle Broadcom Vulcan quirks

Message ID 1455528398-29311-1-git-send-email-jchandra@broadcom.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Jayachandran C. Feb. 15, 2016, 9:26 a.m. UTC
Handle two quirks of the PCI controller on Broadcom's Vulcan processor.
 - Mark internal bridges so that they are skipped during DMA alias
   search.
 - Skip BAR0 resource assignment for internal bridges. The BARs
   of internal bridges should not be assigned from the mem resource
   range.

Signed-off-by: Jayachandran C <jchandra@broadcom.com>
---

Resending, last patch was missing the Signed-off-by, also fixed the
comment a bit.

JC.

 drivers/pci/quirks.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Comments

Bjorn Helgaas Feb. 15, 2016, 6:30 p.m. UTC | #1
On Mon, Feb 15, 2016 at 02:56:38PM +0530, Jayachandran C wrote:
> Handle two quirks of the PCI controller on Broadcom's Vulcan processor.
>  - Mark internal bridges so that they are skipped during DMA alias
>    search.
>  - Skip BAR0 resource assignment for internal bridges. The BARs
>    of internal bridges should not be assigned from the mem resource
>    range.
> 
> Signed-off-by: Jayachandran C <jchandra@broadcom.com>
> ---
> 
> Resending, last patch was missing the Signed-off-by, also fixed the
> comment a bit.

If you resend a patch, please increment the version number and resend
the entire series, no matter how minor the change was.  Version
numbers are free, and it's a hassle for me to sort out multiple
versions labeled with the same number.

> JC.
> 
>  drivers/pci/quirks.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 0575a1e..afc186a 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3705,6 +3705,27 @@ DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8892, quirk_use_pcie_bridge_dma_alias);
>  DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias);
>  
>  /*
> + * Two levels of bridges in Broadcom Vulcan are not real PCI or PCIe bridges.
> + * These are internal bridges and should not be used for dma alias
> + * calculations. Additionally, the BAR0 of thes bridges should not be
> + * assigned with a mem resource from linux
> + */
> +static void quirk_bridge_brcm_vulcan_internal(struct pci_dev *pdev)
> +{
> +	struct resource *r = &pdev->resource[0];
> +
> +	/* skip from alias search */
> +	pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS;
> +
> +	/* clear BAR0, should not be used from Linux */
> +	memset(r, 0, sizeof(*r));

This definitely needs some explanation.  The whole point of the
architected PCI config space header is so that generic OS code can
manage the device without having to add device-specific code.

Are you saying the register at 0x10 in config space is not actually a
BAR at all?  Or it is a BAR, but you don't think anybody should need
to use it?  BARs do not have enable bits, so no matter what value is
in the BAR, that value defines address space to which the device will
respond.  Linux needs to know about that, even if no driver actually
uses it.

> +}
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000,
> +				quirk_bridge_brcm_vulcan_internal);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9039,
> +				quirk_bridge_brcm_vulcan_internal);
> +
> +/*
>   * Intersil/Techwell TW686[4589]-based video capture cards have an empty (zero)
>   * class code.  Fix it.
>   */
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jayachandran Chandrashekaran Nair Feb. 16, 2016, 4:17 p.m. UTC | #2
On Tue, Feb 16, 2016 at 12:00 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Mon, Feb 15, 2016 at 02:56:38PM +0530, Jayachandran C wrote:
>> Handle two quirks of the PCI controller on Broadcom's Vulcan processor.
>>  - Mark internal bridges so that they are skipped during DMA alias
>>    search.
>>  - Skip BAR0 resource assignment for internal bridges. The BARs
>>    of internal bridges should not be assigned from the mem resource
>>    range.
>>
>> Signed-off-by: Jayachandran C <jchandra@broadcom.com>
>> ---
>>
>> Resending, last patch was missing the Signed-off-by, also fixed the
>> comment a bit.
>
> If you resend a patch, please increment the version number and resend
> the entire series, no matter how minor the change was.  Version
> numbers are free, and it's a hassle for me to sort out multiple
> versions labeled with the same number.

Since it was an RFC, I thought setting the in-reply-to would be sufficient.
Looks like I was mistaken, sorry for the trouble.

>>
>>  drivers/pci/quirks.c | 21 +++++++++++++++++++++
>>  1 file changed, 21 insertions(+)
>>
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index 0575a1e..afc186a 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -3705,6 +3705,27 @@ DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8892, quirk_use_pcie_bridge_dma_alias);
>>  DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias);
>>
>>  /*
>> + * Two levels of bridges in Broadcom Vulcan are not real PCI or PCIe bridges.
>> + * These are internal bridges and should not be used for dma alias
>> + * calculations. Additionally, the BAR0 of thes bridges should not be
>> + * assigned with a mem resource from linux
>> + */
>> +static void quirk_bridge_brcm_vulcan_internal(struct pci_dev *pdev)
>> +{
>> +     struct resource *r = &pdev->resource[0];
>> +
>> +     /* skip from alias search */
>> +     pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS;
>> +
>> +     /* clear BAR0, should not be used from Linux */
>> +     memset(r, 0, sizeof(*r));
>
> This definitely needs some explanation.  The whole point of the
> architected PCI config space header is so that generic OS code can
> manage the device without having to add device-specific code.
>
> Are you saying the register at 0x10 in config space is not actually a
> BAR at all?  Or it is a BAR, but you don't think anybody should need
> to use it?

These is a proper BAR (64 bit pref) in the bridge's Type 1 header, but:
1. Assigning a memory resource from the pci memory window to this
   BAR causes the PCIe system to fail (this is a bug). So we cannot
   expose this BAR to standard Linux code without even more quirks
   and hacks.
2. This BAR is used for accessing debugging and private registers
   of the PCI controller, which is not useful in Linux.

So I thought it is better to handle it with a quirk to hide the BAR. The
device still works as a bridge and standard linux bridge configuration
happens correctly.

> BARs do not have enable bits, so no matter what value is
> in the BAR, that value defines address space to which the device will
> respond.  Linux needs to know about that, even if no driver actually
> uses it.

This is a good point. We might need to read the firmware setting of this
BAR and mark the physical address range reserved - but this may be
to document the value.

>> +}
>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000,
>> +                             quirk_bridge_brcm_vulcan_internal);
>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9039,
>> +                             quirk_bridge_brcm_vulcan_internal);
>> +
>> +/*
>>   * Intersil/Techwell TW686[4589]-based video capture cards have an empty (zero)
>>   * class code.  Fix it.
>>   */

Thanks,
JC.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Feb. 16, 2016, 5:14 p.m. UTC | #3
On Tue, Feb 16, 2016 at 09:47:34PM +0530, Jayachandran Chandrashekaran Nair wrote:
> On Tue, Feb 16, 2016 at 12:00 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Mon, Feb 15, 2016 at 02:56:38PM +0530, Jayachandran C wrote:
> >> Handle two quirks of the PCI controller on Broadcom's Vulcan processor.
> >>  - Mark internal bridges so that they are skipped during DMA alias
> >>    search.
> >>  - Skip BAR0 resource assignment for internal bridges. The BARs
> >>    of internal bridges should not be assigned from the mem resource
> >>    range.
> >>
> >> Signed-off-by: Jayachandran C <jchandra@broadcom.com>
> >> ---
> >>
> >> Resending, last patch was missing the Signed-off-by, also fixed the
> >> comment a bit.
> >
> > If you resend a patch, please increment the version number and resend
> > the entire series, no matter how minor the change was.  Version
> > numbers are free, and it's a hassle for me to sort out multiple
> > versions labeled with the same number.
> 
> Since it was an RFC, I thought setting the in-reply-to would be sufficient.
> Looks like I was mistaken, sorry for the trouble.
> 
> >>
> >>  drivers/pci/quirks.c | 21 +++++++++++++++++++++
> >>  1 file changed, 21 insertions(+)
> >>
> >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> >> index 0575a1e..afc186a 100644
> >> --- a/drivers/pci/quirks.c
> >> +++ b/drivers/pci/quirks.c
> >> @@ -3705,6 +3705,27 @@ DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8892, quirk_use_pcie_bridge_dma_alias);
> >>  DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias);
> >>
> >>  /*
> >> + * Two levels of bridges in Broadcom Vulcan are not real PCI or PCIe bridges.
> >> + * These are internal bridges and should not be used for dma alias
> >> + * calculations. Additionally, the BAR0 of thes bridges should not be
> >> + * assigned with a mem resource from linux
> >> + */
> >> +static void quirk_bridge_brcm_vulcan_internal(struct pci_dev *pdev)
> >> +{
> >> +     struct resource *r = &pdev->resource[0];
> >> +
> >> +     /* skip from alias search */
> >> +     pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS;
> >> +
> >> +     /* clear BAR0, should not be used from Linux */
> >> +     memset(r, 0, sizeof(*r));
> >
> > This definitely needs some explanation.  The whole point of the
> > architected PCI config space header is so that generic OS code can
> > manage the device without having to add device-specific code.
> >
> > Are you saying the register at 0x10 in config space is not actually a
> > BAR at all?  Or it is a BAR, but you don't think anybody should need
> > to use it?
> 
> These is a proper BAR (64 bit pref) in the bridge's Type 1 header, but:
> 1. Assigning a memory resource from the pci memory window to this
>    BAR causes the PCIe system to fail (this is a bug). So we cannot
>    expose this BAR to standard Linux code without even more quirks
>    and hacks.

Are you saying assigning space for this BAR exposes a Linux PCI bug?
If so, I'd like to know more about that because we should fix it.

Or does it expose a hardware bug in the Broadcom device?

> 2. This BAR is used for accessing debugging and private registers
>    of the PCI controller, which is not useful in Linux.
> 
> So I thought it is better to handle it with a quirk to hide the BAR. The
> device still works as a bridge and standard linux bridge configuration
> happens correctly.

Unfortunately, there's no way in PCI for a device to communicate that
some of the resources it requests are more valuable than others.
There's no way for it to say "I'm asking for these resources (via the
BAR), but I didn't really mean it."

> > BARs do not have enable bits, so no matter what value is
> > in the BAR, that value defines address space to which the device will
> > respond.  Linux needs to know about that, even if no driver actually
> > uses it.
> 
> This is a good point. We might need to read the firmware setting of this
> BAR and mark the physical address range reserved - but this may be
> to document the value.
> 
> >> +}
> >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000,
> >> +                             quirk_bridge_brcm_vulcan_internal);
> >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9039,
> >> +                             quirk_bridge_brcm_vulcan_internal);
> >> +
> >> +/*
> >>   * Intersil/Techwell TW686[4589]-based video capture cards have an empty (zero)
> >>   * class code.  Fix it.
> >>   */
> 
> Thanks,
> JC.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jayachandran Chandrashekaran Nair Feb. 16, 2016, 6:09 p.m. UTC | #4
On Tue, Feb 16, 2016 at 10:44 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Tue, Feb 16, 2016 at 09:47:34PM +0530, Jayachandran Chandrashekaran Nair wrote:
>> On Tue, Feb 16, 2016 at 12:00 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> > On Mon, Feb 15, 2016 at 02:56:38PM +0530, Jayachandran C wrote:
>> >> Handle two quirks of the PCI controller on Broadcom's Vulcan processor.
>> >>  - Mark internal bridges so that they are skipped during DMA alias
>> >>    search.
>> >>  - Skip BAR0 resource assignment for internal bridges. The BARs
>> >>    of internal bridges should not be assigned from the mem resource
>> >>    range.
>> >>
>> >> Signed-off-by: Jayachandran C <jchandra@broadcom.com>
>> >> ---
>> >>
>> >> Resending, last patch was missing the Signed-off-by, also fixed the
>> >> comment a bit.
>> >
>> > If you resend a patch, please increment the version number and resend
>> > the entire series, no matter how minor the change was.  Version
>> > numbers are free, and it's a hassle for me to sort out multiple
>> > versions labeled with the same number.
>>
>> Since it was an RFC, I thought setting the in-reply-to would be sufficient.
>> Looks like I was mistaken, sorry for the trouble.
>>
>> >>
>> >>  drivers/pci/quirks.c | 21 +++++++++++++++++++++
>> >>  1 file changed, 21 insertions(+)
>> >>
>> >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> >> index 0575a1e..afc186a 100644
>> >> --- a/drivers/pci/quirks.c
>> >> +++ b/drivers/pci/quirks.c
>> >> @@ -3705,6 +3705,27 @@ DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8892, quirk_use_pcie_bridge_dma_alias);
>> >>  DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias);
>> >>
>> >>  /*
>> >> + * Two levels of bridges in Broadcom Vulcan are not real PCI or PCIe bridges.
>> >> + * These are internal bridges and should not be used for dma alias
>> >> + * calculations. Additionally, the BAR0 of thes bridges should not be
>> >> + * assigned with a mem resource from linux
>> >> + */
>> >> +static void quirk_bridge_brcm_vulcan_internal(struct pci_dev *pdev)
>> >> +{
>> >> +     struct resource *r = &pdev->resource[0];
>> >> +
>> >> +     /* skip from alias search */
>> >> +     pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS;
>> >> +
>> >> +     /* clear BAR0, should not be used from Linux */
>> >> +     memset(r, 0, sizeof(*r));
>> >
>> > This definitely needs some explanation.  The whole point of the
>> > architected PCI config space header is so that generic OS code can
>> > manage the device without having to add device-specific code.
>> >
>> > Are you saying the register at 0x10 in config space is not actually a
>> > BAR at all?  Or it is a BAR, but you don't think anybody should need
>> > to use it?
>>
>> These is a proper BAR (64 bit pref) in the bridge's Type 1 header, but:
>> 1. Assigning a memory resource from the pci memory window to this
>>    BAR causes the PCIe system to fail (this is a bug). So we cannot
>>    expose this BAR to standard Linux code without even more quirks
>>    and hacks.
>
> Are you saying assigning space for this BAR exposes a Linux PCI bug?
> If so, I'd like to know more about that because we should fix it.
>
> Or does it expose a hardware bug in the Broadcom device?

This is a hardware bug (or non-compliance). The BAR cannot be
assigned from the PCI MEM range. To use it we need to program
an address outside the areas assigned to PCI to the BAR. But like
I said it is better for us to ignore the BAR in linux and then the
device acts as a normal bridge.

>
>> 2. This BAR is used for accessing debugging and private registers
>>    of the PCI controller, which is not useful in Linux.
>>
>> So I thought it is better to handle it with a quirk to hide the BAR. The
>> device still works as a bridge and standard linux bridge configuration
>> happens correctly.
>
> Unfortunately, there's no way in PCI for a device to communicate that
> some of the resources it requests are more valuable than others.
> There's no way for it to say "I'm asking for these resources (via the
> BAR), but I didn't really mean it."
>

I think hiding the buggy BAR is the best option in my view.


>> > BARs do not have enable bits, so no matter what value is
>> > in the BAR, that value defines address space to which the device will
>> > respond.  Linux needs to know about that, even if no driver actually
>> > uses it.
>>
>> This is a good point. We might need to read the firmware setting of this
>> BAR and mark the physical address range reserved - but this may be
>> to document the value.
>>
>> >> +}
>> >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000,
>> >> +                             quirk_bridge_brcm_vulcan_internal);
>> >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9039,
>> >> +                             quirk_bridge_brcm_vulcan_internal);
>> >> +
>> >> +/*
>> >>   * Intersil/Techwell TW686[4589]-based video capture cards have an empty (zero)
>> >>   * class code.  Fix it.
>> >>   */
>>

JC.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Feb. 16, 2016, 9:03 p.m. UTC | #5
[+cc Arnd, Rob, DT host bridge description questions]

On Tue, Feb 16, 2016 at 11:39:41PM +0530, Jayachandran Chandrashekaran Nair wrote:
> On Tue, Feb 16, 2016 at 10:44 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Feb 16, 2016 at 09:47:34PM +0530, Jayachandran Chandrashekaran Nair wrote:
> >> On Tue, Feb 16, 2016 at 12:00 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >> > On Mon, Feb 15, 2016 at 02:56:38PM +0530, Jayachandran C wrote:
> >> >> Handle two quirks of the PCI controller on Broadcom's Vulcan processor.
> >> >>  - Mark internal bridges so that they are skipped during DMA alias
> >> >>    search.
> >> >>  - Skip BAR0 resource assignment for internal bridges. The BARs
> >> >>    of internal bridges should not be assigned from the mem resource
> >> >>    range.
> >> >> ...
> >> >>  drivers/pci/quirks.c | 21 +++++++++++++++++++++
> >> >>  1 file changed, 21 insertions(+)
> >> >>
> >> >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> >> >> index 0575a1e..afc186a 100644
> >> >> --- a/drivers/pci/quirks.c
> >> >> +++ b/drivers/pci/quirks.c
> >> >> @@ -3705,6 +3705,27 @@ DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8892, quirk_use_pcie_bridge_dma_alias);
> >> >>  DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias);
> >> >>
> >> >>  /*
> >> >> + * Two levels of bridges in Broadcom Vulcan are not real PCI or PCIe bridges.
> >> >> + * These are internal bridges and should not be used for dma alias
> >> >> + * calculations. Additionally, the BAR0 of thes bridges should not be
> >> >> + * assigned with a mem resource from linux
> >> >> + */
> >> >> +static void quirk_bridge_brcm_vulcan_internal(struct pci_dev *pdev)
> >> >> +{
> >> >> +     struct resource *r = &pdev->resource[0];
> >> >> +
> >> >> +     /* skip from alias search */
> >> >> +     pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS;
> >> >> +
> >> >> +     /* clear BAR0, should not be used from Linux */
> >> >> +     memset(r, 0, sizeof(*r));
> >> >
> >> > This definitely needs some explanation.  The whole point of the
> >> > architected PCI config space header is so that generic OS code can
> >> > manage the device without having to add device-specific code.
> >> >
> >> > Are you saying the register at 0x10 in config space is not actually a
> >> > BAR at all?  Or it is a BAR, but you don't think anybody should need
> >> > to use it?
> >>
> >> These is a proper BAR (64 bit pref) in the bridge's Type 1 header, but:
> >> 1. Assigning a memory resource from the pci memory window to this
> >>    BAR causes the PCIe system to fail (this is a bug). So we cannot
> >>    expose this BAR to standard Linux code without even more quirks
> >>    and hacks.
> >
> > Are you saying assigning space for this BAR exposes a Linux PCI bug?
> > If so, I'd like to know more about that because we should fix it.
> >
> > Or does it expose a hardware bug in the Broadcom device?
> 
> This is a hardware bug (or non-compliance). The BAR cannot be
> assigned from the PCI MEM range. To use it we need to program
> an address outside the areas assigned to PCI to the BAR. But like
> I said it is better for us to ignore the BAR in linux and then the
> device acts as a normal bridge.

Your PCI host bridge has a window of address space that it forwards
from the primary (CPU) side to the secondary (PCI) side.  I assume
that's what you mean by the "PCI MEM range."  Apparently if the BAR is
programmed inside that window, it causes some hardware error.  That
still seems strange; there's no driver for the device, and we know the
range is in use so it won't be assigned to any other device, so Linux
should never *access* the range.

Apparently if you program the BAR *outside* the window, the hardware
error does not happen, and the private registers are accessible.  But
Linux currently doesn't know where this space is.

If we ignore the BAR in Linux, apparently the hardware error does not
happen.  But the BAR still contains some value, so this is really the
same as having the BAR outside the window, presumably because it came
out of reset that way, or maybe firmware programmed it.

It sounds to me like you should do the following:

a) Have firmware program this BAR where you want it,

b) Describe these private registers as internal PCI bridge registers,
   using a DT "reg" property,

c) Describe the host bridge window (which doesn't include the above
   registers) using the normal "ranges" property, and

d) Arrange for config writes to BAR 0 to be dropped and for reads to
   return 0, maybe by checks in your config accessors.

> >> 2. This BAR is used for accessing debugging and private registers
> >>    of the PCI controller, which is not useful in Linux.
> >>
> >> So I thought it is better to handle it with a quirk to hide the BAR. The
> >> device still works as a bridge and standard linux bridge configuration
> >> happens correctly.
> >
> > Unfortunately, there's no way in PCI for a device to communicate that
> > some of the resources it requests are more valuable than others.
> > There's no way for it to say "I'm asking for these resources (via the
> > BAR), but I didn't really mean it."
> 
> I think hiding the buggy BAR is the best option in my view.
> 
> >> > BARs do not have enable bits, so no matter what value is
> >> > in the BAR, that value defines address space to which the device will
> >> > respond.  Linux needs to know about that, even if no driver actually
> >> > uses it.
> >>
> >> This is a good point. We might need to read the firmware setting of this
> >> BAR and mark the physical address range reserved - but this may be
> >> to document the value.
> >>
> >> >> +}
> >> >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000,
> >> >> +                             quirk_bridge_brcm_vulcan_internal);
> >> >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9039,
> >> >> +                             quirk_bridge_brcm_vulcan_internal);
> >> >> +
> >> >> +/*
> >> >>   * Intersil/Techwell TW686[4589]-based video capture cards have an empty (zero)
> >> >>   * class code.  Fix it.
> >> >>   */
> >>
> 
> JC.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann Feb. 16, 2016, 9:46 p.m. UTC | #6
On Tuesday 16 February 2016 15:03:37 Bjorn Helgaas wrote:
> 
> Your PCI host bridge has a window of address space that it forwards
> from the primary (CPU) side to the secondary (PCI) side.  I assume
> that's what you mean by the "PCI MEM range."  Apparently if the BAR is
> programmed inside that window, it causes some hardware error.  That
> still seems strange; there's no driver for the device, and we know the
> range is in use so it won't be assigned to any other device, so Linux
> should never *access* the range.
> 
> Apparently if you program the BAR *outside* the window, the hardware
> error does not happen, and the private registers are accessible.  But
> Linux currently doesn't know where this space is.
> 
> If we ignore the BAR in Linux, apparently the hardware error does not
> happen.  But the BAR still contains some value, so this is really the
> same as having the BAR outside the window, presumably because it came
> out of reset that way, or maybe firmware programmed it.
> 
> It sounds to me like you should do the following:
> 
> a) Have firmware program this BAR where you want it,
> 
> b) Describe these private registers as internal PCI bridge registers,
>    using a DT "reg" property,
> 
> c) Describe the host bridge window (which doesn't include the above
>    registers) using the normal "ranges" property, and
> 
> d) Arrange for config writes to BAR 0 to be dropped and for reads to
>    return 0, maybe by checks in your config accessors.

We should be able to express this in the ranges property as
a non-relocatable range, I'm pretty sure the PCI binding allows
this, but the Linux PCI code might not honor the flag at the
moment, which can be fixed.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jayachandran Chandrashekaran Nair Feb. 17, 2016, 5:06 p.m. UTC | #7
On Wed, Feb 17, 2016 at 2:33 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> [+cc Arnd, Rob, DT host bridge description questions]
>
> On Tue, Feb 16, 2016 at 11:39:41PM +0530, Jayachandran Chandrashekaran Nair wrote:
>> On Tue, Feb 16, 2016 at 10:44 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> > On Tue, Feb 16, 2016 at 09:47:34PM +0530, Jayachandran Chandrashekaran Nair wrote:
>> >> On Tue, Feb 16, 2016 at 12:00 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> >> > On Mon, Feb 15, 2016 at 02:56:38PM +0530, Jayachandran C wrote:
>> >> >> Handle two quirks of the PCI controller on Broadcom's Vulcan processor.
>> >> >>  - Mark internal bridges so that they are skipped during DMA alias
>> >> >>    search.
>> >> >>  - Skip BAR0 resource assignment for internal bridges. The BARs
>> >> >>    of internal bridges should not be assigned from the mem resource
>> >> >>    range.
>> >> >> ...
>> >> >>  drivers/pci/quirks.c | 21 +++++++++++++++++++++
>> >> >>  1 file changed, 21 insertions(+)
>> >> >>
>> >> >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> >> >> index 0575a1e..afc186a 100644
>> >> >> --- a/drivers/pci/quirks.c
>> >> >> +++ b/drivers/pci/quirks.c
>> >> >> @@ -3705,6 +3705,27 @@ DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8892, quirk_use_pcie_bridge_dma_alias);
>> >> >>  DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias);
>> >> >>
>> >> >>  /*
>> >> >> + * Two levels of bridges in Broadcom Vulcan are not real PCI or PCIe bridges.
>> >> >> + * These are internal bridges and should not be used for dma alias
>> >> >> + * calculations. Additionally, the BAR0 of thes bridges should not be
>> >> >> + * assigned with a mem resource from linux
>> >> >> + */
>> >> >> +static void quirk_bridge_brcm_vulcan_internal(struct pci_dev *pdev)
>> >> >> +{
>> >> >> +     struct resource *r = &pdev->resource[0];
>> >> >> +
>> >> >> +     /* skip from alias search */
>> >> >> +     pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS;
>> >> >> +
>> >> >> +     /* clear BAR0, should not be used from Linux */
>> >> >> +     memset(r, 0, sizeof(*r));
>> >> >
>> >> > This definitely needs some explanation.  The whole point of the
>> >> > architected PCI config space header is so that generic OS code can
>> >> > manage the device without having to add device-specific code.
>> >> >
>> >> > Are you saying the register at 0x10 in config space is not actually a
>> >> > BAR at all?  Or it is a BAR, but you don't think anybody should need
>> >> > to use it?
>> >>
>> >> These is a proper BAR (64 bit pref) in the bridge's Type 1 header, but:
>> >> 1. Assigning a memory resource from the pci memory window to this
>> >>    BAR causes the PCIe system to fail (this is a bug). So we cannot
>> >>    expose this BAR to standard Linux code without even more quirks
>> >>    and hacks.
>> >
>> > Are you saying assigning space for this BAR exposes a Linux PCI bug?
>> > If so, I'd like to know more about that because we should fix it.
>> >
>> > Or does it expose a hardware bug in the Broadcom device?
>>
>> This is a hardware bug (or non-compliance). The BAR cannot be
>> assigned from the PCI MEM range. To use it we need to program
>> an address outside the areas assigned to PCI to the BAR. But like
>> I said it is better for us to ignore the BAR in linux and then the
>> device acts as a normal bridge.
>
> Your PCI host bridge has a window of address space that it forwards
> from the primary (CPU) side to the secondary (PCI) side.  I assume
> that's what you mean by the "PCI MEM range."  Apparently if the BAR is
> programmed inside that window, it causes some hardware error.  That
> still seems strange; there's no driver for the device, and we know the
> range is in use so it won't be assigned to any other device, so Linux
> should never *access* the range.

This is correct. The write to this bridge BAR with a address from
the host bridge window will cause a hardware issue.

> Apparently if you program the BAR *outside* the window, the hardware
> error does not happen, and the private registers are accessible.  But
> Linux currently doesn't know where this space is.
>
> If we ignore the BAR in Linux, apparently the hardware error does not
> happen.  But the BAR still contains some value, so this is really the
> same as having the BAR outside the window, presumably because it came
> out of reset that way, or maybe firmware programmed it.

Yes.

> It sounds to me like you should do the following:
>
> a) Have firmware program this BAR where you want it,
>
> b) Describe these private registers as internal PCI bridge registers,
>    using a DT "reg" property,

Two ponts here:
- We have to support ACPI as well
- There is no need to use the private registers in linux, so the
   whole thing will be pointless.

> c) Describe the host bridge window (which doesn't include the above
>    registers) using the normal "ranges" property, and

This is standard code (and ACPI works as well)..

> d) Arrange for config writes to BAR 0 to be dropped and for reads to
>    return 0, maybe by checks in your config accessors.

Here we are hiding the BAR from linux using checks in config read
write (if i understand correctly). This will need custom config accessors
for Vulcan both on device tree as well as in ACPI.

The patch (above) is trying to do it in a much much simpler way by
clearing the bar0 resource in DECLARE_PCI_FIXUP_HEADER quirk.
I am not able to see why this is not valid.

Thanks,
JC.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Feb. 18, 2016, 3:49 p.m. UTC | #8
On Wed, Feb 17, 2016 at 10:36:55PM +0530, Jayachandran Chandrashekaran Nair wrote:
> On Wed, Feb 17, 2016 at 2:33 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > [+cc Arnd, Rob, DT host bridge description questions]
> >
> > On Tue, Feb 16, 2016 at 11:39:41PM +0530, Jayachandran Chandrashekaran Nair wrote:
> >> On Tue, Feb 16, 2016 at 10:44 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >> > On Tue, Feb 16, 2016 at 09:47:34PM +0530, Jayachandran Chandrashekaran Nair wrote:
> >> >> On Tue, Feb 16, 2016 at 12:00 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >> >> > On Mon, Feb 15, 2016 at 02:56:38PM +0530, Jayachandran C wrote:
> >> >> >> Handle two quirks of the PCI controller on Broadcom's Vulcan processor.
> >> >> >>  - Mark internal bridges so that they are skipped during DMA alias
> >> >> >>    search.
> >> >> >>  - Skip BAR0 resource assignment for internal bridges. The BARs
> >> >> >>    of internal bridges should not be assigned from the mem resource
> >> >> >>    range.
> >> >> >> ...
> >> >> >>  drivers/pci/quirks.c | 21 +++++++++++++++++++++
> >> >> >>  1 file changed, 21 insertions(+)
> >> >> >>
> >> >> >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> >> >> >> index 0575a1e..afc186a 100644
> >> >> >> --- a/drivers/pci/quirks.c
> >> >> >> +++ b/drivers/pci/quirks.c
> >> >> >> @@ -3705,6 +3705,27 @@ DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8892, quirk_use_pcie_bridge_dma_alias);
> >> >> >>  DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias);
> >> >> >>
> >> >> >>  /*
> >> >> >> + * Two levels of bridges in Broadcom Vulcan are not real PCI or PCIe bridges.
> >> >> >> + * These are internal bridges and should not be used for dma alias
> >> >> >> + * calculations. Additionally, the BAR0 of thes bridges should not be
> >> >> >> + * assigned with a mem resource from linux
> >> >> >> + */
> >> >> >> +static void quirk_bridge_brcm_vulcan_internal(struct pci_dev *pdev)
> >> >> >> +{
> >> >> >> +     struct resource *r = &pdev->resource[0];
> >> >> >> +
> >> >> >> +     /* skip from alias search */
> >> >> >> +     pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS;
> >> >> >> +
> >> >> >> +     /* clear BAR0, should not be used from Linux */
> >> >> >> +     memset(r, 0, sizeof(*r));
> >> >> >
> >> >> > This definitely needs some explanation.  The whole point of the
> >> >> > architected PCI config space header is so that generic OS code can
> >> >> > manage the device without having to add device-specific code.
> >> >> >
> >> >> > Are you saying the register at 0x10 in config space is not actually a
> >> >> > BAR at all?  Or it is a BAR, but you don't think anybody should need
> >> >> > to use it?
> >> >>
> >> >> These is a proper BAR (64 bit pref) in the bridge's Type 1 header, but:
> >> >> 1. Assigning a memory resource from the pci memory window to this
> >> >>    BAR causes the PCIe system to fail (this is a bug). So we cannot
> >> >>    expose this BAR to standard Linux code without even more quirks
> >> >>    and hacks.
> >> >
> >> > Are you saying assigning space for this BAR exposes a Linux PCI bug?
> >> > If so, I'd like to know more about that because we should fix it.
> >> >
> >> > Or does it expose a hardware bug in the Broadcom device?
> >>
> >> This is a hardware bug (or non-compliance). The BAR cannot be
> >> assigned from the PCI MEM range. To use it we need to program
> >> an address outside the areas assigned to PCI to the BAR. But like
> >> I said it is better for us to ignore the BAR in linux and then the
> >> device acts as a normal bridge.
> >
> > Your PCI host bridge has a window of address space that it forwards
> > from the primary (CPU) side to the secondary (PCI) side.  I assume
> > that's what you mean by the "PCI MEM range."  Apparently if the BAR is
> > programmed inside that window, it causes some hardware error.  That
> > still seems strange; there's no driver for the device, and we know the
> > range is in use so it won't be assigned to any other device, so Linux
> > should never *access* the range.
> 
> This is correct. The write to this bridge BAR with a address from
> the host bridge window will cause a hardware issue.
> 
> > Apparently if you program the BAR *outside* the window, the hardware
> > error does not happen, and the private registers are accessible.  But
> > Linux currently doesn't know where this space is.
> >
> > If we ignore the BAR in Linux, apparently the hardware error does not
> > happen.  But the BAR still contains some value, so this is really the
> > same as having the BAR outside the window, presumably because it came
> > out of reset that way, or maybe firmware programmed it.
> 
> Yes.
> 
> > It sounds to me like you should do the following:
> >
> > a) Have firmware program this BAR where you want it,
> >
> > b) Describe these private registers as internal PCI bridge registers,
> >    using a DT "reg" property,
> 
> Two ponts here:
> - We have to support ACPI as well

ACPI can do something similar.  This register space would be described
in a _CRS method.  IIRC there is actually a bit in the _CRS descriptor
that was intended to tell you whether the resource is (a) consumed
directly by the device or (b) forwarded to downstream devices.  But I
think BIOSes didn't use that bit correctly, so it's useless, so the
_CRS method might have to be on a different device.

> - There is no need to use the private registers in linux, so the
>    whole thing will be pointless.
> 
> > c) Describe the host bridge window (which doesn't include the above
> >    registers) using the normal "ranges" property, and
> 
> This is standard code (and ACPI works as well)..
> 
> > d) Arrange for config writes to BAR 0 to be dropped and for reads to
> >    return 0, maybe by checks in your config accessors.
> 
> Here we are hiding the BAR from linux using checks in config read
> write (if i understand correctly). This will need custom config accessors
> for Vulcan both on device tree as well as in ACPI.

Config accessors are usually not specific to DT or ACPI.

> The patch (above) is trying to do it in a much much simpler way by
> clearing the bar0 resource in DECLARE_PCI_FIXUP_HEADER quirk.
> I am not able to see why this is not valid.

I think there are two problems:

1) If the device responds to any address space, Linux needs to know
about it.  If we don't know about it, we might assign that space to
something else.

2) The PCI core still reads and writes the BAR to size it, and we
don't know whether this will trigger the hardware issue.

*You* might know that neither of these is an issue today, but special
incidental knowledge like that is a maintenance problem because I
can't tell what PCI core changes might make them an issue in the
future.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jayachandran Chandrashekaran Nair Feb. 23, 2016, 2:40 p.m. UTC | #9
On Thu, Feb 18, 2016 at 9:19 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Wed, Feb 17, 2016 at 10:36:55PM +0530, Jayachandran Chandrashekaran Nair wrote:
>> On Wed, Feb 17, 2016 at 2:33 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> > [+cc Arnd, Rob, DT host bridge description questions]
>> >
>> > On Tue, Feb 16, 2016 at 11:39:41PM +0530, Jayachandran Chandrashekaran Nair wrote:
>> >> On Tue, Feb 16, 2016 at 10:44 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> >> > On Tue, Feb 16, 2016 at 09:47:34PM +0530, Jayachandran Chandrashekaran Nair wrote:
>> >> >> On Tue, Feb 16, 2016 at 12:00 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> >> >> > On Mon, Feb 15, 2016 at 02:56:38PM +0530, Jayachandran C wrote:
>> >> >> >> Handle two quirks of the PCI controller on Broadcom's Vulcan processor.
>> >> >> >>  - Mark internal bridges so that they are skipped during DMA alias
>> >> >> >>    search.
>> >> >> >>  - Skip BAR0 resource assignment for internal bridges. The BARs
>> >> >> >>    of internal bridges should not be assigned from the mem resource
>> >> >> >>    range.
>> >> >> >> ...
>> >> >> >>  drivers/pci/quirks.c | 21 +++++++++++++++++++++
>> >> >> >>  1 file changed, 21 insertions(+)
>> >> >> >>
>> >> >> >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> >> >> >> index 0575a1e..afc186a 100644
>> >> >> >> --- a/drivers/pci/quirks.c
>> >> >> >> +++ b/drivers/pci/quirks.c
>> >> >> >> @@ -3705,6 +3705,27 @@ DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8892, quirk_use_pcie_bridge_dma_alias);
>> >> >> >>  DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias);
>> >> >> >>
>> >> >> >>  /*
>> >> >> >> + * Two levels of bridges in Broadcom Vulcan are not real PCI or PCIe bridges.
>> >> >> >> + * These are internal bridges and should not be used for dma alias
>> >> >> >> + * calculations. Additionally, the BAR0 of thes bridges should not be
>> >> >> >> + * assigned with a mem resource from linux
>> >> >> >> + */
>> >> >> >> +static void quirk_bridge_brcm_vulcan_internal(struct pci_dev *pdev)
>> >> >> >> +{
>> >> >> >> +     struct resource *r = &pdev->resource[0];
>> >> >> >> +
>> >> >> >> +     /* skip from alias search */
>> >> >> >> +     pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS;
>> >> >> >> +
>> >> >> >> +     /* clear BAR0, should not be used from Linux */
>> >> >> >> +     memset(r, 0, sizeof(*r));
>> >> >> >
>> >> >> > This definitely needs some explanation.  The whole point of the
>> >> >> > architected PCI config space header is so that generic OS code can
>> >> >> > manage the device without having to add device-specific code.
>> >> >> >
>> >> >> > Are you saying the register at 0x10 in config space is not actually a
>> >> >> > BAR at all?  Or it is a BAR, but you don't think anybody should need
>> >> >> > to use it?
>> >> >>
>> >> >> These is a proper BAR (64 bit pref) in the bridge's Type 1 header, but:
>> >> >> 1. Assigning a memory resource from the pci memory window to this
>> >> >>    BAR causes the PCIe system to fail (this is a bug). So we cannot
>> >> >>    expose this BAR to standard Linux code without even more quirks
>> >> >>    and hacks.
>> >> >
>> >> > Are you saying assigning space for this BAR exposes a Linux PCI bug?
>> >> > If so, I'd like to know more about that because we should fix it.
>> >> >
>> >> > Or does it expose a hardware bug in the Broadcom device?
>> >>
>> >> This is a hardware bug (or non-compliance). The BAR cannot be
>> >> assigned from the PCI MEM range. To use it we need to program
>> >> an address outside the areas assigned to PCI to the BAR. But like
>> >> I said it is better for us to ignore the BAR in linux and then the
>> >> device acts as a normal bridge.
>> >
>> > Your PCI host bridge has a window of address space that it forwards
>> > from the primary (CPU) side to the secondary (PCI) side.  I assume
>> > that's what you mean by the "PCI MEM range."  Apparently if the BAR is
>> > programmed inside that window, it causes some hardware error.  That
>> > still seems strange; there's no driver for the device, and we know the
>> > range is in use so it won't be assigned to any other device, so Linux
>> > should never *access* the range.
>>
>> This is correct. The write to this bridge BAR with a address from
>> the host bridge window will cause a hardware issue.
>>
>> > Apparently if you program the BAR *outside* the window, the hardware
>> > error does not happen, and the private registers are accessible.  But
>> > Linux currently doesn't know where this space is.
>> >
>> > If we ignore the BAR in Linux, apparently the hardware error does not
>> > happen.  But the BAR still contains some value, so this is really the
>> > same as having the BAR outside the window, presumably because it came
>> > out of reset that way, or maybe firmware programmed it.
>>
>> Yes.
>>
>> > It sounds to me like you should do the following:
>> >
>> > a) Have firmware program this BAR where you want it,
>> >
>> > b) Describe these private registers as internal PCI bridge registers,
>> >    using a DT "reg" property,
>>
>> Two ponts here:
>> - We have to support ACPI as well
>
> ACPI can do something similar.  This register space would be described
> in a _CRS method.  IIRC there is actually a bit in the _CRS descriptor
> that was intended to tell you whether the resource is (a) consumed
> directly by the device or (b) forwarded to downstream devices.  But I
> think BIOSes didn't use that bit correctly, so it's useless, so the
> _CRS method might have to be on a different device.
>
>> - There is no need to use the private registers in linux, so the
>>    whole thing will be pointless.
>>
>> > c) Describe the host bridge window (which doesn't include the above
>> >    registers) using the normal "ranges" property, and
>>
>> This is standard code (and ACPI works as well)..
>>
>> > d) Arrange for config writes to BAR 0 to be dropped and for reads to
>> >    return 0, maybe by checks in your config accessors.
>>
>> Here we are hiding the BAR from linux using checks in config read
>> write (if i understand correctly). This will need custom config accessors
>> for Vulcan both on device tree as well as in ACPI.
>
> Config accessors are usually not specific to DT or ACPI.
>
>> The patch (above) is trying to do it in a much much simpler way by
>> clearing the bar0 resource in DECLARE_PCI_FIXUP_HEADER quirk.
>> I am not able to see why this is not valid.
>
> I think there are two problems:
>
> 1) If the device responds to any address space, Linux needs to know
> about it.  If we don't know about it, we might assign that space to
> something else.

If it is really needed I can mark the address space as reserved. And
Linux generally should not use physical address space outside the
areas specified by firmware of device tree.

> 2) The PCI core still reads and writes the BAR to size it, and we
> don't know whether this will trigger the hardware issue.

The sizing read/write is done after disabling the BAR by writing
to CMD register - I don't see why you think this will be a problem.

> *You* might know that neither of these is an issue today, but special
> incidental knowledge like that is a maintenance problem because I
> can't tell what PCI core changes might make them an issue in the
> future.

The worst case maintenance problem is that a future PCI change
might break our chip. This is not unexpected in Linux, and such
breakages can be easily fixed.

Anyway, looks like you do not like the approach, so I will go back
and see if anything else can be done.

Thanks.
JC.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Feb. 23, 2016, 3:12 p.m. UTC | #10
[+cc Andi, linux-kernel, x86 for Intel Broadwell-EP non-BAR issue]

On Tue, Feb 23, 2016 at 08:10:03PM +0530, Jayachandran Chandrashekaran Nair wrote:
> On Thu, Feb 18, 2016 at 9:19 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Wed, Feb 17, 2016 at 10:36:55PM +0530, Jayachandran Chandrashekaran Nair wrote:
> >> On Wed, Feb 17, 2016 at 2:33 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >> > [+cc Arnd, Rob, DT host bridge description questions]
> >> >
> >> > On Tue, Feb 16, 2016 at 11:39:41PM +0530, Jayachandran Chandrashekaran Nair wrote:
> >> >> On Tue, Feb 16, 2016 at 10:44 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >> >> > On Tue, Feb 16, 2016 at 09:47:34PM +0530, Jayachandran Chandrashekaran Nair wrote:
> >> >> >> On Tue, Feb 16, 2016 at 12:00 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >> >> >> > On Mon, Feb 15, 2016 at 02:56:38PM +0530, Jayachandran C wrote:
> >> >> >> >> Handle two quirks of the PCI controller on Broadcom's Vulcan processor.
> >> >> >> >>  - Mark internal bridges so that they are skipped during DMA alias
> >> >> >> >>    search.
> >> >> >> >>  - Skip BAR0 resource assignment for internal bridges. The BARs
> >> >> >> >>    of internal bridges should not be assigned from the mem resource
> >> >> >> >>    range.
> >> >> >> >> ...
> >> >> >> >>  drivers/pci/quirks.c | 21 +++++++++++++++++++++
> >> >> >> >>  1 file changed, 21 insertions(+)
> >> >> >> >>
> >> >> >> >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> >> >> >> >> index 0575a1e..afc186a 100644
> >> >> >> >> --- a/drivers/pci/quirks.c
> >> >> >> >> +++ b/drivers/pci/quirks.c
> >> >> >> >> @@ -3705,6 +3705,27 @@ DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8892, quirk_use_pcie_bridge_dma_alias);
> >> >> >> >>  DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias);
> >> >> >> >>
> >> >> >> >>  /*
> >> >> >> >> + * Two levels of bridges in Broadcom Vulcan are not real PCI or PCIe bridges.
> >> >> >> >> + * These are internal bridges and should not be used for dma alias
> >> >> >> >> + * calculations. Additionally, the BAR0 of thes bridges should not be
> >> >> >> >> + * assigned with a mem resource from linux
> >> >> >> >> + */
> >> >> >> >> +static void quirk_bridge_brcm_vulcan_internal(struct pci_dev *pdev)
> >> >> >> >> +{
> >> >> >> >> +     struct resource *r = &pdev->resource[0];
> >> >> >> >> +
> >> >> >> >> +     /* skip from alias search */
> >> >> >> >> +     pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS;
> >> >> >> >> +
> >> >> >> >> +     /* clear BAR0, should not be used from Linux */
> >> >> >> >> +     memset(r, 0, sizeof(*r));
> >> >> >> >
> >> >> >> > This definitely needs some explanation.  The whole point of the
> >> >> >> > architected PCI config space header is so that generic OS code can
> >> >> >> > manage the device without having to add device-specific code.
> >> >> >> >
> >> >> >> > Are you saying the register at 0x10 in config space is not actually a
> >> >> >> > BAR at all?  Or it is a BAR, but you don't think anybody should need
> >> >> >> > to use it?
> >> >> >>
> >> >> >> These is a proper BAR (64 bit pref) in the bridge's Type 1 header, but:
> >> >> >> 1. Assigning a memory resource from the pci memory window to this
> >> >> >>    BAR causes the PCIe system to fail (this is a bug). So we cannot
> >> >> >>    expose this BAR to standard Linux code without even more quirks
> >> >> >>    and hacks.
> >> >> >
> >> >> > Are you saying assigning space for this BAR exposes a Linux PCI bug?
> >> >> > If so, I'd like to know more about that because we should fix it.
> >> >> >
> >> >> > Or does it expose a hardware bug in the Broadcom device?
> >> >>
> >> >> This is a hardware bug (or non-compliance). The BAR cannot be
> >> >> assigned from the PCI MEM range. To use it we need to program
> >> >> an address outside the areas assigned to PCI to the BAR. But like
> >> >> I said it is better for us to ignore the BAR in linux and then the
> >> >> device acts as a normal bridge.
> >> >
> >> > Your PCI host bridge has a window of address space that it forwards
> >> > from the primary (CPU) side to the secondary (PCI) side.  I assume
> >> > that's what you mean by the "PCI MEM range."  Apparently if the BAR is
> >> > programmed inside that window, it causes some hardware error.  That
> >> > still seems strange; there's no driver for the device, and we know the
> >> > range is in use so it won't be assigned to any other device, so Linux
> >> > should never *access* the range.
> >>
> >> This is correct. The write to this bridge BAR with a address from
> >> the host bridge window will cause a hardware issue.
> >>
> >> > Apparently if you program the BAR *outside* the window, the hardware
> >> > error does not happen, and the private registers are accessible.  But
> >> > Linux currently doesn't know where this space is.
> >> >
> >> > If we ignore the BAR in Linux, apparently the hardware error does not
> >> > happen.  But the BAR still contains some value, so this is really the
> >> > same as having the BAR outside the window, presumably because it came
> >> > out of reset that way, or maybe firmware programmed it.
> >>
> >> Yes.
> >>
> >> > It sounds to me like you should do the following:
> >> >
> >> > a) Have firmware program this BAR where you want it,
> >> >
> >> > b) Describe these private registers as internal PCI bridge registers,
> >> >    using a DT "reg" property,
> >>
> >> Two ponts here:
> >> - We have to support ACPI as well
> >
> > ACPI can do something similar.  This register space would be described
> > in a _CRS method.  IIRC there is actually a bit in the _CRS descriptor
> > that was intended to tell you whether the resource is (a) consumed
> > directly by the device or (b) forwarded to downstream devices.  But I
> > think BIOSes didn't use that bit correctly, so it's useless, so the
> > _CRS method might have to be on a different device.
> >
> >> - There is no need to use the private registers in linux, so the
> >>    whole thing will be pointless.
> >>
> >> > c) Describe the host bridge window (which doesn't include the above
> >> >    registers) using the normal "ranges" property, and
> >>
> >> This is standard code (and ACPI works as well)..
> >>
> >> > d) Arrange for config writes to BAR 0 to be dropped and for reads to
> >> >    return 0, maybe by checks in your config accessors.
> >>
> >> Here we are hiding the BAR from linux using checks in config read
> >> write (if i understand correctly). This will need custom config accessors
> >> for Vulcan both on device tree as well as in ACPI.
> >
> > Config accessors are usually not specific to DT or ACPI.
> >
> >> The patch (above) is trying to do it in a much much simpler way by
> >> clearing the bar0 resource in DECLARE_PCI_FIXUP_HEADER quirk.
> >> I am not able to see why this is not valid.
> >
> > I think there are two problems:
> >
> > 1) If the device responds to any address space, Linux needs to know
> > about it.  If we don't know about it, we might assign that space to
> > something else.
> 
> If it is really needed I can mark the address space as reserved. And
> Linux generally should not use physical address space outside the
> areas specified by firmware of device tree.

That's true, at least in theory.  Historically on x86, we only avoided
areas marked as reserved.  We *should* have only used areas marked as
available for MMIO, but Linux didn't pay much attention to the _CRS
methods that tell us that.  We just assumed anything that wasn't RAM
and wasn't marked as reserved in E820 was fair game for use by PCI.

> > 2) The PCI core still reads and writes the BAR to size it, and we
> > don't know whether this will trigger the hardware issue.
> 
> The sizing read/write is done after disabling the BAR by writing
> to CMD register - I don't see why you think this will be a problem.

You're right; I forgot about that CMD enable bit.  There are a few
cases where we don't disable decoding (see mmio_always_on), the most
important being for PCI_CLASS_BRIDGE_HOST devices.  But I think your
device is a PCI_CLASS_BRIDGE_PCI device, not a BRIDGE_HOST device, so
we should disable decoding on your device.

> > *You* might know that neither of these is an issue today, but special
> > incidental knowledge like that is a maintenance problem because I
> > can't tell what PCI core changes might make them an issue in the
> > future.
> 
> The worst case maintenance problem is that a future PCI change
> might break our chip. This is not unexpected in Linux, and such
> breakages can be easily fixed.

It's certainly unexpected by *me* :)  And the burden of fixing such
breakages doesn't scale well -- it usually involves triage by somebody
other than the chip folks, e.g., me.

Andi is doing something similar for some Broadwell-EP devices that
have BARs that aren't really BARs.  I suggested custom config
accessors, which is fairly clean from the core point of view, but it's
definitely clunky on the device side.

Andi's first proposals were to use a quirk to set a flag bit that told
us not to touch the BAR.  I'm starting to think that's a cleaner
approach.  I think it would be better to put the bit(s) in the
pci_dev, so generic struct resource users wouldn't see them, set them
via early quirks, then have __pci_read_base() check them and just
leave the struct resource zeroed out.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jayachandran Chandrashekaran Nair Feb. 27, 2016, 8:14 a.m. UTC | #11
Hi Bjorn,

On Tue, Feb 23, 2016 at 8:42 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Andi, linux-kernel, x86 for Intel Broadwell-EP non-BAR issue]
>
> On Tue, Feb 23, 2016 at 08:10:03PM +0530, Jayachandran Chandrashekaran Nair wrote:
> > On Thu, Feb 18, 2016 at 9:19 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Wed, Feb 17, 2016 at 10:36:55PM +0530, Jayachandran Chandrashekaran Nair wrote:
> > >> On Wed, Feb 17, 2016 at 2:33 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > >> > [+cc Arnd, Rob, DT host bridge description questions]
> > >> >
> > >> > On Tue, Feb 16, 2016 at 11:39:41PM +0530, Jayachandran Chandrashekaran Nair wrote:
> > >> >> On Tue, Feb 16, 2016 at 10:44 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > >> >> > On Tue, Feb 16, 2016 at 09:47:34PM +0530, Jayachandran Chandrashekaran Nair wrote:
> > >> >> >> On Tue, Feb 16, 2016 at 12:00 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > >> >> >> > On Mon, Feb 15, 2016 at 02:56:38PM +0530, Jayachandran C wrote:
> > >> >> >> >> Handle two quirks of the PCI controller on Broadcom's Vulcan processor.
> > >> >> >> >>  - Mark internal bridges so that they are skipped during DMA alias
> > >> >> >> >>    search.
> > >> >> >> >>  - Skip BAR0 resource assignment for internal bridges. The BARs
> > >> >> >> >>    of internal bridges should not be assigned from the mem resource
> > >> >> >> >>    range.
> > >> >> >> >> ...
> > >> >> >> >>  drivers/pci/quirks.c | 21 +++++++++++++++++++++
> > >> >> >> >>  1 file changed, 21 insertions(+)
> > >> >> >> >>
> > >> >> >> >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > >> >> >> >> index 0575a1e..afc186a 100644
> > >> >> >> >> --- a/drivers/pci/quirks.c
> > >> >> >> >> +++ b/drivers/pci/quirks.c
> > >> >> >> >> @@ -3705,6 +3705,27 @@ DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8892, quirk_use_pcie_bridge_dma_alias);
> > >> >> >> >>  DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias);
> > >> >> >> >>
> > >> >> >> >>  /*
> > >> >> >> >> + * Two levels of bridges in Broadcom Vulcan are not real PCI or PCIe bridges.
> > >> >> >> >> + * These are internal bridges and should not be used for dma alias
> > >> >> >> >> + * calculations. Additionally, the BAR0 of thes bridges should not be
> > >> >> >> >> + * assigned with a mem resource from linux
> > >> >> >> >> + */
> > >> >> >> >> +static void quirk_bridge_brcm_vulcan_internal(struct pci_dev *pdev)
> > >> >> >> >> +{
> > >> >> >> >> +     struct resource *r = &pdev->resource[0];
> > >> >> >> >> +
> > >> >> >> >> +     /* skip from alias search */
> > >> >> >> >> +     pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS;
> > >> >> >> >> +
> > >> >> >> >> +     /* clear BAR0, should not be used from Linux */
> > >> >> >> >> +     memset(r, 0, sizeof(*r));
> > >> >> >> >
> > >> >> >> > This definitely needs some explanation.  The whole point of the
> > >> >> >> > architected PCI config space header is so that generic OS code can
> > >> >> >> > manage the device without having to add device-specific code.
> > >> >> >> >
> > >> >> >> > Are you saying the register at 0x10 in config space is not actually a
> > >> >> >> > BAR at all?  Or it is a BAR, but you don't think anybody should need
> > >> >> >> > to use it?
> > >> >> >>
> > >> >> >> These is a proper BAR (64 bit pref) in the bridge's Type 1 header, but:
> > >> >> >> 1. Assigning a memory resource from the pci memory window to this
> > >> >> >>    BAR causes the PCIe system to fail (this is a bug). So we cannot
> > >> >> >>    expose this BAR to standard Linux code without even more quirks
> > >> >> >>    and hacks.
> > >> >> >
> > >> >> > Are you saying assigning space for this BAR exposes a Linux PCI bug?
> > >> >> > If so, I'd like to know more about that because we should fix it.
> > >> >> >
> > >> >> > Or does it expose a hardware bug in the Broadcom device?
> > >> >>
> > >> >> This is a hardware bug (or non-compliance). The BAR cannot be
> > >> >> assigned from the PCI MEM range. To use it we need to program
> > >> >> an address outside the areas assigned to PCI to the BAR. But like
> > >> >> I said it is better for us to ignore the BAR in linux and then the
> > >> >> device acts as a normal bridge.
> > >> >
> > >> > Your PCI host bridge has a window of address space that it forwards
> > >> > from the primary (CPU) side to the secondary (PCI) side.  I assume
> > >> > that's what you mean by the "PCI MEM range."  Apparently if the BAR is
> > >> > programmed inside that window, it causes some hardware error.  That
> > >> > still seems strange; there's no driver for the device, and we know the
> > >> > range is in use so it won't be assigned to any other device, so Linux
> > >> > should never *access* the range.
> > >>
> > >> This is correct. The write to this bridge BAR with a address from
> > >> the host bridge window will cause a hardware issue.
> > >>
> > >> > Apparently if you program the BAR *outside* the window, the hardware
> > >> > error does not happen, and the private registers are accessible.  But
> > >> > Linux currently doesn't know where this space is.
> > >> >
> > >> > If we ignore the BAR in Linux, apparently the hardware error does not
> > >> > happen.  But the BAR still contains some value, so this is really the
> > >> > same as having the BAR outside the window, presumably because it came
> > >> > out of reset that way, or maybe firmware programmed it.
> > >>
> > >> Yes.
> > >>
> > >> > It sounds to me like you should do the following:
> > >> >
> > >> > a) Have firmware program this BAR where you want it,
> > >> >
> > >> > b) Describe these private registers as internal PCI bridge registers,
> > >> >    using a DT "reg" property,
> > >>
> > >> Two ponts here:
> > >> - We have to support ACPI as well
> > >
> > > ACPI can do something similar.  This register space would be described
> > > in a _CRS method.  IIRC there is actually a bit in the _CRS descriptor
> > > that was intended to tell you whether the resource is (a) consumed
> > > directly by the device or (b) forwarded to downstream devices.  But I
> > > think BIOSes didn't use that bit correctly, so it's useless, so the
> > > _CRS method might have to be on a different device.
> > >
> > >> - There is no need to use the private registers in linux, so the
> > >>    whole thing will be pointless.
> > >>
> > >> > c) Describe the host bridge window (which doesn't include the above
> > >> >    registers) using the normal "ranges" property, and
> > >>
> > >> This is standard code (and ACPI works as well)..
> > >>
> > >> > d) Arrange for config writes to BAR 0 to be dropped and for reads to
> > >> >    return 0, maybe by checks in your config accessors.
> > >>
> > >> Here we are hiding the BAR from linux using checks in config read
> > >> write (if i understand correctly). This will need custom config accessors
> > >> for Vulcan both on device tree as well as in ACPI.
> > >
> > > Config accessors are usually not specific to DT or ACPI.
> > >
> > >> The patch (above) is trying to do it in a much much simpler way by
> > >> clearing the bar0 resource in DECLARE_PCI_FIXUP_HEADER quirk.
> > >> I am not able to see why this is not valid.
> > >
> > > I think there are two problems:
> > >
> > > 1) If the device responds to any address space, Linux needs to know
> > > about it.  If we don't know about it, we might assign that space to
> > > something else.
> >
> > If it is really needed I can mark the address space as reserved. And
> > Linux generally should not use physical address space outside the
> > areas specified by firmware of device tree.
>
> That's true, at least in theory.  Historically on x86, we only avoided
> areas marked as reserved.  We *should* have only used areas marked as
> available for MMIO, but Linux didn't pay much attention to the _CRS
> methods that tell us that.  We just assumed anything that wasn't RAM
> and wasn't marked as reserved in E820 was fair game for use by PCI.
>
> > > 2) The PCI core still reads and writes the BAR to size it, and we
> > > don't know whether this will trigger the hardware issue.
> >
> > The sizing read/write is done after disabling the BAR by writing
> > to CMD register - I don't see why you think this will be a problem.
>
> You're right; I forgot about that CMD enable bit.  There are a few
> cases where we don't disable decoding (see mmio_always_on), the most
> important being for PCI_CLASS_BRIDGE_HOST devices.  But I think your
> device is a PCI_CLASS_BRIDGE_PCI device, not a BRIDGE_HOST device, so
> we should disable decoding on your device.
>
> > > *You* might know that neither of these is an issue today, but special
> > > incidental knowledge like that is a maintenance problem because I
> > > can't tell what PCI core changes might make them an issue in the
> > > future.
> >
> > The worst case maintenance problem is that a future PCI change
> > might break our chip. This is not unexpected in Linux, and such
> > breakages can be easily fixed.
>
> It's certainly unexpected by *me* :)  And the burden of fixing such
> breakages doesn't scale well -- it usually involves triage by somebody
> other than the chip folks, e.g., me.
>
> Andi is doing something similar for some Broadwell-EP devices that
> have BARs that aren't really BARs.  I suggested custom config
> accessors, which is fairly clean from the core point of view, but it's
> definitely clunky on the device side.
>
> Andi's first proposals were to use a quirk to set a flag bit that told
> us not to touch the BAR.  I'm starting to think that's a cleaner
> approach.  I think it would be better to put the bit(s) in the
> pci_dev, so generic struct resource users wouldn't see them, set them
> via early quirks, then have __pci_read_base() check them and just
> leave the struct resource zeroed out.

I spent some more time working with our hardware team on the nature
of this issue. It looks like this is not a problem if we setup the PCI
configuration in firmware better (this has to do with the sequence for
configuring the PCI controller in firmware).

So I will drop the part in my patch that hides the BARs from linux, since
the standard Linux resource allocation code will work Sorry for taking up
your time on this.

Thanks,
JC.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Feb. 27, 2016, 2:36 p.m. UTC | #12
On Sat, Feb 27, 2016 at 01:44:49PM +0530, Jayachandran Chandrashekaran Nair wrote:
> On Tue, Feb 23, 2016 at 8:42 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Feb 23, 2016 at 08:10:03PM +0530, Jayachandran Chandrashekaran Nair wrote:
> > > On Thu, Feb 18, 2016 at 9:19 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > On Wed, Feb 17, 2016 at 10:36:55PM +0530, Jayachandran Chandrashekaran Nair wrote:
> > > >> On Wed, Feb 17, 2016 at 2:33 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > >> > On Tue, Feb 16, 2016 at 11:39:41PM +0530, Jayachandran Chandrashekaran Nair wrote:
> > > >> >> On Tue, Feb 16, 2016 at 10:44 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > >> >> > On Tue, Feb 16, 2016 at 09:47:34PM +0530, Jayachandran Chandrashekaran Nair wrote:
> > > >> >> >> On Tue, Feb 16, 2016 at 12:00 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > >> >> >> > On Mon, Feb 15, 2016 at 02:56:38PM +0530, Jayachandran C wrote:
> > > >> >> >> >> Handle two quirks of the PCI controller on Broadcom's Vulcan processor.
> > > >> >> >> >>  - Mark internal bridges so that they are skipped during DMA alias
> > > >> >> >> >>    search.
> > > >> >> >> >>  - Skip BAR0 resource assignment for internal bridges. The BARs
> > > >> >> >> >>    of internal bridges should not be assigned from the mem resource
> > > >> >> >> >>    range.
> > > >> >> >> >> ...
> > > >> >> >> >>  drivers/pci/quirks.c | 21 +++++++++++++++++++++
> > > >> >> >> >>  1 file changed, 21 insertions(+)
> > > >> >> >> >>
> > > >> >> >> >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > >> >> >> >> index 0575a1e..afc186a 100644
> > > >> >> >> >> --- a/drivers/pci/quirks.c
> > > >> >> >> >> +++ b/drivers/pci/quirks.c
> > > >> >> >> >> @@ -3705,6 +3705,27 @@ DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8892, quirk_use_pcie_bridge_dma_alias);
> > > >> >> >> >>  DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias);
> > > >> >> >> >>
> > > >> >> >> >>  /*
> > > >> >> >> >> + * Two levels of bridges in Broadcom Vulcan are not real PCI or PCIe bridges.
> > > >> >> >> >> + * These are internal bridges and should not be used for dma alias
> > > >> >> >> >> + * calculations. Additionally, the BAR0 of thes bridges should not be
> > > >> >> >> >> + * assigned with a mem resource from linux
> > > >> >> >> >> + */
> > > >> >> >> >> +static void quirk_bridge_brcm_vulcan_internal(struct pci_dev *pdev)
> > > >> >> >> >> +{
> > > >> >> >> >> +     struct resource *r = &pdev->resource[0];
> > > >> >> >> >> +
> > > >> >> >> >> +     /* skip from alias search */
> > > >> >> >> >> +     pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS;
> > > >> >> >> >> +
> > > >> >> >> >> +     /* clear BAR0, should not be used from Linux */
> > > >> >> >> >> +     memset(r, 0, sizeof(*r));
> > > >> >> >> >
> > > >> >> >> > This definitely needs some explanation.  The whole point of the
> > > >> >> >> > architected PCI config space header is so that generic OS code can
> > > >> >> >> > manage the device without having to add device-specific code.
> > > >> >> >> >
> > > >> >> >> > Are you saying the register at 0x10 in config space is not actually a
> > > >> >> >> > BAR at all?  Or it is a BAR, but you don't think anybody should need
> > > >> >> >> > to use it?
> > > >> >> >>
> > > >> >> >> These is a proper BAR (64 bit pref) in the bridge's Type 1 header, but:
> > > >> >> >> 1. Assigning a memory resource from the pci memory window to this
> > > >> >> >>    BAR causes the PCIe system to fail (this is a bug). So we cannot
> > > >> >> >>    expose this BAR to standard Linux code without even more quirks
> > > >> >> >>    and hacks.
> > > >> >> >
> > > >> >> > Are you saying assigning space for this BAR exposes a Linux PCI bug?
> > > >> >> > If so, I'd like to know more about that because we should fix it.
> > > >> >> >
> > > >> >> > Or does it expose a hardware bug in the Broadcom device?
> > > >> >>
> > > >> >> This is a hardware bug (or non-compliance). The BAR cannot be
> > > >> >> assigned from the PCI MEM range. To use it we need to program
> > > >> >> an address outside the areas assigned to PCI to the BAR. But like
> > > >> >> I said it is better for us to ignore the BAR in linux and then the
> > > >> >> device acts as a normal bridge.
> > > >> >
> > > >> > Your PCI host bridge has a window of address space that it forwards
> > > >> > from the primary (CPU) side to the secondary (PCI) side.  I assume
> > > >> > that's what you mean by the "PCI MEM range."  Apparently if the BAR is
> > > >> > programmed inside that window, it causes some hardware error.  That
> > > >> > still seems strange; there's no driver for the device, and we know the
> > > >> > range is in use so it won't be assigned to any other device, so Linux
> > > >> > should never *access* the range.
> > > >>
> > > >> This is correct. The write to this bridge BAR with a address from
> > > >> the host bridge window will cause a hardware issue.
> > > >>
> > > >> > Apparently if you program the BAR *outside* the window, the hardware
> > > >> > error does not happen, and the private registers are accessible.  But
> > > >> > Linux currently doesn't know where this space is.
> > > >> >
> > > >> > If we ignore the BAR in Linux, apparently the hardware error does not
> > > >> > happen.  But the BAR still contains some value, so this is really the
> > > >> > same as having the BAR outside the window, presumably because it came
> > > >> > out of reset that way, or maybe firmware programmed it.
> > > >>
> > > >> Yes.
> > > >>
> > > >> > It sounds to me like you should do the following:
> > > >> >
> > > >> > a) Have firmware program this BAR where you want it,
> > > >> >
> > > >> > b) Describe these private registers as internal PCI bridge registers,
> > > >> >    using a DT "reg" property,
> > > >>
> > > >> Two ponts here:
> > > >> - We have to support ACPI as well
> > > >
> > > > ACPI can do something similar.  This register space would be described
> > > > in a _CRS method.  IIRC there is actually a bit in the _CRS descriptor
> > > > that was intended to tell you whether the resource is (a) consumed
> > > > directly by the device or (b) forwarded to downstream devices.  But I
> > > > think BIOSes didn't use that bit correctly, so it's useless, so the
> > > > _CRS method might have to be on a different device.
> > > >
> > > >> - There is no need to use the private registers in linux, so the
> > > >>    whole thing will be pointless.
> > > >>
> > > >> > c) Describe the host bridge window (which doesn't include the above
> > > >> >    registers) using the normal "ranges" property, and
> > > >>
> > > >> This is standard code (and ACPI works as well)..
> > > >>
> > > >> > d) Arrange for config writes to BAR 0 to be dropped and for reads to
> > > >> >    return 0, maybe by checks in your config accessors.
> > > >>
> > > >> Here we are hiding the BAR from linux using checks in config read
> > > >> write (if i understand correctly). This will need custom config accessors
> > > >> for Vulcan both on device tree as well as in ACPI.
> > > >
> > > > Config accessors are usually not specific to DT or ACPI.
> > > >
> > > >> The patch (above) is trying to do it in a much much simpler way by
> > > >> clearing the bar0 resource in DECLARE_PCI_FIXUP_HEADER quirk.
> > > >> I am not able to see why this is not valid.
> > > >
> > > > I think there are two problems:
> > > >
> > > > 1) If the device responds to any address space, Linux needs to know
> > > > about it.  If we don't know about it, we might assign that space to
> > > > something else.
> > >
> > > If it is really needed I can mark the address space as reserved. And
> > > Linux generally should not use physical address space outside the
> > > areas specified by firmware of device tree.
> >
> > That's true, at least in theory.  Historically on x86, we only avoided
> > areas marked as reserved.  We *should* have only used areas marked as
> > available for MMIO, but Linux didn't pay much attention to the _CRS
> > methods that tell us that.  We just assumed anything that wasn't RAM
> > and wasn't marked as reserved in E820 was fair game for use by PCI.
> >
> > > > 2) The PCI core still reads and writes the BAR to size it, and we
> > > > don't know whether this will trigger the hardware issue.
> > >
> > > The sizing read/write is done after disabling the BAR by writing
> > > to CMD register - I don't see why you think this will be a problem.
> >
> > You're right; I forgot about that CMD enable bit.  There are a few
> > cases where we don't disable decoding (see mmio_always_on), the most
> > important being for PCI_CLASS_BRIDGE_HOST devices.  But I think your
> > device is a PCI_CLASS_BRIDGE_PCI device, not a BRIDGE_HOST device, so
> > we should disable decoding on your device.
> >
> > > > *You* might know that neither of these is an issue today, but special
> > > > incidental knowledge like that is a maintenance problem because I
> > > > can't tell what PCI core changes might make them an issue in the
> > > > future.
> > >
> > > The worst case maintenance problem is that a future PCI change
> > > might break our chip. This is not unexpected in Linux, and such
> > > breakages can be easily fixed.
> >
> > It's certainly unexpected by *me* :)  And the burden of fixing such
> > breakages doesn't scale well -- it usually involves triage by somebody
> > other than the chip folks, e.g., me.
> >
> > Andi is doing something similar for some Broadwell-EP devices that
> > have BARs that aren't really BARs.  I suggested custom config
> > accessors, which is fairly clean from the core point of view, but it's
> > definitely clunky on the device side.
> >
> > Andi's first proposals were to use a quirk to set a flag bit that told
> > us not to touch the BAR.  I'm starting to think that's a cleaner
> > approach.  I think it would be better to put the bit(s) in the
> > pci_dev, so generic struct resource users wouldn't see them, set them
> > via early quirks, then have __pci_read_base() check them and just
> > leave the struct resource zeroed out.
> 
> I spent some more time working with our hardware team on the nature
> of this issue. It looks like this is not a problem if we setup the PCI
> configuration in firmware better (this has to do with the sequence for
> configuring the PCI controller in firmware).
> 
> So I will drop the part in my patch that hides the BARs from linux, since
> the standard Linux resource allocation code will work Sorry for taking up
> your time on this.

No problem, this is the best possible outcome: the hardware isn't
broken, Linux isn't broken (at least in this tiny regard), we avoided
putting an unnecessary workaround into Linux, and we retained the
ability to use those debug registers if that's ever necessary.

Thanks a lot for chasing this down!

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 0575a1e..afc186a 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3705,6 +3705,27 @@  DECLARE_PCI_FIXUP_HEADER(0x1283, 0x8892, quirk_use_pcie_bridge_dma_alias);
 DECLARE_PCI_FIXUP_HEADER(0x8086, 0x244e, quirk_use_pcie_bridge_dma_alias);
 
 /*
+ * Two levels of bridges in Broadcom Vulcan are not real PCI or PCIe bridges.
+ * These are internal bridges and should not be used for dma alias
+ * calculations. Additionally, the BAR0 of thes bridges should not be
+ * assigned with a mem resource from linux
+ */
+static void quirk_bridge_brcm_vulcan_internal(struct pci_dev *pdev)
+{
+	struct resource *r = &pdev->resource[0];
+
+	/* skip from alias search */
+	pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS;
+
+	/* clear BAR0, should not be used from Linux */
+	memset(r, 0, sizeof(*r));
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000,
+				quirk_bridge_brcm_vulcan_internal);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9039,
+				quirk_bridge_brcm_vulcan_internal);
+
+/*
  * Intersil/Techwell TW686[4589]-based video capture cards have an empty (zero)
  * class code.  Fix it.
  */