Message ID | 20130528184020.3318.7800.stgit@bling.home (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Tue, 28 May 2013 12:40:20 -0600 Alex Williamson <alex.williamson@redhat.com> wrote: > Some PCIe-to-PCI bridges are not fully compliant with the PCIe spec > and do not include a PCIe capability. pci_is_pcie() is not useful on > these devices, making it difficult to determine where we transition > from a legacy PCI bus to a PCIe link. PCI-core doesn't want a quirked > bridge test for fear of confusion that a PCIe capability would be > expected, so implement it in IOMMU code. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> This won't work for me because I run without IOMMU. The overhead of IOMMU setup per-packet significantly impacts network performance. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2013-05-28 at 12:38 -0700, Stephen Hemminger wrote: > On Tue, 28 May 2013 12:40:20 -0600 > Alex Williamson <alex.williamson@redhat.com> wrote: > > > Some PCIe-to-PCI bridges are not fully compliant with the PCIe spec > > and do not include a PCIe capability. pci_is_pcie() is not useful on > > these devices, making it difficult to determine where we transition > > from a legacy PCI bus to a PCIe link. PCI-core doesn't want a quirked > > bridge test for fear of confusion that a PCIe capability would be > > expected, so implement it in IOMMU code. > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > This won't work for me because I run without IOMMU. > The overhead of IOMMU setup per-packet significantly impacts network performance. You were one of the reporters of this issue: http://www.spinics.net/lists/linux-pci/msg19579.html Does this not work for you or are you saying this is not relevant to you because you no longer attempt to enable the IOMMU? Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 28 May 2013 13:53:10 -0600 Alex Williamson <alex.williamson@redhat.com> wrote: > On Tue, 2013-05-28 at 12:38 -0700, Stephen Hemminger wrote: > > On Tue, 28 May 2013 12:40:20 -0600 > > Alex Williamson <alex.williamson@redhat.com> wrote: > > > > > Some PCIe-to-PCI bridges are not fully compliant with the PCIe spec > > > and do not include a PCIe capability. pci_is_pcie() is not useful on > > > these devices, making it difficult to determine where we transition > > > from a legacy PCI bus to a PCIe link. PCI-core doesn't want a quirked > > > bridge test for fear of confusion that a PCIe capability would be > > > expected, so implement it in IOMMU code. > > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > > This won't work for me because I run without IOMMU. > > The overhead of IOMMU setup per-packet significantly impacts network performance. > > You were one of the reporters of this issue: > > http://www.spinics.net/lists/linux-pci/msg19579.html > > Does this not work for you or are you saying this is not relevant to you > because you no longer attempt to enable the IOMMU? Thanks, > > Alex > For most usage, I gave up on using IOMMU/SR-IOV. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2013-05-28 at 12:56 -0700, Stephen Hemminger wrote: > On Tue, 28 May 2013 13:53:10 -0600 > Alex Williamson <alex.williamson@redhat.com> wrote: > > > On Tue, 2013-05-28 at 12:38 -0700, Stephen Hemminger wrote: > > > On Tue, 28 May 2013 12:40:20 -0600 > > > Alex Williamson <alex.williamson@redhat.com> wrote: > > > > > > > Some PCIe-to-PCI bridges are not fully compliant with the PCIe spec > > > > and do not include a PCIe capability. pci_is_pcie() is not useful on > > > > these devices, making it difficult to determine where we transition > > > > from a legacy PCI bus to a PCIe link. PCI-core doesn't want a quirked > > > > bridge test for fear of confusion that a PCIe capability would be > > > > expected, so implement it in IOMMU code. > > > > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > > > > This won't work for me because I run without IOMMU. > > > The overhead of IOMMU setup per-packet significantly impacts network performance. > > > > You were one of the reporters of this issue: > > > > http://www.spinics.net/lists/linux-pci/msg19579.html > > > > Does this not work for you or are you saying this is not relevant to you > > because you no longer attempt to enable the IOMMU? Thanks, > > > > For most usage, I gave up on using IOMMU/SR-IOV. Ok, let's call that a "don't care" rather than "won't work" for the purposes of these patches. I'll take you off the CC if there's another version. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 28 May 2013 14:15:24 -0600 Alex Williamson <alex.williamson@redhat.com> wrote: > On Tue, 2013-05-28 at 12:56 -0700, Stephen Hemminger wrote: > > On Tue, 28 May 2013 13:53:10 -0600 > > Alex Williamson <alex.williamson@redhat.com> wrote: > > > > > On Tue, 2013-05-28 at 12:38 -0700, Stephen Hemminger wrote: > > > > On Tue, 28 May 2013 12:40:20 -0600 > > > > Alex Williamson <alex.williamson@redhat.com> wrote: > > > > > > > > > Some PCIe-to-PCI bridges are not fully compliant with the PCIe spec > > > > > and do not include a PCIe capability. pci_is_pcie() is not useful on > > > > > these devices, making it difficult to determine where we transition > > > > > from a legacy PCI bus to a PCIe link. PCI-core doesn't want a quirked > > > > > bridge test for fear of confusion that a PCIe capability would be > > > > > expected, so implement it in IOMMU code. > > > > > > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > > > > > > This won't work for me because I run without IOMMU. > > > > The overhead of IOMMU setup per-packet significantly impacts network performance. > > > > > > You were one of the reporters of this issue: > > > > > > http://www.spinics.net/lists/linux-pci/msg19579.html > > > > > > Does this not work for you or are you saying this is not relevant to you > > > because you no longer attempt to enable the IOMMU? Thanks, > > > > > > > For most usage, I gave up on using IOMMU/SR-IOV. > > Ok, let's call that a "don't care" rather than "won't work" for the > purposes of these patches. I'll take you off the CC if there's another > version. Thanks, > > Alex > Yes, it is a don't care. but leave me on cc, because I may still need it. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: > +bool iommu_pci_is_pcie_bridge(struct pci_dev *pdev) > +{ > + if (!pdev->subordinate) > + return false; > + > + if (pci_is_pcie(pdev)) > + return true; > + > +#ifdef CONFIG_PCI_QUIRKS > + /* > + * If we're not on the root bus, look one device upstream of the > + * current device. If that device is PCIe and is not a PCIe-to-PCI > + * bridge, then the current device is effectively PCIe as it must > + * be the PCIe-to-PCI bridge. This handles several bridges that > + * violate the PCIe spec by not exposing a PCIe capability: > + * https://bugzilla.kernel.org/show_bug.cgi?id=44881 > + */ > + if (!pci_is_root_bus(pdev->bus)) { > + struct pci_dev *parent = pdev->bus->self; > + > + if (pci_is_pcie(parent) && > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) > + return true; > + } Hmm, that looks a bit dangerous. Do we have a list of PCI vendor/device-ids of these broken bridges to match against instead of some open-coded heuristics? That would probably also help to bring this into the PCI code. Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2013-06-20 at 15:59 +0200, Joerg Roedel wrote: > On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: > > +bool iommu_pci_is_pcie_bridge(struct pci_dev *pdev) > > +{ > > + if (!pdev->subordinate) > > + return false; > > + > > + if (pci_is_pcie(pdev)) > > + return true; > > + > > +#ifdef CONFIG_PCI_QUIRKS > > + /* > > + * If we're not on the root bus, look one device upstream of the > > + * current device. If that device is PCIe and is not a PCIe-to-PCI > > + * bridge, then the current device is effectively PCIe as it must > > + * be the PCIe-to-PCI bridge. This handles several bridges that > > + * violate the PCIe spec by not exposing a PCIe capability: > > + * https://bugzilla.kernel.org/show_bug.cgi?id=44881 > > + */ > > + if (!pci_is_root_bus(pdev->bus)) { > > + struct pci_dev *parent = pdev->bus->self; > > + > > + if (pci_is_pcie(parent) && > > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) > > + return true; > > + } > > Hmm, that looks a bit dangerous. How so? The algorithm seems pretty simple and logical. > Do we have a list of PCI > vendor/device-ids of these broken bridges to match against instead of > some open-coded heuristics? > > That would probably also help to bring this into the PCI code. Actually, I believe Bjorn rejected the idea of a fixed list because this problem is detectable. He also doesn't want me messing with quirks to pci_is_pcie() in PCI because he wants a 1:1 relation between that and having a PCIe capability. So, I'm stuck and this is where it's ended up. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jun 20, 2013 at 09:44:51AM -0600, Alex Williamson wrote: > On Thu, 2013-06-20 at 15:59 +0200, Joerg Roedel wrote: > > On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: > > > + if (!pci_is_root_bus(pdev->bus)) { > > > + struct pci_dev *parent = pdev->bus->self; > > > + > > > + if (pci_is_pcie(parent) && > > > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) > > > + return true; > > > + } > > > > Hmm, that looks a bit dangerous. > > How so? The algorithm seems pretty simple and logical. It is simple, but it is still a heuristic that may fail at some point, no? > Actually, I believe Bjorn rejected the idea of a fixed list because this > problem is detectable. He also doesn't want me messing with quirks to > pci_is_pcie() in PCI because he wants a 1:1 relation between that and > having a PCIe capability. So, I'm stuck and this is where it's ended > up. Thanks, I think implementing such a list is much safer. Bjorn, why didn't you like that idea? Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jun 20, 2013 at 10:15 AM, Joerg Roedel <joro@8bytes.org> wrote: > On Thu, Jun 20, 2013 at 09:44:51AM -0600, Alex Williamson wrote: >> On Thu, 2013-06-20 at 15:59 +0200, Joerg Roedel wrote: >> > On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: >> > > + if (!pci_is_root_bus(pdev->bus)) { >> > > + struct pci_dev *parent = pdev->bus->self; >> > > + >> > > + if (pci_is_pcie(parent) && >> > > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) >> > > + return true; >> > > + } >> > >> > Hmm, that looks a bit dangerous. >> >> How so? The algorithm seems pretty simple and logical. > > It is simple, but it is still a heuristic that may fail at some point, > no? > >> Actually, I believe Bjorn rejected the idea of a fixed list because this >> problem is detectable. He also doesn't want me messing with quirks to >> pci_is_pcie() in PCI because he wants a 1:1 relation between that and >> having a PCIe capability. So, I'm stuck and this is where it's ended >> up. Thanks, > > I think implementing such a list is much safer. > > Bjorn, why didn't you like that idea? Sorry, I can't remember, and I haven't been able to find the discussion where I said that. I think the current patches are all in drivers/iommu, and if a list makes sense there, it's fine with me. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2013-06-25 at 22:20 -0600, Bjorn Helgaas wrote: > On Thu, Jun 20, 2013 at 10:15 AM, Joerg Roedel <joro@8bytes.org> wrote: > > On Thu, Jun 20, 2013 at 09:44:51AM -0600, Alex Williamson wrote: > >> On Thu, 2013-06-20 at 15:59 +0200, Joerg Roedel wrote: > >> > On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: > >> > > + if (!pci_is_root_bus(pdev->bus)) { > >> > > + struct pci_dev *parent = pdev->bus->self; > >> > > + > >> > > + if (pci_is_pcie(parent) && > >> > > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) > >> > > + return true; > >> > > + } > >> > > >> > Hmm, that looks a bit dangerous. > >> > >> How so? The algorithm seems pretty simple and logical. > > > > It is simple, but it is still a heuristic that may fail at some point, > > no? > > > >> Actually, I believe Bjorn rejected the idea of a fixed list because this > >> problem is detectable. He also doesn't want me messing with quirks to > >> pci_is_pcie() in PCI because he wants a 1:1 relation between that and > >> having a PCIe capability. So, I'm stuck and this is where it's ended > >> up. Thanks, > > > > I think implementing such a list is much safer. > > > > Bjorn, why didn't you like that idea? > > Sorry, I can't remember, and I haven't been able to find the > discussion where I said that. I think the current patches are all in > drivers/iommu, and if a list makes sense there, it's fine with me. Here's the comment I remember https://bugzilla.kernel.org/show_bug.cgi?id=44881#c7 Comment #7 From Bjorn Helgaas 2012-08-23 15:58:39 [snip] I doubt the upstream device is at fault. More likely the downstream device is really a PCIe device (a PCIe-to-PCI bridge) but just fails to report a PCIe capability. I think this situation is likely too common to deal with via quirks, so we'll have to figure out a way to just make this work. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 26, 2013 at 12:45 PM, Alex Williamson <alex.williamson@redhat.com> wrote: > On Tue, 2013-06-25 at 22:20 -0600, Bjorn Helgaas wrote: >> On Thu, Jun 20, 2013 at 10:15 AM, Joerg Roedel <joro@8bytes.org> wrote: >> > On Thu, Jun 20, 2013 at 09:44:51AM -0600, Alex Williamson wrote: >> >> On Thu, 2013-06-20 at 15:59 +0200, Joerg Roedel wrote: >> >> > On Tue, May 28, 2013 at 12:40:20PM -0600, Alex Williamson wrote: >> >> > > + if (!pci_is_root_bus(pdev->bus)) { >> >> > > + struct pci_dev *parent = pdev->bus->self; >> >> > > + >> >> > > + if (pci_is_pcie(parent) && >> >> > > + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) >> >> > > + return true; >> >> > > + } >> >> > >> >> > Hmm, that looks a bit dangerous. >> >> >> >> How so? The algorithm seems pretty simple and logical. >> > >> > It is simple, but it is still a heuristic that may fail at some point, >> > no? >> > >> >> Actually, I believe Bjorn rejected the idea of a fixed list because this >> >> problem is detectable. He also doesn't want me messing with quirks to >> >> pci_is_pcie() in PCI because he wants a 1:1 relation between that and >> >> having a PCIe capability. So, I'm stuck and this is where it's ended >> >> up. Thanks, >> > >> > I think implementing such a list is much safer. >> > >> > Bjorn, why didn't you like that idea? >> >> Sorry, I can't remember, and I haven't been able to find the >> discussion where I said that. I think the current patches are all in >> drivers/iommu, and if a list makes sense there, it's fine with me. > > Here's the comment I remember > > https://bugzilla.kernel.org/show_bug.cgi?id=44881#c7 > > Comment #7 From Bjorn Helgaas 2012-08-23 15:58:39 > [snip] > I doubt the upstream device is at fault. More likely the > downstream device is really a PCIe device (a PCIe-to-PCI bridge) > but just fails to report a PCIe capability. I think this > situation is likely too common to deal with via quirks, so we'll > have to figure out a way to just make this work. OK, I remember that now. So the question is whether you want a list or a set of quirks that may be an ongoing maintenance burden, or whether you want an algorithm that may be risky but possibly less maintenance. I preferred the latter. I think a failure in the algorithm will most likely result in a device that just doesn't work (because we derived a DMA source ID that doesn't match what the IOMMU sees), so at least the impact is relatively minor, and no worse than a missing entry in the list of exception devices. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index c332fb9..a591794 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -2,6 +2,9 @@ config IOMMU_API bool +config IOMMU_PCI + bool + menuconfig IOMMU_SUPPORT bool "IOMMU Hardware Support" default y diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index ef0e520..f14d905 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -15,3 +15,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o obj-$(CONFIG_SHMOBILE_IOMMU) += shmobile-iommu.o obj-$(CONFIG_SHMOBILE_IPMMU) += shmobile-ipmmu.o +obj-$(CONFIG_IOMMU_PCI) += pci.o diff --git a/drivers/iommu/pci.c b/drivers/iommu/pci.c new file mode 100644 index 0000000..c3cae8d --- /dev/null +++ b/drivers/iommu/pci.c @@ -0,0 +1,69 @@ +/* + * Copyright (C) 2013 Red Hat, Inc. All rights reserved. + * Author: Alex Williamson <alex.williamson@redhat.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include <linux/pci.h> + +bool iommu_pci_is_pcie_bridge(struct pci_dev *pdev) +{ + if (!pdev->subordinate) + return false; + + if (pci_is_pcie(pdev)) + return true; + +#ifdef CONFIG_PCI_QUIRKS + /* + * If we're not on the root bus, look one device upstream of the + * current device. If that device is PCIe and is not a PCIe-to-PCI + * bridge, then the current device is effectively PCIe as it must + * be the PCIe-to-PCI bridge. This handles several bridges that + * violate the PCIe spec by not exposing a PCIe capability: + * https://bugzilla.kernel.org/show_bug.cgi?id=44881 + */ + if (!pci_is_root_bus(pdev->bus)) { + struct pci_dev *parent = pdev->bus->self; + + if (pci_is_pcie(parent) && + pci_pcie_type(parent) != PCI_EXP_TYPE_PCI_BRIDGE) + return true; + } +#endif + return false; +} + +struct pci_dev *iommu_pci_find_upstream(struct pci_dev *pdev, + bool (*match)(struct pci_dev *), + struct pci_dev **last) +{ + *last = NULL; + + if (match(pdev)) + return pdev; + + *last = pdev; + + while (!pci_is_root_bus(pdev->bus)) { + *last = pdev; + pdev = pdev->bus->self; + + if (match(pdev)) + return pdev; + } + + return NULL; +} diff --git a/drivers/iommu/pci.h b/drivers/iommu/pci.h index 352d80a..cc6d58a 100644 --- a/drivers/iommu/pci.h +++ b/drivers/iommu/pci.h @@ -26,4 +26,27 @@ static inline void swap_pci_ref(struct pci_dev **from, struct pci_dev *to) *from = to; } +/** + * iommu_pci_is_pcie_bridge - Match a PCIe bridge device + * @pdev: device to test + * + * Return true if the given device is a PCIe bridge, false otherwise. + */ +bool iommu_pci_is_pcie_bridge(struct pci_dev *pdev); + +/** + * iommu_pci_find_upstream - Generic upstream search function + * @pdev: starting PCI device to search + * @match: match function to call on each device (true = match) + * @last: last device examined prior to returned device + * + * Walk upstream from the given device, calling match() at each + * device, including self. Returns the first device matching match(). + * If the root bus is reached without finding a match, return NULL. + * last returns the N-1 step in the search path. + */ +struct pci_dev *iommu_pci_find_upstream(struct pci_dev *pdev, + bool (*match)(struct pci_dev *), + struct pci_dev **last); + #endif /* __IOMMU_PCI_H */
Some PCIe-to-PCI bridges are not fully compliant with the PCIe spec and do not include a PCIe capability. pci_is_pcie() is not useful on these devices, making it difficult to determine where we transition from a legacy PCI bus to a PCIe link. PCI-core doesn't want a quirked bridge test for fear of confusion that a PCIe capability would be expected, so implement it in IOMMU code. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- drivers/iommu/Kconfig | 3 ++ drivers/iommu/Makefile | 1 + drivers/iommu/pci.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++ drivers/iommu/pci.h | 23 ++++++++++++++++ 4 files changed, 96 insertions(+) create mode 100644 drivers/iommu/pci.c -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html