Message ID | 1452841574-2781-4-git-send-email-xyjxie@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
From: Yongji Xie > Sent: 15 January 2016 07:06 > > MSI-X tables are not allowed to be mmapped in vfio-pci > driver in case that user get to touch this directly. > This will cause some performance issues when when PCI > adapters have critical registers in the same page as > the MSI-X table. ... If the driver wants to generate an incorrect MSI-X interrupt it can do so by requesting the device do a normal memory transfer to the target address area that raises MSI-X interrupts. So disabling writes to the MSI-X table (and pending bit array) areas only raises the bar very slightly. A device may also give the driver write access to the MSI-X table through other addresses. This seems to make disallowing the mapping of the MSI-X table rather pointless. I've also dumped out the MSI-X table (during development) to check that the values are being written there correctly. David
On 2016/1/16 1:24, David Laight wrote: > From: Yongji Xie >> Sent: 15 January 2016 07:06 >> >> MSI-X tables are not allowed to be mmapped in vfio-pci >> driver in case that user get to touch this directly. >> This will cause some performance issues when when PCI >> adapters have critical registers in the same page as >> the MSI-X table. > ... > If the driver wants to generate an incorrect MSI-X interrupt > it can do so by requesting the device do a normal memory transfer > to the target address area that raises MSI-X interrupts. IOMMUs supporting interrupt remapping can prevent this case. > So disabling writes to the MSI-X table (and pending bit array) > areas only raises the bar very slightly. > A device may also give the driver write access to the MSI-X > table through other addresses. > > This seems to make disallowing the mapping of the MSI-X table > rather pointless. If we allow the mapping of the MSI-X table, it seems the guest kernels of some architectures can write invalid data to MSI-X table when device drivers initialize MSI-X interrupts. Regards, Yongji Xie > I've also dumped out the MSI-X table (during development) to > check that the values are being written there correctly. > > David > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2016-01-15 at 15:06 +0800, Yongji Xie wrote: > MSI-X tables are not allowed to be mmapped in vfio-pci > driver in case that user get to touch this directly. > This will cause some performance issues when when PCI > adapters have critical registers in the same page as > the MSI-X table. > > However, some kind of PCI host bridge such as IODA bridge > on Power support filtering of MSIs, which can ensure that a > given pci device can only shoot the MSIs assigned for it. > So we think it's safe to expose the MSI-X table to userspace > if filtering of MSIs is supported because the exposed MSI-X > table can't be used to do harm to other memory space. > > To support this case, this patch adds a pci_host_bridge > attribute to indicate if this PCI host bridge supports > filtering of MSIs. > > Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com> > --- > drivers/pci/host-bridge.c | 6 ++++++ > include/linux/pci.h | 3 +++ > 2 files changed, 9 insertions(+) > > diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c > index 5f4a2e0..c029267 100644 > --- a/drivers/pci/host-bridge.c > +++ b/drivers/pci/host-bridge.c > @@ -96,3 +96,9 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res, > res->end = region->end + offset; > } > EXPORT_SYMBOL(pcibios_bus_to_resource); > + > +bool pci_host_bridge_msi_filtered_enabled(struct pci_dev *pdev) > +{ > + return pci_find_host_bridge(pdev->bus)->msi_filtered; > +} > +EXPORT_SYMBOL_GPL(pci_host_bridge_msi_filtered_enabled); > diff --git a/include/linux/pci.h b/include/linux/pci.h > index b640d65..b952b78 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -412,6 +412,7 @@ struct pci_host_bridge { > void (*release_fn)(struct pci_host_bridge *); > void *release_data; > unsigned int ignore_reset_delay:1; /* for entire hierarchy */ > + unsigned int msi_filtered:1; /* support filtering of MSIs */ > /* Resource alignment requirements */ > resource_size_t (*align_resource)(struct pci_dev *dev, > const struct resource *res, > @@ -430,6 +431,8 @@ void pci_set_host_bridge_release(struct pci_host_bridge *bridge, > > int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge); > > +bool pci_host_bridge_msi_filtered_enabled(struct pci_dev *pdev); > + > /* > * The first PCI_BRIDGE_RESOURCE_NUM PCI bus resources (those that correspond > * to P2P or CardBus bridge windows) go in a table. Additional ones (for Don't we already have a flag for this in the IOMMU space? enum iommu_cap { IOMMU_CAP_CACHE_COHERENCY, /* IOMMU can enforce cache coherent DMA transactions */ ---> IOMMU_CAP_INTR_REMAP, /* IOMMU supports interrupt isolation */ IOMMU_CAP_NOEXEC, /* IOMMU_NOEXEC flag */ }; -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2016/1/29 6:46, Alex Williamson wrote: > On Fri, 2016-01-15 at 15:06 +0800, Yongji Xie wrote: >> MSI-X tables are not allowed to be mmapped in vfio-pci >> driver in case that user get to touch this directly. >> This will cause some performance issues when when PCI >> adapters have critical registers in the same page as >> the MSI-X table. >> >> However, some kind of PCI host bridge such as IODA bridge >> on Power support filtering of MSIs, which can ensure that a >> given pci device can only shoot the MSIs assigned for it. >> So we think it's safe to expose the MSI-X table to userspace >> if filtering of MSIs is supported because the exposed MSI-X >> table can't be used to do harm to other memory space. >> >> To support this case, this patch adds a pci_host_bridge >> attribute to indicate if this PCI host bridge supports >> filtering of MSIs. >> >> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com> >> --- >> drivers/pci/host-bridge.c | 6 ++++++ >> include/linux/pci.h | 3 +++ >> 2 files changed, 9 insertions(+) >> >> diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c >> index 5f4a2e0..c029267 100644 >> --- a/drivers/pci/host-bridge.c >> +++ b/drivers/pci/host-bridge.c >> @@ -96,3 +96,9 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res, >> res->end = region->end + offset; >> } >> EXPORT_SYMBOL(pcibios_bus_to_resource); >> + >> +bool pci_host_bridge_msi_filtered_enabled(struct pci_dev *pdev) >> +{ >> + return pci_find_host_bridge(pdev->bus)->msi_filtered; >> +} >> +EXPORT_SYMBOL_GPL(pci_host_bridge_msi_filtered_enabled); >> diff --git a/include/linux/pci.h b/include/linux/pci.h >> index b640d65..b952b78 100644 >> --- a/include/linux/pci.h >> +++ b/include/linux/pci.h >> @@ -412,6 +412,7 @@ struct pci_host_bridge { >> void (*release_fn)(struct pci_host_bridge *); >> void *release_data; >> unsigned int ignore_reset_delay:1; /* for entire hierarchy */ >> + unsigned int msi_filtered:1; /* support filtering of MSIs */ >> /* Resource alignment requirements */ >> resource_size_t (*align_resource)(struct pci_dev *dev, >> const struct resource *res, >> @@ -430,6 +431,8 @@ void pci_set_host_bridge_release(struct pci_host_bridge *bridge, >> >> int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge); >> >> +bool pci_host_bridge_msi_filtered_enabled(struct pci_dev *pdev); >> + >> /* >> * The first PCI_BRIDGE_RESOURCE_NUM PCI bus resources (those that correspond >> * to P2P or CardBus bridge windows) go in a table. Additional ones (for > Don't we already have a flag for this in the IOMMU space? > > enum iommu_cap { > IOMMU_CAP_CACHE_COHERENCY, /* IOMMU can enforce cache coherent DMA > transactions */ > ---> IOMMU_CAP_INTR_REMAP, /* IOMMU supports interrupt isolation */ > IOMMU_CAP_NOEXEC, /* IOMMU_NOEXEC flag */ > }; > I saw this flag had been enabled in x86 and ARM arch. I'm not sure whether we can mmap MSI-X table in those archs. I just verify it on PPC64 arch. Regards. Yongji Xie -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
----- Original Message ----- > On 2016/1/29 6:46, Alex Williamson wrote: > > On Fri, 2016-01-15 at 15:06 +0800, Yongji Xie wrote: > >> MSI-X tables are not allowed to be mmapped in vfio-pci > >> driver in case that user get to touch this directly. > >> This will cause some performance issues when when PCI > >> adapters have critical registers in the same page as > >> the MSI-X table. > >> > >> However, some kind of PCI host bridge such as IODA bridge > >> on Power support filtering of MSIs, which can ensure that a > >> given pci device can only shoot the MSIs assigned for it. > >> So we think it's safe to expose the MSI-X table to userspace > >> if filtering of MSIs is supported because the exposed MSI-X > >> table can't be used to do harm to other memory space. > >> > >> To support this case, this patch adds a pci_host_bridge > >> attribute to indicate if this PCI host bridge supports > >> filtering of MSIs. > >> > >> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com> > >> --- > >> drivers/pci/host-bridge.c | 6 ++++++ > >> include/linux/pci.h | 3 +++ > >> 2 files changed, 9 insertions(+) > >> > >> diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c > >> index 5f4a2e0..c029267 100644 > >> --- a/drivers/pci/host-bridge.c > >> +++ b/drivers/pci/host-bridge.c > >> @@ -96,3 +96,9 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct > >> resource *res, > >> res->end = region->end + offset; > >> } > >> EXPORT_SYMBOL(pcibios_bus_to_resource); > >> + > >> +bool pci_host_bridge_msi_filtered_enabled(struct pci_dev *pdev) > >> +{ > >> + return pci_find_host_bridge(pdev->bus)->msi_filtered; > >> +} > >> +EXPORT_SYMBOL_GPL(pci_host_bridge_msi_filtered_enabled); > >> diff --git a/include/linux/pci.h b/include/linux/pci.h > >> index b640d65..b952b78 100644 > >> --- a/include/linux/pci.h > >> +++ b/include/linux/pci.h > >> @@ -412,6 +412,7 @@ struct pci_host_bridge { > >> void (*release_fn)(struct pci_host_bridge *); > >> void *release_data; > >> unsigned int ignore_reset_delay:1; /* for entire hierarchy */ > >> + unsigned int msi_filtered:1; /* support filtering of MSIs */ > >> /* Resource alignment requirements */ > >> resource_size_t (*align_resource)(struct pci_dev *dev, > >> const struct resource *res, > >> @@ -430,6 +431,8 @@ void pci_set_host_bridge_release(struct > >> pci_host_bridge *bridge, > >> > >> int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge); > >> > >> +bool pci_host_bridge_msi_filtered_enabled(struct pci_dev *pdev); > >> + > >> /* > >> * The first PCI_BRIDGE_RESOURCE_NUM PCI bus resources (those that > >> correspond > >> * to P2P or CardBus bridge windows) go in a table. Additional ones > >> (for > > Don't we already have a flag for this in the IOMMU space? > > > > enum iommu_cap { > > IOMMU_CAP_CACHE_COHERENCY, /* IOMMU can enforce cache > > coherent DMA > > transactions */ > > ---> IOMMU_CAP_INTR_REMAP, /* IOMMU supports interrupt > > isolation */ > > IOMMU_CAP_NOEXEC, /* IOMMU_NOEXEC flag */ > > }; > > > > I saw this flag had been enabled in x86 and ARM arch. > > I'm not sure whether we can mmap MSI-X table in those archs. I just > verify it on PPC64 arch. Unfortunately that's not a very good excuse for creating an alternate implementation. When x86 implements interrupt remapping, we get fine grained isolation of MSI vectors and we've always taken this flag to mean that the system is isolated from devices that may perform DoS attacks with MSI writes. I'm not entirely sure whether ARM really provides that degree of isolation, but they would be incorrect is exposing the capability if they do not. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2016/1/30 3:05, Alex Williamson wrote: > > ----- Original Message ----- >> On 2016/1/29 6:46, Alex Williamson wrote: >>> On Fri, 2016-01-15 at 15:06 +0800, Yongji Xie wrote: >>>> MSI-X tables are not allowed to be mmapped in vfio-pci >>>> driver in case that user get to touch this directly. >>>> This will cause some performance issues when when PCI >>>> adapters have critical registers in the same page as >>>> the MSI-X table. >>>> >>>> However, some kind of PCI host bridge such as IODA bridge >>>> on Power support filtering of MSIs, which can ensure that a >>>> given pci device can only shoot the MSIs assigned for it. >>>> So we think it's safe to expose the MSI-X table to userspace >>>> if filtering of MSIs is supported because the exposed MSI-X >>>> table can't be used to do harm to other memory space. >>>> >>>> To support this case, this patch adds a pci_host_bridge >>>> attribute to indicate if this PCI host bridge supports >>>> filtering of MSIs. >>>> >>>> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com> >>>> --- >>>> drivers/pci/host-bridge.c | 6 ++++++ >>>> include/linux/pci.h | 3 +++ >>>> 2 files changed, 9 insertions(+) >>>> >>>> diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c >>>> index 5f4a2e0..c029267 100644 >>>> --- a/drivers/pci/host-bridge.c >>>> +++ b/drivers/pci/host-bridge.c >>>> @@ -96,3 +96,9 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct >>>> resource *res, >>>> res->end = region->end + offset; >>>> } >>>> EXPORT_SYMBOL(pcibios_bus_to_resource); >>>> + >>>> +bool pci_host_bridge_msi_filtered_enabled(struct pci_dev *pdev) >>>> +{ >>>> + return pci_find_host_bridge(pdev->bus)->msi_filtered; >>>> +} >>>> +EXPORT_SYMBOL_GPL(pci_host_bridge_msi_filtered_enabled); >>>> diff --git a/include/linux/pci.h b/include/linux/pci.h >>>> index b640d65..b952b78 100644 >>>> --- a/include/linux/pci.h >>>> +++ b/include/linux/pci.h >>>> @@ -412,6 +412,7 @@ struct pci_host_bridge { >>>> void (*release_fn)(struct pci_host_bridge *); >>>> void *release_data; >>>> unsigned int ignore_reset_delay:1; /* for entire hierarchy */ >>>> + unsigned int msi_filtered:1; /* support filtering of MSIs */ >>>> /* Resource alignment requirements */ >>>> resource_size_t (*align_resource)(struct pci_dev *dev, >>>> const struct resource *res, >>>> @@ -430,6 +431,8 @@ void pci_set_host_bridge_release(struct >>>> pci_host_bridge *bridge, >>>> >>>> int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge); >>>> >>>> +bool pci_host_bridge_msi_filtered_enabled(struct pci_dev *pdev); >>>> + >>>> /* >>>> * The first PCI_BRIDGE_RESOURCE_NUM PCI bus resources (those that >>>> correspond >>>> * to P2P or CardBus bridge windows) go in a table. Additional ones >>>> (for >>> Don't we already have a flag for this in the IOMMU space? >>> >>> enum iommu_cap { >>> IOMMU_CAP_CACHE_COHERENCY, /* IOMMU can enforce cache >>> coherent DMA >>> transactions */ >>> ---> IOMMU_CAP_INTR_REMAP, /* IOMMU supports interrupt >>> isolation */ >>> IOMMU_CAP_NOEXEC, /* IOMMU_NOEXEC flag */ >>> }; >>> >> I saw this flag had been enabled in x86 and ARM arch. >> >> I'm not sure whether we can mmap MSI-X table in those archs. I just >> verify it on PPC64 arch. > Unfortunately that's not a very good excuse for creating an alternate implementation. When x86 implements interrupt remapping, we get fine grained isolation of MSI vectors and we've always taken this flag to mean that the system is isolated from devices that may perform DoS attacks with MSI writes. I'm not entirely sure whether ARM really provides that degree of isolation, but they would be incorrect is exposing the capability if they do not. Thanks, > > Alex > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > OK, I will use this flag in next version. Thanks. Regards, Yongji Xie -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c index 5f4a2e0..c029267 100644 --- a/drivers/pci/host-bridge.c +++ b/drivers/pci/host-bridge.c @@ -96,3 +96,9 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res, res->end = region->end + offset; } EXPORT_SYMBOL(pcibios_bus_to_resource); + +bool pci_host_bridge_msi_filtered_enabled(struct pci_dev *pdev) +{ + return pci_find_host_bridge(pdev->bus)->msi_filtered; +} +EXPORT_SYMBOL_GPL(pci_host_bridge_msi_filtered_enabled); diff --git a/include/linux/pci.h b/include/linux/pci.h index b640d65..b952b78 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -412,6 +412,7 @@ struct pci_host_bridge { void (*release_fn)(struct pci_host_bridge *); void *release_data; unsigned int ignore_reset_delay:1; /* for entire hierarchy */ + unsigned int msi_filtered:1; /* support filtering of MSIs */ /* Resource alignment requirements */ resource_size_t (*align_resource)(struct pci_dev *dev, const struct resource *res, @@ -430,6 +431,8 @@ void pci_set_host_bridge_release(struct pci_host_bridge *bridge, int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge); +bool pci_host_bridge_msi_filtered_enabled(struct pci_dev *pdev); + /* * The first PCI_BRIDGE_RESOURCE_NUM PCI bus resources (those that correspond * to P2P or CardBus bridge windows) go in a table. Additional ones (for
MSI-X tables are not allowed to be mmapped in vfio-pci driver in case that user get to touch this directly. This will cause some performance issues when when PCI adapters have critical registers in the same page as the MSI-X table. However, some kind of PCI host bridge such as IODA bridge on Power support filtering of MSIs, which can ensure that a given pci device can only shoot the MSIs assigned for it. So we think it's safe to expose the MSI-X table to userspace if filtering of MSIs is supported because the exposed MSI-X table can't be used to do harm to other memory space. To support this case, this patch adds a pci_host_bridge attribute to indicate if this PCI host bridge supports filtering of MSIs. Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com> --- drivers/pci/host-bridge.c | 6 ++++++ include/linux/pci.h | 3 +++ 2 files changed, 9 insertions(+)