diff mbox

[RFC,v6,06/10] PCI: Add a new PCI_BUS_FLAGS_MSI_REMAP flag

Message ID 1460977120-29367-1-git-send-email-xyjxie@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Yongji Xie April 18, 2016, 10:58 a.m. UTC
We introduce a new pci_bus_flags, PCI_BUS_FLAGS_MSI_REMAP
which indicates all devices on the bus are protected by the
hardware which supports IRQ remapping(intel naming).

This flag will be used to know whether it's safe to expose
MSI-X tables of PCI BARs to userspace. Because the capability
of IRQ remapping can guarantee the PCI device cannot trigger
MSIs that correspond to interrupt IDs of other devices.

There is a existing flag for this in the IOMMU space:

enum iommu_cap {
	IOMMU_CAP_CACHE_COHERENCY,
--->	IOMMU_CAP_INTR_REMAP,
	IOMMU_CAP_NOEXEC,
};

and Eric also posted a patchset [1] to abstract this
capability on MSI controller side for ARM. But it would
make sense to have a more common flag like
PCI_BUS_FLAGS_MSI_REMAP so that we can use a universal
flag to test this capability for different archs on
PCI side.

[1] http://www.spinics.net/lists/kvm/msg130256.html

Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
---
 include/linux/pci.h |    1 +
 1 file changed, 1 insertion(+)

Comments

David Laight April 18, 2016, 11:30 a.m. UTC | #1
From: Yongji Xie

> Sent: 18 April 2016 11:59

> We introduce a new pci_bus_flags, PCI_BUS_FLAGS_MSI_REMAP

> which indicates all devices on the bus are protected by the

> hardware which supports IRQ remapping(intel naming).

> 

> This flag will be used to know whether it's safe to expose

> MSI-X tables of PCI BARs to userspace. Because the capability

> of IRQ remapping can guarantee the PCI device cannot trigger

> MSIs that correspond to interrupt IDs of other devices.


I'm worried that this entire series is going to break drivers
for existing hardware.

I understand some of the reasoning for 'vm pass through' configurations,
but there will be PCIe devices out there that have the MSI-X tables
in the same BAR as other device registers.
If you are lucky nothing else is in the same 4k area, but I wouldn't
assume it.

In any case, if the hardware can't police the card's master transfers
there is nothing to stop a different bus master block on the card
from raising MSI-X interrupts - they are just a PCIe write.
So all you are doing is raising the bar slightly and giving a very false
sense of security.

	David
Yongji Xie April 19, 2016, 11:13 a.m. UTC | #2
On 2016/4/18 19:30, David Laight wrote:
> From: Yongji Xie
>> Sent: 18 April 2016 11:59
>> We introduce a new pci_bus_flags, PCI_BUS_FLAGS_MSI_REMAP
>> which indicates all devices on the bus are protected by the
>> hardware which supports IRQ remapping(intel naming).
>>
>> This flag will be used to know whether it's safe to expose
>> MSI-X tables of PCI BARs to userspace. Because the capability
>> of IRQ remapping can guarantee the PCI device cannot trigger
>> MSIs that correspond to interrupt IDs of other devices.
> I'm worried that this entire series is going to break drivers
> for existing hardware.
>
> I understand some of the reasoning for 'vm pass through' configurations,
> but there will be PCIe devices out there that have the MSI-X tables
> in the same BAR as other device registers.
> If you are lucky nothing else is in the same 4k area, but I wouldn't
> assume it.

Thanks for your comments. But I didn't get your point here.
Why will exposing MSI-X table to userspace break the driver
for hardware which have the MSI-X tables in the same BAR as
other device registers? Could you give me more details?

The reason why we want to mmap MSI-X table is that there
may be some other critical device registers in the same page
as the MSI-X table. We prefer to handle the mmio access to
these registers in guest rather than in QEMU. So we would
like to see there is something else in the same 4k/64k area.

> In any case, if the hardware can't police the card's master transfers
> there is nothing to stop a different bus master block on the card
> from raising MSI-X interrupts - they are just a PCIe write.
> So all you are doing is raising the bar slightly and giving a very false
> sense of security.

Do you mean we can request a DMA to the target address
area that raises MSI-X interrupts? But for PPC64 with IODA
bridge, this invalid PCIe write will be prevented on PHB before
raising MSI-X interrupt. And I think the capability of interrupt
remapping or ITS can also do the same thing. If hardware didn't
support this, we would not expose MSI-X table in my patch.

Thanks,
Yongji

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 27df4a6..d619228 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -193,6 +193,7 @@  typedef unsigned short __bitwise pci_bus_flags_t;
 enum pci_bus_flags {
 	PCI_BUS_FLAGS_NO_MSI   = (__force pci_bus_flags_t) 1,
 	PCI_BUS_FLAGS_NO_MMRBC = (__force pci_bus_flags_t) 2,
+	PCI_BUS_FLAGS_MSI_REMAP = (__force pci_bus_flags_t) 4,
 };
 
 /* These values come from the PCI Express Spec */