diff mbox

[RFC,V2,02/10] Qemu/VFIO: Add new VFIO_GET_PCI_CAP_INFO ioctl cmd definition

Message ID 1448372127-28115-3-git-send-email-tianyu.lan@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

lan,Tianyu Nov. 24, 2015, 1:35 p.m. UTC
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 linux-headers/linux/vfio.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Alex Williamson Dec. 2, 2015, 10:25 p.m. UTC | #1
On Tue, 2015-11-24 at 21:35 +0800, Lan Tianyu wrote:
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  linux-headers/linux/vfio.h | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> index 0508d0b..732b0bd 100644
> --- a/linux-headers/linux/vfio.h
> +++ b/linux-headers/linux/vfio.h
> @@ -495,6 +495,22 @@ struct vfio_eeh_pe_op {
>  
>  #define VFIO_EEH_PE_OP			_IO(VFIO_TYPE, VFIO_BASE + 21)
>  
> +
> +#define VFIO_FIND_FREE_PCI_CONFIG_REG   _IO(VFIO_TYPE, VFIO_BASE + 22)
> +
> +#define VFIO_GET_PCI_CAP_INFO   _IO(VFIO_TYPE, VFIO_BASE + 22)
> +
> +struct vfio_pci_cap_info {
> +    __u32 argsz;
> +    __u32 flags;
> +#define VFIO_PCI_CAP_GET_SIZE (1 << 0)
> +#define VFIO_PCI_CAP_GET_FREE_REGION (1 << 1)
> +    __u32 index;
> +    __u32 offset;
> +    __u32 size;
> +    __u8 cap;
> +};
> +
>  /* ***************************************************************** */
>  
>  #endif /* VFIO_H */

I didn't seen a matching kernel patch series for this, but why is the
kernel more capable of doing this than userspace is already?  These seem
like pointless ioctls, we're creating a purely virtual PCI capability,
the kernel doesn't really need to participate in that.  Also, why are we
restricting ourselves to standard capabilities?  That's often a crowded
space and we can't always know whether an area is free or not based only
on it being covered by a capability.  Some capabilities can also appear
more than once, so there's context that isn't being passed to the kernel
here.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
lan,Tianyu Dec. 3, 2015, 8:40 a.m. UTC | #2
On 12/3/2015 6:25 AM, Alex Williamson wrote:
> I didn't seen a matching kernel patch series for this, but why is the
> kernel more capable of doing this than userspace is already?
The following link is the kernel patch.
http://marc.info/?l=kvm&m=144837328920989&w=2

> These seem
> like pointless ioctls, we're creating a purely virtual PCI capability,
> the kernel doesn't really need to participate in that.

VFIO kernel driver has pci_config_map which indicates the PCI capability 
position and length which helps to find free PCI config regs. Qemu side 
doesn't have such info and can't get the exact table size of PCI 
capability. If we want to add such support in the Qemu, needs duplicates 
a lot of code of vfio_pci_configs.c in the Qemu.

> Also, why are we
> restricting ourselves to standard capabilities?

This version is to check whether it's on the right way and We can extend
this to pci extended capability later.

> That's often a crowded
> space and we can't always know whether an area is free or not based only
> on it being covered by a capability.  Some capabilities can also appear
> more than once, so there's context that isn't being passed to the kernel
> here.  Thanks,

The region outside of PCI capability are not passed to kernel or used by 
Qemu for MSI/MSIX . It's possible to use these places for new 
capability. One concerns is that guest driver may abuse them and quirk 
of masking some special regs outside of capability maybe helpful.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Williamson Dec. 3, 2015, 3:26 p.m. UTC | #3
On Thu, 2015-12-03 at 16:40 +0800, Lan, Tianyu wrote:
> On 12/3/2015 6:25 AM, Alex Williamson wrote:
> > I didn't seen a matching kernel patch series for this, but why is the
> > kernel more capable of doing this than userspace is already?
> The following link is the kernel patch.
> http://marc.info/?l=kvm&m=144837328920989&w=2
> 
> > These seem
> > like pointless ioctls, we're creating a purely virtual PCI capability,
> > the kernel doesn't really need to participate in that.
> 
> VFIO kernel driver has pci_config_map which indicates the PCI capability 
> position and length which helps to find free PCI config regs. Qemu side 
> doesn't have such info and can't get the exact table size of PCI 
> capability. If we want to add such support in the Qemu, needs duplicates 
> a lot of code of vfio_pci_configs.c in the Qemu.

That's an internal implementation detail of the kernel, not motivation
for creating a new userspace ABI.  QEMU can recreate this data on its
own.  The kernel is in no more of an authoritative position to determine
capability extents than userspace.

> > Also, why are we
> > restricting ourselves to standard capabilities?
> 
> This version is to check whether it's on the right way and We can extend
> this to pci extended capability later.
> 
> > That's often a crowded
> > space and we can't always know whether an area is free or not based only
> > on it being covered by a capability.  Some capabilities can also appear
> > more than once, so there's context that isn't being passed to the kernel
> > here.  Thanks,
> 
> The region outside of PCI capability are not passed to kernel or used by 
> Qemu for MSI/MSIX . It's possible to use these places for new 
> capability. One concerns is that guest driver may abuse them and quirk 
> of masking some special regs outside of capability maybe helpful.

That's not correct, see kernel commit
a7d1ea1c11b33bda2691f3294b4d735ed635535a.  Gaps between capabilities are
exposed with raw read-write access from the kernel and some drivers and
devices depend on this.  There's also no guarantee that there's a
sufficiently sized gap in conventional space.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 0508d0b..732b0bd 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -495,6 +495,22 @@  struct vfio_eeh_pe_op {
 
 #define VFIO_EEH_PE_OP			_IO(VFIO_TYPE, VFIO_BASE + 21)
 
+
+#define VFIO_FIND_FREE_PCI_CONFIG_REG   _IO(VFIO_TYPE, VFIO_BASE + 22)
+
+#define VFIO_GET_PCI_CAP_INFO   _IO(VFIO_TYPE, VFIO_BASE + 22)
+
+struct vfio_pci_cap_info {
+    __u32 argsz;
+    __u32 flags;
+#define VFIO_PCI_CAP_GET_SIZE (1 << 0)
+#define VFIO_PCI_CAP_GET_FREE_REGION (1 << 1)
+    __u32 index;
+    __u32 offset;
+    __u32 size;
+    __u8 cap;
+};
+
 /* ***************************************************************** */
 
 #endif /* VFIO_H */