Message ID | 20241016062827.2301004-1-Jiqian.Chen@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [QEMU,v8] xen/passthrough: use gsi to map pirq when dom0 is PVH | expand |
On 10/16/24 02:28, Jiqian Chen wrote: > In PVH dom0, when passthrough a device to domU, QEMU code > xen_pt_realize->xc_physdev_map_pirq wants to use gsi, but in current codes > the gsi number is got from file /sys/bus/pci/devices/<sbdf>/irq, that is > wrong, because irq is not equal with gsi, they are in different spaces, so > pirq mapping fails. > > To solve above problem, use new interface of Xen, xc_pcidev_get_gsi to get > gsi and use xc_physdev_map_pirq_gsi to map pirq when dom0 is PVH. > > Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> > Signed-off-by: Huang Rui <ray.huang@amd.com> > Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> > --- > Hi All, > This is v8 to support passthrough on Xen when dom0 is PVH. > v7->v8 change: > * Since xc_physdev_gsi_from_dev was renamed to xc_pcidev_get_gsi, changed it. > * Added xen_run_qemu_on_hvm to check if Qemu run on PV dom0, if not use xc_physdev_map_pirq_gsi to map pirq. > * Used CONFIG_XEN_CTRL_INTERFACE_VERSION to wrap the new part for compatibility. > * Added "#define DOMID_RUN_QEMU 0" to represent the id of domain that Qemu run on. > > > Best regards, > Jiqian Chen > > > > v6->v7 changes: > * Because the function of obtaining gsi was changed on the kernel and Xen side. Changed to use > xc_physdev_gsi_from_dev, that requires passing in sbdf instead of irq. > > v5->v6 changes: > * Because the function of obtaining gsi was changed on the kernel and Xen side. Changed to use > xc_physdev_gsi_from_irq, instead of gsi sysfs. > * Since function changed, removed the Review-by of Stefano. > > v4->v5 changes: > * Added Review-by Stefano. > > v3->v4 changes: > * Added gsi into struct XenHostPCIDevice and used gsi number that read from gsi sysfs > if it exists, if there is no gsi sysfs, still use irq. > > v2->v3 changes: > * Due to changes in the implementation of the second patch on kernel side(that adds > a new sysfs for gsi instead of a new syscall), so read gsi number from the sysfs of gsi. > > v1 and v2: > We can record the relation between gsi and irq, then when userspace(qemu) want > to use gsi, we can do a translation. The third patch of kernel(xen/privcmd: Add new syscall > to get gsi from irq) records all the relations in acpi_register_gsi_xen_pvh() when dom0 > initialize pci devices, and provide a syscall for userspace to get the gsi from irq. The > third patch of xen(tools: Add new function to get gsi from irq) add a new function > xc_physdev_gsi_from_irq() to call the new syscall added on kernel side. > And then userspace can use that function to get gsi. Then xc_physdev_map_pirq() will success. > > Issues we encountered: > 1. failed to map pirq for gsi > Problem: qemu will call xc_physdev_map_pirq() to map a passthrough device's gsi to pirq in > function xen_pt_realize(). But failed. > > Reason: According to the implement of xc_physdev_map_pirq(), it needs gsi instead of irq, > but qemu pass irq to it and treat irq as gsi, it is got from file > /sys/bus/pci/devices/xxxx:xx:xx.x/irq in function xen_host_pci_device_get(). But actually > the gsi number is not equal with irq. They are in different space. > --- > hw/xen/xen_pt.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ > hw/xen/xen_pt.h | 1 + > 2 files changed, 45 insertions(+) > > diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c > index 3635d1b39f79..7f8139d20915 100644 > --- a/hw/xen/xen_pt.c > +++ b/hw/xen/xen_pt.c > @@ -766,6 +766,41 @@ static void xen_pt_destroy(PCIDevice *d) { > } > /* init */ > > +#define PCI_SBDF(seg, bus, dev, func) \ > + ((((uint32_t)(seg)) << 16) | \ > + (PCI_BUILD_BDF(bus, PCI_DEVFN(dev, func)))) Nit: This macro looks generic and useful. Would it be better defined in include/hw/pci/pci.h? > + > +#if CONFIG_XEN_CTRL_INTERFACE_VERSION >= 42000 > +static bool xen_run_qemu_on_hvm(void) This function name seems to imply "is qemu running on HVM?", but I think the question we're really trying to answer is whether the pcidev needs a GSI mapped. How about calling the function "xen_pt_needs_gsi" or similar? > +{ > + xc_domaininfo_t info; > + > + if (!xc_domain_getinfo_single(xen_xc, DOMID_RUN_QEMU, &info) && > + (info.flags & XEN_DOMINF_hvm_guest)) { I think reading /sys/hypervisor/guest_type would allow you to get the same information without another hypercall. > + return true; > + } > + > + return false; > +} > + > +static int xen_map_pirq_for_gsi(PCIDevice *d, int *pirq) Nit: s/xen_/xen_pt_/ > +{ > + int gsi; > + XenPCIPassthroughState *s = XEN_PT_DEVICE(d); > + > + gsi = xc_pcidev_get_gsi(xen_xc, > + PCI_SBDF(s->real_device.domain, > + s->real_device.bus, > + s->real_device.dev, > + s->real_device.func)); > + if (gsi >= 0) { > + return xc_physdev_map_pirq_gsi(xen_xc, xen_domid, gsi, pirq); > + } > + > + return gsi; > +} > +#endif > + > static void xen_pt_realize(PCIDevice *d, Error **errp) > { > ERRP_GUARD(); > @@ -847,7 +882,16 @@ static void xen_pt_realize(PCIDevice *d, Error **errp) > goto out; > } > > +#if CONFIG_XEN_CTRL_INTERFACE_VERSION >= 42000 > + if (xen_run_qemu_on_hvm()) { > + rc = xen_map_pirq_for_gsi(d, &pirq); > + } else { > + rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, &pirq); > + } > +#else > rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, &pirq); > +#endif > + > if (rc < 0) { > XEN_PT_ERR(d, "Mapping machine irq %u to pirq %i failed, (err: %d)\n", > machine_irq, pirq, errno); > diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h > index 095a0f0365d4..a08b45b7fbae 100644 > --- a/hw/xen/xen_pt.h > +++ b/hw/xen/xen_pt.h > @@ -36,6 +36,7 @@ void xen_pt_log(const PCIDevice *d, const char *f, ...) G_GNUC_PRINTF(2, 3); > # define XEN_PT_LOG_CONFIG(d, addr, val, len) > #endif > > +#define DOMID_RUN_QEMU 0 > > /* Helper */ > #define XEN_PFN(x) ((x) >> XC_PAGE_SHIFT)
On Wed, Oct 16, 2024 at 02:28:27PM +0800, Jiqian Chen wrote: > --- a/hw/xen/xen_pt.h > +++ b/hw/xen/xen_pt.h > @@ -36,6 +36,7 @@ void xen_pt_log(const PCIDevice *d, const char *f, ...) G_GNUC_PRINTF(2, 3); > # define XEN_PT_LOG_CONFIG(d, addr, val, len) > #endif > > +#define DOMID_RUN_QEMU 0 Please, don't hardcode dom0 here, QEMU can be running in a stubdomain too (in which case, this will hilariously explode, as it will check what dom0 is, instead of where QEMU is running). Stewart already provided an alternative.
On 2024/10/22 04:55, Stewart Hildebrand wrote: > On 10/16/24 02:28, Jiqian Chen wrote: >> In PVH dom0, when passthrough a device to domU, QEMU code >> xen_pt_realize->xc_physdev_map_pirq wants to use gsi, but in current codes >> the gsi number is got from file /sys/bus/pci/devices/<sbdf>/irq, that is >> wrong, because irq is not equal with gsi, they are in different spaces, so >> pirq mapping fails. >> >> To solve above problem, use new interface of Xen, xc_pcidev_get_gsi to get >> gsi and use xc_physdev_map_pirq_gsi to map pirq when dom0 is PVH. >> >> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> >> Signed-off-by: Huang Rui <ray.huang@amd.com> >> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> >> --- >> Hi All, >> This is v8 to support passthrough on Xen when dom0 is PVH. >> v7->v8 change: >> * Since xc_physdev_gsi_from_dev was renamed to xc_pcidev_get_gsi, changed it. >> * Added xen_run_qemu_on_hvm to check if Qemu run on PV dom0, if not use xc_physdev_map_pirq_gsi to map pirq. >> * Used CONFIG_XEN_CTRL_INTERFACE_VERSION to wrap the new part for compatibility. >> * Added "#define DOMID_RUN_QEMU 0" to represent the id of domain that Qemu run on. >> >> >> Best regards, >> Jiqian Chen >> >> >> >> v6->v7 changes: >> * Because the function of obtaining gsi was changed on the kernel and Xen side. Changed to use >> xc_physdev_gsi_from_dev, that requires passing in sbdf instead of irq. >> >> v5->v6 changes: >> * Because the function of obtaining gsi was changed on the kernel and Xen side. Changed to use >> xc_physdev_gsi_from_irq, instead of gsi sysfs. >> * Since function changed, removed the Review-by of Stefano. >> >> v4->v5 changes: >> * Added Review-by Stefano. >> >> v3->v4 changes: >> * Added gsi into struct XenHostPCIDevice and used gsi number that read from gsi sysfs >> if it exists, if there is no gsi sysfs, still use irq. >> >> v2->v3 changes: >> * Due to changes in the implementation of the second patch on kernel side(that adds >> a new sysfs for gsi instead of a new syscall), so read gsi number from the sysfs of gsi. >> >> v1 and v2: >> We can record the relation between gsi and irq, then when userspace(qemu) want >> to use gsi, we can do a translation. The third patch of kernel(xen/privcmd: Add new syscall >> to get gsi from irq) records all the relations in acpi_register_gsi_xen_pvh() when dom0 >> initialize pci devices, and provide a syscall for userspace to get the gsi from irq. The >> third patch of xen(tools: Add new function to get gsi from irq) add a new function >> xc_physdev_gsi_from_irq() to call the new syscall added on kernel side. >> And then userspace can use that function to get gsi. Then xc_physdev_map_pirq() will success. >> >> Issues we encountered: >> 1. failed to map pirq for gsi >> Problem: qemu will call xc_physdev_map_pirq() to map a passthrough device's gsi to pirq in >> function xen_pt_realize(). But failed. >> >> Reason: According to the implement of xc_physdev_map_pirq(), it needs gsi instead of irq, >> but qemu pass irq to it and treat irq as gsi, it is got from file >> /sys/bus/pci/devices/xxxx:xx:xx.x/irq in function xen_host_pci_device_get(). But actually >> the gsi number is not equal with irq. They are in different space. >> --- >> hw/xen/xen_pt.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ >> hw/xen/xen_pt.h | 1 + >> 2 files changed, 45 insertions(+) >> >> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c >> index 3635d1b39f79..7f8139d20915 100644 >> --- a/hw/xen/xen_pt.c >> +++ b/hw/xen/xen_pt.c >> @@ -766,6 +766,41 @@ static void xen_pt_destroy(PCIDevice *d) { >> } >> /* init */ >> >> +#define PCI_SBDF(seg, bus, dev, func) \ >> + ((((uint32_t)(seg)) << 16) | \ >> + (PCI_BUILD_BDF(bus, PCI_DEVFN(dev, func)))) > > Nit: This macro looks generic and useful. Would it be better defined in > include/hw/pci/pci.h? > >> + >> +#if CONFIG_XEN_CTRL_INTERFACE_VERSION >= 42000 >> +static bool xen_run_qemu_on_hvm(void) > > This function name seems to imply "is qemu running on HVM?", but I think > the question we're really trying to answer is whether the pcidev needs > a GSI mapped. How about calling the function "xen_pt_needs_gsi" or > similar? > >> +{ >> + xc_domaininfo_t info; >> + >> + if (!xc_domain_getinfo_single(xen_xc, DOMID_RUN_QEMU, &info) && >> + (info.flags & XEN_DOMINF_hvm_guest)) { > > I think reading /sys/hypervisor/guest_type would allow you to get the > same information without another hypercall. > >> + return true; >> + } >> + >> + return false; >> +} >> + >> +static int xen_map_pirq_for_gsi(PCIDevice *d, int *pirq) > > Nit: s/xen_/xen_pt_/ > >> +{ >> + int gsi; >> + XenPCIPassthroughState *s = XEN_PT_DEVICE(d); >> + >> + gsi = xc_pcidev_get_gsi(xen_xc, >> + PCI_SBDF(s->real_device.domain, >> + s->real_device.bus, >> + s->real_device.dev, >> + s->real_device.func)); >> + if (gsi >= 0) { >> + return xc_physdev_map_pirq_gsi(xen_xc, xen_domid, gsi, pirq); >> + } >> + >> + return gsi; >> +} >> +#endif >> + >> static void xen_pt_realize(PCIDevice *d, Error **errp) >> { >> ERRP_GUARD(); >> @@ -847,7 +882,16 @@ static void xen_pt_realize(PCIDevice *d, Error **errp) >> goto out; >> } >> >> +#if CONFIG_XEN_CTRL_INTERFACE_VERSION >= 42000 >> + if (xen_run_qemu_on_hvm()) { >> + rc = xen_map_pirq_for_gsi(d, &pirq); >> + } else { >> + rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, &pirq); >> + } >> +#else >> rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, &pirq); >> +#endif >> + >> if (rc < 0) { >> XEN_PT_ERR(d, "Mapping machine irq %u to pirq %i failed, (err: %d)\n", >> machine_irq, pirq, errno); >> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h >> index 095a0f0365d4..a08b45b7fbae 100644 >> --- a/hw/xen/xen_pt.h >> +++ b/hw/xen/xen_pt.h >> @@ -36,6 +36,7 @@ void xen_pt_log(const PCIDevice *d, const char *f, ...) G_GNUC_PRINTF(2, 3); >> # define XEN_PT_LOG_CONFIG(d, addr, val, len) >> #endif >> >> +#define DOMID_RUN_QEMU 0 >> >> /* Helper */ >> #define XEN_PFN(x) ((x) >> XC_PAGE_SHIFT) > What you said makes more sense. I will change this patch according to your comments. Thank you very much!
On 2024/10/23 07:07, Marek Marczykowski-Górecki wrote: > On Wed, Oct 16, 2024 at 02:28:27PM +0800, Jiqian Chen wrote: >> --- a/hw/xen/xen_pt.h >> +++ b/hw/xen/xen_pt.h >> @@ -36,6 +36,7 @@ void xen_pt_log(const PCIDevice *d, const char *f, ...) G_GNUC_PRINTF(2, 3); >> # define XEN_PT_LOG_CONFIG(d, addr, val, len) >> #endif >> >> +#define DOMID_RUN_QEMU 0 > > Please, don't hardcode dom0 here, QEMU can be running in a stubdomain > too (in which case, this will hilariously explode, as it will check what > dom0 is, instead of where QEMU is running). > Stewart already provided an alternative. Got it, will use Stewart's suggestion, thanks. >
diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c index 3635d1b39f79..7f8139d20915 100644 --- a/hw/xen/xen_pt.c +++ b/hw/xen/xen_pt.c @@ -766,6 +766,41 @@ static void xen_pt_destroy(PCIDevice *d) { } /* init */ +#define PCI_SBDF(seg, bus, dev, func) \ + ((((uint32_t)(seg)) << 16) | \ + (PCI_BUILD_BDF(bus, PCI_DEVFN(dev, func)))) + +#if CONFIG_XEN_CTRL_INTERFACE_VERSION >= 42000 +static bool xen_run_qemu_on_hvm(void) +{ + xc_domaininfo_t info; + + if (!xc_domain_getinfo_single(xen_xc, DOMID_RUN_QEMU, &info) && + (info.flags & XEN_DOMINF_hvm_guest)) { + return true; + } + + return false; +} + +static int xen_map_pirq_for_gsi(PCIDevice *d, int *pirq) +{ + int gsi; + XenPCIPassthroughState *s = XEN_PT_DEVICE(d); + + gsi = xc_pcidev_get_gsi(xen_xc, + PCI_SBDF(s->real_device.domain, + s->real_device.bus, + s->real_device.dev, + s->real_device.func)); + if (gsi >= 0) { + return xc_physdev_map_pirq_gsi(xen_xc, xen_domid, gsi, pirq); + } + + return gsi; +} +#endif + static void xen_pt_realize(PCIDevice *d, Error **errp) { ERRP_GUARD(); @@ -847,7 +882,16 @@ static void xen_pt_realize(PCIDevice *d, Error **errp) goto out; } +#if CONFIG_XEN_CTRL_INTERFACE_VERSION >= 42000 + if (xen_run_qemu_on_hvm()) { + rc = xen_map_pirq_for_gsi(d, &pirq); + } else { + rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, &pirq); + } +#else rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, &pirq); +#endif + if (rc < 0) { XEN_PT_ERR(d, "Mapping machine irq %u to pirq %i failed, (err: %d)\n", machine_irq, pirq, errno); diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h index 095a0f0365d4..a08b45b7fbae 100644 --- a/hw/xen/xen_pt.h +++ b/hw/xen/xen_pt.h @@ -36,6 +36,7 @@ void xen_pt_log(const PCIDevice *d, const char *f, ...) G_GNUC_PRINTF(2, 3); # define XEN_PT_LOG_CONFIG(d, addr, val, len) #endif +#define DOMID_RUN_QEMU 0 /* Helper */ #define XEN_PFN(x) ((x) >> XC_PAGE_SHIFT)