Message ID | 1526648406-1746-1-git-send-email-paul.durrant@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 18.05.18 at 15:00, <paul.durrant@citrix.com> wrote: > @@ -903,6 +926,80 @@ static void cpu_ioreq_move(ioreq_t *req) > } > } > > +static void rw_config_req_item(XenPciDevice *xendev, ioreq_t *req, It looks to me as if both parameters could be constified. > + uint32_t i, uint32_t *val) > +{ > + int32_t reg = req->addr; > + uint32_t offset = req->size * i; > + > + reg += (req->df ? -1 : 1) * offset; > + if (reg < 0 || reg > PCI_CONFIG_SPACE_SIZE) { Having fought a number of issues in this area in the hypervisor a couple of years back I wonder - why reg is of signed type, - whether overflow of the first multiplication really doesn't matter, - whether wrapping when adding in the offset is not an issue. I take it that the rather lax upper bound check (should imo really be reg + size > PCI_CONFIG_SPACE_SIZE [implying reg + size doesn't itself wrap], or at least reg >= PCI_CONFIG_SPACE_SIZE) is not a problem because ... > + if (req->dir == IOREQ_READ) { > + *val = ~0u; > + } > + return; > + } > + > + if (req->dir == IOREQ_READ) { > + *val = pci_host_config_read_common(xendev->pci_dev, reg, > + PCI_CONFIG_SPACE_SIZE, > + req->size); > + trace_cpu_ioreq_config_read(req, xendev->sbdf, reg, > + req->size, *val); > + } else { > + trace_cpu_ioreq_config_write(req, xendev->sbdf, reg, req->size, > + *val); > + pci_host_config_write_common(xendev->pci_dev, reg, > + PCI_CONFIG_SPACE_SIZE, *val, > + req->size); > + } ... these called functions do full checking anyway? > +static void cpu_ioreq_config(XenIOState *state, ioreq_t *req) > +{ > + uint32_t sbdf = req->addr >> 32; > + XenPciDevice *xendev; > + > + if (req->size > sizeof(uint32_t)) { > + hw_error("PCI config access: bad size (%u)", req->size); What about size 0 or 3? > + } > + > + QLIST_FOREACH(xendev, &state->dev_list, entry) { > + unsigned int i; > + uint32_t tmp; > + > + if (xendev->sbdf != sbdf) { > + continue; > + } > + > + if (!req->data_is_ptr) { > + if (req->dir == IOREQ_READ) { > + for (i = 0; i < req->count; i++) { > + rw_config_req_item(xendev, req, i, &tmp); > + req->data = tmp; > + } > + } else if (req->dir == IOREQ_WRITE) { > + for (i = 0; i < req->count; i++) { > + tmp = req->data; > + rw_config_req_item(xendev, req, i, &tmp); > + } > + } Wouldn't it be more sensible to fail req->count != 1 requests here? Jan
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 18 May 2018 14:34 > To: Paul Durrant <Paul.Durrant@citrix.com> > Cc: Anthony Perard <anthony.perard@citrix.com>; Roger Pau Monne > <roger.pau@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; xen- > devel <xen-devel@lists.xenproject.org>; qemu-devel@nongnu.org; > ehabkost@redhat.com; marcel@redhat.com; mst@redhat.com; Paolo > Bonzini <pbonzini@redhat.com>; Richard Henderson <rth@twiddle.net> > Subject: Re: [Xen-devel] [PATCH v2] xen-hvm: stop faking I/O to access PCI > config space > > >>> On 18.05.18 at 15:00, <paul.durrant@citrix.com> wrote: > > @@ -903,6 +926,80 @@ static void cpu_ioreq_move(ioreq_t *req) > > } > > } > > > > +static void rw_config_req_item(XenPciDevice *xendev, ioreq_t *req, > > It looks to me as if both parameters could be constified. > They could for this function, yes. > > + uint32_t i, uint32_t *val) > > +{ > > + int32_t reg = req->addr; > > + uint32_t offset = req->size * i; > > + > > + reg += (req->df ? -1 : 1) * offset; > > + if (reg < 0 || reg > PCI_CONFIG_SPACE_SIZE) { > > Having fought a number of issues in this area in the hypervisor a couple > of years back I wonder > - why reg is of signed type, I did that so I could do a < 0 check. > - whether overflow of the first multiplication really doesn't matter, It would be better to check it. > - whether wrapping when adding in the offset is not an issue. > I'll do limits check on offset then... should be able to make reg unsigned then I guess. > I take it that the rather lax upper bound check (should imo really be > reg + size > PCI_CONFIG_SPACE_SIZE [implying reg + size doesn't > itself wrap], or at least reg >= PCI_CONFIG_SPACE_SIZE) is not a > problem because ... > > > + if (req->dir == IOREQ_READ) { > > + *val = ~0u; > > + } > > + return; > > + } > > + > > + if (req->dir == IOREQ_READ) { > > + *val = pci_host_config_read_common(xendev->pci_dev, reg, > > + PCI_CONFIG_SPACE_SIZE, > > + req->size); > > + trace_cpu_ioreq_config_read(req, xendev->sbdf, reg, > > + req->size, *val); > > + } else { > > + trace_cpu_ioreq_config_write(req, xendev->sbdf, reg, req->size, > > + *val); > > + pci_host_config_write_common(xendev->pci_dev, reg, > > + PCI_CONFIG_SPACE_SIZE, *val, > > + req->size); > > + } > > ... these called functions do full checking anyway? Yes, I'm deferring further checking to these common functions. I'm only intending to avoid passing junk into them here. > > > +static void cpu_ioreq_config(XenIOState *state, ioreq_t *req) > > +{ > > + uint32_t sbdf = req->addr >> 32; > > + XenPciDevice *xendev; > > + > > + if (req->size > sizeof(uint32_t)) { > > + hw_error("PCI config access: bad size (%u)", req->size); > > What about size 0 or 3? > Yes, I can reject those here also. > > + } > > + > > + QLIST_FOREACH(xendev, &state->dev_list, entry) { > > + unsigned int i; > > + uint32_t tmp; > > + > > + if (xendev->sbdf != sbdf) { > > + continue; > > + } > > + > > + if (!req->data_is_ptr) { > > + if (req->dir == IOREQ_READ) { > > + for (i = 0; i < req->count; i++) { > > + rw_config_req_item(xendev, req, i, &tmp); > > + req->data = tmp; > > + } > > + } else if (req->dir == IOREQ_WRITE) { > > + for (i = 0; i < req->count; i++) { > > + tmp = req->data; > > + rw_config_req_item(xendev, req, i, &tmp); > > + } > > + } > > Wouldn't it be more sensible to fail req->count != 1 requests here? > I'm wondering whether we'd want to handle count > 1 once we allow MMCONFIG accesses though. I guess it would be easier just to defer that. Paul > Jan >
>>> On 18.05.18 at 15:51, <Paul.Durrant@citrix.com> wrote: >> Sent: 18 May 2018 14:34 >> To: Paul Durrant <Paul.Durrant@citrix.com> >> >>> On 18.05.18 at 15:00, <paul.durrant@citrix.com> wrote: >> > + QLIST_FOREACH(xendev, &state->dev_list, entry) { >> > + unsigned int i; >> > + uint32_t tmp; >> > + >> > + if (xendev->sbdf != sbdf) { >> > + continue; >> > + } >> > + >> > + if (!req->data_is_ptr) { >> > + if (req->dir == IOREQ_READ) { >> > + for (i = 0; i < req->count; i++) { >> > + rw_config_req_item(xendev, req, i, &tmp); >> > + req->data = tmp; >> > + } >> > + } else if (req->dir == IOREQ_WRITE) { >> > + for (i = 0; i < req->count; i++) { >> > + tmp = req->data; >> > + rw_config_req_item(xendev, req, i, &tmp); >> > + } >> > + } >> >> Wouldn't it be more sensible to fail req->count != 1 requests here? >> > > I'm wondering whether we'd want to handle count > 1 once we allow MMCONFIG > accesses though. I guess it would be easier just to defer that. For the data_is_ptr case - sure. But here? Or wait - are you thinking about REP STOS (and the relatively useless REP LODS)? Jan
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 18 May 2018 15:16 > To: Paul Durrant <Paul.Durrant@citrix.com> > Cc: Anthony Perard <anthony.perard@citrix.com>; Roger Pau Monne > <roger.pau@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; xen- > devel <xen-devel@lists.xenproject.org>; qemu-devel@nongnu.org; > ehabkost@redhat.com; marcel@redhat.com; mst@redhat.com; Paolo > Bonzini <pbonzini@redhat.com>; Richard Henderson <rth@twiddle.net> > Subject: RE: [Xen-devel] [PATCH v2] xen-hvm: stop faking I/O to access PCI > config space > > >>> On 18.05.18 at 15:51, <Paul.Durrant@citrix.com> wrote: > >> Sent: 18 May 2018 14:34 > >> To: Paul Durrant <Paul.Durrant@citrix.com> > >> >>> On 18.05.18 at 15:00, <paul.durrant@citrix.com> wrote: > >> > + QLIST_FOREACH(xendev, &state->dev_list, entry) { > >> > + unsigned int i; > >> > + uint32_t tmp; > >> > + > >> > + if (xendev->sbdf != sbdf) { > >> > + continue; > >> > + } > >> > + > >> > + if (!req->data_is_ptr) { > >> > + if (req->dir == IOREQ_READ) { > >> > + for (i = 0; i < req->count; i++) { > >> > + rw_config_req_item(xendev, req, i, &tmp); > >> > + req->data = tmp; > >> > + } > >> > + } else if (req->dir == IOREQ_WRITE) { > >> > + for (i = 0; i < req->count; i++) { > >> > + tmp = req->data; > >> > + rw_config_req_item(xendev, req, i, &tmp); > >> > + } > >> > + } > >> > >> Wouldn't it be more sensible to fail req->count != 1 requests here? > >> > > > > I'm wondering whether we'd want to handle count > 1 once we allow > MMCONFIG > > accesses though. I guess it would be easier just to defer that. > > For the data_is_ptr case - sure. But here? Or wait - are you thinking about > REP STOS (and the relatively useless REP LODS)? > Yes. We'd need to cope with a rep stos if we had memory mapped access, but we don't need to worry about it until then I think. In the meantime I doubt any well behaved OS is going to do rep ins or rep outs to cfc so just aborting on count > 1 is probably fine. Paul > Jan >
diff --git a/hw/i386/xen/trace-events b/hw/i386/xen/trace-events index 8dab7bc..f576f1b 100644 --- a/hw/i386/xen/trace-events +++ b/hw/i386/xen/trace-events @@ -15,6 +15,8 @@ cpu_ioreq_pio(void *req, uint32_t dir, uint32_t df, uint32_t data_is_ptr, uint64 cpu_ioreq_pio_read_reg(void *req, uint64_t data, uint64_t addr, uint32_t size) "I/O=%p pio read reg data=0x%"PRIx64" port=0x%"PRIx64" size=%d" cpu_ioreq_pio_write_reg(void *req, uint64_t data, uint64_t addr, uint32_t size) "I/O=%p pio write reg data=0x%"PRIx64" port=0x%"PRIx64" size=%d" cpu_ioreq_move(void *req, uint32_t dir, uint32_t df, uint32_t data_is_ptr, uint64_t addr, uint64_t data, uint32_t count, uint32_t size) "I/O=%p copy dir=%d df=%d ptr=%d port=0x%"PRIx64" data=0x%"PRIx64" count=%d size=%d" +cpu_ioreq_config_read(void *req, uint32_t sbdf, uint32_t reg, uint32_t size, uint32_t data) "I/O=%p sbdf=0x%x reg=%u size=%u data=0x%x" +cpu_ioreq_config_write(void *req, uint32_t sbdf, uint32_t reg, uint32_t size, uint32_t data) "I/O=%p sbdf=0x%x reg=%u size=%u data=0x%x" # xen-mapcache.c xen_map_cache(uint64_t phys_addr) "want 0x%"PRIx64 diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c index caa563b..d79b1d6 100644 --- a/hw/i386/xen/xen-hvm.c +++ b/hw/i386/xen/xen-hvm.c @@ -12,6 +12,7 @@ #include "cpu.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_host.h" #include "hw/i386/pc.h" #include "hw/i386/apic-msidef.h" #include "hw/xen/xen_common.h" @@ -86,6 +87,12 @@ typedef struct XenPhysmap { QLIST_ENTRY(XenPhysmap) list; } XenPhysmap; +typedef struct XenPciDevice { + PCIDevice *pci_dev; + uint32_t sbdf; + QLIST_ENTRY(XenPciDevice) entry; +} XenPciDevice; + typedef struct XenIOState { ioservid_t ioservid; shared_iopage_t *shared_page; @@ -105,6 +112,7 @@ typedef struct XenIOState { struct xs_handle *xenstore; MemoryListener memory_listener; MemoryListener io_listener; + QLIST_HEAD(, XenPciDevice) dev_list; DeviceListener device_listener; QLIST_HEAD(, XenPhysmap) physmap; hwaddr free_phys_offset; @@ -569,6 +577,12 @@ static void xen_device_realize(DeviceListener *listener, if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { PCIDevice *pci_dev = PCI_DEVICE(dev); + XenPciDevice *xendev = g_new(XenPciDevice, 1); + + xendev->pci_dev = pci_dev; + xendev->sbdf = PCI_BUILD_BDF(pci_dev_bus_num(pci_dev), + pci_dev->devfn); + QLIST_INSERT_HEAD(&state->dev_list, xendev, entry); xen_map_pcidev(xen_domid, state->ioservid, pci_dev); } @@ -581,8 +595,17 @@ static void xen_device_unrealize(DeviceListener *listener, if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { PCIDevice *pci_dev = PCI_DEVICE(dev); + XenPciDevice *xendev, *next; xen_unmap_pcidev(xen_domid, state->ioservid, pci_dev); + + QLIST_FOREACH_SAFE(xendev, &state->dev_list, entry, next) { + if (xendev->pci_dev == pci_dev) { + QLIST_REMOVE(xendev, entry); + g_free(xendev); + break; + } + } } } @@ -903,6 +926,80 @@ static void cpu_ioreq_move(ioreq_t *req) } } +static void rw_config_req_item(XenPciDevice *xendev, ioreq_t *req, + uint32_t i, uint32_t *val) +{ + int32_t reg = req->addr; + uint32_t offset = req->size * i; + + reg += (req->df ? -1 : 1) * offset; + if (reg < 0 || reg > PCI_CONFIG_SPACE_SIZE) { + if (req->dir == IOREQ_READ) { + *val = ~0u; + } + return; + } + + if (req->dir == IOREQ_READ) { + *val = pci_host_config_read_common(xendev->pci_dev, reg, + PCI_CONFIG_SPACE_SIZE, + req->size); + trace_cpu_ioreq_config_read(req, xendev->sbdf, reg, + req->size, *val); + } else { + trace_cpu_ioreq_config_write(req, xendev->sbdf, reg, req->size, + *val); + pci_host_config_write_common(xendev->pci_dev, reg, + PCI_CONFIG_SPACE_SIZE, *val, + req->size); + } +} + +static void cpu_ioreq_config(XenIOState *state, ioreq_t *req) +{ + uint32_t sbdf = req->addr >> 32; + XenPciDevice *xendev; + + if (req->size > sizeof(uint32_t)) { + hw_error("PCI config access: bad size (%u)", req->size); + } + + QLIST_FOREACH(xendev, &state->dev_list, entry) { + unsigned int i; + uint32_t tmp; + + if (xendev->sbdf != sbdf) { + continue; + } + + if (!req->data_is_ptr) { + if (req->dir == IOREQ_READ) { + for (i = 0; i < req->count; i++) { + rw_config_req_item(xendev, req, i, &tmp); + req->data = tmp; + } + } else if (req->dir == IOREQ_WRITE) { + for (i = 0; i < req->count; i++) { + tmp = req->data; + rw_config_req_item(xendev, req, i, &tmp); + } + } + } else { + if (req->dir == IOREQ_READ) { + for (i = 0; i < req->count; i++) { + rw_config_req_item(xendev, req, i, &tmp); + write_phys_req_item(req->data, req, i, &tmp); + } + } else if (req->dir == IOREQ_WRITE) { + for (i = 0; i < req->count; i++) { + read_phys_req_item(req->data, req, i, &tmp); + rw_config_req_item(xendev, req, i, &tmp); + } + } + } + } +} + static void regs_to_cpu(vmware_regs_t *vmport_regs, ioreq_t *req) { X86CPU *cpu; @@ -975,27 +1072,9 @@ static void handle_ioreq(XenIOState *state, ioreq_t *req) case IOREQ_TYPE_INVALIDATE: xen_invalidate_map_cache(); break; - case IOREQ_TYPE_PCI_CONFIG: { - uint32_t sbdf = req->addr >> 32; - uint32_t val; - - /* Fake a write to port 0xCF8 so that - * the config space access will target the - * correct device model. - */ - val = (1u << 31) | - ((req->addr & 0x0f00) << 16) | - ((sbdf & 0xffff) << 8) | - (req->addr & 0xfc); - do_outp(0xcf8, 4, val); - - /* Now issue the config space access via - * port 0xCFC - */ - req->addr = 0xcfc | (req->addr & 0x03); - cpu_ioreq_pio(req); + case IOREQ_TYPE_PCI_CONFIG: + cpu_ioreq_config(state, req); break; - } default: hw_error("Invalid ioreq type 0x%x\n", req->type); } @@ -1366,6 +1445,7 @@ void xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory) memory_listener_register(&state->io_listener, &address_space_io); state->device_listener = xen_device_listener; + QLIST_INIT(&state->dev_list); device_listener_register(&state->device_listener); /* Initialize backend core & drivers */
This patch removes the current hackery where IOREQ_TYPE_PCI_CONFIG requests are handled by faking PIO to 0xcf8 and 0xcfc and replaces it with direct calls to pci_host_config_read/write_common(). Doing so necessitates mapping BDFs to PCIDevices but maintaining a simple QLIST in xen_device_realize/unrealize() will suffice. NOTE: whilst config space accesses are currently limited to PCI_CONFIG_SPACE_SIZE, this patch paves the way to increasing the limit to PCIE_CONFIG_SPACE_SIZE when Xen gains the ability to emulate MCFG table accesses. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> -- Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Marcel Apfelbaum <marcel@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Roger Pau Monne <roger.pau@citrix.com> v2: - Introduce a helper function rw_config_req_item() to handle config register offset calculation - Handle req->count > 1 more like MMIO rather than PIO - Added Roger to cc list (not taking R-b because of significant change) --- hw/i386/xen/trace-events | 2 + hw/i386/xen/xen-hvm.c | 120 +++++++++++++++++++++++++++++++++++++++-------- 2 files changed, 102 insertions(+), 20 deletions(-)