From patchwork Tue Aug 1 16:04:14 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Punit Agrawal X-Patchwork-Id: 9874927 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1CB916038F for ; Tue, 1 Aug 2017 16:05:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0ECD228068 for ; Tue, 1 Aug 2017 16:05:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0402C28680; Tue, 1 Aug 2017 16:05:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3CD2428068 for ; Tue, 1 Aug 2017 16:04:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751959AbdHAQEn (ORCPT ); Tue, 1 Aug 2017 12:04:43 -0400 Received: from foss.arm.com ([217.140.101.70]:42744 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751794AbdHAQER (ORCPT ); Tue, 1 Aug 2017 12:04:17 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0975A13D5; Tue, 1 Aug 2017 09:04:17 -0700 (PDT) Received: from localhost (e105922-lin.cambridge.arm.com [10.1.207.56]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 78F823F3E1; Tue, 1 Aug 2017 09:04:16 -0700 (PDT) From: Punit Agrawal To: Jean-Philippe Brucker Cc: kvm@vger.kernel.org, will.deacon@arm.com, robin.murphy@arm.com, lorenzo.pieralisi@arm.com, marc.zyngier@arm.com Subject: Re: [PATCH v2 kvmtool 07/10] vfio-pci: add MSI-X support References: <20170622170536.14319-1-jean-philippe.brucker@arm.com> <20170622170536.14319-8-jean-philippe.brucker@arm.com> <87a83kzebr.fsf@e105922-lin.cambridge.arm.com> Date: Tue, 01 Aug 2017 17:04:14 +0100 In-Reply-To: <87a83kzebr.fsf@e105922-lin.cambridge.arm.com> (Punit Agrawal's message of "Mon, 31 Jul 2017 18:49:44 +0100") Message-ID: <87vam7w9z5.fsf@e105922-lin.cambridge.arm.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Punit Agrawal writes: > Hi Jean-Philippe, > > I ran into an issue while testing these series on a Seattle based > system. More below... > > Jean-Philippe Brucker writes: > >> Add virtual MSI-X tables for PCI devices, and create IRQFD routes to let >> the kernel inject MSIs from a physical device directly into the guest. >> >> It would be tempting to create the MSI routes at init time before starting >> vCPUs, when we can afford to exit gracefully. But some of it must be >> initialized when the guest requests it. >> >> * On the KVM side, MSIs must be enabled after devices allocate their IRQ >> lines and irqchips are operational, which can happen until late_init. >> >> * On the VFIO side, hardware state of devices may be updated when setting >> up MSIs. For example, when passing a virtio-pci-legacy device to the >> guest: >> >> (1) The device-specific configuration layout (in BAR0) depends on >> whether MSIs are enabled or not in the device. If they are enabled, >> the device-specific configuration starts at offset 24, otherwise it >> starts at offset 20. >> (2) Linux guest assumes that MSIs are initially disabled (doesn't >> actually check the capability). So it reads the device config at >> offset 20. >> (3) Had we enabled MSIs early, host would have enabled the MSI-X >> capability and device would return the config at offset 24. >> (4) The guest would read junk and explode. >> >> Therefore we have to create MSI-X routes when the guest requests MSIs, and >> enable/disable them in VFIO when the guest pokes the MSI-X capability. We >> have to follow both physical and virtual state of the capability, which >> makes the state machine a bit complex, but I think it works. >> >> Signed-off-by: Jean-Philippe Brucker >> --- >> include/kvm/vfio.h | 54 +++++ >> vfio/pci.c | 619 ++++++++++++++++++++++++++++++++++++++++++++++++++++- >> 2 files changed, 664 insertions(+), 9 deletions(-) >> > > [...] > >> +static ssize_t vfio_pci_cap_size(struct pci_cap_hdr *cap_hdr) >> +{ >> + switch (cap_hdr->type) { >> + case PCI_CAP_ID_MSIX: >> + return PCI_CAP_MSIX_SIZEOF; >> + default: >> + pr_err("unknown PCI capability 0x%x", cap_hdr->type); >> + return 0; >> + } >> +} >> + >> +/* >> + * Copy capability from physical header into virtual header, and add it to the >> + * virtual capability list. >> + * >> + * @fd_offset: offset of pci header into vfio device fd >> + * @pos: offset of capability from start of header >> + */ >> +static int vfio_pci_add_cap(struct vfio_device *vdev, struct pci_cap_hdr *cap_hdr, >> + off_t fd_offset, off_t pos) >> +{ >> + int i; >> + ssize_t size = vfio_pci_cap_size(cap_hdr); >> + struct pci_device_header *hdr = &vdev->pci.hdr; >> + struct pci_cap_hdr *out = (void *)hdr + pos; >> + >> + if (pread(vdev->fd, out, size, fd_offset + pos) != size) >> + return -errno; >> + >> + out->next = 0; >> + >> + if (!hdr->capabilities) { >> + hdr->capabilities = pos; >> + hdr->status |= PCI_STATUS_CAP_LIST; >> + } else { >> + /* Add cap at end of list */ >> + struct pci_cap_hdr *last; >> + >> + pci_for_each_cap(i, last, hdr) >> + ; >> + last->next = pos; > > Setting the next capability is somehow overwriting the vendor id in the > virtual header. This leads to the device not being probed by the > appropriate driver. > > I did a quick hack to prevent the vendor_id override and was able to > proceed with probing the device. But there are still some issues with > setting up the capabilities. > > Can you take a look to see what's going wrong? I think I've figured out what's going wrong. pci_for_each_cap is defined as - #define pci_for_each_cap(pos, cap, hdr) \ for ((pos) = (hdr)->capabilities & ~3; \ (cap) = (void *)(hdr) + (pos), (pos) != 0; \ (pos) = ((struct pci_cap_hdr *)(cap))->next & ~3) Here cap is being assigned before the pos != 0 check. So in the last iteration through the loop, cap will point to the start of the PCI header - which is then used to set "last->next = pos". Below change got me the right vendor id and the driver was able to probe the device. But I don't seem to be getting any interrupts. I'll have a look to see if I can figure out what's going wrong. > > Thanks, > Punit > >> + } >> + >> + return 0; >> +} >> + >> static int vfio_pci_parse_caps(struct vfio_device *vdev) >> { >> + u8 pos; >> + int ret; >> + struct pci_cap_hdr cap; >> + ssize_t sz = sizeof(cap); >> + struct vfio_region_info *info; >> struct vfio_pci_device *pdev = &vdev->pci; >> >> if (!(pdev->hdr.status & PCI_STATUS_CAP_LIST)) >> return 0; >> >> + pos = pdev->hdr.capabilities & ~3; >> + info = &vdev->regions[VFIO_PCI_CONFIG_REGION_INDEX].info; >> + >> pdev->hdr.status &= ~PCI_STATUS_CAP_LIST; >> pdev->hdr.capabilities = 0; >> >> - /* TODO: install virtual capabilities */ >> + for (; pos; pos = cap.next) { >> + if (pos >= PCI_DEV_CFG_SIZE) { >> + dev_warn(vdev, "ignoring cap outside of config space"); >> + return -EINVAL; >> + } >> + >> + if (pread(vdev->fd, &cap, sz, info->offset + pos) != sz) { >> + dev_warn(vdev, "failed to read from capabilities pointer (0x%x)", >> + pos); >> + return -EINVAL; >> + } >> + >> + switch (cap.type) { >> + case PCI_CAP_ID_MSIX: >> + ret = vfio_pci_add_cap(vdev, &cap, info->offset, pos); >> + if (ret) { >> + dev_warn(vdev, "failed to read capability structure %x", >> + cap.type); >> + return ret; >> + } >> + >> + pdev->msi.pos = pos; >> + pdev->irq_type = VFIO_PCI_IRQ_MSIX; >> + break; >> + >> + /* Any other capability is hidden */ >> + } >> + } >> >> return 0; >> } > > [...] diff --git a/vfio/pci.c b/vfio/pci.c index 9e45d30..ad03325 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -489,7 +489,6 @@ static ssize_t vfio_pci_cap_size(struct pci_cap_hdr *cap_hdr) static int vfio_pci_add_cap(struct vfio_device *vdev, struct pci_cap_hdr *cap_hdr, off_t fd_offset, off_t pos) { - int i; ssize_t size = vfio_pci_cap_size(cap_hdr); struct pci_device_header *hdr = &vdev->pci.hdr; struct pci_cap_hdr *out = (void *)hdr + pos; @@ -505,9 +504,12 @@ static int vfio_pci_add_cap(struct vfio_device *vdev, struct pci_cap_hdr *cap_hd } else { /* Add cap at end of list */ struct pci_cap_hdr *last; + int i = hdr->capabilities & ~3; - pci_for_each_cap(i, last, hdr) - ; + do { + last = (void *)hdr + i; + i = last->next; + } while (i); last->next = pos; }