From patchwork Tue Aug 30 04:53:04 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 9304847 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id DD5E1607F0 for ; Tue, 30 Aug 2016 04:53:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C7122289B8 for ; Tue, 30 Aug 2016 04:53:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BBA1428AB5; Tue, 30 Aug 2016 04:53:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 12F6F289B8 for ; Tue, 30 Aug 2016 04:53:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751962AbcH3ExH (ORCPT ); Tue, 30 Aug 2016 00:53:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36062 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750964AbcH3ExH (ORCPT ); Tue, 30 Aug 2016 00:53:07 -0400 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E61AD883A4; Tue, 30 Aug 2016 04:53:05 +0000 (UTC) Received: from t450s.home (ovpn03.gateway.prod.ext.phx2.redhat.com [10.5.9.3]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u7U4r4Ws008620; Tue, 30 Aug 2016 00:53:04 -0400 Date: Mon, 29 Aug 2016 22:53:04 -0600 From: Alex Williamson To: "Michael S. Tsirkin" Cc: linux-kernel@vger.kernel.org, Jason Wang , Cornelia Huck , virtualization@lists.linux-foundation.org, qemu-devel@nongnu.org, kvm@vger.kernel.org, Julia Lawall , Yongji Xie , Dan Carpenter , Feng Wu , Paolo Bonzini Subject: Re: [PATCH v2 2/2] vfio: add virtio pci quirk Message-ID: <20160829225304.17238cfd@t450s.home> In-Reply-To: <20160829215220.0efb91b7@t450s.home> References: <1472523968-9540-1-git-send-email-mst@redhat.com> <1472523968-9540-3-git-send-email-mst@redhat.com> <20160829212325.32467efc@t450s.home> <20160829215220.0efb91b7@t450s.home> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Tue, 30 Aug 2016 04:53:06 +0000 (UTC) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Mon, 29 Aug 2016 21:52:20 -0600 Alex Williamson wrote: > On Mon, 29 Aug 2016 21:23:25 -0600 > Alex Williamson wrote: > > > On Tue, 30 Aug 2016 05:27:17 +0300 > > "Michael S. Tsirkin" wrote: > > > > > Modern virtio pci devices can set VIRTIO_F_IOMMU_PLATFORM > > > to signal they are safe to use with an IOMMU. > > > > > > Without this bit, exposing the device to userspace is unsafe, so probe > > > and fail VFIO initialization unless noiommu is enabled. > > > > > > Signed-off-by: Michael S. Tsirkin > > > --- > > > drivers/vfio/pci/vfio_pci_private.h | 1 + > > > drivers/vfio/pci/vfio_pci.c | 14 ++++ > > > drivers/vfio/pci/vfio_pci_virtio.c | 140 ++++++++++++++++++++++++++++++++++++ > > > drivers/vfio/pci/Makefile | 1 + > > > 4 files changed, 156 insertions(+) > > > create mode 100644 drivers/vfio/pci/vfio_pci_virtio.c > > > > > > diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h > > > index 2128de8..2bd5616 100644 > > > --- a/drivers/vfio/pci/vfio_pci_private.h > > > +++ b/drivers/vfio/pci/vfio_pci_private.h > > > @@ -139,4 +139,5 @@ static inline int vfio_pci_igd_init(struct vfio_pci_device *vdev) > > > return -ENODEV; > > > } > > > #endif > > > +extern int vfio_pci_virtio_quirk(struct vfio_pci_device *vdev, bool noiommu); > > > #endif /* VFIO_PCI_PRIVATE_H */ > > > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > > > index d624a52..e93bf0c 100644 > > > --- a/drivers/vfio/pci/vfio_pci.c > > > +++ b/drivers/vfio/pci/vfio_pci.c > > > @@ -1236,6 +1236,20 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) > > > return ret; > > > } > > > > > > + if (pdev->vendor == PCI_VENDOR_ID_REDHAT_QUMRANET) { > > > > Perhaps a vfio_pci_is_virtio() like vga below? Let's test the device > > ID range initially as well, this test raised a big red flag for me > > whether all devices within this vendor ID were virtio. > > > > > + bool noiommu = vfio_is_noiommu_group_dev(&pdev->dev); > > > > I think you can use iommu_present() for this and avoid patch 1of2. > > noiommu is mutually exclusive to an iommu being present. Seems like > > all of this logic should be in the quirk itself, I'm not sure what it > > buys to get the value here but wait until later to use it. Using > > iommu_present() could also move this test much earlier in > > vfio_pci_probe() making the exit path easier. > > Except then I'm reintroducing the bug fixed by 16ab8a5cbea4 since > iommu_present() assumes an IOMMU API based device. I'll try to think if > there's another way to avoid adding the is_noiommu function. Thanks, I think something like this would do it. Thanks, Alex > > > + > > > + ret = vfio_pci_virtio_quirk(vdev, noiommu); > > > + if (ret) { > > > + dev_warn(&vdev->pdev->dev, > > > + "Failed to setup Virtio for VFIO\n"); > > > + vfio_del_group_dev(&pdev->dev); > > > + vfio_iommu_group_put(group, &pdev->dev); > > > + kfree(vdev); > > > + return ret; > > > + } > > > + } > > > + > > > if (vfio_pci_is_vga(pdev)) { > > > vga_client_register(pdev, vdev, NULL, vfio_pci_set_vga_decode); > > > vga_set_legacy_decoding(pdev, > > > diff --git a/drivers/vfio/pci/vfio_pci_virtio.c b/drivers/vfio/pci/vfio_pci_virtio.c > > > new file mode 100644 > > > index 0000000..e1ecffd > > > --- /dev/null > > > +++ b/drivers/vfio/pci/vfio_pci_virtio.c > > > @@ -0,0 +1,140 @@ > > > +/* > > > + * VFIO PCI Intel Graphics support > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > + * > > > + * Copyright (C) 2016 Red Hat, Inc. All rights reserved. > > > + * Author: Alex Williamson > > > > Update. > > > > > + * > > > + * This program is free software; you can redistribute it and/or modify > > > + * it under the terms of the GNU General Public License version 2 as > > > + * published by the Free Software Foundation. > > > + * > > > + * Register a device specific region through which to provide read-only > > > + * access to the Intel IGD opregion. The register defining the opregion > > > + * address is also virtualized to prevent user modification. > > > + */ > > > + > > > +#include > > > +#include > > > +#include > > > > Are io.h and uaccess.h needed? > > > > > +#include > > > +#include > > > +#include > > > + > > > +#include "vfio_pci_private.h" > > > + > > > +/** > > > + * virtio_pci_find_capability - walk capabilities to find device info. > > > + * @dev: the pci device > > > + * @cfg_type: the VIRTIO_PCI_CAP_* value we seek > > > + * > > > + * Returns offset of the capability, or 0. > > > + */ > > > +static inline int virtio_pci_find_capability(struct pci_dev *dev, u8 cfg_type) > > > > Does inlining this really make sense? > > > > > +{ > > > + int pos; > > > + > > > + for (pos = pci_find_capability(dev, PCI_CAP_ID_VNDR); > > > + pos > 0; > > > + pos = pci_find_next_capability(dev, pos, PCI_CAP_ID_VNDR)) { > > > + u8 type; > > > + pci_read_config_byte(dev, pos + offsetof(struct virtio_pci_cap, > > > + cfg_type), > > > + &type); > > > + > > > + if (type != cfg_type) > > > + continue; > > > + > > > + /* Ignore structures with reserved BAR values */ > > > + if (type != VIRTIO_PCI_CAP_PCI_CFG) { > > > + u8 bar; > > > + > > > + pci_read_config_byte(dev, pos + > > > + offsetof(struct virtio_pci_cap, > > > + bar), > > > + &bar); > > > + if (bar > 0x5) > > > > s/0x5/PCI_STD_RESOURCE_END/ > > > > > + continue; > > > + } > > > + > > > + return pos; > > > + } > > > + return 0; > > > +} > > > + > > > + > > > +int vfio_pci_virtio_quirk(struct vfio_pci_device *vdev, bool noiommu) > > > +{ > > > + struct pci_dev *dev = vdev->pdev; > > > > Please use *pdev for consistency with the rest of drivers/vfio/pci/* > > > > Also, is there any reason to pass the vfio_pci_device? There's nothing > > vfio here otherwise and we could remove more #includes. > > > > > + int common, cfg; > > > + u32 features; > > > + u32 offset; > > > + u8 bar; > > > + > > > + /* Without an IOMMU, we don't care */ > > > + if (noiommu) > > > + return 0; > > > + > > > + /* Virtio only owns devices >= 0x1000 and <= 0x107f: leave the rest. */ > > > + if (dev->device < 0x1000 || dev->device > 0x107f) > > > + return 0; > > > > Whitespace > > > > > + > > > + /* Check whether device enforces the IOMMU correctly */ > > > + > > > + /* > > > + * All modern devices must have common and cfg capabilities. We use cfg > > > + * capability for access so that we don't need to worry about resource > > > + * availability. Slow but sure. > > > + * Note that all vendor-specific fields we access are little-endian > > > + * which matches what pci config accessors expect, so they do byteswap > > > + * for us if appropriate. > > > + */ > > > + common = virtio_pci_find_capability(dev, VIRTIO_PCI_CAP_COMMON_CFG); > > > + cfg = virtio_pci_find_capability(dev, VIRTIO_PCI_CAP_PCI_CFG); > > > + if (!cfg || !common) { > > > + dev_warn(&dev->dev, > > > + "Virtio device lacks common or pci cfg.\n"); > > > > Whitespace > > > > > + return -ENODEV; > > > + } > > > + > > > + pci_read_config_byte(dev, common + offsetof(struct virtio_pci_cap, > > > + bar), > > > + &bar); > > > + pci_read_config_dword(dev, common + offsetof(struct virtio_pci_cap, > > > + offset), > > > + &offset); > > > + > > > + /* Program cfg capability for dword access into common cfg. */ > > > + pci_write_config_byte(dev, cfg + offsetof(struct virtio_pci_cfg_cap, > > > + cap.bar), > > > + bar); > > > + pci_write_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap, > > > + cap.length), > > > + 0x4); > > > + > > > + /* Select features dword that has VIRTIO_F_IOMMU_PLATFORM. */ > > > + pci_write_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap, > > > + cap.offset), > > > + offset + offsetof(struct virtio_pci_common_cfg, > > > + device_feature_select)); > > > + pci_write_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap, > > > + pci_cfg_data), > > > + VIRTIO_F_IOMMU_PLATFORM / 32); > > > + > > > + /* Get the features dword. */ > > > + pci_write_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap, > > > + cap.offset), > > > + offset + offsetof(struct virtio_pci_common_cfg, > > > + device_feature)); > > > + pci_read_config_dword(dev, cfg + offsetof(struct virtio_pci_cfg_cap, > > > + pci_cfg_data), > > > + &features); > > > + > > > + /* Does this device obey the platform's IOMMU? If not it's an error. */ > > > + if (!(features & (0x1 << (VIRTIO_F_IOMMU_PLATFORM % 32)))) { > > > + dev_warn(&dev->dev, > > > + "Virtio device lacks VIRTIO_F_IOMMU_PLATFORM.\n"); > > > > Whitespace > > > > > + return -ENODEV; > > > + } > > > + > > > + return 0; > > > +} > > > diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile > > > index 76d8ec0..e9b20e7 100644 > > > --- a/drivers/vfio/pci/Makefile > > > +++ b/drivers/vfio/pci/Makefile > > > @@ -1,5 +1,6 @@ > > > > > > vfio-pci-y := vfio_pci.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o > > > +vfio-pci-y += vfio_pci_virtio.o > > > vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o > > > > > > obj-$(CONFIG_VFIO_PCI) += vfio-pci.o > > > > Thanks, > > Alex --- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -1214,6 +1214,22 @@ static int vfio_pci_probe(struct pci_dev *pdev, const str if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL) return -EINVAL; + /* + * Filter out virtio devices that do not honor the iommu, + * but only for real iommu groups. + */ + if (vfio_pci_is_virtio(pdev)) { + struct iommu_group *tmp = iommu_group_get(&pdev->dev); + + if (tmp) { + iommu_group_put(tmp); + + ret = vfio_pci_virtio_quirk(pdev); + if (ret) + return ret; + } + } + group = vfio_iommu_group_get(&pdev->dev); if (!group) return -EINVAL;