From patchwork Tue Sep 20 12:53:48 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jike Song X-Patchwork-Id: 9341627 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 89A7C601C2 for ; Tue, 20 Sep 2016 12:56:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 77CCF29649 for ; Tue, 20 Sep 2016 12:56:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6AA292965C; Tue, 20 Sep 2016 12:56:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 64FDA29649 for ; Tue, 20 Sep 2016 12:56:31 +0000 (UTC) Received: from localhost ([::1]:34982 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bmKb8-0004ej-Fh for patchwork-qemu-devel@patchwork.kernel.org; Tue, 20 Sep 2016 08:56:30 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51938) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bmKar-0004db-1j for qemu-devel@nongnu.org; Tue, 20 Sep 2016 08:56:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bmKan-00077T-P6 for qemu-devel@nongnu.org; Tue, 20 Sep 2016 08:56:13 -0400 Received: from mga02.intel.com ([134.134.136.20]:9072) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bmKan-000777-8g for qemu-devel@nongnu.org; Tue, 20 Sep 2016 08:56:09 -0400 Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP; 20 Sep 2016 05:55:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.30,368,1470726000"; d="scan'208";a="11373637" Received: from git1.bj.intel.com ([10.238.135.72]) by orsmga005.jf.intel.com with ESMTP; 20 Sep 2016 05:55:54 -0700 Message-ID: <57E1315C.5070304@intel.com> Date: Tue, 20 Sep 2016 20:53:48 +0800 From: Jike Song User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: Kirti Wankhede References: <1472097235-6332-1-git-send-email-kwankhede@nvidia.com> <1472097235-6332-3-git-send-email-kwankhede@nvidia.com> In-Reply-To: <1472097235-6332-3-git-send-email-kwankhede@nvidia.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.20 Subject: Re: [Qemu-devel] [PATCH v7 2/4] vfio: VFIO driver for mediated devices X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kevin.tian@intel.com, cjia@nvidia.com, kvm@vger.kernel.org, qemu-devel@nongnu.org, alex.williamson@redhat.com, kraxel@redhat.com, pbonzini@redhat.com, bjsdjshi@linux.vnet.ibm.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP On 08/25/2016 11:53 AM, Kirti Wankhede wrote: /* {snip} */ To show another possible implementation of vfio-mdev, which provides the thinnest framework and let the vendor physical drivers do whatever they want to. Again, it is diff-ed against Kirti's version 7, for demonstration only. --- Thanks, Jike diff --git a/drivers/vfio/mdev/Kconfig b/drivers/vfio/mdev/Kconfig index d25439f..b2fe0c6 100644 --- a/drivers/vfio/mdev/Kconfig +++ b/drivers/vfio/mdev/Kconfig @@ -8,3 +8,11 @@ config MDEV See Documentation/vfio-mediated-device.txt for more details. If you don't know what do here, say N. + +config VFIO_MDEV + tristate "VFIO Bus driver for Mediated devices" + depends on VFIO && MDEV + default n + help + VFIO Bus driver for mediated devices. + diff --git a/drivers/vfio/mdev/Makefile b/drivers/vfio/mdev/Makefile index 8bd78b5..ee9f89f 100644 --- a/drivers/vfio/mdev/Makefile +++ b/drivers/vfio/mdev/Makefile @@ -2,3 +2,4 @@ mdev-y := mdev_core.o mdev_sysfs.o mdev_driver.o obj-$(CONFIG_MDEV) += mdev.o +obj-$(CONFIG_VFIO_MDEV) += vfio_mdev.o diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c index 28f13ae..c22ebd8 100644 --- a/drivers/vfio/mdev/vfio_mdev.c +++ b/drivers/vfio/mdev/vfio_mdev.c @@ -1,10 +1,15 @@ /* - * VFIO based Mediated PCI device driver + * VFIO Bus driver for Mediated device * * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved. * Author: Neo Jia * Kirti Wankhede * + * Copyright (c) 2016 Intel Corporation. + * Author: + * Xiao Guangrong + * Jike Song + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as * published by the Free Software Foundation. @@ -22,24 +27,21 @@ #include "mdev_private.h" -#define DRIVER_VERSION "0.1" +#define DRIVER_VERSION "0.2" #define DRIVER_AUTHOR "NVIDIA Corporation" -#define DRIVER_DESC "VFIO based Mediated PCI device driver" +#define DRIVER_DESC "VFIO Bus driver for Mediated device" struct vfio_mdev { struct iommu_group *group; struct mdev_device *mdev; - struct vfio_device_info dev_info; }; static int vfio_mdev_open(void *device_data) { - int ret = 0; - if (!try_module_get(THIS_MODULE)) return -ENODEV; - return ret; + return 0; } static void vfio_mdev_close(void *device_data) @@ -47,220 +49,17 @@ static void vfio_mdev_close(void *device_data) module_put(THIS_MODULE); } -static int sparse_mmap_cap(struct vfio_info_cap *caps, void *cap_type) -{ - struct vfio_info_cap_header *header; - struct vfio_region_info_cap_sparse_mmap *sparse_cap, *sparse = cap_type; - size_t size; - - size = sizeof(*sparse) + sparse->nr_areas * sizeof(*sparse->areas); - header = vfio_info_cap_add(caps, size, - VFIO_REGION_INFO_CAP_SPARSE_MMAP, 1); - if (IS_ERR(header)) - return PTR_ERR(header); - - sparse_cap = container_of(header, - struct vfio_region_info_cap_sparse_mmap, header); - sparse_cap->nr_areas = sparse->nr_areas; - memcpy(sparse_cap->areas, sparse->areas, - sparse->nr_areas * sizeof(*sparse->areas)); - return 0; -} - -static int region_type_cap(struct vfio_info_cap *caps, void *cap_type) -{ - struct vfio_info_cap_header *header; - struct vfio_region_info_cap_type *type_cap, *cap = cap_type; - - header = vfio_info_cap_add(caps, sizeof(*cap), - VFIO_REGION_INFO_CAP_TYPE, 1); - if (IS_ERR(header)) - return PTR_ERR(header); - - type_cap = container_of(header, struct vfio_region_info_cap_type, - header); - type_cap->type = cap->type; - type_cap->subtype = cap->type; - return 0; -} - static long vfio_mdev_unlocked_ioctl(void *device_data, unsigned int cmd, unsigned long arg) { - int ret = 0; struct vfio_mdev *vmdev = device_data; - struct parent_device *parent = vmdev->mdev->parent; - unsigned long minsz; - - switch (cmd) { - case VFIO_DEVICE_GET_INFO: - { - struct vfio_device_info info; - - minsz = offsetofend(struct vfio_device_info, num_irqs); - - if (copy_from_user(&info, (void __user *)arg, minsz)) - return -EFAULT; - - if (info.argsz < minsz) - return -EINVAL; - - if (parent->ops->get_device_info) - ret = parent->ops->get_device_info(vmdev->mdev, &info); - else - return -EINVAL; - - if (ret) - return ret; - - if (parent->ops->reset) - info.flags |= VFIO_DEVICE_FLAGS_RESET; - - memcpy(&vmdev->dev_info, &info, sizeof(info)); - - return copy_to_user((void __user *)arg, &info, minsz); - } - case VFIO_DEVICE_GET_REGION_INFO: - { - struct vfio_region_info info; - struct vfio_info_cap caps = { .buf = NULL, .size = 0 }; - u16 cap_type_id = 0; - void *cap_type = NULL; - - minsz = offsetofend(struct vfio_region_info, offset); - - if (copy_from_user(&info, (void __user *)arg, minsz)) - return -EFAULT; - - if (info.argsz < minsz) - return -EINVAL; - - if (parent->ops->get_region_info) - ret = parent->ops->get_region_info(vmdev->mdev, &info, - &cap_type_id, &cap_type); - else - return -EINVAL; - - if (ret) - return ret; - - if ((info.flags & VFIO_REGION_INFO_FLAG_CAPS) && cap_type) { - switch (cap_type_id) { - case VFIO_REGION_INFO_CAP_SPARSE_MMAP: - ret = sparse_mmap_cap(&caps, cap_type); - if (ret) - return ret; - break; - - case VFIO_REGION_INFO_CAP_TYPE: - ret = region_type_cap(&caps, cap_type); - if (ret) - return ret; - break; - default: - return -EINVAL; - } - } - - if (caps.size) { - if (info.argsz < sizeof(info) + caps.size) { - info.argsz = sizeof(info) + caps.size; - info.cap_offset = 0; - } else { - vfio_info_cap_shift(&caps, sizeof(info)); - if (copy_to_user((void __user *)arg + - sizeof(info), caps.buf, - caps.size)) { - kfree(caps.buf); - return -EFAULT; - } - info.cap_offset = sizeof(info); - } - kfree(caps.buf); - } - - return copy_to_user((void __user *)arg, &info, minsz); - } - case VFIO_DEVICE_GET_IRQ_INFO: - { - struct vfio_irq_info info; - - minsz = offsetofend(struct vfio_irq_info, count); - - if (copy_from_user(&info, (void __user *)arg, minsz)) - return -EFAULT; - - if ((info.argsz < minsz) || - (info.index >= vmdev->dev_info.num_irqs)) - return -EINVAL; - - if (parent->ops->get_irq_info) - ret = parent->ops->get_irq_info(vmdev->mdev, &info); - else - return -EINVAL; - - if (ret) - return ret; - - if (info.count == -1) - return -EINVAL; - - return copy_to_user((void __user *)arg, &info, minsz); - } - case VFIO_DEVICE_SET_IRQS: - { - struct vfio_irq_set hdr; - u8 *data = NULL, *ptr = NULL; - - minsz = offsetofend(struct vfio_irq_set, count); - - if (copy_from_user(&hdr, (void __user *)arg, minsz)) - return -EFAULT; - - if ((hdr.argsz < minsz) || - (hdr.index >= vmdev->dev_info.num_irqs) || - (hdr.flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK | - VFIO_IRQ_SET_ACTION_TYPE_MASK))) - return -EINVAL; - - if (!(hdr.flags & VFIO_IRQ_SET_DATA_NONE)) { - size_t size; - - if (hdr.flags & VFIO_IRQ_SET_DATA_BOOL) - size = sizeof(uint8_t); - else if (hdr.flags & VFIO_IRQ_SET_DATA_EVENTFD) - size = sizeof(int32_t); - else - return -EINVAL; - - if (hdr.argsz - minsz < hdr.count * size) - return -EINVAL; - - ptr = data = memdup_user((void __user *)(arg + minsz), - hdr.count * size); - if (IS_ERR(data)) - return PTR_ERR(data); - } - - if (parent->ops->set_irqs) - ret = parent->ops->set_irqs(vmdev->mdev, hdr.flags, - hdr.index, hdr.start, - hdr.count, data); - else - ret = -EINVAL; - - kfree(ptr); - return ret; - } - case VFIO_DEVICE_RESET: - { - if (parent->ops->reset) - return parent->ops->reset(vmdev->mdev); - - return -EINVAL; - } - } - return -ENOTTY; + struct mdev_device *mdev = vmdev->mdev; + struct mdev_host *host = dev_to_host(mdev->dev.parent); + + if (host->ops->ioctl) + return host->ops->ioctl(mdev, cmd, arg); + + return -ENODEV; } static ssize_t vfio_mdev_read(void *device_data, char __user *buf, @@ -268,63 +67,12 @@ static ssize_t vfio_mdev_read(void *device_data, char __user *buf, { struct vfio_mdev *vmdev = device_data; struct mdev_device *mdev = vmdev->mdev; - struct parent_device *parent = mdev->parent; - unsigned int done = 0; - int ret; - - if (!parent->ops->read) - return -EINVAL; - - while (count) { - size_t filled; - - if (count >= 4 && !(*ppos % 4)) { - u32 val; - - ret = parent->ops->read(mdev, (char *)&val, sizeof(val), - *ppos); - if (ret <= 0) - goto read_err; - - if (copy_to_user(buf, &val, sizeof(val))) - goto read_err; - - filled = 4; - } else if (count >= 2 && !(*ppos % 2)) { - u16 val; - - ret = parent->ops->read(mdev, (char *)&val, sizeof(val), - *ppos); - if (ret <= 0) - goto read_err; - - if (copy_to_user(buf, &val, sizeof(val))) - goto read_err; - - filled = 2; - } else { - u8 val; - - ret = parent->ops->read(mdev, &val, sizeof(val), *ppos); - if (ret <= 0) - goto read_err; + struct mdev_host *host = dev_to_host(mdev->dev.parent); - if (copy_to_user(buf, &val, sizeof(val))) - goto read_err; + if (host->ops->read) + return host->ops->read(mdev, buf, count, ppos); - filled = 1; - } - - count -= filled; - done += filled; - *ppos += filled; - buf += filled; - } - - return done; - -read_err: - return -EFAULT; + return -ENODEV; } static ssize_t vfio_mdev_write(void *device_data, const char __user *buf, @@ -332,75 +80,24 @@ static ssize_t vfio_mdev_write(void *device_data, const char __user *buf, { struct vfio_mdev *vmdev = device_data; struct mdev_device *mdev = vmdev->mdev; - struct parent_device *parent = mdev->parent; - unsigned int done = 0; - int ret; - - if (!parent->ops->write) - return -EINVAL; - - while (count) { - size_t filled; - - if (count >= 4 && !(*ppos % 4)) { - u32 val; - - if (copy_from_user(&val, buf, sizeof(val))) - goto write_err; - - ret = parent->ops->write(mdev, (char *)&val, - sizeof(val), *ppos); - if (ret <= 0) - goto write_err; - - filled = 4; - } else if (count >= 2 && !(*ppos % 2)) { - u16 val; - - if (copy_from_user(&val, buf, sizeof(val))) - goto write_err; - - ret = parent->ops->write(mdev, (char *)&val, - sizeof(val), *ppos); - if (ret <= 0) - goto write_err; - - filled = 2; - } else { - u8 val; + struct mdev_host *host = dev_to_host(mdev->dev.parent); - if (copy_from_user(&val, buf, sizeof(val))) - goto write_err; + if (host->ops->write) + return host->ops->write(mdev, buf, count, ppos); - ret = parent->ops->write(mdev, &val, sizeof(val), - *ppos); - if (ret <= 0) - goto write_err; - - filled = 1; - } - - count -= filled; - done += filled; - *ppos += filled; - buf += filled; - } - - return done; -write_err: - return -EFAULT; + return -ENODEV; } static int vfio_mdev_mmap(void *device_data, struct vm_area_struct *vma) { struct vfio_mdev *vmdev = device_data; struct mdev_device *mdev = vmdev->mdev; - struct parent_device *parent = mdev->parent; + struct mdev_host *host = dev_to_host(mdev->dev.parent); - if (parent->ops->mmap) - return parent->ops->mmap(mdev, vma); + if (host->ops->mmap) + return host->ops->mmap(mdev, vma); - return -EINVAL; + return -ENODEV; } static const struct vfio_device_ops vfio_mdev_dev_ops = { @@ -413,28 +110,27 @@ static const struct vfio_device_ops vfio_mdev_dev_ops = { .mmap = vfio_mdev_mmap, }; -int vfio_mdev_probe(struct device *dev) +static int vfio_mdev_probe(struct device *dev) { struct vfio_mdev *vmdev; - struct mdev_device *mdev = to_mdev_device(dev); + struct mdev_device *mdev = dev_to_mdev(dev); int ret; vmdev = kzalloc(sizeof(*vmdev), GFP_KERNEL); if (IS_ERR(vmdev)) return PTR_ERR(vmdev); - vmdev->mdev = mdev_get_device(mdev); + vmdev->mdev = mdev; vmdev->group = mdev->group; ret = vfio_add_group_dev(dev, &vfio_mdev_dev_ops, vmdev); if (ret) kfree(vmdev); - mdev_put_device(mdev); return ret; } -void vfio_mdev_remove(struct device *dev) +static void vfio_mdev_remove(struct device *dev) { struct vfio_mdev *vmdev; @@ -442,10 +138,34 @@ void vfio_mdev_remove(struct device *dev) kfree(vmdev); } -struct mdev_driver vfio_mdev_driver = { - .name = "vfio_mdev", - .probe = vfio_mdev_probe, - .remove = vfio_mdev_remove, +static int vfio_mdev_online(struct device *dev) +{ + struct mdev_device *mdev = dev_to_mdev(dev); + struct mdev_host *host = dev_to_host(mdev->dev.parent); + + if (host->ops->start) + return host->ops->start(mdev); + + return -ENOTSUPP; +} + +static int vfio_mdev_offline(struct device *dev) +{ + struct mdev_device *mdev = dev_to_mdev(dev); + struct mdev_host *host = dev_to_host(mdev->dev.parent); + + if (host->ops->stop) + return host->ops->stop(mdev); + + return -ENOTSUPP; +} + +static struct mdev_driver vfio_mdev_driver = { + .name = "vfio_mdev", + .probe = vfio_mdev_probe, + .remove = vfio_mdev_remove, + .online = vfio_mdev_online, + .offline = vfio_mdev_offline, }; static int __init vfio_mdev_init(void)