From patchwork Fri Sep 2 08:16:12 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jike Song X-Patchwork-Id: 9310579 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F3573607D2 for ; Fri, 2 Sep 2016 08:25:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E2B392964E for ; Fri, 2 Sep 2016 08:25:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D69C129712; Fri, 2 Sep 2016 08:25:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id DFCDB2964E for ; Fri, 2 Sep 2016 08:25:51 +0000 (UTC) Received: from localhost ([::1]:40929 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bfjnL-0001Lt-4R for patchwork-qemu-devel@patchwork.kernel.org; Fri, 02 Sep 2016 04:25:51 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52245) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bfjhH-0004UF-5n for qemu-devel@nongnu.org; Fri, 02 Sep 2016 04:19:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bfjhC-0007DN-Om for qemu-devel@nongnu.org; Fri, 02 Sep 2016 04:19:35 -0400 Received: from mga05.intel.com ([192.55.52.43]:48821) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bfjhC-0007Aw-Cp for qemu-devel@nongnu.org; Fri, 02 Sep 2016 04:19:30 -0400 Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP; 02 Sep 2016 01:19:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.30,270,1470726000"; d="scan'208";a="874396896" Received: from kvmgt1.bj.intel.com ([10.238.154.158]) by orsmga003.jf.intel.com with ESMTP; 02 Sep 2016 01:19:27 -0700 From: Jike Song To: alex.williamson@redhat.com, kwankhede@nvidia.com, cjia@nvidia.com Date: Fri, 2 Sep 2016 16:16:12 +0800 Message-Id: <1472804172-25542-5-git-send-email-jike.song@intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1472804172-25542-1-git-send-email-jike.song@intel.com> References: <1472804172-25542-1-git-send-email-jike.song@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.43 Subject: [Qemu-devel] [RFC v2 4/4] docs: Add Documentation for Mediated devices X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kevin.tian@intel.com, guangrong.xiao@linux.intel.com, kvm@vger.kernel.org, qemu-devel@nongnu.org, zhenyuw@linux.intel.com, jike.song@intel.com, zhiyuan.lv@intel.com, pbonzini@redhat.com, bjsdjshi@linux.vnet.ibm.com, kraxel@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Kirti Wankhede Add file Documentation/vfio-mediated-device.txt that include details of mediated device framework. Signed-off-by: Kirti Wankhede Signed-off-by: Neo Jia Signed-off-by: Jike Song --- Documentation/vfio-mediated-device.txt | 203 +++++++++++++++++++++++++++++++++ 1 file changed, 203 insertions(+) create mode 100644 Documentation/vfio-mediated-device.txt diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt new file mode 100644 index 0000000..39bdcd9 --- /dev/null +++ b/Documentation/vfio-mediated-device.txt @@ -0,0 +1,203 @@ +VFIO Mediated devices [1] +------------------------------------------------------------------------------- + +There are more and more use cases/demands to virtualize the DMA devices which +doesn't have SR_IOV capability built-in. To do this, drivers of different +devices had to develop their own management interface and set of APIs and then +integrate it to user space software. We've identified common requirements and +unified management interface for such devices to make user space software +integration easier. + +The VFIO driver framework provides unified APIs for direct device access. It is +an IOMMU/device agnostic framework for exposing direct device access to +user space, in a secure, IOMMU protected environment. This framework is +used for multiple devices like GPUs, network adapters and compute accelerators. +With direct device access, virtual machines or user space applications have +direct access of physical device. This framework is reused for mediated devices. + +Mediated core driver provides a common interface for mediated device management +that can be used by drivers of different devices. This module provides a generic +interface to create/destroy mediated device, add/remove it to mediated bus +driver, add/remove device to IOMMU group. It also provides an interface to +register different types of bus drivers, for example, Mediated VFIO PCI driver +is designed for mediated PCI devices and supports VFIO APIs. Similarly, driver +can be designed to support any type of mediated device and added to this +framework. Mediated bus driver add/delete mediated device to VFIO Group. + +Below is the high level block diagram, with NVIDIA, Intel and IBM devices +as examples, since these are the devices which are going to actively use +this module as of now. + + + +---------------+ + | | + | +-----------+ | + | | | | + | | | | + | | mdev | | mdev_register_driver() +---------------+ + | | bus | |<------------------------------+ | + | | driver | | | | + | | | +------------------------------>| vfio_mdev.ko |<-> VFIO user + | | | | probe()/remove() | | APIs + | | | | +---------------+ + | | | | + | +-----------+ | + | | + | MDEV CORE | + | MODULE | + | mdev.ko | + | | + | +-----------+ | mdev_register_host_device() +---------------+ + | | | +<------------------------------+ | + | | | | | nvidia.ko |<-> physical + | | | +------------------------------>+ | device + | | | | callbacks +---------------+ + | | | | + | | Physical | | mdev_register_host_device() +---------------+ + | | Device | |<------------------------------+ | + | | Interface | | | i915.ko |<-> physical + | | | +------------------------------>+ | device + | | | | callbacks +---------------+ + | | | | + | | | | mdev_register_host_device() +---------------+ + | | | +<------------------------------+ | + | | | | | ccw_device.ko |<-> physical + | | | +------------------------------>+ | device + | | | | callbacks +---------------+ + | +-----------+ | + +---------------+ + + +Registration Interfaces +------------------------------------------------------------------------------- + +Mediated core driver provides two types of registration interfaces: + +1. Registration interface for mediated bus driver: +------------------------------------------------- + + /** + * struct mdev_driver [2] - Mediated device driver + * @name: driver name + * @probe: called when new device created + * @remove: called when device removed + * @driver: device driver structure + **/ + struct mdev_driver { + const char *name; + int (*probe)(struct device *dev); + void (*remove)(struct device *dev); + int (*online)(struct device *dev); + int (*offline)(struct device *dev); + struct device_driver driver; + }; + + + +Mediated bus driver for mdev should use this interface to register and +unregister with core driver respectively: + +extern int mdev_register_driver(struct mdev_driver *drv, struct module *owner); +extern void mdev_unregister_driver(struct mdev_driver *drv); + +Mediated bus driver is responsible to add/delete mediated devices to/from VFIO +group when devices are bound and unbound to the driver. + +2. Physical device driver interface: +----------------------------------- +This interface [3] provides a set of APIs to manage physical device related work +in its driver. APIs are: + +* hdev_attr_groups: attribute groups of the mdev host device. +* mdev_attr_groups: attribute groups of the mediated device. +* supported_config: to provide supported configuration list by the driver. +* create: to allocate basic resources in driver for a mediated device. +* destroy: to free resources in driver when mediated device is destroyed. +* start: to start the mediated device +* stop: to stop the mediated device +* read : read emulation callback. +* write: write emulation callback. +* mmap: the mmap callback +* ioctl: the ioctl callback + +Drivers should use this interface to register and unregister device to mdev core +driver respectively: + + struct mdev_host *mdev_register_host_device(struct device *pdev, + const struct mdev_host_ops *ops); + void mdev_unregister_device(struct mdev_host *host); + + +Mediated device management interface via sysfs +------------------------------------------------------------------------------- +This is the interface that allows user space software, like libvirt, to query +and configure mediated device in a HW agnostic fashion. This management +interface provide flexibility to underlying physical device's driver to support +mediated device hotplug, multiple mediated devices per virtual machine, multiple +mediated devices from different physical devices, etc. + +For echo physical device, there is a mdev host device created, it shows in sysfs: +/sys/bus/pci/devices/0000:05:00.0/mdev-host/ +--------------------------------- + +* mdev_supported_types: (read only) + List the current supported mediated device types and its details. + +* mdev_create: (write only) + Create a mediated device on target physical device. + Input syntax: + where, + UUID: mediated device's UUID + params: extra parameters required by driver + Example: + # echo "12345678-1234-1234-1234-123456789abc" > + /sys/bus/pci/devices/0000\:05\:00.0/mdev-host/mdev_create + +* mdev_destroy: (write only) + Destroy a mediated device on a target physical device. + Input syntax: + where, + UUID: mediated device's UUID + Example: + # echo "12345678-1234-1234-1234-123456789abc" > + /sys/bus/pci/devices/0000\:05\:00.0/mdev-host/mdev_destroy + +Under mdev sysfs directory: +/sys/bus/pci/devices/0000:05:00.0/mdev-host/12345678-1234-1234-1234-123456789abc/ +---------------------------------- + +* online: (read/write) + This shows and sets the online status of mdev. + Input syntax: <0|1> + To set mdev online, simply echo "1" to the online file; "0" to + offline the mdev. + Example: + # echo 1 > /sys/bus/pci/devices/0000\:05\:00.0/mdev-host/12345678-1234-1234-1234-123456789abc/online + # echo 0 > /sys/bus/pci/devices/0000\:05\:00.0/mdev-host/12345678-1234-1234-1234-123456789abc/online + + +Translation APIs for Mediated device +------------------------------------------------------------------------------ + +Below APIs are provided for user pfn to host pfn translation in VFIO driver: + +extern long vfio_pin_pages(struct mdev_device *mdev, unsigned long *user_pfn, + long npage, int prot, unsigned long *phys_pfn); + +extern long vfio_unpin_pages(struct mdev_device *mdev, unsigned long *pfn, + long npage); + +These functions call back into the backend IOMMU module using two callbacks of +struct vfio_iommu_driver_ops, pin_pages and unpin_pages [4]. Currently these are +supported in TYPE1 IOMMU module. To enable the same for other IOMMU backend +modules, such as PPC64 sPAPR module, they need to provide these two callback +functions. + +References +------------------------------------------------------------------------------- + +[1] See Documentation/vfio.txt for more information on VFIO. +[2] struct mdev_driver in include/linux/mdev.h +[3] struct mdev_host_ops in include/linux/mdev.h +[4] struct vfio_iommu_driver_ops in include/linux/vfio.h +