From patchwork Thu Jul 9 18:13:34 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Michael S. Tsirkin" X-Patchwork-Id: 34848 Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by demeter.kernel.org (8.14.2/8.14.2) with ESMTP id n69IEKEZ022377 for ; Thu, 9 Jul 2009 18:14:20 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753730AbZGISOS (ORCPT ); Thu, 9 Jul 2009 14:14:18 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753975AbZGISOS (ORCPT ); Thu, 9 Jul 2009 14:14:18 -0400 Received: from mx2.redhat.com ([66.187.237.31]:47543 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753698AbZGISOR (ORCPT ); Thu, 9 Jul 2009 14:14:17 -0400 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n69IEHRT011012 for ; Thu, 9 Jul 2009 14:14:17 -0400 Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199]) by int-mx2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id n69IEGCr000841; Thu, 9 Jul 2009 14:14:17 -0400 Received: from redhat.com (vpn-10-73.str.redhat.com [10.32.10.73]) by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id n69IEEsX003454; Thu, 9 Jul 2009 14:14:15 -0400 Date: Thu, 9 Jul 2009 21:13:34 +0300 From: "Michael S. Tsirkin" To: kvm@vger.kernel.org, avi@redhat.com Subject: [PATCH corrected RFC] uio: add generic driver for PCI 2.3 devices Message-ID: <20090709181334.GC4340@redhat.com> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.19 (2009-01-05) X-Scanned-By: MIMEDefang 2.58 on 172.16.27.26 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Resending with corrected addresses. Sorry about the churn. ----------- I got annoyed by the fact that we don't support shared interrupts with PCI in assigned devides, so here's a draft patch to add that support in kernel through uio. I intend to send this to lkml, but meanwhile I'd appreciate some early feedback/flames from people on the list. Thanks! -----------> This adds a generic uio driver that can bind to any PCI device. First user will be virtualization where a qemu userspace process needs to give guest OS access to the device. Interrupts are handled using the Interrupt Disable bit in the PCI command register and Interrupt Status bit in the PCI status register. All devices compliant to PCI 2.3 (circa 2002) and all compliant PCI Express devices should support these bits. Driver detects this support, and won't bind to devices which do not support the Interrupt Disable Bit in the command register. It's expected that MSI/MSI-X support will be added to this driver in the future, to interface with virtualization irqfd/eventfd infrastructure. Another area to examine, and of interest to virtualization, is iommu. Signed-off-by: Michael S. Tsirkin --- drivers/uio/Kconfig | 10 ++ drivers/uio/Makefile | 1 + drivers/uio/uio_pci_generic.c | 202 +++++++++++++++++++++++++++++++++++++++++ include/linux/pci_regs.h | 1 + 4 files changed, 214 insertions(+), 0 deletions(-) create mode 100644 drivers/uio/uio_pci_generic.c diff --git a/drivers/uio/Kconfig b/drivers/uio/Kconfig index 7f86534..0f14c8e 100644 --- a/drivers/uio/Kconfig +++ b/drivers/uio/Kconfig @@ -89,4 +89,14 @@ config UIO_SERCOS3 If you compile this as a module, it will be called uio_sercos3. +config UIO_PCI_GENERIC + tristate "Generic driver for PCI 2.3 and PCI Express cards" + depends on PCI + default n + help + Generic driver that you can bind, dynamically, to any + PCI 2.3 compliant and PCI Express card. It is useful, + primarily, for virtualization scenarios. + If you compile this as a module, it will be called uio_pci_generic. + endif diff --git a/drivers/uio/Makefile b/drivers/uio/Makefile index 5c2586d..73b2e75 100644 --- a/drivers/uio/Makefile +++ b/drivers/uio/Makefile @@ -5,3 +5,4 @@ obj-$(CONFIG_UIO_PDRV_GENIRQ) += uio_pdrv_genirq.o obj-$(CONFIG_UIO_SMX) += uio_smx.o obj-$(CONFIG_UIO_AEC) += uio_aec.o obj-$(CONFIG_UIO_SERCOS3) += uio_sercos3.o +obj-$(CONFIG_UIO_PCI_GENERIC) += uio_pci_generic.o diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c new file mode 100644 index 0000000..dd0df44 --- /dev/null +++ b/drivers/uio/uio_pci_generic.c @@ -0,0 +1,202 @@ +/* uio_pci_generic - generic UIO driver for PCI 2.3 devices + * + * Copyright (C) 2009 Red Hat, Inc. + * Author: Michael S. Tsirkin + * + * This work is licensed under the terms of the GNU GPL, version 2. + * + * Since the driver does not declare any device ids, you must allocate + * id and bind the device to the driver yourself. For example: + * + * # echo "8086 10f5" > /sys/bus/pci/drivers/uio_pci_generic/new_id + * # echo -n 0000:00:19.0 > /sys/bus/pci/drivers/e1000e/unbind + * # echo -n 0000:00:19.0 > /sys/bus/pci/drivers/uio_pci_generic/bind + * # ls -l /sys/bus/pci/devices/0000:00:19.0/driver + * .../0000:00:19.0/driver -> ../../../bus/pci/drivers/uio_pci_generic + * + * Driver won't bind to devices which do not support the Interrupt Disable Bit + * in the command register. All devices compliant to PCI 2.3 (circa 2002) and + * all compliant PCI Express devices should support this bit. + */ + +#include +#include +#include +#include +#include + +struct generic_dev { + struct uio_info info; + struct pci_dev *pdev; + spinlock_t lock; /* guards command register accesses */ +}; + +/* Read/modify/write command register to disable interrupts. + * Note: we could cache the value and optimize the read if there was a way to + * get notified of user changes to command register through sysfs. + * */ +static void irqtoggle(struct generic_dev *dev, int irq_on) +{ + struct pci_dev *pdev = dev->pdev; + unsigned long flags; + u16 orig, new; + + spin_lock_irqsave(&dev->lock, flags); + pci_block_user_cfg_access(pdev); + pci_read_config_word(pdev, PCI_COMMAND, &orig); + new = irq_on ? orig & ~PCI_COMMAND_INTX_DISABLE : + orig | PCI_COMMAND_INTX_DISABLE; + if (new != orig) + pci_write_config_word(dev->pdev, PCI_COMMAND, new); + pci_unblock_user_cfg_access(dev); + spin_unlock_irqrestore(&dev->lock, flags); +} + +/* irqcontrol is use by userspace to enable/disable interrupts. */ +static int irqcontrol(struct uio_info *info, s32 irq_on) +{ + struct generic_dev *dev = container_of(info, struct generic_dev, info); + irqtoggle(dev, irq_on); + return 0; +} + +static irqreturn_t irqhandler(int irq, struct uio_info *info) +{ + struct generic_dev *dev = container_of(info, struct generic_dev, info); + irqreturn_t ret = IRQ_NONE; + u16 status; + + /* Check interrupt status register to see whether our device + * triggered the interrupt. */ + pci_read_config_word(dev->pdev, PCI_STATUS, &status); + if (!(status & PCI_STATUS_INTERRUPT)) + goto done; + + /* We triggered the interrupt, disable it. */ + irqtoggle(dev, 0); + /* UIO core will signal the user process. */ + ret = IRQ_HANDLED; +done: + return ret; +} + +/* Verify that the device supports Interrupt Disable bit in command register, + * per PCI 2.3, by flipping this bit and reading it back: this bit was readonly + * in PCI 2.2. */ +static int __devinit verify_pci_2_3(struct pci_dev *pdev) +{ + u16 orig, new; + int err = 0; + + pci_block_user_cfg_access(pdev); + pci_read_config_word(pdev, PCI_COMMAND, &orig); + pci_write_config_word(pdev, PCI_COMMAND, + orig ^ PCI_COMMAND_INTX_DISABLE); + pci_read_config_word(pdev, PCI_COMMAND, &new); + /* There's no way to protect against + * hardware bugs or detect them reliably, but as long as we know + * what the value should be, let's go ahead and check it. */ + if ((new ^ orig) & ~PCI_COMMAND_INTX_DISABLE) { + err = -EBUSY; + dev_err(&pdev->dev, "Command changed from 0x%x to 0x%x: " + "driver or HW bug?\n", orig, new); + goto err; + } + if (!((new ^ orig) & PCI_COMMAND_INTX_DISABLE)) { + dev_warn(&pdev->dev, "Device does not support " + "disabling interrupts: unable to bind.\n"); + err = -ENODEV; + goto err; + } + /* Now restore the original value. */ + pci_write_config_word(pdev, PCI_COMMAND, orig); +err: + pci_unblock_user_cfg_access(pdev); + return err; +} + +static int __devinit probe(struct pci_dev *pdev, + const struct pci_device_id *id) +{ + struct generic_dev *dev; + int err; + + err = pci_enable_device(pdev); + if (err) { + dev_err(&pdev->dev, "%s: pci_enable_device failed: %d\n", + __func__, err); + return err; + } + + err = verify_pci_2_3(pdev); + if (err) + goto err_verify; + + err = pci_request_regions(pdev, "uio_pci_generic"); + if (err) { + dev_err(&pdev->dev, "%s: pci_request_regions failed: %d\n", + __func__, err); + goto err_verify; + } + + dev = kzalloc(sizeof(struct generic_dev), GFP_KERNEL); + if (!dev) { + err = -ENOMEM; + goto err_alloc; + } + + dev->info.name = "uio_pci_generic"; + dev->info.version = "0.01"; + dev->info.irq = pdev->irq; + dev->info.irq_flags = IRQF_SHARED; + dev->info.handler = irqhandler; + dev->info.irqcontrol = irqcontrol; + dev->pdev = pdev; + spin_lock_init(&dev->lock); + + pci_reset_function(pdev); + + if (uio_register_device(&pdev->dev, &dev->info)) + goto err_register; + pci_set_drvdata(pdev, dev); + + return 0; +err_register: + kfree(dev); +err_alloc: + pci_release_regions(pdev); +err_verify: + pci_disable_device(pdev); + return err; +} + +static void remove(struct pci_dev *pdev) +{ + struct generic_dev *dev = pci_get_drvdata(pdev); + + uio_unregister_device(&dev->info); + kfree(dev); +} + +static struct pci_driver driver = { + .name = "uio_pci_generic", + .id_table = NULL, /* only dynamic id's */ + .probe = probe, + .remove = remove, +}; + +static int __init init(void) +{ + return pci_register_driver(&driver); +} + +static void __exit cleanup(void) +{ + pci_unregister_driver(&driver); +} + +module_init(init); +module_exit(cleanup); + +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("Michael S. Tsirkin "); diff --git a/include/linux/pci_regs.h b/include/linux/pci_regs.h index 616bf8b..bfb9b31 100644 --- a/include/linux/pci_regs.h +++ b/include/linux/pci_regs.h @@ -42,6 +42,7 @@ #define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */ #define PCI_STATUS 0x06 /* 16 bits */ +#define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */ #define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */ #define PCI_STATUS_66MHZ 0x20 /* Support 66 Mhz PCI 2.1 bus */ #define PCI_STATUS_UDF 0x40 /* Support User Definable Features [obsolete] */