From patchwork Thu Apr 21 18:57:47 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Neil Horman X-Patchwork-Id: 725361 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id p3LIu7Pi031468 for ; Thu, 21 Apr 2011 18:58:12 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752223Ab1DUS6L (ORCPT ); Thu, 21 Apr 2011 14:58:11 -0400 Received: from charlotte.tuxdriver.com ([70.61.120.58]:38578 "EHLO smtp.tuxdriver.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752124Ab1DUS6K (ORCPT ); Thu, 21 Apr 2011 14:58:10 -0400 Received: from 99-127-245-201.lightspeed.rlghnc.sbcglobal.net ([99.127.245.201] helo=localhost) by smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63) (envelope-from ) id 1QCz4j-00050a-MA; Thu, 21 Apr 2011 14:58:07 -0400 From: Neil Horman To: linux-kernel@vger.kernel.org Cc: Neil Horman , Jesse Barnes , linux-pci@vger.kernel.org Subject: [PATCH] pci: Export pci device msi table via sysfs Date: Thu, 21 Apr 2011 14:57:47 -0400 Message-Id: <1303412267-1948-1-git-send-email-nhorman@tuxdriver.com> X-Mailer: git-send-email 1.7.4.2 X-Spam-Score: -2.5 (--) X-Spam-Status: No Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]); Thu, 21 Apr 2011 18:58:12 +0000 (UTC) I've been working on some improvements to how we balance irqs for high volume interrupt sources. The consensus so far has been that what would be really helpful is a irqbalance mechanism that operates in a one shot mode in response to the addition of high volume interrupt sources (i.e. network devices mainly). In attempting to implement this, I've found that it would be really useful to have 2 bits of information: 1) A clear correlation of which interrupts belong to which devices. Parsing /proc/interrupts is an exercize in guessing what naming pattern a given driver follows, and requires some amount of stateful information to be kept in user space, lest every device addition require a rebalancing of every interrupt in the system. 2) A indicator as to which kind of interrupts a given device is using. The irq attribute for a pci device is always accurate in that it simply reads whats in the appropriate pci config space register, but devices using msi interrupts have no use for that register, and never request that interrupt number. This patch adds the requisite information. It creates two per-pci-device irq attribute files: a) irq_mode - identifies which kind of irqs the device in question is using, intx/msi/msix b) msi_table - populated only if msi(x) is enabled, it lists the irqs allocated to the pci device Using this information I can implement a stateless irq one-shot balancer that reacts to various udev events quite well Signed-off-by: Neil Horman CC: Jesse Barnes CC: linux-pci@vger.kernel.org --- drivers/pci/pci-sysfs.c | 33 +++++++++++++++++++++++++++++++++ 1 files changed, 33 insertions(+), 0 deletions(-) diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index f8deb3e..1397dfb 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -26,6 +26,7 @@ #include #include #include +#include #include "pci.h" static int sysfs_initialized; /* = 0 */ @@ -71,6 +72,34 @@ static ssize_t broken_parity_status_store(struct device *dev, return count; } +static ssize_t irq_mode_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct pci_dev *pdev = to_pci_dev(dev); + + return sprintf(buf, "%s\n", pdev->msix_enabled ? "msix" : + (pdev->msi_enabled ? "msi" : "intx")); +} + +#ifdef CONFIG_PCI_MSI +static ssize_t msi_list_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct pci_dev *pdev = to_pci_dev(dev); + struct msi_desc *entry; + int first, last; + ssize_t count = 0; + + if (!(pdev->msi_enabled || pdev->msix_enabled)) + return 0; + + list_for_each_entry(entry, &pdev->msi_list, list) + count += sprintf(&buf[count], "%d ", entry->irq); + + return count; +} +#endif + static ssize_t local_cpus_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -328,6 +357,10 @@ struct device_attribute pci_dev_attrs[] = { __ATTR_RO(subsystem_device), __ATTR_RO(class), __ATTR_RO(irq), + __ATTR_RO(irq_mode), +#ifdef CONFIG_PCI_MSI + __ATTR_RO(msi_list), +#endif __ATTR_RO(local_cpus), __ATTR_RO(local_cpulist), __ATTR_RO(modalias),