From patchwork Tue Mar 28 21:53:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13191580 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D684C6FD18 for ; Tue, 28 Mar 2023 21:54:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229999AbjC1VyQ (ORCPT ); Tue, 28 Mar 2023 17:54:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47474 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229987AbjC1VyN (ORCPT ); Tue, 28 Mar 2023 17:54:13 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06B442D51; Tue, 28 Mar 2023 14:53:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680040433; x=1711576433; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XbkcLI4n2teEQCC6ULXx558MPH5O1I8KLnT+xl6+hk8=; b=W4uAsz4jPYoKm+auyAmjhJYI4GgB9ZRjr43+zYdz8knzAWfxd4i/92Bg iPlS3kZ9+hYHqmBkwaL/WGk3BWsEdugAdb+S7Vg43v6sqygkhRHYUUo3A qSgLXhPOgLnNw+88DRWE87YalnWHoCoY/dJ9VJm/bQtTFW6ZnvjUYiZkY KHdTa/GNzYcoVxu2BA+Lf759tWW87N6KgOAJPXkjxeE1lmY8hrNOFuAQD KrUiszH7M8fzRhW7xLJav0ERLPk3Gu/30Qmy9BR0PpMo/TRLAnCaBRjhO MdUR1iIHi35TSxRmxq3+tDJz5KkkiiBWLE7dtziPKyAAmZMiNnOfUascz Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="403316913" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="403316913" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="748543765" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="748543765" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:49 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V2 1/8] vfio/pci: Consolidate irq cleanup on MSI/MSI-X disable Date: Tue, 28 Mar 2023 14:53:28 -0700 Message-Id: <4e1ac682c11a596a8b96ef5034674cda83eddbd2.1680038771.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org vfio_msi_disable() releases all previously allocated state associated with each interrupt before disabling MSI/MSI-X. vfio_msi_disable() iterates twice over the interrupt state: first directly with a for loop to do virqfd cleanup, followed by another for loop within vfio_msi_set_block() that releases the interrupt and its state using vfio_msi_set_vector_signal(). Simplify interrupt cleanup by iterating over allocated interrupts once. Signed-off-by: Reinette Chatre --- drivers/vfio/pci/vfio_pci_intrs.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index bffb0741518b..6a9c6a143cc3 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -426,10 +426,9 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) for (i = 0; i < vdev->num_ctx; i++) { vfio_virqfd_disable(&vdev->ctx[i].unmask); vfio_virqfd_disable(&vdev->ctx[i].mask); + vfio_msi_set_vector_signal(vdev, i, -1, msix); } - vfio_msi_set_block(vdev, 0, vdev->num_ctx, NULL, msix); - cmd = vfio_pci_memory_lock_and_enable(vdev); pci_free_irq_vectors(pdev); vfio_pci_memory_unlock_and_restore(vdev, cmd); From patchwork Tue Mar 28 21:53:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13191581 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 579C6C77B60 for ; Tue, 28 Mar 2023 21:54:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229987AbjC1VyS (ORCPT ); Tue, 28 Mar 2023 17:54:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229989AbjC1VyO (ORCPT ); Tue, 28 Mar 2023 17:54:14 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 875052D63; Tue, 28 Mar 2023 14:53:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680040434; x=1711576434; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Qbu9sl+61RQBLYcvTOuWSLXfqQGA99BUZPMWL8q3o7A=; b=JuYNZG1LQ49rh16oQvD70YtMltdZZ/xcY3DM4yGyqMkIhjOVOuLnHZqx Q0HJ8/khaoLzIDVrMonJPOOAbtRerqiosbAa2phJvDFmaSkvtfSMtq53o zsd3ByVeZ3QFfbheay3+d/Rjj40NppYNTvTMa0dYZoYKy4FaJiupVipjd Yf7Keij7+sXirZq1ceHDflXUeaQDFs1Dcy6hDn22W5BQo3pJbnS5TEtrF IEZsX1JRlqGaEMpl0LGqXp7dQP0d+6WYkhL8TCop5Hzbk+M3stBDIJyC2 pMj5B1w5dX0ZCIjXONQO6zRSuONoVHSEtZxrtaNe7zvwYgVk/BAT5LUoQ w==; X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="403316920" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="403316920" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="748543768" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="748543768" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:49 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V2 2/8] vfio/pci: Remove negative check on unsigned vector Date: Tue, 28 Mar 2023 14:53:29 -0700 Message-Id: <0dc2a0c8e25b13a3a41db75ab192f387a1548c80.1680038771.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org User space provides the vector as an unsigned int that is checked early for validity (vfio_set_irqs_validate_and_prepare()). A later negative check of the provided vector is not necessary. Remove the negative check and ensure the type used for the vector is consistent as an unsigned int. Signed-off-by: Reinette Chatre --- drivers/vfio/pci/vfio_pci_intrs.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 6a9c6a143cc3..3f64ccdce69f 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -317,14 +317,14 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi } static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, - int vector, int fd, bool msix) + unsigned int vector, int fd, bool msix) { struct pci_dev *pdev = vdev->pdev; struct eventfd_ctx *trigger; int irq, ret; u16 cmd; - if (vector < 0 || vector >= vdev->num_ctx) + if (vector >= vdev->num_ctx) return -EINVAL; irq = pci_irq_vector(pdev, vector); @@ -399,7 +399,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, unsigned count, int32_t *fds, bool msix) { - int i, j, ret = 0; + int i, ret = 0; + unsigned int j; if (start >= vdev->num_ctx || start + count > vdev->num_ctx) return -EINVAL; @@ -420,7 +421,7 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) { struct pci_dev *pdev = vdev->pdev; - int i; + unsigned int i; u16 cmd; for (i = 0; i < vdev->num_ctx; i++) { @@ -542,7 +543,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { - int i; + unsigned int i; bool msix = (index == VFIO_PCI_MSIX_IRQ_INDEX) ? true : false; if (irq_is(vdev, index) && !count && (flags & VFIO_IRQ_SET_DATA_NONE)) { From patchwork Tue Mar 28 21:53:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13191582 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51026C6FD18 for ; Tue, 28 Mar 2023 21:54:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230017AbjC1Vy2 (ORCPT ); Tue, 28 Mar 2023 17:54:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230013AbjC1VyX (ORCPT ); Tue, 28 Mar 2023 17:54:23 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34674270B; Tue, 28 Mar 2023 14:54:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680040445; x=1711576445; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=c6/0IvJxGbxNFxGBlKteRomX4zRUuc1+opLCThWE3hs=; b=Qmiv4BY62AHXuwkdRh9oKn40eHX0aKXupqPXmKKIQO4SUP2SnCPCudAH 2v/bYyoIZOJI6+UudsAr0xYs6THZvPZUEVLXT3aMowVDdYI4nrsdv4PP7 1mG56TWPIV8/n8vaXjG/Dve51S08vEfmj6iNd7qWFvIfLwYA95aiARgo2 AD8HI254yGVsYj8VS39vWSllKAFQJZUGPaXITb/QUN0CDCWGJ9M5tnmsn ixO5qo9svIMecAQ5jydbExcXasM5LAKf0q22lF5aKV7bp9aGYaJFWdULK RzneyPQT1wTCdOQ9sk5Z6q5JxrTCiFAuwCY/XKvg/9agtvg/7szyC/EuB g==; X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="403316925" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="403316925" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="748543772" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="748543772" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:49 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V2 3/8] vfio/pci: Prepare for dynamic interrupt context storage Date: Tue, 28 Mar 2023 14:53:30 -0700 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Interrupt context storage is statically allocated at the time interrupts are allocated. Following allocation, the interrupt context is managed by directly accessing the elements of the array using the vector as index. It is possible to allocate additional MSI-X vectors after MSI-X has been enabled. Dynamic storage of interrupt context is needed to support adding new MSI-X vectors after initial allocation. Replace direct access of array elements with pointers to the array elements. Doing so reduces impact of moving to a new data structure. Move interactions with the array to helpers to mostly contain changes needed to transition to a dynamic data structure. No functional change intended. Signed-off-by: Reinette Chatre --- Changes since RFC V1: - Improve accuracy of changelog. drivers/vfio/pci/vfio_pci_intrs.c | 206 ++++++++++++++++++++---------- 1 file changed, 140 insertions(+), 66 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 3f64ccdce69f..ece371ebea00 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -48,6 +48,31 @@ static bool is_irq_none(struct vfio_pci_core_device *vdev) vdev->irq_type == VFIO_PCI_MSIX_IRQ_INDEX); } +static +struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev, + unsigned long index) +{ + if (index >= vdev->num_ctx) + return NULL; + return &vdev->ctx[index]; +} + +static void vfio_irq_ctx_free_all(struct vfio_pci_core_device *vdev) +{ + kfree(vdev->ctx); +} + +static int vfio_irq_ctx_alloc_num(struct vfio_pci_core_device *vdev, + unsigned long num) +{ + vdev->ctx = kcalloc(num, sizeof(struct vfio_pci_irq_ctx), + GFP_KERNEL_ACCOUNT); + if (!vdev->ctx) + return -ENOMEM; + + return 0; +} + /* * INTx */ @@ -55,17 +80,28 @@ static void vfio_send_intx_eventfd(void *opaque, void *unused) { struct vfio_pci_core_device *vdev = opaque; - if (likely(is_intx(vdev) && !vdev->virq_disabled)) - eventfd_signal(vdev->ctx[0].trigger, 1); + if (likely(is_intx(vdev) && !vdev->virq_disabled)) { + struct vfio_pci_irq_ctx *ctx; + + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return; + eventfd_signal(ctx->trigger, 1); + } } /* Returns true if the INTx vfio_pci_irq_ctx.masked value is changed. */ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) { struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; unsigned long flags; bool masked_changed = false; + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return masked_changed; + spin_lock_irqsave(&vdev->irqlock, flags); /* @@ -77,7 +113,7 @@ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) if (unlikely(!is_intx(vdev))) { if (vdev->pci_2_3) pci_intx(pdev, 0); - } else if (!vdev->ctx[0].masked) { + } else if (!ctx->masked) { /* * Can't use check_and_mask here because we always want to * mask, not just when something is pending. @@ -87,7 +123,7 @@ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) else disable_irq_nosync(pdev->irq); - vdev->ctx[0].masked = true; + ctx->masked = true; masked_changed = true; } @@ -105,9 +141,14 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused) { struct vfio_pci_core_device *vdev = opaque; struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; unsigned long flags; int ret = 0; + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return ret; + spin_lock_irqsave(&vdev->irqlock, flags); /* @@ -117,7 +158,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused) if (unlikely(!is_intx(vdev))) { if (vdev->pci_2_3) pci_intx(pdev, 1); - } else if (vdev->ctx[0].masked && !vdev->virq_disabled) { + } else if (ctx->masked && !vdev->virq_disabled) { /* * A pending interrupt here would immediately trigger, * but we can avoid that overhead by just re-sending @@ -129,7 +170,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused) } else enable_irq(pdev->irq); - vdev->ctx[0].masked = (ret > 0); + ctx->masked = (ret > 0); } spin_unlock_irqrestore(&vdev->irqlock, flags); @@ -146,18 +187,23 @@ void vfio_pci_intx_unmask(struct vfio_pci_core_device *vdev) static irqreturn_t vfio_intx_handler(int irq, void *dev_id) { struct vfio_pci_core_device *vdev = dev_id; + struct vfio_pci_irq_ctx *ctx; unsigned long flags; int ret = IRQ_NONE; + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return ret; + spin_lock_irqsave(&vdev->irqlock, flags); if (!vdev->pci_2_3) { disable_irq_nosync(vdev->pdev->irq); - vdev->ctx[0].masked = true; + ctx->masked = true; ret = IRQ_HANDLED; - } else if (!vdev->ctx[0].masked && /* may be shared */ + } else if (!ctx->masked && /* may be shared */ pci_check_and_mask_intx(vdev->pdev)) { - vdev->ctx[0].masked = true; + ctx->masked = true; ret = IRQ_HANDLED; } @@ -171,15 +217,24 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id) static int vfio_intx_enable(struct vfio_pci_core_device *vdev) { + struct vfio_pci_irq_ctx *ctx; + int ret; + if (!is_irq_none(vdev)) return -EINVAL; if (!vdev->pdev->irq) return -ENODEV; - vdev->ctx = kzalloc(sizeof(struct vfio_pci_irq_ctx), GFP_KERNEL_ACCOUNT); - if (!vdev->ctx) - return -ENOMEM; + ret = vfio_irq_ctx_alloc_num(vdev, 1); + if (ret) + return ret; + + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) { + vfio_irq_ctx_free_all(vdev); + return -EINVAL; + } vdev->num_ctx = 1; @@ -189,9 +244,9 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev) * here, non-PCI-2.3 devices will have to wait until the * interrupt is enabled. */ - vdev->ctx[0].masked = vdev->virq_disabled; + ctx->masked = vdev->virq_disabled; if (vdev->pci_2_3) - pci_intx(vdev->pdev, !vdev->ctx[0].masked); + pci_intx(vdev->pdev, !ctx->masked); vdev->irq_type = VFIO_PCI_INTX_IRQ_INDEX; @@ -202,41 +257,46 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) { struct pci_dev *pdev = vdev->pdev; unsigned long irqflags = IRQF_SHARED; + struct vfio_pci_irq_ctx *ctx; struct eventfd_ctx *trigger; unsigned long flags; int ret; - if (vdev->ctx[0].trigger) { + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return -EINVAL; + + if (ctx->trigger) { free_irq(pdev->irq, vdev); - kfree(vdev->ctx[0].name); - eventfd_ctx_put(vdev->ctx[0].trigger); - vdev->ctx[0].trigger = NULL; + kfree(ctx->name); + eventfd_ctx_put(ctx->trigger); + ctx->trigger = NULL; } if (fd < 0) /* Disable only */ return 0; - vdev->ctx[0].name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-intx(%s)", - pci_name(pdev)); - if (!vdev->ctx[0].name) + ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-intx(%s)", + pci_name(pdev)); + if (!ctx->name) return -ENOMEM; trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { - kfree(vdev->ctx[0].name); + kfree(ctx->name); return PTR_ERR(trigger); } - vdev->ctx[0].trigger = trigger; + ctx->trigger = trigger; if (!vdev->pci_2_3) irqflags = 0; ret = request_irq(pdev->irq, vfio_intx_handler, - irqflags, vdev->ctx[0].name, vdev); + irqflags, ctx->name, vdev); if (ret) { - vdev->ctx[0].trigger = NULL; - kfree(vdev->ctx[0].name); + ctx->trigger = NULL; + kfree(ctx->name); eventfd_ctx_put(trigger); return ret; } @@ -246,7 +306,7 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) * disable_irq won't. */ spin_lock_irqsave(&vdev->irqlock, flags); - if (!vdev->pci_2_3 && vdev->ctx[0].masked) + if (!vdev->pci_2_3 && ctx->masked) disable_irq_nosync(pdev->irq); spin_unlock_irqrestore(&vdev->irqlock, flags); @@ -255,12 +315,17 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) static void vfio_intx_disable(struct vfio_pci_core_device *vdev) { - vfio_virqfd_disable(&vdev->ctx[0].unmask); - vfio_virqfd_disable(&vdev->ctx[0].mask); + struct vfio_pci_irq_ctx *ctx; + + ctx = vfio_irq_ctx_get(vdev, 0); + if (ctx) { + vfio_virqfd_disable(&ctx->unmask); + vfio_virqfd_disable(&ctx->mask); + } vfio_intx_set_signal(vdev, -1); vdev->irq_type = VFIO_PCI_NUM_IRQS; vdev->num_ctx = 0; - kfree(vdev->ctx); + vfio_irq_ctx_free_all(vdev); } /* @@ -284,10 +349,9 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi if (!is_irq_none(vdev)) return -EINVAL; - vdev->ctx = kcalloc(nvec, sizeof(struct vfio_pci_irq_ctx), - GFP_KERNEL_ACCOUNT); - if (!vdev->ctx) - return -ENOMEM; + ret = vfio_irq_ctx_alloc_num(vdev, nvec); + if (ret) + return ret; /* return the number of supported vectors if we can't get all: */ cmd = vfio_pci_memory_lock_and_enable(vdev); @@ -296,7 +360,7 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi if (ret > 0) pci_free_irq_vectors(pdev); vfio_pci_memory_unlock_and_restore(vdev, cmd); - kfree(vdev->ctx); + vfio_irq_ctx_free_all(vdev); return ret; } vfio_pci_memory_unlock_and_restore(vdev, cmd); @@ -320,6 +384,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, unsigned int vector, int fd, bool msix) { struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; struct eventfd_ctx *trigger; int irq, ret; u16 cmd; @@ -327,33 +392,33 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, if (vector >= vdev->num_ctx) return -EINVAL; + ctx = vfio_irq_ctx_get(vdev, vector); + if (!ctx) + return -EINVAL; irq = pci_irq_vector(pdev, vector); - if (vdev->ctx[vector].trigger) { - irq_bypass_unregister_producer(&vdev->ctx[vector].producer); + if (ctx->trigger) { + irq_bypass_unregister_producer(&ctx->producer); cmd = vfio_pci_memory_lock_and_enable(vdev); - free_irq(irq, vdev->ctx[vector].trigger); + free_irq(irq, ctx->trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); - - kfree(vdev->ctx[vector].name); - eventfd_ctx_put(vdev->ctx[vector].trigger); - vdev->ctx[vector].trigger = NULL; + kfree(ctx->name); + eventfd_ctx_put(ctx->trigger); + ctx->trigger = NULL; } if (fd < 0) return 0; - vdev->ctx[vector].name = kasprintf(GFP_KERNEL_ACCOUNT, - "vfio-msi%s[%d](%s)", - msix ? "x" : "", vector, - pci_name(pdev)); - if (!vdev->ctx[vector].name) + ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)", + msix ? "x" : "", vector, pci_name(pdev)); + if (!ctx->name) return -ENOMEM; trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { - kfree(vdev->ctx[vector].name); + kfree(ctx->name); return PTR_ERR(trigger); } @@ -372,26 +437,25 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, pci_write_msi_msg(irq, &msg); } - ret = request_irq(irq, vfio_msihandler, 0, - vdev->ctx[vector].name, trigger); + ret = request_irq(irq, vfio_msihandler, 0, ctx->name, trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); if (ret) { - kfree(vdev->ctx[vector].name); + kfree(ctx->name); eventfd_ctx_put(trigger); return ret; } - vdev->ctx[vector].producer.token = trigger; - vdev->ctx[vector].producer.irq = irq; - ret = irq_bypass_register_producer(&vdev->ctx[vector].producer); + ctx->producer.token = trigger; + ctx->producer.irq = irq; + ret = irq_bypass_register_producer(&ctx->producer); if (unlikely(ret)) { dev_info(&pdev->dev, "irq bypass producer (token %p) registration fails: %d\n", - vdev->ctx[vector].producer.token, ret); + ctx->producer.token, ret); - vdev->ctx[vector].producer.token = NULL; + ctx->producer.token = NULL; } - vdev->ctx[vector].trigger = trigger; + ctx->trigger = trigger; return 0; } @@ -421,13 +485,17 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) { struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; unsigned int i; u16 cmd; for (i = 0; i < vdev->num_ctx; i++) { - vfio_virqfd_disable(&vdev->ctx[i].unmask); - vfio_virqfd_disable(&vdev->ctx[i].mask); - vfio_msi_set_vector_signal(vdev, i, -1, msix); + ctx = vfio_irq_ctx_get(vdev, i); + if (ctx) { + vfio_virqfd_disable(&ctx->unmask); + vfio_virqfd_disable(&ctx->mask); + vfio_msi_set_vector_signal(vdev, i, -1, msix); + } } cmd = vfio_pci_memory_lock_and_enable(vdev); @@ -443,7 +511,7 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) vdev->irq_type = VFIO_PCI_NUM_IRQS; vdev->num_ctx = 0; - kfree(vdev->ctx); + vfio_irq_ctx_free_all(vdev); } /* @@ -463,14 +531,18 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_core_device *vdev, if (unmask) vfio_pci_intx_unmask(vdev); } else if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { + struct vfio_pci_irq_ctx *ctx = vfio_irq_ctx_get(vdev, 0); int32_t fd = *(int32_t *)data; + + if (!ctx) + return -EINVAL; if (fd >= 0) return vfio_virqfd_enable((void *) vdev, vfio_pci_intx_unmask_handler, vfio_send_intx_eventfd, NULL, - &vdev->ctx[0].unmask, fd); + &ctx->unmask, fd); - vfio_virqfd_disable(&vdev->ctx[0].unmask); + vfio_virqfd_disable(&ctx->unmask); } return 0; @@ -543,6 +615,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { + struct vfio_pci_irq_ctx *ctx; unsigned int i; bool msix = (index == VFIO_PCI_MSIX_IRQ_INDEX) ? true : false; @@ -577,14 +650,15 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, return -EINVAL; for (i = start; i < start + count; i++) { - if (!vdev->ctx[i].trigger) + ctx = vfio_irq_ctx_get(vdev, i); + if (!ctx || !ctx->trigger) continue; if (flags & VFIO_IRQ_SET_DATA_NONE) { - eventfd_signal(vdev->ctx[i].trigger, 1); + eventfd_signal(ctx->trigger, 1); } else if (flags & VFIO_IRQ_SET_DATA_BOOL) { uint8_t *bools = data; if (bools[i - start]) - eventfd_signal(vdev->ctx[i].trigger, 1); + eventfd_signal(ctx->trigger, 1); } } return 0; From patchwork Tue Mar 28 21:53:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13191584 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05F0BC77B60 for ; Tue, 28 Mar 2023 21:54:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230046AbjC1Vyh (ORCPT ); Tue, 28 Mar 2023 17:54:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230039AbjC1Vy2 (ORCPT ); Tue, 28 Mar 2023 17:54:28 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81F932721; Tue, 28 Mar 2023 14:54:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680040454; x=1711576454; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=clXlp6AznSWgVdTSj2E/3+K57ayIvVDYVuY4v/JSdjI=; b=chjTS6xIUQIO2WPKqI961QDPTTq4F1XiZQkq8Vu7eGP3B+ShYVkjFalw uR9uvT9gR0zHh0nnMrldUH+h5Qxa3Hid6BJEN+Kwz5dB6mOfESw5rtY1I fNoOhOwh5AB41wPNx0vpLEElxpQ7ZZP2qlxLQW3T/ArxNadhsETDAC9Gu plpFdphFuLYOE2Ua4soAcUgug2n/KHoDXXBlg2KMOScRjVpLPE0JoxdXE RK4a4xS+sqREe+OcCVXmKpyVVK18wqkKgZWBGbOE0AYurVyynR3FI5HO/ NuwKnswJ+5AUMxuioZ07pDgAe3XNF6i1VbuZgZZLvyc78qSpQ2NgDHjy9 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="403316937" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="403316937" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="748543776" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="748543776" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:50 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V2 4/8] vfio/pci: Use xarray for interrupt context storage Date: Tue, 28 Mar 2023 14:53:31 -0700 Message-Id: <8e9aaee36eac32c4b22f6a2c334721ca84132d61.1680038771.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Interrupt context is statically allocated at the time interrupts are allocated. Following allocation, the context is managed by directly accessing the elements of the array using the vector as index. The storage is released when interrupts are disabled. It is possible to dynamically allocate a single MSI-X index after MSI-X has been enabled. A dynamic storage for interrupt context is needed to support this. Replace the interrupt context array with an xarray (similar to what the core uses as store for MSI descriptors) that can support the dynamic expansion while maintaining the custom that uses the vector as index. Use the new data storage to allocate all elements once and free all elements together. Dynamic usage follows. Create helpers with understanding that it is only possible to (after initial MSI-X enabling) allocate a single MSI-X index at a time. The only time multiple MSI-X are allocated is during initial MSI-X enabling where failure results in no allocations. Signed-off-by: Reinette Chatre --- Changes since RFC V1: - Let vfio_irq_ctx_alloc_single() return pointer to allocated context. (Alex) - Transition INTx allocation to simplified vfio_irq_ctx_alloc_single(). - Improve accuracy of changelog. drivers/vfio/pci/vfio_pci_core.c | 1 + drivers/vfio/pci/vfio_pci_intrs.c | 77 ++++++++++++++++++++----------- include/linux/vfio_pci_core.h | 2 +- 3 files changed, 53 insertions(+), 27 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index a5ab416cf476..ae0e161c7fc9 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -2102,6 +2102,7 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev) INIT_LIST_HEAD(&vdev->vma_list); INIT_LIST_HEAD(&vdev->sriov_pfs_item); init_rwsem(&vdev->memory_lock); + xa_init(&vdev->ctx); return 0; } diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index ece371ebea00..00119641aa19 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -52,25 +52,60 @@ static struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev, unsigned long index) { - if (index >= vdev->num_ctx) - return NULL; - return &vdev->ctx[index]; + return xa_load(&vdev->ctx, index); } static void vfio_irq_ctx_free_all(struct vfio_pci_core_device *vdev) { - kfree(vdev->ctx); + struct vfio_pci_irq_ctx *ctx; + unsigned long index; + + xa_for_each(&vdev->ctx, index, ctx) { + xa_erase(&vdev->ctx, index); + kfree(ctx); + } +} + +static struct vfio_pci_irq_ctx * +vfio_irq_ctx_alloc_single(struct vfio_pci_core_device *vdev, + unsigned long index) +{ + struct vfio_pci_irq_ctx *ctx; + int ret; + + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT); + if (!ctx) + return NULL; + + ret = xa_insert(&vdev->ctx, index, ctx, GFP_KERNEL_ACCOUNT); + if (ret) { + kfree(ctx); + return NULL; + } + + return ctx; } +/* Only called during initial interrupt enabling. Never after. */ static int vfio_irq_ctx_alloc_num(struct vfio_pci_core_device *vdev, unsigned long num) { - vdev->ctx = kcalloc(num, sizeof(struct vfio_pci_irq_ctx), - GFP_KERNEL_ACCOUNT); - if (!vdev->ctx) - return -ENOMEM; + struct vfio_pci_irq_ctx *ctx; + unsigned long index; + + WARN_ON(!xa_empty(&vdev->ctx)); + + for (index = 0; index < num; index++) { + ctx = vfio_irq_ctx_alloc_single(vdev, index); + if (!ctx) + goto err; + } return 0; + +err: + vfio_irq_ctx_free_all(vdev); + return -ENOMEM; } /* @@ -218,7 +253,6 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id) static int vfio_intx_enable(struct vfio_pci_core_device *vdev) { struct vfio_pci_irq_ctx *ctx; - int ret; if (!is_irq_none(vdev)) return -EINVAL; @@ -226,15 +260,9 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev) if (!vdev->pdev->irq) return -ENODEV; - ret = vfio_irq_ctx_alloc_num(vdev, 1); - if (ret) - return ret; - - ctx = vfio_irq_ctx_get(vdev, 0); - if (!ctx) { - vfio_irq_ctx_free_all(vdev); - return -EINVAL; - } + ctx = vfio_irq_ctx_alloc_single(vdev, 0); + if (!ctx) + return -ENOMEM; vdev->num_ctx = 1; @@ -486,16 +514,13 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) { struct pci_dev *pdev = vdev->pdev; struct vfio_pci_irq_ctx *ctx; - unsigned int i; + unsigned long i; u16 cmd; - for (i = 0; i < vdev->num_ctx; i++) { - ctx = vfio_irq_ctx_get(vdev, i); - if (ctx) { - vfio_virqfd_disable(&ctx->unmask); - vfio_virqfd_disable(&ctx->mask); - vfio_msi_set_vector_signal(vdev, i, -1, msix); - } + xa_for_each(&vdev->ctx, i, ctx) { + vfio_virqfd_disable(&ctx->unmask); + vfio_virqfd_disable(&ctx->mask); + vfio_msi_set_vector_signal(vdev, i, -1, msix); } cmd = vfio_pci_memory_lock_and_enable(vdev); diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 367fd79226a3..61d7873a3973 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -59,7 +59,7 @@ struct vfio_pci_core_device { struct perm_bits *msi_perm; spinlock_t irqlock; struct mutex igate; - struct vfio_pci_irq_ctx *ctx; + struct xarray ctx; int num_ctx; int irq_type; int num_regions; From patchwork Tue Mar 28 21:53:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13191583 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16455C76196 for ; Tue, 28 Mar 2023 21:54:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230041AbjC1Vyf (ORCPT ); Tue, 28 Mar 2023 17:54:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230037AbjC1Vy2 (ORCPT ); Tue, 28 Mar 2023 17:54:28 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 228322701; Tue, 28 Mar 2023 14:54:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680040453; x=1711576453; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4Fjay3jnRQL5RqHD02xmo6UTR+EpsLej553IAjnbsts=; b=lbE+yCOHLh4b7Ej/VaggU7611+VMkXjmoPvC+t7YV1Hb3Um1TMYQPWGa gdEfHzZ2hV+qxoIV4xVGfi6oA4zRhfGZJqEsVk4yrQepPpEltgTx7W0hF ir6lbd+eBnrhHCAF2k6agl3wRBiGfHg5TPmGHNyFZ5y3h6xdXrXaeNJHW 8Fk4A9gJyKSefhEeTkPSIuKL2tB1BrQak0daXHq1/GRmzHBtnJxb2qFId 3aMnnLdZJy91cklIVpX7gA2H8zkhVvMWgKfm4YoLnqH4sTdXXhDVNgH/R S7uY7gwrJJM1V7Y/AP/11rjp5wDykmvNdhOAwRCp6C8qt9m9z3iKrTNm5 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="403316932" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="403316932" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="748543782" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="748543782" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:51 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V2 5/8] vfio/pci: Remove interrupt context counter Date: Tue, 28 Mar 2023 14:53:32 -0700 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org struct vfio_pci_core_device::num_ctx counts how many interrupt contexts have been allocated. When all interrupt contexts are allocated simultaneously num_ctx provides the upper bound of all vectors that can be used as indices into the interrupt context array. With the upcoming support for dynamic MSI-X the number of interrupt contexts does not necessarily span the range of allocated interrupts. Consequently, num_ctx is no longer a trusted upper bound for valid indices. Stop using num_ctx to determine if a provided vector is valid, use the existence of interrupt context directly. This changes behavior on the error path when user space provides an invalid vector range. Behavior changes from early exit without any modifications to possible modifications to valid vectors within the invalid range. This is acceptable considering that an invalid range is not a valid scenario, see link to discussion. The checks that ensure that user space provides a range of vectors that is valid for the device are untouched. Signed-off-by: Reinette Chatre Link: https://lore.kernel.org/lkml/20230316155646.07ae266f.alex.williamson@redhat.com/ --- Changes since RFC V1: - Remove vfio_irq_ctx_range_allocated(). (Alex and Kevin). drivers/vfio/pci/vfio_pci_intrs.c | 13 +------------ include/linux/vfio_pci_core.h | 1 - 2 files changed, 1 insertion(+), 13 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 00119641aa19..b86dce0f384f 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -264,8 +264,6 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev) if (!ctx) return -ENOMEM; - vdev->num_ctx = 1; - /* * If the virtual interrupt is masked, restore it. Devices * supporting DisINTx can be masked at the hardware level @@ -352,7 +350,6 @@ static void vfio_intx_disable(struct vfio_pci_core_device *vdev) } vfio_intx_set_signal(vdev, -1); vdev->irq_type = VFIO_PCI_NUM_IRQS; - vdev->num_ctx = 0; vfio_irq_ctx_free_all(vdev); } @@ -393,7 +390,6 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi } vfio_pci_memory_unlock_and_restore(vdev, cmd); - vdev->num_ctx = nvec; vdev->irq_type = msix ? VFIO_PCI_MSIX_IRQ_INDEX : VFIO_PCI_MSI_IRQ_INDEX; @@ -417,9 +413,6 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, int irq, ret; u16 cmd; - if (vector >= vdev->num_ctx) - return -EINVAL; - ctx = vfio_irq_ctx_get(vdev, vector); if (!ctx) return -EINVAL; @@ -494,9 +487,6 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, int i, ret = 0; unsigned int j; - if (start >= vdev->num_ctx || start + count > vdev->num_ctx) - return -EINVAL; - for (i = 0, j = start; i < count && !ret; i++, j++) { int fd = fds ? fds[i] : -1; ret = vfio_msi_set_vector_signal(vdev, j, fd, msix); @@ -535,7 +525,6 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) pci_intx(pdev, 0); vdev->irq_type = VFIO_PCI_NUM_IRQS; - vdev->num_ctx = 0; vfio_irq_ctx_free_all(vdev); } @@ -671,7 +660,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, return ret; } - if (!irq_is(vdev, index) || start + count > vdev->num_ctx) + if (!irq_is(vdev, index)) return -EINVAL; for (i = start; i < start + count; i++) { diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 61d7873a3973..148fd1ae6c1c 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -60,7 +60,6 @@ struct vfio_pci_core_device { spinlock_t irqlock; struct mutex igate; struct xarray ctx; - int num_ctx; int irq_type; int num_regions; struct vfio_pci_region *region; From patchwork Tue Mar 28 21:53:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13191585 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1316C6FD18 for ; Tue, 28 Mar 2023 21:54:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230084AbjC1Vy6 (ORCPT ); Tue, 28 Mar 2023 17:54:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230075AbjC1Vyn (ORCPT ); Tue, 28 Mar 2023 17:54:43 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 669F73591; Tue, 28 Mar 2023 14:54:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680040463; x=1711576463; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZFMnAYWkb7URHJwpoaNsszFowQGtPXU9f9P8ZuKBdU4=; b=Byp15McsPG8nMDQugS83bOLsuNJshXHpSeF8Ir+eTBT3uEx0EodIhJzk tUy9kPrgWFv2ktLXxHC5PwBuxEOfqk1g+CJzqW4upiQom3TnUBkUuC6bT fIXVIuUQ50BJhFgUtRdI19T26G3ZpiyHI1I8HhcFdXHPUmJniyu23VC4W dRAv6VI2qHTC9/W+Y4uKU55X9F8ZLICbSLFqkPgdFvmO6XD5qETuhcBCG vnRKyhJ1tXdSeMYbrikI8GigHL9MkSIFOCtyVw0wsultI5Sk8N4AWW2tl UI3CqHsL6J9wJg/BPuh0Zja1c8PaWpQqiurmxOsRZtHXPVODWs1Jffcli A==; X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="403316942" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="403316942" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="748543785" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="748543785" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:51 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V2 6/8] vfio/pci: Move to single error path Date: Tue, 28 Mar 2023 14:53:33 -0700 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Enabling and disabling of an interrupt involves several steps that can fail. Cleanup after failure is done when the error is encountered, resulting in some repetitive code. Support for dynamic MSI-X will introduce more steps during interrupt enabling and disabling. Transition to centralized exit path in preparation for dynamic MSI-X to eliminate duplicate error handling code. Ensure no remaining state refers to freed memory. Signed-off-by: Reinette Chatre --- Changes since RFC V1: - Improve changelog. drivers/vfio/pci/vfio_pci_intrs.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index b86dce0f384f..b3a258e58625 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -439,8 +439,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { - kfree(ctx->name); - return PTR_ERR(trigger); + ret = PTR_ERR(trigger); + goto out_free_name; } /* @@ -460,11 +460,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, ret = request_irq(irq, vfio_msihandler, 0, ctx->name, trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); - if (ret) { - kfree(ctx->name); - eventfd_ctx_put(trigger); - return ret; - } + if (ret) + goto out_put_eventfd_ctx; ctx->producer.token = trigger; ctx->producer.irq = irq; @@ -479,6 +476,13 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, ctx->trigger = trigger; return 0; + +out_put_eventfd_ctx: + eventfd_ctx_put(trigger); +out_free_name: + kfree(ctx->name); + ctx->name = NULL; + return ret; } static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, From patchwork Tue Mar 28 21:53:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13191586 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72BAFC761A6 for ; Tue, 28 Mar 2023 21:55:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230106AbjC1VzB (ORCPT ); Tue, 28 Mar 2023 17:55:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230094AbjC1Vys (ORCPT ); Tue, 28 Mar 2023 17:54:48 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2789C30E8; Tue, 28 Mar 2023 14:54:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680040467; x=1711576467; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7zreZsByDSz6HOKWv+RGhtYj9t2usm/KvzrQLe5byiA=; b=AQ8AA71AzA4Q67as9Q5R88Ajhji6w3AAgd1jlgN3z/0unqCtlc12CsYZ WdPZIjMVCDMNAoQiS5Qp7ZfPCCfmZ462Ch9IHMGIPiqyua1v5YQd360fW UJhQp6ID/IvuEBinXeXITTdCiPrBwuLdtzAEshUkhlMPhHDJw9lMry9r/ Mr82dSzu7x2/a0aQhIU9HLLX9XJYG6X6HCNp4YCmVZhYARybVtulJjZdp f5PkKOWhNJrwBTgo08/Hr+i3JrXFKvFSMsGkgv7LBqd2IbZ/52Ufic0+d VkjZcjWHhx2lHh7sbewjqtHOM/2NZp+ZxjXPWF6zhwEYbT+gBZsdOUk2Y g==; X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="403316946" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="403316946" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="748543789" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="748543789" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:51 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V2 7/8] vfio/pci: Support dynamic MSI-x Date: Tue, 28 Mar 2023 14:53:34 -0700 Message-Id: <419f3ba2f732154d8ae079b3deb02d0fdbe3e258.1680038771.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Recently introduced pci_msix_alloc_irq_at() and pci_msix_free_irq() enables an individual MSI-X index to be allocated and freed after MSI-X enabling. Support dynamic MSI-X if supported by the device. Keep the association between allocated interrupt and vfio interrupt context. Allocate new context together with the new interrupt if no interrupt context exist for an MSI-X interrupt. Similarly, release an interrupt with its context. Signed-off-by: Reinette Chatre --- Changes since RFC V1: - Add pointer to interrupt context as function parameter to vfio_irq_ctx_free(). (Alex) - Initialize new_ctx to false. (Dan Carpenter) - Only support dynamic allocation if device supports it. (Alex) drivers/vfio/pci/vfio_pci_intrs.c | 93 +++++++++++++++++++++++++------ 1 file changed, 76 insertions(+), 17 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index b3a258e58625..755b752ca17e 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -55,6 +55,13 @@ struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev, return xa_load(&vdev->ctx, index); } +static void vfio_irq_ctx_free(struct vfio_pci_core_device *vdev, + struct vfio_pci_irq_ctx *ctx, unsigned long index) +{ + xa_erase(&vdev->ctx, index); + kfree(ctx); +} + static void vfio_irq_ctx_free_all(struct vfio_pci_core_device *vdev) { struct vfio_pci_irq_ctx *ctx; @@ -409,33 +416,62 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, { struct pci_dev *pdev = vdev->pdev; struct vfio_pci_irq_ctx *ctx; + struct msi_map msix_map = {}; + bool allow_dyn_alloc = false; struct eventfd_ctx *trigger; + bool new_ctx = false; int irq, ret; u16 cmd; + /* Only MSI-X allows dynamic allocation. */ + if (msix && pci_msix_can_alloc_dyn(vdev->pdev)) + allow_dyn_alloc = true; + ctx = vfio_irq_ctx_get(vdev, vector); - if (!ctx) + if (!ctx && !allow_dyn_alloc) return -EINVAL; + irq = pci_irq_vector(pdev, vector); + /* Context and interrupt are always allocated together. */ + WARN_ON((ctx && irq == -EINVAL) || (!ctx && irq != -EINVAL)); - if (ctx->trigger) { + if (ctx && ctx->trigger) { irq_bypass_unregister_producer(&ctx->producer); cmd = vfio_pci_memory_lock_and_enable(vdev); free_irq(irq, ctx->trigger); + if (allow_dyn_alloc) { + msix_map.index = vector; + msix_map.virq = irq; + pci_msix_free_irq(pdev, msix_map); + irq = -EINVAL; + } vfio_pci_memory_unlock_and_restore(vdev, cmd); kfree(ctx->name); eventfd_ctx_put(ctx->trigger); ctx->trigger = NULL; + if (allow_dyn_alloc) { + vfio_irq_ctx_free(vdev, ctx, vector); + ctx = NULL; + } } if (fd < 0) return 0; + if (!ctx) { + ctx = vfio_irq_ctx_alloc_single(vdev, vector); + if (!ctx) + return -ENOMEM; + new_ctx = true; + } + ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)", msix ? "x" : "", vector, pci_name(pdev)); - if (!ctx->name) - return -ENOMEM; + if (!ctx->name) { + ret = -ENOMEM; + goto out_free_ctx; + } trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { @@ -443,25 +479,38 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, goto out_free_name; } - /* - * The MSIx vector table resides in device memory which may be cleared - * via backdoor resets. We don't allow direct access to the vector - * table so even if a userspace driver attempts to save/restore around - * such a reset it would be unsuccessful. To avoid this, restore the - * cached value of the message prior to enabling. - */ cmd = vfio_pci_memory_lock_and_enable(vdev); if (msix) { - struct msi_msg msg; - - get_cached_msi_msg(irq, &msg); - pci_write_msi_msg(irq, &msg); + if (irq == -EINVAL) { + msix_map = pci_msix_alloc_irq_at(pdev, vector, NULL); + if (msix_map.index < 0) { + vfio_pci_memory_unlock_and_restore(vdev, cmd); + ret = msix_map.index; + goto out_put_eventfd_ctx; + } + irq = msix_map.virq; + } else { + /* + * The MSIx vector table resides in device memory which + * may be cleared via backdoor resets. We don't allow + * direct access to the vector table so even if a + * userspace driver attempts to save/restore around + * such a reset it would be unsuccessful. To avoid + * this, restore the cached value of the message prior + * to enabling. + */ + struct msi_msg msg; + + get_cached_msi_msg(irq, &msg); + pci_write_msi_msg(irq, &msg); + } } ret = request_irq(irq, vfio_msihandler, 0, ctx->name, trigger); - vfio_pci_memory_unlock_and_restore(vdev, cmd); if (ret) - goto out_put_eventfd_ctx; + goto out_free_irq_locked; + + vfio_pci_memory_unlock_and_restore(vdev, cmd); ctx->producer.token = trigger; ctx->producer.irq = irq; @@ -477,11 +526,21 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, return 0; +out_free_irq_locked: + if (allow_dyn_alloc && new_ctx) { + msix_map.index = vector; + msix_map.virq = irq; + pci_msix_free_irq(pdev, msix_map); + } + vfio_pci_memory_unlock_and_restore(vdev, cmd); out_put_eventfd_ctx: eventfd_ctx_put(trigger); out_free_name: kfree(ctx->name); ctx->name = NULL; +out_free_ctx: + if (allow_dyn_alloc && new_ctx) + vfio_irq_ctx_free(vdev, ctx, vector); return ret; } From patchwork Tue Mar 28 21:53:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13191587 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7854AC77B60 for ; Tue, 28 Mar 2023 21:55:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230143AbjC1VzC (ORCPT ); Tue, 28 Mar 2023 17:55:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48166 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230105AbjC1Vys (ORCPT ); Tue, 28 Mar 2023 17:54:48 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0417F101; Tue, 28 Mar 2023 14:54:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680040468; x=1711576468; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=u1o6u8j9j0bXGW/uy+rny+Z/wq7R1dGmviE3Tye7HEw=; b=Y6JgtkG/v8RHyLygeBSAvLt5/tRo3uXCqIKdkqqnofxicQyR3R85edA1 OMVWcWDxhZ0sgX4dEwR8MqKgsaJXlrbJhST6nJv7v835sw7+xEzrvN634 Q9HbJlvBo47ZUH8FdiDvZsW/f5QMPysz3RZ1s++mjfAV+nhxouP3UJLGV FbzUr5x3SdGjdYLaqlmnachR6T56pvwoHK6019h9ULHbSg5QisPo3W8hl pVffdzF9neglgjnTELevoEzcT3focgG3mnM/81D7Crl53ctqvV/FpRXK8 K5qexZNpgyioDae1cUnQdGn9IAujGODckFz2Ma4dsXqVi2I6pORsq+DMY Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="403316954" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="403316954" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="748543792" X-IronPort-AV: E=Sophos;i="5.98,297,1673942400"; d="scan'208";a="748543792" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Mar 2023 14:53:52 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V2 8/8] vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X Date: Tue, 28 Mar 2023 14:53:35 -0700 Message-Id: <81a6066c0f0d6dfa06f41c016abfb7152064e33e.1680038771.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Dynamic MSI-X is supported. Clear VFIO_IRQ_INFO_NORESIZE to provide guidance to user space. Signed-off-by: Reinette Chatre --- Changes since RFC V1: - Only advertise VFIO_IRQ_INFO_NORESIZE for MSI-X devices that can actually support dynamic allocation. (Alex) drivers/vfio/pci/vfio_pci_core.c | 4 +++- include/uapi/linux/vfio.h | 3 +++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index ae0e161c7fc9..c679d7b59f9b 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1111,7 +1111,9 @@ static int vfio_pci_ioctl_get_irq_info(struct vfio_pci_core_device *vdev, if (info.index == VFIO_PCI_INTX_IRQ_INDEX) info.flags |= (VFIO_IRQ_INFO_MASKABLE | VFIO_IRQ_INFO_AUTOMASKED); - else + else if ((info.index != VFIO_PCI_MSIX_IRQ_INDEX) || + (info.index == VFIO_PCI_MSIX_IRQ_INDEX && + !pci_msix_can_alloc_dyn(vdev->pdev))) info.flags |= VFIO_IRQ_INFO_NORESIZE; return copy_to_user(arg, &info, minsz) ? -EFAULT : 0; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 0552e8dcf0cb..1a36134cae5c 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -511,6 +511,9 @@ struct vfio_region_info_cap_nvlink2_lnkspd { * then add and unmask vectors, it's up to userspace to make the decision * whether to allocate the maximum supported number of vectors or tear * down setup and incrementally increase the vectors as each is enabled. + * Absence of the NORESIZE flag indicates that vectors can be enabled + * and disabled dynamically without impacting other vectors within the + * index. */ struct vfio_irq_info { __u32 argsz;