From patchwork Wed Mar 15 20:59:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13176644 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA32BC7618A for ; Wed, 15 Mar 2023 21:00:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232569AbjCOVAA (ORCPT ); Wed, 15 Mar 2023 17:00:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59192 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232513AbjCOU7o (ORCPT ); Wed, 15 Mar 2023 16:59:44 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3290E80902; Wed, 15 Mar 2023 13:59:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678913983; x=1710449983; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XbkcLI4n2teEQCC6ULXx558MPH5O1I8KLnT+xl6+hk8=; b=C02qq4gruf8QbbXjO8v1TxzvrvQdFoTdPu+2Jz5BiyP7nekRuH403AGt vpn6uk3Y/weqlDNIpwO1ui5lZm7BWwjB4bA46zFM0v2kpm7T9rFQEMa4a vnTxEvwiG3q29YnDJqGXxU4lrOHvVtjbolP5h8Ayo+ymiQSbGoAll/hP/ sCwDxThJZjqbvc6VMNSCDsXuNgaRiEzuxZH6d1LhGfXxDdbYFZn3bbZul DV8JJl52mEYDoSY83vAOC+haZgJU15dUjQQM0BH+HXYFzdIZ7zNncsG7Z HnVkF6yj5h0X/+HBIShvPaWhbHzIasgquJkFlhY0Rsd83He7HT6ul+eqD g==; X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="326176500" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="326176500" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="853747211" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="853747211" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:38 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [RFC PATCH 1/8] vfio/pci: Consolidate irq cleanup on MSI/MSI-X disable Date: Wed, 15 Mar 2023 13:59:21 -0700 Message-Id: <0da0db5b65804bf6ff2c8448e7b9dea9462e5c34.1678911529.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org vfio_msi_disable() releases all previously allocated state associated with each interrupt before disabling MSI/MSI-X. vfio_msi_disable() iterates twice over the interrupt state: first directly with a for loop to do virqfd cleanup, followed by another for loop within vfio_msi_set_block() that releases the interrupt and its state using vfio_msi_set_vector_signal(). Simplify interrupt cleanup by iterating over allocated interrupts once. Signed-off-by: Reinette Chatre --- drivers/vfio/pci/vfio_pci_intrs.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index bffb0741518b..6a9c6a143cc3 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -426,10 +426,9 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) for (i = 0; i < vdev->num_ctx; i++) { vfio_virqfd_disable(&vdev->ctx[i].unmask); vfio_virqfd_disable(&vdev->ctx[i].mask); + vfio_msi_set_vector_signal(vdev, i, -1, msix); } - vfio_msi_set_block(vdev, 0, vdev->num_ctx, NULL, msix); - cmd = vfio_pci_memory_lock_and_enable(vdev); pci_free_irq_vectors(pdev); vfio_pci_memory_unlock_and_restore(vdev, cmd); From patchwork Wed Mar 15 20:59:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13176641 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E8F0C61DA4 for ; Wed, 15 Mar 2023 20:59:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232532AbjCOU7p (ORCPT ); Wed, 15 Mar 2023 16:59:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59028 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232383AbjCOU7l (ORCPT ); Wed, 15 Mar 2023 16:59:41 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC21D85A4F; Wed, 15 Mar 2023 13:59:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678913979; x=1710449979; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Qbu9sl+61RQBLYcvTOuWSLXfqQGA99BUZPMWL8q3o7A=; b=lARoTfPD6xxYDl6LO3fEqXvXoTLr8sH7fGhOmtELD+/tr0Oy+GuVPvB6 dRqPrCtT4pvIdKxiS+v8wEvnfmK6KlW07Th/DDkvktzs9TVBAXB8a1PlI DtR6AmXk3iLs8ZKbpYhtqNfcQOu8fS+BbN97PCgBstdo9eNxRYwzHu4c8 gWCac5inQEANNnj4vEbnZh8DR5u1MLdnF+89BGfJaDoib+FUfJrHGARcV qWULhASca8gevSzpfkHjASrfWfLoagZzaqhSO6dSRlh61KXmPePxs+Mj6 7sKCHQ8p0XsdHYNbxol/Y1Yko6AH2cwHOh67MKH46FRVxUvfweyMPCwvm g==; X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="326176506" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="326176506" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="853747215" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="853747215" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:38 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [RFC PATCH 2/8] vfio/pci: Remove negative check on unsigned vector Date: Wed, 15 Mar 2023 13:59:22 -0700 Message-Id: <97b2809a7d786583233a07b56c42e220d1d6e5a7.1678911529.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org User space provides the vector as an unsigned int that is checked early for validity (vfio_set_irqs_validate_and_prepare()). A later negative check of the provided vector is not necessary. Remove the negative check and ensure the type used for the vector is consistent as an unsigned int. Signed-off-by: Reinette Chatre --- drivers/vfio/pci/vfio_pci_intrs.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 6a9c6a143cc3..3f64ccdce69f 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -317,14 +317,14 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi } static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, - int vector, int fd, bool msix) + unsigned int vector, int fd, bool msix) { struct pci_dev *pdev = vdev->pdev; struct eventfd_ctx *trigger; int irq, ret; u16 cmd; - if (vector < 0 || vector >= vdev->num_ctx) + if (vector >= vdev->num_ctx) return -EINVAL; irq = pci_irq_vector(pdev, vector); @@ -399,7 +399,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, unsigned count, int32_t *fds, bool msix) { - int i, j, ret = 0; + int i, ret = 0; + unsigned int j; if (start >= vdev->num_ctx || start + count > vdev->num_ctx) return -EINVAL; @@ -420,7 +421,7 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) { struct pci_dev *pdev = vdev->pdev; - int i; + unsigned int i; u16 cmd; for (i = 0; i < vdev->num_ctx; i++) { @@ -542,7 +543,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { - int i; + unsigned int i; bool msix = (index == VFIO_PCI_MSIX_IRQ_INDEX) ? true : false; if (irq_is(vdev, index) && !count && (flags & VFIO_IRQ_SET_DATA_NONE)) { From patchwork Wed Mar 15 20:59:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13176642 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F907C7618A for ; Wed, 15 Mar 2023 20:59:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232496AbjCOU7z (ORCPT ); Wed, 15 Mar 2023 16:59:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59142 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232412AbjCOU7n (ORCPT ); Wed, 15 Mar 2023 16:59:43 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4161E9DE09; Wed, 15 Mar 2023 13:59:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678913981; x=1710449981; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=A+4ZcIQsKSo9aRxa5BQ1AU5NMqG5zxOH2Y36C6NWVys=; b=EPK7nRnmib3Smukwtv5IHyJ6JhVRtEEi7GP0XTjsMGwcsro8vGu7269h njTWd72FxmmkM1Y7qQ4nnx5a1+KelrndF12JMxqEzLiXZavhs1Myd+LkJ LKDkFj4fAfq6Zd0iIY0wUVjIcK7MBzOUrhSm9aihnfG+n2/qspeYWuvpy kUN5PL15anqt//iMx7jE7jYqw7zTKyjJqbK93SL1t0WQ2U6rvbmeO2goa uQpGu7Ijj1FvdP6MPN7Na/MlB+mV7VrT9N3HM1BeM6g7ICvKznFWeBbjb 7pBXaGJwWDKGwzCzXOWMrYP3/mSm+DQq8HwLh7wdgNj2S7D17gbBqw/JK A==; X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="326176513" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="326176513" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="853747217" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="853747217" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:38 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [RFC PATCH 3/8] vfio/pci: Prepare for dynamic interrupt context storage Date: Wed, 15 Mar 2023 13:59:23 -0700 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Interrupt context storage is statically allocated at the time interrupts are enabled. Following allocation, the interrupt context is managed by directly accessing the elements of the array using the vector as index. It is possible to allocate additional MSI-X vectors after MSI-X has been enabled. Dynamic storage of interrupt context is needed to support adding new MSI-X vectors after initial enabling. Replace direct access of array elements with pointers to the array elements. Doing so reduces impact of moving to a new data structure. Move interactions with the array to helpers to mostly contain changes needed to transition to a dynamic data structure. No functional change intended. Signed-off-by: Reinette Chatre --- drivers/vfio/pci/vfio_pci_intrs.c | 206 ++++++++++++++++++++---------- 1 file changed, 140 insertions(+), 66 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 3f64ccdce69f..ece371ebea00 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -48,6 +48,31 @@ static bool is_irq_none(struct vfio_pci_core_device *vdev) vdev->irq_type == VFIO_PCI_MSIX_IRQ_INDEX); } +static +struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev, + unsigned long index) +{ + if (index >= vdev->num_ctx) + return NULL; + return &vdev->ctx[index]; +} + +static void vfio_irq_ctx_free_all(struct vfio_pci_core_device *vdev) +{ + kfree(vdev->ctx); +} + +static int vfio_irq_ctx_alloc_num(struct vfio_pci_core_device *vdev, + unsigned long num) +{ + vdev->ctx = kcalloc(num, sizeof(struct vfio_pci_irq_ctx), + GFP_KERNEL_ACCOUNT); + if (!vdev->ctx) + return -ENOMEM; + + return 0; +} + /* * INTx */ @@ -55,17 +80,28 @@ static void vfio_send_intx_eventfd(void *opaque, void *unused) { struct vfio_pci_core_device *vdev = opaque; - if (likely(is_intx(vdev) && !vdev->virq_disabled)) - eventfd_signal(vdev->ctx[0].trigger, 1); + if (likely(is_intx(vdev) && !vdev->virq_disabled)) { + struct vfio_pci_irq_ctx *ctx; + + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return; + eventfd_signal(ctx->trigger, 1); + } } /* Returns true if the INTx vfio_pci_irq_ctx.masked value is changed. */ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) { struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; unsigned long flags; bool masked_changed = false; + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return masked_changed; + spin_lock_irqsave(&vdev->irqlock, flags); /* @@ -77,7 +113,7 @@ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) if (unlikely(!is_intx(vdev))) { if (vdev->pci_2_3) pci_intx(pdev, 0); - } else if (!vdev->ctx[0].masked) { + } else if (!ctx->masked) { /* * Can't use check_and_mask here because we always want to * mask, not just when something is pending. @@ -87,7 +123,7 @@ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) else disable_irq_nosync(pdev->irq); - vdev->ctx[0].masked = true; + ctx->masked = true; masked_changed = true; } @@ -105,9 +141,14 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused) { struct vfio_pci_core_device *vdev = opaque; struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; unsigned long flags; int ret = 0; + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return ret; + spin_lock_irqsave(&vdev->irqlock, flags); /* @@ -117,7 +158,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused) if (unlikely(!is_intx(vdev))) { if (vdev->pci_2_3) pci_intx(pdev, 1); - } else if (vdev->ctx[0].masked && !vdev->virq_disabled) { + } else if (ctx->masked && !vdev->virq_disabled) { /* * A pending interrupt here would immediately trigger, * but we can avoid that overhead by just re-sending @@ -129,7 +170,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused) } else enable_irq(pdev->irq); - vdev->ctx[0].masked = (ret > 0); + ctx->masked = (ret > 0); } spin_unlock_irqrestore(&vdev->irqlock, flags); @@ -146,18 +187,23 @@ void vfio_pci_intx_unmask(struct vfio_pci_core_device *vdev) static irqreturn_t vfio_intx_handler(int irq, void *dev_id) { struct vfio_pci_core_device *vdev = dev_id; + struct vfio_pci_irq_ctx *ctx; unsigned long flags; int ret = IRQ_NONE; + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return ret; + spin_lock_irqsave(&vdev->irqlock, flags); if (!vdev->pci_2_3) { disable_irq_nosync(vdev->pdev->irq); - vdev->ctx[0].masked = true; + ctx->masked = true; ret = IRQ_HANDLED; - } else if (!vdev->ctx[0].masked && /* may be shared */ + } else if (!ctx->masked && /* may be shared */ pci_check_and_mask_intx(vdev->pdev)) { - vdev->ctx[0].masked = true; + ctx->masked = true; ret = IRQ_HANDLED; } @@ -171,15 +217,24 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id) static int vfio_intx_enable(struct vfio_pci_core_device *vdev) { + struct vfio_pci_irq_ctx *ctx; + int ret; + if (!is_irq_none(vdev)) return -EINVAL; if (!vdev->pdev->irq) return -ENODEV; - vdev->ctx = kzalloc(sizeof(struct vfio_pci_irq_ctx), GFP_KERNEL_ACCOUNT); - if (!vdev->ctx) - return -ENOMEM; + ret = vfio_irq_ctx_alloc_num(vdev, 1); + if (ret) + return ret; + + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) { + vfio_irq_ctx_free_all(vdev); + return -EINVAL; + } vdev->num_ctx = 1; @@ -189,9 +244,9 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev) * here, non-PCI-2.3 devices will have to wait until the * interrupt is enabled. */ - vdev->ctx[0].masked = vdev->virq_disabled; + ctx->masked = vdev->virq_disabled; if (vdev->pci_2_3) - pci_intx(vdev->pdev, !vdev->ctx[0].masked); + pci_intx(vdev->pdev, !ctx->masked); vdev->irq_type = VFIO_PCI_INTX_IRQ_INDEX; @@ -202,41 +257,46 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) { struct pci_dev *pdev = vdev->pdev; unsigned long irqflags = IRQF_SHARED; + struct vfio_pci_irq_ctx *ctx; struct eventfd_ctx *trigger; unsigned long flags; int ret; - if (vdev->ctx[0].trigger) { + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return -EINVAL; + + if (ctx->trigger) { free_irq(pdev->irq, vdev); - kfree(vdev->ctx[0].name); - eventfd_ctx_put(vdev->ctx[0].trigger); - vdev->ctx[0].trigger = NULL; + kfree(ctx->name); + eventfd_ctx_put(ctx->trigger); + ctx->trigger = NULL; } if (fd < 0) /* Disable only */ return 0; - vdev->ctx[0].name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-intx(%s)", - pci_name(pdev)); - if (!vdev->ctx[0].name) + ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-intx(%s)", + pci_name(pdev)); + if (!ctx->name) return -ENOMEM; trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { - kfree(vdev->ctx[0].name); + kfree(ctx->name); return PTR_ERR(trigger); } - vdev->ctx[0].trigger = trigger; + ctx->trigger = trigger; if (!vdev->pci_2_3) irqflags = 0; ret = request_irq(pdev->irq, vfio_intx_handler, - irqflags, vdev->ctx[0].name, vdev); + irqflags, ctx->name, vdev); if (ret) { - vdev->ctx[0].trigger = NULL; - kfree(vdev->ctx[0].name); + ctx->trigger = NULL; + kfree(ctx->name); eventfd_ctx_put(trigger); return ret; } @@ -246,7 +306,7 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) * disable_irq won't. */ spin_lock_irqsave(&vdev->irqlock, flags); - if (!vdev->pci_2_3 && vdev->ctx[0].masked) + if (!vdev->pci_2_3 && ctx->masked) disable_irq_nosync(pdev->irq); spin_unlock_irqrestore(&vdev->irqlock, flags); @@ -255,12 +315,17 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) static void vfio_intx_disable(struct vfio_pci_core_device *vdev) { - vfio_virqfd_disable(&vdev->ctx[0].unmask); - vfio_virqfd_disable(&vdev->ctx[0].mask); + struct vfio_pci_irq_ctx *ctx; + + ctx = vfio_irq_ctx_get(vdev, 0); + if (ctx) { + vfio_virqfd_disable(&ctx->unmask); + vfio_virqfd_disable(&ctx->mask); + } vfio_intx_set_signal(vdev, -1); vdev->irq_type = VFIO_PCI_NUM_IRQS; vdev->num_ctx = 0; - kfree(vdev->ctx); + vfio_irq_ctx_free_all(vdev); } /* @@ -284,10 +349,9 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi if (!is_irq_none(vdev)) return -EINVAL; - vdev->ctx = kcalloc(nvec, sizeof(struct vfio_pci_irq_ctx), - GFP_KERNEL_ACCOUNT); - if (!vdev->ctx) - return -ENOMEM; + ret = vfio_irq_ctx_alloc_num(vdev, nvec); + if (ret) + return ret; /* return the number of supported vectors if we can't get all: */ cmd = vfio_pci_memory_lock_and_enable(vdev); @@ -296,7 +360,7 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi if (ret > 0) pci_free_irq_vectors(pdev); vfio_pci_memory_unlock_and_restore(vdev, cmd); - kfree(vdev->ctx); + vfio_irq_ctx_free_all(vdev); return ret; } vfio_pci_memory_unlock_and_restore(vdev, cmd); @@ -320,6 +384,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, unsigned int vector, int fd, bool msix) { struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; struct eventfd_ctx *trigger; int irq, ret; u16 cmd; @@ -327,33 +392,33 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, if (vector >= vdev->num_ctx) return -EINVAL; + ctx = vfio_irq_ctx_get(vdev, vector); + if (!ctx) + return -EINVAL; irq = pci_irq_vector(pdev, vector); - if (vdev->ctx[vector].trigger) { - irq_bypass_unregister_producer(&vdev->ctx[vector].producer); + if (ctx->trigger) { + irq_bypass_unregister_producer(&ctx->producer); cmd = vfio_pci_memory_lock_and_enable(vdev); - free_irq(irq, vdev->ctx[vector].trigger); + free_irq(irq, ctx->trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); - - kfree(vdev->ctx[vector].name); - eventfd_ctx_put(vdev->ctx[vector].trigger); - vdev->ctx[vector].trigger = NULL; + kfree(ctx->name); + eventfd_ctx_put(ctx->trigger); + ctx->trigger = NULL; } if (fd < 0) return 0; - vdev->ctx[vector].name = kasprintf(GFP_KERNEL_ACCOUNT, - "vfio-msi%s[%d](%s)", - msix ? "x" : "", vector, - pci_name(pdev)); - if (!vdev->ctx[vector].name) + ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)", + msix ? "x" : "", vector, pci_name(pdev)); + if (!ctx->name) return -ENOMEM; trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { - kfree(vdev->ctx[vector].name); + kfree(ctx->name); return PTR_ERR(trigger); } @@ -372,26 +437,25 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, pci_write_msi_msg(irq, &msg); } - ret = request_irq(irq, vfio_msihandler, 0, - vdev->ctx[vector].name, trigger); + ret = request_irq(irq, vfio_msihandler, 0, ctx->name, trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); if (ret) { - kfree(vdev->ctx[vector].name); + kfree(ctx->name); eventfd_ctx_put(trigger); return ret; } - vdev->ctx[vector].producer.token = trigger; - vdev->ctx[vector].producer.irq = irq; - ret = irq_bypass_register_producer(&vdev->ctx[vector].producer); + ctx->producer.token = trigger; + ctx->producer.irq = irq; + ret = irq_bypass_register_producer(&ctx->producer); if (unlikely(ret)) { dev_info(&pdev->dev, "irq bypass producer (token %p) registration fails: %d\n", - vdev->ctx[vector].producer.token, ret); + ctx->producer.token, ret); - vdev->ctx[vector].producer.token = NULL; + ctx->producer.token = NULL; } - vdev->ctx[vector].trigger = trigger; + ctx->trigger = trigger; return 0; } @@ -421,13 +485,17 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) { struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; unsigned int i; u16 cmd; for (i = 0; i < vdev->num_ctx; i++) { - vfio_virqfd_disable(&vdev->ctx[i].unmask); - vfio_virqfd_disable(&vdev->ctx[i].mask); - vfio_msi_set_vector_signal(vdev, i, -1, msix); + ctx = vfio_irq_ctx_get(vdev, i); + if (ctx) { + vfio_virqfd_disable(&ctx->unmask); + vfio_virqfd_disable(&ctx->mask); + vfio_msi_set_vector_signal(vdev, i, -1, msix); + } } cmd = vfio_pci_memory_lock_and_enable(vdev); @@ -443,7 +511,7 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) vdev->irq_type = VFIO_PCI_NUM_IRQS; vdev->num_ctx = 0; - kfree(vdev->ctx); + vfio_irq_ctx_free_all(vdev); } /* @@ -463,14 +531,18 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_core_device *vdev, if (unmask) vfio_pci_intx_unmask(vdev); } else if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { + struct vfio_pci_irq_ctx *ctx = vfio_irq_ctx_get(vdev, 0); int32_t fd = *(int32_t *)data; + + if (!ctx) + return -EINVAL; if (fd >= 0) return vfio_virqfd_enable((void *) vdev, vfio_pci_intx_unmask_handler, vfio_send_intx_eventfd, NULL, - &vdev->ctx[0].unmask, fd); + &ctx->unmask, fd); - vfio_virqfd_disable(&vdev->ctx[0].unmask); + vfio_virqfd_disable(&ctx->unmask); } return 0; @@ -543,6 +615,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { + struct vfio_pci_irq_ctx *ctx; unsigned int i; bool msix = (index == VFIO_PCI_MSIX_IRQ_INDEX) ? true : false; @@ -577,14 +650,15 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, return -EINVAL; for (i = start; i < start + count; i++) { - if (!vdev->ctx[i].trigger) + ctx = vfio_irq_ctx_get(vdev, i); + if (!ctx || !ctx->trigger) continue; if (flags & VFIO_IRQ_SET_DATA_NONE) { - eventfd_signal(vdev->ctx[i].trigger, 1); + eventfd_signal(ctx->trigger, 1); } else if (flags & VFIO_IRQ_SET_DATA_BOOL) { uint8_t *bools = data; if (bools[i - start]) - eventfd_signal(vdev->ctx[i].trigger, 1); + eventfd_signal(ctx->trigger, 1); } } return 0; From patchwork Wed Mar 15 20:59:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13176643 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7548BC7618D for ; Wed, 15 Mar 2023 20:59:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232685AbjCOU75 (ORCPT ); Wed, 15 Mar 2023 16:59:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59130 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232409AbjCOU7n (ORCPT ); Wed, 15 Mar 2023 16:59:43 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5BDDF9DE22; Wed, 15 Mar 2023 13:59:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678913981; x=1710449981; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7dXez5k4GaMgDvxV5RxbsMB39Tz+QawY3iHxSZvUebU=; b=WyI/uzgXDZScG3ZMTUx/HD5VLUcpb+PM/eTAwji/GRP4ANeUfNEm44WZ A8ypQuQGIRj/SFp2gpFTxioqB4a6y+C3qVo2Z8+fEAqILAxBgmyqikhzW tNpSeqSuqPt5UOcHwXKjJRfjunUSts/wqZjI9VbFBXEHejPsgDsVtRpP0 c6WCriR1LQHbt8Mf+s7Ox9zc3ohSBr2gFHHLkNj5vhhXYv3bK+/ehIYZA /DK2J2ZmslXHzdMy6B/w5IRLGhDf6ZX7qJ89pqfIcQlJk2A/zLiGWs/u3 GWRagScrPgO/vw3LYJBrXl6PsDYB/bmTPM4w4b+mS1KiPGTeJmQdfa59H g==; X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="326176519" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="326176519" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="853747221" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="853747221" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:39 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [RFC PATCH 4/8] vfio/pci: Use xarray for interrupt context storage Date: Wed, 15 Mar 2023 13:59:24 -0700 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Interrupt context is statically allocated at the time interrupts are enabled. Following allocation, the context is managed by directly accessing the elements of the array using the vector as index. The storage is released when interrupts are disabled. It is possible to dynamically allocate a single MSI-X index after MSI-X has been enabled. A dynamic storage for interrupt context is needed to support this. Replace the interrupt context array with an xarray (similar to what the core uses as store for MSI descriptors) that can support the dynamic expansion while maintaining the custom that uses the vector as index. Use the new data storage to allocate all elements once and free all elements together. Dynamic usage follows. Create helpers with understanding that it is only possible to (after initial MSI-X enabling) allocate a single MSI-X index at a time. The only time multiple MSI-X are allocated is during initial MSI-X enabling where failure results in no allocations. Signed-off-by: Reinette Chatre --- drivers/vfio/pci/vfio_pci_core.c | 1 + drivers/vfio/pci/vfio_pci_intrs.c | 63 +++++++++++++++++++++++-------- include/linux/vfio_pci_core.h | 2 +- 3 files changed, 49 insertions(+), 17 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index a5ab416cf476..ae0e161c7fc9 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -2102,6 +2102,7 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev) INIT_LIST_HEAD(&vdev->vma_list); INIT_LIST_HEAD(&vdev->sriov_pfs_item); init_rwsem(&vdev->memory_lock); + xa_init(&vdev->ctx); return 0; } diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index ece371ebea00..bfcf5cb6b435 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -52,25 +52,59 @@ static struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev, unsigned long index) { - if (index >= vdev->num_ctx) - return NULL; - return &vdev->ctx[index]; + return xa_load(&vdev->ctx, index); } static void vfio_irq_ctx_free_all(struct vfio_pci_core_device *vdev) { - kfree(vdev->ctx); + struct vfio_pci_irq_ctx *ctx; + unsigned long index; + + xa_for_each(&vdev->ctx, index, ctx) { + xa_erase(&vdev->ctx, index); + kfree(ctx); + } } +static int vfio_irq_ctx_alloc_single(struct vfio_pci_core_device *vdev, + unsigned long index) +{ + struct vfio_pci_irq_ctx *ctx; + int ret; + + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT); + if (!ctx) + return -ENOMEM; + + ret = xa_insert(&vdev->ctx, index, ctx, GFP_KERNEL_ACCOUNT); + if (ret) { + kfree(ctx); + return ret; + } + + return 0; +} + +/* Only called during initial interrupt enabling. Never after. */ static int vfio_irq_ctx_alloc_num(struct vfio_pci_core_device *vdev, unsigned long num) { - vdev->ctx = kcalloc(num, sizeof(struct vfio_pci_irq_ctx), - GFP_KERNEL_ACCOUNT); - if (!vdev->ctx) - return -ENOMEM; + unsigned long index; + int ret; + + WARN_ON(!xa_empty(&vdev->ctx)); + + for (index = 0; index < num; index++) { + ret = vfio_irq_ctx_alloc_single(vdev, index); + if (ret) + goto err; + } return 0; + +err: + vfio_irq_ctx_free_all(vdev); + return ret; } /* @@ -486,16 +520,13 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) { struct pci_dev *pdev = vdev->pdev; struct vfio_pci_irq_ctx *ctx; - unsigned int i; + unsigned long i; u16 cmd; - for (i = 0; i < vdev->num_ctx; i++) { - ctx = vfio_irq_ctx_get(vdev, i); - if (ctx) { - vfio_virqfd_disable(&ctx->unmask); - vfio_virqfd_disable(&ctx->mask); - vfio_msi_set_vector_signal(vdev, i, -1, msix); - } + xa_for_each(&vdev->ctx, i, ctx) { + vfio_virqfd_disable(&ctx->unmask); + vfio_virqfd_disable(&ctx->mask); + vfio_msi_set_vector_signal(vdev, i, -1, msix); } cmd = vfio_pci_memory_lock_and_enable(vdev); diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 367fd79226a3..61d7873a3973 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -59,7 +59,7 @@ struct vfio_pci_core_device { struct perm_bits *msi_perm; spinlock_t irqlock; struct mutex igate; - struct vfio_pci_irq_ctx *ctx; + struct xarray ctx; int num_ctx; int irq_type; int num_regions; From patchwork Wed Mar 15 20:59:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13176646 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35B17C6FD1D for ; Wed, 15 Mar 2023 21:00:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232656AbjCOVAG (ORCPT ); Wed, 15 Mar 2023 17:00:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232498AbjCOU7o (ORCPT ); Wed, 15 Mar 2023 16:59:44 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E780900B3; Wed, 15 Mar 2023 13:59:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678913983; x=1710449983; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=j7plS9ITT8roz2joqbCS1lpsy17/bXANdR8t6kDnulg=; b=E3UsmQCiMcWdyW1wAxNvUQFxJ8gvF3cBRFkpFljnC3vbIyWaY9clqiiI /lTbNWGpZQSXopz0Is1lvrQz+Mzt8t1pnAO72R7oVDPKBB2GvlAptb+Sk 4haoMhBdjxDjsllTl/KpvYXgcRCGWs30xEapHgqb7MeykJCssHlBYAwqw TC8OTXbE7AWt7bF4F+tiLbHffOxSeDPIDnuVJtVuL+khjdIPOom8GaJ9u TmDsJGVlQ0iLtZbaps3AmAkTl0F/AKMcR2gdegucLzuqeZ1ynPG8hpbSu H1G00pXlKUicpC65vIDsNXph9Uwl8XIv9rveWG+pFVgI5ROaMiGdGHw8a A==; X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="326176524" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="326176524" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="853747225" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="853747225" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:39 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [RFC PATCH 5/8] vfio/pci: Remove interrupt context counter Date: Wed, 15 Mar 2023 13:59:25 -0700 Message-Id: <3154c63905481b5747a7457b275e2bce403b6f84.1678911529.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org struct vfio_pci_core_device::num_ctx counts how many interrupt contexts have been allocated. When all interrupt contexts are allocated simultaneously num_ctx provides the upper bound of all vectors that can be used as indices into the interrupt context array. With the upcoming support for dynamic MSI-X the number of interrupt contexts does not necessarily span the range of allocated interrupts. Consequently, num_ctx is no longer a trusted upper bound for valid indices. Stop using num_ctx to determine if a provided vector is valid, use the existence of interrupt context directly. Introduce a helper that ensures a provided interrupt range is allocated before any user requested action is taken. This maintains existing behavior (early exit without modifications) when user space attempts to modify a range of vectors that includes unallocated interrupts. The checks that ensure that user space provides a range of vectors that is valid for the device are untouched. Signed-off-by: Reinette Chatre --- Existing behavior on error paths is not maintained for MSI-X when adding support for dynamic MSI-X. Please see maintainer comments associated with "vfio/pci: Support dynamic MSI-x". drivers/vfio/pci/vfio_pci_intrs.c | 30 ++++++++++++++++++++---------- include/linux/vfio_pci_core.h | 1 - 2 files changed, 20 insertions(+), 11 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index bfcf5cb6b435..187a1ba34a16 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -107,6 +107,21 @@ static int vfio_irq_ctx_alloc_num(struct vfio_pci_core_device *vdev, return ret; } +static bool vfio_irq_ctx_range_allocated(struct vfio_pci_core_device *vdev, + unsigned int start, unsigned int count) +{ + struct vfio_pci_irq_ctx *ctx; + unsigned int i; + + for (i = start; i < start + count; i++) { + ctx = xa_load(&vdev->ctx, i); + if (!ctx) + return false; + } + + return true; +} + /* * INTx */ @@ -270,8 +285,6 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev) return -EINVAL; } - vdev->num_ctx = 1; - /* * If the virtual interrupt is masked, restore it. Devices * supporting DisINTx can be masked at the hardware level @@ -358,7 +371,6 @@ static void vfio_intx_disable(struct vfio_pci_core_device *vdev) } vfio_intx_set_signal(vdev, -1); vdev->irq_type = VFIO_PCI_NUM_IRQS; - vdev->num_ctx = 0; vfio_irq_ctx_free_all(vdev); } @@ -399,7 +411,6 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi } vfio_pci_memory_unlock_and_restore(vdev, cmd); - vdev->num_ctx = nvec; vdev->irq_type = msix ? VFIO_PCI_MSIX_IRQ_INDEX : VFIO_PCI_MSI_IRQ_INDEX; @@ -423,9 +434,6 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, int irq, ret; u16 cmd; - if (vector >= vdev->num_ctx) - return -EINVAL; - ctx = vfio_irq_ctx_get(vdev, vector); if (!ctx) return -EINVAL; @@ -500,7 +508,7 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, int i, ret = 0; unsigned int j; - if (start >= vdev->num_ctx || start + count > vdev->num_ctx) + if (!vfio_irq_ctx_range_allocated(vdev, start, count)) return -EINVAL; for (i = 0, j = start; i < count && !ret; i++, j++) { @@ -541,7 +549,6 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) pci_intx(pdev, 0); vdev->irq_type = VFIO_PCI_NUM_IRQS; - vdev->num_ctx = 0; vfio_irq_ctx_free_all(vdev); } @@ -677,7 +684,10 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, return ret; } - if (!irq_is(vdev, index) || start + count > vdev->num_ctx) + if (!irq_is(vdev, index)) + return -EINVAL; + + if (!vfio_irq_ctx_range_allocated(vdev, start, count)) return -EINVAL; for (i = start; i < start + count; i++) { diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 61d7873a3973..148fd1ae6c1c 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -60,7 +60,6 @@ struct vfio_pci_core_device { spinlock_t irqlock; struct mutex igate; struct xarray ctx; - int num_ctx; int irq_type; int num_regions; struct vfio_pci_region *region; From patchwork Wed Mar 15 20:59:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13176645 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 542B9C6FD1D for ; Wed, 15 Mar 2023 21:00:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232757AbjCOVAC (ORCPT ); Wed, 15 Mar 2023 17:00:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232503AbjCOU7o (ORCPT ); Wed, 15 Mar 2023 16:59:44 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C7F685690; Wed, 15 Mar 2023 13:59:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678913983; x=1710449983; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DMhH90UWDLh3tQDufZrmB4QPEnhvo2J9/K2HdYM+u74=; b=T2t6R0SgUdVaON7L7HUcwc+gdLWWVQEvw3K+//D1Omqh2PtV7U7bM2xV MUTs1j9wrOsXuNc117q0+dn34PEZFX+o+UwCR3KOe5ga+jrUMD+XALWyy tw4TpvKrvrKuBwK3lpOBP/z8AvX/0K4b4iwXHJuW6PnN6i5HA25HwLqr1 DvBO6YH7/hjBdosH9rg4vuOjYJXL+3Rn62Oj8yCfBAJCuAghqwxXumRaj 3OtxjnFypHxFeTAG26a5KE2bBYxk9YMWdNWvIF57zWV6SO+25zFRUTREk PhRsQ54ouPxmv3D4mwOMnUz19P2pG3swuqGYfPPskFfrPW2OkOC9fUHhW g==; X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="326176529" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="326176529" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="853747228" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="853747228" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:39 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [RFC PATCH 6/8] vfio/pci: Move to single error path Date: Wed, 15 Mar 2023 13:59:26 -0700 Message-Id: <2589209ff98360947bcbbbc000028bb6adc639ef.1678911529.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The creation and release of interrupt context involves several steps that can fail. Cleanup after failure is done when the error is encountered, resulting in some repetitive code. Support for dynamic MSI-X will introduce more steps during interrupt context creation and release. Transition to centralized exit path in preparation for dynamic MSI-X to eliminate duplicate error handling code. Ensure no remaining state refers to freed memory. Signed-off-by: Reinette Chatre --- drivers/vfio/pci/vfio_pci_intrs.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 187a1ba34a16..b375a12885ba 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -460,8 +460,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { - kfree(ctx->name); - return PTR_ERR(trigger); + ret = PTR_ERR(trigger); + goto out_free_name; } /* @@ -481,11 +481,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, ret = request_irq(irq, vfio_msihandler, 0, ctx->name, trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); - if (ret) { - kfree(ctx->name); - eventfd_ctx_put(trigger); - return ret; - } + if (ret) + goto out_put_eventfd_ctx; ctx->producer.token = trigger; ctx->producer.irq = irq; @@ -500,6 +497,13 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, ctx->trigger = trigger; return 0; + +out_put_eventfd_ctx: + eventfd_ctx_put(trigger); +out_free_name: + kfree(ctx->name); + ctx->name = NULL; + return ret; } static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, From patchwork Wed Mar 15 20:59:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13176648 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E099C7618A for ; Wed, 15 Mar 2023 21:00:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230266AbjCOVAN (ORCPT ); Wed, 15 Mar 2023 17:00:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59182 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232588AbjCOU7q (ORCPT ); Wed, 15 Mar 2023 16:59:46 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA55F80E0C; Wed, 15 Mar 2023 13:59:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678913985; x=1710449985; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=JSa8iDgjg3H/aceoIH3gESDgqHFGMkkzi2yi2ZVLSyY=; b=bIqebSfytjUc45WSje+w3ymKYJixik+vQ/zgMZAJqK5s5qB0kaXV4e5z 56sa0o/EIeAXJm7Et+p/YX/MjSTM4KMAJmobEAtqMUCJOzhowHpr/mYcG gDAFvWESdbOQq0x8VuWo3XjpAf8uEXyhQ/9eA9JDSGrwqYS2j0V2iLwou ZtpnXiHr7oVVroTZJFYGteuYNg3fK6sPKZFEQiZ3NQY6ClPVbZLW+Eavz gYQnd0th10nOBOVtNpvPDCWuB/rVdfx9HJdMaNPg7LqLiTOTtMsjXb3OV 087oph8XIZ1NT9+jUtuBSavcuiAkCydte6L0GZNzDudmaywNg7BYN5sS5 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="326176534" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="326176534" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="853747232" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="853747232" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:39 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [RFC PATCH 7/8] vfio/pci: Support dynamic MSI-x Date: Wed, 15 Mar 2023 13:59:27 -0700 Message-Id: <591ce11f4a33e022fc9242324ebdc077202bedaf.1678911529.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Recently introduced pci_msix_alloc_irq_at() and pci_msix_free_irq() enables an individual MSI-X index to be allocated and freed after MSI-X enabling. Support dynamic MSI-X by keeping the association between allocated interrupt and vfio interrupt context. Allocate new context together with the new interrupt if no interrupt context exist for an MSI-X interrupt. Similarly, release an interrupt with its context. Signed-off-by: Reinette Chatre --- Guidance is appreciated on expectations regarding maintaining existing error behavior. Earlier patch introduced the vfio_irq_ctx_range_allocated() helper to maintain existing error behavior. Now, this helper needs to be disabled for MSI-X. User space not wanting to dynamically allocate MSI-X interrupts, but providing invalid range when providing a new ACTION will now obtain new interrupts or new failures (potentially including freeing of existing interrupts) if the allocation of the new interrupts fail. drivers/vfio/pci/vfio_pci_intrs.c | 101 ++++++++++++++++++++++++------ 1 file changed, 83 insertions(+), 18 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index b375a12885ba..954a70575802 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -55,6 +55,18 @@ struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev, return xa_load(&vdev->ctx, index); } +static void vfio_irq_ctx_free(struct vfio_pci_core_device *vdev, + unsigned long index) +{ + struct vfio_pci_irq_ctx *ctx; + + ctx = xa_load(&vdev->ctx, index); + if (ctx) { + xa_erase(&vdev->ctx, index); + kfree(ctx); + } +} + static void vfio_irq_ctx_free_all(struct vfio_pci_core_device *vdev) { struct vfio_pci_irq_ctx *ctx; @@ -430,33 +442,63 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, { struct pci_dev *pdev = vdev->pdev; struct vfio_pci_irq_ctx *ctx; + struct msi_map msix_map = {}; struct eventfd_ctx *trigger; + bool new_ctx; int irq, ret; u16 cmd; ctx = vfio_irq_ctx_get(vdev, vector); - if (!ctx) + /* Only MSI-X allows dynamic allocation. */ + if (!msix && !ctx) return -EINVAL; + irq = pci_irq_vector(pdev, vector); + /* Context and interrupt are always allocated together. */ + WARN_ON((ctx && irq == -EINVAL) || (!ctx && irq != -EINVAL)); - if (ctx->trigger) { + if (ctx && ctx->trigger) { irq_bypass_unregister_producer(&ctx->producer); cmd = vfio_pci_memory_lock_and_enable(vdev); free_irq(irq, ctx->trigger); + if (msix) { + msix_map.index = vector; + msix_map.virq = irq; + pci_msix_free_irq(pdev, msix_map); + irq = -EINVAL; + } vfio_pci_memory_unlock_and_restore(vdev, cmd); kfree(ctx->name); eventfd_ctx_put(ctx->trigger); ctx->trigger = NULL; + if (msix) { + vfio_irq_ctx_free(vdev, vector); + ctx = NULL; + } } if (fd < 0) return 0; + if (!ctx) { + ret = vfio_irq_ctx_alloc_single(vdev, vector); + if (ret) + return ret; + ctx = vfio_irq_ctx_get(vdev, vector); + if (!ctx) { + ret = -EINVAL; + goto out_free_ctx; + } + new_ctx = true; + } + ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)", msix ? "x" : "", vector, pci_name(pdev)); - if (!ctx->name) - return -ENOMEM; + if (!ctx->name) { + ret = -ENOMEM; + goto out_free_ctx; + } trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { @@ -464,25 +506,38 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, goto out_free_name; } - /* - * The MSIx vector table resides in device memory which may be cleared - * via backdoor resets. We don't allow direct access to the vector - * table so even if a userspace driver attempts to save/restore around - * such a reset it would be unsuccessful. To avoid this, restore the - * cached value of the message prior to enabling. - */ cmd = vfio_pci_memory_lock_and_enable(vdev); if (msix) { - struct msi_msg msg; - - get_cached_msi_msg(irq, &msg); - pci_write_msi_msg(irq, &msg); + if (irq == -EINVAL) { + msix_map = pci_msix_alloc_irq_at(pdev, vector, NULL); + if (msix_map.index < 0) { + vfio_pci_memory_unlock_and_restore(vdev, cmd); + ret = msix_map.index; + goto out_put_eventfd_ctx; + } + irq = msix_map.virq; + } else { + /* + * The MSIx vector table resides in device memory which + * may be cleared via backdoor resets. We don't allow + * direct access to the vector table so even if a + * userspace driver attempts to save/restore around + * such a reset it would be unsuccessful. To avoid + * this, restore the cached value of the message prior + * to enabling. + */ + struct msi_msg msg; + + get_cached_msi_msg(irq, &msg); + pci_write_msi_msg(irq, &msg); + } } ret = request_irq(irq, vfio_msihandler, 0, ctx->name, trigger); - vfio_pci_memory_unlock_and_restore(vdev, cmd); if (ret) - goto out_put_eventfd_ctx; + goto out_free_irq_locked; + + vfio_pci_memory_unlock_and_restore(vdev, cmd); ctx->producer.token = trigger; ctx->producer.irq = irq; @@ -498,11 +553,21 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, return 0; +out_free_irq_locked: + if (msix && new_ctx) { + msix_map.index = vector; + msix_map.virq = irq; + pci_msix_free_irq(pdev, msix_map); + } + vfio_pci_memory_unlock_and_restore(vdev, cmd); out_put_eventfd_ctx: eventfd_ctx_put(trigger); out_free_name: kfree(ctx->name); ctx->name = NULL; +out_free_ctx: + if (msix && new_ctx) + vfio_irq_ctx_free(vdev, vector); return ret; } @@ -512,7 +577,7 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, int i, ret = 0; unsigned int j; - if (!vfio_irq_ctx_range_allocated(vdev, start, count)) + if (!msix && !vfio_irq_ctx_range_allocated(vdev, start, count)) return -EINVAL; for (i = 0, j = start; i < count && !ret; i++, j++) { From patchwork Wed Mar 15 20:59:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13176647 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A97AC7618A for ; Wed, 15 Mar 2023 21:00:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232791AbjCOVAI (ORCPT ); Wed, 15 Mar 2023 17:00:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232648AbjCOU7s (ORCPT ); Wed, 15 Mar 2023 16:59:48 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EAA5E8C97A; Wed, 15 Mar 2023 13:59:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678913985; x=1710449985; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7B1A/cTDYcTxTgWo5idyY91azCfGJIiJsYGrC8xzLuY=; b=B2Mj7uSSnLuerG/uJoRvgVyLqqQr8Nn27VGpFolzk18MaboTDG16YFDa dYvgIh/OUHfe66UJigazxjPumsipXCihsJVW8d5GJEMqK/M30i/LsEsjh aNZrWAx/z4KJuWH6qdycTf8juXCD+YJFB6g3OUba7F/BHu6vTzsJkqy8f euWWu8Izuo8FT2onugR5oyWQJuWkoPKAuF1a+T9UTCwhplDrICwD2XFmg 41oZp+mfSQNpMVhzoWntQpBHBh0CudsEjLh0eEK2ytT3SiLPpWayivrZy hnpSR7rwyXoOxMeTOwsP4SbPRnI11vRpvlPT20v/6Cv8xeck7egomVhzW g==; X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="326176540" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="326176540" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10650"; a="853747235" X-IronPort-AV: E=Sophos;i="5.98,262,1673942400"; d="scan'208";a="853747235" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Mar 2023 13:59:39 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [RFC PATCH 8/8] vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X Date: Wed, 15 Mar 2023 13:59:28 -0700 Message-Id: <549e6300c0ea011cdce9a2712d49de4efd3a06b7.1678911529.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Dynamic MSI-X is supported. Clear VFIO_IRQ_INFO_NORESIZE to provide guidance to user space. Signed-off-by: Reinette Chatre --- drivers/vfio/pci/vfio_pci_core.c | 2 +- include/uapi/linux/vfio.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index ae0e161c7fc9..1d071ee212a7 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1111,7 +1111,7 @@ static int vfio_pci_ioctl_get_irq_info(struct vfio_pci_core_device *vdev, if (info.index == VFIO_PCI_INTX_IRQ_INDEX) info.flags |= (VFIO_IRQ_INFO_MASKABLE | VFIO_IRQ_INFO_AUTOMASKED); - else + else if (info.index != VFIO_PCI_MSIX_IRQ_INDEX) info.flags |= VFIO_IRQ_INFO_NORESIZE; return copy_to_user(arg, &info, minsz) ? -EFAULT : 0; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 0552e8dcf0cb..1a36134cae5c 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -511,6 +511,9 @@ struct vfio_region_info_cap_nvlink2_lnkspd { * then add and unmask vectors, it's up to userspace to make the decision * whether to allocate the maximum supported number of vectors or tear * down setup and incrementally increase the vectors as each is enabled. + * Absence of the NORESIZE flag indicates that vectors can be enabled + * and disabled dynamically without impacting other vectors within the + * index. */ struct vfio_irq_info { __u32 argsz;