From patchwork Thu May 11 15:44:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13238148 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E232C7EE23 for ; Thu, 11 May 2023 15:45:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238767AbjEKPo6 (ORCPT ); Thu, 11 May 2023 11:44:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238271AbjEKPo4 (ORCPT ); Thu, 11 May 2023 11:44:56 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5FF35FD; Thu, 11 May 2023 08:44:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683819895; x=1715355895; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Fq4IZslUIdBYDjhjcyUMZGxjA/O2iCEOTMmu+hT1ySk=; b=a0Vid4qNj2UKxwcYuMwMlpJI3Vrp3N/glvg4MaopCGalwWKVwdAUeChr nl3jJOF4/h6hCn838pKLkNxcFTaIFD68dfkySrt5At4XhzLOlZnhlJvOd mk6sl/zRpSbGy1L5LCNCfM7GKlWdY9pkVL9xeI3utwpmgXvh4V+qoXJ8/ HgPyFOqIdMRO2l8CN1+9IKt2spleVGGGlqoqQnSIpsSDUHvBaQyV+5n09 EU18v7Nb7yjhd6oZL6pT6HAAPe1vAOc9yGAPG3K6ca1t8KrXrrkl1eMJ4 XrIlJ65aab+l/klTwy+vMv8PCHOlCbgeUogzg1ZcOh5+WALKuS/IlJhCl g==; X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="335046649" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="335046649" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="699776220" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="699776220" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:48 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V5 01/11] vfio/pci: Consolidate irq cleanup on MSI/MSI-X disable Date: Thu, 11 May 2023 08:44:28 -0700 Message-Id: <837acb8cbe86a258a50da05e56a1f17c1a19abbe.1683740667.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org vfio_msi_disable() releases all previously allocated state associated with each interrupt before disabling MSI/MSI-X. vfio_msi_disable() iterates twice over the interrupt state: first directly with a for loop to do virqfd cleanup, followed by another for loop within vfio_msi_set_block() that removes the interrupt handler and its associated state using vfio_msi_set_vector_signal(). Simplify interrupt cleanup by iterating over allocated interrupts once. Signed-off-by: Reinette Chatre Reviewed-by: Kevin Tian --- Changes since V4: - Add Kevin's Reviewed-by tag. Changes since V2: - Improve accuracy of changelog. drivers/vfio/pci/vfio_pci_intrs.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index bffb0741518b..6a9c6a143cc3 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -426,10 +426,9 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) for (i = 0; i < vdev->num_ctx; i++) { vfio_virqfd_disable(&vdev->ctx[i].unmask); vfio_virqfd_disable(&vdev->ctx[i].mask); + vfio_msi_set_vector_signal(vdev, i, -1, msix); } - vfio_msi_set_block(vdev, 0, vdev->num_ctx, NULL, msix); - cmd = vfio_pci_memory_lock_and_enable(vdev); pci_free_irq_vectors(pdev); vfio_pci_memory_unlock_and_restore(vdev, cmd); From patchwork Thu May 11 15:44:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13238149 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B3D6C7EE22 for ; Thu, 11 May 2023 15:45:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238599AbjEKPpB (ORCPT ); Thu, 11 May 2023 11:45:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237526AbjEKPo5 (ORCPT ); Thu, 11 May 2023 11:44:57 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77B271A4; Thu, 11 May 2023 08:44:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683819896; x=1715355896; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=g6JZYSL71CEjogYzaYYuX7rOxxJ3pXQABRSAZrbyMBo=; b=P2yirWpMbjdeimCXGZl13VVeB1SKXxwd7+Cs5CVe05oPQWilFlku70IA 8D8tGew/EqZHY3XYG19esGR2Ltmt6yX94mahtFuZ3ETobcqnnRIwl3MFp VdB0tYQJyImnhIk3fSib+UUWmr/GyT+wFsQC4ifuttLs52J0qauFSN8vl XsfDS49fegSkQG9eLu6Uq049KbSMZqMTM8hYAvqI9VKchWKKQkzgv0um4 VPHkY+Nww7OfFwbt9sE31N8Wbor4N+ev4ZIv+WUjh78ABTjqtgTL9qHUq vfBvdeHMxUdITWRLXnP3XGRlHfITEXbgvOlWU6h3gMJNU6fsOnmpJ917E w==; X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="335046654" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="335046654" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="699776224" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="699776224" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:48 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V5 02/11] vfio/pci: Remove negative check on unsigned vector Date: Thu, 11 May 2023 08:44:29 -0700 Message-Id: <28521e1b0b091849952b0ecb8c118729fc8cdc4f.1683740667.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org User space provides the vector as an unsigned int that is checked early for validity (vfio_set_irqs_validate_and_prepare()). A later negative check of the provided vector is not necessary. Remove the negative check and ensure the type used for the vector is consistent as an unsigned int. Signed-off-by: Reinette Chatre Reviewed-by: Kevin Tian --- Changes since V4: - Add Kevin's Reviewed-by tag. (Kevin) Changes since V2: - Rework unwind loop within vfio_msi_set_block() that required j to be an int. Rework results in both i and j used for vector, both moved to be unsigned int. (Alex) drivers/vfio/pci/vfio_pci_intrs.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 6a9c6a143cc3..258de57ef956 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -317,14 +317,14 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi } static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, - int vector, int fd, bool msix) + unsigned int vector, int fd, bool msix) { struct pci_dev *pdev = vdev->pdev; struct eventfd_ctx *trigger; int irq, ret; u16 cmd; - if (vector < 0 || vector >= vdev->num_ctx) + if (vector >= vdev->num_ctx) return -EINVAL; irq = pci_irq_vector(pdev, vector); @@ -399,7 +399,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, unsigned count, int32_t *fds, bool msix) { - int i, j, ret = 0; + unsigned int i, j; + int ret = 0; if (start >= vdev->num_ctx || start + count > vdev->num_ctx) return -EINVAL; @@ -410,8 +411,8 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, } if (ret) { - for (--j; j >= (int)start; j--) - vfio_msi_set_vector_signal(vdev, j, -1, msix); + for (i = start; i < j; i++) + vfio_msi_set_vector_signal(vdev, i, -1, msix); } return ret; @@ -420,7 +421,7 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) { struct pci_dev *pdev = vdev->pdev; - int i; + unsigned int i; u16 cmd; for (i = 0; i < vdev->num_ctx; i++) { @@ -542,7 +543,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { - int i; + unsigned int i; bool msix = (index == VFIO_PCI_MSIX_IRQ_INDEX) ? true : false; if (irq_is(vdev, index) && !count && (flags & VFIO_IRQ_SET_DATA_NONE)) { From patchwork Thu May 11 15:44:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13238150 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54758C7EE26 for ; Thu, 11 May 2023 15:45:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238805AbjEKPpC (ORCPT ); Thu, 11 May 2023 11:45:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36082 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238471AbjEKPo6 (ORCPT ); Thu, 11 May 2023 11:44:58 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D78513A; Thu, 11 May 2023 08:44:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683819895; x=1715355895; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CDzd8/I3ohP2MTirPKPWQUOEF8lcp6HDRQNYFJpSUMk=; b=a1cLEyED61QW7fIX83glwxpOaNr6oXUYVAELugHKP17VQmIQhi+rSVB6 H/UBP+7YPpcQ7Lf1gFfRzoZRlU2BUmChyicHxuizHyzMEOOyaFqyQjWgk gWy7WXA98+CsCmIjK1yrTY2XQg+dpEWIiCd5OxWfmL9rK1RHXlnV3QhzA 894i6tnxSxVqY6P5E3ZlE7QXIeU9d/PXk2BLkUpAvIkxsUwBYrLH983ot 3cbvgUoZhh+tCaZ63HzfthCd3v5Cn968XK3rpru4oet6M+34GB/0Syp6H mw0apO7/EKPhpHirkg0SPs6uqNdqiMVKUqR/05pjxI0zlrm0ARO7W41V8 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="335046659" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="335046659" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="699776227" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="699776227" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:48 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V5 03/11] vfio/pci: Prepare for dynamic interrupt context storage Date: Thu, 11 May 2023 08:44:30 -0700 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Interrupt context storage is statically allocated at the time interrupts are allocated. Following allocation, the interrupt context is managed by directly accessing the elements of the array using the vector as index. It is possible to allocate additional MSI-X vectors after MSI-X has been enabled. Dynamic storage of interrupt context is needed to support adding new MSI-X vectors after initial allocation. Replace direct access of array elements with pointers to the array elements. Doing so reduces impact of moving to a new data structure. Move interactions with the array to helpers to mostly contain changes needed to transition to a dynamic data structure. No functional change intended. Signed-off-by: Reinette Chatre Reviewed-by: Kevin Tian --- Changes since V4: - Non-existing interrupt context should be treated as kernel bug with WARN_ON_ONCE(). (Kevin) - Move interrupt context retrieval in vfio_pci_intx_mask() and vfio_pci_intx_unmask_handler() to support scenario of mask/unmask of INTx without INTx being enabled. Changes since RFC V1: - Improve accuracy of changelog. drivers/vfio/pci/vfio_pci_intrs.c | 215 +++++++++++++++++++++--------- 1 file changed, 149 insertions(+), 66 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 258de57ef956..6094679349d9 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -48,6 +48,31 @@ static bool is_irq_none(struct vfio_pci_core_device *vdev) vdev->irq_type == VFIO_PCI_MSIX_IRQ_INDEX); } +static +struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev, + unsigned long index) +{ + if (index >= vdev->num_ctx) + return NULL; + return &vdev->ctx[index]; +} + +static void vfio_irq_ctx_free_all(struct vfio_pci_core_device *vdev) +{ + kfree(vdev->ctx); +} + +static int vfio_irq_ctx_alloc_num(struct vfio_pci_core_device *vdev, + unsigned long num) +{ + vdev->ctx = kcalloc(num, sizeof(struct vfio_pci_irq_ctx), + GFP_KERNEL_ACCOUNT); + if (!vdev->ctx) + return -ENOMEM; + + return 0; +} + /* * INTx */ @@ -55,14 +80,21 @@ static void vfio_send_intx_eventfd(void *opaque, void *unused) { struct vfio_pci_core_device *vdev = opaque; - if (likely(is_intx(vdev) && !vdev->virq_disabled)) - eventfd_signal(vdev->ctx[0].trigger, 1); + if (likely(is_intx(vdev) && !vdev->virq_disabled)) { + struct vfio_pci_irq_ctx *ctx; + + ctx = vfio_irq_ctx_get(vdev, 0); + if (WARN_ON_ONCE(!ctx)) + return; + eventfd_signal(ctx->trigger, 1); + } } /* Returns true if the INTx vfio_pci_irq_ctx.masked value is changed. */ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) { struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; unsigned long flags; bool masked_changed = false; @@ -77,7 +109,14 @@ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) if (unlikely(!is_intx(vdev))) { if (vdev->pci_2_3) pci_intx(pdev, 0); - } else if (!vdev->ctx[0].masked) { + goto out_unlock; + } + + ctx = vfio_irq_ctx_get(vdev, 0); + if (WARN_ON_ONCE(!ctx)) + goto out_unlock; + + if (!ctx->masked) { /* * Can't use check_and_mask here because we always want to * mask, not just when something is pending. @@ -87,10 +126,11 @@ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) else disable_irq_nosync(pdev->irq); - vdev->ctx[0].masked = true; + ctx->masked = true; masked_changed = true; } +out_unlock: spin_unlock_irqrestore(&vdev->irqlock, flags); return masked_changed; } @@ -105,6 +145,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused) { struct vfio_pci_core_device *vdev = opaque; struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; unsigned long flags; int ret = 0; @@ -117,7 +158,14 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused) if (unlikely(!is_intx(vdev))) { if (vdev->pci_2_3) pci_intx(pdev, 1); - } else if (vdev->ctx[0].masked && !vdev->virq_disabled) { + goto out_unlock; + } + + ctx = vfio_irq_ctx_get(vdev, 0); + if (WARN_ON_ONCE(!ctx)) + goto out_unlock; + + if (ctx->masked && !vdev->virq_disabled) { /* * A pending interrupt here would immediately trigger, * but we can avoid that overhead by just re-sending @@ -129,9 +177,10 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused) } else enable_irq(pdev->irq); - vdev->ctx[0].masked = (ret > 0); + ctx->masked = (ret > 0); } +out_unlock: spin_unlock_irqrestore(&vdev->irqlock, flags); return ret; @@ -146,18 +195,23 @@ void vfio_pci_intx_unmask(struct vfio_pci_core_device *vdev) static irqreturn_t vfio_intx_handler(int irq, void *dev_id) { struct vfio_pci_core_device *vdev = dev_id; + struct vfio_pci_irq_ctx *ctx; unsigned long flags; int ret = IRQ_NONE; + ctx = vfio_irq_ctx_get(vdev, 0); + if (WARN_ON_ONCE(!ctx)) + return ret; + spin_lock_irqsave(&vdev->irqlock, flags); if (!vdev->pci_2_3) { disable_irq_nosync(vdev->pdev->irq); - vdev->ctx[0].masked = true; + ctx->masked = true; ret = IRQ_HANDLED; - } else if (!vdev->ctx[0].masked && /* may be shared */ + } else if (!ctx->masked && /* may be shared */ pci_check_and_mask_intx(vdev->pdev)) { - vdev->ctx[0].masked = true; + ctx->masked = true; ret = IRQ_HANDLED; } @@ -171,15 +225,24 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id) static int vfio_intx_enable(struct vfio_pci_core_device *vdev) { + struct vfio_pci_irq_ctx *ctx; + int ret; + if (!is_irq_none(vdev)) return -EINVAL; if (!vdev->pdev->irq) return -ENODEV; - vdev->ctx = kzalloc(sizeof(struct vfio_pci_irq_ctx), GFP_KERNEL_ACCOUNT); - if (!vdev->ctx) - return -ENOMEM; + ret = vfio_irq_ctx_alloc_num(vdev, 1); + if (ret) + return ret; + + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) { + vfio_irq_ctx_free_all(vdev); + return -EINVAL; + } vdev->num_ctx = 1; @@ -189,9 +252,9 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev) * here, non-PCI-2.3 devices will have to wait until the * interrupt is enabled. */ - vdev->ctx[0].masked = vdev->virq_disabled; + ctx->masked = vdev->virq_disabled; if (vdev->pci_2_3) - pci_intx(vdev->pdev, !vdev->ctx[0].masked); + pci_intx(vdev->pdev, !ctx->masked); vdev->irq_type = VFIO_PCI_INTX_IRQ_INDEX; @@ -202,41 +265,46 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) { struct pci_dev *pdev = vdev->pdev; unsigned long irqflags = IRQF_SHARED; + struct vfio_pci_irq_ctx *ctx; struct eventfd_ctx *trigger; unsigned long flags; int ret; - if (vdev->ctx[0].trigger) { + ctx = vfio_irq_ctx_get(vdev, 0); + if (WARN_ON_ONCE(!ctx)) + return -EINVAL; + + if (ctx->trigger) { free_irq(pdev->irq, vdev); - kfree(vdev->ctx[0].name); - eventfd_ctx_put(vdev->ctx[0].trigger); - vdev->ctx[0].trigger = NULL; + kfree(ctx->name); + eventfd_ctx_put(ctx->trigger); + ctx->trigger = NULL; } if (fd < 0) /* Disable only */ return 0; - vdev->ctx[0].name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-intx(%s)", - pci_name(pdev)); - if (!vdev->ctx[0].name) + ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-intx(%s)", + pci_name(pdev)); + if (!ctx->name) return -ENOMEM; trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { - kfree(vdev->ctx[0].name); + kfree(ctx->name); return PTR_ERR(trigger); } - vdev->ctx[0].trigger = trigger; + ctx->trigger = trigger; if (!vdev->pci_2_3) irqflags = 0; ret = request_irq(pdev->irq, vfio_intx_handler, - irqflags, vdev->ctx[0].name, vdev); + irqflags, ctx->name, vdev); if (ret) { - vdev->ctx[0].trigger = NULL; - kfree(vdev->ctx[0].name); + ctx->trigger = NULL; + kfree(ctx->name); eventfd_ctx_put(trigger); return ret; } @@ -246,7 +314,7 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) * disable_irq won't. */ spin_lock_irqsave(&vdev->irqlock, flags); - if (!vdev->pci_2_3 && vdev->ctx[0].masked) + if (!vdev->pci_2_3 && ctx->masked) disable_irq_nosync(pdev->irq); spin_unlock_irqrestore(&vdev->irqlock, flags); @@ -255,12 +323,18 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) static void vfio_intx_disable(struct vfio_pci_core_device *vdev) { - vfio_virqfd_disable(&vdev->ctx[0].unmask); - vfio_virqfd_disable(&vdev->ctx[0].mask); + struct vfio_pci_irq_ctx *ctx; + + ctx = vfio_irq_ctx_get(vdev, 0); + WARN_ON_ONCE(!ctx); + if (ctx) { + vfio_virqfd_disable(&ctx->unmask); + vfio_virqfd_disable(&ctx->mask); + } vfio_intx_set_signal(vdev, -1); vdev->irq_type = VFIO_PCI_NUM_IRQS; vdev->num_ctx = 0; - kfree(vdev->ctx); + vfio_irq_ctx_free_all(vdev); } /* @@ -284,10 +358,9 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi if (!is_irq_none(vdev)) return -EINVAL; - vdev->ctx = kcalloc(nvec, sizeof(struct vfio_pci_irq_ctx), - GFP_KERNEL_ACCOUNT); - if (!vdev->ctx) - return -ENOMEM; + ret = vfio_irq_ctx_alloc_num(vdev, nvec); + if (ret) + return ret; /* return the number of supported vectors if we can't get all: */ cmd = vfio_pci_memory_lock_and_enable(vdev); @@ -296,7 +369,7 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi if (ret > 0) pci_free_irq_vectors(pdev); vfio_pci_memory_unlock_and_restore(vdev, cmd); - kfree(vdev->ctx); + vfio_irq_ctx_free_all(vdev); return ret; } vfio_pci_memory_unlock_and_restore(vdev, cmd); @@ -320,6 +393,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, unsigned int vector, int fd, bool msix) { struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; struct eventfd_ctx *trigger; int irq, ret; u16 cmd; @@ -327,33 +401,33 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, if (vector >= vdev->num_ctx) return -EINVAL; + ctx = vfio_irq_ctx_get(vdev, vector); + if (!ctx) + return -EINVAL; irq = pci_irq_vector(pdev, vector); - if (vdev->ctx[vector].trigger) { - irq_bypass_unregister_producer(&vdev->ctx[vector].producer); + if (ctx->trigger) { + irq_bypass_unregister_producer(&ctx->producer); cmd = vfio_pci_memory_lock_and_enable(vdev); - free_irq(irq, vdev->ctx[vector].trigger); + free_irq(irq, ctx->trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); - - kfree(vdev->ctx[vector].name); - eventfd_ctx_put(vdev->ctx[vector].trigger); - vdev->ctx[vector].trigger = NULL; + kfree(ctx->name); + eventfd_ctx_put(ctx->trigger); + ctx->trigger = NULL; } if (fd < 0) return 0; - vdev->ctx[vector].name = kasprintf(GFP_KERNEL_ACCOUNT, - "vfio-msi%s[%d](%s)", - msix ? "x" : "", vector, - pci_name(pdev)); - if (!vdev->ctx[vector].name) + ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)", + msix ? "x" : "", vector, pci_name(pdev)); + if (!ctx->name) return -ENOMEM; trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { - kfree(vdev->ctx[vector].name); + kfree(ctx->name); return PTR_ERR(trigger); } @@ -372,26 +446,25 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, pci_write_msi_msg(irq, &msg); } - ret = request_irq(irq, vfio_msihandler, 0, - vdev->ctx[vector].name, trigger); + ret = request_irq(irq, vfio_msihandler, 0, ctx->name, trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); if (ret) { - kfree(vdev->ctx[vector].name); + kfree(ctx->name); eventfd_ctx_put(trigger); return ret; } - vdev->ctx[vector].producer.token = trigger; - vdev->ctx[vector].producer.irq = irq; - ret = irq_bypass_register_producer(&vdev->ctx[vector].producer); + ctx->producer.token = trigger; + ctx->producer.irq = irq; + ret = irq_bypass_register_producer(&ctx->producer); if (unlikely(ret)) { dev_info(&pdev->dev, "irq bypass producer (token %p) registration fails: %d\n", - vdev->ctx[vector].producer.token, ret); + ctx->producer.token, ret); - vdev->ctx[vector].producer.token = NULL; + ctx->producer.token = NULL; } - vdev->ctx[vector].trigger = trigger; + ctx->trigger = trigger; return 0; } @@ -421,13 +494,17 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) { struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; unsigned int i; u16 cmd; for (i = 0; i < vdev->num_ctx; i++) { - vfio_virqfd_disable(&vdev->ctx[i].unmask); - vfio_virqfd_disable(&vdev->ctx[i].mask); - vfio_msi_set_vector_signal(vdev, i, -1, msix); + ctx = vfio_irq_ctx_get(vdev, i); + if (ctx) { + vfio_virqfd_disable(&ctx->unmask); + vfio_virqfd_disable(&ctx->mask); + vfio_msi_set_vector_signal(vdev, i, -1, msix); + } } cmd = vfio_pci_memory_lock_and_enable(vdev); @@ -443,7 +520,7 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) vdev->irq_type = VFIO_PCI_NUM_IRQS; vdev->num_ctx = 0; - kfree(vdev->ctx); + vfio_irq_ctx_free_all(vdev); } /* @@ -463,14 +540,18 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_core_device *vdev, if (unmask) vfio_pci_intx_unmask(vdev); } else if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { + struct vfio_pci_irq_ctx *ctx = vfio_irq_ctx_get(vdev, 0); int32_t fd = *(int32_t *)data; + + if (WARN_ON_ONCE(!ctx)) + return -EINVAL; if (fd >= 0) return vfio_virqfd_enable((void *) vdev, vfio_pci_intx_unmask_handler, vfio_send_intx_eventfd, NULL, - &vdev->ctx[0].unmask, fd); + &ctx->unmask, fd); - vfio_virqfd_disable(&vdev->ctx[0].unmask); + vfio_virqfd_disable(&ctx->unmask); } return 0; @@ -543,6 +624,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { + struct vfio_pci_irq_ctx *ctx; unsigned int i; bool msix = (index == VFIO_PCI_MSIX_IRQ_INDEX) ? true : false; @@ -577,14 +659,15 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, return -EINVAL; for (i = start; i < start + count; i++) { - if (!vdev->ctx[i].trigger) + ctx = vfio_irq_ctx_get(vdev, i); + if (!ctx || !ctx->trigger) continue; if (flags & VFIO_IRQ_SET_DATA_NONE) { - eventfd_signal(vdev->ctx[i].trigger, 1); + eventfd_signal(ctx->trigger, 1); } else if (flags & VFIO_IRQ_SET_DATA_BOOL) { uint8_t *bools = data; if (bools[i - start]) - eventfd_signal(vdev->ctx[i].trigger, 1); + eventfd_signal(ctx->trigger, 1); } } return 0; From patchwork Thu May 11 15:44:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13238151 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1800CC7EE23 for ; Thu, 11 May 2023 15:45:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238817AbjEKPpE (ORCPT ); Thu, 11 May 2023 11:45:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36102 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238660AbjEKPo6 (ORCPT ); Thu, 11 May 2023 11:44:58 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E826FD; Thu, 11 May 2023 08:44:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683819896; x=1715355896; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8S7nmNRsfN3F7l+vsdYbyvmv0P60T0hUfIF0wQ0EMhs=; b=BShJ9Querevh4ei01lDrbXOa+Q1HsrtF9qhkdyxoOB7OvqhAMq+sai7i Lqg20VxcJetVkXD3YNxqB9ZDaPc7JdvcraXNAfpkasbCoHJRneIrzKBrY zTwDjkfSSnBzDbbcMN+rxAZbiyYTkkEvUCYo6FepfgNLgC8G8mfRu2Cy9 LZ3XB+cNQ47+ct1MRDQNdMBUp8xFalpW0wZbsnoA86wjBlG6T10X+68Ue WEqQTanWyJgcjCSEAunNKL/cCYUvwtrpjx+Ao3FWsVAuXlLr219fq3S/N cSB24lz+Vp0TPOEAA4OatbG5Lre2j7EnOYycKKIxQTdu1Y8V7A5wkq95r Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="335046665" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="335046665" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="699776230" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="699776230" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:48 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V5 04/11] vfio/pci: Move to single error path Date: Thu, 11 May 2023 08:44:31 -0700 Message-Id: <72dddae8aa710ce522a74130120733af61cffe4d.1683740667.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Enabling and disabling of an interrupt involves several steps that can fail. Cleanup after failure is done when the error is encountered, resulting in some repetitive code. Support for dynamic contexts will introduce more steps during interrupt enabling and disabling. Transition to centralized exit path in preparation for dynamic contexts to eliminate duplicate error handling code. Signed-off-by: Reinette Chatre Reviewed-by: Kevin Tian --- Changes since V4: - Add Kevin's Reviewed-by tag. (Kevin) Changes since V2: - Move patch to earlier in series in support of the change to dynamic context management. - Do not add the "ctx->name = NULL" in error path. It is not done in baseline and will not be needed when transitioning to dynamic context management. - Update changelog to not make this change specific to dynamic MSI-X. Changes since RFC V1: - Improve changelog. drivers/vfio/pci/vfio_pci_intrs.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 6094679349d9..96396e1ad085 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -427,8 +427,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { - kfree(ctx->name); - return PTR_ERR(trigger); + ret = PTR_ERR(trigger); + goto out_free_name; } /* @@ -448,11 +448,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, ret = request_irq(irq, vfio_msihandler, 0, ctx->name, trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); - if (ret) { - kfree(ctx->name); - eventfd_ctx_put(trigger); - return ret; - } + if (ret) + goto out_put_eventfd_ctx; ctx->producer.token = trigger; ctx->producer.irq = irq; @@ -467,6 +464,12 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, ctx->trigger = trigger; return 0; + +out_put_eventfd_ctx: + eventfd_ctx_put(trigger); +out_free_name: + kfree(ctx->name); + return ret; } static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, From patchwork Thu May 11 15:44:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13238155 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38798C77B7C for ; Thu, 11 May 2023 15:45:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238836AbjEKPpM (ORCPT ); Thu, 11 May 2023 11:45:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36254 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238802AbjEKPpC (ORCPT ); Thu, 11 May 2023 11:45:02 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8FF6559D4; Thu, 11 May 2023 08:44:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683819900; x=1715355900; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=itBGbh9mjbjKnt3SikZkfXZGLbqY/yNzMQ669OJKtoY=; b=MoW5oUw54E56XNew1OEG28ESPEjwKXucw0TFQF2kDcHjRb1maTradpe8 0s4pvkTAU99v/9WboDBaW1hAZ5vEiAb1rOdoTAnMtXlwtzYhV+gQJMEqb rnN531O7J6zKPOsEI+6nYpYSNQ6LaT7YBHgvacfm32p7bMWwQNX/iA9j3 kXiNiyhVnKQEEi/Q+kFZ9wVge/G9bPqVrIbZbaT/W4e2SAP6GhS/wHeSr XMHduL/xlMp3xJIMu9FPAk0dsapt2HcOo6buzJ67u7buj5zxiGkiopUC2 xHTaVdOouuDH4rQPc+mDZEZYmsXA+zJhoIK2VRaTb5/C2H7v7XHTbl85C w==; X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="335046670" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="335046670" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="699776233" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="699776233" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:48 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V5 05/11] vfio/pci: Use xarray for interrupt context storage Date: Thu, 11 May 2023 08:44:32 -0700 Message-Id: <40e235f38d427aff79ae35eda0ced42502aa0937.1683740667.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Interrupt context is statically allocated at the time interrupts are allocated. Following allocation, the context is managed by directly accessing the elements of the array using the vector as index. The storage is released when interrupts are disabled. It is possible to dynamically allocate a single MSI-X interrupt after MSI-X is enabled. A dynamic storage for interrupt context is needed to support this. Replace the interrupt context array with an xarray (similar to what the core uses as store for MSI descriptors) that can support the dynamic expansion while maintaining the custom that uses the vector as index. With a dynamic storage it is no longer required to pre-allocate interrupt contexts at the time the interrupts are allocated. MSI and MSI-X interrupt contexts are only used when interrupts are enabled. Their allocation can thus be delayed until interrupt enabling. Only enabled interrupts will have associated interrupt contexts. Whether an interrupt has been allocated (a Linux irq number exists for it) becomes the criteria for whether an interrupt can be enabled. Signed-off-by: Reinette Chatre Link: https://lore.kernel.org/lkml/20230404122444.59e36a99.alex.williamson@redhat.com/ Reviewed-by: Kevin Tian --- Changes since V4: - Add Kevin's Reviewed-by tag. (Kevin) Changes since V2: - Only allocate contexts as they are used, or "active". (Alex) - Move vfio_irq_ctx_free() from later patch to prevent open-coding the same within vfio_irq_ctx_free_all(). This evolved into vfio_irq_ctx_free() used for dynamic context allocation and vfio_irq_ctx_free_all() removed because of it. (Alex) - With vfio_irq_ctx_alloc_num() removed, rename vfio_irq_ctx_alloc_single() to vfio_irq_ctx_alloc(). Changes since RFC V1: - Let vfio_irq_ctx_alloc_single() return pointer to allocated context. (Alex) - Transition INTx allocation to simplified vfio_irq_ctx_alloc_single(). - Improve accuracy of changelog. drivers/vfio/pci/vfio_pci_core.c | 1 + drivers/vfio/pci/vfio_pci_intrs.c | 91 ++++++++++++++++--------------- include/linux/vfio_pci_core.h | 2 +- 3 files changed, 48 insertions(+), 46 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index a5ab416cf476..ae0e161c7fc9 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -2102,6 +2102,7 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev) INIT_LIST_HEAD(&vdev->vma_list); INIT_LIST_HEAD(&vdev->sriov_pfs_item); init_rwsem(&vdev->memory_lock); + xa_init(&vdev->ctx); return 0; } diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 96396e1ad085..77957274027c 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -52,25 +52,33 @@ static struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev, unsigned long index) { - if (index >= vdev->num_ctx) - return NULL; - return &vdev->ctx[index]; + return xa_load(&vdev->ctx, index); } -static void vfio_irq_ctx_free_all(struct vfio_pci_core_device *vdev) +static void vfio_irq_ctx_free(struct vfio_pci_core_device *vdev, + struct vfio_pci_irq_ctx *ctx, unsigned long index) { - kfree(vdev->ctx); + xa_erase(&vdev->ctx, index); + kfree(ctx); } -static int vfio_irq_ctx_alloc_num(struct vfio_pci_core_device *vdev, - unsigned long num) +static struct vfio_pci_irq_ctx * +vfio_irq_ctx_alloc(struct vfio_pci_core_device *vdev, unsigned long index) { - vdev->ctx = kcalloc(num, sizeof(struct vfio_pci_irq_ctx), - GFP_KERNEL_ACCOUNT); - if (!vdev->ctx) - return -ENOMEM; + struct vfio_pci_irq_ctx *ctx; + int ret; - return 0; + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT); + if (!ctx) + return NULL; + + ret = xa_insert(&vdev->ctx, index, ctx, GFP_KERNEL_ACCOUNT); + if (ret) { + kfree(ctx); + return NULL; + } + + return ctx; } /* @@ -226,7 +234,6 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id) static int vfio_intx_enable(struct vfio_pci_core_device *vdev) { struct vfio_pci_irq_ctx *ctx; - int ret; if (!is_irq_none(vdev)) return -EINVAL; @@ -234,15 +241,9 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev) if (!vdev->pdev->irq) return -ENODEV; - ret = vfio_irq_ctx_alloc_num(vdev, 1); - if (ret) - return ret; - - ctx = vfio_irq_ctx_get(vdev, 0); - if (!ctx) { - vfio_irq_ctx_free_all(vdev); - return -EINVAL; - } + ctx = vfio_irq_ctx_alloc(vdev, 0); + if (!ctx) + return -ENOMEM; vdev->num_ctx = 1; @@ -334,7 +335,7 @@ static void vfio_intx_disable(struct vfio_pci_core_device *vdev) vfio_intx_set_signal(vdev, -1); vdev->irq_type = VFIO_PCI_NUM_IRQS; vdev->num_ctx = 0; - vfio_irq_ctx_free_all(vdev); + vfio_irq_ctx_free(vdev, ctx, 0); } /* @@ -358,10 +359,6 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi if (!is_irq_none(vdev)) return -EINVAL; - ret = vfio_irq_ctx_alloc_num(vdev, nvec); - if (ret) - return ret; - /* return the number of supported vectors if we can't get all: */ cmd = vfio_pci_memory_lock_and_enable(vdev); ret = pci_alloc_irq_vectors(pdev, 1, nvec, flag); @@ -369,7 +366,6 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi if (ret > 0) pci_free_irq_vectors(pdev); vfio_pci_memory_unlock_and_restore(vdev, cmd); - vfio_irq_ctx_free_all(vdev); return ret; } vfio_pci_memory_unlock_and_restore(vdev, cmd); @@ -401,12 +397,13 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, if (vector >= vdev->num_ctx) return -EINVAL; - ctx = vfio_irq_ctx_get(vdev, vector); - if (!ctx) - return -EINVAL; irq = pci_irq_vector(pdev, vector); + if (irq < 0) + return -EINVAL; - if (ctx->trigger) { + ctx = vfio_irq_ctx_get(vdev, vector); + + if (ctx) { irq_bypass_unregister_producer(&ctx->producer); cmd = vfio_pci_memory_lock_and_enable(vdev); @@ -414,16 +411,22 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, vfio_pci_memory_unlock_and_restore(vdev, cmd); kfree(ctx->name); eventfd_ctx_put(ctx->trigger); - ctx->trigger = NULL; + vfio_irq_ctx_free(vdev, ctx, vector); } if (fd < 0) return 0; + ctx = vfio_irq_ctx_alloc(vdev, vector); + if (!ctx) + return -ENOMEM; + ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)", msix ? "x" : "", vector, pci_name(pdev)); - if (!ctx->name) - return -ENOMEM; + if (!ctx->name) { + ret = -ENOMEM; + goto out_free_ctx; + } trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { @@ -469,6 +472,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, eventfd_ctx_put(trigger); out_free_name: kfree(ctx->name); +out_free_ctx: + vfio_irq_ctx_free(vdev, ctx, vector); return ret; } @@ -498,16 +503,13 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) { struct pci_dev *pdev = vdev->pdev; struct vfio_pci_irq_ctx *ctx; - unsigned int i; + unsigned long i; u16 cmd; - for (i = 0; i < vdev->num_ctx; i++) { - ctx = vfio_irq_ctx_get(vdev, i); - if (ctx) { - vfio_virqfd_disable(&ctx->unmask); - vfio_virqfd_disable(&ctx->mask); - vfio_msi_set_vector_signal(vdev, i, -1, msix); - } + xa_for_each(&vdev->ctx, i, ctx) { + vfio_virqfd_disable(&ctx->unmask); + vfio_virqfd_disable(&ctx->mask); + vfio_msi_set_vector_signal(vdev, i, -1, msix); } cmd = vfio_pci_memory_lock_and_enable(vdev); @@ -523,7 +525,6 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) vdev->irq_type = VFIO_PCI_NUM_IRQS; vdev->num_ctx = 0; - vfio_irq_ctx_free_all(vdev); } /* @@ -663,7 +664,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, for (i = start; i < start + count; i++) { ctx = vfio_irq_ctx_get(vdev, i); - if (!ctx || !ctx->trigger) + if (!ctx) continue; if (flags & VFIO_IRQ_SET_DATA_NONE) { eventfd_signal(ctx->trigger, 1); diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 367fd79226a3..61d7873a3973 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -59,7 +59,7 @@ struct vfio_pci_core_device { struct perm_bits *msi_perm; spinlock_t irqlock; struct mutex igate; - struct vfio_pci_irq_ctx *ctx; + struct xarray ctx; int num_ctx; int irq_type; int num_regions; From patchwork Thu May 11 15:44:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13238152 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B1ACC7EE2A for ; Thu, 11 May 2023 15:45:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238819AbjEKPpH (ORCPT ); Thu, 11 May 2023 11:45:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36118 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238772AbjEKPo7 (ORCPT ); Thu, 11 May 2023 11:44:59 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBEF459CD; Thu, 11 May 2023 08:44:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683819898; x=1715355898; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZuWQTxy8AqfpQ+bvDIhc2X8Civ9eEAHcNiFXD/cVlS8=; b=oI4C0P7ihN4AckTPVbTjpnienxOGHcEEUErcMOiGt7q+rACaWWk5ZhCT 1qw0XiTvCen+QQx8wDt2YI2m3QfBoI5vrn2za9TVysDh0pi3NqSGFp4UD LpiBl0pg57sePys1wi9f38AfelIt8UH7jjsVxbcr70JJmhjb+ZDxX4/Yk ShYlSGbU3cJ6WJweX4OKW6xT2UwsougovW1nRaGI23C01+NugL1bb4+nP U6ww0PLfAacwdLSQe8Y63ZYjfgk5hyHSVXPzHuLTArSNFflJpYjTki8c9 H1SOXp6TYi3WXXvxhShIEX3OC3F54NLS0FB6qoThlI0Yjy+OdmlQb+SwT w==; X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="335046675" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="335046675" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="699776235" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="699776235" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:48 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V5 06/11] vfio/pci: Remove interrupt context counter Date: Thu, 11 May 2023 08:44:33 -0700 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org struct vfio_pci_core_device::num_ctx counts how many interrupt contexts have been allocated. When all interrupt contexts are allocated simultaneously num_ctx provides the upper bound of all vectors that can be used as indices into the interrupt context array. With the upcoming support for dynamic MSI-X the number of interrupt contexts does not necessarily span the range of allocated interrupts. Consequently, num_ctx is no longer a trusted upper bound for valid indices. Stop using num_ctx to determine if a provided vector is valid. Use the existence of allocated interrupt. This changes behavior on the error path when user space provides an invalid vector range. Behavior changes from early exit without any modifications to possible modifications to valid vectors within the invalid range. This is acceptable considering that an invalid range is not a valid scenario, see link to discussion. The checks that ensure that user space provides a range of vectors that is valid for the device are untouched. Signed-off-by: Reinette Chatre Link: https://lore.kernel.org/lkml/20230316155646.07ae266f.alex.williamson@redhat.com/ Reviewed-by: Kevin Tian --- Changes since V4: - Add Kevin's Reviewed-by tag. (Kevin) Changes since V2: - Update changelog to reflect change in policy that existence of allocated interrupt is validity check, not existence of context (which is now dynamically allocated). Changes since RFC V1: - Remove vfio_irq_ctx_range_allocated(). (Alex and Kevin). drivers/vfio/pci/vfio_pci_intrs.c | 13 +------------ include/linux/vfio_pci_core.h | 1 - 2 files changed, 1 insertion(+), 13 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 77957274027c..e40eca69a293 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -245,8 +245,6 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev) if (!ctx) return -ENOMEM; - vdev->num_ctx = 1; - /* * If the virtual interrupt is masked, restore it. Devices * supporting DisINTx can be masked at the hardware level @@ -334,7 +332,6 @@ static void vfio_intx_disable(struct vfio_pci_core_device *vdev) } vfio_intx_set_signal(vdev, -1); vdev->irq_type = VFIO_PCI_NUM_IRQS; - vdev->num_ctx = 0; vfio_irq_ctx_free(vdev, ctx, 0); } @@ -370,7 +367,6 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi } vfio_pci_memory_unlock_and_restore(vdev, cmd); - vdev->num_ctx = nvec; vdev->irq_type = msix ? VFIO_PCI_MSIX_IRQ_INDEX : VFIO_PCI_MSI_IRQ_INDEX; @@ -394,9 +390,6 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, int irq, ret; u16 cmd; - if (vector >= vdev->num_ctx) - return -EINVAL; - irq = pci_irq_vector(pdev, vector); if (irq < 0) return -EINVAL; @@ -483,9 +476,6 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, unsigned int i, j; int ret = 0; - if (start >= vdev->num_ctx || start + count > vdev->num_ctx) - return -EINVAL; - for (i = 0, j = start; i < count && !ret; i++, j++) { int fd = fds ? fds[i] : -1; ret = vfio_msi_set_vector_signal(vdev, j, fd, msix); @@ -524,7 +514,6 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) pci_intx(pdev, 0); vdev->irq_type = VFIO_PCI_NUM_IRQS; - vdev->num_ctx = 0; } /* @@ -659,7 +648,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, return ret; } - if (!irq_is(vdev, index) || start + count > vdev->num_ctx) + if (!irq_is(vdev, index)) return -EINVAL; for (i = start; i < start + count; i++) { diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 61d7873a3973..148fd1ae6c1c 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -60,7 +60,6 @@ struct vfio_pci_core_device { spinlock_t irqlock; struct mutex igate; struct xarray ctx; - int num_ctx; int irq_type; int num_regions; struct vfio_pci_region *region; From patchwork Thu May 11 15:44:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13238153 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BFC9C7EE22 for ; Thu, 11 May 2023 15:45:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238772AbjEKPpQ (ORCPT ); Thu, 11 May 2023 11:45:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238811AbjEKPpD (ORCPT ); Thu, 11 May 2023 11:45:03 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D431B5B93; Thu, 11 May 2023 08:45:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683819901; x=1715355901; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+C6B7H7ve/lwfgUWQi6R33paNDdITx62G7bxEy6HEqA=; b=IwGfVQLezTasKcUFeB3X6ofNyhWDMwl7Mwen1aTIBDAPI03VfL8Zn3ps nRPOfVCwfLPGwERnZhqjwNDEEJHrcSLMQNxpM1Xdzox6aZsMMzCsI4iDY zbmQ/FGtxY1dgsUWJUtjXEmCmYKDch/4lOgSjxErDosLDv/BHuam2hddV kON5Bwlnq6TffAohBBdLd2uj4Jse2Lk3q/JJPvZ3j3jO62bIvwigwcrdx cz9TE0O9NfVC32Gd8pqxEhpYYfGHpvk6CGjeMHvT3JCjhboUL2hBDGAYU htDyA7SSsBErJeDyIO8yxnaPT5cEnJF3NLpNUVYj1WKPRJydHQtBeSxbn w==; X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="335046681" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="335046681" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="699776239" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="699776239" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:48 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V5 07/11] vfio/pci: Update stale comment Date: Thu, 11 May 2023 08:44:34 -0700 Message-Id: <5b605ce7dcdab5a5dfef19cec4d73ae2fdad3ae1.1683740667.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In preparation for surrounding code change it is helpful to ensure that existing comments are accurate. Remove inaccurate comment about direct access and update the rest of the comment to reflect the purpose of writing the cached MSI message to the device. Suggested-by: Alex Williamson Link: https://lore.kernel.org/lkml/20230330164050.0069e2a5.alex.williamson@redhat.com/ Signed-off-by: Reinette Chatre Reviewed-by: Kevin Tian --- Changes since V4: - Restore text about backdoor reset as example to indicate that the restore may not be a severe problem but instead done in collaboration with user space. (Kevin) Changes since V2: - New patch. drivers/vfio/pci/vfio_pci_intrs.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index e40eca69a293..867327e159c1 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -428,11 +428,9 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, } /* - * The MSIx vector table resides in device memory which may be cleared - * via backdoor resets. We don't allow direct access to the vector - * table so even if a userspace driver attempts to save/restore around - * such a reset it would be unsuccessful. To avoid this, restore the - * cached value of the message prior to enabling. + * If the vector was previously allocated, refresh the on-device + * message data before enabling in case it had been cleared or + * corrupted (e.g. due to backdoor resets) since writing. */ cmd = vfio_pci_memory_lock_and_enable(vdev); if (msix) { From patchwork Thu May 11 15:44:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13238154 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58568C7EE23 for ; Thu, 11 May 2023 15:45:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238848AbjEKPpR (ORCPT ); Thu, 11 May 2023 11:45:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238807AbjEKPpC (ORCPT ); Thu, 11 May 2023 11:45:02 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CABA459E6; Thu, 11 May 2023 08:45:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683819901; x=1715355901; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8QfQokLJef1Q3tFm7nj3VQwR9R1wsoZXkMCX5UZhwws=; b=BPNojPFC3PGuF7jA9DOxYrp/IK0Im2Yc7OlGDSoE8pXAj5cr6Q+j/7Ak Xfezzqdb5jBnbjDMJuC3C5pM1lXk1mtXPeBdJhBoMXBCcbZPpy/ROkyFg ruaCJYS4utdagSZg2zMQZeV6agfUMHoG8hd3l4ejiuqditARbQ6lArfWL EXHCHALL+D9+c7vNUumzfxuTIyGDbsCcIm/N6HdEs1HYbS7lQe/UHwXh0 r5Ia3PcxiZr1ID5QNQQAqF8mw3+5I+BdhURa00JulqXHrCqQ6Shdkr0Ch /TMHDDyujC/KrCCEFpdFSA4AqYRioowUmRmoq2n59K5uuJqENAMpXG97W Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="335046685" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="335046685" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="699776242" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="699776242" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:48 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V5 08/11] vfio/pci: Use bitfield for struct vfio_pci_core_device flags Date: Thu, 11 May 2023 08:44:35 -0700 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org struct vfio_pci_core_device contains eleven boolean flags. Boolean flags clearly indicate their usage but space usage starts to be a concern when there are many. An upcoming change adds another boolean flag to struct vfio_pci_core_device, thereby increasing the concern that the boolean flags are consuming unnecessary space. Transition the boolean flags to use bitfields. On a system that uses one byte per boolean this reduces the space consumed by existing flags from 11 bytes to 2 bytes with room for a few more flags without increasing the structure's size. Suggested-by: Jason Gunthorpe Signed-off-by: Reinette Chatre Reviewed-by: Kevin Tian --- Changes since V4: - Add Kevin's Reviewed-by tag. (Kevin) Changes since V3: - New patch. (Jason) include/linux/vfio_pci_core.h | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 148fd1ae6c1c..adb47e2914d7 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -68,17 +68,17 @@ struct vfio_pci_core_device { u16 msix_size; u32 msix_offset; u32 rbar[7]; - bool pci_2_3; - bool virq_disabled; - bool reset_works; - bool extended_caps; - bool bardirty; - bool has_vga; - bool needs_reset; - bool nointx; - bool needs_pm_restore; - bool pm_intx_masked; - bool pm_runtime_engaged; + bool pci_2_3:1; + bool virq_disabled:1; + bool reset_works:1; + bool extended_caps:1; + bool bardirty:1; + bool has_vga:1; + bool needs_reset:1; + bool nointx:1; + bool needs_pm_restore:1; + bool pm_intx_masked:1; + bool pm_runtime_engaged:1; struct pci_saved_state *pci_saved_state; struct pci_saved_state *pm_save; int ioeventfds_nr; From patchwork Thu May 11 15:44:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13238156 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A3A5C7EE23 for ; Thu, 11 May 2023 15:45:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238812AbjEKPp0 (ORCPT ); Thu, 11 May 2023 11:45:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36360 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238816AbjEKPpE (ORCPT ); Thu, 11 May 2023 11:45:04 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9702B65A8; Thu, 11 May 2023 08:45:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683819902; x=1715355902; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+F4nj4ZYLrEfbJc9cJJHRTYr1d6jx6lUO68SKftJit4=; b=Zf2FXGsgRZWYNo6o6hqOxbQts1v1iUNwcwErrlmL9vBwnMDdrkh8lw3z yOcJ4cpzCTcvc6EjM5Dgh32sUQHnmYdNh1GMek90mblFwj2D0q67u2Pxe /+A2L+RCL7pvfuXzygl/dD0uBaeDlBtxBlW/IETea7y4Etgt4omqloHr+ oxASFJa4zqs8l9EadpN9xxg0uwiDGJOpGI3Uu1sCaAO7ebM/m7PjAwy5U g4gtlxYogFap0Bf2+1fHbfkvwgv44Xry4m+oW05BPa+PE34HmmO669xKk EOpa4nkUw1bgH4zRpxXS5A5VM/0EMtebvhhFJLUt8j7Xgrn9uWc9NphYz Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="335046692" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="335046692" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="699776245" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="699776245" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:48 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V5 09/11] vfio/pci: Probe and store ability to support dynamic MSI-X Date: Thu, 11 May 2023 08:44:36 -0700 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Not all MSI-X devices support dynamic MSI-X allocation. Whether a device supports dynamic MSI-X should be queried using pci_msix_can_alloc_dyn(). Instead of scattering code with pci_msix_can_alloc_dyn(), probe this ability once and store it as a property of the virtual device. Suggested-by: Alex Williamson Signed-off-by: Reinette Chatre Reviewed-by: Kevin Tian --- Changes since V4: - Add Kevin's Reviewed-by tag. (Kevin) Changes since V3: - Move field to improve structure layout. (Alex) - Use bitfield. (Jason) Changes since V2: - New patch. (Alex) drivers/vfio/pci/vfio_pci_core.c | 5 ++++- include/linux/vfio_pci_core.h | 1 + 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index ae0e161c7fc9..a3635a8e54c8 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -530,8 +530,11 @@ int vfio_pci_core_enable(struct vfio_pci_core_device *vdev) vdev->msix_bar = table & PCI_MSIX_TABLE_BIR; vdev->msix_offset = table & PCI_MSIX_TABLE_OFFSET; vdev->msix_size = ((flags & PCI_MSIX_FLAGS_QSIZE) + 1) * 16; - } else + vdev->has_dyn_msix = pci_msix_can_alloc_dyn(pdev); + } else { vdev->msix_bar = 0xFF; + vdev->has_dyn_msix = false; + } if (!vfio_vga_disabled() && vfio_pci_is_vga(pdev)) vdev->has_vga = true; diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index adb47e2914d7..562e8754869d 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -68,6 +68,7 @@ struct vfio_pci_core_device { u16 msix_size; u32 msix_offset; u32 rbar[7]; + bool has_dyn_msix:1; bool pci_2_3:1; bool virq_disabled:1; bool reset_works:1; From patchwork Thu May 11 15:44:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13238158 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA589C7EE22 for ; Thu, 11 May 2023 15:45:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238827AbjEKPph (ORCPT ); Thu, 11 May 2023 11:45:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36290 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238809AbjEKPpL (ORCPT ); Thu, 11 May 2023 11:45:11 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92EAC65A7; Thu, 11 May 2023 08:45:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683819903; x=1715355903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1cm72Cy6xlAgprevh8lMD1M+IJolhBAjk36Mv4o6NE4=; b=kVTHr/ASWwMxS8GyvHXlzs9RHhWGkBumFwNn03EDfyub/tX2bVdLr9IY DTo+q35et5u0DY52oy1SIVjap8X/783PWmEHRosBhnFWjYciCBinqJOOH 5QZKDY2c+LQG+LOICfogMfp9aiVFxFHpLVmGa38+xJ3iC4JoFucmzEFtn XiDFSGx5sK9hEa3YBdkDxY8F061rHaAr8ZPgMCDXblrQFKurHD7En6ls+ WfwaxbOCCYpomVL+mCw1VMKGDqLeOFrJy+1kau9Vd2zn1ECUdYzmCWt5s ROgZQY6dXpeHmMEygxjDx5MFqrWpDtKsiyB+NTwuMBRTULnLanMWGA615 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="335046696" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="335046696" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="699776247" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="699776247" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:48 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V5 10/11] vfio/pci: Support dynamic MSI-X Date: Thu, 11 May 2023 08:44:37 -0700 Message-Id: <956c47057ae9fd45591feaa82e9ae20929889249.1683740667.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org pci_msix_alloc_irq_at() enables an individual MSI-X interrupt to be allocated after MSI-X enabling. Use dynamic MSI-X (if supported by the device) to allocate an interrupt after MSI-X is enabled. An MSI-X interrupt is dynamically allocated at the time a valid eventfd is assigned. This is different behavior from a range provided during MSI-X enabling where interrupts are allocated for the entire range whether a valid eventfd is provided for each interrupt or not. The PCI-MSIX API requires that some number of irqs are allocated for an initial set of vectors when enabling MSI-X on the device. When dynamic MSIX allocation is not supported, the vector table, and thus the allocated irq set can only be resized by disabling and re-enabling MSI-X with a different range. In that case the irq allocation is essentially a cache for configuring vectors within the previously allocated vector range. When dynamic MSI-X allocation is supported, the API still requires some initial set of irqs to be allocated, but also supports allocating and freeing specific irq vectors both within and beyond the initially allocated range. For consistency between modes, as well as to reduce latency and improve reliability of allocations, and also simplicity, this implementation only releases irqs via pci_free_irq_vectors() when either the interrupt mode changes or the device is released. Signed-off-by: Reinette Chatre Link: https://lore.kernel.org/lkml/20230403211841.0e206b67.alex.williamson@redhat.com/ Reviewed-by: Kevin Tian --- Changes since V4: - Treat pci_irq_vector() returning '0' for Linux IRQ number as kernel bug with a WARN. (Kevin) - Move comment about missing vfio_msi_free_irq() to description of vfio_msi_alloc_irq(). (Kevin) - Add comment at vfio_msi_alloc_irq() call that indicates the allocation is maintained until MSI-X disable to explain the absence of its release in error path. (Kevin) Changes since V3: - Remove vfio_msi_free_irq(). (Alex) - Rework changelog. (Alex) Changes since V2: - Move vfio_irq_ctx_free() to earlier in series to support earlier usage. (Alex) - Use consistent terms in changelog: MSI-x changed to MSI-X. - Make dynamic interrupt context creation generic across all MSI/MSI-X interrupts. This resulted in code moving to earlier in series as part of xarray introduction patch. (Alex) - Remove the local allow_dyn_alloc and direct calling of pci_msix_can_alloc_dyn(), use the new vdev->has_dyn_msix introduced earlier instead. (Alex) - Stop tracking new allocations (remove "new_ctx"). (Alex) - Introduce new wrapper that returns Linux interrupt number or dynamically allocate a new interrupt. Wrapper can be used for all interrupt cases. (Alex) - Only free dynamic MSI-X interrupts on MSI-X teardown. (Alex) Changes since RFC V1: - Add pointer to interrupt context as function parameter to vfio_irq_ctx_free(). (Alex) - Initialize new_ctx to false. (Dan Carpenter) - Only support dynamic allocation if device supports it. (Alex) drivers/vfio/pci/vfio_pci_intrs.c | 47 +++++++++++++++++++++++++++---- 1 file changed, 41 insertions(+), 6 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 867327e159c1..cbb4bcbfbf83 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -381,27 +381,55 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi return 0; } +/* + * vfio_msi_alloc_irq() returns the Linux IRQ number of an MSI or MSI-X device + * interrupt vector. If a Linux IRQ number is not available then a new + * interrupt is allocated if dynamic MSI-X is supported. + * + * Where is vfio_msi_free_irq()? Allocated interrupts are maintained, + * essentially forming a cache that subsequent allocations can draw from. + * Interrupts are freed using pci_free_irq_vectors() when MSI/MSI-X is + * disabled. + */ +static int vfio_msi_alloc_irq(struct vfio_pci_core_device *vdev, + unsigned int vector, bool msix) +{ + struct pci_dev *pdev = vdev->pdev; + struct msi_map map; + int irq; + u16 cmd; + + irq = pci_irq_vector(pdev, vector); + if (WARN_ON_ONCE(irq == 0)) + return -EINVAL; + if (irq > 0 || !msix || !vdev->has_dyn_msix) + return irq; + + cmd = vfio_pci_memory_lock_and_enable(vdev); + map = pci_msix_alloc_irq_at(pdev, vector, NULL); + vfio_pci_memory_unlock_and_restore(vdev, cmd); + + return map.index < 0 ? map.index : map.virq; +} + static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, unsigned int vector, int fd, bool msix) { struct pci_dev *pdev = vdev->pdev; struct vfio_pci_irq_ctx *ctx; struct eventfd_ctx *trigger; - int irq, ret; + int irq = -EINVAL, ret; u16 cmd; - irq = pci_irq_vector(pdev, vector); - if (irq < 0) - return -EINVAL; - ctx = vfio_irq_ctx_get(vdev, vector); if (ctx) { irq_bypass_unregister_producer(&ctx->producer); - + irq = pci_irq_vector(pdev, vector); cmd = vfio_pci_memory_lock_and_enable(vdev); free_irq(irq, ctx->trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); + /* Interrupt stays allocated, will be freed at MSI-X disable. */ kfree(ctx->name); eventfd_ctx_put(ctx->trigger); vfio_irq_ctx_free(vdev, ctx, vector); @@ -410,6 +438,13 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, if (fd < 0) return 0; + if (irq == -EINVAL) { + /* Interrupt stays allocated, will be freed at MSI-X disable. */ + irq = vfio_msi_alloc_irq(vdev, vector, msix); + if (irq < 0) + return irq; + } + ctx = vfio_irq_ctx_alloc(vdev, vector); if (!ctx) return -ENOMEM; From patchwork Thu May 11 15:44:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13238157 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27162C77B7C for ; Thu, 11 May 2023 15:45:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238872AbjEKPpf (ORCPT ); Thu, 11 May 2023 11:45:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237526AbjEKPpF (ORCPT ); Thu, 11 May 2023 11:45:05 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C06FA59CD; Thu, 11 May 2023 08:45:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683819903; x=1715355903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1OCB2Pnk9ac/skfbPGWj0mlIkor+JuqIsSYNnVVKu/A=; b=ggm7XSvaGSGj32Wkj22ABcH/UUH8If4mQi9VOHSxIAAWyaZ6tfbP5oSb BAC8rD/8rauN19KawY4zf+fy3kDhxjT3iRiBaWIYkco0lNY974kCqvU0V 3h+gyT66wgmNpYV4mmUFkRorK1UpK5I9RDtyqsBZiO4dI4H+u/pnTxg+D xMDfebyd/0AjwICzxV5PxUFVoT+20cjNiSHOgTXiIDWW9TPUUAqmUGzUU BPYaA3ZvZz+0AnSqz78i0XK1aeP0tRDOzcG7w39KzBPN1KjiQY1ZvQmoo S/DdsGqVbAye1Q1mKux+jPVvB0UIoVC2C3A98OynSx8JXaKfLje602TMY w==; X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="335046701" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="335046701" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10707"; a="699776251" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="699776251" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 08:44:49 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V5 11/11] vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X Date: Thu, 11 May 2023 08:44:38 -0700 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Dynamic MSI-X is supported. Clear VFIO_IRQ_INFO_NORESIZE to provide guidance to user space. Signed-off-by: Reinette Chatre Reviewed-by: Kevin Tian --- Changes since V4: - Add Kevin's Reviewed-by tag. (Kevin) Changes since V3: - Remove unnecessary test from condition. (Alex) Changes since V2: - Use new vdev->has_dyn_msix property instead of calling pci_msix_can_alloc_dyn() directly. (Alex) Changes since RFC V1: - Only advertise VFIO_IRQ_INFO_NORESIZE for MSI-X devices that can actually support dynamic allocation. (Alex) drivers/vfio/pci/vfio_pci_core.c | 2 +- include/uapi/linux/vfio.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index a3635a8e54c8..ec7e662de033 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1114,7 +1114,7 @@ static int vfio_pci_ioctl_get_irq_info(struct vfio_pci_core_device *vdev, if (info.index == VFIO_PCI_INTX_IRQ_INDEX) info.flags |= (VFIO_IRQ_INFO_MASKABLE | VFIO_IRQ_INFO_AUTOMASKED); - else + else if (info.index != VFIO_PCI_MSIX_IRQ_INDEX || !vdev->has_dyn_msix) info.flags |= VFIO_IRQ_INFO_NORESIZE; return copy_to_user(arg, &info, minsz) ? -EFAULT : 0; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 0552e8dcf0cb..1a36134cae5c 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -511,6 +511,9 @@ struct vfio_region_info_cap_nvlink2_lnkspd { * then add and unmask vectors, it's up to userspace to make the decision * whether to allocate the maximum supported number of vectors or tear * down setup and incrementally increase the vectors as each is enabled. + * Absence of the NORESIZE flag indicates that vectors can be enabled + * and disabled dynamically without impacting other vectors within the + * index. */ struct vfio_irq_info { __u32 argsz;