From patchwork Tue Apr 18 17:29:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13215984 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F08EC6FD18 for ; Tue, 18 Apr 2023 17:29:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232594AbjDRR3w (ORCPT ); Tue, 18 Apr 2023 13:29:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232513AbjDRR3s (ORCPT ); Tue, 18 Apr 2023 13:29:48 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28BFF9023; Tue, 18 Apr 2023 10:29:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681838987; x=1713374987; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mvcm9I7oVMk743K1U4CKLSb1SD+72HuGUMTrE4bjIiU=; b=Qcq1z/v6btmLY+arprg69itcm9V4nnp/E4U6pQFwfCrhmM+IzikxOyo2 vtjj7k7wuLAhwTbmlsOiBifGjeB2eBs/a3wBBquJHcsJIUKc7tmQOYP0E Q8uYBDZg5+S8FVmG1YVlwZ6hhTT3/OWO/2Zxhw5oQpAwq6AURT7VYTw4R UlgkQgRS9H7yBZ94NvJKiLjhzxJDy2vv2oPjBqJgNPszYlx8PisoRj4DB aOrw0crkXf2iMqtOunFjfxQTdHP9j71l1Og2pN315TOCM1a6+8sQNmqfL /dB8yvHldfw/neYkqR3pa7eJz6YwHtY/H6q0Cym8D79E/5D+GZ4cQVMpk g==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="410466447" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="410466447" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="865503470" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="865503470" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:41 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V3 01/10] vfio/pci: Consolidate irq cleanup on MSI/MSI-X disable Date: Tue, 18 Apr 2023 10:29:12 -0700 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org vfio_msi_disable() releases all previously allocated state associated with each interrupt before disabling MSI/MSI-X. vfio_msi_disable() iterates twice over the interrupt state: first directly with a for loop to do virqfd cleanup, followed by another for loop within vfio_msi_set_block() that removes the interrupt handler and its associated state using vfio_msi_set_vector_signal(). Simplify interrupt cleanup by iterating over allocated interrupts once. Signed-off-by: Reinette Chatre --- Changes since V2: - Improve accuracy of changelog. drivers/vfio/pci/vfio_pci_intrs.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index bffb0741518b..6a9c6a143cc3 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -426,10 +426,9 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) for (i = 0; i < vdev->num_ctx; i++) { vfio_virqfd_disable(&vdev->ctx[i].unmask); vfio_virqfd_disable(&vdev->ctx[i].mask); + vfio_msi_set_vector_signal(vdev, i, -1, msix); } - vfio_msi_set_block(vdev, 0, vdev->num_ctx, NULL, msix); - cmd = vfio_pci_memory_lock_and_enable(vdev); pci_free_irq_vectors(pdev); vfio_pci_memory_unlock_and_restore(vdev, cmd); From patchwork Tue Apr 18 17:29:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13215983 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0192C77B78 for ; Tue, 18 Apr 2023 17:29:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232570AbjDRR3u (ORCPT ); Tue, 18 Apr 2023 13:29:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232226AbjDRR3r (ORCPT ); Tue, 18 Apr 2023 13:29:47 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D96FC900D; Tue, 18 Apr 2023 10:29:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681838986; x=1713374986; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bJYyq4Hx+wN5lbKnUKqDcpvgy5YenkrSQ22BR7rz2Xc=; b=A9sI7Nf27EFz/pwrKd0E9LVMDLBq5enGm0WVz4qT0MhAkX0Rp4jp7kBj kYlgc8dw5Qfd+Kcmeg25NFRNUXCQsVHEKNyG8huh/dZbHu3Xm19elTztf gpT1OiZbntU6KNTQfkYzbrqpYWVCw/hr2QdASrqYMXx8zomyd866s95Z+ DAxw0+kFhS6ertqpju6ScL7qlA1u2CiQ/olsKQBKF0HNvtsf4wxnNztzB 5k3Q76oujnvY1oyI7F1WP+rq3WSVODoT9jf6YJXb/WSwb9E5d3b6A2Ea7 ADLqamSOFVcsuEFJ9at1JWroSdOB3Yop8dDRYaMFTdeCixi3wOwvjAj7v Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="410466452" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="410466452" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="865503472" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="865503472" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V3 02/10] vfio/pci: Remove negative check on unsigned vector Date: Tue, 18 Apr 2023 10:29:13 -0700 Message-Id: <5add301d11d4a566c29c487a78b4227ae383f11d.1681837892.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org User space provides the vector as an unsigned int that is checked early for validity (vfio_set_irqs_validate_and_prepare()). A later negative check of the provided vector is not necessary. Remove the negative check and ensure the type used for the vector is consistent as an unsigned int. Signed-off-by: Reinette Chatre --- Changes since V2: - Rework unwind loop within vfio_msi_set_block() that required j to be an int. Rework results in both i and j used for vector, both moved to be unsigned int. (Alex) drivers/vfio/pci/vfio_pci_intrs.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 6a9c6a143cc3..258de57ef956 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -317,14 +317,14 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi } static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, - int vector, int fd, bool msix) + unsigned int vector, int fd, bool msix) { struct pci_dev *pdev = vdev->pdev; struct eventfd_ctx *trigger; int irq, ret; u16 cmd; - if (vector < 0 || vector >= vdev->num_ctx) + if (vector >= vdev->num_ctx) return -EINVAL; irq = pci_irq_vector(pdev, vector); @@ -399,7 +399,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, unsigned count, int32_t *fds, bool msix) { - int i, j, ret = 0; + unsigned int i, j; + int ret = 0; if (start >= vdev->num_ctx || start + count > vdev->num_ctx) return -EINVAL; @@ -410,8 +411,8 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, } if (ret) { - for (--j; j >= (int)start; j--) - vfio_msi_set_vector_signal(vdev, j, -1, msix); + for (i = start; i < j; i++) + vfio_msi_set_vector_signal(vdev, i, -1, msix); } return ret; @@ -420,7 +421,7 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) { struct pci_dev *pdev = vdev->pdev; - int i; + unsigned int i; u16 cmd; for (i = 0; i < vdev->num_ctx; i++) { @@ -542,7 +543,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { - int i; + unsigned int i; bool msix = (index == VFIO_PCI_MSIX_IRQ_INDEX) ? true : false; if (irq_is(vdev, index) && !count && (flags & VFIO_IRQ_SET_DATA_NONE)) { From patchwork Tue Apr 18 17:29:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13215982 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21BF8C77B75 for ; Tue, 18 Apr 2023 17:29:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232095AbjDRR3t (ORCPT ); Tue, 18 Apr 2023 13:29:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232371AbjDRR3q (ORCPT ); Tue, 18 Apr 2023 13:29:46 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0297AA5D3; Tue, 18 Apr 2023 10:29:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681838983; x=1713374983; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=md25LnaVmBGc4U+FZeYEKgswv71ook8yDdmIU9PpL6c=; b=WEHEgpwH2dGcwssnGlvC9DTJ1mBJGbXnRXSbCQLNFkl1M5dsWEfNMRJD o1N/viKiwlbBsF+A26mvWePHhpnAA1krw38hk0TQJWcKd2gEirMlsWI8H q6A8rWt0doDaP2+KHHH+wdOy3548XRrpZTLoyW8Iax3NKXKwxf6QryHlm O5JBZRsEZPM+xH19CDEVfP1VOXrMPePJ9dxziX8fv+NW3r8C30SXRdHLi S9delo71gjy21a+EVikDGxIezbAyuBS/AxEmpUNsAAeXsM3ihMg8Aq0xI mgjrmOKp0qhwhM9FObEhVOfX5REj7EMus3cjeIBfB+KvtYBXwDUEP1Ove A==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="410466457" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="410466457" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="865503476" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="865503476" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V3 03/10] vfio/pci: Prepare for dynamic interrupt context storage Date: Tue, 18 Apr 2023 10:29:14 -0700 Message-Id: <6fcd4019e22931a97d962b6e657e74d6fd1049ba.1681837892.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Interrupt context storage is statically allocated at the time interrupts are allocated. Following allocation, the interrupt context is managed by directly accessing the elements of the array using the vector as index. It is possible to allocate additional MSI-X vectors after MSI-X has been enabled. Dynamic storage of interrupt context is needed to support adding new MSI-X vectors after initial allocation. Replace direct access of array elements with pointers to the array elements. Doing so reduces impact of moving to a new data structure. Move interactions with the array to helpers to mostly contain changes needed to transition to a dynamic data structure. No functional change intended. Signed-off-by: Reinette Chatre --- No changes since V2. Changes since RFC V1: - Improve accuracy of changelog. drivers/vfio/pci/vfio_pci_intrs.c | 206 ++++++++++++++++++++---------- 1 file changed, 140 insertions(+), 66 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 258de57ef956..b664fbb6d2f2 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -48,6 +48,31 @@ static bool is_irq_none(struct vfio_pci_core_device *vdev) vdev->irq_type == VFIO_PCI_MSIX_IRQ_INDEX); } +static +struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev, + unsigned long index) +{ + if (index >= vdev->num_ctx) + return NULL; + return &vdev->ctx[index]; +} + +static void vfio_irq_ctx_free_all(struct vfio_pci_core_device *vdev) +{ + kfree(vdev->ctx); +} + +static int vfio_irq_ctx_alloc_num(struct vfio_pci_core_device *vdev, + unsigned long num) +{ + vdev->ctx = kcalloc(num, sizeof(struct vfio_pci_irq_ctx), + GFP_KERNEL_ACCOUNT); + if (!vdev->ctx) + return -ENOMEM; + + return 0; +} + /* * INTx */ @@ -55,17 +80,28 @@ static void vfio_send_intx_eventfd(void *opaque, void *unused) { struct vfio_pci_core_device *vdev = opaque; - if (likely(is_intx(vdev) && !vdev->virq_disabled)) - eventfd_signal(vdev->ctx[0].trigger, 1); + if (likely(is_intx(vdev) && !vdev->virq_disabled)) { + struct vfio_pci_irq_ctx *ctx; + + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return; + eventfd_signal(ctx->trigger, 1); + } } /* Returns true if the INTx vfio_pci_irq_ctx.masked value is changed. */ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) { struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; unsigned long flags; bool masked_changed = false; + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return masked_changed; + spin_lock_irqsave(&vdev->irqlock, flags); /* @@ -77,7 +113,7 @@ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) if (unlikely(!is_intx(vdev))) { if (vdev->pci_2_3) pci_intx(pdev, 0); - } else if (!vdev->ctx[0].masked) { + } else if (!ctx->masked) { /* * Can't use check_and_mask here because we always want to * mask, not just when something is pending. @@ -87,7 +123,7 @@ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) else disable_irq_nosync(pdev->irq); - vdev->ctx[0].masked = true; + ctx->masked = true; masked_changed = true; } @@ -105,9 +141,14 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused) { struct vfio_pci_core_device *vdev = opaque; struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; unsigned long flags; int ret = 0; + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return ret; + spin_lock_irqsave(&vdev->irqlock, flags); /* @@ -117,7 +158,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused) if (unlikely(!is_intx(vdev))) { if (vdev->pci_2_3) pci_intx(pdev, 1); - } else if (vdev->ctx[0].masked && !vdev->virq_disabled) { + } else if (ctx->masked && !vdev->virq_disabled) { /* * A pending interrupt here would immediately trigger, * but we can avoid that overhead by just re-sending @@ -129,7 +170,7 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused) } else enable_irq(pdev->irq); - vdev->ctx[0].masked = (ret > 0); + ctx->masked = (ret > 0); } spin_unlock_irqrestore(&vdev->irqlock, flags); @@ -146,18 +187,23 @@ void vfio_pci_intx_unmask(struct vfio_pci_core_device *vdev) static irqreturn_t vfio_intx_handler(int irq, void *dev_id) { struct vfio_pci_core_device *vdev = dev_id; + struct vfio_pci_irq_ctx *ctx; unsigned long flags; int ret = IRQ_NONE; + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return ret; + spin_lock_irqsave(&vdev->irqlock, flags); if (!vdev->pci_2_3) { disable_irq_nosync(vdev->pdev->irq); - vdev->ctx[0].masked = true; + ctx->masked = true; ret = IRQ_HANDLED; - } else if (!vdev->ctx[0].masked && /* may be shared */ + } else if (!ctx->masked && /* may be shared */ pci_check_and_mask_intx(vdev->pdev)) { - vdev->ctx[0].masked = true; + ctx->masked = true; ret = IRQ_HANDLED; } @@ -171,15 +217,24 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id) static int vfio_intx_enable(struct vfio_pci_core_device *vdev) { + struct vfio_pci_irq_ctx *ctx; + int ret; + if (!is_irq_none(vdev)) return -EINVAL; if (!vdev->pdev->irq) return -ENODEV; - vdev->ctx = kzalloc(sizeof(struct vfio_pci_irq_ctx), GFP_KERNEL_ACCOUNT); - if (!vdev->ctx) - return -ENOMEM; + ret = vfio_irq_ctx_alloc_num(vdev, 1); + if (ret) + return ret; + + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) { + vfio_irq_ctx_free_all(vdev); + return -EINVAL; + } vdev->num_ctx = 1; @@ -189,9 +244,9 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev) * here, non-PCI-2.3 devices will have to wait until the * interrupt is enabled. */ - vdev->ctx[0].masked = vdev->virq_disabled; + ctx->masked = vdev->virq_disabled; if (vdev->pci_2_3) - pci_intx(vdev->pdev, !vdev->ctx[0].masked); + pci_intx(vdev->pdev, !ctx->masked); vdev->irq_type = VFIO_PCI_INTX_IRQ_INDEX; @@ -202,41 +257,46 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) { struct pci_dev *pdev = vdev->pdev; unsigned long irqflags = IRQF_SHARED; + struct vfio_pci_irq_ctx *ctx; struct eventfd_ctx *trigger; unsigned long flags; int ret; - if (vdev->ctx[0].trigger) { + ctx = vfio_irq_ctx_get(vdev, 0); + if (!ctx) + return -EINVAL; + + if (ctx->trigger) { free_irq(pdev->irq, vdev); - kfree(vdev->ctx[0].name); - eventfd_ctx_put(vdev->ctx[0].trigger); - vdev->ctx[0].trigger = NULL; + kfree(ctx->name); + eventfd_ctx_put(ctx->trigger); + ctx->trigger = NULL; } if (fd < 0) /* Disable only */ return 0; - vdev->ctx[0].name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-intx(%s)", - pci_name(pdev)); - if (!vdev->ctx[0].name) + ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-intx(%s)", + pci_name(pdev)); + if (!ctx->name) return -ENOMEM; trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { - kfree(vdev->ctx[0].name); + kfree(ctx->name); return PTR_ERR(trigger); } - vdev->ctx[0].trigger = trigger; + ctx->trigger = trigger; if (!vdev->pci_2_3) irqflags = 0; ret = request_irq(pdev->irq, vfio_intx_handler, - irqflags, vdev->ctx[0].name, vdev); + irqflags, ctx->name, vdev); if (ret) { - vdev->ctx[0].trigger = NULL; - kfree(vdev->ctx[0].name); + ctx->trigger = NULL; + kfree(ctx->name); eventfd_ctx_put(trigger); return ret; } @@ -246,7 +306,7 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) * disable_irq won't. */ spin_lock_irqsave(&vdev->irqlock, flags); - if (!vdev->pci_2_3 && vdev->ctx[0].masked) + if (!vdev->pci_2_3 && ctx->masked) disable_irq_nosync(pdev->irq); spin_unlock_irqrestore(&vdev->irqlock, flags); @@ -255,12 +315,17 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) static void vfio_intx_disable(struct vfio_pci_core_device *vdev) { - vfio_virqfd_disable(&vdev->ctx[0].unmask); - vfio_virqfd_disable(&vdev->ctx[0].mask); + struct vfio_pci_irq_ctx *ctx; + + ctx = vfio_irq_ctx_get(vdev, 0); + if (ctx) { + vfio_virqfd_disable(&ctx->unmask); + vfio_virqfd_disable(&ctx->mask); + } vfio_intx_set_signal(vdev, -1); vdev->irq_type = VFIO_PCI_NUM_IRQS; vdev->num_ctx = 0; - kfree(vdev->ctx); + vfio_irq_ctx_free_all(vdev); } /* @@ -284,10 +349,9 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi if (!is_irq_none(vdev)) return -EINVAL; - vdev->ctx = kcalloc(nvec, sizeof(struct vfio_pci_irq_ctx), - GFP_KERNEL_ACCOUNT); - if (!vdev->ctx) - return -ENOMEM; + ret = vfio_irq_ctx_alloc_num(vdev, nvec); + if (ret) + return ret; /* return the number of supported vectors if we can't get all: */ cmd = vfio_pci_memory_lock_and_enable(vdev); @@ -296,7 +360,7 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi if (ret > 0) pci_free_irq_vectors(pdev); vfio_pci_memory_unlock_and_restore(vdev, cmd); - kfree(vdev->ctx); + vfio_irq_ctx_free_all(vdev); return ret; } vfio_pci_memory_unlock_and_restore(vdev, cmd); @@ -320,6 +384,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, unsigned int vector, int fd, bool msix) { struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; struct eventfd_ctx *trigger; int irq, ret; u16 cmd; @@ -327,33 +392,33 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, if (vector >= vdev->num_ctx) return -EINVAL; + ctx = vfio_irq_ctx_get(vdev, vector); + if (!ctx) + return -EINVAL; irq = pci_irq_vector(pdev, vector); - if (vdev->ctx[vector].trigger) { - irq_bypass_unregister_producer(&vdev->ctx[vector].producer); + if (ctx->trigger) { + irq_bypass_unregister_producer(&ctx->producer); cmd = vfio_pci_memory_lock_and_enable(vdev); - free_irq(irq, vdev->ctx[vector].trigger); + free_irq(irq, ctx->trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); - - kfree(vdev->ctx[vector].name); - eventfd_ctx_put(vdev->ctx[vector].trigger); - vdev->ctx[vector].trigger = NULL; + kfree(ctx->name); + eventfd_ctx_put(ctx->trigger); + ctx->trigger = NULL; } if (fd < 0) return 0; - vdev->ctx[vector].name = kasprintf(GFP_KERNEL_ACCOUNT, - "vfio-msi%s[%d](%s)", - msix ? "x" : "", vector, - pci_name(pdev)); - if (!vdev->ctx[vector].name) + ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)", + msix ? "x" : "", vector, pci_name(pdev)); + if (!ctx->name) return -ENOMEM; trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { - kfree(vdev->ctx[vector].name); + kfree(ctx->name); return PTR_ERR(trigger); } @@ -372,26 +437,25 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, pci_write_msi_msg(irq, &msg); } - ret = request_irq(irq, vfio_msihandler, 0, - vdev->ctx[vector].name, trigger); + ret = request_irq(irq, vfio_msihandler, 0, ctx->name, trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); if (ret) { - kfree(vdev->ctx[vector].name); + kfree(ctx->name); eventfd_ctx_put(trigger); return ret; } - vdev->ctx[vector].producer.token = trigger; - vdev->ctx[vector].producer.irq = irq; - ret = irq_bypass_register_producer(&vdev->ctx[vector].producer); + ctx->producer.token = trigger; + ctx->producer.irq = irq; + ret = irq_bypass_register_producer(&ctx->producer); if (unlikely(ret)) { dev_info(&pdev->dev, "irq bypass producer (token %p) registration fails: %d\n", - vdev->ctx[vector].producer.token, ret); + ctx->producer.token, ret); - vdev->ctx[vector].producer.token = NULL; + ctx->producer.token = NULL; } - vdev->ctx[vector].trigger = trigger; + ctx->trigger = trigger; return 0; } @@ -421,13 +485,17 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) { struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_irq_ctx *ctx; unsigned int i; u16 cmd; for (i = 0; i < vdev->num_ctx; i++) { - vfio_virqfd_disable(&vdev->ctx[i].unmask); - vfio_virqfd_disable(&vdev->ctx[i].mask); - vfio_msi_set_vector_signal(vdev, i, -1, msix); + ctx = vfio_irq_ctx_get(vdev, i); + if (ctx) { + vfio_virqfd_disable(&ctx->unmask); + vfio_virqfd_disable(&ctx->mask); + vfio_msi_set_vector_signal(vdev, i, -1, msix); + } } cmd = vfio_pci_memory_lock_and_enable(vdev); @@ -443,7 +511,7 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) vdev->irq_type = VFIO_PCI_NUM_IRQS; vdev->num_ctx = 0; - kfree(vdev->ctx); + vfio_irq_ctx_free_all(vdev); } /* @@ -463,14 +531,18 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_core_device *vdev, if (unmask) vfio_pci_intx_unmask(vdev); } else if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { + struct vfio_pci_irq_ctx *ctx = vfio_irq_ctx_get(vdev, 0); int32_t fd = *(int32_t *)data; + + if (!ctx) + return -EINVAL; if (fd >= 0) return vfio_virqfd_enable((void *) vdev, vfio_pci_intx_unmask_handler, vfio_send_intx_eventfd, NULL, - &vdev->ctx[0].unmask, fd); + &ctx->unmask, fd); - vfio_virqfd_disable(&vdev->ctx[0].unmask); + vfio_virqfd_disable(&ctx->unmask); } return 0; @@ -543,6 +615,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { + struct vfio_pci_irq_ctx *ctx; unsigned int i; bool msix = (index == VFIO_PCI_MSIX_IRQ_INDEX) ? true : false; @@ -577,14 +650,15 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, return -EINVAL; for (i = start; i < start + count; i++) { - if (!vdev->ctx[i].trigger) + ctx = vfio_irq_ctx_get(vdev, i); + if (!ctx || !ctx->trigger) continue; if (flags & VFIO_IRQ_SET_DATA_NONE) { - eventfd_signal(vdev->ctx[i].trigger, 1); + eventfd_signal(ctx->trigger, 1); } else if (flags & VFIO_IRQ_SET_DATA_BOOL) { uint8_t *bools = data; if (bools[i - start]) - eventfd_signal(vdev->ctx[i].trigger, 1); + eventfd_signal(ctx->trigger, 1); } } return 0; From patchwork Tue Apr 18 17:29:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13215987 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4399C6FD18 for ; Tue, 18 Apr 2023 17:29:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232625AbjDRR34 (ORCPT ); Tue, 18 Apr 2023 13:29:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232559AbjDRR3u (ORCPT ); Tue, 18 Apr 2023 13:29:50 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA61BAF12; Tue, 18 Apr 2023 10:29:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681838988; x=1713374988; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=w7/Wvfls9T+9v41w24j+JaGq3vtI51OxqopK2676JN0=; b=Yjc98/q0+xRq1g1PvlEMN6gTQlZY3ERZogZKgIzVErAwtT/p6C0ke0L2 vyOh/IEHMoU+a385UCM5xxYPE0UBiVyjNFYZL6H0fztHbAZmfQUzL3Ttj 099qURM/2+hmgHc0DfBwPI/hwTvW0TQfE2py0paA6mbZabBRmc3B0vRC2 RHKT96kNQbsUkZ+wgC/mvKY6L8y7V8KzUVBfn5vJD0beG1SYGrzbBSLpo E/LSOcLOOUa9V/OxYMg7HzI/7pbM/INLioVCdBHZCogFwjAffFD9Yuzqr 5fPV8cE+t8KPAJna2jzFem40pq9niINAg+rlChEnUFjYMJNiHstolhCN/ g==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="410466462" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="410466462" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="865503479" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="865503479" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V3 04/10] vfio/pci: Move to single error path Date: Tue, 18 Apr 2023 10:29:15 -0700 Message-Id: <521fd5184ef052ff768a90bbe670cfc4e375eff9.1681837892.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Enabling and disabling of an interrupt involves several steps that can fail. Cleanup after failure is done when the error is encountered, resulting in some repetitive code. Support for dynamic contexts will introduce more steps during interrupt enabling and disabling. Transition to centralized exit path in preparation for dynamic contexts to eliminate duplicate error handling code. Signed-off-by: Reinette Chatre --- Changes since V2: - Move patch to earlier in series in support of the change to dynamic context management. - Do not add the "ctx->name = NULL" in error path. It is not done in baseline and will not be needed when transitioning to dynamic context management. - Update changelog to not make this change specific to dynamic MSI-X. Changes since RFC V1: - Improve changelog. drivers/vfio/pci/vfio_pci_intrs.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index b664fbb6d2f2..9e17e59a4d60 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -418,8 +418,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { - kfree(ctx->name); - return PTR_ERR(trigger); + ret = PTR_ERR(trigger); + goto out_free_name; } /* @@ -439,11 +439,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, ret = request_irq(irq, vfio_msihandler, 0, ctx->name, trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); - if (ret) { - kfree(ctx->name); - eventfd_ctx_put(trigger); - return ret; - } + if (ret) + goto out_put_eventfd_ctx; ctx->producer.token = trigger; ctx->producer.irq = irq; @@ -458,6 +455,12 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, ctx->trigger = trigger; return 0; + +out_put_eventfd_ctx: + eventfd_ctx_put(trigger); +out_free_name: + kfree(ctx->name); + return ret; } static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, From patchwork Tue Apr 18 17:29:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13215991 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2447C6FD18 for ; Tue, 18 Apr 2023 17:30:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232627AbjDRRaE (ORCPT ); Tue, 18 Apr 2023 13:30:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232592AbjDRR3w (ORCPT ); Tue, 18 Apr 2023 13:29:52 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F09A5B467; Tue, 18 Apr 2023 10:29:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681838991; x=1713374991; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+LQ4KknkPd5it6WKDWxtv2E8zkNti3jsn4E8sdDJHMY=; b=SBqgC2LI98rOxaIFEPWVBmZOTNTqAYFxrnKvaVkInINm3jfsvMb3iKcp NX1sc0Dcuruj43lpOqHxW7eYx0T6dgwWOqDMkM9TV2p7E6j3u5khUh7Zj cT85kJrw+0ySH3ztIdzPAzNxD4OIdtekUqRWYvDEpQdOowjtI9QZ6EUvl /074hXgVvpnTER9Tr7d1u3pXHRwAPNId21WVKFIto0yUp5RDurLSUJ5oB /N88F/HZg+XQCHKoTxn5I3LpPVg2lU7MzpWQlgqUVVqSO/vOHgEVom0bF vC83rT+9uMDrF/ZRkjEA5nEGxg1daflX/Y7KWv4uzbR7S4XHUdSgjwisN w==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="410466468" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="410466468" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="865503482" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="865503482" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V3 05/10] vfio/pci: Use xarray for interrupt context storage Date: Tue, 18 Apr 2023 10:29:16 -0700 Message-Id: <78182c9cd770885b6d354f114ba157c7024c8b39.1681837892.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Interrupt context is statically allocated at the time interrupts are allocated. Following allocation, the context is managed by directly accessing the elements of the array using the vector as index. The storage is released when interrupts are disabled. It is possible to dynamically allocate a single MSI-X interrupt after MSI-X is enabled. A dynamic storage for interrupt context is needed to support this. Replace the interrupt context array with an xarray (similar to what the core uses as store for MSI descriptors) that can support the dynamic expansion while maintaining the custom that uses the vector as index. With a dynamic storage it is no longer required to pre-allocate interrupt contexts at the time the interrupts are allocated. MSI and MSI-X interrupt contexts are only used when interrupts are enabled. Their allocation can thus be delayed until interrupt enabling. Only enabled interrupts will have associated interrupt contexts. Whether an interrupt has been allocated (a Linux irq number exists for it) becomes the criteria for whether an interrupt can be enabled. Signed-off-by: Reinette Chatre Link: https://lore.kernel.org/lkml/20230404122444.59e36a99.alex.williamson@redhat.com/ --- Changes since V2: - Only allocate contexts as they are used, or "active". (Alex) - Move vfio_irq_ctx_free() from later patch to prevent open-coding the same within vfio_irq_ctx_free_all(). This evolved into vfio_irq_ctx_free() used for dynamic context allocation and vfio_irq_ctx_free_all() removed because of it. (Alex) - With vfio_irq_ctx_alloc_num() removed, rename vfio_irq_ctx_alloc_single() to vfio_irq_ctx_alloc(). Changes since RFC V1: - Let vfio_irq_ctx_alloc_single() return pointer to allocated context. (Alex) - Transition INTx allocation to simplified vfio_irq_ctx_alloc_single(). - Improve accuracy of changelog. drivers/vfio/pci/vfio_pci_core.c | 1 + drivers/vfio/pci/vfio_pci_intrs.c | 91 ++++++++++++++++--------------- include/linux/vfio_pci_core.h | 2 +- 3 files changed, 48 insertions(+), 46 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index a5ab416cf476..ae0e161c7fc9 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -2102,6 +2102,7 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev) INIT_LIST_HEAD(&vdev->vma_list); INIT_LIST_HEAD(&vdev->sriov_pfs_item); init_rwsem(&vdev->memory_lock); + xa_init(&vdev->ctx); return 0; } diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 9e17e59a4d60..117cd384b3ad 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -52,25 +52,33 @@ static struct vfio_pci_irq_ctx *vfio_irq_ctx_get(struct vfio_pci_core_device *vdev, unsigned long index) { - if (index >= vdev->num_ctx) - return NULL; - return &vdev->ctx[index]; + return xa_load(&vdev->ctx, index); } -static void vfio_irq_ctx_free_all(struct vfio_pci_core_device *vdev) +static void vfio_irq_ctx_free(struct vfio_pci_core_device *vdev, + struct vfio_pci_irq_ctx *ctx, unsigned long index) { - kfree(vdev->ctx); + xa_erase(&vdev->ctx, index); + kfree(ctx); } -static int vfio_irq_ctx_alloc_num(struct vfio_pci_core_device *vdev, - unsigned long num) +static struct vfio_pci_irq_ctx * +vfio_irq_ctx_alloc(struct vfio_pci_core_device *vdev, unsigned long index) { - vdev->ctx = kcalloc(num, sizeof(struct vfio_pci_irq_ctx), - GFP_KERNEL_ACCOUNT); - if (!vdev->ctx) - return -ENOMEM; + struct vfio_pci_irq_ctx *ctx; + int ret; - return 0; + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL_ACCOUNT); + if (!ctx) + return NULL; + + ret = xa_insert(&vdev->ctx, index, ctx, GFP_KERNEL_ACCOUNT); + if (ret) { + kfree(ctx); + return NULL; + } + + return ctx; } /* @@ -218,7 +226,6 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id) static int vfio_intx_enable(struct vfio_pci_core_device *vdev) { struct vfio_pci_irq_ctx *ctx; - int ret; if (!is_irq_none(vdev)) return -EINVAL; @@ -226,15 +233,9 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev) if (!vdev->pdev->irq) return -ENODEV; - ret = vfio_irq_ctx_alloc_num(vdev, 1); - if (ret) - return ret; - - ctx = vfio_irq_ctx_get(vdev, 0); - if (!ctx) { - vfio_irq_ctx_free_all(vdev); - return -EINVAL; - } + ctx = vfio_irq_ctx_alloc(vdev, 0); + if (!ctx) + return -ENOMEM; vdev->num_ctx = 1; @@ -325,7 +326,7 @@ static void vfio_intx_disable(struct vfio_pci_core_device *vdev) vfio_intx_set_signal(vdev, -1); vdev->irq_type = VFIO_PCI_NUM_IRQS; vdev->num_ctx = 0; - vfio_irq_ctx_free_all(vdev); + vfio_irq_ctx_free(vdev, ctx, 0); } /* @@ -349,10 +350,6 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi if (!is_irq_none(vdev)) return -EINVAL; - ret = vfio_irq_ctx_alloc_num(vdev, nvec); - if (ret) - return ret; - /* return the number of supported vectors if we can't get all: */ cmd = vfio_pci_memory_lock_and_enable(vdev); ret = pci_alloc_irq_vectors(pdev, 1, nvec, flag); @@ -360,7 +357,6 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi if (ret > 0) pci_free_irq_vectors(pdev); vfio_pci_memory_unlock_and_restore(vdev, cmd); - vfio_irq_ctx_free_all(vdev); return ret; } vfio_pci_memory_unlock_and_restore(vdev, cmd); @@ -392,12 +388,13 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, if (vector >= vdev->num_ctx) return -EINVAL; - ctx = vfio_irq_ctx_get(vdev, vector); - if (!ctx) - return -EINVAL; irq = pci_irq_vector(pdev, vector); + if (irq < 0) + return -EINVAL; - if (ctx->trigger) { + ctx = vfio_irq_ctx_get(vdev, vector); + + if (ctx) { irq_bypass_unregister_producer(&ctx->producer); cmd = vfio_pci_memory_lock_and_enable(vdev); @@ -405,16 +402,22 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, vfio_pci_memory_unlock_and_restore(vdev, cmd); kfree(ctx->name); eventfd_ctx_put(ctx->trigger); - ctx->trigger = NULL; + vfio_irq_ctx_free(vdev, ctx, vector); } if (fd < 0) return 0; + ctx = vfio_irq_ctx_alloc(vdev, vector); + if (!ctx) + return -ENOMEM; + ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)", msix ? "x" : "", vector, pci_name(pdev)); - if (!ctx->name) - return -ENOMEM; + if (!ctx->name) { + ret = -ENOMEM; + goto out_free_ctx; + } trigger = eventfd_ctx_fdget(fd); if (IS_ERR(trigger)) { @@ -460,6 +463,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, eventfd_ctx_put(trigger); out_free_name: kfree(ctx->name); +out_free_ctx: + vfio_irq_ctx_free(vdev, ctx, vector); return ret; } @@ -489,16 +494,13 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) { struct pci_dev *pdev = vdev->pdev; struct vfio_pci_irq_ctx *ctx; - unsigned int i; + unsigned long i; u16 cmd; - for (i = 0; i < vdev->num_ctx; i++) { - ctx = vfio_irq_ctx_get(vdev, i); - if (ctx) { - vfio_virqfd_disable(&ctx->unmask); - vfio_virqfd_disable(&ctx->mask); - vfio_msi_set_vector_signal(vdev, i, -1, msix); - } + xa_for_each(&vdev->ctx, i, ctx) { + vfio_virqfd_disable(&ctx->unmask); + vfio_virqfd_disable(&ctx->mask); + vfio_msi_set_vector_signal(vdev, i, -1, msix); } cmd = vfio_pci_memory_lock_and_enable(vdev); @@ -514,7 +516,6 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) vdev->irq_type = VFIO_PCI_NUM_IRQS; vdev->num_ctx = 0; - vfio_irq_ctx_free_all(vdev); } /* @@ -654,7 +655,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, for (i = start; i < start + count; i++) { ctx = vfio_irq_ctx_get(vdev, i); - if (!ctx || !ctx->trigger) + if (!ctx) continue; if (flags & VFIO_IRQ_SET_DATA_NONE) { eventfd_signal(ctx->trigger, 1); diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 367fd79226a3..61d7873a3973 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -59,7 +59,7 @@ struct vfio_pci_core_device { struct perm_bits *msi_perm; spinlock_t irqlock; struct mutex igate; - struct vfio_pci_irq_ctx *ctx; + struct xarray ctx; int num_ctx; int irq_type; int num_regions; From patchwork Tue Apr 18 17:29:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13215986 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5058C77B7D for ; Tue, 18 Apr 2023 17:29:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232599AbjDRR3x (ORCPT ); Tue, 18 Apr 2023 13:29:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39052 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231877AbjDRR3t (ORCPT ); Tue, 18 Apr 2023 13:29:49 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4FEB1AF06; Tue, 18 Apr 2023 10:29:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681838988; x=1713374988; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4kEJ197nr6yM9IPTGo2RuhmxMJUO//xhEu591g3IFAE=; b=ZVtwveBfnSrDwSMVErY7Rq3XFWqB2oZD+Y8aPDmDKmBYjYp2Qo4K1LeQ C2U6GtvfDs13qXPHJskdrG4aFx79uM9HJkXI4gfWNKjOwJDte3I9/P0Sh 47F2QK+C1jDJXBjKMIsw+ad7NHYiIuMTk2FGfRhIxzazFDr/LtpAN0vbk dSXK5LbAONI1PK5CzvNNfSm7alzVQZ8HcbYH0JeJqdJpph/kC7ungMx9I xQnmXiocPSwSzTaHsnMgAp0bAXPJZMQP49vjBTiRQ7ZDlAooS5Y9xQvO2 BC2iVT4jJPzxuuE7YiyXawc770iq/M+uP8rArd8Z+1vtRZms8gNjU41JE A==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="410466475" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="410466475" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="865503484" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="865503484" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V3 06/10] vfio/pci: Remove interrupt context counter Date: Tue, 18 Apr 2023 10:29:17 -0700 Message-Id: <056fbd6c7c5161fb912d60b3f75e379ab3255d75.1681837892.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org struct vfio_pci_core_device::num_ctx counts how many interrupt contexts have been allocated. When all interrupt contexts are allocated simultaneously num_ctx provides the upper bound of all vectors that can be used as indices into the interrupt context array. With the upcoming support for dynamic MSI-X the number of interrupt contexts does not necessarily span the range of allocated interrupts. Consequently, num_ctx is no longer a trusted upper bound for valid indices. Stop using num_ctx to determine if a provided vector is valid. Use the existence of allocated interrupt. This changes behavior on the error path when user space provides an invalid vector range. Behavior changes from early exit without any modifications to possible modifications to valid vectors within the invalid range. This is acceptable considering that an invalid range is not a valid scenario, see link to discussion. The checks that ensure that user space provides a range of vectors that is valid for the device are untouched. Signed-off-by: Reinette Chatre Link: https://lore.kernel.org/lkml/20230316155646.07ae266f.alex.williamson@redhat.com/ --- Changes since V2: - Update changelog to reflect change in policy that existence of allocated interrupt is validity check, not existence of context (which is now dynamically allocated). Changes since RFC V1: - Remove vfio_irq_ctx_range_allocated(). (Alex and Kevin). drivers/vfio/pci/vfio_pci_intrs.c | 13 +------------ include/linux/vfio_pci_core.h | 1 - 2 files changed, 1 insertion(+), 13 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 117cd384b3ad..5e3de004f4cb 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -237,8 +237,6 @@ static int vfio_intx_enable(struct vfio_pci_core_device *vdev) if (!ctx) return -ENOMEM; - vdev->num_ctx = 1; - /* * If the virtual interrupt is masked, restore it. Devices * supporting DisINTx can be masked at the hardware level @@ -325,7 +323,6 @@ static void vfio_intx_disable(struct vfio_pci_core_device *vdev) } vfio_intx_set_signal(vdev, -1); vdev->irq_type = VFIO_PCI_NUM_IRQS; - vdev->num_ctx = 0; vfio_irq_ctx_free(vdev, ctx, 0); } @@ -361,7 +358,6 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi } vfio_pci_memory_unlock_and_restore(vdev, cmd); - vdev->num_ctx = nvec; vdev->irq_type = msix ? VFIO_PCI_MSIX_IRQ_INDEX : VFIO_PCI_MSI_IRQ_INDEX; @@ -385,9 +381,6 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, int irq, ret; u16 cmd; - if (vector >= vdev->num_ctx) - return -EINVAL; - irq = pci_irq_vector(pdev, vector); if (irq < 0) return -EINVAL; @@ -474,9 +467,6 @@ static int vfio_msi_set_block(struct vfio_pci_core_device *vdev, unsigned start, unsigned int i, j; int ret = 0; - if (start >= vdev->num_ctx || start + count > vdev->num_ctx) - return -EINVAL; - for (i = 0, j = start; i < count && !ret; i++, j++) { int fd = fds ? fds[i] : -1; ret = vfio_msi_set_vector_signal(vdev, j, fd, msix); @@ -515,7 +505,6 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev, bool msix) pci_intx(pdev, 0); vdev->irq_type = VFIO_PCI_NUM_IRQS; - vdev->num_ctx = 0; } /* @@ -650,7 +639,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev, return ret; } - if (!irq_is(vdev, index) || start + count > vdev->num_ctx) + if (!irq_is(vdev, index)) return -EINVAL; for (i = start; i < start + count; i++) { diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 61d7873a3973..148fd1ae6c1c 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -60,7 +60,6 @@ struct vfio_pci_core_device { spinlock_t irqlock; struct mutex igate; struct xarray ctx; - int num_ctx; int irq_type; int num_regions; struct vfio_pci_region *region; From patchwork Tue Apr 18 17:29:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13215989 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AFC0C77B7D for ; Tue, 18 Apr 2023 17:30:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232649AbjDRRaA (ORCPT ); Tue, 18 Apr 2023 13:30:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232558AbjDRR3v (ORCPT ); Tue, 18 Apr 2023 13:29:51 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E802F26BC; Tue, 18 Apr 2023 10:29:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681838990; x=1713374990; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QtN+Gn8m6GoE1K9sHr26L3K3cEPPiJQ24QAc56BarA0=; b=Xotw5asohSB3xO2dxfHqcfTxSI0pBPb1cB1WVDbVz2HCY50egUGZznEq jZGaX3M3T9i9dPyAe8xGsjc25fNp/COexxWIpDX9SMLvG9Q5vIXElxpqj IeZlnlaQAGE2wbucNpJm2Yy8DvLAj7H4NfG9+PbELCxRmmcgRAEXcg8ar Lr94DNt9E6C2Lueeu3HckEiGQjJ7/zVCk9bkoTE3TERxUTcLcwex7BHP1 jUul28UcFFaj8mDvfIPibxnYErW5da+7Pvx4mAeXyeiXMUHqR3d/8h0LO Y+T/XqJ4DNQhAwQriMB4t4WuiP8JjEmHgRPJaj85Z8uYcJPawEtm6unrJ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="410466481" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="410466481" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="865503487" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="865503487" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V3 07/10] vfio/pci: Update stale comment Date: Tue, 18 Apr 2023 10:29:18 -0700 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In preparation for surrounding code change it is helpful to ensure that existing comments are accurate. Remove inaccurate comment about direct access and update the rest of the comment to reflect the purpose of writing the cached MSI message to the device. Suggested-by: Alex Williamson Link: https://lore.kernel.org/lkml/20230330164050.0069e2a5.alex.williamson@redhat.com/ Signed-off-by: Reinette Chatre --- Changes since V2: - New patch. drivers/vfio/pci/vfio_pci_intrs.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 5e3de004f4cb..bdda7f46c2be 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -419,11 +419,9 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, } /* - * The MSIx vector table resides in device memory which may be cleared - * via backdoor resets. We don't allow direct access to the vector - * table so even if a userspace driver attempts to save/restore around - * such a reset it would be unsuccessful. To avoid this, restore the - * cached value of the message prior to enabling. + * If the vector was previously allocated, refresh the on-device + * message data before enabling in case it had been cleared or + * corrupted since writing. */ cmd = vfio_pci_memory_lock_and_enable(vdev); if (msix) { From patchwork Tue Apr 18 17:29:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13215992 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C44A0C77B75 for ; Tue, 18 Apr 2023 17:30:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232674AbjDRRaG (ORCPT ); Tue, 18 Apr 2023 13:30:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232573AbjDRR3u (ORCPT ); Tue, 18 Apr 2023 13:29:50 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82AC172B8; Tue, 18 Apr 2023 10:29:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681838989; x=1713374989; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=86AA1WYOs53EOz4wfSGK1F5XZP9WYHmLDplH3O/7veQ=; b=IKzcWnEexqI5sRUEV9cwZVLpIQ1JPpRXUG6wbCUQ80JBwz9JJDX5B3yg QnCCrD9Vjfg53d4C1NELrJaJOGsXh847Q3Ofe0//UVYl1lh54gPKwnOGf Eojfo/zU2s5CGd7omlu30Fas9jb5xFaKFT+JC7rlmZohVl5dVayfxCTs7 gF7a03r0X1yfYPmhXD6i/LYxiq/n/BdqVNKWVbBB5wdNDw92wMb1HsKLM 7iz7uYCGgDtahvMplt889SflaXPFZVmR+T5regAnwZ6sQ1tWsLALbP7ex Nec+R+/lnCGQZNxGZTkdYl+7EoNRzPZhmqavWgxNGPgQGAeNkFen4SZln w==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="410466486" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="410466486" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="865503490" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="865503490" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V3 08/10] vfio/pci: Probe and store ability to support dynamic MSI-X Date: Tue, 18 Apr 2023 10:29:19 -0700 Message-Id: <0da4830176e9c4a7877aac0611869f341dda831c.1681837892.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Not all MSI-X devices support dynamic MSI-X allocation. Whether a device supports dynamic MSI-X should be queried using pci_msix_can_alloc_dyn(). Instead of scattering code with pci_msix_can_alloc_dyn(), probe this ability once and store it as a property of the virtual device. Suggested-by: Alex Williamson Signed-off-by: Reinette Chatre --- Changes since V2: - New patch. (Alex) drivers/vfio/pci/vfio_pci_core.c | 5 ++++- include/linux/vfio_pci_core.h | 1 + 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index ae0e161c7fc9..a3635a8e54c8 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -530,8 +530,11 @@ int vfio_pci_core_enable(struct vfio_pci_core_device *vdev) vdev->msix_bar = table & PCI_MSIX_TABLE_BIR; vdev->msix_offset = table & PCI_MSIX_TABLE_OFFSET; vdev->msix_size = ((flags & PCI_MSIX_FLAGS_QSIZE) + 1) * 16; - } else + vdev->has_dyn_msix = pci_msix_can_alloc_dyn(pdev); + } else { vdev->msix_bar = 0xFF; + vdev->has_dyn_msix = false; + } if (!vfio_vga_disabled() && vfio_pci_is_vga(pdev)) vdev->has_vga = true; diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index 148fd1ae6c1c..4f070f2d6fde 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -67,6 +67,7 @@ struct vfio_pci_core_device { u8 msix_bar; u16 msix_size; u32 msix_offset; + bool has_dyn_msix; u32 rbar[7]; bool pci_2_3; bool virq_disabled; From patchwork Tue Apr 18 17:29:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13215988 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE1CBC77B75 for ; Tue, 18 Apr 2023 17:30:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232563AbjDRR36 (ORCPT ); Tue, 18 Apr 2023 13:29:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39052 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232578AbjDRR3u (ORCPT ); Tue, 18 Apr 2023 13:29:50 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82A3844BC; Tue, 18 Apr 2023 10:29:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681838989; x=1713374989; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oMfakM0i/B5Q/gryK7ofBQXHR1G72EZy+CSdZ2o9fNM=; b=YEuByj/DMj3ZucwxUtZ6luDlPfNjLh/sdQpcehKBest7/o6cArsswcXB 4EPSNWcIHrAqoOU7l4vQvajKXu6vT6t9cOMKKndKgcfRVicCNBL0lNcak flPXaAG0CPInV2FloMeRzIfIjTZa/Y5uXZxhIoBdmgu1NVaDRFy5aYK6U ykjJ3+hgYKKxGO5aRtt4kNo7ornnhJUPfU2L4W4AXfcz75uQa9vgB0HrG MOcVsD2qccGOewwFf8/jnYF+BEBjxWjqmQG2YvtV2bkbnBNoD0K8dTqaz e1TXUcPRMmV8ozioKIxG48kbM0Ak5vaKSGxhFU6Cr5hJnxZxu/tn2/bY3 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="410466492" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="410466492" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="865503494" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="865503494" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V3 09/10] vfio/pci: Support dynamic MSI-X Date: Tue, 18 Apr 2023 10:29:20 -0700 Message-Id: <86cda5cf2742feff3b14954284fb509863355050.1681837892.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Recently introduced pci_msix_alloc_irq_at() and pci_msix_free_irq() enables an individual MSI-X interrupt to be allocated and freed after MSI-X enabling. Use dynamic MSI-X (if supported by the device) to allocate an interrupt after MSI-X is enabled. An MSI-X interrupt is dynamically allocated at the time a valid eventfd is assigned. This is different behavior from a range provided during MSI-X enabling where interrupts are allocated for the entire range whether a valid eventfd is provided for each interrupt or not. Do not dynamically free interrupts, leave that to when MSI-X is disabled. Signed-off-by: Reinette Chatre Link: https://lore.kernel.org/lkml/20230403211841.0e206b67.alex.williamson@redhat.com/ --- The get_cached_msi_msg()/pci_write_msi_msg() behavior is kept similar to the scenario when MSI-X is enabled with triggers provided for new interrupts. get_cached_msi_msg()/pci_write_msi_msg() follows for interrupts recently allocated with pci_msix_alloc_irq_at() just like get_cached_msi_msg()/pci_write_msi_msg() is done for interrupts recently allocated with pci_alloc_irq_vectors(). Changes since V2: - Move vfio_irq_ctx_free() to earlier in series to support earlier usage. (Alex) - Use consistent terms in changelog: MSI-x changed to MSI-X. - Make dynamic interrupt context creation generic across all MSI/MSI-X interrupts. This resulted in code moving to earlier in series as part of xarray introduction patch. (Alex) - Remove the local allow_dyn_alloc and direct calling of pci_msix_can_alloc_dyn(), use the new vdev->has_dyn_msix introduced earlier instead. (Alex) - Stop tracking new allocations (remove "new_ctx"). (Alex) - Introduce new wrapper that returns Linux interrupt number or dynamically allocate a new interrupt. Wrapper can be used for all interrupt cases. (Alex) - Only free dynamic MSI-X interrupts on MSI-X teardown. (Alex) Changes since RFC V1: - Add pointer to interrupt context as function parameter to vfio_irq_ctx_free(). (Alex) - Initialize new_ctx to false. (Dan Carpenter) - Only support dynamic allocation if device supports it. (Alex) drivers/vfio/pci/vfio_pci_intrs.c | 73 +++++++++++++++++++++++++++---- 1 file changed, 65 insertions(+), 8 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index bdda7f46c2be..c1a3e224c867 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -372,27 +372,74 @@ static int vfio_msi_enable(struct vfio_pci_core_device *vdev, int nvec, bool msi return 0; } +/* + * Return Linux IRQ number of an MSI or MSI-X device interrupt vector. + * If a Linux IRQ number is not available then a new interrupt will be + * allocated if dynamic MSI-X is supported. + */ +static int vfio_msi_alloc_irq(struct vfio_pci_core_device *vdev, + unsigned int vector, bool msix) +{ + struct pci_dev *pdev = vdev->pdev; + struct msi_map map; + int irq; + u16 cmd; + + irq = pci_irq_vector(pdev, vector); + if (irq > 0 || !msix || !vdev->has_dyn_msix) + return irq; + + cmd = vfio_pci_memory_lock_and_enable(vdev); + map = pci_msix_alloc_irq_at(pdev, vector, NULL); + vfio_pci_memory_unlock_and_restore(vdev, cmd); + + return map.index < 0 ? map.index : map.virq; +} + +/* + * Free interrupt if it can be re-allocated dynamically (while MSI-X + * is enabled). + */ +static void vfio_msi_free_irq(struct vfio_pci_core_device *vdev, + unsigned int vector, bool msix) +{ + struct pci_dev *pdev = vdev->pdev; + struct msi_map map; + int irq; + u16 cmd; + + if (!msix || !vdev->has_dyn_msix) + return; + + irq = pci_irq_vector(pdev, vector); + map = (struct msi_map) { .index = vector, .virq = irq }; + + if (WARN_ON(irq < 0)) + return; + + cmd = vfio_pci_memory_lock_and_enable(vdev); + pci_msix_free_irq(pdev, map); + vfio_pci_memory_unlock_and_restore(vdev, cmd); +} + static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, unsigned int vector, int fd, bool msix) { struct pci_dev *pdev = vdev->pdev; struct vfio_pci_irq_ctx *ctx; struct eventfd_ctx *trigger; - int irq, ret; + int irq = -EINVAL, ret; u16 cmd; - irq = pci_irq_vector(pdev, vector); - if (irq < 0) - return -EINVAL; - ctx = vfio_irq_ctx_get(vdev, vector); if (ctx) { irq_bypass_unregister_producer(&ctx->producer); - + irq = pci_irq_vector(pdev, vector); cmd = vfio_pci_memory_lock_and_enable(vdev); free_irq(irq, ctx->trigger); vfio_pci_memory_unlock_and_restore(vdev, cmd); + /* Interrupt stays allocated, will be freed at MSI-X disable. */ kfree(ctx->name); eventfd_ctx_put(ctx->trigger); vfio_irq_ctx_free(vdev, ctx, vector); @@ -401,9 +448,17 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, if (fd < 0) return 0; + if (irq == -EINVAL) { + irq = vfio_msi_alloc_irq(vdev, vector, msix); + if (irq < 0) + return irq; + } + ctx = vfio_irq_ctx_alloc(vdev, vector); - if (!ctx) - return -ENOMEM; + if (!ctx) { + ret = -ENOMEM; + goto out_free_irq; + } ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)", msix ? "x" : "", vector, pci_name(pdev)); @@ -456,6 +511,8 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev, kfree(ctx->name); out_free_ctx: vfio_irq_ctx_free(vdev, ctx, vector); +out_free_irq: + vfio_msi_free_irq(vdev, vector, msix); return ret; } From patchwork Tue Apr 18 17:29:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 13215990 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D40EDC77B78 for ; Tue, 18 Apr 2023 17:30:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232656AbjDRRaC (ORCPT ); Tue, 18 Apr 2023 13:30:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39184 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232590AbjDRR3v (ORCPT ); Tue, 18 Apr 2023 13:29:51 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18AA2A5D3; Tue, 18 Apr 2023 10:29:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681838990; x=1713374990; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TKr20cbHjaq+vXR0di2P0v1WIRXrN40KCy9UBNZV2/c=; b=OZ84VT2lUOR+LTM85xRWR5y+fIllEWPZHPI/1qEOWRxYONn/ZpSB0uG3 XnN7t6AyvDlfu8IYCylEKk01aCf8bhdTRNh6stOwfFJk1YptoSTFONHdJ nQPHuGLHouD4TavghUeEsfwzfkyN8WlGGiB3uum4mwPVflPQqqZ3PBz9W k4bxXS1K/M983RfNOo0QQdxKdy9EHxTIzwLxOx0PoVdgRn4izAmzhVF6O LHZi3X7oGWS7QC2CfU7vZDBFlwZu0vM1YyB7c7e5P2NUtcfEqGArCHVrk WHpt1LQu65rhLWBNlwp32WVNNsm2ukmIbx5mtf0zXAS8MYdNA3HmUDxZQ w==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="410466496" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="410466496" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="865503497" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="865503497" Received: from rchatre-ws.ostc.intel.com ([10.54.69.144]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 10:29:42 -0700 From: Reinette Chatre To: jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, alex.williamson@redhat.com Cc: tglx@linutronix.de, darwi@linutronix.de, kvm@vger.kernel.org, dave.jiang@intel.com, jing2.liu@intel.com, ashok.raj@intel.com, fenghua.yu@intel.com, tom.zanussi@linux.intel.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH V3 10/10] vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X Date: Tue, 18 Apr 2023 10:29:21 -0700 Message-Id: <6c057618833a180da2147bffadb98e07cb73e045.1681837892.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Dynamic MSI-X is supported. Clear VFIO_IRQ_INFO_NORESIZE to provide guidance to user space. Signed-off-by: Reinette Chatre --- Changes since V2: - Use new vdev->has_dyn_msix property instead of calling pci_msix_can_alloc_dyn() directly. (Alex) Changes since RFC V1: - Only advertise VFIO_IRQ_INFO_NORESIZE for MSI-X devices that can actually support dynamic allocation. (Alex) drivers/vfio/pci/vfio_pci_core.c | 4 +++- include/uapi/linux/vfio.h | 3 +++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index a3635a8e54c8..4050ad3388c2 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1114,7 +1114,9 @@ static int vfio_pci_ioctl_get_irq_info(struct vfio_pci_core_device *vdev, if (info.index == VFIO_PCI_INTX_IRQ_INDEX) info.flags |= (VFIO_IRQ_INFO_MASKABLE | VFIO_IRQ_INFO_AUTOMASKED); - else + else if ((info.index != VFIO_PCI_MSIX_IRQ_INDEX) || + (info.index == VFIO_PCI_MSIX_IRQ_INDEX && + !vdev->has_dyn_msix)) info.flags |= VFIO_IRQ_INFO_NORESIZE; return copy_to_user(arg, &info, minsz) ? -EFAULT : 0; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 0552e8dcf0cb..1a36134cae5c 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -511,6 +511,9 @@ struct vfio_region_info_cap_nvlink2_lnkspd { * then add and unmask vectors, it's up to userspace to make the decision * whether to allocate the maximum supported number of vectors or tear * down setup and incrementally increase the vectors as each is enabled. + * Absence of the NORESIZE flag indicates that vectors can be enabled + * and disabled dynamically without impacting other vectors within the + * index. */ struct vfio_irq_info { __u32 argsz;