From patchwork Fri Nov 4 19:57:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony DeRossi X-Patchwork-Id: 13032544 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80DA1C433FE for ; Fri, 4 Nov 2022 20:02:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229693AbiKDUC5 (ORCPT ); Fri, 4 Nov 2022 16:02:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48026 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229900AbiKDUC2 (ORCPT ); Fri, 4 Nov 2022 16:02:28 -0400 Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 936714E432 for ; Fri, 4 Nov 2022 13:01:41 -0700 (PDT) Received: by mail-pf1-x436.google.com with SMTP id 130so5402861pfu.8 for ; Fri, 04 Nov 2022 13:01:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=e5SjKpl5wQpBKJaujaO+g62YnASncJMmHCTnqeXvOsQ=; b=G/2wdby9XEOdzpBgg5EDh+iPs5wZ5HXoJv2WWfzydjtLTKSHnZM762oIKwPbtwN//E /C3tND2qrcxcMOn4I22EXHmx1VCdwmR3W3vy0l0rYcxseP6VvKE0ZVEsQ46A81VRLlVK qZBtAghw5K57jdIAtKuhdZYzbnD686BHRopAs4ou70vFfRW9shoIRyhs8Obt2uAwbQJP 6I/AkzEVziRxi12JvOmhnpP4akRJczRyJ7oTAwlxC3ewewoQBhkfVvt10tXUNnEDvPmg qE65p8s5QveEz4LQX1to5Ba6b1k1ktvBMVez0G/v2cjUHsqVyTSWHYBjcnDps7tQ/idC 9t9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=e5SjKpl5wQpBKJaujaO+g62YnASncJMmHCTnqeXvOsQ=; b=VdrJ8G3rW+h8BZCPcS/U7CFiQlaIXqqd9qSaJ4Sh8PNSOUnw6T4u9esyH9jMLCnb7V r4P/yWDsWBGmbfKMQCvcAx6zPrXSwpCsh+LK2+AHBLdTtPI4IrQDZSGqU73Gf47NBrpr GmayjYXs3scZoXpea1YWIQBW4lEQ24rpJvet1ddzRE4jx16lXgqXHh37zrg2gGS+FbXj n4DIi4inKH8J47pK1aR6grUNG0Jq4WJhtz40xA8R7Mz+C5KX2Udg/Zx3zATvDf5v6ZNB yEEAkmuFRkD8/OpCCejCrIepzbufdn59MWWqevuZTVwCwEuRyJ4Vd9/gtjq/ETu6PjzZ IOoA== X-Gm-Message-State: ACrzQf1yU5Udrhszwrk4U54m/2rvAjPIMWdopQJOrbLRjlpB+upL+XWE g92qAdR+e+bOCaoRw+dwYQqg8m0g16EIwg== X-Google-Smtp-Source: AMsMyM7pb9CmSFsMnnMrwDUvg5VMpuKhP6ltHNmb7Zm8m+q139VycgLcuCmeNeFsWv0c8xKKoSoihQ== X-Received: by 2002:a63:5a63:0:b0:42f:e143:80d4 with SMTP id k35-20020a635a63000000b0042fe14380d4mr32485104pgm.456.1667592100697; Fri, 04 Nov 2022 13:01:40 -0700 (PDT) Received: from crazyhorse.local ([174.127.229.57]) by smtp.googlemail.com with ESMTPSA id q23-20020a63cc57000000b0046f6d7dcd1dsm122545pgi.25.2022.11.04.13.01.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Nov 2022 13:01:40 -0700 (PDT) From: Anthony DeRossi To: kvm@vger.kernel.org Cc: alex.williamson@redhat.com, cohuck@redhat.com, jgg@ziepe.ca, kevin.tian@intel.com, abhsahu@nvidia.com, yishaih@nvidia.com Subject: [PATCH v4 1/3] vfio: Fix container device registration life cycle Date: Fri, 4 Nov 2022 12:57:25 -0700 Message-Id: <20221104195727.4629-2-ajderossi@gmail.com> X-Mailer: git-send-email 2.37.4 In-Reply-To: <20221104195727.4629-1-ajderossi@gmail.com> References: <20221104195727.4629-1-ajderossi@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In vfio_device_open(), vfio_container_device_register() is always called when open_count == 1. On error, vfio_device_container_unregister() is only called when open_count == 1 and close_device is set. This leaks a registration for devices without a close_device implementation. In vfio_device_fops_release(), vfio_device_container_unregister() is called unconditionally. This can cause a device to be unregistered multiple times. Treating container device registration/unregistration uniformly (always when open_count == 1) fixes both issues. Fixes: ce4b4657ff18 ("vfio: Replace the DMA unmapping notifier with a callback") Signed-off-by: Anthony DeRossi --- drivers/vfio/vfio_main.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 2d168793d4e1..9a4af880e941 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -801,8 +801,9 @@ static struct file *vfio_device_open(struct vfio_device *device) err_close_device: mutex_lock(&device->dev_set->lock); mutex_lock(&device->group->group_lock); - if (device->open_count == 1 && device->ops->close_device) { - device->ops->close_device(device); + if (device->open_count == 1) { + if (device->ops->close_device) + device->ops->close_device(device); vfio_device_container_unregister(device); } @@ -1017,10 +1018,12 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep) mutex_lock(&device->dev_set->lock); vfio_assert_device_open(device); mutex_lock(&device->group->group_lock); - if (device->open_count == 1 && device->ops->close_device) - device->ops->close_device(device); + if (device->open_count == 1) { + if (device->ops->close_device) + device->ops->close_device(device); - vfio_device_container_unregister(device); + vfio_device_container_unregister(device); + } mutex_unlock(&device->group->group_lock); device->open_count--; if (device->open_count == 0) From patchwork Fri Nov 4 19:57:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony DeRossi X-Patchwork-Id: 13032545 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FAABC433FE for ; Fri, 4 Nov 2022 20:03:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229772AbiKDUDB (ORCPT ); Fri, 4 Nov 2022 16:03:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229888AbiKDUCg (ORCPT ); Fri, 4 Nov 2022 16:02:36 -0400 Received: from mail-pf1-x42b.google.com (mail-pf1-x42b.google.com [IPv6:2607:f8b0:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A48F650F06 for ; Fri, 4 Nov 2022 13:01:48 -0700 (PDT) Received: by mail-pf1-x42b.google.com with SMTP id b185so5402854pfb.9 for ; Fri, 04 Nov 2022 13:01:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lL1U+nSAOKGAWh3pKMUaEtjhsNJnzcDwQbCqQSSfvwA=; b=PiBpvMYwmV8P8Zgt9o13BTOVxy5NGSgCssKtEu0l9EItx60jcrjboAxqz0nzvKly1O Vk2osvdOYIIH3D1zkQ267NUKfsN3WpwEtVhO7LsoOMdWAYe8c6NvelJhzGITEZjuMxEA 8xaD2n3ybfIOwUV5T3Veknoam5EBaLLm7C4cQ8CkRLyzwUqAxm+UkZwuWrrjyPvNmszy N3+mm0h7js21kBsD9kKZ7fZnqRo7b2Wo1mTcfcqFvVnCCPEc49RWwDZyznbsOwMA3UAG qk3Q4ODa8lPk7poBC17LU16Z1Fs+gerCUAQuF1UFs57P1p9om2lmMgNM3Pb9pyZ3MeXN dr7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lL1U+nSAOKGAWh3pKMUaEtjhsNJnzcDwQbCqQSSfvwA=; b=xnDua78e7bYNVCgulXj7LeapuCwYDy1U/LkhnaVypFdPGnOJfaq8AibhV1SIfFnmg0 faQtk23E1U6oCwb3PucZzJolo8XO7tGQBIo+MWrRajixwZFPfStn9xHoijqeDqga+1r6 tWfqqr3sfQhvMAEyows9NNu2dKysPIYmekuVpTesH2oQZoUjdfDg4HCw+YZYJayA2W9M pYivArmsyyEiHTblplfBxTk9nwFRRpxk5ni5aBEw2tSqoEmq77TKIHc0IOZSbrpcHvTM CJFgw6hiod4FVujjK/FJDVm3gguPK/MawsN1ZrFGPPVVvo8ia/aQTOvauYeLsUC5TNJy 59/g== X-Gm-Message-State: ACrzQf2mYh+dkWBMMJIZFyeIX6aXt2OlwVlq/EcWybJkg8i5pMX071xx RPJHiGUIUK3ouP2ybo1EGmijBMe8766+eQ== X-Google-Smtp-Source: AMsMyM5S7nUujFSB1BxUltmovyDz1SV4zC1L6VvtHOcjvj54fCuLoprNGQHWMKEr0nWskL9ivV7jDg== X-Received: by 2002:a05:6a00:8ce:b0:56e:6961:c6b6 with SMTP id s14-20020a056a0008ce00b0056e6961c6b6mr7976225pfu.3.1667592107992; Fri, 04 Nov 2022 13:01:47 -0700 (PDT) Received: from crazyhorse.local ([174.127.229.57]) by smtp.googlemail.com with ESMTPSA id q23-20020a63cc57000000b0046f6d7dcd1dsm122545pgi.25.2022.11.04.13.01.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Nov 2022 13:01:47 -0700 (PDT) From: Anthony DeRossi To: kvm@vger.kernel.org Cc: alex.williamson@redhat.com, cohuck@redhat.com, jgg@ziepe.ca, kevin.tian@intel.com, abhsahu@nvidia.com, yishaih@nvidia.com Subject: [PATCH v4 2/3] vfio: Add an open counter to vfio_device_set Date: Fri, 4 Nov 2022 12:57:26 -0700 Message-Id: <20221104195727.4629-3-ajderossi@gmail.com> X-Mailer: git-send-email 2.37.4 In-Reply-To: <20221104195727.4629-1-ajderossi@gmail.com> References: <20221104195727.4629-1-ajderossi@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org open_count is incremented before open_device() and decremented after close_device() for each device in the set. This allows devices to determine whether shared resources are in use without tracking them manually or accessing the private open_count in vfio_device. Signed-off-by: Anthony DeRossi --- drivers/vfio/vfio_main.c | 3 +++ include/linux/vfio.h | 1 + 2 files changed, 4 insertions(+) diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 9a4af880e941..6c65418fc7e3 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -761,6 +761,7 @@ static struct file *vfio_device_open(struct vfio_device *device) mutex_lock(&device->group->group_lock); device->kvm = device->group->kvm; + device->dev_set->open_count++; if (device->ops->open_device) { ret = device->ops->open_device(device); if (ret) @@ -809,6 +810,7 @@ static struct file *vfio_device_open(struct vfio_device *device) } err_undo_count: mutex_unlock(&device->group->group_lock); + device->dev_set->open_count--; device->open_count--; if (device->open_count == 0 && device->kvm) device->kvm = NULL; @@ -1023,6 +1025,7 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep) device->ops->close_device(device); vfio_device_container_unregister(device); + device->dev_set->open_count--; } mutex_unlock(&device->group->group_lock); device->open_count--; diff --git a/include/linux/vfio.h b/include/linux/vfio.h index e7cebeb875dd..5becdcdf4ba2 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -28,6 +28,7 @@ struct vfio_device_set { struct mutex lock; struct list_head device_list; unsigned int device_count; + unsigned int open_count; }; struct vfio_device { From patchwork Fri Nov 4 19:57:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony DeRossi X-Patchwork-Id: 13032546 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5953FC4332F for ; Fri, 4 Nov 2022 20:03:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229846AbiKDUDE (ORCPT ); Fri, 4 Nov 2022 16:03:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230006AbiKDUCj (ORCPT ); Fri, 4 Nov 2022 16:02:39 -0400 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 40ADC45A35 for ; Fri, 4 Nov 2022 13:01:52 -0700 (PDT) Received: by mail-pl1-x635.google.com with SMTP id y4so5865249plb.2 for ; Fri, 04 Nov 2022 13:01:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Jh/lYB5WiOxX1oSK4LWl/FL3NO8tB4pgbus8C9dwTWk=; b=em4fQgqI+TXrScQ/k5gwiFtYFyzXQ4CEYuZEzjarizzrodACfNrwwQtI/5wOb9j83+ mShZ3TU8wUBb5oVOLw/AVrGL0oYsrZfYpmlpKb+Pb+3oXB+CxDBr0IIbL7VHJeeYmIqo iJZmiaay2FRoPdpSnispwU/kV1s71LCpDEog1yzZ0TVkQhBXnyUzA5wyO1OiuGxVseR0 OOC1GvYf5lZajGr5T2pLodVEh5KE3ZJ0xgf7vHCl0RB9a//TWjarUg2nKGOTfVmAFaYU IVMS9WDBaKxZEDjJbLIxDiKG0fYHKmwf+7hukPQGI0L0btFhv9sfQYxaSKPEc6AsnYZ/ qHdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Jh/lYB5WiOxX1oSK4LWl/FL3NO8tB4pgbus8C9dwTWk=; b=4YvPZFu5ASQjnnMqFE00UeA47jYhc8BKbsWOP8JMmavE35JqK/5U9dVunh91qP+N1e 3az3XQd8PHM+AfaAd+iZrISrhMlhTRqHo1zKKMXDEe4YdMhD0qVrvIe+tiS7CGmtoVSS W+qoa04Yf0EwNxGpg0s+B/SiiIvYhF8mVvW6Yf2+lJx0Jrwnyis5ZYwT2SroTJrhC6T+ n63FnAFM2PvvkjunlSsLW+dBZrpgu2+rikHOaNCpAMO1WfqnY0qdmGuVdYA32Pjkfq8+ 1wdYQSkhyz+Q+3pi+GwJC2gTR+uiwrb8Ah5X1bjhXWiTo7zqC9Sjl7mpJIYcgEMXO9qw eqNg== X-Gm-Message-State: ACrzQf2DdgsmODd/7/YAHJMu97TIiHevw7DWylGOJnorl23Mee3twqfW MQJHVaC04PkLd4/Z+TnomOuUWhFUZ03AEw== X-Google-Smtp-Source: AMsMyM7n6xzlxWpINSc7gC6ArswBZ/yS+gHl00K2gSlsK1nX4Cr7zGkC4G79z00UcVUHwPV0gfxNdg== X-Received: by 2002:a17:902:7283:b0:188:612b:1d31 with SMTP id d3-20020a170902728300b00188612b1d31mr6333194pll.81.1667592111657; Fri, 04 Nov 2022 13:01:51 -0700 (PDT) Received: from crazyhorse.local ([174.127.229.57]) by smtp.googlemail.com with ESMTPSA id q23-20020a63cc57000000b0046f6d7dcd1dsm122545pgi.25.2022.11.04.13.01.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Nov 2022 13:01:51 -0700 (PDT) From: Anthony DeRossi To: kvm@vger.kernel.org Cc: alex.williamson@redhat.com, cohuck@redhat.com, jgg@ziepe.ca, kevin.tian@intel.com, abhsahu@nvidia.com, yishaih@nvidia.com Subject: [PATCH v4 3/3] vfio/pci: Check the device set open_count on reset Date: Fri, 4 Nov 2022 12:57:27 -0700 Message-Id: <20221104195727.4629-4-ajderossi@gmail.com> X-Mailer: git-send-email 2.37.4 In-Reply-To: <20221104195727.4629-1-ajderossi@gmail.com> References: <20221104195727.4629-1-ajderossi@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org vfio_pci_dev_set_needs_reset() inspects the open_count of every device in the set to determine whether a reset is needed. The current device always has open_count == 1 within vfio_pci_core_disable(), effectively disabling the reset logic. This field is also documented as private in vfio_device, so it should not be used to determine whether other devices in the set are open. Checking for open_count > 1 on the device set fixes both issues. After commit 2cd8b14aaa66 ("vfio/pci: Move to the device set infrastructure"), failure to create a new file for a device would cause the reset to be skipped due to open_count being decremented after calling close_device() in the error path. After commit eadd86f835c6 ("vfio: Remove calls to vfio_group_add_container_user()"), releasing a device would always skip the reset due to an ordering change in vfio_device_fops_release(). Failing to reset the device leaves it in an unknown state, potentially causing errors when it is bound to a different driver. This issue was observed with a Radeon RX Vega 56 [1002:687f] (rev c3) assigned to a Windows guest. After shutting down the guest, unbinding the device from vfio-pci, and binding the device to amdgpu: [ 548.007102] [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed! [ 548.027174] [drm:psp_hw_init [amdgpu]] *ERROR* PSP firmware loading failed [ 548.027242] [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* hw_init of IP block failed -22 [ 548.027306] amdgpu 0000:0a:00.0: amdgpu: amdgpu_device_ip_init failed [ 548.027308] amdgpu 0000:0a:00.0: amdgpu: Fatal error during GPU init Fixes: 2cd8b14aaa66 ("vfio/pci: Move to the device set infrastructure") Fixes: eadd86f835c6 ("vfio: Remove calls to vfio_group_add_container_user()") Signed-off-by: Anthony DeRossi --- drivers/vfio/pci/vfio_pci_core.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index badc9d828cac..e65c70781fe2 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -2488,12 +2488,12 @@ static bool vfio_pci_dev_set_needs_reset(struct vfio_device_set *dev_set) struct vfio_pci_core_device *cur; bool needs_reset = false; - list_for_each_entry(cur, &dev_set->device_list, vdev.dev_set_list) { - /* No VFIO device in the set can have an open device FD */ - if (cur->vdev.open_count) - return false; + /* No other VFIO device in the set can be open. */ + if (dev_set->open_count > 1) + return false; + + list_for_each_entry(cur, &dev_set->device_list, vdev.dev_set_list) needs_reset |= cur->needs_reset; - } return needs_reset; }