From patchwork Sat Jan 4 07:52:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Akihiko Odaki X-Patchwork-Id: 13926143 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 12CBCE77197 for ; Sat, 4 Jan 2025 07:53:58 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tTyyk-0002RR-VM; Sat, 04 Jan 2025 02:53:51 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tTyyT-0001yW-UY for qemu-devel@nongnu.org; Sat, 04 Jan 2025 02:53:35 -0500 Received: from mail-pl1-x62c.google.com ([2607:f8b0:4864:20::62c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tTyyR-00037N-Rm for qemu-devel@nongnu.org; Sat, 04 Jan 2025 02:53:33 -0500 Received: by mail-pl1-x62c.google.com with SMTP id d9443c01a7336-21649a7bcdcso176409405ad.1 for ; Fri, 03 Jan 2025 23:53:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=daynix-com.20230601.gappssmtp.com; s=20230601; t=1735977210; x=1736582010; darn=nongnu.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=AKxtFhYsEIBBoyDWKOc88hhZYX1qMEsEPG5KZLaNUDY=; b=BEfZjulp09svIV5VGUeVu1pkyL+fKeTHrNhPghx7IYU4DrO5Ztf/VSRZbgeOE/9Krc IieXgoQ/t7kg9H6G0DnGNJyX9n6Vg99HRk17RL77UtMNmVQgjcl55pU1Aa44DnXcYalH Z+L1d32BzvEFKZuDUh1q5duGzjusb58/6uNeED4jzpoXdfvM5XZvLOadEzhc2XXrBQwO 84JulGEk1PgWYTCOWGGO1jWQ7zznh3xu4LPBfcM5UQTY65eI1hVD1d3cerEkl0grKaXV fWVZUqcRMalmAts01zO6fMF/pbQ1WygkAzALVOMwfElZbj9q5qIcQZrlUZbkyKvtzOwS 2BlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735977210; x=1736582010; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AKxtFhYsEIBBoyDWKOc88hhZYX1qMEsEPG5KZLaNUDY=; b=A7l3zQeerAAkLWcDSx8OXglyv9w68xpKKRRkj52RNx0J6Kyhbx9qWwB3yp0Ui2HdnW CbAoLBSlT2kLs3QVy++UUqhIDu94+iZCkdkZwx7X+aLKbJu2kGixgRIQY1Z7axdw4JnO c5gsm7XvwUr6i8enI1qXiUqHfkaUNL8ARiLu/gzIVSxP4N10iMhQHaJwYY8mx9TWWE+s c4NoR091Qz+sE9g4ClaYVXzi/d69cKdGfsEy67e/o42J7y5q72qgKHzdwO5o2bXZ63VI /Grltq8FnOU46qccMvrFbzOqzswgL+oQmVPeCp0gPp5NY2j2Dg/PaHcDmbSq3xr36GqS 5Vlg== X-Gm-Message-State: AOJu0YxsiuAh1zBETBrW1mw3gzdXWckawAkP7MY40GbTH/1mgGT3IMqL Q0PcKva3COfjXzPQGa/JW1Ozrj7fyKsLMRx7HbQtw/6Ma31OvoexTjSmMDnsFlQ= X-Gm-Gg: ASbGnctmQBpGfJoUqobu3eBCLWqLHXFu1Y+2Suu1cuSTU0sv5kibK74A8I6oaWB1J4U lnuWscv0Nzy6b/iFAskqd++u4mFOwU5PkVudcR4+he0M9/GEGDdVzGnB1IJQcjPDtIrYFDHKpxz T3aDDnzRkR0KXN/NCzLT1bEeFKVEWCMItCu+4tQS/z3Pl31XP2OyItg+51vSpgvhT5yyPd0kN3R gq+atIyKSUqIDI01/F3aSWS1+SuBikCMOvIEv4T4lA+2/FNjrlBzseSdZU6 X-Google-Smtp-Source: AGHT+IEo1l4R6874vOWAqyvpFEn0xIhHhvjxqpdOoDQyvYcxiHNvQ84qTs5w+6/I2ckVo/A30fcnXA== X-Received: by 2002:a05:6a21:100f:b0:1e1:a693:d5fd with SMTP id adf61e73a8af0-1e5e04a3432mr82647793637.25.1735977210594; Fri, 03 Jan 2025 23:53:30 -0800 (PST) Received: from localhost ([157.82.207.107]) by smtp.gmail.com with UTF8SMTPSA id d2e1a72fcca58-72aad815796sm28340967b3a.28.2025.01.03.23.53.26 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 03 Jan 2025 23:53:30 -0800 (PST) From: Akihiko Odaki Date: Sat, 04 Jan 2025 16:52:29 +0900 Subject: [PATCH v18 09/14] pcie_sriov: Reuse SR-IOV VF device instances MIME-Version: 1.0 Message-Id: <20250104-reuse-v18-9-c349eafd8673@daynix.com> References: <20250104-reuse-v18-0-c349eafd8673@daynix.com> In-Reply-To: <20250104-reuse-v18-0-c349eafd8673@daynix.com> To: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , "Michael S. Tsirkin" , Marcel Apfelbaum , Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_Goa?= =?utf-8?q?ter?= , Paolo Bonzini , =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Eduardo Habkost , Sriram Yagnaraman , Jason Wang , Keith Busch , Klaus Jensen , Markus Armbruster , Matthew Rosato , Eric Farman , Harsh Prateek Bora , Shivaprasad G Bhat Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, devel@daynix.com, Akihiko Odaki X-Mailer: b4 0.14-dev-fd6e3 Received-SPF: pass client-ip=2607:f8b0:4864:20::62c; envelope-from=akihiko.odaki@daynix.com; helo=mail-pl1-x62c.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Disable SR-IOV VF devices by reusing code to power down PCI devices instead of removing them when the guest requests to disable VFs. This allows to realize devices and report VF realization errors at PF realization time. Signed-off-by: Akihiko Odaki --- include/hw/pci/pci.h | 5 --- include/hw/pci/pci_device.h | 15 ++++++++ include/hw/pci/pcie_sriov.h | 1 - hw/pci/pci.c | 2 +- hw/pci/pcie_sriov.c | 94 +++++++++++++++++++-------------------------- 5 files changed, 55 insertions(+), 62 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index c1e897f44143..e40235d9ab07 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -676,9 +676,4 @@ static inline void pci_irq_deassert(PCIDevice *pci_dev) MSIMessage pci_get_msi_message(PCIDevice *dev, int vector); void pci_set_enabled(PCIDevice *pci_dev, bool state); -static inline void pci_set_power(PCIDevice *pci_dev, bool state) -{ - pci_set_enabled(pci_dev, state); -} - #endif diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h index 66b6c08b0118..37d88f0cd05b 100644 --- a/include/hw/pci/pci_device.h +++ b/include/hw/pci/pci_device.h @@ -219,6 +219,21 @@ static inline uint16_t pci_get_bdf(PCIDevice *dev) return PCI_BUILD_BDF(pci_bus_num(pci_get_bus(dev)), dev->devfn); } +static inline void pci_set_power(PCIDevice *pci_dev, bool state) +{ + /* + * Don't change the enabled state of VFs when powering on/off the device. + * + * When powering on, VFs must not be enabled immediately but they must + * wait until the guest configures SR-IOV. + * When powering off, their corresponding PFs will be reset and disable + * VFs. + */ + if (!pci_is_vf(pci_dev)) { + pci_set_enabled(pci_dev, state); + } +} + uint16_t pci_requester_id(PCIDevice *dev); /* DMA access functions */ diff --git a/include/hw/pci/pcie_sriov.h b/include/hw/pci/pcie_sriov.h index aa704e8f9d9f..70649236c18a 100644 --- a/include/hw/pci/pcie_sriov.h +++ b/include/hw/pci/pcie_sriov.h @@ -18,7 +18,6 @@ typedef struct PCIESriovPF { uint16_t num_vfs; /* Number of virtual functions created */ uint8_t vf_bar_type[PCI_NUM_REGIONS]; /* Store type for each VF bar */ - const char *vfname; /* Reference to the device type used for the VFs */ PCIDevice **vf; /* Pointer to an array of num_vfs VF devices */ } PCIESriovPF; diff --git a/hw/pci/pci.c b/hw/pci/pci.c index e1563d738ce2..d769952e89ba 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -2972,7 +2972,7 @@ void pci_set_enabled(PCIDevice *d, bool state) memory_region_set_enabled(&d->bus_master_enable_region, (pci_get_word(d->config + PCI_COMMAND) & PCI_COMMAND_MASTER) && d->enabled); - if (!d->enabled) { + if (d->qdev.realized) { pci_device_reset(d); } } diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c index 91c64c988eb4..f1993bc553c0 100644 --- a/hw/pci/pcie_sriov.c +++ b/hw/pci/pcie_sriov.c @@ -20,9 +20,16 @@ #include "qapi/error.h" #include "trace.h" -static PCIDevice *register_vf(PCIDevice *pf, int devfn, - const char *name, uint16_t vf_num); -static void unregister_vfs(PCIDevice *dev); +static void unparent_vfs(PCIDevice *dev, uint16_t total_vfs) +{ + for (uint16_t i = 0; i < total_vfs; i++) { + PCIDevice *vf = dev->exp.sriov_pf.vf[i]; + object_unparent(OBJECT(vf)); + object_unref(OBJECT(vf)); + } + g_free(dev->exp.sriov_pf.vf); + dev->exp.sriov_pf.vf = NULL; +} bool pcie_sriov_pf_init(PCIDevice *dev, uint16_t offset, const char *vfname, uint16_t vf_dev_id, @@ -30,6 +37,7 @@ bool pcie_sriov_pf_init(PCIDevice *dev, uint16_t offset, uint16_t vf_offset, uint16_t vf_stride, Error **errp) { + BusState *bus = qdev_get_parent_bus(&dev->qdev); int32_t devfn = dev->devfn + vf_offset; uint8_t *cfg = dev->config + offset; uint8_t *wmask; @@ -44,7 +52,6 @@ bool pcie_sriov_pf_init(PCIDevice *dev, uint16_t offset, offset, PCI_EXT_CAP_SRIOV_SIZEOF); dev->exp.sriov_cap = offset; dev->exp.sriov_pf.num_vfs = 0; - dev->exp.sriov_pf.vfname = g_strdup(vfname); dev->exp.sriov_pf.vf = NULL; pci_set_word(cfg + PCI_SRIOV_VF_OFFSET, vf_offset); @@ -78,14 +85,34 @@ bool pcie_sriov_pf_init(PCIDevice *dev, uint16_t offset, qdev_prop_set_bit(&dev->qdev, "multifunction", true); + dev->exp.sriov_pf.vf = g_new(PCIDevice *, total_vfs); + + for (uint16_t i = 0; i < total_vfs; i++) { + PCIDevice *vf = pci_new(devfn, vfname); + vf->exp.sriov_vf.pf = dev; + vf->exp.sriov_vf.vf_number = i; + + if (!qdev_realize(&vf->qdev, bus, errp)) { + unparent_vfs(dev, i); + return false; + } + + /* set vid/did according to sr/iov spec - they are not used */ + pci_config_set_vendor_id(vf->config, 0xffff); + pci_config_set_device_id(vf->config, 0xffff); + + dev->exp.sriov_pf.vf[i] = vf; + devfn += vf_stride; + } + return true; } void pcie_sriov_pf_exit(PCIDevice *dev) { - unregister_vfs(dev); - g_free((char *)dev->exp.sriov_pf.vfname); - dev->exp.sriov_pf.vfname = NULL; + uint8_t *cfg = dev->config + dev->exp.sriov_cap; + + unparent_vfs(dev, pci_get_word(cfg + PCI_SRIOV_TOTAL_VF)); } void pcie_sriov_pf_init_vf_bar(PCIDevice *dev, int region_num, @@ -151,38 +178,11 @@ void pcie_sriov_vf_register_bar(PCIDevice *dev, int region_num, } } -static PCIDevice *register_vf(PCIDevice *pf, int devfn, const char *name, - uint16_t vf_num) -{ - PCIDevice *dev = pci_new(devfn, name); - dev->exp.sriov_vf.pf = pf; - dev->exp.sriov_vf.vf_number = vf_num; - PCIBus *bus = pci_get_bus(pf); - Error *local_err = NULL; - - qdev_realize(&dev->qdev, &bus->qbus, &local_err); - if (local_err) { - error_report_err(local_err); - return NULL; - } - - /* set vid/did according to sr/iov spec - they are not used */ - pci_config_set_vendor_id(dev->config, 0xffff); - pci_config_set_device_id(dev->config, 0xffff); - - return dev; -} - static void register_vfs(PCIDevice *dev) { uint16_t num_vfs; uint16_t i; uint16_t sriov_cap = dev->exp.sriov_cap; - uint16_t vf_offset = - pci_get_word(dev->config + sriov_cap + PCI_SRIOV_VF_OFFSET); - uint16_t vf_stride = - pci_get_word(dev->config + sriov_cap + PCI_SRIOV_VF_STRIDE); - int32_t devfn = dev->devfn + vf_offset; assert(sriov_cap > 0); num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF); @@ -190,18 +190,10 @@ static void register_vfs(PCIDevice *dev) return; } - dev->exp.sriov_pf.vf = g_new(PCIDevice *, num_vfs); - trace_sriov_register_vfs(dev->name, PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn), num_vfs); for (i = 0; i < num_vfs; i++) { - dev->exp.sriov_pf.vf[i] = register_vf(dev, devfn, - dev->exp.sriov_pf.vfname, i); - if (!dev->exp.sriov_pf.vf[i]) { - num_vfs = i; - break; - } - devfn += vf_stride; + pci_set_enabled(dev->exp.sriov_pf.vf[i], true); } dev->exp.sriov_pf.num_vfs = num_vfs; } @@ -214,12 +206,8 @@ static void unregister_vfs(PCIDevice *dev) trace_sriov_unregister_vfs(dev->name, PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn), num_vfs); for (i = 0; i < num_vfs; i++) { - PCIDevice *vf = dev->exp.sriov_pf.vf[i]; - object_unparent(OBJECT(vf)); - object_unref(OBJECT(vf)); + pci_set_enabled(dev->exp.sriov_pf.vf[i], false); } - g_free(dev->exp.sriov_pf.vf); - dev->exp.sriov_pf.vf = NULL; dev->exp.sriov_pf.num_vfs = 0; } @@ -241,14 +229,10 @@ void pcie_sriov_config_write(PCIDevice *dev, uint32_t address, PCI_FUNC(dev->devfn), off, val, len); if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) { - if (dev->exp.sriov_pf.num_vfs) { - if (!(val & PCI_SRIOV_CTRL_VFE)) { - unregister_vfs(dev); - } + if (val & PCI_SRIOV_CTRL_VFE) { + register_vfs(dev); } else { - if (val & PCI_SRIOV_CTRL_VFE) { - register_vfs(dev); - } + unregister_vfs(dev); } } }