diff mbox series

[RFC,18/29] nvkm/vgpu: introduce pci_driver.sriov_configure() in nvkm

Message ID 20240922124951.1946072-19-zhiw@nvidia.com (mailing list archive)
State New, archived
Headers show
Series Introduce NVIDIA GPU Virtualization (vGPU) Support | expand

Commit Message

Zhi Wang Sept. 22, 2024, 12:49 p.m. UTC
The kernel PCI core provides a sysfs UAPI for the user to specify
number of VFs to be enabled.

To support the UAPI, the driver is required to implement
pci_driver.sriov_configure(). The num of VFs are only able to be
changed where there is no active vGPUs.

Implement the pci_driver.sriov_configure() in nvkm. Introduce a
blocking call chain to let NVIDIA vGPU manager handle this event.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h  |  1 +
 drivers/gpu/drm/nouveau/nvkm/device/pci.c     | 14 +++++++++++
 .../gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c  | 25 +++++++++++++++++++
 include/drm/nvkm_vgpu_mgr_vfio.h              |  5 ++++
 4 files changed, 45 insertions(+)

Comments

Jason Gunthorpe Sept. 26, 2024, 10:56 p.m. UTC | #1
On Sun, Sep 22, 2024 at 05:49:40AM -0700, Zhi Wang wrote:

> diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
> index d9ed2cd202ff..5c2c650c2df9 100644
> --- a/include/drm/nvkm_vgpu_mgr_vfio.h
> +++ b/include/drm/nvkm_vgpu_mgr_vfio.h
> @@ -6,8 +6,13 @@
>  #ifndef __NVKM_VGPU_MGR_VFIO_H__
>  #define __NVKM_VGPU_MGR_VFIO_H__
>  
> +enum {
> +	NVIDIA_VGPU_EVENT_PCI_SRIOV_CONFIGURE = 0,
> +};
> +
>  struct nvidia_vgpu_vfio_handle_data {
>  	void *priv;
> +	struct notifier_block notifier;
>  };

Nothing references this? Why would you need it?

It looks approx correct to me to just directly put your function in
the sriov_configure callback.

This is the callback that indicates the admin has decided to turn on
the SRIOV feature.

Jason
Zhi Wang Oct. 14, 2024, 8:32 a.m. UTC | #2
On 27/09/2024 1.56, Jason Gunthorpe wrote:
> On Sun, Sep 22, 2024 at 05:49:40AM -0700, Zhi Wang wrote:
> 
>> diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
>> index d9ed2cd202ff..5c2c650c2df9 100644
>> --- a/include/drm/nvkm_vgpu_mgr_vfio.h
>> +++ b/include/drm/nvkm_vgpu_mgr_vfio.h
>> @@ -6,8 +6,13 @@
>>   #ifndef __NVKM_VGPU_MGR_VFIO_H__
>>   #define __NVKM_VGPU_MGR_VFIO_H__
>>   
>> +enum {
>> +	NVIDIA_VGPU_EVENT_PCI_SRIOV_CONFIGURE = 0,
>> +};
>> +
>>   struct nvidia_vgpu_vfio_handle_data {
>>   	void *priv;
>> +	struct notifier_block notifier;
>>   };
> 
> Nothing references this? Why would you need it?
> 

Oops, these are the leftovers of the discard changes. Will remove them 
accordingly in the next iteration. Thanks so much for catching this.

> It looks approx correct to me to just directly put your function in
> the sriov_configure callback.
> 
> This is the callback that indicates the admin has decided to turn on
> the SRIOV feature.

Turning on the SRIOV feature is just a part of the process enabling a 
vGPU. The VF is not instantly usable before a vGPU type is chosen via 
another userspace interface (e.g. fwctl).

Besides, admin has to enable the vGPU support by some means (e.g. a 
kernel parameter is just one candidate) and GSP firmware needs to be 
configured accordingly when being loaded.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

As this is related to user space interface, I am leaning towards putting 
some restriction/checks for the pre-condition in the 
driver.sriov_configure(), so admin would know there is something wrong 
in his configuration as early as possible, instead of he failed to 
creating vGPUs again and again, then he found he forgot to enable the 
vGPU support.

Thanks,
Zhi.

> Jason
Zhi Wang Oct. 14, 2024, 8:36 a.m. UTC | #3
On 27/09/2024 1.56, Jason Gunthorpe wrote:

Re-send this email as there are weird space in the previous email which.

> On Sun, Sep 22, 2024 at 05:49:40AM -0700, Zhi Wang wrote:
> 
>> diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
>> index d9ed2cd202ff..5c2c650c2df9 100644
>> --- a/include/drm/nvkm_vgpu_mgr_vfio.h
>> +++ b/include/drm/nvkm_vgpu_mgr_vfio.h
>> @@ -6,8 +6,13 @@
>>   #ifndef __NVKM_VGPU_MGR_VFIO_H__
>>   #define __NVKM_VGPU_MGR_VFIO_H__
>>   
>> +enum {
>> +	NVIDIA_VGPU_EVENT_PCI_SRIOV_CONFIGURE = 0,
>> +};
>> +
>>   struct nvidia_vgpu_vfio_handle_data {
>>   	void *priv;
>> +	struct notifier_block notifier;
>>   };
> 
> Nothing references this? Why would you need it?
> 
> It looks approx correct to me to just directly put your function in
> the sriov_configure callback.
> 

Oops, these are the leftovers of the discard changes. Will remove them 
accordingly in the next iteration. Thanks so much for catching this.

> This is the callback that indicates the admin has decided to turn on
> the SRIOV feature.
> 

As this is related to user space interface, I am leaning towards putting
some restriction/checks for the pre-condition in the
driver.sriov_configure(), so admin would know there is something wrong
in his configuration as early as possible, instead of he failed to
creating vGPUs again and again, then he found he forgot to enable the
vGPU support.

Thanks,
Zhi.

> Jason
Jason Gunthorpe Oct. 15, 2024, 12:27 p.m. UTC | #4
On Mon, Oct 14, 2024 at 08:32:03AM +0000, Zhi Wang wrote:

> Turning on the SRIOV feature is just a part of the process enabling a 
> vGPU. The VF is not instantly usable before a vGPU type is chosen via 
> another userspace interface (e.g. fwctl).

That's OK, that has become pretty normal now that VFs are just empty
handles when they are created until they are properly profiled.

> Besides, admin has to enable the vGPU support by some means (e.g. a 
> kernel parameter is just one candidate) and GSP firmware needs to be 
> configured accordingly when being loaded.

Definitely not a kernel parameter..

> As this is related to user space interface, I am leaning towards putting 
> some restriction/checks for the pre-condition in the 
> driver.sriov_configure(), so admin would know there is something wrong 
> in his configuration as early as possible, instead of he failed to 
> creating vGPUs again and again, then he found he forgot to enable the 
> vGPU support.

Well, as I've said, this is poor, you shouldn't have a FW SRIOV enable
bit at all, or at least it shouldn't be user configurable.

If the PCI function supports SRIOV then it should work to turn it on.

Jason
Zhi Wang Oct. 15, 2024, 3:14 p.m. UTC | #5
On 15/10/2024 15.27, Jason Gunthorpe wrote:
> On Mon, Oct 14, 2024 at 08:32:03AM +0000, Zhi Wang wrote:
> 
>> Turning on the SRIOV feature is just a part of the process enabling a
>> vGPU. The VF is not instantly usable before a vGPU type is chosen via
>> another userspace interface (e.g. fwctl).
> 
> That's OK, that has become pretty normal now that VFs are just empty
> handles when they are created until they are properly profiled.
> 
>> Besides, admin has to enable the vGPU support by some means (e.g. a
>> kernel parameter is just one candidate) and GSP firmware needs to be
>> configured accordingly when being loaded.
> 
> Definitely not a kernel parameter..
> 
>> As this is related to user space interface, I am leaning towards putting
>> some restriction/checks for the pre-condition in the
>> driver.sriov_configure(), so admin would know there is something wrong
>> in his configuration as early as possible, instead of he failed to
>> creating vGPUs again and again, then he found he forgot to enable the
>> vGPU support.
> 
> Well, as I've said, this is poor, you shouldn't have a FW SRIOV enable
> bit at all, or at least it shouldn't be user configurable.
> 
> If the PCI function supports SRIOV then it should work to turn it on.
> 
> Jason

Makes sense. Then we don't need a user-configurable option for 
enabling/disable SRIOV at least so far. We just enable it when we see 
the HW supports SRIOV.
diff mbox series

Patch

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
index 882965fd25ce..388758fa7ce8 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/vgpu_mgr/vgpu_mgr.h
@@ -40,5 +40,6 @@  void nvkm_vgpu_mgr_fini(struct nvkm_device *device);
 void nvkm_vgpu_mgr_populate_gsp_vf_info(struct nvkm_device *device,
 					void *info);
 void nvkm_vgpu_mgr_init_vfio_ops(struct nvkm_vgpu_mgr *vgpu_mgr);
+int nvkm_vgpu_mgr_pci_sriov_configure(struct nvkm_device *device, int num_vfs);
 
 #endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/device/pci.c b/drivers/gpu/drm/nouveau/nvkm/device/pci.c
index 1543902b20e9..f39d2727d653 100644
--- a/drivers/gpu/drm/nouveau/nvkm/device/pci.c
+++ b/drivers/gpu/drm/nouveau/nvkm/device/pci.c
@@ -1766,6 +1766,9 @@  nvkm_device_pci_probe(struct pci_dev *pci_dev, const struct pci_device_id *id)
 	struct nvkm_device *device;
 	int ret, bits;
 
+	if (pci_dev->is_virtfn)
+		return -EINVAL;
+
 	if (vga_switcheroo_client_probe_defer(pci_dev))
 		return -EPROBE_DEFER;
 
@@ -1867,6 +1870,16 @@  nvkm_device_pci_probe(struct pci_dev *pci_dev, const struct pci_device_id *id)
 	return ret;
 }
 
+static int nvkm_device_pci_sriov_configure(struct pci_dev *dev, int num_vfs)
+{
+	struct nvkm_device *device = pci_get_drvdata(dev);
+
+	if (!nvkm_vgpu_mgr_is_enabled(device))
+		return -ENODEV;
+
+	return nvkm_vgpu_mgr_pci_sriov_configure(device, num_vfs);
+}
+
 static struct pci_device_id
 nvkm_device_pci_id_table[] = {
 	{
@@ -1889,6 +1902,7 @@  nvkm_device_pci_driver = {
 	.probe = nvkm_device_pci_probe,
 	.remove = nvkm_device_pci_remove,
 	.driver.pm = &nvkm_device_pci_pm,
+	.sriov_configure = nvkm_device_pci_sriov_configure,
 };
 
 MODULE_DEVICE_TABLE(pci, nvkm_device_pci_id_table);
diff --git a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
index 3654bd43b68a..47c459f93950 100644
--- a/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
+++ b/drivers/gpu/drm/nouveau/nvkm/vgpu_mgr/vgpu_mgr.c
@@ -207,3 +207,28 @@  void nvkm_vgpu_mgr_populate_gsp_vf_info(struct nvkm_device *device,
 	v = nvkm_rd32(device, 0x88000 + 0xbfc);
 	vf_info->b64bitBar2 = IS_BAR_64(v);
 }
+
+/**
+ * nvkm_vgpu_mgr_pci_sriov_configure - Configure SRIOV VFs
+ * @device: the nvkm_device pointer
+ * @num_vfs: Number of VFs
+ *
+ * Returns: 0 on success, negative on failure.
+ */
+int nvkm_vgpu_mgr_pci_sriov_configure(struct nvkm_device *device, int num_vfs)
+{
+	struct nvkm_vgpu_mgr *vgpu_mgr = &device->vgpu_mgr;
+	struct nvidia_vgpu_vfio_handle_data *vfio = &vgpu_mgr->vfio_handle_data;
+	struct pci_dev *pdev = nvkm_to_pdev(device);
+	int ret;
+
+	if (vfio->priv)
+		return -EBUSY;
+
+	if (num_vfs)
+		ret = pci_enable_sriov(pdev, num_vfs);
+	else
+		pci_disable_sriov(pdev);
+
+	return ret ? ret : num_vfs;
+}
diff --git a/include/drm/nvkm_vgpu_mgr_vfio.h b/include/drm/nvkm_vgpu_mgr_vfio.h
index d9ed2cd202ff..5c2c650c2df9 100644
--- a/include/drm/nvkm_vgpu_mgr_vfio.h
+++ b/include/drm/nvkm_vgpu_mgr_vfio.h
@@ -6,8 +6,13 @@ 
 #ifndef __NVKM_VGPU_MGR_VFIO_H__
 #define __NVKM_VGPU_MGR_VFIO_H__
 
+enum {
+	NVIDIA_VGPU_EVENT_PCI_SRIOV_CONFIGURE = 0,
+};
+
 struct nvidia_vgpu_vfio_handle_data {
 	void *priv;
+	struct notifier_block notifier;
 };
 
 /* A combo of handles of RmClient and RmDevice */