diff mbox series

[mlx5-next,v5,3/4] net/mlx5: Dynamically assign MSI-X vectors count

Message ID 20210126085730.1165673-4-leon@kernel.org (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series Dynamically assign MSI-X vectors count | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Guessed tree name to be net-next
netdev/subject_prefix success Link
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 130 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Leon Romanovsky Jan. 26, 2021, 8:57 a.m. UTC
From: Leon Romanovsky <leonro@nvidia.com>

The number of MSI-X vectors is PCI property visible through lspci, that
field is read-only and configured by the device. The static assignment
of an amount of MSI-X vectors doesn't allow utilize the newly created
VF because it is not known to the device the future load and configuration
where that VF will be used.

To overcome the inefficiency in the spread of such MSI-X vectors, we
allow the kernel to instruct the device with the needed number of such
vectors.

Such change immediately increases the amount of MSI-X vectors for the
system with @ VFs from 12 vectors per-VF, to be 32 vectors per-VF.

Before this patch:
[root@server ~]# lspci -vs 0000:08:00.2
08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
....
	Capabilities: [9c] MSI-X: Enable- Count=12 Masked-

After this patch:
[root@server ~]# lspci -vs 0000:08:00.2
08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
....
	Capabilities: [9c] MSI-X: Enable- Count=32 Masked-

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/main.c    |  4 ++
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  5 ++
 .../net/ethernet/mellanox/mlx5/core/pci_irq.c | 72 +++++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/sriov.c   | 13 +++-
 4 files changed, 92 insertions(+), 2 deletions(-)

Comments

Bjorn Helgaas Feb. 2, 2021, 5:25 p.m. UTC | #1
On Tue, Jan 26, 2021 at 10:57:29AM +0200, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> The number of MSI-X vectors is PCI property visible through lspci, that
> field is read-only and configured by the device. The static assignment
> of an amount of MSI-X vectors doesn't allow utilize the newly created
> VF because it is not known to the device the future load and configuration
> where that VF will be used.
> 
> To overcome the inefficiency in the spread of such MSI-X vectors, we
> allow the kernel to instruct the device with the needed number of such
> vectors.
> 
> Such change immediately increases the amount of MSI-X vectors for the
> system with @ VFs from 12 vectors per-VF, to be 32 vectors per-VF.

Not knowing anything about mlx5, it looks like maybe this gets some
parameters from firmware on the device, then changes the way MSI-X
vectors are distributed among VFs?

I don't understand the implications above about "static assignment"
and "inefficiency in the spread."  I guess maybe this takes advantage
of the fact that you know how many VFs are enabled, so if NumVFs is
less that TotalVFs, you can assign more vectors to each VF?

If that's the case, spell it out a little bit.  The current text makes
it sound like you discovered brand new MSI-X vectors somewhere,
regardless of how many VFs are enabled, which doesn't sound right.

> Before this patch:
> [root@server ~]# lspci -vs 0000:08:00.2
> 08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
> ....
> 	Capabilities: [9c] MSI-X: Enable- Count=12 Masked-
> 
> After this patch:
> [root@server ~]# lspci -vs 0000:08:00.2
> 08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
> ....
> 	Capabilities: [9c] MSI-X: Enable- Count=32 Masked-
> 
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>  .../net/ethernet/mellanox/mlx5/core/main.c    |  4 ++
>  .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  5 ++
>  .../net/ethernet/mellanox/mlx5/core/pci_irq.c | 72 +++++++++++++++++++
>  .../net/ethernet/mellanox/mlx5/core/sriov.c   | 13 +++-
>  4 files changed, 92 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> index ca6f2fc39ea0..79cfcc844156 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> @@ -567,6 +567,10 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx)
>  	if (MLX5_CAP_GEN_MAX(dev, mkey_by_name))
>  		MLX5_SET(cmd_hca_cap, set_hca_cap, mkey_by_name, 1);
>  
> +	if (MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix))
> +		MLX5_SET(cmd_hca_cap, set_hca_cap, num_total_dynamic_vf_msix,
> +			 MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix));
> +
>  	return set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE);
>  }
>  
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
> index 0a0302ce7144..5babb4434a87 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
> @@ -172,6 +172,11 @@ int mlx5_irq_attach_nb(struct mlx5_irq_table *irq_table, int vecidx,
>  		       struct notifier_block *nb);
>  int mlx5_irq_detach_nb(struct mlx5_irq_table *irq_table, int vecidx,
>  		       struct notifier_block *nb);
> +
> +int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int devfn,
> +			    int msix_vec_count);
> +int mlx5_get_default_msix_vec_count(struct mlx5_core_dev *dev, int num_vfs);
> +
>  struct cpumask *
>  mlx5_irq_get_affinity_mask(struct mlx5_irq_table *irq_table, int vecidx);
>  struct cpu_rmap *mlx5_irq_get_rmap(struct mlx5_irq_table *table);
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
> index 6fd974920394..2a35888fcff0 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
> @@ -55,6 +55,78 @@ static struct mlx5_irq *mlx5_irq_get(struct mlx5_core_dev *dev, int vecidx)
>  	return &irq_table->irq[vecidx];
>  }
>  
> +/**
> + * mlx5_get_default_msix_vec_count() - Get defaults of number of MSI-X vectors
> + * to be set

s/defaults of number of/default number of/
s/to be set/to be assigned to each VF/ ?

> + * @dev: PF to work on
> + * @num_vfs: Number of VFs was asked when SR-IOV was enabled

s/Number of VFs was asked when SR-IOV was enabled/Number of enabled VFs/ ?

> + **/

Documentation/doc-guide/kernel-doc.rst says kernel-doc comments end
with just "*/" (not "**/").

> +int mlx5_get_default_msix_vec_count(struct mlx5_core_dev *dev, int num_vfs)
> +{
> +	int num_vf_msix, min_msix, max_msix;
> +
> +	num_vf_msix = MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix);
> +	if (!num_vf_msix)
> +		return 0;
> +
> +	min_msix = MLX5_CAP_GEN(dev, min_dynamic_vf_msix_table_size);
> +	max_msix = MLX5_CAP_GEN(dev, max_dynamic_vf_msix_table_size);
> +
> +	/* Limit maximum number of MSI-X to leave some of them free in the
> +	 * pool and ready to be assigned by the users without need to resize
> +	 * other Vfs.

s/number of MSI-X/number of MSI-X vectors/
s/Vfs/VFs/

> +	 */
> +	return max(min(num_vf_msix / num_vfs, max_msix / 2), min_msix);
> +}
> +
> +/**
> + * mlx5_set_msix_vec_count() - Set dynamically allocated MSI-X to the VF
> + * @dev: PF to work on
> + * @function_id: Internal PCI VF function id
> + * @msix_vec_count: Number of MSI-X to set

s/id/ID/
s/Number of MSI-X/Number of MSI-X vectors/

> + **/
> +int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int function_id,
> +			    int msix_vec_count)
> +{
> +	int sz = MLX5_ST_SZ_BYTES(set_hca_cap_in);
> +	int num_vf_msix, min_msix, max_msix;
> +	void *hca_cap, *cap;
> +	int ret;
> +
> +	num_vf_msix = MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix);
> +	if (!num_vf_msix)
> +		return 0;
> +
> +	if (!MLX5_CAP_GEN(dev, vport_group_manager) || !mlx5_core_is_pf(dev))
> +		return -EOPNOTSUPP;
> +
> +	min_msix = MLX5_CAP_GEN(dev, min_dynamic_vf_msix_table_size);
> +	max_msix = MLX5_CAP_GEN(dev, max_dynamic_vf_msix_table_size);
> +
> +	if (msix_vec_count < min_msix)
> +		return -EINVAL;
> +
> +	if (msix_vec_count > max_msix)
> +		return -EOVERFLOW;
> +
> +	hca_cap = kzalloc(sz, GFP_KERNEL);
> +	if (!hca_cap)
> +		return -ENOMEM;
> +
> +	cap = MLX5_ADDR_OF(set_hca_cap_in, hca_cap, capability);
> +	MLX5_SET(cmd_hca_cap, cap, dynamic_msix_table_size, msix_vec_count);
> +
> +	MLX5_SET(set_hca_cap_in, hca_cap, opcode, MLX5_CMD_OP_SET_HCA_CAP);
> +	MLX5_SET(set_hca_cap_in, hca_cap, other_function, 1);
> +	MLX5_SET(set_hca_cap_in, hca_cap, function_id, function_id);
> +
> +	MLX5_SET(set_hca_cap_in, hca_cap, op_mod,
> +		 MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE << 1);
> +	ret = mlx5_cmd_exec_in(dev, set_hca_cap, hca_cap);
> +	kfree(hca_cap);
> +	return ret;
> +}
> +
>  int mlx5_irq_attach_nb(struct mlx5_irq_table *irq_table, int vecidx,
>  		       struct notifier_block *nb)
>  {
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
> index 3094d20297a9..f0ec86a1c8a6 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
> @@ -71,8 +71,7 @@ static int sriov_restore_guids(struct mlx5_core_dev *dev, int vf)
>  static int mlx5_device_enable_sriov(struct mlx5_core_dev *dev, int num_vfs)
>  {
>  	struct mlx5_core_sriov *sriov = &dev->priv.sriov;
> -	int err;
> -	int vf;
> +	int err, vf, num_msix_count;
>  
>  	if (!MLX5_ESWITCH_MANAGER(dev))
>  		goto enable_vfs_hca;
> @@ -85,12 +84,22 @@ static int mlx5_device_enable_sriov(struct mlx5_core_dev *dev, int num_vfs)
>  	}
>  
>  enable_vfs_hca:
> +	num_msix_count = mlx5_get_default_msix_vec_count(dev, num_vfs);
>  	for (vf = 0; vf < num_vfs; vf++) {
>  		err = mlx5_core_enable_hca(dev, vf + 1);
>  		if (err) {
>  			mlx5_core_warn(dev, "failed to enable VF %d (%d)\n", vf, err);
>  			continue;
>  		}
> +
> +		err = mlx5_set_msix_vec_count(dev, vf + 1, num_msix_count);
> +		if (err) {
> +			mlx5_core_warn(dev,
> +				       "failed to set MSI-X vector counts VF %d, err %d\n",
> +				       vf, err);
> +			continue;
> +		}
> +
>  		sriov->vfs_ctx[vf].enabled = 1;
>  		if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_IB) {
>  			err = sriov_restore_guids(dev, vf);
> -- 
> 2.29.2
>
Leon Romanovsky Feb. 2, 2021, 7:11 p.m. UTC | #2
On Tue, Feb 02, 2021 at 11:25:08AM -0600, Bjorn Helgaas wrote:
> On Tue, Jan 26, 2021 at 10:57:29AM +0200, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > The number of MSI-X vectors is PCI property visible through lspci, that
> > field is read-only and configured by the device. The static assignment
> > of an amount of MSI-X vectors doesn't allow utilize the newly created
> > VF because it is not known to the device the future load and configuration
> > where that VF will be used.
> >
> > To overcome the inefficiency in the spread of such MSI-X vectors, we
> > allow the kernel to instruct the device with the needed number of such
> > vectors.
> >
> > Such change immediately increases the amount of MSI-X vectors for the
> > system with @ VFs from 12 vectors per-VF, to be 32 vectors per-VF.
>
> Not knowing anything about mlx5, it looks like maybe this gets some
> parameters from firmware on the device, then changes the way MSI-X
> vectors are distributed among VFs?

The mlx5 devices can operate in one of two modes: static MSI-X vector
table size and dynamic.

For the same number of VFs, the device will get 12 vectors per-VF in static
mode. In dynamic, the total number is higher and we will be able to distribute
new amount better.

>
> I don't understand the implications above about "static assignment"
> and "inefficiency in the spread."  I guess maybe this takes advantage
> of the fact that you know how many VFs are enabled, so if NumVFs is
> less that TotalVFs, you can assign more vectors to each VF?

Internally, in the FW, we are using different pool and configuration scheme
for such distribution. In static mode, the amount is pre-configured through
our FW configuration tool (nvconfig), in dynamic, the driver is fully
responsible. And yes. NumVFs helps to utilize it is better.

>
> If that's the case, spell it out a little bit.  The current text makes
> it sound like you discovered brand new MSI-X vectors somewhere,
> regardless of how many VFs are enabled, which doesn't sound right.

I will do.

>
> > Before this patch:
> > [root@server ~]# lspci -vs 0000:08:00.2
> > 08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
> > ....
> > 	Capabilities: [9c] MSI-X: Enable- Count=12 Masked-
> >
> > After this patch:
> > [root@server ~]# lspci -vs 0000:08:00.2
> > 08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
> > ....
> > 	Capabilities: [9c] MSI-X: Enable- Count=32 Masked-
> >
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> >  .../net/ethernet/mellanox/mlx5/core/main.c    |  4 ++
> >  .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  5 ++
> >  .../net/ethernet/mellanox/mlx5/core/pci_irq.c | 72 +++++++++++++++++++
> >  .../net/ethernet/mellanox/mlx5/core/sriov.c   | 13 +++-
> >  4 files changed, 92 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> > index ca6f2fc39ea0..79cfcc844156 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> > @@ -567,6 +567,10 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx)
> >  	if (MLX5_CAP_GEN_MAX(dev, mkey_by_name))
> >  		MLX5_SET(cmd_hca_cap, set_hca_cap, mkey_by_name, 1);
> >
> > +	if (MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix))
> > +		MLX5_SET(cmd_hca_cap, set_hca_cap, num_total_dynamic_vf_msix,
> > +			 MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix));
> > +
> >  	return set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE);
> >  }
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
> > index 0a0302ce7144..5babb4434a87 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
> > @@ -172,6 +172,11 @@ int mlx5_irq_attach_nb(struct mlx5_irq_table *irq_table, int vecidx,
> >  		       struct notifier_block *nb);
> >  int mlx5_irq_detach_nb(struct mlx5_irq_table *irq_table, int vecidx,
> >  		       struct notifier_block *nb);
> > +
> > +int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int devfn,
> > +			    int msix_vec_count);
> > +int mlx5_get_default_msix_vec_count(struct mlx5_core_dev *dev, int num_vfs);
> > +
> >  struct cpumask *
> >  mlx5_irq_get_affinity_mask(struct mlx5_irq_table *irq_table, int vecidx);
> >  struct cpu_rmap *mlx5_irq_get_rmap(struct mlx5_irq_table *table);
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
> > index 6fd974920394..2a35888fcff0 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
> > @@ -55,6 +55,78 @@ static struct mlx5_irq *mlx5_irq_get(struct mlx5_core_dev *dev, int vecidx)
> >  	return &irq_table->irq[vecidx];
> >  }
> >
> > +/**
> > + * mlx5_get_default_msix_vec_count() - Get defaults of number of MSI-X vectors
> > + * to be set
>
> s/defaults of number of/default number of/
> s/to be set/to be assigned to each VF/ ?
>
> > + * @dev: PF to work on
> > + * @num_vfs: Number of VFs was asked when SR-IOV was enabled
>
> s/Number of VFs was asked when SR-IOV was enabled/Number of enabled VFs/ ?
>
> > + **/
>
> Documentation/doc-guide/kernel-doc.rst says kernel-doc comments end
> with just "*/" (not "**/").

The netdev uses this style all other the place. Also it is internal API
call, the kdoc is not needed here, so I followed existing format.

I'll fix all comments and resubmit.

Thanks
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index ca6f2fc39ea0..79cfcc844156 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -567,6 +567,10 @@  static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx)
 	if (MLX5_CAP_GEN_MAX(dev, mkey_by_name))
 		MLX5_SET(cmd_hca_cap, set_hca_cap, mkey_by_name, 1);
 
+	if (MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix))
+		MLX5_SET(cmd_hca_cap, set_hca_cap, num_total_dynamic_vf_msix,
+			 MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix));
+
 	return set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE);
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 0a0302ce7144..5babb4434a87 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -172,6 +172,11 @@  int mlx5_irq_attach_nb(struct mlx5_irq_table *irq_table, int vecidx,
 		       struct notifier_block *nb);
 int mlx5_irq_detach_nb(struct mlx5_irq_table *irq_table, int vecidx,
 		       struct notifier_block *nb);
+
+int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int devfn,
+			    int msix_vec_count);
+int mlx5_get_default_msix_vec_count(struct mlx5_core_dev *dev, int num_vfs);
+
 struct cpumask *
 mlx5_irq_get_affinity_mask(struct mlx5_irq_table *irq_table, int vecidx);
 struct cpu_rmap *mlx5_irq_get_rmap(struct mlx5_irq_table *table);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
index 6fd974920394..2a35888fcff0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
@@ -55,6 +55,78 @@  static struct mlx5_irq *mlx5_irq_get(struct mlx5_core_dev *dev, int vecidx)
 	return &irq_table->irq[vecidx];
 }
 
+/**
+ * mlx5_get_default_msix_vec_count() - Get defaults of number of MSI-X vectors
+ * to be set
+ * @dev: PF to work on
+ * @num_vfs: Number of VFs was asked when SR-IOV was enabled
+ **/
+int mlx5_get_default_msix_vec_count(struct mlx5_core_dev *dev, int num_vfs)
+{
+	int num_vf_msix, min_msix, max_msix;
+
+	num_vf_msix = MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix);
+	if (!num_vf_msix)
+		return 0;
+
+	min_msix = MLX5_CAP_GEN(dev, min_dynamic_vf_msix_table_size);
+	max_msix = MLX5_CAP_GEN(dev, max_dynamic_vf_msix_table_size);
+
+	/* Limit maximum number of MSI-X to leave some of them free in the
+	 * pool and ready to be assigned by the users without need to resize
+	 * other Vfs.
+	 */
+	return max(min(num_vf_msix / num_vfs, max_msix / 2), min_msix);
+}
+
+/**
+ * mlx5_set_msix_vec_count() - Set dynamically allocated MSI-X to the VF
+ * @dev: PF to work on
+ * @function_id: Internal PCI VF function id
+ * @msix_vec_count: Number of MSI-X to set
+ **/
+int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int function_id,
+			    int msix_vec_count)
+{
+	int sz = MLX5_ST_SZ_BYTES(set_hca_cap_in);
+	int num_vf_msix, min_msix, max_msix;
+	void *hca_cap, *cap;
+	int ret;
+
+	num_vf_msix = MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix);
+	if (!num_vf_msix)
+		return 0;
+
+	if (!MLX5_CAP_GEN(dev, vport_group_manager) || !mlx5_core_is_pf(dev))
+		return -EOPNOTSUPP;
+
+	min_msix = MLX5_CAP_GEN(dev, min_dynamic_vf_msix_table_size);
+	max_msix = MLX5_CAP_GEN(dev, max_dynamic_vf_msix_table_size);
+
+	if (msix_vec_count < min_msix)
+		return -EINVAL;
+
+	if (msix_vec_count > max_msix)
+		return -EOVERFLOW;
+
+	hca_cap = kzalloc(sz, GFP_KERNEL);
+	if (!hca_cap)
+		return -ENOMEM;
+
+	cap = MLX5_ADDR_OF(set_hca_cap_in, hca_cap, capability);
+	MLX5_SET(cmd_hca_cap, cap, dynamic_msix_table_size, msix_vec_count);
+
+	MLX5_SET(set_hca_cap_in, hca_cap, opcode, MLX5_CMD_OP_SET_HCA_CAP);
+	MLX5_SET(set_hca_cap_in, hca_cap, other_function, 1);
+	MLX5_SET(set_hca_cap_in, hca_cap, function_id, function_id);
+
+	MLX5_SET(set_hca_cap_in, hca_cap, op_mod,
+		 MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE << 1);
+	ret = mlx5_cmd_exec_in(dev, set_hca_cap, hca_cap);
+	kfree(hca_cap);
+	return ret;
+}
+
 int mlx5_irq_attach_nb(struct mlx5_irq_table *irq_table, int vecidx,
 		       struct notifier_block *nb)
 {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
index 3094d20297a9..f0ec86a1c8a6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
@@ -71,8 +71,7 @@  static int sriov_restore_guids(struct mlx5_core_dev *dev, int vf)
 static int mlx5_device_enable_sriov(struct mlx5_core_dev *dev, int num_vfs)
 {
 	struct mlx5_core_sriov *sriov = &dev->priv.sriov;
-	int err;
-	int vf;
+	int err, vf, num_msix_count;
 
 	if (!MLX5_ESWITCH_MANAGER(dev))
 		goto enable_vfs_hca;
@@ -85,12 +84,22 @@  static int mlx5_device_enable_sriov(struct mlx5_core_dev *dev, int num_vfs)
 	}
 
 enable_vfs_hca:
+	num_msix_count = mlx5_get_default_msix_vec_count(dev, num_vfs);
 	for (vf = 0; vf < num_vfs; vf++) {
 		err = mlx5_core_enable_hca(dev, vf + 1);
 		if (err) {
 			mlx5_core_warn(dev, "failed to enable VF %d (%d)\n", vf, err);
 			continue;
 		}
+
+		err = mlx5_set_msix_vec_count(dev, vf + 1, num_msix_count);
+		if (err) {
+			mlx5_core_warn(dev,
+				       "failed to set MSI-X vector counts VF %d, err %d\n",
+				       vf, err);
+			continue;
+		}
+
 		sriov->vfs_ctx[vf].enabled = 1;
 		if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_IB) {
 			err = sriov_restore_guids(dev, vf);