[1/4] net: mellanox: drop mlx5_cpumask_default_spread()

Message ID	20230925020528.777578-2-yury.norov@gmail.com (mailing list archive)
State	Not Applicable
Headers	show Return-Path: <linux-rdma-owner@vger.kernel.org> From: Yury Norov <yury.norov@gmail.com> To: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, linux-rdma@vger.kernel.org Cc: Yury Norov <yury.norov@gmail.com>, Tariq Toukan <ttoukan.linux@gmail.com>, Valentin Schneider <vschneid@redhat.com>, Maher Sanalla <msanalla@nvidia.com>, Ingo Molnar <mingo@kernel.org>, Mel Gorman <mgorman@suse.de>, Saeed Mahameed <saeedm@nvidia.com>, Leon Romanovsky <leon@kernel.org>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Ben Segall <bsegall@google.com>, Daniel Bristot de Oliveira <bristot@redhat.com>, Pawel Chmielewski <pawel.chmielewski@intel.com>, Jacob Keller <jacob.e.keller@intel.com>, Yury Norov <ynorov@nvidia.com> Subject: [PATCH 1/4] net: mellanox: drop mlx5_cpumask_default_spread() Date: Sun, 24 Sep 2023 19:05:25 -0700 Message-Id: <20230925020528.777578-2-yury.norov@gmail.com> In-Reply-To: <20230925020528.777578-1-yury.norov@gmail.com> References: <20230925020528.777578-1-yury.norov@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	sched: drop for_each_numa_hop_mask() \| expand [0/4] sched: drop for_each_numa_hop_mask() [1/4] net: mellanox: drop mlx5_cpumask_default_spread() [2/4] Revert "sched/topology: Introduce for_each_numa_hop_mask()" [3/4] Revert "sched/topology: Introduce sched_numa_hop_mask()" [4/4] lib/cpumask: don't mention for_each_numa_hop_mask in cpumask_local_spread()"

Yury Norov Sept. 25, 2023, 2:05 a.m. UTC

The function duplicates existing cpumask_local_spread(), and it's O(N),
while cpumask_local_spread() implementation is based on bsearch, and
thus is O(log n), so drop mlx5_cpumask_default_spread() and use generic
cpumask_local_spread().

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Yury Norov <ynorov@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 28 ++------------------
 1 file changed, 2 insertions(+), 26 deletions(-)

Jacob Keller Sept. 25, 2023, 10:45 p.m. UTC | #1

On 9/24/2023 7:05 PM, Yury Norov wrote:
> The function duplicates existing cpumask_local_spread(), and it's O(N),
> while cpumask_local_spread() implementation is based on bsearch, and
> thus is O(log n), so drop mlx5_cpumask_default_spread() and use generic
> cpumask_local_spread().
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> Signed-off-by: Yury Norov <ynorov@nvidia.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/eq.c | 28 ++------------------
>  1 file changed, 2 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> index ea0405e0a43f..bd9f857cc52d 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> @@ -828,30 +828,6 @@ static void comp_irq_release_pci(struct mlx5_core_dev *dev, u16 vecidx)
>  	mlx5_irq_release_vector(irq);
>  }
>  

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

> -static int mlx5_cpumask_default_spread(int numa_node, int index)
> -{
> -	const struct cpumask *prev = cpu_none_mask;
> -	const struct cpumask *mask;
> -	int found_cpu = 0;
> -	int i = 0;
> -	int cpu;
> -
> -	rcu_read_lock();
> -	for_each_numa_hop_mask(mask, numa_node) {
> -		for_each_cpu_andnot(cpu, mask, prev) {
> -			if (i++ == index) {
> -				found_cpu = cpu;
> -				goto spread_done;
> -			}
> -		}
> -		prev = mask;
> -	}
> -
> -spread_done:
> -	rcu_read_unlock();
> -	return found_cpu;
> -}
> -
>  static struct cpu_rmap *mlx5_eq_table_get_pci_rmap(struct mlx5_core_dev *dev)
>  {
>  #ifdef CONFIG_RFS_ACCEL
> @@ -873,7 +849,7 @@ static int comp_irq_request_pci(struct mlx5_core_dev *dev, u16 vecidx)
>  	int cpu;
>  
>  	rmap = mlx5_eq_table_get_pci_rmap(dev);
> -	cpu = mlx5_cpumask_default_spread(dev->priv.numa_node, vecidx);
> +	cpu = cpumask_local_spread(vecidx, dev->priv.numa_node);
>  	irq = mlx5_irq_request_vector(dev, cpu, vecidx, &rmap);
>  	if (IS_ERR(irq))
>  		return PTR_ERR(irq);
> @@ -1125,7 +1101,7 @@ int mlx5_comp_vector_get_cpu(struct mlx5_core_dev *dev, int vector)
>  	if (mask)
>  		cpu = cpumask_first(mask);
>  	else
> -		cpu = mlx5_cpumask_default_spread(dev->priv.numa_node, vector);
> +		cpu = cpumask_local_spread(vector, dev->priv.numa_node);
>  
>  	return cpu;
>  }

Paolo Abeni Oct. 3, 2023, 10:04 a.m. UTC | #2

On Sun, 2023-09-24 at 19:05 -0700, Yury Norov wrote:
> The function duplicates existing cpumask_local_spread(), and it's O(N),
> while cpumask_local_spread() implementation is based on bsearch, and
> thus is O(log n), so drop mlx5_cpumask_default_spread() and use generic
> cpumask_local_spread().
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> Signed-off-by: Yury Norov <ynorov@nvidia.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/eq.c | 28 ++------------------
>  1 file changed, 2 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> index ea0405e0a43f..bd9f857cc52d 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> @@ -828,30 +828,6 @@ static void comp_irq_release_pci(struct mlx5_core_dev *dev, u16 vecidx)
>  	mlx5_irq_release_vector(irq);
>  }
>  
> -static int mlx5_cpumask_default_spread(int numa_node, int index)
> -{
> -	const struct cpumask *prev = cpu_none_mask;
> -	const struct cpumask *mask;
> -	int found_cpu = 0;
> -	int i = 0;
> -	int cpu;
> -
> -	rcu_read_lock();
> -	for_each_numa_hop_mask(mask, numa_node) {
> -		for_each_cpu_andnot(cpu, mask, prev) {
> -			if (i++ == index) {
> -				found_cpu = cpu;
> -				goto spread_done;
> -			}
> -		}
> -		prev = mask;
> -	}
> -
> -spread_done:
> -	rcu_read_unlock();
> -	return found_cpu;
> -}
> -
>  static struct cpu_rmap *mlx5_eq_table_get_pci_rmap(struct mlx5_core_dev *dev)
>  {
>  #ifdef CONFIG_RFS_ACCEL
> @@ -873,7 +849,7 @@ static int comp_irq_request_pci(struct mlx5_core_dev *dev, u16 vecidx)
>  	int cpu;
>  
>  	rmap = mlx5_eq_table_get_pci_rmap(dev);
> -	cpu = mlx5_cpumask_default_spread(dev->priv.numa_node, vecidx);
> +	cpu = cpumask_local_spread(vecidx, dev->priv.numa_node);
>  	irq = mlx5_irq_request_vector(dev, cpu, vecidx, &rmap);
>  	if (IS_ERR(irq))
>  		return PTR_ERR(irq);
> @@ -1125,7 +1101,7 @@ int mlx5_comp_vector_get_cpu(struct mlx5_core_dev *dev, int vector)
>  	if (mask)
>  		cpu = cpumask_first(mask);
>  	else
> -		cpu = mlx5_cpumask_default_spread(dev->priv.numa_node, vector);
> +		cpu = cpumask_local_spread(vector, dev->priv.numa_node);
>  
>  	return cpu;
>  }

It looks like this series is going to cause some later conflicts
regardless of the target tree. I think the whole series could go via
the net-next tree, am I missing any relevant point?

Thanks!

Paolo

Yury Norov Oct. 3, 2023, 1:46 p.m. UTC | #3

On Tue, Oct 03, 2023 at 12:04:01PM +0200, Paolo Abeni wrote:
> On Sun, 2023-09-24 at 19:05 -0700, Yury Norov wrote:
> > The function duplicates existing cpumask_local_spread(), and it's O(N),
> > while cpumask_local_spread() implementation is based on bsearch, and
> > thus is O(log n), so drop mlx5_cpumask_default_spread() and use generic
> > cpumask_local_spread().
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > Signed-off-by: Yury Norov <ynorov@nvidia.com>
> > ---
> >  drivers/net/ethernet/mellanox/mlx5/core/eq.c | 28 ++------------------
> >  1 file changed, 2 insertions(+), 26 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > index ea0405e0a43f..bd9f857cc52d 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > @@ -828,30 +828,6 @@ static void comp_irq_release_pci(struct mlx5_core_dev *dev, u16 vecidx)
> >  	mlx5_irq_release_vector(irq);
> >  }
> >  
> > -static int mlx5_cpumask_default_spread(int numa_node, int index)
> > -{
> > -	const struct cpumask *prev = cpu_none_mask;
> > -	const struct cpumask *mask;
> > -	int found_cpu = 0;
> > -	int i = 0;
> > -	int cpu;
> > -
> > -	rcu_read_lock();
> > -	for_each_numa_hop_mask(mask, numa_node) {
> > -		for_each_cpu_andnot(cpu, mask, prev) {
> > -			if (i++ == index) {
> > -				found_cpu = cpu;
> > -				goto spread_done;
> > -			}
> > -		}
> > -		prev = mask;
> > -	}
> > -
> > -spread_done:
> > -	rcu_read_unlock();
> > -	return found_cpu;
> > -}
> > -
> >  static struct cpu_rmap *mlx5_eq_table_get_pci_rmap(struct mlx5_core_dev *dev)
> >  {
> >  #ifdef CONFIG_RFS_ACCEL
> > @@ -873,7 +849,7 @@ static int comp_irq_request_pci(struct mlx5_core_dev *dev, u16 vecidx)
> >  	int cpu;
> >  
> >  	rmap = mlx5_eq_table_get_pci_rmap(dev);
> > -	cpu = mlx5_cpumask_default_spread(dev->priv.numa_node, vecidx);
> > +	cpu = cpumask_local_spread(vecidx, dev->priv.numa_node);
> >  	irq = mlx5_irq_request_vector(dev, cpu, vecidx, &rmap);
> >  	if (IS_ERR(irq))
> >  		return PTR_ERR(irq);
> > @@ -1125,7 +1101,7 @@ int mlx5_comp_vector_get_cpu(struct mlx5_core_dev *dev, int vector)
> >  	if (mask)
> >  		cpu = cpumask_first(mask);
> >  	else
> > -		cpu = mlx5_cpumask_default_spread(dev->priv.numa_node, vector);
> > +		cpu = cpumask_local_spread(vector, dev->priv.numa_node);
> >  
> >  	return cpu;
> >  }
> 
> It looks like this series is going to cause some later conflicts
> regardless of the target tree. I think the whole series could go via
> the net-next tree, am I missing any relevant point?

Hi Paolo,

Can you elaborate on the conflicts you see? For me it applies cleanly
on current master, and with some 3-way merging on latest -next...

Thanks,
Yury

Jakub Kicinski Oct. 3, 2023, 10:20 p.m. UTC | #4

On Tue, 3 Oct 2023 06:46:17 -0700 Yury Norov wrote:
> Can you elaborate on the conflicts you see? For me it applies cleanly
> on current master, and with some 3-way merging on latest -next...

We're half way thru the release cycle the conflicts can still come in.

There's no dependency for the first patch. The most normal way to
handle this would be to send patch 1 to the networking tree and send
the rest in the subsequent merge window.

Yury Norov Oct. 4, 2023, 1:31 a.m. UTC | #5

On Tue, Oct 03, 2023 at 03:20:30PM -0700, Jakub Kicinski wrote:
> On Tue, 3 Oct 2023 06:46:17 -0700 Yury Norov wrote:
> > Can you elaborate on the conflicts you see? For me it applies cleanly
> > on current master, and with some 3-way merging on latest -next...
> 
> We're half way thru the release cycle the conflicts can still come in.
> 
> There's no dependency for the first patch. The most normal way to
> handle this would be to send patch 1 to the networking tree and send
> the rest in the subsequent merge window.

Ah, I understand now. I didn't plan to move it in current merge
window. In fact, I'll be more comfortable to keep it in -next for
longer and merge it in v6.7.

But it's up to you. If you think it's better, I can resend 1st patch
separately.

Thanks,
Yury

Jakub Kicinski Oct. 4, 2023, 5:19 p.m. UTC | #6

On Tue, 3 Oct 2023 18:31:28 -0700 Yury Norov wrote:
> > We're half way thru the release cycle the conflicts can still come in.
> > 
> > There's no dependency for the first patch. The most normal way to
> > handle this would be to send patch 1 to the networking tree and send
> > the rest in the subsequent merge window.  
> 
> Ah, I understand now. I didn't plan to move it in current merge
> window. In fact, I'll be more comfortable to keep it in -next for
> longer and merge it in v6.7.
> 
> But it's up to you. If you think it's better, I can resend 1st patch
> separately.

Let's see if Saeed can help us.

Saeed, could you pick up patch 1 from this series for the mlx5 tree?

[1/4] net: mellanox: drop mlx5_cpumask_default_spread()

Commit Message

Comments

Patch