diff mbox series

[net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()

Message ID 20241025072356.56093-1-wenjia@linux.ibm.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net] net/smc: Fix lookup of netdev by using ib_device_get_netdev() | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 5 this patch: 5
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers fail 2 blamed authors not CCed: leon@kernel.org michaelgur@nvidia.com; 3 maintainers not CCed: leon@kernel.org michaelgur@nvidia.com horms@kernel.org
netdev/build_clang success Errors and warnings before: 3 this patch: 3
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 4 this patch: 4
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 30 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-10-26--21-00 (tests: 777)

Commit Message

Wenjia Zhang Oct. 25, 2024, 7:23 a.m. UTC
Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
alternative to get_netdev") introduced an API ib_device_get_netdev.
The SMC-R variant of the SMC protocol continued to use the old API
ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
ib_device_ops.get_netdev didn't work any more at least by using a mlx5
device driver. Thus, using ib_device_set_netdev() now became mandatory.

Replace ib_device_ops.get_netdev() with ib_device_get_netdev().

Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
Reported-by: Aswin K <aswin@linux.ibm.com>
Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
---
 net/smc/smc_ib.c   | 8 ++------
 net/smc/smc_pnet.c | 4 +---
 2 files changed, 3 insertions(+), 9 deletions(-)

Comments

Halil Pasic Oct. 25, 2024, 8:57 a.m. UTC | #1
On Fri, 25 Oct 2024 09:23:55 +0200
Wenjia Zhang <wenjia@linux.ibm.com> wrote:

> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>

Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Simon Horman Oct. 25, 2024, 2:01 p.m. UTC | #2
On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>

Reviewed-by: Simon Horman <horms@kernel.org>
Dust Li Oct. 26, 2024, 12:42 a.m. UTC | #3
On 2024-10-25 09:23:55, Wenjia Zhang wrote:
>Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
>alternative to get_netdev") introduced an API ib_device_get_netdev.
>The SMC-R variant of the SMC protocol continued to use the old API
>ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
>("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
>get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
>ib_device_ops.get_netdev didn't work any more at least by using a mlx5
>device driver. Thus, using ib_device_set_netdev() now became mandatory.
>
>Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>
>Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
>Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
>Reported-by: Aswin K <aswin@linux.ibm.com>
>Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
>Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>

Reviewed-by: Dust Li <dust.li@linux.alibaba.com>

>---
> net/smc/smc_ib.c   | 8 ++------
> net/smc/smc_pnet.c | 4 +---
> 2 files changed, 3 insertions(+), 9 deletions(-)
>
>diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
>index 9297dc20bfe2..9c563cdbea90 100644
>--- a/net/smc/smc_ib.c
>+++ b/net/smc/smc_ib.c
>@@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
> 	struct ib_device *ibdev = smcibdev->ibdev;
> 	struct net_device *ndev;
> 
>-	if (!ibdev->ops.get_netdev)
>-		return;
>-	ndev = ibdev->ops.get_netdev(ibdev, port + 1);
>+	ndev = ib_device_get_netdev(ibdev, port + 1);
> 	if (ndev) {
> 		smcibdev->ndev_ifidx[port] = ndev->ifindex;
> 		dev_put(ndev);
>@@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
> 		port_cnt = smcibdev->ibdev->phys_port_cnt;
> 		for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
> 			libdev = smcibdev->ibdev;
>-			if (!libdev->ops.get_netdev)
>-				continue;
>-			lndev = libdev->ops.get_netdev(libdev, i + 1);
>+			lndev = ib_device_get_netdev(libdev, i + 1);
> 			dev_put(lndev);
> 			if (lndev != ndev)
> 				continue;
>diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
>index 1dd362326c0a..8566937c8903 100644
>--- a/net/smc/smc_pnet.c
>+++ b/net/smc/smc_pnet.c
>@@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
> 		for (i = 1; i <= SMC_MAX_PORTS; i++) {
> 			if (!rdma_is_port_valid(ibdev->ibdev, i))
> 				continue;
>-			if (!ibdev->ibdev->ops.get_netdev)
>-				continue;
>-			ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
>+			ndev = ib_device_get_netdev(ibdev->ibdev, i);
> 			if (!ndev)
> 				continue;
> 			dev_put(ndev);
>-- 
>2.43.0
>
Wen Gu Oct. 27, 2024, 11:18 a.m. UTC | #4
On 2024/10/25 15:23, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>

LGTM!

Reviewed-by: Wen Gu <guwen@linux.alibaba.com>

> ---
>   net/smc/smc_ib.c   | 8 ++------
>   net/smc/smc_pnet.c | 4 +---
>   2 files changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 9297dc20bfe2..9c563cdbea90 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
>   	struct ib_device *ibdev = smcibdev->ibdev;
>   	struct net_device *ndev;
>   
> -	if (!ibdev->ops.get_netdev)
> -		return;
> -	ndev = ibdev->ops.get_netdev(ibdev, port + 1);
> +	ndev = ib_device_get_netdev(ibdev, port + 1);
>   	if (ndev) {
>   		smcibdev->ndev_ifidx[port] = ndev->ifindex;
>   		dev_put(ndev);
> @@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
>   		port_cnt = smcibdev->ibdev->phys_port_cnt;
>   		for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
>   			libdev = smcibdev->ibdev;
> -			if (!libdev->ops.get_netdev)
> -				continue;
> -			lndev = libdev->ops.get_netdev(libdev, i + 1);
> +			lndev = ib_device_get_netdev(libdev, i + 1);
>   			dev_put(lndev);
>   			if (lndev != ndev)
>   				continue;
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 1dd362326c0a..8566937c8903 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
>   		for (i = 1; i <= SMC_MAX_PORTS; i++) {
>   			if (!rdma_is_port_valid(ibdev->ibdev, i))
>   				continue;
> -			if (!ibdev->ibdev->ops.get_netdev)
> -				continue;
> -			ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
> +			ndev = ib_device_get_netdev(ibdev->ibdev, i);
>   			if (!ndev)
>   				continue;
>   			dev_put(ndev);
Zhu Yanjun Oct. 27, 2024, 7:28 p.m. UTC | #5
在 2024/10/25 9:23, Wenjia Zhang 写道:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling

Thanks a lot.

Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Because the commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and 
get_netdev functions") removes the get_netdev callback from 
mlx5_ib_dev_common_roce_ops, in mlx4, get_netdev is still in 
mlx4_ib_dev_ops. So the following commit will follow mlx5 to remove 
get_netdev from mlx4 driver.

 From a59f2e01428640a332a51b8d910ec166704aa441 Mon Sep 17 00:00:00 2001
From: Zhu Yanjun <yanjun.zhu@linux.dev>
Date: Sun, 27 Oct 2024 20:21:27 +0100
Subject: [PATCH 1/1] RDMA/mlx4: Use IB get_netdev functions and remove
  get_netdev callback

In the commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
functions") removed the get_netdev callback from
mlx5_ib_dev_common_roce_ops, in mlx4, get_netdev callback should also
be removed.

Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
compile successfully only
---
  drivers/infiniband/hw/mlx4/main.c | 35 -------------------------------
  1 file changed, 35 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index 529db874d67c..cf34d92de7b1 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -123,40 +123,6 @@ static int num_ib_ports(struct mlx4_dev *dev)
         return ib_ports;
  }

-static struct net_device *mlx4_ib_get_netdev(struct ib_device *device,
-                                            u32 port_num)
-{
-       struct mlx4_ib_dev *ibdev = to_mdev(device);
-       struct net_device *dev, *ret = NULL;
-
-       rcu_read_lock();
-       for_each_netdev_rcu(&init_net, dev) {
-               if (dev->dev.parent != ibdev->ib_dev.dev.parent ||
-                   dev->dev_port + 1 != port_num)
-                       continue;
-
-               if (mlx4_is_bonded(ibdev->dev)) {
-                       struct net_device *upper;
-
-                       upper = netdev_master_upper_dev_get_rcu(dev);
-                       if (upper) {
-                               struct net_device *active;
-
-                               active = 
bond_option_active_slave_get_rcu(netdev_priv(upper));
-                               if (active)
-                                       dev = active;
-                       }
-               }
-
-               dev_hold(dev);
-               ret = dev;
-               break;
-       }
-
-       rcu_read_unlock();
-       return ret;
-}
-
  static int mlx4_ib_update_gids_v1(struct gid_entry *gids,
                                   struct mlx4_ib_dev *ibdev,
                                   u32 port_num)
@@ -2544,7 +2510,6 @@ static const struct ib_device_ops mlx4_ib_dev_ops = {
         .get_dev_fw_str = get_fw_ver_str,
         .get_dma_mr = mlx4_ib_get_dma_mr,
         .get_link_layer = mlx4_ib_port_link_layer,
-       .get_netdev = mlx4_ib_get_netdev,
         .get_port_immutable = mlx4_port_immutable,
         .map_mr_sg = mlx4_ib_map_mr_sg,
         .mmap = mlx4_ib_mmap,
--
2.34.1

Zhu Yanjun

> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
> ---
>   net/smc/smc_ib.c   | 8 ++------
>   net/smc/smc_pnet.c | 4 +---
>   2 files changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 9297dc20bfe2..9c563cdbea90 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
>   	struct ib_device *ibdev = smcibdev->ibdev;
>   	struct net_device *ndev;
>   
> -	if (!ibdev->ops.get_netdev)
> -		return;
> -	ndev = ibdev->ops.get_netdev(ibdev, port + 1);
> +	ndev = ib_device_get_netdev(ibdev, port + 1);
>   	if (ndev) {
>   		smcibdev->ndev_ifidx[port] = ndev->ifindex;
>   		dev_put(ndev);
> @@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
>   		port_cnt = smcibdev->ibdev->phys_port_cnt;
>   		for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
>   			libdev = smcibdev->ibdev;
> -			if (!libdev->ops.get_netdev)
> -				continue;
> -			lndev = libdev->ops.get_netdev(libdev, i + 1);
> +			lndev = ib_device_get_netdev(libdev, i + 1);
>   			dev_put(lndev);
>   			if (lndev != ndev)
>   				continue;
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 1dd362326c0a..8566937c8903 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
>   		for (i = 1; i <= SMC_MAX_PORTS; i++) {
>   			if (!rdma_is_port_valid(ibdev->ibdev, i))
>   				continue;
> -			if (!ibdev->ibdev->ops.get_netdev)
> -				continue;
> -			ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
> +			ndev = ib_device_get_netdev(ibdev->ibdev, i);
>   			if (!ndev)
>   				continue;
>   			dev_put(ndev);
Leon Romanovsky Oct. 27, 2024, 8:18 p.m. UTC | #6
On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. 

I would say that calls to ibdev ops from ULPs was never been right
thing to do. The ib_device_set_netdev() was introduced for the drivers.

So the whole commit message is not accurate and better to be rewritten.

> As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver.

It is not a correct statement too. All modern drivers (for last 5 years)
don't have that .get_netdev() ops, so it is not mlx5 specific, but another
justification to say that SMC-R was doing it wrong.

> Thus, using ib_device_set_netdev() now became mandatory.

ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
with ULPs.

> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().

It is too late for me to do proper review for today, but I would say
that it is worth to pay attention to multiple dev_put() calls in the
functions around the ib_device_get_netdev().

> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")

It is not related to this change Fixes line.

> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
> ---
>  net/smc/smc_ib.c   | 8 ++------
>  net/smc/smc_pnet.c | 4 +---
>  2 files changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 9297dc20bfe2..9c563cdbea90 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
>  	struct ib_device *ibdev = smcibdev->ibdev;
>  	struct net_device *ndev;
>  
> -	if (!ibdev->ops.get_netdev)
> -		return;
> -	ndev = ibdev->ops.get_netdev(ibdev, port + 1);
> +	ndev = ib_device_get_netdev(ibdev, port + 1);
>  	if (ndev) {
>  		smcibdev->ndev_ifidx[port] = ndev->ifindex;
>  		dev_put(ndev);
> @@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
>  		port_cnt = smcibdev->ibdev->phys_port_cnt;
>  		for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
>  			libdev = smcibdev->ibdev;
> -			if (!libdev->ops.get_netdev)
> -				continue;
> -			lndev = libdev->ops.get_netdev(libdev, i + 1);
> +			lndev = ib_device_get_netdev(libdev, i + 1);
>  			dev_put(lndev);
>  			if (lndev != ndev)
>  				continue;
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 1dd362326c0a..8566937c8903 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
>  		for (i = 1; i <= SMC_MAX_PORTS; i++) {
>  			if (!rdma_is_port_valid(ibdev->ibdev, i))
>  				continue;
> -			if (!ibdev->ibdev->ops.get_netdev)
> -				continue;
> -			ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
> +			ndev = ib_device_get_netdev(ibdev->ibdev, i);
>  			if (!ndev)
>  				continue;
>  			dev_put(ndev);
> -- 
> 2.43.0
> 
>
Leon Romanovsky Oct. 27, 2024, 8:30 p.m. UTC | #7
On Sun, Oct 27, 2024 at 10:18:57PM +0200, Leon Romanovsky wrote:
> On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> > Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> > alternative to get_netdev") introduced an API ib_device_get_netdev.
> > The SMC-R variant of the SMC protocol continued to use the old API
> > ib_device_ops.get_netdev() to lookup netdev. 
> 
> I would say that calls to ibdev ops from ULPs was never been right
> thing to do. The ib_device_set_netdev() was introduced for the drivers.
> 
> So the whole commit message is not accurate and better to be rewritten.
> 
> > As this commit 8d159eb2117b
> > ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> > get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> > ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> > device driver.
> 
> It is not a correct statement too. All modern drivers (for last 5 years)
> don't have that .get_netdev() ops, so it is not mlx5 specific, but another
> justification to say that SMC-R was doing it wrong.
> 
> > Thus, using ib_device_set_netdev() now became mandatory.
> 
> ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
> with ULPs.
> 
> > 
> > Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> It is too late for me to do proper review for today, but I would say
> that it is worth to pay attention to multiple dev_put() calls in the
> functions around the ib_device_get_netdev().
> 
> > 
> > Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")

Honestly, this patch in Fixes line doesn't look right to me. It pokes inside
of ib_device to get netdev index. For example call to smc_ib_ndev_change()
will return completely unpredictable results, due to races.

It is bad that RDMA ML wasn't even CCed back then, we would say NAK to
this patch.
https://lore.kernel.org/netdev/20201201192049.53517-6-kgraul@linux.ibm.com/

Thanks
D. Wythe Oct. 29, 2024, 8:43 a.m. UTC | #8
On 10/25/24 3:23 PM, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
> ---
>   net/smc/smc_ib.c   | 8 ++------
>   net/smc/smc_pnet.c | 4 +---
>   2 files changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 9297dc20bfe2..9c563cdbea90 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
>   	struct ib_device *ibdev = smcibdev->ibdev;
>   	struct net_device *ndev;
>   
> -	if (!ibdev->ops.get_netdev)
> -		return;
> -	ndev = ibdev->ops.get_netdev(ibdev, port + 1);
> +	ndev = ib_device_get_netdev(ibdev, port + 1);
>   	if (ndev) {
>   		smcibdev->ndev_ifidx[port] = ndev->ifindex;
>   		dev_put(ndev);
> @@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
>   		port_cnt = smcibdev->ibdev->phys_port_cnt;
>   		for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
>   			libdev = smcibdev->ibdev;
> -			if (!libdev->ops.get_netdev)
> -				continue;
> -			lndev = libdev->ops.get_netdev(libdev, i + 1);
> +			lndev = ib_device_get_netdev(libdev, i + 1);
>   			dev_put(lndev);
>   			if (lndev != ndev)
>   				continue;
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 1dd362326c0a..8566937c8903 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
>   		for (i = 1; i <= SMC_MAX_PORTS; i++) {
>   			if (!rdma_is_port_valid(ibdev->ibdev, i))
>   				continue;
> -			if (!ibdev->ibdev->ops.get_netdev)
> -				continue;
> -			ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
> +			ndev = ib_device_get_netdev(ibdev->ibdev, i);
>   			if (!ndev)
>   				continue;
>   			dev_put(ndev);


Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
Paolo Abeni Oct. 31, 2024, 10:01 a.m. UTC | #9
On 10/25/24 09:23, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>

Please adjust the commit message as per Leon suggestion. You can retain
all the ack collected so far.

Thanks,

Paolo
Wenjia Zhang Nov. 5, 2024, 9:50 a.m. UTC | #10
On 27.10.24 21:18, Leon Romanovsky wrote:
> On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
>> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
>> alternative to get_netdev") introduced an API ib_device_get_netdev.
>> The SMC-R variant of the SMC protocol continued to use the old API
>> ib_device_ops.get_netdev() to lookup netdev.
> 
> I would say that calls to ibdev ops from ULPs was never been right
> thing to do. The ib_device_set_netdev() was introduced for the drivers.
> 
> So the whole commit message is not accurate and better to be rewritten.
> 
>> As this commit 8d159eb2117b
>> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
>> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
>> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
>> device driver.
> 
> It is not a correct statement too. All modern drivers (for last 5 years)
> don't have that .get_netdev() ops, so it is not mlx5 specific, but another
> justification to say that SMC-R was doing it wrong.
> 
>> Thus, using ib_device_set_netdev() now became mandatory.
> 
> ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
> with ULPs.
> 
>>
>> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> It is too late for me to do proper review for today, but I would say
> that it is worth to pay attention to multiple dev_put() calls in the
> functions around the ib_device_get_netdev().
> 
>>
>> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
>> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> 
> It is not related to this change Fixes line.
> 

Hi Leon,

Thank you for the review! I agree that SMC could do better. However, we 
should fix it and give enough information and reference on the changes, 
since the code has already existed and didn't work with the old way. I 
can rewrite the commit message.

What about:
"
The SMC-R variant of the SMC protocol still called 
ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device 
driver to run SMC-R, it failed to find a device, because in mlx5_ib the 
internal net device management for retrieving net devices was replaced 
by a common interface ib_device_get_netdev() in commit 8d159eb2117b 
("RDMA/mlx5: Use IB set_netdev and get_netdev functions"). Thus, replace 
ib_device_ops.get_netdev() with ib_device_get_netdev() in SMC.
"

Thanks,
Wenjia
Wenjia Zhang Nov. 5, 2024, 9:53 a.m. UTC | #11
On 31.10.24 11:01, Paolo Abeni wrote:
> On 10/25/24 09:23, Wenjia Zhang wrote:
>> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
>> alternative to get_netdev") introduced an API ib_device_get_netdev.
>> The SMC-R variant of the SMC protocol continued to use the old API
>> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
>> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
>> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
>> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
>> device driver. Thus, using ib_device_set_netdev() now became mandatory.
>>
>> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>>
>> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
>> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
>> Reported-by: Aswin K <aswin@linux.ibm.com>
>> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
>> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
> 
> Please adjust the commit message as per Leon suggestion. You can retain
> all the ack collected so far.
> 
> Thanks,
> 
> Paolo
> 
Hi Paolo,

thank you for the reminder!
Sure, I'll do it.

Thanks,
Wenjia
Leon Romanovsky Nov. 5, 2024, 11:23 a.m. UTC | #12
On Tue, Nov 05, 2024 at 10:50:45AM +0100, Wenjia Zhang wrote:
> 
> 
> On 27.10.24 21:18, Leon Romanovsky wrote:
> > On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> > > Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> > > alternative to get_netdev") introduced an API ib_device_get_netdev.
> > > The SMC-R variant of the SMC protocol continued to use the old API
> > > ib_device_ops.get_netdev() to lookup netdev.
> > 
> > I would say that calls to ibdev ops from ULPs was never been right
> > thing to do. The ib_device_set_netdev() was introduced for the drivers.
> > 
> > So the whole commit message is not accurate and better to be rewritten.
> > 
> > > As this commit 8d159eb2117b
> > > ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> > > get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> > > ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> > > device driver.
> > 
> > It is not a correct statement too. All modern drivers (for last 5 years)
> > don't have that .get_netdev() ops, so it is not mlx5 specific, but another
> > justification to say that SMC-R was doing it wrong.
> > 
> > > Thus, using ib_device_set_netdev() now became mandatory.
> > 
> > ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
> > with ULPs.
> > 
> > > 
> > > Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> > 
> > It is too late for me to do proper review for today, but I would say
> > that it is worth to pay attention to multiple dev_put() calls in the
> > functions around the ib_device_get_netdev().
> > 
> > > 
> > > Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> > > Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> > 
> > It is not related to this change Fixes line.
> > 
> 
> Hi Leon,
> 
> Thank you for the review! I agree that SMC could do better. However, we
> should fix it and give enough information and reference on the changes,
> since the code has already existed and didn't work with the old way. 

The code which you change worked by chance and was wrong from day one.

> I can rewrite the commit message.
> 
> What about:
> "
> The SMC-R variant of the SMC protocol still called
> ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device driver
> to run SMC-R, it failed to find a device, because in mlx5_ib the internal
> net device management for retrieving net devices was replaced by a common
> interface ib_device_get_netdev() in commit 8d159eb2117b ("RDMA/mlx5: Use IB
> set_netdev and get_netdev functions"). Thus, replace
> ib_device_ops.get_netdev() with ib_device_get_netdev() in SMC.
> "

 The SMC-R variant of the SMC protocol used direct call to ib_device_ops.get_netdev()
 function to lookup netdev. Such direct accesses are not correct for any
 usage outside of RDMA core code. 

 RDMA subsystem provides ib_device_get_netdev() function that works on
 all RDMA drivers returns valid netdev with proper locking an reference
 counting. The commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
 functions") exposed that SMC-R didn't use that function.

 So update the SMC-R to use proper API,

Thanks
Wenjia Zhang Nov. 5, 2024, 12:30 p.m. UTC | #13
On 05.11.24 12:23, Leon Romanovsky wrote:
> On Tue, Nov 05, 2024 at 10:50:45AM +0100, Wenjia Zhang wrote:
>>
>>
>> On 27.10.24 21:18, Leon Romanovsky wrote:
>>> On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
>>>> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
>>>> alternative to get_netdev") introduced an API ib_device_get_netdev.
>>>> The SMC-R variant of the SMC protocol continued to use the old API
>>>> ib_device_ops.get_netdev() to lookup netdev.
>>>
>>> I would say that calls to ibdev ops from ULPs was never been right
>>> thing to do. The ib_device_set_netdev() was introduced for the drivers.
>>>
>>> So the whole commit message is not accurate and better to be rewritten.
>>>
>>>> As this commit 8d159eb2117b
>>>> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
>>>> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
>>>> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
>>>> device driver.
>>>
>>> It is not a correct statement too. All modern drivers (for last 5 years)
>>> don't have that .get_netdev() ops, so it is not mlx5 specific, but another
>>> justification to say that SMC-R was doing it wrong.
>>>
>>>> Thus, using ib_device_set_netdev() now became mandatory.
>>>
>>> ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
>>> with ULPs.
>>>
>>>>
>>>> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>>>
>>> It is too late for me to do proper review for today, but I would say
>>> that it is worth to pay attention to multiple dev_put() calls in the
>>> functions around the ib_device_get_netdev().
>>>
>>>>
>>>> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
>>>> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
>>>
>>> It is not related to this change Fixes line.
>>>
>>
>> Hi Leon,
>>
>> Thank you for the review! I agree that SMC could do better. However, we
>> should fix it and give enough information and reference on the changes,
>> since the code has already existed and didn't work with the old way.
> 
> The code which you change worked by chance and was wrong from day one.
> 
>> I can rewrite the commit message.
>>
>> What about:
>> "
>> The SMC-R variant of the SMC protocol still called
>> ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device driver
>> to run SMC-R, it failed to find a device, because in mlx5_ib the internal
>> net device management for retrieving net devices was replaced by a common
>> interface ib_device_get_netdev() in commit 8d159eb2117b ("RDMA/mlx5: Use IB
>> set_netdev and get_netdev functions"). Thus, replace
>> ib_device_ops.get_netdev() with ib_device_get_netdev() in SMC.
>> "
> 
>   The SMC-R variant of the SMC protocol used direct call to ib_device_ops.get_netdev()
>   function to lookup netdev. Such direct accesses are not correct for any
>   usage outside of RDMA core code.
> 
Is such an absolute statement documented somewhere? If not, I don't 
think it's convenient that I use it. Maybe you guys as RDMA core 
maintainer can, not I.
>   RDMA subsystem provides ib_device_get_netdev() function that works on
>   all RDMA drivers returns valid netdev with proper locking an reference
>   counting. The commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
>   functions") exposed that SMC-R didn't use that function.
> 
>   So update the SMC-R to use proper API,
> 
> Thanks
> 
mhhh, I'd like to stick to my version, which sounds more neutral IMO. I 
think the purpose is the same.

Thanks,
Wenjia
Leon Romanovsky Nov. 5, 2024, 1:39 p.m. UTC | #14
On Tue, Nov 05, 2024 at 01:30:24PM +0100, Wenjia Zhang wrote:
> 
> 
> On 05.11.24 12:23, Leon Romanovsky wrote:
> > On Tue, Nov 05, 2024 at 10:50:45AM +0100, Wenjia Zhang wrote:
> > > 
> > > 
> > > On 27.10.24 21:18, Leon Romanovsky wrote:
> > > > On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> > > > > Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> > > > > alternative to get_netdev") introduced an API ib_device_get_netdev.
> > > > > The SMC-R variant of the SMC protocol continued to use the old API
> > > > > ib_device_ops.get_netdev() to lookup netdev.
> > > > 
> > > > I would say that calls to ibdev ops from ULPs was never been right
> > > > thing to do. The ib_device_set_netdev() was introduced for the drivers.
> > > > 
> > > > So the whole commit message is not accurate and better to be rewritten.
> > > > 
> > > > > As this commit 8d159eb2117b
> > > > > ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> > > > > get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> > > > > ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> > > > > device driver.
> > > > 
> > > > It is not a correct statement too. All modern drivers (for last 5 years)
> > > > don't have that .get_netdev() ops, so it is not mlx5 specific, but another
> > > > justification to say that SMC-R was doing it wrong.
> > > > 
> > > > > Thus, using ib_device_set_netdev() now became mandatory.
> > > > 
> > > > ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
> > > > with ULPs.
> > > > 
> > > > > 
> > > > > Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> > > > 
> > > > It is too late for me to do proper review for today, but I would say
> > > > that it is worth to pay attention to multiple dev_put() calls in the
> > > > functions around the ib_device_get_netdev().
> > > > 
> > > > > 
> > > > > Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> > > > > Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> > > > 
> > > > It is not related to this change Fixes line.
> > > > 
> > > 
> > > Hi Leon,
> > > 
> > > Thank you for the review! I agree that SMC could do better. However, we
> > > should fix it and give enough information and reference on the changes,
> > > since the code has already existed and didn't work with the old way.
> > 
> > The code which you change worked by chance and was wrong from day one.
> > 
> > > I can rewrite the commit message.
> > > 
> > > What about:
> > > "
> > > The SMC-R variant of the SMC protocol still called
> > > ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device driver
> > > to run SMC-R, it failed to find a device, because in mlx5_ib the internal
> > > net device management for retrieving net devices was replaced by a common
> > > interface ib_device_get_netdev() in commit 8d159eb2117b ("RDMA/mlx5: Use IB
> > > set_netdev and get_netdev functions"). Thus, replace
> > > ib_device_ops.get_netdev() with ib_device_get_netdev() in SMC.
> > > "
> > 
> >   The SMC-R variant of the SMC protocol used direct call to ib_device_ops.get_netdev()
> >   function to lookup netdev. Such direct accesses are not correct for any
> >   usage outside of RDMA core code.
> > 
> Is such an absolute statement documented somewhere? If not, I don't think
> it's convenient that I use it. Maybe you guys as RDMA core maintainer can,
> not I.

You can too as it is very clear. All functions which can be used have
EXPORT_SYMBOL near them, ops.get_netdev() has nothing like that.

> >   RDMA subsystem provides ib_device_get_netdev() function that works on
> >   all RDMA drivers returns valid netdev with proper locking an reference
> >   counting. The commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
> >   functions") exposed that SMC-R didn't use that function.
> > 
> >   So update the SMC-R to use proper API,
> > 
> > Thanks
> > 
> mhhh, I'd like to stick to my version, which sounds more neutral IMO. I
> think the purpose is the same.

I don't want to argue about the words, my point is that get_netdev() was
never been the right interface.

Thanks

> 
> Thanks,
> Wenjia
Wenjia Zhang Nov. 5, 2024, 2:14 p.m. UTC | #15
On 05.11.24 14:39, Leon Romanovsky wrote:
> On Tue, Nov 05, 2024 at 01:30:24PM +0100, Wenjia Zhang wrote:
>>
>>
>> On 05.11.24 12:23, Leon Romanovsky wrote:
>>> On Tue, Nov 05, 2024 at 10:50:45AM +0100, Wenjia Zhang wrote:
>>>>
>>>>
>>>> On 27.10.24 21:18, Leon Romanovsky wrote:
>>>>> On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
>>>>>> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
>>>>>> alternative to get_netdev") introduced an API ib_device_get_netdev.
>>>>>> The SMC-R variant of the SMC protocol continued to use the old API
>>>>>> ib_device_ops.get_netdev() to lookup netdev.
>>>>>
>>>>> I would say that calls to ibdev ops from ULPs was never been right
>>>>> thing to do. The ib_device_set_netdev() was introduced for the drivers.
>>>>>
>>>>> So the whole commit message is not accurate and better to be rewritten.
>>>>>
>>>>>> As this commit 8d159eb2117b
>>>>>> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
>>>>>> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
>>>>>> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
>>>>>> device driver.
>>>>>
>>>>> It is not a correct statement too. All modern drivers (for last 5 years)
>>>>> don't have that .get_netdev() ops, so it is not mlx5 specific, but another
>>>>> justification to say that SMC-R was doing it wrong.
>>>>>
>>>>>> Thus, using ib_device_set_netdev() now became mandatory.
>>>>>
>>>>> ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
>>>>> with ULPs.
>>>>>
>>>>>>
>>>>>> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>>>>>
>>>>> It is too late for me to do proper review for today, but I would say
>>>>> that it is worth to pay attention to multiple dev_put() calls in the
>>>>> functions around the ib_device_get_netdev().
>>>>>
>>>>>>
>>>>>> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
>>>>>> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
>>>>>
>>>>> It is not related to this change Fixes line.
>>>>>
>>>>
>>>> Hi Leon,
>>>>
>>>> Thank you for the review! I agree that SMC could do better. However, we
>>>> should fix it and give enough information and reference on the changes,
>>>> since the code has already existed and didn't work with the old way.
>>>
>>> The code which you change worked by chance and was wrong from day one.
>>>
>>>> I can rewrite the commit message.
>>>>
>>>> What about:
>>>> "
>>>> The SMC-R variant of the SMC protocol still called
>>>> ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device driver
>>>> to run SMC-R, it failed to find a device, because in mlx5_ib the internal
>>>> net device management for retrieving net devices was replaced by a common
>>>> interface ib_device_get_netdev() in commit 8d159eb2117b ("RDMA/mlx5: Use IB
>>>> set_netdev and get_netdev functions"). Thus, replace
>>>> ib_device_ops.get_netdev() with ib_device_get_netdev() in SMC.
>>>> "
>>>
>>>    The SMC-R variant of the SMC protocol used direct call to ib_device_ops.get_netdev()
>>>    function to lookup netdev. Such direct accesses are not correct for any
>>>    usage outside of RDMA core code.
>>>
>> Is such an absolute statement documented somewhere? If not, I don't think
>> it's convenient that I use it. Maybe you guys as RDMA core maintainer can,
>> not I.
> 
> You can too as it is very clear. All functions which can be used have
> EXPORT_SYMBOL near them, ops.get_netdev() has nothing like that.
> 
>>>    RDMA subsystem provides ib_device_get_netdev() function that works on
>>>    all RDMA drivers returns valid netdev with proper locking an reference
>>>    counting. The commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
>>>    functions") exposed that SMC-R didn't use that function.
>>>
>>>    So update the SMC-R to use proper API,
>>>
>>> Thanks
>>>
>> mhhh, I'd like to stick to my version, which sounds more neutral IMO. I
>> think the purpose is the same.
> 
> I don't want to argue about the words, my point is that get_netdev() was
> never been the right interface.
> 
> Thanks
> 
Ok, I got you. I'll send v2 with a proper commit message.

Thanks,
Wenjia
Halil Pasic Nov. 6, 2024, 9:24 a.m. UTC | #16
On Tue, 5 Nov 2024 13:23:13 +0200
Leon Romanovsky <leon@kernel.org> wrote:

> On Tue, Nov 05, 2024 at 10:50:45AM +0100, Wenjia Zhang wrote:
> > 
> > 
> > On 27.10.24 21:18, Leon Romanovsky wrote:  
> > > On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:  
> > > > Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as
> > > > an alternative to get_netdev") introduced an API
> > > > ib_device_get_netdev. The SMC-R variant of the SMC protocol
> > > > continued to use the old API ib_device_ops.get_netdev() to
> > > > lookup netdev.  
> > > 
> > > I would say that calls to ibdev ops from ULPs was never been right
> > > thing to do. The ib_device_set_netdev() was introduced for the
> > > drivers.
> > > 
> > > So the whole commit message is not accurate and better to be
> > > rewritten. 
> > > > As this commit 8d159eb2117b
> > > > ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> > > > removed the get_netdev callback from
> > > > mlx5_ib_dev_common_roce_ops, calling ib_device_ops.get_netdev
> > > > didn't work any more at least by using a mlx5 device driver.  
> > > 
> > > It is not a correct statement too. All modern drivers (for last 5
> > > years) don't have that .get_netdev() ops, so it is not mlx5
> > > specific, but another justification to say that SMC-R was doing it
> > > wrong. 
> > > > Thus, using ib_device_set_netdev() now became mandatory.  
> > > 
> > > ib_device_set_netdev() is mandatory for the drivers, it is nothing
> > > to do with ULPs.
> > >   
> > > > 
> > > > Replace ib_device_ops.get_netdev() with ib_device_get_netdev().  
> > > 
> > > It is too late for me to do proper review for today, but I would
> > > say that it is worth to pay attention to multiple dev_put() calls
> > > in the functions around the ib_device_get_netdev().
> > >   
> > > > 
> > > > Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> > > > Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and
> > > > get_netdev functions")  
> > > 
> > > It is not related to this change Fixes line.
> > >   
> > 
> > Hi Leon,
> > 
> > Thank you for the review! I agree that SMC could do better. However,
> > we should fix it and give enough information and reference on the
> > changes, since the code has already existed and didn't work with the
> > old way.   
> 
> The code which you change worked by chance and was wrong from day one.

I absolutely agree with that statement. But please notice that the
commit date of commit c2261dd76b54 ("RDMA/device: Add
ib_device_set_netdev() as an alternative to get_netdev") predates the
commit date of commit 54903572c23c ("net/smc: allow pnetid-less
configuration") only by 9 days. And before commit c2261dd76b54
("RDMA/device: Add ib_device_set_netdev() as an alternative to
get_netdev") there was no 
ib_device_get_netdev() AFAICT.

Maybe the two patches crossed mid air so to say.

> 
> > I can rewrite the commit message.
> > 
> > What about:
> > "
> > The SMC-R variant of the SMC protocol still called
> > ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device
> > driver to run SMC-R, it failed to find a device, because in mlx5_ib
> > the internal net device management for retrieving net devices was
> > replaced by a common interface ib_device_get_netdev() in commit
> > 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
> > functions"). Thus, replace ib_device_ops.get_netdev() with
> > ib_device_get_netdev() in SMC. "  
> 
>  The SMC-R variant of the SMC protocol used direct call to
> ib_device_ops.get_netdev() function to lookup netdev. Such direct
> accesses are not correct for any usage outside of RDMA core code. 
> 

I agree, it is not correct since c2261dd76b54 ("RDMA/device: Add
ib_device_set_netdev() as an alternative to get_netdev").

Does fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?
I would guess it is not, and I would not actually mind sending a patch
but I have trouble figuring out the logic behind  commit ecce70cf17d9
("ksmbd: fix missing RDMA-capable flag for IPoIB device in
ksmbd_rdma_capable_netdev()").


>  RDMA subsystem provides ib_device_get_netdev() function that works on
>  all RDMA drivers returns valid netdev with proper locking an reference
>  counting. The commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and
> get_netdev functions") exposed that SMC-R didn't use that function.
> 

I believe the intention was this all along. I think the commit message
was written with the idea that 54903572c23c happened before c2261dd76b54
which is not the case.

>  So update the SMC-R to use proper API,
> 

I believe this is exactly what the patch does! And I agree we need to
improve on the commit message.

Regards,
Halil
Leon Romanovsky Nov. 6, 2024, 1:59 p.m. UTC | #17
On Wed, Nov 06, 2024 at 10:24:39AM +0100, Halil Pasic wrote:
> On Tue, 5 Nov 2024 13:23:13 +0200
> Leon Romanovsky <leon@kernel.org> wrote:
> 
> > On Tue, Nov 05, 2024 at 10:50:45AM +0100, Wenjia Zhang wrote:
> > > 
> > > 
> > > On 27.10.24 21:18, Leon Romanovsky wrote:  
> > > > On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:  
> > > > > Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as
> > > > > an alternative to get_netdev") introduced an API
> > > > > ib_device_get_netdev. The SMC-R variant of the SMC protocol
> > > > > continued to use the old API ib_device_ops.get_netdev() to
> > > > > lookup netdev.  
> > > > 
> > > > I would say that calls to ibdev ops from ULPs was never been right
> > > > thing to do. The ib_device_set_netdev() was introduced for the
> > > > drivers.
> > > > 
> > > > So the whole commit message is not accurate and better to be
> > > > rewritten. 
> > > > > As this commit 8d159eb2117b
> > > > > ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> > > > > removed the get_netdev callback from
> > > > > mlx5_ib_dev_common_roce_ops, calling ib_device_ops.get_netdev
> > > > > didn't work any more at least by using a mlx5 device driver.  
> > > > 
> > > > It is not a correct statement too. All modern drivers (for last 5
> > > > years) don't have that .get_netdev() ops, so it is not mlx5
> > > > specific, but another justification to say that SMC-R was doing it
> > > > wrong. 
> > > > > Thus, using ib_device_set_netdev() now became mandatory.  
> > > > 
> > > > ib_device_set_netdev() is mandatory for the drivers, it is nothing
> > > > to do with ULPs.
> > > >   
> > > > > 
> > > > > Replace ib_device_ops.get_netdev() with ib_device_get_netdev().  
> > > > 
> > > > It is too late for me to do proper review for today, but I would
> > > > say that it is worth to pay attention to multiple dev_put() calls
> > > > in the functions around the ib_device_get_netdev().
> > > >   
> > > > > 
> > > > > Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> > > > > Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and
> > > > > get_netdev functions")  
> > > > 
> > > > It is not related to this change Fixes line.
> > > >   
> > > 
> > > Hi Leon,
> > > 
> > > Thank you for the review! I agree that SMC could do better. However,
> > > we should fix it and give enough information and reference on the
> > > changes, since the code has already existed and didn't work with the
> > > old way.   
> > 
> > The code which you change worked by chance and was wrong from day one.
> 
> I absolutely agree with that statement. But please notice that the
> commit date of commit c2261dd76b54 ("RDMA/device: Add
> ib_device_set_netdev() as an alternative to get_netdev") predates the
> commit date of commit 54903572c23c ("net/smc: allow pnetid-less
> configuration") only by 9 days. And before commit c2261dd76b54
> ("RDMA/device: Add ib_device_set_netdev() as an alternative to
> get_netdev") there was no 
> ib_device_get_netdev() AFAICT.

It doesn't make it right.

1. While commit c2261dd76b54 was submitted and discussed, RDMA was not CCed.
2. Author didn't try to add his version of ib_device_get_netdev() as it
is done for all APIs exposed by RDMA core.

> 
> Maybe the two patches crossed mid air so to say.
> 
> > 
> > > I can rewrite the commit message.
> > > 
> > > What about:
> > > "
> > > The SMC-R variant of the SMC protocol still called
> > > ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device
> > > driver to run SMC-R, it failed to find a device, because in mlx5_ib
> > > the internal net device management for retrieving net devices was
> > > replaced by a common interface ib_device_get_netdev() in commit
> > > 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
> > > functions"). Thus, replace ib_device_ops.get_netdev() with
> > > ib_device_get_netdev() in SMC. "  
> > 
> >  The SMC-R variant of the SMC protocol used direct call to
> > ib_device_ops.get_netdev() function to lookup netdev. Such direct
> > accesses are not correct for any usage outside of RDMA core code. 
> > 
> 
> I agree, it is not correct since c2261dd76b54 ("RDMA/device: Add
> ib_device_set_netdev() as an alternative to get_netdev").
> 
> Does fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?

RDMA core code is drivers/infiniband/core/*.

> I would guess it is not, and I would not actually mind sending a patch
> but I have trouble figuring out the logic behind  commit ecce70cf17d9
> ("ksmbd: fix missing RDMA-capable flag for IPoIB device in
> ksmbd_rdma_capable_netdev()").

It is strange version of RDMA-CM. All other ULPs use RDMA-CM to avoid
GID, netdev and fabric complexity.

> 
> 
> >  RDMA subsystem provides ib_device_get_netdev() function that works on
> >  all RDMA drivers returns valid netdev with proper locking an reference
> >  counting. The commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and
> > get_netdev functions") exposed that SMC-R didn't use that function.
> > 
> 
> I believe the intention was this all along. I think the commit message
> was written with the idea that 54903572c23c happened before c2261dd76b54
> which is not the case.
> 
> >  So update the SMC-R to use proper API,
> > 
> 
> I believe this is exactly what the patch does! And I agree we need to
> improve on the commit message.
> 
> Regards,
> Halil
Halil Pasic Nov. 7, 2024, 11:47 a.m. UTC | #18
On Wed, 6 Nov 2024 15:59:10 +0200
Leon Romanovsky <leon@kernel.org> wrote:

> > I absolutely agree with that statement. But please notice that the
> > commit date of commit c2261dd76b54 ("RDMA/device: Add
> > ib_device_set_netdev() as an alternative to get_netdev") predates the
> > commit date of commit 54903572c23c ("net/smc: allow pnetid-less
> > configuration") only by 9 days. And before commit c2261dd76b54
> > ("RDMA/device: Add ib_device_set_netdev() as an alternative to
> > get_netdev") there was no 
> > ib_device_get_netdev() AFAICT.  
> 
> It doesn't make it right.

I agree!
> 
> 1. While commit c2261dd76b54 was submitted and discussed, RDMA was not
> CCed.

Would the RDMA community agree with adding 
L:	linux-rdma@vger.kernel.org
to the "SHARED MEMORY COMMUNICATIONS (SMC) SOCKETS" section of the
MAINTAINERS file, so that get_maintainer.pl tells contributors to cc
RDMA?

In my personal opinion SMC would have benefited greatly from review by
the RDMA community, and this is not the first time where the RDMA
community was not included where it should have been.
 
> 2. Author didn't try to add his version of ib_device_get_netdev() as it
> is done for all APIs exposed by RDMA core.

I understand now that direct access to ops callbacks is off limits for
ULPs. I'm not sure I understand all the details, but I hope I don't have
to.

Regards,
Halil
Halil Pasic Nov. 7, 2024, 11:56 a.m. UTC | #19
On Wed, 6 Nov 2024 15:59:10 +0200
Leon Romanovsky <leon@kernel.org> wrote:

> > Does  fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?  
> 
> RDMA core code is drivers/infiniband/core/*.

Understood. So this is a violation of the no direct access to the
callbacks rule.

> 
> > I would guess it is not, and I would not actually mind sending a patch
> > but I have trouble figuring out the logic behind  commit ecce70cf17d9
> > ("ksmbd: fix missing RDMA-capable flag for IPoIB device in
> > ksmbd_rdma_capable_netdev()").  
> 
> It is strange version of RDMA-CM. All other ULPs use RDMA-CM to avoid
> GID, netdev and fabric complexity.

I'm not familiar enough with either of the subsystems. Based on your
answer my guess is that it ain't outright bugous but still a layering 
violation. Copying linux-cifs@vger.kernel.org so that 
the smb are aware.

Thank you very much for all the explanations!

Regards,
Halil
Leon Romanovsky Nov. 7, 2024, 12:08 p.m. UTC | #20
On Thu, Nov 07, 2024 at 12:47:11PM +0100, Halil Pasic wrote:
> On Wed, 6 Nov 2024 15:59:10 +0200
> Leon Romanovsky <leon@kernel.org> wrote:
> 
> > > I absolutely agree with that statement. But please notice that the
> > > commit date of commit c2261dd76b54 ("RDMA/device: Add
> > > ib_device_set_netdev() as an alternative to get_netdev") predates the
> > > commit date of commit 54903572c23c ("net/smc: allow pnetid-less
> > > configuration") only by 9 days. And before commit c2261dd76b54
> > > ("RDMA/device: Add ib_device_set_netdev() as an alternative to
> > > get_netdev") there was no 
> > > ib_device_get_netdev() AFAICT.  
> > 
> > It doesn't make it right.
> 
> I agree!
> > 
> > 1. While commit c2261dd76b54 was submitted and discussed, RDMA was not
> > CCed.
> 
> Would the RDMA community agree with adding 
> L:	linux-rdma@vger.kernel.org
> to the "SHARED MEMORY COMMUNICATIONS (SMC) SOCKETS" section of the
> MAINTAINERS file, so that get_maintainer.pl tells contributors to cc
> RDMA?

Yes, of course. We always curious to see how our in-kernel API works and
if it needs adjustments rather than ULP hacks to overcome its limitations.

> 
> In my personal opinion SMC would have benefited greatly from review by
> the RDMA community, and this is not the first time where the RDMA
> community was not included where it should have been.

Jakub pushes SMC authors to CC RDMA, which is great, but it wasn't in
the past and your idea of adding new entry to MAINTAINERS file will
help.

>  
> > 2. Author didn't try to add his version of ib_device_get_netdev() as it
> > is done for all APIs exposed by RDMA core.
> 
> I understand now that direct access to ops callbacks is off limits for
> ULPs. I'm not sure I understand all the details, but I hope I don't have
> to.
> 
> Regards,
> Halil
Leon Romanovsky Nov. 7, 2024, 12:13 p.m. UTC | #21
On Thu, Nov 07, 2024 at 12:56:43PM +0100, Halil Pasic wrote:
> On Wed, 6 Nov 2024 15:59:10 +0200
> Leon Romanovsky <leon@kernel.org> wrote:
> 
> > > Does  fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?  
> > 
> > RDMA core code is drivers/infiniband/core/*.
> 
> Understood. So this is a violation of the no direct access to the
> callbacks rule.

It is not rule, but more common sense. Callbacks don't provide any
module reference counting, module autoload e.t.c

It is very rare situation where you call device callbacks from one subsystem
in another. I'm not familiar with such situations.

> 
> > 
> > > I would guess it is not, and I would not actually mind sending a patch
> > > but I have trouble figuring out the logic behind  commit ecce70cf17d9
> > > ("ksmbd: fix missing RDMA-capable flag for IPoIB device in
> > > ksmbd_rdma_capable_netdev()").  
> > 
> > It is strange version of RDMA-CM. All other ULPs use RDMA-CM to avoid
> > GID, netdev and fabric complexity.
> 
> I'm not familiar enough with either of the subsystems. Based on your
> answer my guess is that it ain't outright bugous but still a layering 
> violation. Copying linux-cifs@vger.kernel.org so that 
> the smb are aware.
> 
> Thank you very much for all the explanations!
> 
> Regards,
> Halil
Namjae Jeon Nov. 7, 2024, 11:40 p.m. UTC | #22
On Thu, Nov 7, 2024 at 9:00 PM Halil Pasic <pasic@linux.ibm.com> wrote:
>
> On Wed, 6 Nov 2024 15:59:10 +0200
> Leon Romanovsky <leon@kernel.org> wrote:
>
> > > Does  fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?
> >
> > RDMA core code is drivers/infiniband/core/*.
>
> Understood. So this is a violation of the no direct access to the
> callbacks rule.
>
> >
> > > I would guess it is not, and I would not actually mind sending a patch
> > > but I have trouble figuring out the logic behind  commit ecce70cf17d9
> > > ("ksmbd: fix missing RDMA-capable flag for IPoIB device in
> > > ksmbd_rdma_capable_netdev()").
> >
> > It is strange version of RDMA-CM. All other ULPs use RDMA-CM to avoid
> > GID, netdev and fabric complexity.
>
> I'm not familiar enough with either of the subsystems. Based on your
> answer my guess is that it ain't outright bugous but still a layering
> violation. Copying linux-cifs@vger.kernel.org so that
> the smb are aware.
Could you please elaborate what the violation is ?
I would also appreciate it if you could suggest to me how to fix this.

Thanks.
>
> Thank you very much for all the explanations!
>
> Regards,
> Halil
>
Leon Romanovsky Nov. 8, 2024, 5:59 p.m. UTC | #23
On Fri, Nov 08, 2024 at 08:40:40AM +0900, Namjae Jeon wrote:
> On Thu, Nov 7, 2024 at 9:00 PM Halil Pasic <pasic@linux.ibm.com> wrote:
> >
> > On Wed, 6 Nov 2024 15:59:10 +0200
> > Leon Romanovsky <leon@kernel.org> wrote:
> >
> > > > Does  fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?
> > >
> > > RDMA core code is drivers/infiniband/core/*.
> >
> > Understood. So this is a violation of the no direct access to the
> > callbacks rule.
> >
> > >
> > > > I would guess it is not, and I would not actually mind sending a patch
> > > > but I have trouble figuring out the logic behind  commit ecce70cf17d9
> > > > ("ksmbd: fix missing RDMA-capable flag for IPoIB device in
> > > > ksmbd_rdma_capable_netdev()").
> > >
> > > It is strange version of RDMA-CM. All other ULPs use RDMA-CM to avoid
> > > GID, netdev and fabric complexity.
> >
> > I'm not familiar enough with either of the subsystems. Based on your
> > answer my guess is that it ain't outright bugous but still a layering
> > violation. Copying linux-cifs@vger.kernel.org so that
> > the smb are aware.
> Could you please elaborate what the violation is ?

There are many, but the most screaming is that ksmbd has logic to
differentiate IPoIB devices. These devices are pure netdev devices
and should be treated like that. ULPs should treat them exactly
as they treat netdev devices.

> I would also appreciate it if you could suggest to me how to fix this.
> 
> Thanks.
> >
> > Thank you very much for all the explanations!
> >
> > Regards,
> > Halil
> >
Namjae Jeon Nov. 9, 2024, 5:32 a.m. UTC | #24
On Sat, Nov 9, 2024 at 2:59 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Fri, Nov 08, 2024 at 08:40:40AM +0900, Namjae Jeon wrote:
> > On Thu, Nov 7, 2024 at 9:00 PM Halil Pasic <pasic@linux.ibm.com> wrote:
> > >
> > > On Wed, 6 Nov 2024 15:59:10 +0200
> > > Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > > > Does  fs/smb/server/transport_rdma.c qualify as inside of RDMA core code?
> > > >
> > > > RDMA core code is drivers/infiniband/core/*.
> > >
> > > Understood. So this is a violation of the no direct access to the
> > > callbacks rule.
> > >
> > > >
> > > > > I would guess it is not, and I would not actually mind sending a patch
> > > > > but I have trouble figuring out the logic behind  commit ecce70cf17d9
> > > > > ("ksmbd: fix missing RDMA-capable flag for IPoIB device in
> > > > > ksmbd_rdma_capable_netdev()").
> > > >
> > > > It is strange version of RDMA-CM. All other ULPs use RDMA-CM to avoid
> > > > GID, netdev and fabric complexity.
> > >
> > > I'm not familiar enough with either of the subsystems. Based on your
> > > answer my guess is that it ain't outright bugous but still a layering
> > > violation. Copying linux-cifs@vger.kernel.org so that
> > > the smb are aware.
> > Could you please elaborate what the violation is ?
>
> There are many, but the most screaming is that ksmbd has logic to
> differentiate IPoIB devices. These devices are pure netdev devices
> and should be treated like that. ULPs should treat them exactly
> as they treat netdev devices.
Okay, I'll discuss with Kangjing if there's another way to avoid this issue.
If not, I'll revert the patch.

Thanks.
>
> > I would also appreciate it if you could suggest to me how to fix this.
> >
> > Thanks.
> > >
> > > Thank you very much for all the explanations!
> > >
> > > Regards,
> > > Halil
> > >
diff mbox series

Patch

diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index 9297dc20bfe2..9c563cdbea90 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -899,9 +899,7 @@  static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
 	struct ib_device *ibdev = smcibdev->ibdev;
 	struct net_device *ndev;
 
-	if (!ibdev->ops.get_netdev)
-		return;
-	ndev = ibdev->ops.get_netdev(ibdev, port + 1);
+	ndev = ib_device_get_netdev(ibdev, port + 1);
 	if (ndev) {
 		smcibdev->ndev_ifidx[port] = ndev->ifindex;
 		dev_put(ndev);
@@ -921,9 +919,7 @@  void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
 		port_cnt = smcibdev->ibdev->phys_port_cnt;
 		for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
 			libdev = smcibdev->ibdev;
-			if (!libdev->ops.get_netdev)
-				continue;
-			lndev = libdev->ops.get_netdev(libdev, i + 1);
+			lndev = ib_device_get_netdev(libdev, i + 1);
 			dev_put(lndev);
 			if (lndev != ndev)
 				continue;
diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
index 1dd362326c0a..8566937c8903 100644
--- a/net/smc/smc_pnet.c
+++ b/net/smc/smc_pnet.c
@@ -1054,9 +1054,7 @@  static void smc_pnet_find_rdma_dev(struct net_device *netdev,
 		for (i = 1; i <= SMC_MAX_PORTS; i++) {
 			if (!rdma_is_port_valid(ibdev->ibdev, i))
 				continue;
-			if (!ibdev->ibdev->ops.get_netdev)
-				continue;
-			ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
+			ndev = ib_device_get_netdev(ibdev->ibdev, i);
 			if (!ndev)
 				continue;
 			dev_put(ndev);