diff mbox series

[net] net/smc: fix listen processing for SMC-Rv2

Message ID 20220523055056.2078994-1-liuyacan@corp.netease.com (mailing list archive)
State Accepted
Commit 8c3b8dc5cc9bf6d273ebe18b16e2d6882bcfb36d
Delegated to: Netdev Maintainers
Headers show
Series [net] net/smc: fix listen processing for SMC-Rv2 | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 86 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

liuyacan@corp.netease.com May 23, 2022, 5:50 a.m. UTC
From: liuyacan <liuyacan@corp.netease.com>

In the process of checking whether RDMAv2 is available, the current
implementation first sets ini->smcrv2.ib_dev_v2, and then allocates
smc buf desc, but the latter may fail. Unfortunately, the caller
will only check the former. In this case, a NULL pointer reference
will occur in smc_clc_send_confirm_accept() when accessing
conn->rmb_desc.

This patch does two things:
1. Use the return code to determine whether V2 is available.
2. If the return code is NODEV, continue to check whether V1 is
available.

Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2")
Signed-off-by: liuyacan <liuyacan@corp.netease.com>
---
 net/smc/af_smc.c | 44 +++++++++++++++++++++++++++-----------------
 1 file changed, 27 insertions(+), 17 deletions(-)

Comments

patchwork-bot+netdevbpf@kernel.org May 23, 2022, 9:10 a.m. UTC | #1
Hello:

This patch was applied to netdev/net.git (master)
by David S. Miller <davem@davemloft.net>:

On Mon, 23 May 2022 13:50:56 +0800 you wrote:
> From: liuyacan <liuyacan@corp.netease.com>
> 
> In the process of checking whether RDMAv2 is available, the current
> implementation first sets ini->smcrv2.ib_dev_v2, and then allocates
> smc buf desc, but the latter may fail. Unfortunately, the caller
> will only check the former. In this case, a NULL pointer reference
> will occur in smc_clc_send_confirm_accept() when accessing
> conn->rmb_desc.
> 
> [...]

Here is the summary with links:
  - [net] net/smc: fix listen processing for SMC-Rv2
    https://git.kernel.org/netdev/net/c/8c3b8dc5cc9b

You are awesome, thank you!
Karsten Graul May 23, 2022, 11:19 a.m. UTC | #2
On 23/05/2022 07:50, liuyacan@corp.netease.com wrote:
> From: liuyacan <liuyacan@corp.netease.com>
> 
> In the process of checking whether RDMAv2 is available, the current
> implementation first sets ini->smcrv2.ib_dev_v2, and then allocates
> smc buf desc, but the latter may fail. Unfortunately, the caller
> will only check the former. In this case, a NULL pointer reference
> will occur in smc_clc_send_confirm_accept() when accessing
> conn->rmb_desc.
> 
> This patch does two things:
> 1. Use the return code to determine whether V2 is available.
> 2. If the return code is NODEV, continue to check whether V1 is
> available.
> 
> Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2")
> Signed-off-by: liuyacan <liuyacan@corp.netease.com>
> ---

I am not happy with this patch. You are right that this is a problem,
but the fix should be much simpler: set ini->smcrv2.ib_dev_v2 = NULL in
smc_find_rdma_v2_device_serv() after the not_found label, just like it is
done in a similar way for the ISM device in smc_find_ism_v1_device_serv().

Your patch changes many more things, and beside that you eliminated the calls 
to smc_find_ism_store_rc() completely, which is not correct.

Since your patch was already applied (btw. 3:20 hours after you submitted it),
please revert it and resend. Thank you.
liuyacan@corp.netease.com May 23, 2022, 12:12 p.m. UTC | #3
> > From: liuyacan <liuyacan@corp.netease.com>
> > 
> > In the process of checking whether RDMAv2 is available, the current
> > implementation first sets ini->smcrv2.ib_dev_v2, and then allocates
> > smc buf desc, but the latter may fail. Unfortunately, the caller
> > will only check the former. In this case, a NULL pointer reference
> > will occur in smc_clc_send_confirm_accept() when accessing
> > conn->rmb_desc.
> > 
> > This patch does two things:
> > 1. Use the return code to determine whether V2 is available.
> > 2. If the return code is NODEV, continue to check whether V1 is
> > available.
> > 
> > Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2")
> > Signed-off-by: liuyacan <liuyacan@corp.netease.com>
> > ---
>
> I am not happy with this patch. You are right that this is a problem,
> but the fix should be much simpler: set ini->smcrv2.ib_dev_v2 = NULL in
> smc_find_rdma_v2_device_serv() after the not_found label, just like it is
> done in a similar way for the ISM device in smc_find_ism_v1_device_serv().
>
> Your patch changes many more things, and beside that you eliminated the calls 
> to smc_find_ism_store_rc() completely, which is not correct.
> 
> Since your patch was already applied (btw. 3:20 hours after you submitted it),
> please revert it and resend. Thank you.

I also have considered this way, one question is that do we need to do more roll 
back work before V1 check? 

Specifically, In smc_find_rdma_v2_device_serv(), there are the following steps:

1. smc_listen_rdma_init()
   1.1 smc_conn_create()
   1.2 smc_buf_create()   --> may fail
2. smc_listen_rdma_reg()  --> may fail

When later steps fail, Do we need to roll back previous steps?
Thank you.
Karsten Graul May 23, 2022, 12:37 p.m. UTC | #4
On 23/05/2022 14:12, liuyacan@corp.netease.com wrote:
>>> From: liuyacan <liuyacan@corp.netease.com>
>>>
>>> In the process of checking whether RDMAv2 is available, the current
>>> implementation first sets ini->smcrv2.ib_dev_v2, and then allocates
>>> smc buf desc, but the latter may fail. Unfortunately, the caller
>>> will only check the former. In this case, a NULL pointer reference
>>> will occur in smc_clc_send_confirm_accept() when accessing
>>> conn->rmb_desc.
>>>
>>> This patch does two things:
>>> 1. Use the return code to determine whether V2 is available.
>>> 2. If the return code is NODEV, continue to check whether V1 is
>>> available.
>>>
>>> Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2")
>>> Signed-off-by: liuyacan <liuyacan@corp.netease.com>
>>> ---
>>
>> I am not happy with this patch. You are right that this is a problem,
>> but the fix should be much simpler: set ini->smcrv2.ib_dev_v2 = NULL in
>> smc_find_rdma_v2_device_serv() after the not_found label, just like it is
>> done in a similar way for the ISM device in smc_find_ism_v1_device_serv().
>>
>> Your patch changes many more things, and beside that you eliminated the calls 
>> to smc_find_ism_store_rc() completely, which is not correct.
>>
>> Since your patch was already applied (btw. 3:20 hours after you submitted it),
>> please revert it and resend. Thank you.
> 
> I also have considered this way, one question is that do we need to do more roll 
> back work before V1 check? 
> 
> Specifically, In smc_find_rdma_v2_device_serv(), there are the following steps:
> 
> 1. smc_listen_rdma_init()
>    1.1 smc_conn_create()
>    1.2 smc_buf_create()   --> may fail
> 2. smc_listen_rdma_reg()  --> may fail
> 
> When later steps fail, Do we need to roll back previous steps?

That is a good question and I think that is a different problem for another patch.
smc_listen_rdma_init() maybe should call smc_conn_abort() similar to what smc_listen_ism_init()
does in this situation. And when smc_listen_rdma_reg() fails ... hmm we need to think about this.

We will also discuss this here in our team.
liuyacan@corp.netease.com May 23, 2022, 1:25 p.m. UTC | #5
> >>> From: liuyacan <liuyacan@corp.netease.com>
> >>>
> >>> In the process of checking whether RDMAv2 is available, the current
> >>> implementation first sets ini->smcrv2.ib_dev_v2, and then allocates
> >>> smc buf desc, but the latter may fail. Unfortunately, the caller
> >>> will only check the former. In this case, a NULL pointer reference
> >>> will occur in smc_clc_send_confirm_accept() when accessing
> >>> conn->rmb_desc.
> >>>
> >>> This patch does two things:
> >>> 1. Use the return code to determine whether V2 is available.
> >>> 2. If the return code is NODEV, continue to check whether V1 is
> >>> available.
> >>>
> >>> Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2")
> >>> Signed-off-by: liuyacan <liuyacan@corp.netease.com>
> >>> ---
> >>
> >> I am not happy with this patch. You are right that this is a problem,
> >> but the fix should be much simpler: set ini->smcrv2.ib_dev_v2 = NULL in
> >> smc_find_rdma_v2_device_serv() after the not_found label, just like it is
> >> done in a similar way for the ISM device in smc_find_ism_v1_device_serv().
> >>
> >> Your patch changes many more things, and beside that you eliminated the calls 
> >> to smc_find_ism_store_rc() completely, which is not correct.
> >>
> >> Since your patch was already applied (btw. 3:20 hours after you submitted it),
> >> please revert it and resend. Thank you.
> > 
> > I also have considered this way, one question is that do we need to do more roll 
> > back work before V1 check? 
> > 
> > Specifically, In smc_find_rdma_v2_device_serv(), there are the following steps:
> > 
> > 1. smc_listen_rdma_init()
> >    1.1 smc_conn_create()
> >    1.2 smc_buf_create()   --> may fail
> > 2. smc_listen_rdma_reg()  --> may fail
> > 
> > When later steps fail, Do we need to roll back previous steps?
> 
> That is a good question and I think that is a different problem for another patch.
> smc_listen_rdma_init() maybe should call smc_conn_abort() similar to what smc_listen_ism_init()
> does in this situation. And when smc_listen_rdma_reg() fails ... hmm we need to think about this.
> 
> We will also discuss this here in our team.

Ok, I will revert this patch and resend a simpler one. Thank you.
diff mbox series

Patch

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 45a24d242..d3de54b70 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -2093,13 +2093,13 @@  static int smc_listen_rdma_reg(struct smc_sock *new_smc, bool local_first)
 	return 0;
 }
 
-static void smc_find_rdma_v2_device_serv(struct smc_sock *new_smc,
-					 struct smc_clc_msg_proposal *pclc,
-					 struct smc_init_info *ini)
+static int smc_find_rdma_v2_device_serv(struct smc_sock *new_smc,
+					struct smc_clc_msg_proposal *pclc,
+					struct smc_init_info *ini)
 {
 	struct smc_clc_v2_extension *smc_v2_ext;
 	u8 smcr_version;
-	int rc;
+	int rc = 0;
 
 	if (!(ini->smcr_version & SMC_V2) || !smcr_indicated(ini->smc_type_v2))
 		goto not_found;
@@ -2117,26 +2117,31 @@  static void smc_find_rdma_v2_device_serv(struct smc_sock *new_smc,
 	ini->smcrv2.saddr = new_smc->clcsock->sk->sk_rcv_saddr;
 	ini->smcrv2.daddr = smc_ib_gid_to_ipv4(smc_v2_ext->roce);
 	rc = smc_find_rdma_device(new_smc, ini);
-	if (rc) {
-		smc_find_ism_store_rc(rc, ini);
+	if (rc)
 		goto not_found;
-	}
+
 	if (!ini->smcrv2.uses_gateway)
 		memcpy(ini->smcrv2.nexthop_mac, pclc->lcl.mac, ETH_ALEN);
 
 	smcr_version = ini->smcr_version;
 	ini->smcr_version = SMC_V2;
 	rc = smc_listen_rdma_init(new_smc, ini);
-	if (!rc)
-		rc = smc_listen_rdma_reg(new_smc, ini->first_contact_local);
-	if (!rc)
-		return;
-	ini->smcr_version = smcr_version;
-	smc_find_ism_store_rc(rc, ini);
+	if (rc) {
+		ini->smcr_version = smcr_version;
+		goto not_found;
+	}
+	rc = smc_listen_rdma_reg(new_smc, ini->first_contact_local);
+	if (rc) {
+		ini->smcr_version = smcr_version;
+		goto not_found;
+	}
+	return 0;
 
 not_found:
+	rc = rc ?: SMC_CLC_DECL_NOSMCDEV;
 	ini->smcr_version &= ~SMC_V2;
 	ini->check_smcrv2 = false;
+	return rc;
 }
 
 static int smc_find_rdma_v1_device_serv(struct smc_sock *new_smc,
@@ -2169,6 +2174,7 @@  static int smc_listen_find_device(struct smc_sock *new_smc,
 				  struct smc_init_info *ini)
 {
 	int prfx_rc;
+	int rc;
 
 	/* check for ISM device matching V2 proposed device */
 	smc_find_ism_v2_device_serv(new_smc, pclc, ini);
@@ -2196,14 +2202,18 @@  static int smc_listen_find_device(struct smc_sock *new_smc,
 		return ini->rc ?: SMC_CLC_DECL_NOSMCDDEV;
 
 	/* check if RDMA V2 is available */
-	smc_find_rdma_v2_device_serv(new_smc, pclc, ini);
-	if (ini->smcrv2.ib_dev_v2)
+	rc = smc_find_rdma_v2_device_serv(new_smc, pclc, ini);
+	if (!rc)
 		return 0;
 
+	/* skip V1 check if V2 is unavailable for non-Device reason */
+	if (rc != SMC_CLC_DECL_NOSMCDEV &&
+	    rc != SMC_CLC_DECL_NOSMCRDEV &&
+	    rc != SMC_CLC_DECL_NOSMCDDEV)
+		return rc;
+
 	/* check if RDMA V1 is available */
 	if (!prfx_rc) {
-		int rc;
-
 		rc = smc_find_rdma_v1_device_serv(new_smc, pclc, ini);
 		smc_find_ism_store_rc(rc, ini);
 		return (!rc) ? 0 : ini->rc;