Message ID | 1663656189-32090-1-git-send-email-guwen@linux.alibaba.com (mailing list archive) |
---|---|
State | Accepted |
Commit | e738455b2c6dcdab03e45d97de36476f93f557d2 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net/smc: Stop the CLC flow if no link to map buffers on | expand |
On 2022/9/20 14:43, Wen Gu wrote: > There might be a potential race between SMC-R buffer map and > link group termination. > > smc_smcr_terminate_all() | smc_connect_rdma() > -------------------------------------------------------------- > | smc_conn_create() > for links in smcibdev | > schedule links down | > | smc_buf_create() > | \- smcr_buf_map_usable_links() > | \- no usable links found, > | (rmb->mr = NULL) > | > | smc_clc_send_confirm() > | \- access conn->rmb_desc->mr[]->rkey > | (panic) > > During reboot and IB device module remove, all links will be set > down and no usable links remain in link groups. In such situation > smcr_buf_map_usable_links() should return an error and stop the > CLC flow accessing to uninitialized mr. > > Fixes: b9247544c1bc ("net/smc: convert static link ID instances to support multiple links") > Signed-off-by: Wen Gu <guwen@linux.alibaba.com> > --- > net/smc/smc_core.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c > index ebf56cd..df89c2e 100644 > --- a/net/smc/smc_core.c > +++ b/net/smc/smc_core.c > @@ -2239,7 +2239,7 @@ static struct smc_buf_desc *smcr_new_buf_create(struct smc_link_group *lgr, > static int smcr_buf_map_usable_links(struct smc_link_group *lgr, > struct smc_buf_desc *buf_desc, bool is_rmb) > { > - int i, rc = 0; > + int i, rc = 0, cnt = 0; > > /* protect against parallel link reconfiguration */ > mutex_lock(&lgr->llc_conf_mutex); > @@ -2252,9 +2252,12 @@ static int smcr_buf_map_usable_links(struct smc_link_group *lgr, > rc = -ENOMEM; > goto out; > } > + cnt++; > } > out: > mutex_unlock(&lgr->llc_conf_mutex); > + if (!rc && !cnt) > + rc = -EINVAL; > return rc; > } > Any comments or reviews are welcome and appreciated. Thanks, Wen Gu
Hello: This patch was applied to netdev/net.git (master) by Paolo Abeni <pabeni@redhat.com>: On Tue, 20 Sep 2022 14:43:09 +0800 you wrote: > There might be a potential race between SMC-R buffer map and > link group termination. > > smc_smcr_terminate_all() | smc_connect_rdma() > -------------------------------------------------------------- > | smc_conn_create() > for links in smcibdev | > schedule links down | > | smc_buf_create() > | \- smcr_buf_map_usable_links() > | \- no usable links found, > | (rmb->mr = NULL) > | > | smc_clc_send_confirm() > | \- access conn->rmb_desc->mr[]->rkey > | (panic) > > [...] Here is the summary with links: - [net] net/smc: Stop the CLC flow if no link to map buffers on https://git.kernel.org/netdev/net/c/e738455b2c6d You are awesome, thank you!
On 22.09.22 10:29, Wen Gu wrote: > > > On 2022/9/20 14:43, Wen Gu wrote: > >> There might be a potential race between SMC-R buffer map and >> link group termination. >> >> smc_smcr_terminate_all() | smc_connect_rdma() >> -------------------------------------------------------------- >> | smc_conn_create() >> for links in smcibdev | >> schedule links down | >> | smc_buf_create() >> | \- smcr_buf_map_usable_links() >> | \- no usable links found, >> | (rmb->mr = NULL) >> | >> | smc_clc_send_confirm() >> | \- access conn->rmb_desc->mr[]->rkey >> | (panic) >> >> During reboot and IB device module remove, all links will be set >> down and no usable links remain in link groups. In such situation >> smcr_buf_map_usable_links() should return an error and stop the >> CLC flow accessing to uninitialized mr. >> >> Fixes: b9247544c1bc ("net/smc: convert static link ID instances to >> support multiple links") >> Signed-off-by: Wen Gu <guwen@linux.alibaba.com> >> --- >> net/smc/smc_core.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c >> index ebf56cd..df89c2e 100644 >> --- a/net/smc/smc_core.c >> +++ b/net/smc/smc_core.c >> @@ -2239,7 +2239,7 @@ static struct smc_buf_desc >> *smcr_new_buf_create(struct smc_link_group *lgr, >> static int smcr_buf_map_usable_links(struct smc_link_group *lgr, >> struct smc_buf_desc *buf_desc, bool is_rmb) >> { >> - int i, rc = 0; >> + int i, rc = 0, cnt = 0; >> /* protect against parallel link reconfiguration */ >> mutex_lock(&lgr->llc_conf_mutex); >> @@ -2252,9 +2252,12 @@ static int smcr_buf_map_usable_links(struct >> smc_link_group *lgr, >> rc = -ENOMEM; >> goto out; >> } >> + cnt++; >> } >> out: >> mutex_unlock(&lgr->llc_conf_mutex); >> + if (!rc && !cnt) >> + rc = -EINVAL; >> return rc; >> } > > Any comments or reviews are welcome and appreciated. > > Thanks, > Wen Gu Sorry for the late answer! Good catch! Thank you! Acked-by: Wenjia Zhang <wenjia@linux.ibm.com>
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c index ebf56cd..df89c2e 100644 --- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -2239,7 +2239,7 @@ static struct smc_buf_desc *smcr_new_buf_create(struct smc_link_group *lgr, static int smcr_buf_map_usable_links(struct smc_link_group *lgr, struct smc_buf_desc *buf_desc, bool is_rmb) { - int i, rc = 0; + int i, rc = 0, cnt = 0; /* protect against parallel link reconfiguration */ mutex_lock(&lgr->llc_conf_mutex); @@ -2252,9 +2252,12 @@ static int smcr_buf_map_usable_links(struct smc_link_group *lgr, rc = -ENOMEM; goto out; } + cnt++; } out: mutex_unlock(&lgr->llc_conf_mutex); + if (!rc && !cnt) + rc = -EINVAL; return rc; }
There might be a potential race between SMC-R buffer map and link group termination. smc_smcr_terminate_all() | smc_connect_rdma() -------------------------------------------------------------- | smc_conn_create() for links in smcibdev | schedule links down | | smc_buf_create() | \- smcr_buf_map_usable_links() | \- no usable links found, | (rmb->mr = NULL) | | smc_clc_send_confirm() | \- access conn->rmb_desc->mr[]->rkey | (panic) During reboot and IB device module remove, all links will be set down and no usable links remain in link groups. In such situation smcr_buf_map_usable_links() should return an error and stop the CLC flow accessing to uninitialized mr. Fixes: b9247544c1bc ("net/smc: convert static link ID instances to support multiple links") Signed-off-by: Wen Gu <guwen@linux.alibaba.com> --- net/smc/smc_core.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)