Message ID | 1641807505-54454-1-git-send-email-guwen@linux.alibaba.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net/smc: Avoid setting clcsock options after clcsock released | expand |
On 10/01/2022 10:38, Wen Gu wrote: > We encountered a crash in smc_setsockopt() and it is caused by > accessing smc->clcsock after clcsock was released. > > BUG: kernel NULL pointer dereference, address: 0000000000000020 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 0 P4D 0 > Oops: 0000 [#1] PREEMPT SMP PTI > CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E 5.16.0-rc4+ #53 > RIP: 0010:smc_setsockopt+0x59/0x280 [smc] > Call Trace: > <TASK> > __sys_setsockopt+0xfc/0x190 > __x64_sys_setsockopt+0x20/0x30 > do_syscall_64+0x34/0x90 > entry_SYSCALL_64_after_hwframe+0x44/0xae > RIP: 0033:0x7f16ba83918e > </TASK> > > This patch tries to fix it by holding clcsock_release_lock and > checking whether clcsock has already been released. In case that > a crash of the same reason happens in smc_getsockopt(), this patch > also checkes smc->clcsock in smc_getsockopt(). > > Signed-off-by: Wen Gu <guwen@linux.alibaba.com> > --- > net/smc/af_smc.c | 16 +++++++++++++++- > 1 file changed, 15 insertions(+), 1 deletion(-) > > diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c > index 1c9289f..af423f4 100644 > --- a/net/smc/af_smc.c > +++ b/net/smc/af_smc.c > @@ -2441,6 +2441,11 @@ static int smc_setsockopt(struct socket *sock, int level, int optname, > /* generic setsockopts reaching us here always apply to the > * CLC socket > */ > + mutex_lock(&smc->clcsock_release_lock); > + if (!smc->clcsock) { > + mutex_unlock(&smc->clcsock_release_lock); > + return -EBADF; > + } > if (unlikely(!smc->clcsock->ops->setsockopt)) > rc = -EOPNOTSUPP; > else > @@ -2450,6 +2455,7 @@ static int smc_setsockopt(struct socket *sock, int level, int optname, > sk->sk_err = smc->clcsock->sk->sk_err; > sk_error_report(sk); > } > + mutex_unlock(&smc->clcsock_release_lock); In the switch() the function smc_switch_to_fallback() might be called which also accesses smc->clcsock without further checking. This should also be protected then? Also from all callers of smc_switch_to_fallback() ? There are more uses of smc->clcsock (e.g. smc_bind(), ...), so why does this problem happen in setsockopt() for you only? I suspect it depends on the test case. I wonder if it makes sense to check and protect smc->clcsock at all places in the code where it is used... as of now we had no such races like you encountered. But I see that in theory this problem could also happen in other code areas. > > if (optlen < sizeof(int)) > return -EINVAL; > @@ -2509,13 +2515,21 @@ static int smc_getsockopt(struct socket *sock, int level, int optname, > char __user *optval, int __user *optlen) > { > struct smc_sock *smc; > + int rc; > > smc = smc_sk(sock->sk); > + mutex_lock(&smc->clcsock_release_lock); > + if (!smc->clcsock) { > + mutex_unlock(&smc->clcsock_release_lock); > + return -EBADF; > + } > /* socket options apply to the CLC socket */ > if (unlikely(!smc->clcsock->ops->getsockopt)) > return -EOPNOTSUPP; > - return smc->clcsock->ops->getsockopt(smc->clcsock, level, optname, > + rc = smc->clcsock->ops->getsockopt(smc->clcsock, level, optname, > optval, optlen); > + mutex_unlock(&smc->clcsock_release_lock); > + return rc; > } > > static int smc_ioctl(struct socket *sock, unsigned int cmd,
Thanks for your reply. On 2022/1/11 6:03 pm, Karsten Graul wrote: > On 10/01/2022 10:38, Wen Gu wrote: >> We encountered a crash in smc_setsockopt() and it is caused by >> accessing smc->clcsock after clcsock was released. >> > > In the switch() the function smc_switch_to_fallback() might be called which also > accesses smc->clcsock without further checking. This should also be protected then? > Also from all callers of smc_switch_to_fallback() ? > > There are more uses of smc->clcsock (e.g. smc_bind(), ...), so why does this problem > happen in setsockopt() for you only? I suspect it depends on the test case. > Yes, it depends on the test case. The crash described here only happens one time when I run a stress test of nginx/wrk, accompanied with frequent RNIC up/down operations. Considering accessing smc->clcsock after its release is an uncommon, low probability issue and only happens in setsockopt() in my test, I choce an simple way to fix it, using the existing clcsock_release_lock, and only check in smc_setsockopt() and smc_getsockopt(). > I wonder if it makes sense to check and protect smc->clcsock at all places in the code where > it is used... as of now we had no such races like you encountered. But I see that in theory > this problem could also happen in other code areas. > But inspired by your questions, I think maybe we should treat the race as a general problem? Do you think it is necessary to find all the potential race related to the clcsock release and fix them in a unified approach? like define smc->clcsock as RCU pointer, hold rcu read lock before accessing smc->clcsock and call synchronize_rcu() before resetting smc->clcsock? just a rough idea :) Or we should decide it later, do some more tests to see the probability of recurrence of this problem? Thanks, Wen Gu
On Mon, 10 Jan 2022 17:38:25 +0800 Wen Gu wrote: > - return smc->clcsock->ops->getsockopt(smc->clcsock, level, optname, > + rc = smc->clcsock->ops->getsockopt(smc->clcsock, level, optname, > optval, optlen); Please do realign the continuation line when moving the opening bracket.
On 2022/1/12 2:14 am, Jakub Kicinski wrote: > On Mon, 10 Jan 2022 17:38:25 +0800 Wen Gu wrote: >> - return smc->clcsock->ops->getsockopt(smc->clcsock, level, optname, >> + rc = smc->clcsock->ops->getsockopt(smc->clcsock, level, optname, >> optval, optlen); > > Please do realign the continuation line when moving the opening bracket. Thanks for pointing this out. Will fix it. Thanks, Wen Gu
On Mon, Jan 10, 2022 at 05:38:25PM +0800, Wen Gu wrote: >We encountered a crash in smc_setsockopt() and it is caused by >accessing smc->clcsock after clcsock was released. > > BUG: kernel NULL pointer dereference, address: 0000000000000020 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 0 P4D 0 > Oops: 0000 [#1] PREEMPT SMP PTI > CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E 5.16.0-rc4+ #53 > RIP: 0010:smc_setsockopt+0x59/0x280 [smc] > Call Trace: > <TASK> > __sys_setsockopt+0xfc/0x190 > __x64_sys_setsockopt+0x20/0x30 > do_syscall_64+0x34/0x90 > entry_SYSCALL_64_after_hwframe+0x44/0xae > RIP: 0033:0x7f16ba83918e > </TASK> > >This patch tries to fix it by holding clcsock_release_lock and >checking whether clcsock has already been released. In case that >a crash of the same reason happens in smc_getsockopt(), this patch >also checkes smc->clcsock in smc_getsockopt(). > >Signed-off-by: Wen Gu <guwen@linux.alibaba.com> >--- > net/smc/af_smc.c | 16 +++++++++++++++- > 1 file changed, 15 insertions(+), 1 deletion(-) > >diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c >index 1c9289f..af423f4 100644 >--- a/net/smc/af_smc.c >+++ b/net/smc/af_smc.c >@@ -2441,6 +2441,11 @@ static int smc_setsockopt(struct socket *sock, int level, int optname, > /* generic setsockopts reaching us here always apply to the > * CLC socket > */ >+ mutex_lock(&smc->clcsock_release_lock); >+ if (!smc->clcsock) { >+ mutex_unlock(&smc->clcsock_release_lock); >+ return -EBADF; >+ } > if (unlikely(!smc->clcsock->ops->setsockopt)) > rc = -EOPNOTSUPP; > else >@@ -2450,6 +2455,7 @@ static int smc_setsockopt(struct socket *sock, int level, int optname, > sk->sk_err = smc->clcsock->sk->sk_err; > sk_error_report(sk); > } >+ mutex_unlock(&smc->clcsock_release_lock); > > if (optlen < sizeof(int)) > return -EINVAL; >@@ -2509,13 +2515,21 @@ static int smc_getsockopt(struct socket *sock, int level, int optname, > char __user *optval, int __user *optlen) > { > struct smc_sock *smc; >+ int rc; > > smc = smc_sk(sock->sk); >+ mutex_lock(&smc->clcsock_release_lock); >+ if (!smc->clcsock) { >+ mutex_unlock(&smc->clcsock_release_lock); >+ return -EBADF; >+ } > /* socket options apply to the CLC socket */ > if (unlikely(!smc->clcsock->ops->getsockopt)) Missed a mutex_unlock() here ? > return -EOPNOTSUPP; >- return smc->clcsock->ops->getsockopt(smc->clcsock, level, optname, >+ rc = smc->clcsock->ops->getsockopt(smc->clcsock, level, optname, > optval, optlen); >+ mutex_unlock(&smc->clcsock_release_lock); >+ return rc; > } > > static int smc_ioctl(struct socket *sock, unsigned int cmd, >-- >1.8.3.1
On 2022/1/12 3:11 pm, dust.li wrote: > On Mon, Jan 10, 2022 at 05:38:25PM +0800, Wen Gu wrote: >> >> This patch tries to fix it by holding clcsock_release_lock and >> checking whether clcsock has already been released. In case that >> a crash of the same reason happens in smc_getsockopt(), this patch >> also checkes smc->clcsock in smc_getsockopt(). >> @@ -2509,13 +2515,21 @@ static int smc_getsockopt(struct socket *sock, int level, int optname, >> char __user *optval, int __user *optlen) >> { >> struct smc_sock *smc; >> + int rc; >> >> smc = smc_sk(sock->sk); >> + mutex_lock(&smc->clcsock_release_lock); >> + if (!smc->clcsock) { >> + mutex_unlock(&smc->clcsock_release_lock); >> + return -EBADF; >> + } >> /* socket options apply to the CLC socket */ >> if (unlikely(!smc->clcsock->ops->getsockopt)) > Missed a mutex_unlock() here ? > >> return -EOPNOTSUPP; Thanks for pointing it out. Will add an additional mutex_unlock(). Thanks, Wen Gu
On 11/01/2022 17:34, Wen Gu wrote: > Thanks for your reply. > > On 2022/1/11 6:03 pm, Karsten Graul wrote: >> On 10/01/2022 10:38, Wen Gu wrote: >>> We encountered a crash in smc_setsockopt() and it is caused by >>> accessing smc->clcsock after clcsock was released. >>> >> >> In the switch() the function smc_switch_to_fallback() might be called which also >> accesses smc->clcsock without further checking. This should also be protected then? >> Also from all callers of smc_switch_to_fallback() ? >> >> There are more uses of smc->clcsock (e.g. smc_bind(), ...), so why does this problem >> happen in setsockopt() for you only? I suspect it depends on the test case. >> > > Yes, it depends on the test case. The crash described here only happens one time when > I run a stress test of nginx/wrk, accompanied with frequent RNIC up/down operations. > > Considering accessing smc->clcsock after its release is an uncommon, low probability > issue and only happens in setsockopt() in my test, I choce an simple way to fix it, using > the existing clcsock_release_lock, and only check in smc_setsockopt() and smc_getsockopt(). > >> I wonder if it makes sense to check and protect smc->clcsock at all places in the code where >> it is used... as of now we had no such races like you encountered. But I see that in theory >> this problem could also happen in other code areas. >> > > But inspired by your questions, I think maybe we should treat the race as a general problem? > Do you think it is necessary to find all the potential race related to the clcsock release and > fix them in a unified approach? like define smc->clcsock as RCU pointer, hold rcu read lock > before accessing smc->clcsock and call synchronize_rcu() before resetting smc->clcsock? just a rough idea :) > > Or we should decide it later, do some more tests to see the probability of recurrence of this problem? I like the idea to use RCU with rcu_assign_pointer() to protect that pointer! Lets go with your initial patch (improved to address the access in smc_switch_to_fallback()) for now because it solves your current problem. I put that RCU thing on our list, but if either of us here starts working on that please let the others know so we don't end up doing parallel work on this. But I doubt that we will be able to start working on that soon. Thanks for the good idea!
Thanks for your reply. On 2022/1/12 5:38 pm, Karsten Graul wrote: > On 11/01/2022 17:34, Wen Gu wrote: >> Thanks for your reply. >> >> On 2022/1/11 6:03 pm, Karsten Graul wrote: >>> On 10/01/2022 10:38, Wen Gu wrote: >>>> We encountered a crash in smc_setsockopt() and it is caused by >>>> accessing smc->clcsock after clcsock was released. > > I like the idea to use RCU with rcu_assign_pointer() to protect that pointer! > > Lets go with your initial patch (improved to address the access in smc_switch_to_fallback()) > for now because it solves your current problem. > OK, I will improve the patch, adding check before clcsock access in smc_switch_to_fallback() and return an error (-EBADF) if smc->clcsock is NULL. The caller of smc_switch_to_fallback() will check the return value to identify whether fallback is possible. > I put that RCU thing on our list, but if either of us here starts working on that please let the > others know so we don't end up doing parallel work on this. But I doubt that we will be able to start working > on that soon. > > Thanks for the good idea! Thank you! If I start working on the RCU things, I will send a RFC to let you know. Thanks, Wen Gu
On 2022/1/12 5:38 pm, Karsten Graul wrote: > On 11/01/2022 17:34, Wen Gu wrote: >> Thanks for your reply. >> >> On 2022/1/11 6:03 pm, Karsten Graul wrote: >>> On 10/01/2022 10:38, Wen Gu wrote: >>>> We encountered a crash in smc_setsockopt() and it is caused by >>>> accessing smc->clcsock after clcsock was released. >>>> >>> >>> In the switch() the function smc_switch_to_fallback() might be called which also >>> accesses smc->clcsock without further checking. This should also be protected then? >>> Also from all callers of smc_switch_to_fallback() ? >>> > > Lets go with your initial patch (improved to address the access in smc_switch_to_fallback()) > for now because it solves your current problem. > Since the upcoming v2 patch added clcsock check in smc_switch_to_fallback(), the original subject is inappropriate. So I sent a new patch named 'net/smc: Transitional solution for clcsock race issue' instead of a v2. Thanks, Wen Gu
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 1c9289f..af423f4 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2441,6 +2441,11 @@ static int smc_setsockopt(struct socket *sock, int level, int optname, /* generic setsockopts reaching us here always apply to the * CLC socket */ + mutex_lock(&smc->clcsock_release_lock); + if (!smc->clcsock) { + mutex_unlock(&smc->clcsock_release_lock); + return -EBADF; + } if (unlikely(!smc->clcsock->ops->setsockopt)) rc = -EOPNOTSUPP; else @@ -2450,6 +2455,7 @@ static int smc_setsockopt(struct socket *sock, int level, int optname, sk->sk_err = smc->clcsock->sk->sk_err; sk_error_report(sk); } + mutex_unlock(&smc->clcsock_release_lock); if (optlen < sizeof(int)) return -EINVAL; @@ -2509,13 +2515,21 @@ static int smc_getsockopt(struct socket *sock, int level, int optname, char __user *optval, int __user *optlen) { struct smc_sock *smc; + int rc; smc = smc_sk(sock->sk); + mutex_lock(&smc->clcsock_release_lock); + if (!smc->clcsock) { + mutex_unlock(&smc->clcsock_release_lock); + return -EBADF; + } /* socket options apply to the CLC socket */ if (unlikely(!smc->clcsock->ops->getsockopt)) return -EOPNOTSUPP; - return smc->clcsock->ops->getsockopt(smc->clcsock, level, optname, + rc = smc->clcsock->ops->getsockopt(smc->clcsock, level, optname, optval, optlen); + mutex_unlock(&smc->clcsock_release_lock); + return rc; } static int smc_ioctl(struct socket *sock, unsigned int cmd,
We encountered a crash in smc_setsockopt() and it is caused by accessing smc->clcsock after clcsock was released. BUG: kernel NULL pointer dereference, address: 0000000000000020 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E 5.16.0-rc4+ #53 RIP: 0010:smc_setsockopt+0x59/0x280 [smc] Call Trace: <TASK> __sys_setsockopt+0xfc/0x190 __x64_sys_setsockopt+0x20/0x30 do_syscall_64+0x34/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f16ba83918e </TASK> This patch tries to fix it by holding clcsock_release_lock and checking whether clcsock has already been released. In case that a crash of the same reason happens in smc_getsockopt(), this patch also checkes smc->clcsock in smc_getsockopt(). Signed-off-by: Wen Gu <guwen@linux.alibaba.com> --- net/smc/af_smc.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)