Message ID | tencent_941B48254CBA00BB4933069E391B6E4B5408@qq.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | media/cec/core: fix task hung in cec_claim_log_addrs | expand |
On 21/02/2024 15:20, Edward Adam Davis wrote: > After unlocking adap->lock in cec_claim_log_addrs(), cec_claim_log_addrs() may > re-enter, causing this issue to occur. But if it is called again, then it should hit this at the start of the function: if (WARN_ON(adap->is_configuring || adap->is_configured)) return; I'm still not sure what causes the KASAN hung task since I cannot seem to reproduce it, and because it is hard for me to find enough time to dig into this. Regards, Hans > > In the thread function cec_config_thread_func() adap->lock is also used, so there > is no need to unlock adap->lock in cec_claim_log_addrs(), and then use adap->lock > in cec_config_thread_func() to protect. > > Reported-and-tested-by: syzbot+116b65a23bc791ae49a6@syzkaller.appspotmail.com > Signed-off-by: Edward Adam Davis <eadavis@qq.com> > --- > drivers/media/cec/core/cec-adap.c | 5 ----- > 1 file changed, 5 deletions(-) > > diff --git a/drivers/media/cec/core/cec-adap.c b/drivers/media/cec/core/cec-adap.c > index 5741adf09a2e..21b3ff504524 100644 > --- a/drivers/media/cec/core/cec-adap.c > +++ b/drivers/media/cec/core/cec-adap.c > @@ -1436,7 +1436,6 @@ static int cec_config_thread_func(void *arg) > int err; > int i, j; > > - mutex_lock(&adap->lock); > dprintk(1, "physical address: %x.%x.%x.%x, claim %d logical addresses\n", > cec_phys_addr_exp(adap->phys_addr), las->num_log_addrs); > las->log_addr_mask = 0; > @@ -1565,7 +1564,6 @@ static int cec_config_thread_func(void *arg) > } > adap->kthread_config = NULL; > complete(&adap->config_completion); > - mutex_unlock(&adap->lock); > call_void_op(adap, configured); > return 0; > > @@ -1577,7 +1575,6 @@ static int cec_config_thread_func(void *arg) > adap->must_reconfigure = false; > adap->kthread_config = NULL; > complete(&adap->config_completion); > - mutex_unlock(&adap->lock); > return 0; > } > > @@ -1602,9 +1599,7 @@ static void cec_claim_log_addrs(struct cec_adapter *adap, bool block) > adap->kthread_config = NULL; > adap->is_configuring = false; > } else if (block) { > - mutex_unlock(&adap->lock); > wait_for_completion(&adap->config_completion); > - mutex_lock(&adap->lock); > } > } >
On Wed, 21 Feb 2024 15:38:47 +0100 Hans Verkuil <hverkuil-cisco@xs4all.nl> > On 21/02/2024 15:20, Edward Adam Davis wrote: > > After unlocking adap->lock in cec_claim_log_addrs(), cec_claim_log_addrs() may > > re-enter, causing this issue to occur. > > But if it is called again, then it should hit this at the start of the function: > > if (WARN_ON(adap->is_configuring || adap->is_configured)) > return; > > I'm still not sure what causes the KASAN hung task since I cannot seem to reproduce > it, and because it is hard for me to find enough time to dig into this. Likely because of the window for initializing completion more than once [1]. [1] https://lore.kernel.org/lkml/00000000000054a54e0611f1bc01@google.com/
On Wed, 21 Feb 2024 15:38:47 +0100, Hans Verkuil wrote: > > After unlocking adap->lock in cec_claim_log_addrs(), cec_claim_log_addrs() may > > re-enter, causing this issue to occur. > > But if it is called again, then it should hit this at the start of the function: > > if (WARN_ON(adap->is_configuring || adap->is_configured)) > return; > > I'm still not sure what causes the KASAN hung task since I cannot seem to reproduce > it, and because it is hard for me to find enough time to dig into this. Please pay attention to the following section of code in cec_config_thread_func(): 3 unconfigure: 2 for (i = 0; i < las->num_log_addrs; i++) 1 las->log_addr[i] = CEC_LOG_ADDR_INVALID; 1573 cec_adap_unconfigure(adap); // [1], is_configured = false; 1 adap->is_configuring = false; // [2], is_configuring = false; 2 adap->must_reconfigure = false; 3 adap->kthread_config = NULL; 4 complete(&adap->config_completion); 5 mutex_unlock(&adap->lock); // [3], Afterwards And the following code is included in cec_claim_log-addrs(): 3 } else if (block) { 2 mutex_unlock(&adap->lock); 1 wait_for_completion(&adap->config_completion); 1607 mutex_lock(&adap->lock); // [4], During the period before re obtaining the adap->lock, how did cec_claim_log-addrs() re-enter? BR, edward
On 22/02/2024 11:43, Hillf Danton wrote: > On Wed, 21 Feb 2024 15:38:47 +0100 Hans Verkuil <hverkuil-cisco@xs4all.nl> >> On 21/02/2024 15:20, Edward Adam Davis wrote: >>> After unlocking adap->lock in cec_claim_log_addrs(), cec_claim_log_addrs() may >>> re-enter, causing this issue to occur. >> >> But if it is called again, then it should hit this at the start of the function: >> >> if (WARN_ON(adap->is_configuring || adap->is_configured)) >> return; >> >> I'm still not sure what causes the KASAN hung task since I cannot seem to reproduce >> it, and because it is hard for me to find enough time to dig into this. > > Likely because of the window for initializing completion more than once [1]. > > [1] https://lore.kernel.org/lkml/00000000000054a54e0611f1bc01@google.com/ I have been able to reproduce this by adding msleeps in several places. When I have some more time I will start digging into this. Regards, Hans
diff --git a/drivers/media/cec/core/cec-adap.c b/drivers/media/cec/core/cec-adap.c index 5741adf09a2e..21b3ff504524 100644 --- a/drivers/media/cec/core/cec-adap.c +++ b/drivers/media/cec/core/cec-adap.c @@ -1436,7 +1436,6 @@ static int cec_config_thread_func(void *arg) int err; int i, j; - mutex_lock(&adap->lock); dprintk(1, "physical address: %x.%x.%x.%x, claim %d logical addresses\n", cec_phys_addr_exp(adap->phys_addr), las->num_log_addrs); las->log_addr_mask = 0; @@ -1565,7 +1564,6 @@ static int cec_config_thread_func(void *arg) } adap->kthread_config = NULL; complete(&adap->config_completion); - mutex_unlock(&adap->lock); call_void_op(adap, configured); return 0; @@ -1577,7 +1575,6 @@ static int cec_config_thread_func(void *arg) adap->must_reconfigure = false; adap->kthread_config = NULL; complete(&adap->config_completion); - mutex_unlock(&adap->lock); return 0; } @@ -1602,9 +1599,7 @@ static void cec_claim_log_addrs(struct cec_adapter *adap, bool block) adap->kthread_config = NULL; adap->is_configuring = false; } else if (block) { - mutex_unlock(&adap->lock); wait_for_completion(&adap->config_completion); - mutex_lock(&adap->lock); } }
After unlocking adap->lock in cec_claim_log_addrs(), cec_claim_log_addrs() may re-enter, causing this issue to occur. In the thread function cec_config_thread_func() adap->lock is also used, so there is no need to unlock adap->lock in cec_claim_log_addrs(), and then use adap->lock in cec_config_thread_func() to protect. Reported-and-tested-by: syzbot+116b65a23bc791ae49a6@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> --- drivers/media/cec/core/cec-adap.c | 5 ----- 1 file changed, 5 deletions(-)