diff mbox series

[11/11] arch: xtensa: platforms: Fix deadlock in rs_close()

Message ID 9ca3ab0b40c875b6019f32f031c68a1ae80dd73a.1649310812.git.duoming@zju.edu.cn (mailing list archive)
State Not Applicable
Headers show
Series Fix deadlocks caused by del_timer_sync() | expand

Commit Message

Duoming Zhou April 7, 2022, 6:37 a.m. UTC
There is a deadlock in rs_close(), which is shown
below:

   (Thread 1)              |      (Thread 2)
                           | rs_open()
rs_close()                 |  mod_timer()
 spin_lock_bh() //(1)      |  (wait a time)
 ...                       | rs_poll()
 del_timer_sync()          |  spin_lock() //(2)
 (wait timer to stop)      |  ...

We hold timer_lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need timer_lock in position (2) of thread 2.
As a result, rs_close() will block forever.

This patch extracts del_timer_sync() from the protection of
spin_lock_bh(), which could let timer handler to obtain
the needed lock.

Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
---
 arch/xtensa/platforms/iss/console.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Max Filippov April 7, 2022, 7:21 a.m. UTC | #1
Hi Duoming,

On Wed, Apr 6, 2022 at 11:38 PM Duoming Zhou <duoming@zju.edu.cn> wrote:
>
> There is a deadlock in rs_close(), which is shown
> below:
>
>    (Thread 1)              |      (Thread 2)
>                            | rs_open()
> rs_close()                 |  mod_timer()
>  spin_lock_bh() //(1)      |  (wait a time)
>  ...                       | rs_poll()
>  del_timer_sync()          |  spin_lock() //(2)
>  (wait timer to stop)      |  ...
>
> We hold timer_lock in position (1) of thread 1 and
> use del_timer_sync() to wait timer to stop, but timer handler
> also need timer_lock in position (2) of thread 2.
> As a result, rs_close() will block forever.

I agree with this.

> This patch extracts del_timer_sync() from the protection of
> spin_lock_bh(), which could let timer handler to obtain
> the needed lock.

Looking at the timer_lock I don't really understand what it protects.
It looks like it is not needed at all.

Also, I see that rs_poll rewinds the timer regardless of whether del_timer_sync
was called or not, which violates del_timer_sync requirements.

> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
> ---
>  arch/xtensa/platforms/iss/console.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c
> index 81d7c7e8f7e..d431b61ae3c 100644
> --- a/arch/xtensa/platforms/iss/console.c
> +++ b/arch/xtensa/platforms/iss/console.c
> @@ -51,8 +51,10 @@ static int rs_open(struct tty_struct *tty, struct file * filp)
>  static void rs_close(struct tty_struct *tty, struct file * filp)
>  {
>         spin_lock_bh(&timer_lock);
> -       if (tty->count == 1)
> +       if (tty->count == 1) {
> +               spin_unlock_bh(&timer_lock);
>                 del_timer_sync(&serial_timer);
> +       }
>         spin_unlock_bh(&timer_lock);

Now in case tty->count == 1 the timer_lock would be unlocked twice.
Sergey Shtylyov April 7, 2022, 9:42 a.m. UTC | #2
Hello!

On 4/7/22 9:37 AM, Duoming Zhou wrote:

> There is a deadlock in rs_close(), which is shown
> below:
> 
>    (Thread 1)              |      (Thread 2)
>                            | rs_open()
> rs_close()                 |  mod_timer()
>  spin_lock_bh() //(1)      |  (wait a time)
>  ...                       | rs_poll()
>  del_timer_sync()          |  spin_lock() //(2)
>  (wait timer to stop)      |  ...
> 
> We hold timer_lock in position (1) of thread 1 and
> use del_timer_sync() to wait timer to stop, but timer handler
> also need timer_lock in position (2) of thread 2.
> As a result, rs_close() will block forever.
> 
> This patch extracts del_timer_sync() from the protection of
> spin_lock_bh(), which could let timer handler to obtain
> the needed lock.
> 
> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
> ---
>  arch/xtensa/platforms/iss/console.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c
> index 81d7c7e8f7e..d431b61ae3c 100644
> --- a/arch/xtensa/platforms/iss/console.c
> +++ b/arch/xtensa/platforms/iss/console.c
> @@ -51,8 +51,10 @@ static int rs_open(struct tty_struct *tty, struct file * filp)
>  static void rs_close(struct tty_struct *tty, struct file * filp)
>  {
>  	spin_lock_bh(&timer_lock);
> -	if (tty->count == 1)
> +	if (tty->count == 1) {
> +		spin_unlock_bh(&timer_lock);
>  		del_timer_sync(&serial_timer);
> +	}
>  	spin_unlock_bh(&timer_lock);

   Double unlock iff tty->count == 1?

[...]

MBR, Sergey
Duoming Zhou April 7, 2022, 11:05 a.m. UTC | #3
Hello,

On Thu, 7 Apr 2022 00:21:58 -0700 Max Filippov wrote:

> > There is a deadlock in rs_close(), which is shown
> > below:
> >
> >    (Thread 1)              |      (Thread 2)
> >                            | rs_open()
> > rs_close()                 |  mod_timer()
> >  spin_lock_bh() //(1)      |  (wait a time)
> >  ...                       | rs_poll()
> >  del_timer_sync()          |  spin_lock() //(2)
> >  (wait timer to stop)      |  ...
> >
> > We hold timer_lock in position (1) of thread 1 and
> > use del_timer_sync() to wait timer to stop, but timer handler
> > also need timer_lock in position (2) of thread 2.
> > As a result, rs_close() will block forever.
> 
> I agree with this.
> 
> > This patch extracts del_timer_sync() from the protection of
> > spin_lock_bh(), which could let timer handler to obtain
> > the needed lock.
> 
> Looking at the timer_lock I don't really understand what it protects.
> It looks like it is not needed at all.

There is no race condition between rs_close and rs_poll(timer handler),
I think we could remove the timer_lock in rs_close(), rs_open() and rs_poll().

> Also, I see that rs_poll rewinds the timer regardless of whether del_timer_sync
> was called or not, which violates del_timer_sync requirements.

I wrote a kernel module to test whether del_timer_sync() could finish a timer handler
that use mod_timer() to rewind itself. The following is the result.

# insmod del_timer_sync.ko 
[  929.374405] my_timer will be create.
[  929.374738] the jiffies is :4295595572
[  930.411581] In my_timer_function
[  930.411956] the jiffies is 4295596609
[  935.466643] In my_timer_function
[  935.467505] the jiffies is 4295601665
[  940.586538] In my_timer_function
[  940.586916] the jiffies is 4295606784
[  945.706579] In my_timer_function
[  945.706885] the jiffies is 4295611904

# 
# rmmod del_timer_sync.ko
[  948.507692] the del_timer_sync is :1
[  948.507692] 
# 
# 

The result of the experiment shows that the timer handler could
be killed after we execute del_timer_sync().

> > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
> > ---
> >  arch/xtensa/platforms/iss/console.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c
> > index 81d7c7e8f7e..d431b61ae3c 100644
> > --- a/arch/xtensa/platforms/iss/console.c
> > +++ b/arch/xtensa/platforms/iss/console.c
> > @@ -51,8 +51,10 @@ static int rs_open(struct tty_struct *tty, struct file * filp)
> >  static void rs_close(struct tty_struct *tty, struct file * filp)
> >  {
> >         spin_lock_bh(&timer_lock);
> > -       if (tty->count == 1)
> > +       if (tty->count == 1) {
> > +               spin_unlock_bh(&timer_lock);
> >                 del_timer_sync(&serial_timer);
> > +       }
> >         spin_unlock_bh(&timer_lock);
> 
> Now in case tty->count == 1 the timer_lock would be unlocked twice.

I will remove the timer_lock in rs_close(), rs_open() and rs_poll().

Thanks a lot for your time and advice!

Best regards,
Duoming Zhou
Duoming Zhou April 7, 2022, 11:12 a.m. UTC | #4
Hello,

On Thu, 7 Apr 2022 12:42:31 +0300 Sergey Shtylyov wrote:

> > There is a deadlock in rs_close(), which is shown
> > below:
> > 
> >    (Thread 1)              |      (Thread 2)
> >                            | rs_open()
> > rs_close()                 |  mod_timer()
> >  spin_lock_bh() //(1)      |  (wait a time)
> >  ...                       | rs_poll()
> >  del_timer_sync()          |  spin_lock() //(2)
> >  (wait timer to stop)      |  ...
> > 
> > We hold timer_lock in position (1) of thread 1 and
> > use del_timer_sync() to wait timer to stop, but timer handler
> > also need timer_lock in position (2) of thread 2.
> > As a result, rs_close() will block forever.
> > 
> > This patch extracts del_timer_sync() from the protection of
> > spin_lock_bh(), which could let timer handler to obtain
> > the needed lock.
> > 
> > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
> > ---
> >  arch/xtensa/platforms/iss/console.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c
> > index 81d7c7e8f7e..d431b61ae3c 100644
> > --- a/arch/xtensa/platforms/iss/console.c
> > +++ b/arch/xtensa/platforms/iss/console.c
> > @@ -51,8 +51,10 @@ static int rs_open(struct tty_struct *tty, struct file * filp)
> >  static void rs_close(struct tty_struct *tty, struct file * filp)
> >  {
> >  	spin_lock_bh(&timer_lock);
> > -	if (tty->count == 1)
> > +	if (tty->count == 1) {
> > +		spin_unlock_bh(&timer_lock);
> >  		del_timer_sync(&serial_timer);
> > +	}
> >  	spin_unlock_bh(&timer_lock);
> 
>    Double unlock iff tty->count == 1?

Yes, Thanks a lot for your timer and advice. I found there is no race condition
between rs_close and rs_poll(timer handler), I think we could remove the timer_lock
in rs_close(), rs_open() and rs_poll().

Best regards,
Duoming Zhou
diff mbox series

Patch

diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c
index 81d7c7e8f7e..d431b61ae3c 100644
--- a/arch/xtensa/platforms/iss/console.c
+++ b/arch/xtensa/platforms/iss/console.c
@@ -51,8 +51,10 @@  static int rs_open(struct tty_struct *tty, struct file * filp)
 static void rs_close(struct tty_struct *tty, struct file * filp)
 {
 	spin_lock_bh(&timer_lock);
-	if (tty->count == 1)
+	if (tty->count == 1) {
+		spin_unlock_bh(&timer_lock);
 		del_timer_sync(&serial_timer);
+	}
 	spin_unlock_bh(&timer_lock);
 }