diff mbox series

[2/7] md: fix a race condition when stopping the sync thread

Message ID 8fb335e-6d2c-dbb5-d7-ded8db5145a@redhat.com (mailing list archive)
State Superseded, archived
Delegated to: Mike Snitzer
Headers show
Series MD fixes for the LVM2 testsuite | expand

Commit Message

Mikulas Patocka Jan. 17, 2024, 6:18 p.m. UTC
Note that md_wakeup_thread_directly is racy - it will do nothing if the
thread is already running or it may cause spurious wake-up if the thread
is blocked in another subsystem.

In order to eliminate the race condition, we will retry
md_wakeup_thread_directly after 0.1 seconds.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org	# v6.7
Fixes: 130443d60b1b ("md: refactor idle/frozen_sync_thread() to fix deadlock")

---
 drivers/md/md.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Yu Kuai Jan. 18, 2024, 1:32 a.m. UTC | #1
Hi,

在 2024/01/18 2:18, Mikulas Patocka 写道:
> Note that md_wakeup_thread_directly is racy - it will do nothing if the
> thread is already running or it may cause spurious wake-up if the thread
> is blocked in another subsystem.

No, as the comment said, md_wakeup_thread_directly() is just to prevent
that md_wakeup_thread() can't wake up md_do_sync() if it's waiting for
metadata update.
> 
> In order to eliminate the race condition, we will retry
> md_wakeup_thread_directly after 0.1 seconds.

And what you changed is not md_wakeup_thread_directly() at all.

Thanks,
Kuai

> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org	# v6.7
> Fixes: 130443d60b1b ("md: refactor idle/frozen_sync_thread() to fix deadlock")
> 
> ---
>   drivers/md/md.c |    7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6/drivers/md/md.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/md.c
> +++ linux-2.6/drivers/md/md.c
> @@ -4889,6 +4889,7 @@ static void stop_sync_thread(struct mdde
>   
>   	mddev_unlock(mddev);
>   
> +retry:
>   	set_bit(MD_RECOVERY_INTR, &mddev->recovery);
>   	/*
>   	 * Thread might be blocked waiting for metadata update which will now
> @@ -4898,9 +4899,11 @@ static void stop_sync_thread(struct mdde
>   	if (work_pending(&mddev->sync_work))
>   		flush_work(&mddev->sync_work);
>   
> -	wait_event(resync_wait,
> +	if (!wait_event_timeout(resync_wait,
>   		   !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) ||
> -		   (check_seq && sync_seq != atomic_read(&mddev->sync_seq)));
> +		   (check_seq && sync_seq != atomic_read(&mddev->sync_seq)),
> +		   HZ / 10))
> +		goto retry;
>   
>   	if (locked)
>   		mddev_lock_nointr(mddev);
> 
> .
>
Mikulas Patocka Jan. 18, 2024, 1:07 p.m. UTC | #2
On Thu, 18 Jan 2024, Yu Kuai wrote:

> Hi,
> 
> 在 2024/01/18 2:18, Mikulas Patocka 写道:
> > Note that md_wakeup_thread_directly is racy - it will do nothing if the
> > thread is already running or it may cause spurious wake-up if the thread
> > is blocked in another subsystem.
> 
> No, as the comment said, md_wakeup_thread_directly() is just to prevent
> that md_wakeup_thread() can't wake up md_do_sync() if it's waiting for
> metadata update.

Yes - but what happens if you wake up the thread just a few instructions 
before it is going to sleep for metadata update? wake_up_process does 
nothing on a running process and the thread proceeds with waiting. This is 
what I thought could happen when I was making the patch.

Mikulas
Yu Kuai Jan. 18, 2024, 1:20 p.m. UTC | #3
Hi,

在 2024/01/18 21:07, Mikulas Patocka 写道:
> 
> 
> On Thu, 18 Jan 2024, Yu Kuai wrote:
> 
>> Hi,
>>
>> 在 2024/01/18 2:18, Mikulas Patocka 写道:
>>> Note that md_wakeup_thread_directly is racy - it will do nothing if the
>>> thread is already running or it may cause spurious wake-up if the thread
>>> is blocked in another subsystem.
>>
>> No, as the comment said, md_wakeup_thread_directly() is just to prevent
>> that md_wakeup_thread() can't wake up md_do_sync() if it's waiting for
>> metadata update.
> 
> Yes - but what happens if you wake up the thread just a few instructions
> before it is going to sleep for metadata update? wake_up_process does
> nothing on a running process and the thread proceeds with waiting. This is
> what I thought could happen when I was making the patch.

Please notice that in the orginal code md_wakeup_thread_directly() is
used for sync_thread, and md_wakeup_thread() should be used for
*mddev->thread* (mddev_unlock always do that) to clear
MD_RECOVERY_RUNNING.

By the way, the root cause that MD_RECOVERY_RUNNING is not cleared is
that mddev_suspend() never stop sync_thread at all, while
md_check_recovery() won't do anything when mddev is suspended.

Before:
1. suspend
2. call md_reap_sync_thread() directly to unregister sync_thread
     -> notice that this is not safe.
3. resume

Now:
1. suspend
2. call stop_sync_thread() to unregister sync_thread interrupt
md_do_sync() and wait for md_check_recovery() to clear
MD_RECOVERY_RUNNING.
    -> which will never happen now;
3. resume

I fixed this locally and the test integrity-caching.sh passed in my VM.

Thanks,
Kuai
> 
> Mikulas
>
Mikulas Patocka Jan. 18, 2024, 1:28 p.m. UTC | #4
On Thu, 18 Jan 2024, Yu Kuai wrote:

> Hi,
> 
> 在 2024/01/18 21:07, Mikulas Patocka 写道:
> > 
> > 
> > On Thu, 18 Jan 2024, Yu Kuai wrote:
> > 
> >> Hi,
> >>
> >> 在 2024/01/18 2:18, Mikulas Patocka 写道:
> >>> Note that md_wakeup_thread_directly is racy - it will do nothing if the
> >>> thread is already running or it may cause spurious wake-up if the thread
> >>> is blocked in another subsystem.
> >>
> >> No, as the comment said, md_wakeup_thread_directly() is just to prevent
> >> that md_wakeup_thread() can't wake up md_do_sync() if it's waiting for
> >> metadata update.
> > 
> > Yes - but what happens if you wake up the thread just a few instructions
> > before it is going to sleep for metadata update? wake_up_process does
> > nothing on a running process and the thread proceeds with waiting. This is
> > what I thought could happen when I was making the patch.
> 
> Please notice that in the orginal code md_wakeup_thread_directly() is
> used for sync_thread, and md_wakeup_thread() should be used for
> *mddev->thread* (mddev_unlock always do that) to clear
> MD_RECOVERY_RUNNING.
> 
> By the way, the root cause that MD_RECOVERY_RUNNING is not cleared is
> that mddev_suspend() never stop sync_thread at all, while
> md_check_recovery() won't do anything when mddev is suspended.
> 
> Before:
> 1. suspend
> 2. call md_reap_sync_thread() directly to unregister sync_thread
>     -> notice that this is not safe.
> 3. resume
> 
> Now:
> 1. suspend
> 2. call stop_sync_thread() to unregister sync_thread interrupt
> md_do_sync() and wait for md_check_recovery() to clear
> MD_RECOVERY_RUNNING.
>    -> which will never happen now;
> 3. resume
> 
> I fixed this locally and the test integrity-caching.sh passed in my VM.
> 
> Thanks,
> Kuai

OK, Thanks.

Mikulas
diff mbox series

Patch

Index: linux-2.6/drivers/md/md.c
===================================================================
--- linux-2.6.orig/drivers/md/md.c
+++ linux-2.6/drivers/md/md.c
@@ -4889,6 +4889,7 @@  static void stop_sync_thread(struct mdde
 
 	mddev_unlock(mddev);
 
+retry:
 	set_bit(MD_RECOVERY_INTR, &mddev->recovery);
 	/*
 	 * Thread might be blocked waiting for metadata update which will now
@@ -4898,9 +4899,11 @@  static void stop_sync_thread(struct mdde
 	if (work_pending(&mddev->sync_work))
 		flush_work(&mddev->sync_work);
 
-	wait_event(resync_wait,
+	if (!wait_event_timeout(resync_wait,
 		   !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) ||
-		   (check_seq && sync_seq != atomic_read(&mddev->sync_seq)));
+		   (check_seq && sync_seq != atomic_read(&mddev->sync_seq)),
+		   HZ / 10))
+		goto retry;
 
 	if (locked)
 		mddev_lock_nointr(mddev);