diff mbox

Test generic/299 stalling forever

Message ID 20161023212408.cjqmnzw3547ujzil@thunk.org (mailing list archive)
State New, archived
Headers show

Commit Message

Theodore Ts'o Oct. 23, 2016, 9:24 p.m. UTC
OK, perhaps this is helpful.  I reverted my last patch (so I could
repro the problem), and then added the following patch to fio/mutex.c
(see below).

When it hung, I got the following stack trace on the main thread:

(gdb) where
#0  0x00007ff8c018cf2d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007ff8c01b4fb4 in usleep (useconds=<optimized out>)
    at ../sysdeps/unix/sysv/linux/usleep.c:32
    #2  0x000000000045b691 in thread_main (data=0x1bc64000) at backend.c:1738
    #3  0x000000000045d623 in run_threads (sk_out=sk_out@entry=0x0) at backend.c:2268
    #4  0x000000000045d95d in fio_backend (sk_out=sk_out@entry=0x0) at backend.c:2400
    #5  0x000000000040cc18 in main (argc=2, argv=0x7fff8446f0d8, envp=<optimized out>) at fio.c:65

.... and it looks like the reason why we can't take the mutex is that
somehow some thread managed to call stat_exit(), which calls
fio_mutex_down() followed by fio_mutex_remove():

(gdb) p (void *) stat_mutex->owner
$4 = (void *) 0x42fd60 <stat_exit+16>

How this could have happened, I have no idea.  The only call to
stat_exit() appears to be at the end of fio_backend(), and the main
thread is still in the middle of calling thread_main() which is called
by run_threads().

Do you have any thoughts?  Let me know if there's something I can try.

       	    		       	       - Ted




P.P.S.  As before, the only other reads are the verify threads:

(gdb)  thread apply all bt

Thread 5 (Thread 0x7ff8c0e1a700 (LWP 13147)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000004398cb in verify_async_thread (data=0x7ff8aa3bd610) at verify.c:1241
#2  0x00007ff8c068a0a4 in start_thread (arg=0x7ff8c0e1a700) at pthread_create.c:309
#3  0x00007ff8c01bb62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 4 (Thread 0x7ff8c0e12700 (LWP 13148)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000004398cb in verify_async_thread (data=0x7ff8aa3bd610) at verify.c:1241
#2  0x00007ff8c068a0a4 in start_thread (arg=0x7ff8c0e12700) at pthread_create.c:309
#3  0x00007ff8c01bb62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3 (Thread 0x7ff8c0e0a700 (LWP 13149)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000004398cb in verify_async_thread (data=0x7ff8aa3bd610) at verify.c:1241
#2  0x00007ff8c068a0a4 in start_thread (arg=0x7ff8c0e0a700) at pthread_create.c:309
#3  0x00007ff8c01bb62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 (Thread 0x7ff8c0e02700 (LWP 13150)):
#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000004398cb in verify_async_thread (data=0x7ff8aa3bd610) at verify.c:1241
#2  0x00007ff8c068a0a4 in start_thread (arg=0x7ff8c0e02700) at pthread_create.c:309
#3  0x00007ff8c01bb62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x7ff8c0fbd740 (LWP 13139)):
#0  0x00007ff8c018cf2d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007ff8c01b4fb4 in usleep (useconds=<optimized out>)
    at ../sysdeps/unix/sysv/linux/usleep.c:32
#2  0x000000000045b691 in thread_main (data=0x1bc64000) at backend.c:1738
#3  0x000000000045d623 in run_threads (sk_out=sk_out@entry=0x0) at backend.c:2268
#4  0x000000000045d95d in fio_backend (sk_out=sk_out@entry=0x0) at backend.c:2400
#5  0x000000000040cc18 in main (argc=2, argv=0x7fff8446f0d8, envp=<optimized out>) at fio.c:65
(gdb) 
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jens Axboe Oct. 24, 2016, 1:41 a.m. UTC | #1
On 10/23/2016 03:24 PM, Theodore Ts'o wrote:
> OK, perhaps this is helpful.  I reverted my last patch (so I could
> repro the problem), and then added the following patch to fio/mutex.c
> (see below).
>
> When it hung, I got the following stack trace on the main thread:
>
> (gdb) where
> #0  0x00007ff8c018cf2d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
> #1  0x00007ff8c01b4fb4 in usleep (useconds=<optimized out>)
>     at ../sysdeps/unix/sysv/linux/usleep.c:32
>     #2  0x000000000045b691 in thread_main (data=0x1bc64000) at backend.c:1738
>     #3  0x000000000045d623 in run_threads (sk_out=sk_out@entry=0x0) at backend.c:2268
>     #4  0x000000000045d95d in fio_backend (sk_out=sk_out@entry=0x0) at backend.c:2400
>     #5  0x000000000040cc18 in main (argc=2, argv=0x7fff8446f0d8, envp=<optimized out>) at fio.c:65
>
> .... and it looks like the reason why we can't take the mutex is that
> somehow some thread managed to call stat_exit(), which calls
> fio_mutex_down() followed by fio_mutex_remove():
>
> (gdb) p (void *) stat_mutex->owner
> $4 = (void *) 0x42fd60 <stat_exit+16>
>
> How this could have happened, I have no idea.  The only call to
> stat_exit() appears to be at the end of fio_backend(), and the main
> thread is still in the middle of calling thread_main() which is called
> by run_threads().
>
> Do you have any thoughts?  Let me know if there's something I can try.

Wow, that is very odd. That should only be possible if the main thread
exits too early, and calls stat_exit() while we still have jobs running.
That should trigger other funkiness too. I have no idea how that is
happening, but it obviously is since your debugging patch showed that
stat_exit() owns the mutex.

But this gives me more to work from, very useul, thanks Ted! I'll pound
on this tomorrow.
Theodore Ts'o Oct. 24, 2016, 3:38 a.m. UTC | #2
I enabled some more debugging and it's become more clear what's going
on.   (See attached for the full log).

The main issue seems to be that once one of fio is done, it kills off
the other threads (actually, we're using processes):

process  31848 terminate group_id=0
process  31848 setting terminate on direct_aio/31846
process  31848 setting terminate on direct_aio/31848
process  31848 setting terminate on direct_aio/31849
process  31848 setting terminate on direct_aio/31851
process  31848 setting terminate on aio-dio-verifier/31852
process  31848 setting terminate on buffered-aio-verifier/31854
process  31851 pid=31851: runstate RUNNING -> FINISHING
process  31851 terminate group_id=0
process  31851 setting terminate on direct_aio/31846
process  31851 setting terminate on direct_aio/31848
process  31851 setting terminate on direct_aio/31849
process  31851 setting terminate on direct_aio/31851
process  31851 setting terminate on aio-dio-verifier/31852
process  31851 setting terminate on buffered-aio-verifier/31854
process  31852 pid=31852: runstate RUNNING -> FINISHING
process  31846 pid=31846: runstate RUNNING -> FINISHING
    ...

but one or more of the threads doesn't exit within 60 seconds:

fio: job 'direct_aio' (state=5) hasn't exited in 60 seconds, it appears to be stuck. Doing forceful exit of this job.
process  31794 pid=31849: runstate RUNNING -> REAPED
fio: job 'buffered-aio-verifier' (state=5) hasn't exited in 60 seconds, it appears to be stuck. Doing forceful exit of this job.
process  31794 pid=31854: runstate RUNNING -> REAPED
process  31794 terminate group_id=-1

The main thread then prints all of the statistics, and calls stat_exit():

stat_exit called by tid: 31794       <---- debugging message which prints gettid()

Unfortunately, this process(es) aren't actually, killed, they are
marked as reap, but they are still in the process listing:

root@xfstests:~# ps augxww | grep fio
root      1585  0.0  0.0      0     0 ?        S<   18:45   0:00 [dm_bufio_cache]
root      7191  0.0  0.0  12732  2200 pts/1    S+   23:05   0:00 grep fio
root     31849  1.5  0.2 407208 18876 ?        Ss   22:36   0:26 /root/xfstests/bin/fio /tmp/31503.fio
root     31854  1.2  0.1 398480 10240 ?        Ssl  22:36   0:22 /root/xfstests/bin/fio /tmp/31503.fio

And if you attach to them with a gdb, they are spinning trying to grab
the stat_mutex(), which they can't get because the main thread has
already called stat_exit() and then has exited.  So these two threads
did eventually return, but some time after 60 seconds had passed, and
then they hung waiting for stat_mutex(), which they will never get
because the main thread has already called stat_exit().

This probably also explains why you had trouble reproducing it.  It
requires a disk whose performance is variable enougoh that under heavy
load, it might take more than 60 seconds for the direct_aio or
buffered-aio-verifier thread to close itself out.

And I suspect once the main thread exited, it probably also closed out
the debugging channel so the deadlock detector did probably trip, but
somehow we just didn't see the output.

So I can imagine some possible fixes.  We could make the thread
timeout configurable, and/or increase it from 60 seconds to something like
300 seconds.  We could make stat_exit() a no-op --- after all, if the
main thread is exiting, there's no real point to down and then destroy
the stat_mutex.  And/or we could change the forced reap to send a kill
-9 to the thread, and instead of maring it as reaped.

Cheers,

						- Ted
diff mbox

Patch

diff --git a/mutex.c b/mutex.c
index 7580922..26e2955 100644
--- a/mutex.c
+++ b/mutex.c
@@ -108,6 +108,8 @@  int __fio_mutex_init(struct fio_mutex *mutex, int value)
 
 	mutex->value = value;
 	mutex->magic = FIO_MUTEX_MAGIC;
+	if (!value)
+		mutex->owner = __builtin_return_address(0);
 
 	ret = mutex_cond_init_pshared(&mutex->lock, &mutex->cond);
 	if (ret)
@@ -128,8 +130,11 @@  struct fio_mutex *fio_mutex_init(int value)
 		return NULL;
 	}
 
-	if (!__fio_mutex_init(mutex, value))
+	if (!__fio_mutex_init(mutex, value)) {
+		if (!value)
+			mutex->owner = __builtin_return_address(0);
 		return mutex;
+	}
 
 	fio_mutex_remove(mutex);
 	return NULL;
@@ -194,7 +199,13 @@  bool fio_mutex_down_trylock(struct fio_mutex *mutex)
 
 	pthread_mutex_lock(&mutex->lock);
 	if (mutex->value) {
+		if (mutex->value > 1) {
+			log_err("Completely bogus mutex value?!? (%d, %p)\n",
+				mutex->value, mutex->owner);
+			abort();
+		}
 		mutex->value--;
+		mutex->owner = __builtin_return_address(0);
 		ret = false;
 	}
 	pthread_mutex_unlock(&mutex->lock);
@@ -215,6 +226,7 @@  void fio_mutex_down(struct fio_mutex *mutex)
 	}
 
 	mutex->value--;
+	mutex->owner = __builtin_return_address(0);
 	pthread_mutex_unlock(&mutex->lock);
 }
 
@@ -228,6 +240,8 @@  void fio_mutex_up(struct fio_mutex *mutex)
 	read_barrier();
 	if (!mutex->value && mutex->waiters)
 		do_wake = 1;
+	if (!mutex->value)
+		mutex->owner = 0;
 	mutex->value++;
 	pthread_mutex_unlock(&mutex->lock);
 
diff --git a/mutex.h b/mutex.h
index 54009ba..ea43845 100644
--- a/mutex.h
+++ b/mutex.h
@@ -13,6 +13,7 @@  struct fio_mutex {
 	int value;
 	int waiters;
 	int magic;
+	unsigned long owner;
 };
 
 struct fio_rwlock {