diff mbox

Test generic/299 stalling forever

Message ID 773e0780-6641-ec85-5e78-d04e5a82d6b1@fb.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jens Axboe Oct. 24, 2016, 4:28 p.m. UTC
On 10/23/2016 09:38 PM, Theodore Ts'o wrote:
> I enabled some more debugging and it's become more clear what's going
> on.   (See attached for the full log).
>
> The main issue seems to be that once one of fio is done, it kills off
> the other threads (actually, we're using processes):
>
> process  31848 terminate group_id=0
> process  31848 setting terminate on direct_aio/31846
> process  31848 setting terminate on direct_aio/31848
> process  31848 setting terminate on direct_aio/31849
> process  31848 setting terminate on direct_aio/31851
> process  31848 setting terminate on aio-dio-verifier/31852
> process  31848 setting terminate on buffered-aio-verifier/31854
> process  31851 pid=31851: runstate RUNNING -> FINISHING
> process  31851 terminate group_id=0
> process  31851 setting terminate on direct_aio/31846
> process  31851 setting terminate on direct_aio/31848
> process  31851 setting terminate on direct_aio/31849
> process  31851 setting terminate on direct_aio/31851
> process  31851 setting terminate on aio-dio-verifier/31852
> process  31851 setting terminate on buffered-aio-verifier/31854
> process  31852 pid=31852: runstate RUNNING -> FINISHING
> process  31846 pid=31846: runstate RUNNING -> FINISHING
>     ...
>
> but one or more of the threads doesn't exit within 60 seconds:
>
> fio: job 'direct_aio' (state=5) hasn't exited in 60 seconds, it appears to be stuck. Doing forceful exit of this job.
> process  31794 pid=31849: runstate RUNNING -> REAPED
> fio: job 'buffered-aio-verifier' (state=5) hasn't exited in 60 seconds, it appears to be stuck. Doing forceful exit of this job.
> process  31794 pid=31854: runstate RUNNING -> REAPED
> process  31794 terminate group_id=-1
>
> The main thread then prints all of the statistics, and calls stat_exit():
>
> stat_exit called by tid: 31794       <---- debugging message which prints gettid()
>
> Unfortunately, this process(es) aren't actually, killed, they are
> marked as reap, but they are still in the process listing:
>
> root@xfstests:~# ps augxww | grep fio
> root      1585  0.0  0.0      0     0 ?        S<   18:45   0:00 [dm_bufio_cache]
> root      7191  0.0  0.0  12732  2200 pts/1    S+   23:05   0:00 grep fio
> root     31849  1.5  0.2 407208 18876 ?        Ss   22:36   0:26 /root/xfstests/bin/fio /tmp/31503.fio
> root     31854  1.2  0.1 398480 10240 ?        Ssl  22:36   0:22 /root/xfstests/bin/fio /tmp/31503.fio
>
> And if you attach to them with a gdb, they are spinning trying to grab
> the stat_mutex(), which they can't get because the main thread has
> already called stat_exit() and then has exited.  So these two threads
> did eventually return, but some time after 60 seconds had passed, and
> then they hung waiting for stat_mutex(), which they will never get
> because the main thread has already called stat_exit().
>
> This probably also explains why you had trouble reproducing it.  It
> requires a disk whose performance is variable enougoh that under heavy
> load, it might take more than 60 seconds for the direct_aio or
> buffered-aio-verifier thread to close itself out.

Good catch! Yes, that could certainly explain why we are stuck on that 
stat_mutex and why the main thread just gave up on it and ended up in 
stat_exit() with a thread (or more) still running.

> And I suspect once the main thread exited, it probably also closed out
> the debugging channel so the deadlock detector did probably trip, but
> somehow we just didn't see the output.
>
> So I can imagine some possible fixes.  We could make the thread
> timeout configurable, and/or increase it from 60 seconds to something like
> 300 seconds.  We could make stat_exit() a no-op --- after all, if the
> main thread is exiting, there's no real point to down and then destroy
> the stat_mutex.  And/or we could change the forced reap to send a kill
> -9 to the thread, and instead of maring it as reaped.

We have to clean up - for normal runs, it's not a big deal, but if fio
is run as a client/server setup, the backend will persist across runs.
If we leak, then that could be a concern.

How about the below? Bump the timeout to 5 min, 1 min is a little on the
short side, we want normal error handling to be out of the way before
that happens. And additionally, break out if we have been marked as
reaped/exited, so we avoid grabbing the stat mutex again.

Comments

Theodore Ts'o Oct. 25, 2016, 2:54 a.m. UTC | #1
On Mon, Oct 24, 2016 at 10:28:14AM -0600, Jens Axboe wrote:

> How about the below? Bump the timeout to 5 min, 1 min is a little on the
> short side, we want normal error handling to be out of the way before
> that happens. And additionally, break out if we have been marked as
> reaped/exited, so we avoid grabbing the stat mutex again.

Yep, that works.  I tried a test with just the second change:

> +		/*
> +		 * If we took too long to shut down, the main thread could
> +		 * already consider us reaped/exited. If that happens, break
> +		 * out and clean up.
> +		 */
> +		if (td->runstate >= TD_EXITED)
> +			break;
> +

And that's sufficient to solve the problem.

Increasing the timeout to 5 minute also would be a good idea, so we
can let the worker threads exit cleanly so the reported stats will be
completely accurate.

Thanks for your help in figuring out this long-standing problem!

       	   	     		     	  		- Ted
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jens Axboe Oct. 25, 2016, 2:59 a.m. UTC | #2
On 10/24/2016 08:54 PM, Theodore Ts'o wrote:
> On Mon, Oct 24, 2016 at 10:28:14AM -0600, Jens Axboe wrote:
>
>> How about the below? Bump the timeout to 5 min, 1 min is a little on the
>> short side, we want normal error handling to be out of the way before
>> that happens. And additionally, break out if we have been marked as
>> reaped/exited, so we avoid grabbing the stat mutex again.
>
> Yep, that works.  I tried a test with just the second change:
>
>> +		/*
>> +		 * If we took too long to shut down, the main thread could
>> +		 * already consider us reaped/exited. If that happens, break
>> +		 * out and clean up.
>> +		 */
>> +		if (td->runstate >= TD_EXITED)
>> +			break;
>> +
>
> And that's sufficient to solve the problem.

Yes, it should be, so glad that it is!

> Increasing the timeout to 5 minute also would be a good idea, so we
> can let the worker threads exit cleanly so the reported stats will be
> completely accurate.

I made that separate change as well. If the job is stuck in the kernel
for some sync operation, we could feasibly be uninterruptible for
minutes. So 1 minutes is too short in any case, and I'd rather just make
this check than sending kill signals since it won't fix the
uninterruptible problem.

> Thanks for your help in figuring out this long-standing problem!

It was easy based on all your info, since I could not reproduce. So
thanks for your help! Everything should be committed now, and I'll cut a
new release tomorrow so we can hopefully put this behind us.
diff mbox

Patch

diff --git a/backend.c b/backend.c
index 093b6a3a290e..f0927abfccb0 100644
--- a/backend.c
+++ b/backend.c
@@ -1723,6 +1723,14 @@  static void *thread_main(void *data)
  			}
  		}

+		/*
+		 * If we took too long to shut down, the main thread could
+		 * already consider us reaped/exited. If that happens, break
+		 * out and clean up.
+		 */
+		if (td->runstate >= TD_EXITED)
+			break;
+
  		clear_state = 1;

  		/*
diff --git a/fio.h b/fio.h
index 080842aef4f8..74c1b306af26 100644
--- a/fio.h
+++ b/fio.h
@@ -588,7 +588,7 @@  extern const char *runstate_to_name(int runstate);
   * Allow 60 seconds for a job to quit on its own, otherwise reap with
   * a vengeance.
   */
-#define FIO_REAP_TIMEOUT	60
+#define FIO_REAP_TIMEOUT	300

  #define TERMINATE_ALL		(-1U)
  extern void fio_terminate_threads(unsigned int);