diff mbox series

trace/simple: Fix hang when using simpletrace with fork()

Message ID 20250226085015.1143991-1-thuth@redhat.com (mailing list archive)
State New
Headers show
Series trace/simple: Fix hang when using simpletrace with fork() | expand

Commit Message

Thomas Huth Feb. 26, 2025, 8:50 a.m. UTC
When compiling QEMU with --enable-trace-backends=simple , the
iotest 233 is currently hanging. This happens because qemu-nbd
calls trace_init_backends() first - which causes simpletrace to
install its writer thread and the atexit() handler - before
calling fork(). But the simpletrace writer thread is then only
available in the parent process, not in the child process anymore.
Thus when the child process exits, its atexit handler waits forever
on the trace_empty_cond condition to be set by the non-existing
writer thread, so the process never finishes.

Fix it by installing a pthread_atfork() handler, too, which
makes sure that the trace_writeout_enabled variable gets set
to false again in the child process, so we can use it in the
atexit() handler to check whether we still need to wait on the
writer thread or not.

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 trace/simple.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

Comments

Daniel P. Berrangé Feb. 26, 2025, 9:15 a.m. UTC | #1
On Wed, Feb 26, 2025 at 09:50:15AM +0100, Thomas Huth wrote:
> When compiling QEMU with --enable-trace-backends=simple , the
> iotest 233 is currently hanging. This happens because qemu-nbd
> calls trace_init_backends() first - which causes simpletrace to
> install its writer thread and the atexit() handler - before
> calling fork(). But the simpletrace writer thread is then only
> available in the parent process, not in the child process anymore.
> Thus when the child process exits, its atexit handler waits forever
> on the trace_empty_cond condition to be set by the non-existing
> writer thread, so the process never finishes.
> 
> Fix it by installing a pthread_atfork() handler, too, which
> makes sure that the trace_writeout_enabled variable gets set
> to false again in the child process, so we can use it in the
> atexit() handler to check whether we still need to wait on the
> writer thread or not.
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  trace/simple.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/trace/simple.c b/trace/simple.c
> index c0aba00cb7f..269bbda69f1 100644
> --- a/trace/simple.c
> +++ b/trace/simple.c
> @@ -380,8 +380,22 @@ void st_print_trace_file_status(void)
>  
>  void st_flush_trace_buffer(void)
>  {
> -    flush_trace_file(true);
> +    flush_trace_file(trace_writeout_enabled);
> +}
> +
> +#ifndef _WIN32
> +static void trace_thread_atfork(void)
> +{
> +    /*
> +     * If we fork, the writer thread does not exist in the child, so
> +     * make sure to allow st_flush_trace_buffer() to clean up correctly.
> +     */
> +    g_mutex_lock(&trace_lock);
> +    trace_writeout_enabled = false;
> +    g_cond_signal(&trace_empty_cond);
> +    g_mutex_unlock(&trace_lock);
>  }
> +#endif

This doesn't seem right to me. This is being run in the child and while
it may avoid the hang when the child exits, surely it still leaves tracing
non-functional in the child as we're lacking the thread to write out the
trace data.

>  
>  /* Helper function to create a thread with signals blocked.  Use glib's
>   * portable threads since QEMU abstractions cannot be used due to reentrancy in
> @@ -396,6 +410,7 @@ static GThread *trace_thread_create(GThreadFunc fn)
>  
>      sigfillset(&set);
>      pthread_sigmask(SIG_SETMASK, &set, &oldset);
> +    pthread_atfork(NULL, NULL, trace_thread_atfork);
>  #endif
>  
>      thread = g_thread_new("trace-thread", fn, NULL);
> -- 
> 2.48.1
> 
> 

With regards,
Daniel
Kevin Wolf Feb. 26, 2025, 9:29 a.m. UTC | #2
Am 26.02.2025 um 09:50 hat Thomas Huth geschrieben:
> When compiling QEMU with --enable-trace-backends=simple , the
> iotest 233 is currently hanging. This happens because qemu-nbd
> calls trace_init_backends() first - which causes simpletrace to
> install its writer thread and the atexit() handler - before
> calling fork(). But the simpletrace writer thread is then only
> available in the parent process, not in the child process anymore.
> Thus when the child process exits, its atexit handler waits forever
> on the trace_empty_cond condition to be set by the non-existing
> writer thread, so the process never finishes.
> 
> Fix it by installing a pthread_atfork() handler, too, which
> makes sure that the trace_writeout_enabled variable gets set
> to false again in the child process, so we can use it in the
> atexit() handler to check whether we still need to wait on the
> writer thread or not.
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>

I can see how this would fix the hang, but do we actually get the trace
events written out somewhere then? Or do we need to make sure that the
child process has a writer thread, too?

Of course, the question would then be how the two processes writing into
the same trace file interact.

Kevin
Thomas Huth Feb. 26, 2025, 9:38 a.m. UTC | #3
On 26/02/2025 10.15, Daniel P. Berrangé wrote:
> On Wed, Feb 26, 2025 at 09:50:15AM +0100, Thomas Huth wrote:
>> When compiling QEMU with --enable-trace-backends=simple , the
>> iotest 233 is currently hanging. This happens because qemu-nbd
>> calls trace_init_backends() first - which causes simpletrace to
>> install its writer thread and the atexit() handler - before
>> calling fork(). But the simpletrace writer thread is then only
>> available in the parent process, not in the child process anymore.
>> Thus when the child process exits, its atexit handler waits forever
>> on the trace_empty_cond condition to be set by the non-existing
>> writer thread, so the process never finishes.
>>
>> Fix it by installing a pthread_atfork() handler, too, which
>> makes sure that the trace_writeout_enabled variable gets set
>> to false again in the child process, so we can use it in the
>> atexit() handler to check whether we still need to wait on the
>> writer thread or not.
>>
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>   trace/simple.c | 17 ++++++++++++++++-
>>   1 file changed, 16 insertions(+), 1 deletion(-)
>>
>> diff --git a/trace/simple.c b/trace/simple.c
>> index c0aba00cb7f..269bbda69f1 100644
>> --- a/trace/simple.c
>> +++ b/trace/simple.c
>> @@ -380,8 +380,22 @@ void st_print_trace_file_status(void)
>>   
>>   void st_flush_trace_buffer(void)
>>   {
>> -    flush_trace_file(true);
>> +    flush_trace_file(trace_writeout_enabled);
>> +}
>> +
>> +#ifndef _WIN32
>> +static void trace_thread_atfork(void)
>> +{
>> +    /*
>> +     * If we fork, the writer thread does not exist in the child, so
>> +     * make sure to allow st_flush_trace_buffer() to clean up correctly.
>> +     */
>> +    g_mutex_lock(&trace_lock);
>> +    trace_writeout_enabled = false;
>> +    g_cond_signal(&trace_empty_cond);
>> +    g_mutex_unlock(&trace_lock);
>>   }
>> +#endif
> 
> This doesn't seem right to me. This is being run in the child and while
> it may avoid the hang when the child exits, surely it still leaves tracing
> non-functional in the child as we're lacking the thread to write out the
> trace data.

Well, you cannot write to the same file from the parent and child at the 
same time, so one of both needs to be shut up AFAIU. And the simpletrace 
code cannot now which one of the two processes should be allowed to continue 
with the logging, so we either have to disable tracing in one of the two 
processes, or think of something completely different, e.g. using 
pthread_atfork(abort, NULL, NULL) to make people aware that they are not 
allowed to start tracing before calling fork()...? But in that case we still 
need a qemu-nbd expert to fix qemu-nbd, so that it does not initialize the 
trace backend before calling fork().

  Thomas
Daniel P. Berrangé Feb. 26, 2025, 9:51 a.m. UTC | #4
On Wed, Feb 26, 2025 at 10:29:59AM +0100, Kevin Wolf wrote:
> Am 26.02.2025 um 09:50 hat Thomas Huth geschrieben:
> > When compiling QEMU with --enable-trace-backends=simple , the
> > iotest 233 is currently hanging. This happens because qemu-nbd
> > calls trace_init_backends() first - which causes simpletrace to
> > install its writer thread and the atexit() handler - before
> > calling fork(). But the simpletrace writer thread is then only
> > available in the parent process, not in the child process anymore.
> > Thus when the child process exits, its atexit handler waits forever
> > on the trace_empty_cond condition to be set by the non-existing
> > writer thread, so the process never finishes.
> > 
> > Fix it by installing a pthread_atfork() handler, too, which
> > makes sure that the trace_writeout_enabled variable gets set
> > to false again in the child process, so we can use it in the
> > atexit() handler to check whether we still need to wait on the
> > writer thread or not.
> > 
> > Signed-off-by: Thomas Huth <thuth@redhat.com>
> 
> I can see how this would fix the hang, but do we actually get the trace
> events written out somewhere then? Or do we need to make sure that the
> child process has a writer thread, too?
> 
> Of course, the question would then be how the two processes writing into
> the same trace file interact.

In st_set_trace_file, if no explicit filename is given by the user we
use a default filename "trace-$PID", so *if* the atfork handler in
the child were to reset its file after fork it'll get a distinct
filename. Happy days.

If the user provided an explicit filename though, we just use that as
is and would end up with dualling writers and likely data corruption.
While we could blindly append a pid to the user's filename it is 
somewhat unhelpful in the daemonize case to split it across files.

Whether we actually want tracing in a child also possibly depends on
the reason for fork.

If we're forking to daemonize, then we (mostly) want tracing in the
child and (mostly) no longer care about the parent.

If we're forking to exec another app, then we (mostly) want tracing in
the parent and (mostly) no longer care about the child.

With regards,
Daniel
Daniel P. Berrangé Feb. 26, 2025, 9:53 a.m. UTC | #5
On Wed, Feb 26, 2025 at 10:38:56AM +0100, Thomas Huth wrote:
> On 26/02/2025 10.15, Daniel P. Berrangé wrote:
> > On Wed, Feb 26, 2025 at 09:50:15AM +0100, Thomas Huth wrote:
> > > When compiling QEMU with --enable-trace-backends=simple , the
> > > iotest 233 is currently hanging. This happens because qemu-nbd
> > > calls trace_init_backends() first - which causes simpletrace to
> > > install its writer thread and the atexit() handler - before
> > > calling fork(). But the simpletrace writer thread is then only
> > > available in the parent process, not in the child process anymore.
> > > Thus when the child process exits, its atexit handler waits forever
> > > on the trace_empty_cond condition to be set by the non-existing
> > > writer thread, so the process never finishes.
> > > 
> > > Fix it by installing a pthread_atfork() handler, too, which
> > > makes sure that the trace_writeout_enabled variable gets set
> > > to false again in the child process, so we can use it in the
> > > atexit() handler to check whether we still need to wait on the
> > > writer thread or not.
> > > 
> > > Signed-off-by: Thomas Huth <thuth@redhat.com>
> > > ---
> > >   trace/simple.c | 17 ++++++++++++++++-
> > >   1 file changed, 16 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/trace/simple.c b/trace/simple.c
> > > index c0aba00cb7f..269bbda69f1 100644
> > > --- a/trace/simple.c
> > > +++ b/trace/simple.c
> > > @@ -380,8 +380,22 @@ void st_print_trace_file_status(void)
> > >   void st_flush_trace_buffer(void)
> > >   {
> > > -    flush_trace_file(true);
> > > +    flush_trace_file(trace_writeout_enabled);
> > > +}
> > > +
> > > +#ifndef _WIN32
> > > +static void trace_thread_atfork(void)
> > > +{
> > > +    /*
> > > +     * If we fork, the writer thread does not exist in the child, so
> > > +     * make sure to allow st_flush_trace_buffer() to clean up correctly.
> > > +     */
> > > +    g_mutex_lock(&trace_lock);
> > > +    trace_writeout_enabled = false;
> > > +    g_cond_signal(&trace_empty_cond);
> > > +    g_mutex_unlock(&trace_lock);
> > >   }
> > > +#endif
> > 
> > This doesn't seem right to me. This is being run in the child and while
> > it may avoid the hang when the child exits, surely it still leaves tracing
> > non-functional in the child as we're lacking the thread to write out the
> > trace data.
> 
> Well, you cannot write to the same file from the parent and child at the
> same time, so one of both needs to be shut up AFAIU. And the simpletrace
> code cannot now which one of the two processes should be allowed to continue
> with the logging, so we either have to disable tracing in one of the two
> processes, or think of something completely different, e.g. using
> pthread_atfork(abort, NULL, NULL) to make people aware that they are not
> allowed to start tracing before calling fork()...? But in that case we still
> need a qemu-nbd expert to fix qemu-nbd, so that it does not initialize the
> trace backend before calling fork().

As precedent, in system/vl.c we delay trace_init() until after daemonizing
which is the simple way to avoid the worst of the danger.

It would still be nice to have an atfork() handler to fully eliminate the
danger though

With regards,
Daniel
diff mbox series

Patch

diff --git a/trace/simple.c b/trace/simple.c
index c0aba00cb7f..269bbda69f1 100644
--- a/trace/simple.c
+++ b/trace/simple.c
@@ -380,8 +380,22 @@  void st_print_trace_file_status(void)
 
 void st_flush_trace_buffer(void)
 {
-    flush_trace_file(true);
+    flush_trace_file(trace_writeout_enabled);
+}
+
+#ifndef _WIN32
+static void trace_thread_atfork(void)
+{
+    /*
+     * If we fork, the writer thread does not exist in the child, so
+     * make sure to allow st_flush_trace_buffer() to clean up correctly.
+     */
+    g_mutex_lock(&trace_lock);
+    trace_writeout_enabled = false;
+    g_cond_signal(&trace_empty_cond);
+    g_mutex_unlock(&trace_lock);
 }
+#endif
 
 /* Helper function to create a thread with signals blocked.  Use glib's
  * portable threads since QEMU abstractions cannot be used due to reentrancy in
@@ -396,6 +410,7 @@  static GThread *trace_thread_create(GThreadFunc fn)
 
     sigfillset(&set);
     pthread_sigmask(SIG_SETMASK, &set, &oldset);
+    pthread_atfork(NULL, NULL, trace_thread_atfork);
 #endif
 
     thread = g_thread_new("trace-thread", fn, NULL);