diff mbox series

coroutine-sigaltstack: Keep SIGUSR2 handler up

Message ID 20210122102041.27031-1-mreitz@redhat.com (mailing list archive)
State New, archived
Headers show
Series coroutine-sigaltstack: Keep SIGUSR2 handler up | expand

Commit Message

Max Reitz Jan. 22, 2021, 10:20 a.m. UTC
Modifying signal handlers is a process-global operation.  When two
threads run coroutine-sigaltstack's qemu_coroutine_new() concurrently,
they may interfere with each other: One of them may revert the SIGUSR2
handler back to the default between the other thread setting up
coroutine_trampoline() as the handler and raising SIGUSR2.  That SIGUSR2
will then lead to the process exiting.

Outside of coroutine-sigaltstack, qemu does not use SIGUSR2.  We can
thus keep the signal handler installed all the time.
CoroutineThreadState.tr_handler tells coroutine_trampoline() whether its
stack is set up so a new coroutine is to be launched (i.e., it should
invoke sigsetjmp()), or not (i.e., the signal came from an external
source and we should just perform the default action, which is to exit
the process).

Note that in user-mode emulation, the guest can register signal handlers
for any signal but SIGSEGV and SIGBUS, so if it registers a SIGUSR2
handler, sigaltstack coroutines will break from then on.  However, we do
not use coroutines for user-mode emulation, so that is fine.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 util/coroutine-sigaltstack.c | 56 +++++++++++++++++++-----------------
 1 file changed, 29 insertions(+), 27 deletions(-)

Comments

Eric Blake Jan. 22, 2021, 2:55 p.m. UTC | #1
On 1/22/21 4:20 AM, Max Reitz wrote:
> Modifying signal handlers is a process-global operation.  When two
> threads run coroutine-sigaltstack's qemu_coroutine_new() concurrently,
> they may interfere with each other: One of them may revert the SIGUSR2
> handler back to the default between the other thread setting up
> coroutine_trampoline() as the handler and raising SIGUSR2.  That SIGUSR2
> will then lead to the process exiting.
> 
> Outside of coroutine-sigaltstack, qemu does not use SIGUSR2.  We can
> thus keep the signal handler installed all the time.
> CoroutineThreadState.tr_handler tells coroutine_trampoline() whether its
> stack is set up so a new coroutine is to be launched (i.e., it should
> invoke sigsetjmp()), or not (i.e., the signal came from an external
> source and we should just perform the default action, which is to exit
> the process).

Not just exit the process, but exit due to a signal.  It matters...

> 
> Note that in user-mode emulation, the guest can register signal handlers
> for any signal but SIGSEGV and SIGBUS, so if it registers a SIGUSR2
> handler, sigaltstack coroutines will break from then on.  However, we do
> not use coroutines for user-mode emulation, so that is fine.
> 
> Suggested-by: Laszlo Ersek <lersek@redhat.com>
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  util/coroutine-sigaltstack.c | 56 +++++++++++++++++++-----------------
>  1 file changed, 29 insertions(+), 27 deletions(-)
> 

> @@ -121,7 +138,17 @@ static void coroutine_trampoline(int signal)
>      /* Get the thread specific information */
>      coTS = coroutine_get_thread_state();
>      self = coTS->tr_handler;
> +
> +    if (!self) {
> +        /*
> +         * This SIGUSR2 came from an external source, not from
> +         * qemu_coroutine_new(), so perform the default action.
> +         */
> +        exit(0);
> +    }

...here. Silently exiting with status 0 is wrong; the correct response
is to use signal(SIGUSR2, SIG_DFL); raise(SIGUSR2) to terminate the
process in the same manner as if we had not installed our hander at all.

With that fixed,

Reviewed-by: Eric Blake <eblake@redhat.com>
Laszlo Ersek Jan. 22, 2021, 4:38 p.m. UTC | #2
On 01/22/21 11:20, Max Reitz wrote:
> Modifying signal handlers is a process-global operation.  When two
> threads run coroutine-sigaltstack's qemu_coroutine_new() concurrently,
> they may interfere with each other: One of them may revert the SIGUSR2
> handler back to the default between the other thread setting up
> coroutine_trampoline() as the handler and raising SIGUSR2.  That SIGUSR2
> will then lead to the process exiting.
>
> Outside of coroutine-sigaltstack, qemu does not use SIGUSR2.  We can
> thus keep the signal handler installed all the time.
> CoroutineThreadState.tr_handler tells coroutine_trampoline() whether its
> stack is set up so a new coroutine is to be launched (i.e., it should
> invoke sigsetjmp()), or not (i.e., the signal came from an external
> source and we should just perform the default action, which is to exit
> the process).
>
> Note that in user-mode emulation, the guest can register signal handlers
> for any signal but SIGSEGV and SIGBUS, so if it registers a SIGUSR2
> handler, sigaltstack coroutines will break from then on.  However, we do
> not use coroutines for user-mode emulation, so that is fine.
>
> Suggested-by: Laszlo Ersek <lersek@redhat.com>
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  util/coroutine-sigaltstack.c | 56 +++++++++++++++++++-----------------
>  1 file changed, 29 insertions(+), 27 deletions(-)

(1) With SIGUSR2 permanently dedicated to "coroutine-sigaltstack.c", the
comment on the qemu_init_main_loop() declaration, in
"include/qemu/main-loop.h", also needs to be updated. SIGUSR2 is no
longer a "free for thread-internal usage" signal.


>
> diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
> index aade82afb8..2d32afc322 100644
> --- a/util/coroutine-sigaltstack.c
> +++ b/util/coroutine-sigaltstack.c
> @@ -59,6 +59,8 @@ typedef struct {
>
>  static pthread_key_t thread_state_key;
>
> +static void coroutine_trampoline(int signal);
> +
>  static CoroutineThreadState *coroutine_get_thread_state(void)
>  {
>      CoroutineThreadState *s = pthread_getspecific(thread_state_key);
> @@ -80,6 +82,7 @@ static void qemu_coroutine_thread_cleanup(void *opaque)
>
>  static void __attribute__((constructor)) coroutine_init(void)
>  {
> +    struct sigaction sa;
>      int ret;
>
>      ret = pthread_key_create(&thread_state_key, qemu_coroutine_thread_cleanup);
> @@ -87,6 +90,20 @@ static void __attribute__((constructor)) coroutine_init(void)
>          fprintf(stderr, "unable to create leader key: %s\n", strerror(errno));
>          abort();
>      }
> +
> +    /*
> +     * Establish the SIGUSR2 signal handler.  This is a process-wide
> +     * operation, and so will apply to all threads from here on.
> +     */
> +    sa = (struct sigaction) {
> +        .sa_handler = coroutine_trampoline,
> +        .sa_flags   = SA_ONSTACK,
> +    };
> +
> +    if (sigaction(SIGUSR2, &sa, NULL) != 0) {
> +        perror("Unable to install SIGUSR2 handler");
> +        abort();
> +    }
>  }
>
>  /* "boot" function
> @@ -121,7 +138,17 @@ static void coroutine_trampoline(int signal)
>      /* Get the thread specific information */
>      coTS = coroutine_get_thread_state();
>      self = coTS->tr_handler;
> +
> +    if (!self) {
> +        /*
> +         * This SIGUSR2 came from an external source, not from
> +         * qemu_coroutine_new(), so perform the default action.
> +         */
> +        exit(0);
> +    }

(2) exit() is generally unsafe to call in signal handlers.

We could reason whether or not it is safe in this particular case (POSIX
describes the exact conditions --
<https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03>),
but it's much simpler to just call _exit().


(3) "Performing the default action" would be slightly different from
calling _exit(). When a process is terminated with a signal, the parent
can distinguish that, when reaping the child. See waitpid() /
WIFSIGNALED() / WTERMSIG(), versus WIFEXITED() / WEXITSTATUS().

So for the "default action", we'd have to:
- restore the SIGUSR2 handler to SIG_DFL, and
- re-raise the signal for the thread, and
- because the signal being handled is automatically blocked unless
  SA_NODEFER was set: unblock the signal for the thread.

The pthread_sigmask() call, made for the last step, would be the one
that would not return.

*However*, all of this complexity is not really useful here. We don't
really want to simulate being "forcefully terminated" by the unexpected
(asynchronous) SIGUSR2. We just want to exit.

Therefore, _exit() is fine. But, we should use an exit code different
from 0, as this is definitely not a pristine exit from QEMU. I'm not
sure if a convention exists for nonzero exit codes in QEMU; if not, then
just pass EXIT_FAILURE to _exit().


(4) Furthermore, please update the comment, because "perform the default
action" is not precise.


> +
>      coTS->tr_called = 1;
> +    coTS->tr_handler = NULL;
>      co = &self->base;
>
>      /*

(5) I see that "tr_called" has type "volatile sig_atomic_t", which is
great (I think that "sig_atomic_t" is not even necessary here, because
of the careful signal masking that we do, and because the signal is
raised synchronously).

"volatile" is certainly justified though (the compiler may not be able
to trace the call through the signal), and that's what I'm missing from
"tr_handler" too. IMO, the "tr_handler" field should be decalered as
follow:

  volatile void * volatile tr_handler;

wherein the second "volatile" serves just the same purpose as the
"volatile" in "tr_called", and the first "volatile" follows from *that*
-- wherever we chase the *chain of pointers* rooted in "tr_handler"
should not be optimized out by the compiler.

But: my point (5) is orthogonal to this patch. In practice, it may not
matter at all. So feel free to ignore now, I guess.


> @@ -150,12 +177,9 @@ Coroutine *qemu_coroutine_new(void)
>  {
>      CoroutineSigAltStack *co;
>      CoroutineThreadState *coTS;
> -    struct sigaction sa;
> -    struct sigaction osa;
>      stack_t ss;
>      stack_t oss;
>      sigset_t sigs;
> -    sigset_t osigs;
>      sigjmp_buf old_env;
>
>      /* The way to manipulate stack is with the sigaltstack function. We
> @@ -172,24 +196,6 @@ Coroutine *qemu_coroutine_new(void)
>      co->stack = qemu_alloc_stack(&co->stack_size);
>      co->base.entry_arg = &old_env; /* stash away our jmp_buf */
>
> -    coTS = coroutine_get_thread_state();
> -    coTS->tr_handler = co;
> -
> -    /*
> -     * Preserve the SIGUSR2 signal state, block SIGUSR2,
> -     * and establish our signal handler. The signal will
> -     * later transfer control onto the signal stack.
> -     */
> -    sigemptyset(&sigs);
> -    sigaddset(&sigs, SIGUSR2);
> -    pthread_sigmask(SIG_BLOCK, &sigs, &osigs);
> -    sa.sa_handler = coroutine_trampoline;
> -    sigfillset(&sa.sa_mask);
> -    sa.sa_flags = SA_ONSTACK;
> -    if (sigaction(SIGUSR2, &sa, &osa) != 0) {
> -        abort();
> -    }
> -
>      /*
>       * Set the new stack.
>       */
> @@ -207,6 +213,8 @@ Coroutine *qemu_coroutine_new(void)
>       * signal can be delivered the first time sigsuspend() is
>       * called.
>       */
> +    coTS = coroutine_get_thread_state();
> +    coTS->tr_handler = co;
>      coTS->tr_called = 0;
>      pthread_kill(pthread_self(), SIGUSR2);
>      sigfillset(&sigs);
> @@ -230,12 +238,6 @@ Coroutine *qemu_coroutine_new(void)
>          sigaltstack(&oss, NULL);
>      }
>
> -    /*
> -     * Restore the old SIGUSR2 signal handler and mask
> -     */
> -    sigaction(SIGUSR2, &osa, NULL);
> -    pthread_sigmask(SIG_SETMASK, &osigs, NULL);
> -
>      /*
>       * Now enter the trampoline again, but this time not as a signal
>       * handler. Instead we jump into it directly. The functionally
>

(6) This change, for qemu_coroutine_new(), is too heavy-handed, in my
opinion. I suggest (a) removing only the sigaction() calls and their
directly needed aux variables, and (b) *not* moving around operations.

In particular, you remove the blocking of SIGUSR2 for the thread -- the
pthread_sigmask() call. That means the sigsuspend() later on becomes
superfluous, as the signal will not be delivered inside sigsuspend(),
but inside pthread_kill(). With the signal not blocked, *generation* and
*delivery* will coalesce into a single event.

The general logic should stay the same, only the signal action
manipulation should be removed. IOW, for this function, I propose the
following change only:

> diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
> index aade82afb8c0..722fed7b2502 100644
> --- a/util/coroutine-sigaltstack.c
> +++ b/util/coroutine-sigaltstack.c
> @@ -149,107 +149,97 @@ static void coroutine_trampoline(int signal)
>  Coroutine *qemu_coroutine_new(void)
>  {
>      CoroutineSigAltStack *co;
>      CoroutineThreadState *coTS;
> -    struct sigaction sa;
> -    struct sigaction osa;
>      stack_t ss;
>      stack_t oss;
>      sigset_t sigs;
>      sigset_t osigs;
>      sigjmp_buf old_env;
>
>      /* The way to manipulate stack is with the sigaltstack function. We
>       * prepare a stack, with it delivering a signal to ourselves and then
>       * put sigsetjmp/siglongjmp where needed.
>       * This has been done keeping coroutine-ucontext as a model and with the
>       * pth ideas (GNU Portable Threads). See coroutine-ucontext for the basics
>       * of the coroutines and see pth_mctx.c (from the pth project) for the
>       * sigaltstack way of manipulating stacks.
>       */
>
>      co = g_malloc0(sizeof(*co));
>      co->stack_size = COROUTINE_STACK_SIZE;
>      co->stack = qemu_alloc_stack(&co->stack_size);
>      co->base.entry_arg = &old_env; /* stash away our jmp_buf */
>
>      coTS = coroutine_get_thread_state();
>      coTS->tr_handler = co;
>
>      /*
> -     * Preserve the SIGUSR2 signal state, block SIGUSR2,
> -     * and establish our signal handler. The signal will
> -     * later transfer control onto the signal stack.
> +     * Block SIGUSR2. The signal will later transfer control onto the signal
> +     * stack.
>       */
>      sigemptyset(&sigs);
>      sigaddset(&sigs, SIGUSR2);
>      pthread_sigmask(SIG_BLOCK, &sigs, &osigs);
> -    sa.sa_handler = coroutine_trampoline;
> -    sigfillset(&sa.sa_mask);
> -    sa.sa_flags = SA_ONSTACK;
> -    if (sigaction(SIGUSR2, &sa, &osa) != 0) {
> -        abort();
> -    }
>
>      /*
>       * Set the new stack.
>       */
>      ss.ss_sp = co->stack;
>      ss.ss_size = co->stack_size;
>      ss.ss_flags = 0;
>      if (sigaltstack(&ss, &oss) < 0) {
>          abort();
>      }
>
>      /*
>       * Now transfer control onto the signal stack and set it up.
>       * It will return immediately via "return" after the sigsetjmp()
>       * was performed. Be careful here with race conditions.  The
>       * signal can be delivered the first time sigsuspend() is
>       * called.
>       */
>      coTS->tr_called = 0;
>      pthread_kill(pthread_self(), SIGUSR2);
>      sigfillset(&sigs);
>      sigdelset(&sigs, SIGUSR2);
>      while (!coTS->tr_called) {
>          sigsuspend(&sigs);
>      }
>
>      /*
>       * Inform the system that we are back off the signal stack by
>       * removing the alternative signal stack. Be careful here: It
>       * first has to be disabled, before it can be removed.
>       */
>      sigaltstack(NULL, &ss);
>      ss.ss_flags = SS_DISABLE;
>      if (sigaltstack(&ss, NULL) < 0) {
>          abort();
>      }
>      sigaltstack(NULL, &ss);
>      if (!(oss.ss_flags & SS_DISABLE)) {
>          sigaltstack(&oss, NULL);
>      }
>
>      /*
> -     * Restore the old SIGUSR2 signal handler and mask
> +     * Restore the old mask
>       */
> -    sigaction(SIGUSR2, &osa, NULL);
>      pthread_sigmask(SIG_SETMASK, &osigs, NULL);
>
>      /*
>       * Now enter the trampoline again, but this time not as a signal
>       * handler. Instead we jump into it directly. The functionally
>       * redundant ping-pong pointer arithmetic is necessary to avoid
>       * type-conversion warnings related to the `volatile' qualifier and
>       * the fact that `jmp_buf' usually is an array type.
>       */
>      if (!sigsetjmp(old_env, 0)) {
>          siglongjmp(coTS->tr_reenter, 1);
>      }
>
>      /*
>       * Ok, we returned again, so now we're finished
>       */
>
>      return &co->base;
>  }


(7) The sigaction() call has not been moved entirely correctly from
qemu_coroutine_new() to coroutine_init(), in my opinion.

Namely, the original call site fills the "sa_mask" member, meaning that,
while the signal handler is executing, not only SIGUSR2 itself should be
blocked, but *all* signals.

This is missing from the new call site, in coroutine_init() -- the
"sa_mask" member is left zeroed.

Now, in practice, this may not matter a whole lot, because "sa_mask" is
additive, and at the only place we allow the signal to be delivered,
namely in sigsuspend(), we already have everything blocked, except for
SIGUSR2. So when "sa_mask" is ORed with the set of blocked signals, upon
delivery of SIGUSR2, there is no actual change to the signal mask.

*But*, I feel that such a change would really deserve its own argument,
i.e. a separate patch, or at least a separate paragraph in the commit
message. And I suggest not doing those; instead please just faithfully
transfer the "sa_mask" setting too, to coroutine_init(). (My impression
is that the effective removal of the "sa_mask" population was an
oversight in this patch, not a conscious decision.)

Thanks
Laszlo
Laszlo Ersek Jan. 22, 2021, 5:09 p.m. UTC | #3
On 01/22/21 11:20, Max Reitz wrote:
> Modifying signal handlers is a process-global operation.  When two
> threads run coroutine-sigaltstack's qemu_coroutine_new() concurrently,
> they may interfere with each other: One of them may revert the SIGUSR2
> handler back to the default between the other thread setting up
> coroutine_trampoline() as the handler and raising SIGUSR2.  That SIGUSR2
> will then lead to the process exiting.
> 
> Outside of coroutine-sigaltstack, qemu does not use SIGUSR2.  We can
> thus keep the signal handler installed all the time.
> CoroutineThreadState.tr_handler tells coroutine_trampoline() whether its
> stack is set up so a new coroutine is to be launched (i.e., it should
> invoke sigsetjmp()), or not (i.e., the signal came from an external
> source and we should just perform the default action, which is to exit
> the process).
> 
> Note that in user-mode emulation, the guest can register signal handlers
> for any signal but SIGSEGV and SIGBUS, so if it registers a SIGUSR2
> handler, sigaltstack coroutines will break from then on.  However, we do
> not use coroutines for user-mode emulation, so that is fine.
> 
> Suggested-by: Laszlo Ersek <lersek@redhat.com>
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  util/coroutine-sigaltstack.c | 56 +++++++++++++++++++-----------------
>  1 file changed, 29 insertions(+), 27 deletions(-)
> 
> diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
> index aade82afb8..2d32afc322 100644
> --- a/util/coroutine-sigaltstack.c
> +++ b/util/coroutine-sigaltstack.c
> @@ -59,6 +59,8 @@ typedef struct {
>  
>  static pthread_key_t thread_state_key;
>  
> +static void coroutine_trampoline(int signal);
> +
>  static CoroutineThreadState *coroutine_get_thread_state(void)
>  {
>      CoroutineThreadState *s = pthread_getspecific(thread_state_key);
> @@ -80,6 +82,7 @@ static void qemu_coroutine_thread_cleanup(void *opaque)
>  
>  static void __attribute__((constructor)) coroutine_init(void)
>  {
> +    struct sigaction sa;
>      int ret;
>  
>      ret = pthread_key_create(&thread_state_key, qemu_coroutine_thread_cleanup);
> @@ -87,6 +90,20 @@ static void __attribute__((constructor)) coroutine_init(void)
>          fprintf(stderr, "unable to create leader key: %s\n", strerror(errno));
>          abort();
>      }
> +
> +    /*
> +     * Establish the SIGUSR2 signal handler.  This is a process-wide
> +     * operation, and so will apply to all threads from here on.
> +     */
> +    sa = (struct sigaction) {
> +        .sa_handler = coroutine_trampoline,
> +        .sa_flags   = SA_ONSTACK,
> +    };
> +
> +    if (sigaction(SIGUSR2, &sa, NULL) != 0) {
> +        perror("Unable to install SIGUSR2 handler");
> +        abort();
> +    }
>  }
>  
>  /* "boot" function
> @@ -121,7 +138,17 @@ static void coroutine_trampoline(int signal)
>      /* Get the thread specific information */
>      coTS = coroutine_get_thread_state();
>      self = coTS->tr_handler;
> +
> +    if (!self) {
> +        /*
> +         * This SIGUSR2 came from an external source, not from
> +         * qemu_coroutine_new(), so perform the default action.
> +         */
> +        exit(0);
> +    }
> +
>      coTS->tr_called = 1;
> +    coTS->tr_handler = NULL;
>      co = &self->base;
>  
>      /*

(8) There's a further complication here, assuming we really want to
recognize the case when the handler is executing unexpectedly:

- pthread_getspecific() is not necessarily async-signal-safe, according
to POSIX, so calling coroutine_get_thread_state() in the "unexpected"
case (e.g. in response to an asynchronously generated SIGUSR2) is
problematic in its own right,

- if the SIGUSR2 is delivered to a thread that has never called
coroutine_get_thread_state() before, then we'll reach g_malloc0() inside
coroutine_get_thread_state(), in signal handler context, which is very bad.

You'd have to block SIGUSR2 for the entire process (all threads) at all
times, and only temporarily unblock it for a particular coroutine
thread, with the sigsuspend(). The above check would suffice, that way.

Such blocking is possible by calling pthread_sigmask() from the main
thread, before any other thread is created (the signal mask is inherited
across pthread_create()). I guess it could be done in coroutine_init() too.

And *then* the pthread_sigmask() calls should indeed be removed from
qemu_coroutine_new().

(Apologies if my feedback is difficult to understand, it's my fault. I
could propose a patch, if (and only if) you want that.)

Thanks
Laszlo

> @@ -150,12 +177,9 @@ Coroutine *qemu_coroutine_new(void)
>  {
>      CoroutineSigAltStack *co;
>      CoroutineThreadState *coTS;
> -    struct sigaction sa;
> -    struct sigaction osa;
>      stack_t ss;
>      stack_t oss;
>      sigset_t sigs;
> -    sigset_t osigs;
>      sigjmp_buf old_env;
>  
>      /* The way to manipulate stack is with the sigaltstack function. We
> @@ -172,24 +196,6 @@ Coroutine *qemu_coroutine_new(void)
>      co->stack = qemu_alloc_stack(&co->stack_size);
>      co->base.entry_arg = &old_env; /* stash away our jmp_buf */
>  
> -    coTS = coroutine_get_thread_state();
> -    coTS->tr_handler = co;
> -
> -    /*
> -     * Preserve the SIGUSR2 signal state, block SIGUSR2,
> -     * and establish our signal handler. The signal will
> -     * later transfer control onto the signal stack.
> -     */
> -    sigemptyset(&sigs);
> -    sigaddset(&sigs, SIGUSR2);
> -    pthread_sigmask(SIG_BLOCK, &sigs, &osigs);
> -    sa.sa_handler = coroutine_trampoline;
> -    sigfillset(&sa.sa_mask);
> -    sa.sa_flags = SA_ONSTACK;
> -    if (sigaction(SIGUSR2, &sa, &osa) != 0) {
> -        abort();
> -    }
> -
>      /*
>       * Set the new stack.
>       */
> @@ -207,6 +213,8 @@ Coroutine *qemu_coroutine_new(void)
>       * signal can be delivered the first time sigsuspend() is
>       * called.
>       */
> +    coTS = coroutine_get_thread_state();
> +    coTS->tr_handler = co;
>      coTS->tr_called = 0;
>      pthread_kill(pthread_self(), SIGUSR2);
>      sigfillset(&sigs);
> @@ -230,12 +238,6 @@ Coroutine *qemu_coroutine_new(void)
>          sigaltstack(&oss, NULL);
>      }
>  
> -    /*
> -     * Restore the old SIGUSR2 signal handler and mask
> -     */
> -    sigaction(SIGUSR2, &osa, NULL);
> -    pthread_sigmask(SIG_SETMASK, &osigs, NULL);
> -
>      /*
>       * Now enter the trampoline again, but this time not as a signal
>       * handler. Instead we jump into it directly. The functionally
>
Max Reitz Jan. 22, 2021, 5:58 p.m. UTC | #4
On 22.01.21 17:38, Laszlo Ersek wrote:
> On 01/22/21 11:20, Max Reitz wrote:
>> Modifying signal handlers is a process-global operation.  When two
>> threads run coroutine-sigaltstack's qemu_coroutine_new() concurrently,
>> they may interfere with each other: One of them may revert the SIGUSR2
>> handler back to the default between the other thread setting up
>> coroutine_trampoline() as the handler and raising SIGUSR2.  That SIGUSR2
>> will then lead to the process exiting.
>>
>> Outside of coroutine-sigaltstack, qemu does not use SIGUSR2.  We can
>> thus keep the signal handler installed all the time.
>> CoroutineThreadState.tr_handler tells coroutine_trampoline() whether its
>> stack is set up so a new coroutine is to be launched (i.e., it should
>> invoke sigsetjmp()), or not (i.e., the signal came from an external
>> source and we should just perform the default action, which is to exit
>> the process).
>>
>> Note that in user-mode emulation, the guest can register signal handlers
>> for any signal but SIGSEGV and SIGBUS, so if it registers a SIGUSR2
>> handler, sigaltstack coroutines will break from then on.  However, we do
>> not use coroutines for user-mode emulation, so that is fine.
>>
>> Suggested-by: Laszlo Ersek <lersek@redhat.com>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   util/coroutine-sigaltstack.c | 56 +++++++++++++++++++-----------------
>>   1 file changed, 29 insertions(+), 27 deletions(-)
> 
> (1) With SIGUSR2 permanently dedicated to "coroutine-sigaltstack.c", the
> comment on the qemu_init_main_loop() declaration, in
> "include/qemu/main-loop.h", also needs to be updated. SIGUSR2 is no
> longer a "free for thread-internal usage" signal.

It’s possible that I haven’t understood that comment, but I haven’t 
adjusted it because I didn’t interpret it to mean that the signals 
listed there were free to use.  For example, SIGUSR1 (aliased to 
SIG_IPI) is generated by cpus_kick_thread() and caught by KVM and HVF, 
so it doesn’t seem like it would be free for thread-internal usage either.

Instead, I understood it to mean that the signals listed there do not 
have to be blocked by non-main threads, because it is OK for non-main 
threads to catch them.  I can’t think of a reason why SIGUSR2 should be 
blocked by any thread, so I decided not to change the comment.

But perhaps I just didn’t understand the whole comment.  That’s 
definitely possible, given that I don’t even see a place where non-main 
threads would block the signals not listed there.

>> diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
>> index aade82afb8..2d32afc322 100644
>> --- a/util/coroutine-sigaltstack.c
>> +++ b/util/coroutine-sigaltstack.c
>> @@ -59,6 +59,8 @@ typedef struct {
>>
>>   static pthread_key_t thread_state_key;
>>
>> +static void coroutine_trampoline(int signal);
>> +
>>   static CoroutineThreadState *coroutine_get_thread_state(void)
>>   {
>>       CoroutineThreadState *s = pthread_getspecific(thread_state_key);
>> @@ -80,6 +82,7 @@ static void qemu_coroutine_thread_cleanup(void *opaque)
>>
>>   static void __attribute__((constructor)) coroutine_init(void)
>>   {
>> +    struct sigaction sa;
>>       int ret;
>>
>>       ret = pthread_key_create(&thread_state_key, qemu_coroutine_thread_cleanup);
>> @@ -87,6 +90,20 @@ static void __attribute__((constructor)) coroutine_init(void)
>>           fprintf(stderr, "unable to create leader key: %s\n", strerror(errno));
>>           abort();
>>       }
>> +
>> +    /*
>> +     * Establish the SIGUSR2 signal handler.  This is a process-wide
>> +     * operation, and so will apply to all threads from here on.
>> +     */
>> +    sa = (struct sigaction) {
>> +        .sa_handler = coroutine_trampoline,
>> +        .sa_flags   = SA_ONSTACK,
>> +    };
>> +
>> +    if (sigaction(SIGUSR2, &sa, NULL) != 0) {
>> +        perror("Unable to install SIGUSR2 handler");
>> +        abort();
>> +    }
>>   }
>>
>>   /* "boot" function
>> @@ -121,7 +138,17 @@ static void coroutine_trampoline(int signal)
>>       /* Get the thread specific information */
>>       coTS = coroutine_get_thread_state();
>>       self = coTS->tr_handler;
>> +
>> +    if (!self) {
>> +        /*
>> +         * This SIGUSR2 came from an external source, not from
>> +         * qemu_coroutine_new(), so perform the default action.
>> +         */
>> +        exit(0);
>> +    }
> 
> (2) exit() is generally unsafe to call in signal handlers.
> 
> We could reason whether or not it is safe in this particular case (POSIX
> describes the exact conditions --
> <https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03>),
> but it's much simpler to just call _exit().
> 
> 
> (3) "Performing the default action" would be slightly different from
> calling _exit(). When a process is terminated with a signal, the parent
> can distinguish that, when reaping the child. See waitpid() /
> WIFSIGNALED() / WTERMSIG(), versus WIFEXITED() / WEXITSTATUS().
> 
> So for the "default action", we'd have to:
> - restore the SIGUSR2 handler to SIG_DFL, and
> - re-raise the signal for the thread, and
> - because the signal being handled is automatically blocked unless
>    SA_NODEFER was set: unblock the signal for the thread.
> 
> The pthread_sigmask() call, made for the last step, would be the one
> that would not return.
> 
> *However*, all of this complexity is not really useful here. We don't
> really want to simulate being "forcefully terminated" by the unexpected
> (asynchronous) SIGUSR2. We just want to exit.
> 
> Therefore, _exit() is fine. But, we should use an exit code different
> from 0, as this is definitely not a pristine exit from QEMU. I'm not
> sure if a convention exists for nonzero exit codes in QEMU; if not, then
> just pass EXIT_FAILURE to _exit().

I’m fine with calling _exit().  I hope, Eric is, too (as long as the 
comment no longer claims this were the default behavior).

Given that SIGUSR2 is not something that qemu is prepared to receive 
from the outside, EXIT_FAILURE seems right to me.  (Even abort() could 
be justified, but I don’t think generating a core dump does any good here.)

(As for qemu-specific exit code conventions, the only ones I know of are 
for certain qemu-img subcommands.  I’m sure you grepped, too, but I 
can’t find anything for the system emulator.)

> (4) Furthermore, please update the comment, because "perform the default
> action" is not precise.

Sure, that’s definitely easier than to re-raise SIGUSR2.

>> +
>>       coTS->tr_called = 1;
>> +    coTS->tr_handler = NULL;
>>       co = &self->base;
>>
>>       /*
> 
> (5) I see that "tr_called" has type "volatile sig_atomic_t", which is
> great (I think that "sig_atomic_t" is not even necessary here, because
> of the careful signal masking that we do, and because the signal is
> raised synchronously).
> 
> "volatile" is certainly justified though (the compiler may not be able
> to trace the call through the signal), and that's what I'm missing from
> "tr_handler" too. IMO, the "tr_handler" field should be decalered as
> follow:
> 
>    volatile void * volatile tr_handler;
> 
> wherein the second "volatile" serves just the same purpose as the
> "volatile" in "tr_called", and the first "volatile" follows from *that*
> -- wherever we chase the *chain of pointers* rooted in "tr_handler"
> should not be optimized out by the compiler.
> 
> But: my point (5) is orthogonal to this patch. In practice, it may not
> matter at all. So feel free to ignore now, I guess.

I suppose signal handlers are indeed preempting functions (i.e., the 
compiler is not aware of them executing).  I overlooked that, given that 
logically, we more or less explicitly invoke the SIGUSR2 handler, but of 
course, technically, we just trigger the signal and the handler is then 
invoked preemptively.

I suspect the pthread_kill() function is sufficiently complex that the 
compiler can’t prove that it won’t access *coTS (e.g. because perhaps it 
contains a syscall?), but better be safe than sorry.

But I don’t really like the “volatile void *volatile tr_handler”, 
particularly the “volatile void *” combinations.  I find volatile voids 
weird.

Why is tr_handler even a void pointer, and not a pointer to 
CoroutineSigAltStack, if all it can point to is such an object? 
“volatile CoroutineSigAltStack *” would make more sense to me.

Given that perhaps the CoroutineSigAltStack object should be volatile, 
shouldn’t also the CoroutineThreadState object be volatile?  If it were, 
we could drop the volatile from tr_called and wouldn’t need to make the 
pointer value tr_handler volatile.


OTOH, if I look more through the code, making everything volatile seems 
a bit excessive to me.  I understand correctly that 
sigsetjmp()/siglongjmp() act as barriers to the compiler, don’t they?

The only value that I can see the in-coroutine code writing that the 
calling code reads (before the next sigsetjmp()) is tr_called.  So 
shouldn’t it be sufficient to insert a barrier() before the 
pthread_kill(), and then it’d be sufficient to keep tr_called volatile, 
but the rest could stay as it is?

>> @@ -150,12 +177,9 @@ Coroutine *qemu_coroutine_new(void)
>>   {
>>       CoroutineSigAltStack *co;
>>       CoroutineThreadState *coTS;
>> -    struct sigaction sa;
>> -    struct sigaction osa;
>>       stack_t ss;
>>       stack_t oss;
>>       sigset_t sigs;
>> -    sigset_t osigs;
>>       sigjmp_buf old_env;
>>
>>       /* The way to manipulate stack is with the sigaltstack function. We
>> @@ -172,24 +196,6 @@ Coroutine *qemu_coroutine_new(void)
>>       co->stack = qemu_alloc_stack(&co->stack_size);
>>       co->base.entry_arg = &old_env; /* stash away our jmp_buf */
>>
>> -    coTS = coroutine_get_thread_state();
>> -    coTS->tr_handler = co;
>> -
>> -    /*
>> -     * Preserve the SIGUSR2 signal state, block SIGUSR2,
>> -     * and establish our signal handler. The signal will
>> -     * later transfer control onto the signal stack.
>> -     */
>> -    sigemptyset(&sigs);
>> -    sigaddset(&sigs, SIGUSR2);
>> -    pthread_sigmask(SIG_BLOCK, &sigs, &osigs);
>> -    sa.sa_handler = coroutine_trampoline;
>> -    sigfillset(&sa.sa_mask);
>> -    sa.sa_flags = SA_ONSTACK;
>> -    if (sigaction(SIGUSR2, &sa, &osa) != 0) {
>> -        abort();
>> -    }
>> -
>>       /*
>>        * Set the new stack.
>>        */
>> @@ -207,6 +213,8 @@ Coroutine *qemu_coroutine_new(void)
>>        * signal can be delivered the first time sigsuspend() is
>>        * called.
>>        */
>> +    coTS = coroutine_get_thread_state();
>> +    coTS->tr_handler = co;
>>       coTS->tr_called = 0;
>>       pthread_kill(pthread_self(), SIGUSR2);
>>       sigfillset(&sigs);
>> @@ -230,12 +238,6 @@ Coroutine *qemu_coroutine_new(void)
>>           sigaltstack(&oss, NULL);
>>       }
>>
>> -    /*
>> -     * Restore the old SIGUSR2 signal handler and mask
>> -     */
>> -    sigaction(SIGUSR2, &osa, NULL);
>> -    pthread_sigmask(SIG_SETMASK, &osigs, NULL);
>> -
>>       /*
>>        * Now enter the trampoline again, but this time not as a signal
>>        * handler. Instead we jump into it directly. The functionally
>>
> 
> (6) This change, for qemu_coroutine_new(), is too heavy-handed, in my
> opinion. I suggest (a) removing only the sigaction() calls and their
> directly needed aux variables, and (b) *not* moving around operations.
> 
> In particular, you remove the blocking of SIGUSR2 for the thread -- the
> pthread_sigmask() call. That means the sigsuspend() later on becomes
> superfluous, as the signal will not be delivered inside sigsuspend(),
> but inside pthread_kill(). With the signal not blocked, *generation* and
> *delivery* will coalesce into a single event.

Are you saying that (a) is a problem because this is too heavy-handed of 
a change for a single patch, or because it would actually be a problem 
to deliver the signal inside pthread_kill()?

(Either way will result in me dropping the change from this patch, so 
it’s just a question out of curiosity.)

As for (b), just FYI, I deliberately moved around the operation, so that 
tr_handler is only set once the coroutine stack is set up.  Otherwise it 
might be a problem if an external SIGUSR2 comes in before then.

But if we keep SIGUSR2 blocked, there is no reason for that movement, 
hence the “just FYI”.

> The general logic should stay the same, only the signal action
> manipulation should be removed. IOW, for this function, I propose the
> following change only:
> 
>> diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
>> index aade82afb8c0..722fed7b2502 100644
>> --- a/util/coroutine-sigaltstack.c
>> +++ b/util/coroutine-sigaltstack.c
>> @@ -149,107 +149,97 @@ static void coroutine_trampoline(int signal)
>>   Coroutine *qemu_coroutine_new(void)
>>   {
>>       CoroutineSigAltStack *co;
>>       CoroutineThreadState *coTS;
>> -    struct sigaction sa;
>> -    struct sigaction osa;
>>       stack_t ss;
>>       stack_t oss;
>>       sigset_t sigs;
>>       sigset_t osigs;
>>       sigjmp_buf old_env;
>>
>>       /* The way to manipulate stack is with the sigaltstack function. We
>>        * prepare a stack, with it delivering a signal to ourselves and then
>>        * put sigsetjmp/siglongjmp where needed.
>>        * This has been done keeping coroutine-ucontext as a model and with the
>>        * pth ideas (GNU Portable Threads). See coroutine-ucontext for the basics
>>        * of the coroutines and see pth_mctx.c (from the pth project) for the
>>        * sigaltstack way of manipulating stacks.
>>        */
>>
>>       co = g_malloc0(sizeof(*co));
>>       co->stack_size = COROUTINE_STACK_SIZE;
>>       co->stack = qemu_alloc_stack(&co->stack_size);
>>       co->base.entry_arg = &old_env; /* stash away our jmp_buf */
>>
>>       coTS = coroutine_get_thread_state();
>>       coTS->tr_handler = co;
>>
>>       /*
>> -     * Preserve the SIGUSR2 signal state, block SIGUSR2,
>> -     * and establish our signal handler. The signal will
>> -     * later transfer control onto the signal stack.
>> +     * Block SIGUSR2. The signal will later transfer control onto the signal
>> +     * stack.
>>        */
>>       sigemptyset(&sigs);
>>       sigaddset(&sigs, SIGUSR2);
>>       pthread_sigmask(SIG_BLOCK, &sigs, &osigs);
>> -    sa.sa_handler = coroutine_trampoline;
>> -    sigfillset(&sa.sa_mask);
>> -    sa.sa_flags = SA_ONSTACK;
>> -    if (sigaction(SIGUSR2, &sa, &osa) != 0) {
>> -        abort();
>> -    }
>>
>>       /*
>>        * Set the new stack.
>>        */
>>       ss.ss_sp = co->stack;
>>       ss.ss_size = co->stack_size;
>>       ss.ss_flags = 0;
>>       if (sigaltstack(&ss, &oss) < 0) {
>>           abort();
>>       }
>>
>>       /*
>>        * Now transfer control onto the signal stack and set it up.
>>        * It will return immediately via "return" after the sigsetjmp()
>>        * was performed. Be careful here with race conditions.  The
>>        * signal can be delivered the first time sigsuspend() is
>>        * called.
>>        */
>>       coTS->tr_called = 0;
>>       pthread_kill(pthread_self(), SIGUSR2);
>>       sigfillset(&sigs);
>>       sigdelset(&sigs, SIGUSR2);
>>       while (!coTS->tr_called) {
>>           sigsuspend(&sigs);
>>       }
>>
>>       /*
>>        * Inform the system that we are back off the signal stack by
>>        * removing the alternative signal stack. Be careful here: It
>>        * first has to be disabled, before it can be removed.
>>        */
>>       sigaltstack(NULL, &ss);
>>       ss.ss_flags = SS_DISABLE;
>>       if (sigaltstack(&ss, NULL) < 0) {
>>           abort();
>>       }
>>       sigaltstack(NULL, &ss);
>>       if (!(oss.ss_flags & SS_DISABLE)) {
>>           sigaltstack(&oss, NULL);
>>       }
>>
>>       /*
>> -     * Restore the old SIGUSR2 signal handler and mask
>> +     * Restore the old mask
>>        */
>> -    sigaction(SIGUSR2, &osa, NULL);
>>       pthread_sigmask(SIG_SETMASK, &osigs, NULL);
>>
>>       /*
>>        * Now enter the trampoline again, but this time not as a signal
>>        * handler. Instead we jump into it directly. The functionally
>>        * redundant ping-pong pointer arithmetic is necessary to avoid
>>        * type-conversion warnings related to the `volatile' qualifier and
>>        * the fact that `jmp_buf' usually is an array type.
>>        */
>>       if (!sigsetjmp(old_env, 0)) {
>>           siglongjmp(coTS->tr_reenter, 1);
>>       }
>>
>>       /*
>>        * Ok, we returned again, so now we're finished
>>        */
>>
>>       return &co->base;
>>   }
> 
> 
> (7) The sigaction() call has not been moved entirely correctly from
> qemu_coroutine_new() to coroutine_init(), in my opinion.
> 
> Namely, the original call site fills the "sa_mask" member, meaning that,
> while the signal handler is executing, not only SIGUSR2 itself should be
> blocked, but *all* signals.
> 
> This is missing from the new call site, in coroutine_init() -- the
> "sa_mask" member is left zeroed.
> 
> Now, in practice, this may not matter a whole lot, because "sa_mask" is
> additive, and at the only place we allow the signal to be delivered,
> namely in sigsuspend(), we already have everything blocked, except for
> SIGUSR2. So when "sa_mask" is ORed with the set of blocked signals, upon
> delivery of SIGUSR2, there is no actual change to the signal mask.
> 
> *But*, I feel that such a change would really deserve its own argument,
> i.e. a separate patch, or at least a separate paragraph in the commit
> message. And I suggest not doing those; instead please just faithfully
> transfer the "sa_mask" setting too, to coroutine_init(). (My impression
> is that the effective removal of the "sa_mask" population was an
> oversight in this patch, not a conscious decision.)

Yes, it was totally an oversight.

Thanks a lot for your detailed review!  I have absolutely no experience 
with coroutine-sigaltstack in particular, and very little experience 
with signal handling in projects where correctness actually matters. 
I’m grateful that you inspect this patch with great scrutiny.

(Btw, that’s why I was very close to just ending a new version of the 
mutex patch just with the commit message adjusted, because that felt 
like I could do much less wrong than here.  Turns out I was right. O:))

Max
Max Reitz Jan. 22, 2021, 6:05 p.m. UTC | #5
On 22.01.21 18:09, Laszlo Ersek wrote:
> On 01/22/21 11:20, Max Reitz wrote:
>> Modifying signal handlers is a process-global operation.  When two
>> threads run coroutine-sigaltstack's qemu_coroutine_new() concurrently,
>> they may interfere with each other: One of them may revert the SIGUSR2
>> handler back to the default between the other thread setting up
>> coroutine_trampoline() as the handler and raising SIGUSR2.  That SIGUSR2
>> will then lead to the process exiting.
>>
>> Outside of coroutine-sigaltstack, qemu does not use SIGUSR2.  We can
>> thus keep the signal handler installed all the time.
>> CoroutineThreadState.tr_handler tells coroutine_trampoline() whether its
>> stack is set up so a new coroutine is to be launched (i.e., it should
>> invoke sigsetjmp()), or not (i.e., the signal came from an external
>> source and we should just perform the default action, which is to exit
>> the process).
>>
>> Note that in user-mode emulation, the guest can register signal handlers
>> for any signal but SIGSEGV and SIGBUS, so if it registers a SIGUSR2
>> handler, sigaltstack coroutines will break from then on.  However, we do
>> not use coroutines for user-mode emulation, so that is fine.
>>
>> Suggested-by: Laszlo Ersek <lersek@redhat.com>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   util/coroutine-sigaltstack.c | 56 +++++++++++++++++++-----------------
>>   1 file changed, 29 insertions(+), 27 deletions(-)
>>
>> diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
>> index aade82afb8..2d32afc322 100644
>> --- a/util/coroutine-sigaltstack.c
>> +++ b/util/coroutine-sigaltstack.c
>> @@ -59,6 +59,8 @@ typedef struct {
>>   
>>   static pthread_key_t thread_state_key;
>>   
>> +static void coroutine_trampoline(int signal);
>> +
>>   static CoroutineThreadState *coroutine_get_thread_state(void)
>>   {
>>       CoroutineThreadState *s = pthread_getspecific(thread_state_key);
>> @@ -80,6 +82,7 @@ static void qemu_coroutine_thread_cleanup(void *opaque)
>>   
>>   static void __attribute__((constructor)) coroutine_init(void)
>>   {
>> +    struct sigaction sa;
>>       int ret;
>>   
>>       ret = pthread_key_create(&thread_state_key, qemu_coroutine_thread_cleanup);
>> @@ -87,6 +90,20 @@ static void __attribute__((constructor)) coroutine_init(void)
>>           fprintf(stderr, "unable to create leader key: %s\n", strerror(errno));
>>           abort();
>>       }
>> +
>> +    /*
>> +     * Establish the SIGUSR2 signal handler.  This is a process-wide
>> +     * operation, and so will apply to all threads from here on.
>> +     */
>> +    sa = (struct sigaction) {
>> +        .sa_handler = coroutine_trampoline,
>> +        .sa_flags   = SA_ONSTACK,
>> +    };
>> +
>> +    if (sigaction(SIGUSR2, &sa, NULL) != 0) {
>> +        perror("Unable to install SIGUSR2 handler");
>> +        abort();
>> +    }
>>   }
>>   
>>   /* "boot" function
>> @@ -121,7 +138,17 @@ static void coroutine_trampoline(int signal)
>>       /* Get the thread specific information */
>>       coTS = coroutine_get_thread_state();
>>       self = coTS->tr_handler;
>> +
>> +    if (!self) {
>> +        /*
>> +         * This SIGUSR2 came from an external source, not from
>> +         * qemu_coroutine_new(), so perform the default action.
>> +         */
>> +        exit(0);
>> +    }
>> +
>>       coTS->tr_called = 1;
>> +    coTS->tr_handler = NULL;
>>       co = &self->base;
>>   
>>       /*
> 
> (8) There's a further complication here, assuming we really want to
> recognize the case when the handler is executing unexpectedly:
> 
> - pthread_getspecific() is not necessarily async-signal-safe, according
> to POSIX, so calling coroutine_get_thread_state() in the "unexpected"
> case (e.g. in response to an asynchronously generated SIGUSR2) is
> problematic in its own right,

That’s a shame.

> - if the SIGUSR2 is delivered to a thread that has never called
> coroutine_get_thread_state() before, then we'll reach g_malloc0() inside
> coroutine_get_thread_state(), in signal handler context, which is very bad.

Could be solved with a coroutine_try_get_thread_state() that will never 
malloc, but return NULL then.

> You'd have to block SIGUSR2 for the entire process (all threads) at all
> times, and only temporarily unblock it for a particular coroutine
> thread, with the sigsuspend(). The above check would suffice, that way.

Yes, that’s what I was originally afraid of.  I feel like that may be 
the complexity drop that pushes this change too far out of my comfort 
zone.  (And as evidenced by your review, it already was pretty much 
outside as it was.)

> Such blocking is possible by calling pthread_sigmask() from the main
> thread, before any other thread is created (the signal mask is inherited
> across pthread_create()). I guess it could be done in coroutine_init() too.
> 
> And *then* the pthread_sigmask() calls should indeed be removed from
> qemu_coroutine_new().

OTOH, that does sound rather simple...

> (Apologies if my feedback is difficult to understand, it's my fault. I
> could propose a patch, if (and only if) you want that.)

I can’t say I wouldn’t be happy with a patch for this code that doesn’t 
bear my S-o-b. ;)

I feel conflicted.  I can send a v2 that addresses this (probably 
consisting of multiple patches then, e.g. I’d split the SIGUSR2 blocking 
off the main patch), but to me, this bug is really more of a nuisance 
that just blocks me from sending a pull request for my block branch... 
So I’d rather not drag it out forever.  OTOH, sending a quick and bad 
fix just because I can’t wait is just bad.

I suppose I’ll have to decide over the weekend.  Though if you’re 
itching to write a patch yourself, I’d definitely be grateful.

Max
Eric Blake Jan. 22, 2021, 6:19 p.m. UTC | #6
On 1/22/21 11:58 AM, Max Reitz wrote:

>>> +    if (!self) {
>>> +        /*
>>> +         * This SIGUSR2 came from an external source, not from
>>> +         * qemu_coroutine_new(), so perform the default action.
>>> +         */
>>> +        exit(0);
>>> +    }
>>
>> (2) exit() is generally unsafe to call in signal handlers.
>>
>> We could reason whether or not it is safe in this particular case (POSIX
>> describes the exact conditions --
>> <https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03>),
>>
>> but it's much simpler to just call _exit().
>>
>>
>> (3) "Performing the default action" would be slightly different from
>> calling _exit(). When a process is terminated with a signal, the parent
>> can distinguish that, when reaping the child. See waitpid() /
>> WIFSIGNALED() / WTERMSIG(), versus WIFEXITED() / WEXITSTATUS().
>>
>> So for the "default action", we'd have to:
>> - restore the SIGUSR2 handler to SIG_DFL, and
>> - re-raise the signal for the thread, and
>> - because the signal being handled is automatically blocked unless
>>    SA_NODEFER was set: unblock the signal for the thread.
>>
>> The pthread_sigmask() call, made for the last step, would be the one
>> that would not return.
>>
>> *However*, all of this complexity is not really useful here. We don't
>> really want to simulate being "forcefully terminated" by the unexpected
>> (asynchronous) SIGUSR2. We just want to exit.
>>
>> Therefore, _exit() is fine. But, we should use an exit code different
>> from 0, as this is definitely not a pristine exit from QEMU. I'm not
>> sure if a convention exists for nonzero exit codes in QEMU; if not, then
>> just pass EXIT_FAILURE to _exit().
> 
> I’m fine with calling _exit().  I hope, Eric is, too (as long as the
> comment no longer claims this were the default behavior).

Using _exit(nonzero) is fine by me as long as the comment is accurate.
There are signals like SIGINT where you really DO want to terminate by
signal rather than using _exit(SIGINT+128), because shells can tell the
difference [1]; but SIGUSR2 is not one of the signals where shells
special-case their behavior.

[1] https://www.cons.org/cracauer/sigint.html

> 
> Given that SIGUSR2 is not something that qemu is prepared to receive
> from the outside, EXIT_FAILURE seems right to me.  (Even abort() could
> be justified, but I don’t think generating a core dump does any good here.)
> 
> (As for qemu-specific exit code conventions, the only ones I know of are
> for certain qemu-img subcommands.  I’m sure you grepped, too, but I
> can’t find anything for the system emulator.)
> 
>> (4) Furthermore, please update the comment, because "perform the default
>> action" is not precise.
> 
> Sure, that’s definitely easier than to re-raise SIGUSR2.

Works for me as well.
Laszlo Ersek Jan. 22, 2021, 6:27 p.m. UTC | #7
On 01/22/21 18:58, Max Reitz wrote:
> On 22.01.21 17:38, Laszlo Ersek wrote:
>> On 01/22/21 11:20, Max Reitz wrote:
>>> Modifying signal handlers is a process-global operation.  When two
>>> threads run coroutine-sigaltstack's qemu_coroutine_new() concurrently,
>>> they may interfere with each other: One of them may revert the SIGUSR2
>>> handler back to the default between the other thread setting up
>>> coroutine_trampoline() as the handler and raising SIGUSR2.  That SIGUSR2
>>> will then lead to the process exiting.
>>>
>>> Outside of coroutine-sigaltstack, qemu does not use SIGUSR2.  We can
>>> thus keep the signal handler installed all the time.
>>> CoroutineThreadState.tr_handler tells coroutine_trampoline() whether its
>>> stack is set up so a new coroutine is to be launched (i.e., it should
>>> invoke sigsetjmp()), or not (i.e., the signal came from an external
>>> source and we should just perform the default action, which is to exit
>>> the process).
>>>
>>> Note that in user-mode emulation, the guest can register signal handlers
>>> for any signal but SIGSEGV and SIGBUS, so if it registers a SIGUSR2
>>> handler, sigaltstack coroutines will break from then on.  However, we do
>>> not use coroutines for user-mode emulation, so that is fine.
>>>
>>> Suggested-by: Laszlo Ersek <lersek@redhat.com>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>   util/coroutine-sigaltstack.c | 56 +++++++++++++++++++-----------------
>>>   1 file changed, 29 insertions(+), 27 deletions(-)
>>
>> (1) With SIGUSR2 permanently dedicated to "coroutine-sigaltstack.c", the
>> comment on the qemu_init_main_loop() declaration, in
>> "include/qemu/main-loop.h", also needs to be updated. SIGUSR2 is no
>> longer a "free for thread-internal usage" signal.
> 
> It’s possible that I haven’t understood that comment, but I haven’t
> adjusted it because I didn’t interpret it to mean that the signals
> listed there were free to use.  For example, SIGUSR1 (aliased to
> SIG_IPI) is generated by cpus_kick_thread() and caught by KVM and HVF,
> so it doesn’t seem like it would be free for thread-internal usage either.
> 
> Instead, I understood it to mean that the signals listed there do not
> have to be blocked by non-main threads, because it is OK for non-main
> threads to catch them.  I can’t think of a reason why SIGUSR2 should be
> blocked by any thread, so I decided not to change the comment.
> 
> But perhaps I just didn’t understand the whole comment.  That’s
> definitely possible, given that I don’t even see a place where non-main
> threads would block the signals not listed there.

OK, then I agree that I don't understand the comment on
qemu_init_main_loop() either. :) If you consciously decided not to
change the comment, then I won't request otherwise.

> 
>>> diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
>>> index aade82afb8..2d32afc322 100644
>>> --- a/util/coroutine-sigaltstack.c
>>> +++ b/util/coroutine-sigaltstack.c
>>> @@ -59,6 +59,8 @@ typedef struct {
>>>
>>>   static pthread_key_t thread_state_key;
>>>
>>> +static void coroutine_trampoline(int signal);
>>> +
>>>   static CoroutineThreadState *coroutine_get_thread_state(void)
>>>   {
>>>       CoroutineThreadState *s = pthread_getspecific(thread_state_key);
>>> @@ -80,6 +82,7 @@ static void qemu_coroutine_thread_cleanup(void
>>> *opaque)
>>>
>>>   static void __attribute__((constructor)) coroutine_init(void)
>>>   {
>>> +    struct sigaction sa;
>>>       int ret;
>>>
>>>       ret = pthread_key_create(&thread_state_key,
>>> qemu_coroutine_thread_cleanup);
>>> @@ -87,6 +90,20 @@ static void __attribute__((constructor))
>>> coroutine_init(void)
>>>           fprintf(stderr, "unable to create leader key: %s\n",
>>> strerror(errno));
>>>           abort();
>>>       }
>>> +
>>> +    /*
>>> +     * Establish the SIGUSR2 signal handler.  This is a process-wide
>>> +     * operation, and so will apply to all threads from here on.
>>> +     */
>>> +    sa = (struct sigaction) {
>>> +        .sa_handler = coroutine_trampoline,
>>> +        .sa_flags   = SA_ONSTACK,
>>> +    };
>>> +
>>> +    if (sigaction(SIGUSR2, &sa, NULL) != 0) {
>>> +        perror("Unable to install SIGUSR2 handler");
>>> +        abort();
>>> +    }
>>>   }
>>>
>>>   /* "boot" function
>>> @@ -121,7 +138,17 @@ static void coroutine_trampoline(int signal)
>>>       /* Get the thread specific information */
>>>       coTS = coroutine_get_thread_state();
>>>       self = coTS->tr_handler;
>>> +
>>> +    if (!self) {
>>> +        /*
>>> +         * This SIGUSR2 came from an external source, not from
>>> +         * qemu_coroutine_new(), so perform the default action.
>>> +         */
>>> +        exit(0);
>>> +    }
>>
>> (2) exit() is generally unsafe to call in signal handlers.
>>
>> We could reason whether or not it is safe in this particular case (POSIX
>> describes the exact conditions --
>> <https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03>),
>>
>> but it's much simpler to just call _exit().
>>
>>
>> (3) "Performing the default action" would be slightly different from
>> calling _exit(). When a process is terminated with a signal, the parent
>> can distinguish that, when reaping the child. See waitpid() /
>> WIFSIGNALED() / WTERMSIG(), versus WIFEXITED() / WEXITSTATUS().
>>
>> So for the "default action", we'd have to:
>> - restore the SIGUSR2 handler to SIG_DFL, and
>> - re-raise the signal for the thread, and
>> - because the signal being handled is automatically blocked unless
>>    SA_NODEFER was set: unblock the signal for the thread.
>>
>> The pthread_sigmask() call, made for the last step, would be the one
>> that would not return.
>>
>> *However*, all of this complexity is not really useful here. We don't
>> really want to simulate being "forcefully terminated" by the unexpected
>> (asynchronous) SIGUSR2. We just want to exit.
>>
>> Therefore, _exit() is fine. But, we should use an exit code different
>> from 0, as this is definitely not a pristine exit from QEMU. I'm not
>> sure if a convention exists for nonzero exit codes in QEMU; if not, then
>> just pass EXIT_FAILURE to _exit().
> 
> I’m fine with calling _exit().  I hope, Eric is, too (as long as the
> comment no longer claims this were the default behavior).
> 
> Given that SIGUSR2 is not something that qemu is prepared to receive
> from the outside, EXIT_FAILURE seems right to me.  (Even abort() could
> be justified, but I don’t think generating a core dump does any good here.)
> 
> (As for qemu-specific exit code conventions, the only ones I know of are
> for certain qemu-img subcommands.  I’m sure you grepped, too, but I
> can’t find anything for the system emulator.)
> 
>> (4) Furthermore, please update the comment, because "perform the default
>> action" is not precise.
> 
> Sure, that’s definitely easier than to re-raise SIGUSR2.
> 
>>> +
>>>       coTS->tr_called = 1;
>>> +    coTS->tr_handler = NULL;
>>>       co = &self->base;
>>>
>>>       /*
>>
>> (5) I see that "tr_called" has type "volatile sig_atomic_t", which is
>> great (I think that "sig_atomic_t" is not even necessary here, because
>> of the careful signal masking that we do, and because the signal is
>> raised synchronously).
>>
>> "volatile" is certainly justified though (the compiler may not be able
>> to trace the call through the signal), and that's what I'm missing from
>> "tr_handler" too. IMO, the "tr_handler" field should be decalered as
>> follow:
>>
>>    volatile void * volatile tr_handler;
>>
>> wherein the second "volatile" serves just the same purpose as the
>> "volatile" in "tr_called", and the first "volatile" follows from *that*
>> -- wherever we chase the *chain of pointers* rooted in "tr_handler"
>> should not be optimized out by the compiler.
>>
>> But: my point (5) is orthogonal to this patch. In practice, it may not
>> matter at all. So feel free to ignore now, I guess.
> 
> I suppose signal handlers are indeed preempting functions (i.e., the
> compiler is not aware of them executing).  I overlooked that, given that
> logically, we more or less explicitly invoke the SIGUSR2 handler, but of
> course, technically, we just trigger the signal and the handler is then
> invoked preemptively.
> 
> I suspect the pthread_kill() function is sufficiently complex that the
> compiler can’t prove that it won’t access *coTS (e.g. because perhaps it
> contains a syscall?), but better be safe than sorry.
> 
> But I don’t really like the “volatile void *volatile tr_handler”,
> particularly the “volatile void *” combinations.  I find volatile voids
> weird.
> 
> Why is tr_handler even a void pointer, and not a pointer to
> CoroutineSigAltStack, if all it can point to is such an object?
> “volatile CoroutineSigAltStack *” would make more sense to me.
> 
> Given that perhaps the CoroutineSigAltStack object should be volatile,
> shouldn’t also the CoroutineThreadState object be volatile?  If it were,
> we could drop the volatile from tr_called and wouldn’t need to make the
> pointer value tr_handler volatile.

Good point...

> 
> 
> OTOH, if I look more through the code, making everything volatile seems
> a bit excessive to me.  I understand correctly that
> sigsetjmp()/siglongjmp() act as barriers to the compiler, don’t they?

Hmmm...

https://pubs.opengroup.org/onlinepubs/9699919799/functions/longjmp.html

"I guess so".

> 
> The only value that I can see the in-coroutine code writing that the
> calling code reads (before the next sigsetjmp()) is tr_called.  So
> shouldn’t it be sufficient to insert a barrier() before the
> pthread_kill(), and then it’d be sufficient to keep tr_called volatile,
> but the rest could stay as it is?

Well, I guess one could argue that pthread_kill() itself should work as
a barrier. It's hard to find a reason why it should not be a barrier.

> 
>>> @@ -150,12 +177,9 @@ Coroutine *qemu_coroutine_new(void)
>>>   {
>>>       CoroutineSigAltStack *co;
>>>       CoroutineThreadState *coTS;
>>> -    struct sigaction sa;
>>> -    struct sigaction osa;
>>>       stack_t ss;
>>>       stack_t oss;
>>>       sigset_t sigs;
>>> -    sigset_t osigs;
>>>       sigjmp_buf old_env;
>>>
>>>       /* The way to manipulate stack is with the sigaltstack
>>> function. We
>>> @@ -172,24 +196,6 @@ Coroutine *qemu_coroutine_new(void)
>>>       co->stack = qemu_alloc_stack(&co->stack_size);
>>>       co->base.entry_arg = &old_env; /* stash away our jmp_buf */
>>>
>>> -    coTS = coroutine_get_thread_state();
>>> -    coTS->tr_handler = co;
>>> -
>>> -    /*
>>> -     * Preserve the SIGUSR2 signal state, block SIGUSR2,
>>> -     * and establish our signal handler. The signal will
>>> -     * later transfer control onto the signal stack.
>>> -     */
>>> -    sigemptyset(&sigs);
>>> -    sigaddset(&sigs, SIGUSR2);
>>> -    pthread_sigmask(SIG_BLOCK, &sigs, &osigs);
>>> -    sa.sa_handler = coroutine_trampoline;
>>> -    sigfillset(&sa.sa_mask);
>>> -    sa.sa_flags = SA_ONSTACK;
>>> -    if (sigaction(SIGUSR2, &sa, &osa) != 0) {
>>> -        abort();
>>> -    }
>>> -
>>>       /*
>>>        * Set the new stack.
>>>        */
>>> @@ -207,6 +213,8 @@ Coroutine *qemu_coroutine_new(void)
>>>        * signal can be delivered the first time sigsuspend() is
>>>        * called.
>>>        */
>>> +    coTS = coroutine_get_thread_state();
>>> +    coTS->tr_handler = co;
>>>       coTS->tr_called = 0;
>>>       pthread_kill(pthread_self(), SIGUSR2);
>>>       sigfillset(&sigs);
>>> @@ -230,12 +238,6 @@ Coroutine *qemu_coroutine_new(void)
>>>           sigaltstack(&oss, NULL);
>>>       }
>>>
>>> -    /*
>>> -     * Restore the old SIGUSR2 signal handler and mask
>>> -     */
>>> -    sigaction(SIGUSR2, &osa, NULL);
>>> -    pthread_sigmask(SIG_SETMASK, &osigs, NULL);
>>> -
>>>       /*
>>>        * Now enter the trampoline again, but this time not as a signal
>>>        * handler. Instead we jump into it directly. The functionally
>>>
>>
>> (6) This change, for qemu_coroutine_new(), is too heavy-handed, in my
>> opinion. I suggest (a) removing only the sigaction() calls and their
>> directly needed aux variables, and (b) *not* moving around operations.
>>
>> In particular, you remove the blocking of SIGUSR2 for the thread -- the
>> pthread_sigmask() call. That means the sigsuspend() later on becomes
>> superfluous, as the signal will not be delivered inside sigsuspend(),
>> but inside pthread_kill(). With the signal not blocked, *generation* and
>> *delivery* will coalesce into a single event.
> 
> Are you saying that (a) is a problem because this is too heavy-handed of
> a change for a single patch, or because it would actually be a problem
> to deliver the signal inside pthread_kill()?

The former: all the things together that the patch does to
qemu_coroutine_new() are too heavy-weight for a single patch. They also
render sigsuspend() useless, while keeping sigsuspend() in place.

> 
> (Either way will result in me dropping the change from this patch, so
> it’s just a question out of curiosity.)
> 
> As for (b), just FYI, I deliberately moved around the operation, so that
> tr_handler is only set once the coroutine stack is set up.  Otherwise it
> might be a problem if an external SIGUSR2 comes in before then.
> 
> But if we keep SIGUSR2 blocked, there is no reason for that movement,
> hence the “just FYI”.
> 
>> The general logic should stay the same, only the signal action
>> manipulation should be removed. IOW, for this function, I propose the
>> following change only:
>>
>>> diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
>>> index aade82afb8c0..722fed7b2502 100644
>>> --- a/util/coroutine-sigaltstack.c
>>> +++ b/util/coroutine-sigaltstack.c
>>> @@ -149,107 +149,97 @@ static void coroutine_trampoline(int signal)
>>>   Coroutine *qemu_coroutine_new(void)
>>>   {
>>>       CoroutineSigAltStack *co;
>>>       CoroutineThreadState *coTS;
>>> -    struct sigaction sa;
>>> -    struct sigaction osa;
>>>       stack_t ss;
>>>       stack_t oss;
>>>       sigset_t sigs;
>>>       sigset_t osigs;
>>>       sigjmp_buf old_env;
>>>
>>>       /* The way to manipulate stack is with the sigaltstack
>>> function. We
>>>        * prepare a stack, with it delivering a signal to ourselves
>>> and then
>>>        * put sigsetjmp/siglongjmp where needed.
>>>        * This has been done keeping coroutine-ucontext as a model and
>>> with the
>>>        * pth ideas (GNU Portable Threads). See coroutine-ucontext for
>>> the basics
>>>        * of the coroutines and see pth_mctx.c (from the pth project)
>>> for the
>>>        * sigaltstack way of manipulating stacks.
>>>        */
>>>
>>>       co = g_malloc0(sizeof(*co));
>>>       co->stack_size = COROUTINE_STACK_SIZE;
>>>       co->stack = qemu_alloc_stack(&co->stack_size);
>>>       co->base.entry_arg = &old_env; /* stash away our jmp_buf */
>>>
>>>       coTS = coroutine_get_thread_state();
>>>       coTS->tr_handler = co;
>>>
>>>       /*
>>> -     * Preserve the SIGUSR2 signal state, block SIGUSR2,
>>> -     * and establish our signal handler. The signal will
>>> -     * later transfer control onto the signal stack.
>>> +     * Block SIGUSR2. The signal will later transfer control onto
>>> the signal
>>> +     * stack.
>>>        */
>>>       sigemptyset(&sigs);
>>>       sigaddset(&sigs, SIGUSR2);
>>>       pthread_sigmask(SIG_BLOCK, &sigs, &osigs);
>>> -    sa.sa_handler = coroutine_trampoline;
>>> -    sigfillset(&sa.sa_mask);
>>> -    sa.sa_flags = SA_ONSTACK;
>>> -    if (sigaction(SIGUSR2, &sa, &osa) != 0) {
>>> -        abort();
>>> -    }
>>>
>>>       /*
>>>        * Set the new stack.
>>>        */
>>>       ss.ss_sp = co->stack;
>>>       ss.ss_size = co->stack_size;
>>>       ss.ss_flags = 0;
>>>       if (sigaltstack(&ss, &oss) < 0) {
>>>           abort();
>>>       }
>>>
>>>       /*
>>>        * Now transfer control onto the signal stack and set it up.
>>>        * It will return immediately via "return" after the sigsetjmp()
>>>        * was performed. Be careful here with race conditions.  The
>>>        * signal can be delivered the first time sigsuspend() is
>>>        * called.
>>>        */
>>>       coTS->tr_called = 0;
>>>       pthread_kill(pthread_self(), SIGUSR2);
>>>       sigfillset(&sigs);
>>>       sigdelset(&sigs, SIGUSR2);
>>>       while (!coTS->tr_called) {
>>>           sigsuspend(&sigs);
>>>       }
>>>
>>>       /*
>>>        * Inform the system that we are back off the signal stack by
>>>        * removing the alternative signal stack. Be careful here: It
>>>        * first has to be disabled, before it can be removed.
>>>        */
>>>       sigaltstack(NULL, &ss);
>>>       ss.ss_flags = SS_DISABLE;
>>>       if (sigaltstack(&ss, NULL) < 0) {
>>>           abort();
>>>       }
>>>       sigaltstack(NULL, &ss);
>>>       if (!(oss.ss_flags & SS_DISABLE)) {
>>>           sigaltstack(&oss, NULL);
>>>       }
>>>
>>>       /*
>>> -     * Restore the old SIGUSR2 signal handler and mask
>>> +     * Restore the old mask
>>>        */
>>> -    sigaction(SIGUSR2, &osa, NULL);
>>>       pthread_sigmask(SIG_SETMASK, &osigs, NULL);
>>>
>>>       /*
>>>        * Now enter the trampoline again, but this time not as a signal
>>>        * handler. Instead we jump into it directly. The functionally
>>>        * redundant ping-pong pointer arithmetic is necessary to avoid
>>>        * type-conversion warnings related to the `volatile' qualifier
>>> and
>>>        * the fact that `jmp_buf' usually is an array type.
>>>        */
>>>       if (!sigsetjmp(old_env, 0)) {
>>>           siglongjmp(coTS->tr_reenter, 1);
>>>       }
>>>
>>>       /*
>>>        * Ok, we returned again, so now we're finished
>>>        */
>>>
>>>       return &co->base;
>>>   }
>>
>>
>> (7) The sigaction() call has not been moved entirely correctly from
>> qemu_coroutine_new() to coroutine_init(), in my opinion.
>>
>> Namely, the original call site fills the "sa_mask" member, meaning that,
>> while the signal handler is executing, not only SIGUSR2 itself should be
>> blocked, but *all* signals.
>>
>> This is missing from the new call site, in coroutine_init() -- the
>> "sa_mask" member is left zeroed.
>>
>> Now, in practice, this may not matter a whole lot, because "sa_mask" is
>> additive, and at the only place we allow the signal to be delivered,
>> namely in sigsuspend(), we already have everything blocked, except for
>> SIGUSR2. So when "sa_mask" is ORed with the set of blocked signals, upon
>> delivery of SIGUSR2, there is no actual change to the signal mask.
>>
>> *But*, I feel that such a change would really deserve its own argument,
>> i.e. a separate patch, or at least a separate paragraph in the commit
>> message. And I suggest not doing those; instead please just faithfully
>> transfer the "sa_mask" setting too, to coroutine_init(). (My impression
>> is that the effective removal of the "sa_mask" population was an
>> oversight in this patch, not a conscious decision.)
> 
> Yes, it was totally an oversight.
> 
> Thanks a lot for your detailed review!  I have absolutely no experience
> with coroutine-sigaltstack in particular, and very little experience
> with signal handling in projects where correctness actually matters. I’m
> grateful that you inspect this patch with great scrutiny.
> 
> (Btw, that’s why I was very close to just ending a new version of the
> mutex patch just with the commit message adjusted, because that felt
> like I could do much less wrong than here.  Turns out I was right. O:))

TBH I started liking the mutex version too. :) It's possible we're
working just too hard for solving this issue.

Laszlo
Laszlo Ersek Jan. 22, 2021, 6:28 p.m. UTC | #8
On 01/22/21 19:19, Eric Blake wrote:
> On 1/22/21 11:58 AM, Max Reitz wrote:
> 
>>>> +    if (!self) {
>>>> +        /*
>>>> +         * This SIGUSR2 came from an external source, not from
>>>> +         * qemu_coroutine_new(), so perform the default action.
>>>> +         */
>>>> +        exit(0);
>>>> +    }
>>>
>>> (2) exit() is generally unsafe to call in signal handlers.
>>>
>>> We could reason whether or not it is safe in this particular case (POSIX
>>> describes the exact conditions --
>>> <https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03>),
>>>
>>> but it's much simpler to just call _exit().
>>>
>>>
>>> (3) "Performing the default action" would be slightly different from
>>> calling _exit(). When a process is terminated with a signal, the parent
>>> can distinguish that, when reaping the child. See waitpid() /
>>> WIFSIGNALED() / WTERMSIG(), versus WIFEXITED() / WEXITSTATUS().
>>>
>>> So for the "default action", we'd have to:
>>> - restore the SIGUSR2 handler to SIG_DFL, and
>>> - re-raise the signal for the thread, and
>>> - because the signal being handled is automatically blocked unless
>>>    SA_NODEFER was set: unblock the signal for the thread.
>>>
>>> The pthread_sigmask() call, made for the last step, would be the one
>>> that would not return.
>>>
>>> *However*, all of this complexity is not really useful here. We don't
>>> really want to simulate being "forcefully terminated" by the unexpected
>>> (asynchronous) SIGUSR2. We just want to exit.
>>>
>>> Therefore, _exit() is fine. But, we should use an exit code different
>>> from 0, as this is definitely not a pristine exit from QEMU. I'm not
>>> sure if a convention exists for nonzero exit codes in QEMU; if not, then
>>> just pass EXIT_FAILURE to _exit().
>>
>> I’m fine with calling _exit().  I hope, Eric is, too (as long as the
>> comment no longer claims this were the default behavior).
> 
> Using _exit(nonzero) is fine by me as long as the comment is accurate.
> There are signals like SIGINT where you really DO want to terminate by
> signal rather than using _exit(SIGINT+128), because shells can tell the
> difference [1]; but SIGUSR2 is not one of the signals where shells
> special-case their behavior.
> 
> [1] https://www.cons.org/cracauer/sigint.html

Good point!

> 
>>
>> Given that SIGUSR2 is not something that qemu is prepared to receive
>> from the outside, EXIT_FAILURE seems right to me.  (Even abort() could
>> be justified, but I don’t think generating a core dump does any good here.)
>>
>> (As for qemu-specific exit code conventions, the only ones I know of are
>> for certain qemu-img subcommands.  I’m sure you grepped, too, but I
>> can’t find anything for the system emulator.)
>>
>>> (4) Furthermore, please update the comment, because "perform the default
>>> action" is not precise.
>>
>> Sure, that’s definitely easier than to re-raise SIGUSR2.
> 
> Works for me as well.
> 

Thanks.
Laszlo
Laszlo Ersek Jan. 22, 2021, 6:29 p.m. UTC | #9
On 01/22/21 19:05, Max Reitz wrote:
> On 22.01.21 18:09, Laszlo Ersek wrote:
>> On 01/22/21 11:20, Max Reitz wrote:
>>> Modifying signal handlers is a process-global operation.  When two
>>> threads run coroutine-sigaltstack's qemu_coroutine_new() concurrently,
>>> they may interfere with each other: One of them may revert the SIGUSR2
>>> handler back to the default between the other thread setting up
>>> coroutine_trampoline() as the handler and raising SIGUSR2.  That SIGUSR2
>>> will then lead to the process exiting.
>>>
>>> Outside of coroutine-sigaltstack, qemu does not use SIGUSR2.  We can
>>> thus keep the signal handler installed all the time.
>>> CoroutineThreadState.tr_handler tells coroutine_trampoline() whether its
>>> stack is set up so a new coroutine is to be launched (i.e., it should
>>> invoke sigsetjmp()), or not (i.e., the signal came from an external
>>> source and we should just perform the default action, which is to exit
>>> the process).
>>>
>>> Note that in user-mode emulation, the guest can register signal handlers
>>> for any signal but SIGSEGV and SIGBUS, so if it registers a SIGUSR2
>>> handler, sigaltstack coroutines will break from then on.  However, we do
>>> not use coroutines for user-mode emulation, so that is fine.
>>>
>>> Suggested-by: Laszlo Ersek <lersek@redhat.com>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>   util/coroutine-sigaltstack.c | 56 +++++++++++++++++++-----------------
>>>   1 file changed, 29 insertions(+), 27 deletions(-)
>>>
>>> diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
>>> index aade82afb8..2d32afc322 100644
>>> --- a/util/coroutine-sigaltstack.c
>>> +++ b/util/coroutine-sigaltstack.c
>>> @@ -59,6 +59,8 @@ typedef struct {
>>>     static pthread_key_t thread_state_key;
>>>   +static void coroutine_trampoline(int signal);
>>> +
>>>   static CoroutineThreadState *coroutine_get_thread_state(void)
>>>   {
>>>       CoroutineThreadState *s = pthread_getspecific(thread_state_key);
>>> @@ -80,6 +82,7 @@ static void qemu_coroutine_thread_cleanup(void
>>> *opaque)
>>>     static void __attribute__((constructor)) coroutine_init(void)
>>>   {
>>> +    struct sigaction sa;
>>>       int ret;
>>>         ret = pthread_key_create(&thread_state_key,
>>> qemu_coroutine_thread_cleanup);
>>> @@ -87,6 +90,20 @@ static void __attribute__((constructor))
>>> coroutine_init(void)
>>>           fprintf(stderr, "unable to create leader key: %s\n",
>>> strerror(errno));
>>>           abort();
>>>       }
>>> +
>>> +    /*
>>> +     * Establish the SIGUSR2 signal handler.  This is a process-wide
>>> +     * operation, and so will apply to all threads from here on.
>>> +     */
>>> +    sa = (struct sigaction) {
>>> +        .sa_handler = coroutine_trampoline,
>>> +        .sa_flags   = SA_ONSTACK,
>>> +    };
>>> +
>>> +    if (sigaction(SIGUSR2, &sa, NULL) != 0) {
>>> +        perror("Unable to install SIGUSR2 handler");
>>> +        abort();
>>> +    }
>>>   }
>>>     /* "boot" function
>>> @@ -121,7 +138,17 @@ static void coroutine_trampoline(int signal)
>>>       /* Get the thread specific information */
>>>       coTS = coroutine_get_thread_state();
>>>       self = coTS->tr_handler;
>>> +
>>> +    if (!self) {
>>> +        /*
>>> +         * This SIGUSR2 came from an external source, not from
>>> +         * qemu_coroutine_new(), so perform the default action.
>>> +         */
>>> +        exit(0);
>>> +    }
>>> +
>>>       coTS->tr_called = 1;
>>> +    coTS->tr_handler = NULL;
>>>       co = &self->base;
>>>         /*
>>
>> (8) There's a further complication here, assuming we really want to
>> recognize the case when the handler is executing unexpectedly:
>>
>> - pthread_getspecific() is not necessarily async-signal-safe, according
>> to POSIX, so calling coroutine_get_thread_state() in the "unexpected"
>> case (e.g. in response to an asynchronously generated SIGUSR2) is
>> problematic in its own right,
> 
> That’s a shame.
> 
>> - if the SIGUSR2 is delivered to a thread that has never called
>> coroutine_get_thread_state() before, then we'll reach g_malloc0() inside
>> coroutine_get_thread_state(), in signal handler context, which is very
>> bad.
> 
> Could be solved with a coroutine_try_get_thread_state() that will never
> malloc, but return NULL then.
> 
>> You'd have to block SIGUSR2 for the entire process (all threads) at all
>> times, and only temporarily unblock it for a particular coroutine
>> thread, with the sigsuspend(). The above check would suffice, that way.
> 
> Yes, that’s what I was originally afraid of.  I feel like that may be
> the complexity drop that pushes this change too far out of my comfort
> zone.  (And as evidenced by your review, it already was pretty much
> outside as it was.)
> 
>> Such blocking is possible by calling pthread_sigmask() from the main
>> thread, before any other thread is created (the signal mask is inherited
>> across pthread_create()). I guess it could be done in coroutine_init()
>> too.
>>
>> And *then* the pthread_sigmask() calls should indeed be removed from
>> qemu_coroutine_new().
> 
> OTOH, that does sound rather simple...
> 
>> (Apologies if my feedback is difficult to understand, it's my fault. I
>> could propose a patch, if (and only if) you want that.)
> 
> I can’t say I wouldn’t be happy with a patch for this code that doesn’t
> bear my S-o-b. ;)
> 
> I feel conflicted.  I can send a v2 that addresses this (probably
> consisting of multiple patches then, e.g. I’d split the SIGUSR2 blocking
> off the main patch), but to me, this bug is really more of a nuisance
> that just blocks me from sending a pull request for my block branch...
> So I’d rather not drag it out forever.  OTOH, sending a quick and bad
> fix just because I can’t wait is just bad.
> 
> I suppose I’ll have to decide over the weekend.  Though if you’re
> itching to write a patch yourself, I’d definitely be grateful.

OK, I'll try my hand at it; I hope I won't be eating my words.

Laszlo
Laszlo Ersek Jan. 22, 2021, 7:02 p.m. UTC | #10
On 01/22/21 18:58, Max Reitz wrote:
> On 22.01.21 17:38, Laszlo Ersek wrote:

>> (1) With SIGUSR2 permanently dedicated to "coroutine-sigaltstack.c", the
>> comment on the qemu_init_main_loop() declaration, in
>> "include/qemu/main-loop.h", also needs to be updated. SIGUSR2 is no
>> longer a "free for thread-internal usage" signal.
> 
> It’s possible that I haven’t understood that comment, but I haven’t
> adjusted it because I didn’t interpret it to mean that the signals
> listed there were free to use.  For example, SIGUSR1 (aliased to
> SIG_IPI) is generated by cpus_kick_thread() and caught by KVM and HVF,
> so it doesn’t seem like it would be free for thread-internal usage either.
> 
> Instead, I understood it to mean that the signals listed there do not
> have to be blocked by non-main threads, because it is OK for non-main
> threads to catch them.  I can’t think of a reason why SIGUSR2 should be
> blocked by any thread, so I decided not to change the comment.
> 
> But perhaps I just didn’t understand the whole comment.  That’s
> definitely possible, given that I don’t even see a place where non-main
> threads would block the signals not listed there.

Upon re-reading the comment, and also looking at the
qemu_init_main_loop() and qemu_signal_init() function definitions, I now
think that the

  consider these signals to be safe

language simply means that the main thread does not intend to handle the
listed signals.

Your SIG_IPI example is great, because, while SIG_IPI does not satisfy
my original characterization ("free for thread-internal usage"), it does
satisfy "not handled by the main thread".

If we accept that interpretation, then we shouldn't in fact modify the
comment on qemu_init_main_loop(). Because, our dedicating SIGUSR2 to
coroutine-sigaltstack remains consistent with the main thread not
intending to handle SIGUSR2.

Put differently, what we're doing with SIGUSR2 now -- which is on the
list in the comment -- has been done *earlier* with SIG_IPI -- SIGUSR1,
also on the list --, namely, using the signal for a particular
inter-thread, or intra-thread, purpose, while making sure the main
thread does not intend to handle the signal.

Thanks
Laszlo
Laszlo Ersek Jan. 22, 2021, 9:26 p.m. UTC | #11
On 01/22/21 19:29, Laszlo Ersek wrote:

> OK, I'll try my hand at it; I hope I won't be eating my words.

The more I look at it, the less comfortable I am with the existent code.

For example, I don't understand why the original commit -- 3194c8ceeba0,
"coroutine: adding sigaltstack method (.c source)", 2012-03-12 --
wrapped the sigsuspend() in a loop, dependent on the "tr_called" flag.
That looks like a platform bug workaround -- it suggests that
sigsuspend() could wake spuriously, i.e., not in response to the pending
SIGUSR2.

That seems bogus, per POSIX, given that all signals except SIGUSR2 are
included in the mask passed to sigsuspend().

https://pubs.opengroup.org/onlinepubs/9699919799/functions/sigsuspend.html

Also, the comment says, "the signal can be delivered the first time
sigsuspend() is called", which is misleading -- the signal *IS*
delivered the first time sigsuspend() is called, given that we call
pthread_kill(pthread_self()) just before, with SIGUSR2 masked. So by the
time we reach sigsuspend(), the signal is pending.

(The synchronous nature of pthread_kill(pthread_self()) is confirmed by:

https://pubs.opengroup.org/onlinepubs/9699919799/functions/raise.html

which explains (a) the equivalence of raise() with
pthread_kill(pthread_self()), and (b) the fact that raise() does not
return until after the signal handler does, if a signal handler is
called. Given that delivery is dependent on generation, and delivery is
synchronous per description, generation *cannot* be asynchronous.)

All of this makes me super uncomfortable with the code. Either the
platform(s) where it was written & tested did not conform to POSIX, or
the original author missed something, or *I* am missing something. In
each case, I should not be modifying this code; I'm flying blind.

I'm drifting towards an overhaul of coroutine-sigaltstack, based on my
personal understanding of POSIX, but given that I can absolutely not
*test* coroutine-sigaltstack on the platforms where it actually matters,
an "overhaul" by me would be reckless.

I didn't expect these skeletons when I first read Max's "Thread safety
of coroutine-sigaltstack" email :/

Max, after having worked on top of your patch for a few hours, I
officially endorse your mutex approach. I can't encourage you or myself
to touch this code, in good conscience. It's not that it's "bad"; it's
inexplicable and (to me) untestable.

Laszlo
Laszlo Ersek Jan. 23, 2021, 12:41 a.m. UTC | #12
On 01/22/21 22:26, Laszlo Ersek wrote:

> I'm drifting towards an overhaul of coroutine-sigaltstack, based on my
> personal understanding of POSIX, but given that I can absolutely not
> *test* coroutine-sigaltstack on the platforms where it actually matters,
> an "overhaul" by me would be reckless.
> 
> I didn't expect these skeletons when I first read Max's "Thread safety
> of coroutine-sigaltstack" email :/
> 
> Max, after having worked on top of your patch for a few hours, I
> officially endorse your mutex approach. I can't encourage you or myself
> to touch this code, in good conscience. It's not that it's "bad"; it's
> inexplicable and (to me) untestable.

I'm attaching a patch (based on 0e3246263068). I'm not convinced that I
should take responsibility for this, given the lack of testability on my
end. So I'm not posting it stand-alone even as an RFC. I've built it and
have booted one of my existent domains with it, but that's all.

Thanks
Laszlo
Paolo Bonzini Jan. 23, 2021, 10:13 p.m. UTC | #13
On 22/01/21 22:26, Laszlo Ersek wrote:
> That seems bogus, per POSIX, given that all signals except SIGUSR2 are
> included in the mask passed to sigsuspend().

What happens if you get a SIGSTOP at exactly the wrong time?  (Yeah I 
know how incredibly unlikely that would be).

BTW if we are in a mood for cleanup, there's no reason to use 
pthread_key_t instead of __thread + qemu_thread_atexit_add (adding a 
Notifier to struct CoroutineThreadState).  That would fix the issue with 
async-signal safety of pthread_getspecific.

(It makes sense for the function not to be async-signal safe since it 
can in principle allocate memory for the data.  In practice it's most 
likely okay if the function has been called before on this thread).

Paolo
Max Reitz Jan. 25, 2021, 10:57 a.m. UTC | #14
On 23.01.21 01:41, Laszlo Ersek wrote:
> On 01/22/21 22:26, Laszlo Ersek wrote:
> 
>> I'm drifting towards an overhaul of coroutine-sigaltstack, based on my
>> personal understanding of POSIX, but given that I can absolutely not
>> *test* coroutine-sigaltstack on the platforms where it actually matters,
>> an "overhaul" by me would be reckless.
>>
>> I didn't expect these skeletons when I first read Max's "Thread safety
>> of coroutine-sigaltstack" email :/
>>
>> Max, after having worked on top of your patch for a few hours, I
>> officially endorse your mutex approach. I can't encourage you or myself
>> to touch this code, in good conscience. It's not that it's "bad"; it's
>> inexplicable and (to me) untestable.

On one hand, that’s too bad; on the other perhaps it’s just for the 
better to get all of this out of our minds again (for now)... O:)

> I'm attaching a patch (based on 0e3246263068). I'm not convinced that I
> should take responsibility for this, given the lack of testability on my
> end. So I'm not posting it stand-alone even as an RFC. I've built it and
> have booted one of my existent domains with it, but that's all.

FWIW, it looks good to me.  We should keep it in mind if in the future 
for some reason sigaltstack becomes more important, but for now I’m not 
too sad to abort any improvement efforts.

Max
Laszlo Ersek Jan. 25, 2021, 9:13 p.m. UTC | #15
On 01/23/21 23:13, Paolo Bonzini wrote:
> On 22/01/21 22:26, Laszlo Ersek wrote:
>> That seems bogus, per POSIX, given that all signals except SIGUSR2 are
>> included in the mask passed to sigsuspend().
> 
> What happens if you get a SIGSTOP at exactly the wrong time?  (Yeah I
> know how incredibly unlikely that would be).

Nothing; it is impossible to ignore or catch SIGSTOP:

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html

And therefore it is also impossible to block it (but an attempt to block
it will not cause issues); see e.g.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/sigsuspend.html

("It is not possible to block signals that cannot be ignored. This is
enforced by the system without causing an error to be indicated.")

So the process will simply stop executing.


The more interesting question is SIGCONT. SIGCONT will always behave
like it should:

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html

The interesting part is that it's possible to install a handler for
SIGCONT, which will act *in addition* to the normal SIGCONT effect.

Such a handler is actually useful for (e.g.) restarting timers that may
have been thrown off while the process was not scheduled for a while
(due to SIGSTOP).

If there is no handler installed for SIGCONT, nothing particular happens
when it is delivered (the process doesn't notice).

QEMU does seem to install a SIGCONT handler, in "chardev/char-stdio.c".
But that's not a problem (at least with the patch I had attached)
because the mask that sigsuspend() puts in place, temporarily, for
unblocking SIGUSR2, also blocks SIGCONT (together with everything else).
If a STOP/CONT occurs exactly then, then SIGCONT will resume the
process, but the handler for SIGCONT will run precisely *after*
sigsuspend() -- including the SIGUSR2 handling -- returns, and SIGCONT
becomes unblocked by virtue of sigsuspend() restoring the original mask.

(Side comment: I'm not convinced the existent SIGCONT handling in
"chardev/char-stdio.c" is safe, regardless of the topic at hand --
SIGCONT does not seem to be masked for all threads except one, and so
any thread could execute term_stdio_handler(). While the only called
function in that is tcsetattr(), which is async signal safe, I don't
know whether accessing "stdio_echo_state" and "stdio_allow_signal" is
also safe. It may be.)


BTW, the signal(7) Linux manual documents spurious wakeups for
sigtimedwait(2) and sigwaitinfo(2), indeed in connection with SIGCONT.
However, sigsuspend() is not listed in that section (I had checked):

https://man7.org/linux/man-pages/man7/signal.7.html

   Interruption of system calls and library functions by stop signals
       On Linux, even in the absence of signal handlers, certain
       blocking interfaces can fail with the error EINTR after the
       process is stopped by one of the stop signals and then resumed
       via SIGCONT.  This behavior is not sanctioned by POSIX.1, and
       doesn't occur on other systems.

       The Linux interfaces that display this behavior are:

       * "Input" socket interfaces, when a timeout (SO_RCVTIMEO) has
         been set on the socket using setsockopt(2): accept(2), recv(2),
         recvfrom(2), recvmmsg(2) (also with a non-NULL timeout
         argument), and recvmsg(2).

       * "Output" socket interfaces, when a timeout (SO_RCVTIMEO) has
         been set on the socket using setsockopt(2): connect(2),
         send(2), sendto(2), and sendmsg(2), if a send timeout
         (SO_SNDTIMEO) has been set.

       * epoll_wait(2), epoll_pwait(2).

       * semop(2), semtimedop(2).

       * sigtimedwait(2), sigwaitinfo(2).

       * Linux 3.7 and earlier: read(2) from an inotify(7) file
         descriptor

       * Linux 2.6.21 and earlier: futex(2) FUTEX_WAIT,
         sem_timedwait(3), sem_wait(3).

       * Linux 2.6.8 and earlier: msgrcv(2), msgsnd(2).

       * Linux 2.4 and earlier: nanosleep(2).

If other host OSes that QEMU supports have similar non-conformance
notes, and sigsuspend() is affected on them, then removing the loop
around sigsuspend() is unsafe.

I only know where to check POSIX and the Linux manuals.


> BTW if we are in a mood for cleanup, there's no reason to use
> pthread_key_t instead of __thread + qemu_thread_atexit_add (adding a
> Notifier to struct CoroutineThreadState).  That would fix the issue with
> async-signal safety of pthread_getspecific.
> 
> (It makes sense for the function not to be async-signal safe since it
> can in principle allocate memory for the data.  In practice it's most
> likely okay if the function has been called before on this thread).

This is a good idea, but with the patch I had attached, it's not urgent
-- pthread_getspecific is only called from a safe context. Per POSIX,
there is a problem only when an unsafe function is executed while the
signal interrupts an unsafe function:

https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03

    In the presence of signals, all functions defined by this volume of
    POSIX.1-2017 shall behave as defined when called from or interrupted
    by a signal-catching function, with the exception that when a signal
    interrupts an unsafe function or equivalent (such as the processing
    equivalent to exit() performed after a return from the initial call
    to main()) and the signal-catching function calls an unsafe
    function, the behavior is undefined.

And with the proposed patch, the only function that can be interrupted
by SIGUSR2 is sigsuspend(), and sigsuspend() is listed in the table of
safe functions.

Thanks
Laszlo
Laszlo Ersek Jan. 25, 2021, 9:16 p.m. UTC | #16
On 01/25/21 11:57, Max Reitz wrote:
> On 23.01.21 01:41, Laszlo Ersek wrote:
>> On 01/22/21 22:26, Laszlo Ersek wrote:
>>
>>> I'm drifting towards an overhaul of coroutine-sigaltstack, based on my
>>> personal understanding of POSIX, but given that I can absolutely not
>>> *test* coroutine-sigaltstack on the platforms where it actually matters,
>>> an "overhaul" by me would be reckless.
>>>
>>> I didn't expect these skeletons when I first read Max's "Thread safety
>>> of coroutine-sigaltstack" email :/
>>>
>>> Max, after having worked on top of your patch for a few hours, I
>>> officially endorse your mutex approach. I can't encourage you or myself
>>> to touch this code, in good conscience. It's not that it's "bad"; it's
>>> inexplicable and (to me) untestable.
> 
> On one hand, that’s too bad; on the other perhaps it’s just for the
> better to get all of this out of our minds again (for now)... O:)
> 
>> I'm attaching a patch (based on 0e3246263068). I'm not convinced that I
>> should take responsibility for this, given the lack of testability on my
>> end. So I'm not posting it stand-alone even as an RFC. I've built it and
>> have booted one of my existent domains with it, but that's all.
> 
> FWIW, it looks good to me.  We should keep it in mind if in the future
> for some reason sigaltstack becomes more important, but for now I’m not
> too sad to abort any improvement efforts.

OK -- so do you plan to post your mutex approach stand-alone? (Sorry if
that's already been done and I've missed it.)

I actually feel somewhat safe regarding my patch, as long as it is
*only* executed on Linux hosts, and on other hosts that claim
conformance to POSIX.

But, of course, this whole thing has come up because of OSX... and I
don't have the first clue about OSX. (For example whether sigsuspend()
on OSX is prone to a spurious wakeup in response to SIGCONT.)

Thanks
Laszlo
Laszlo Ersek Jan. 25, 2021, 10:08 p.m. UTC | #17
On 01/25/21 22:13, Laszlo Ersek wrote:
> On 01/23/21 23:13, Paolo Bonzini wrote:
>> On 22/01/21 22:26, Laszlo Ersek wrote:
>>> That seems bogus, per POSIX, given that all signals except SIGUSR2 are
>>> included in the mask passed to sigsuspend().
>>
>> What happens if you get a SIGSTOP at exactly the wrong time?  (Yeah I
>> know how incredibly unlikely that would be).
> 
> Nothing; it is impossible to ignore or catch SIGSTOP:
> 
> https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html
> 
> And therefore it is also impossible to block it (but an attempt to block
> it will not cause issues); see e.g.
> 
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/sigsuspend.html
> 
> ("It is not possible to block signals that cannot be ignored. This is
> enforced by the system without causing an error to be indicated.")
> 
> So the process will simply stop executing.
> 
> 
> The more interesting question is SIGCONT. SIGCONT will always behave
> like it should:
> 
> https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html
> 
> The interesting part is that it's possible to install a handler for
> SIGCONT, which will act *in addition* to the normal SIGCONT effect.
> 
> Such a handler is actually useful for (e.g.) restarting timers that may
> have been thrown off while the process was not scheduled for a while
> (due to SIGSTOP).
> 
> If there is no handler installed for SIGCONT, nothing particular happens
> when it is delivered (the process doesn't notice).

Sorry, I need to correct my terminology here (no change to my
description of the behavior, just the terms) -- basically SIGCONT
resumes the target process (if stopped) as soon as it is generated (=
made pending) for the target process; the disposition and/or masking of
SIGCONT in the target has no effect on this.

If a handler and/or masking are used in the target process, those only
affect delivery; and in that case, delivery of SIGCONT is no different
from that of other signals.

In the end, SIGCONT first resumes the process (if stopped), regardless
of disposition / masking, and *then* it is delivered, subject to
disposition and masking. (If SIGCONT is not blocked but it is caught,
then of course generation and delivery are indistinguishable in the
target process.)

(A program I had written earlier relied on this specifically, for
restarting a timer -- it would block SIGCONT at all times, except in a
very tight "event fetch" call, such as pselect(). So there could easily
be sections of code that would run with SIGCONT actively pending. In
fact, async SIGCONT could be used to force a timer restart, without ever
sending SIGSTOP!)

So the short paragraph I'm quoting right above is inaccurate because, if
no handler is installed, SIGCONT is not "delivered" at all.

Thanks
Laszlo

> 
> QEMU does seem to install a SIGCONT handler, in "chardev/char-stdio.c".
> But that's not a problem (at least with the patch I had attached)
> because the mask that sigsuspend() puts in place, temporarily, for
> unblocking SIGUSR2, also blocks SIGCONT (together with everything else).
> If a STOP/CONT occurs exactly then, then SIGCONT will resume the
> process, but the handler for SIGCONT will run precisely *after*
> sigsuspend() -- including the SIGUSR2 handling -- returns, and SIGCONT
> becomes unblocked by virtue of sigsuspend() restoring the original mask.
> 
> (Side comment: I'm not convinced the existent SIGCONT handling in
> "chardev/char-stdio.c" is safe, regardless of the topic at hand --
> SIGCONT does not seem to be masked for all threads except one, and so
> any thread could execute term_stdio_handler(). While the only called
> function in that is tcsetattr(), which is async signal safe, I don't
> know whether accessing "stdio_echo_state" and "stdio_allow_signal" is
> also safe. It may be.)
> 
> 
> BTW, the signal(7) Linux manual documents spurious wakeups for
> sigtimedwait(2) and sigwaitinfo(2), indeed in connection with SIGCONT.
> However, sigsuspend() is not listed in that section (I had checked):
> 
> https://man7.org/linux/man-pages/man7/signal.7.html
> 
>    Interruption of system calls and library functions by stop signals
>        On Linux, even in the absence of signal handlers, certain
>        blocking interfaces can fail with the error EINTR after the
>        process is stopped by one of the stop signals and then resumed
>        via SIGCONT.  This behavior is not sanctioned by POSIX.1, and
>        doesn't occur on other systems.
> 
>        The Linux interfaces that display this behavior are:
> 
>        * "Input" socket interfaces, when a timeout (SO_RCVTIMEO) has
>          been set on the socket using setsockopt(2): accept(2), recv(2),
>          recvfrom(2), recvmmsg(2) (also with a non-NULL timeout
>          argument), and recvmsg(2).
> 
>        * "Output" socket interfaces, when a timeout (SO_RCVTIMEO) has
>          been set on the socket using setsockopt(2): connect(2),
>          send(2), sendto(2), and sendmsg(2), if a send timeout
>          (SO_SNDTIMEO) has been set.
> 
>        * epoll_wait(2), epoll_pwait(2).
> 
>        * semop(2), semtimedop(2).
> 
>        * sigtimedwait(2), sigwaitinfo(2).
> 
>        * Linux 3.7 and earlier: read(2) from an inotify(7) file
>          descriptor
> 
>        * Linux 2.6.21 and earlier: futex(2) FUTEX_WAIT,
>          sem_timedwait(3), sem_wait(3).
> 
>        * Linux 2.6.8 and earlier: msgrcv(2), msgsnd(2).
> 
>        * Linux 2.4 and earlier: nanosleep(2).
> 
> If other host OSes that QEMU supports have similar non-conformance
> notes, and sigsuspend() is affected on them, then removing the loop
> around sigsuspend() is unsafe.
> 
> I only know where to check POSIX and the Linux manuals.
> 
> 
>> BTW if we are in a mood for cleanup, there's no reason to use
>> pthread_key_t instead of __thread + qemu_thread_atexit_add (adding a
>> Notifier to struct CoroutineThreadState).  That would fix the issue with
>> async-signal safety of pthread_getspecific.
>>
>> (It makes sense for the function not to be async-signal safe since it
>> can in principle allocate memory for the data.  In practice it's most
>> likely okay if the function has been called before on this thread).
> 
> This is a good idea, but with the patch I had attached, it's not urgent
> -- pthread_getspecific is only called from a safe context. Per POSIX,
> there is a problem only when an unsafe function is executed while the
> signal interrupts an unsafe function:
> 
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03
> 
>     In the presence of signals, all functions defined by this volume of
>     POSIX.1-2017 shall behave as defined when called from or interrupted
>     by a signal-catching function, with the exception that when a signal
>     interrupts an unsafe function or equivalent (such as the processing
>     equivalent to exit() performed after a return from the initial call
>     to main()) and the signal-catching function calls an unsafe
>     function, the behavior is undefined.
> 
> And with the proposed patch, the only function that can be interrupted
> by SIGUSR2 is sigsuspend(), and sigsuspend() is listed in the table of
> safe functions.
> 
> Thanks
> Laszlo
>
diff mbox series

Patch

diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
index aade82afb8..2d32afc322 100644
--- a/util/coroutine-sigaltstack.c
+++ b/util/coroutine-sigaltstack.c
@@ -59,6 +59,8 @@  typedef struct {
 
 static pthread_key_t thread_state_key;
 
+static void coroutine_trampoline(int signal);
+
 static CoroutineThreadState *coroutine_get_thread_state(void)
 {
     CoroutineThreadState *s = pthread_getspecific(thread_state_key);
@@ -80,6 +82,7 @@  static void qemu_coroutine_thread_cleanup(void *opaque)
 
 static void __attribute__((constructor)) coroutine_init(void)
 {
+    struct sigaction sa;
     int ret;
 
     ret = pthread_key_create(&thread_state_key, qemu_coroutine_thread_cleanup);
@@ -87,6 +90,20 @@  static void __attribute__((constructor)) coroutine_init(void)
         fprintf(stderr, "unable to create leader key: %s\n", strerror(errno));
         abort();
     }
+
+    /*
+     * Establish the SIGUSR2 signal handler.  This is a process-wide
+     * operation, and so will apply to all threads from here on.
+     */
+    sa = (struct sigaction) {
+        .sa_handler = coroutine_trampoline,
+        .sa_flags   = SA_ONSTACK,
+    };
+
+    if (sigaction(SIGUSR2, &sa, NULL) != 0) {
+        perror("Unable to install SIGUSR2 handler");
+        abort();
+    }
 }
 
 /* "boot" function
@@ -121,7 +138,17 @@  static void coroutine_trampoline(int signal)
     /* Get the thread specific information */
     coTS = coroutine_get_thread_state();
     self = coTS->tr_handler;
+
+    if (!self) {
+        /*
+         * This SIGUSR2 came from an external source, not from
+         * qemu_coroutine_new(), so perform the default action.
+         */
+        exit(0);
+    }
+
     coTS->tr_called = 1;
+    coTS->tr_handler = NULL;
     co = &self->base;
 
     /*
@@ -150,12 +177,9 @@  Coroutine *qemu_coroutine_new(void)
 {
     CoroutineSigAltStack *co;
     CoroutineThreadState *coTS;
-    struct sigaction sa;
-    struct sigaction osa;
     stack_t ss;
     stack_t oss;
     sigset_t sigs;
-    sigset_t osigs;
     sigjmp_buf old_env;
 
     /* The way to manipulate stack is with the sigaltstack function. We
@@ -172,24 +196,6 @@  Coroutine *qemu_coroutine_new(void)
     co->stack = qemu_alloc_stack(&co->stack_size);
     co->base.entry_arg = &old_env; /* stash away our jmp_buf */
 
-    coTS = coroutine_get_thread_state();
-    coTS->tr_handler = co;
-
-    /*
-     * Preserve the SIGUSR2 signal state, block SIGUSR2,
-     * and establish our signal handler. The signal will
-     * later transfer control onto the signal stack.
-     */
-    sigemptyset(&sigs);
-    sigaddset(&sigs, SIGUSR2);
-    pthread_sigmask(SIG_BLOCK, &sigs, &osigs);
-    sa.sa_handler = coroutine_trampoline;
-    sigfillset(&sa.sa_mask);
-    sa.sa_flags = SA_ONSTACK;
-    if (sigaction(SIGUSR2, &sa, &osa) != 0) {
-        abort();
-    }
-
     /*
      * Set the new stack.
      */
@@ -207,6 +213,8 @@  Coroutine *qemu_coroutine_new(void)
      * signal can be delivered the first time sigsuspend() is
      * called.
      */
+    coTS = coroutine_get_thread_state();
+    coTS->tr_handler = co;
     coTS->tr_called = 0;
     pthread_kill(pthread_self(), SIGUSR2);
     sigfillset(&sigs);
@@ -230,12 +238,6 @@  Coroutine *qemu_coroutine_new(void)
         sigaltstack(&oss, NULL);
     }
 
-    /*
-     * Restore the old SIGUSR2 signal handler and mask
-     */
-    sigaction(SIGUSR2, &osa, NULL);
-    pthread_sigmask(SIG_SETMASK, &osigs, NULL);
-
     /*
      * Now enter the trampoline again, but this time not as a signal
      * handler. Instead we jump into it directly. The functionally