Message ID | 20210125120305.19520-1-mreitz@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | coroutine-sigaltstack: Add SIGUSR2 mutex | expand |
On 01/25/21 13:03, Max Reitz wrote: > Disposition (action) for any given signal is global for the process. > When two threads run coroutine-sigaltstack's qemu_coroutine_new() > concurrently, they may interfere with each other: One of them may revert > the SIGUSR2 handler to SIG_DFL, between the other thread (a) setting up > coroutine_trampoline() as the handler and (b) raising SIGUSR2. That > SIGUSR2 will then terminate the QEMU process abnormally. > > We have to ensure that only one thread at a time can modify the > process-global SIGUSR2 handler. To do so, wrap the whole section where > that is done in a mutex. > > Alternatively, we could for example have the SIGUSR2 handler always be > coroutine_trampoline(), so there would be no need to invoke sigaction() > in qemu_coroutine_new(). Laszlo has posted a patch to do so here: > > https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg05962.html > > However, given that coroutine-sigaltstack is more of a fallback > implementation for platforms that do not support ucontext, that change > may be a bit too invasive to be comfortable with it. The mutex proposed > here may negatively impact performance, but the change is much simpler. > > Signed-off-by: Max Reitz <mreitz@redhat.com> > --- > util/coroutine-sigaltstack.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c > index aade82afb8..e99b8a4f9c 100644 > --- a/util/coroutine-sigaltstack.c > +++ b/util/coroutine-sigaltstack.c > @@ -157,6 +157,7 @@ Coroutine *qemu_coroutine_new(void) > sigset_t sigs; > sigset_t osigs; > sigjmp_buf old_env; > + static pthread_mutex_t sigusr2_mutex = PTHREAD_MUTEX_INITIALIZER; > > /* The way to manipulate stack is with the sigaltstack function. We > * prepare a stack, with it delivering a signal to ourselves and then > @@ -186,6 +187,12 @@ Coroutine *qemu_coroutine_new(void) > sa.sa_handler = coroutine_trampoline; > sigfillset(&sa.sa_mask); > sa.sa_flags = SA_ONSTACK; > + > + /* > + * sigaction() is a process-global operation. We must not run > + * this code in multiple threads at once. > + */ > + pthread_mutex_lock(&sigusr2_mutex); > if (sigaction(SIGUSR2, &sa, &osa) != 0) { > abort(); > } > @@ -234,6 +241,8 @@ Coroutine *qemu_coroutine_new(void) > * Restore the old SIGUSR2 signal handler and mask > */ > sigaction(SIGUSR2, &osa, NULL); > + pthread_mutex_unlock(&sigusr2_mutex); > + > pthread_sigmask(SIG_SETMASK, &osigs, NULL); > > /* > Reviewed-by: Laszlo Ersek <lersek@redhat.com> Thanks! Laszlo
25.01.2021 15:03, Max Reitz wrote: > Disposition (action) for any given signal is global for the process. > When two threads run coroutine-sigaltstack's qemu_coroutine_new() > concurrently, they may interfere with each other: One of them may revert > the SIGUSR2 handler to SIG_DFL, between the other thread (a) setting up > coroutine_trampoline() as the handler and (b) raising SIGUSR2. That > SIGUSR2 will then terminate the QEMU process abnormally. > > We have to ensure that only one thread at a time can modify the > process-global SIGUSR2 handler. To do so, wrap the whole section where > that is done in a mutex. > > Alternatively, we could for example have the SIGUSR2 handler always be > coroutine_trampoline(), so there would be no need to invoke sigaction() > in qemu_coroutine_new(). Laszlo has posted a patch to do so here: > > https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg05962.html > > However, given that coroutine-sigaltstack is more of a fallback > implementation for platforms that do not support ucontext, that change > may be a bit too invasive to be comfortable with it. The mutex proposed > here may negatively impact performance, but the change is much simpler. > > Signed-off-by: Max Reitz <mreitz@redhat.com> > --- > util/coroutine-sigaltstack.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c > index aade82afb8..e99b8a4f9c 100644 > --- a/util/coroutine-sigaltstack.c > +++ b/util/coroutine-sigaltstack.c > @@ -157,6 +157,7 @@ Coroutine *qemu_coroutine_new(void) > sigset_t sigs; > sigset_t osigs; > sigjmp_buf old_env; > + static pthread_mutex_t sigusr2_mutex = PTHREAD_MUTEX_INITIALIZER; > > /* The way to manipulate stack is with the sigaltstack function. We > * prepare a stack, with it delivering a signal to ourselves and then > @@ -186,6 +187,12 @@ Coroutine *qemu_coroutine_new(void) > sa.sa_handler = coroutine_trampoline; > sigfillset(&sa.sa_mask); > sa.sa_flags = SA_ONSTACK; > + > + /* > + * sigaction() is a process-global operation. We must not run > + * this code in multiple threads at once. > + */ > + pthread_mutex_lock(&sigusr2_mutex); > if (sigaction(SIGUSR2, &sa, &osa) != 0) { > abort(); > } > @@ -234,6 +241,8 @@ Coroutine *qemu_coroutine_new(void) > * Restore the old SIGUSR2 signal handler and mask > */ > sigaction(SIGUSR2, &osa, NULL); > + pthread_mutex_unlock(&sigusr2_mutex); > + > pthread_sigmask(SIG_SETMASK, &osigs, NULL); > > /* > weak: Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Side thought: so, sigaltstack coroutine implementation is not thread-safe. Is that the only bug? Or actually, the whole implementation should be revisited to check, could it be used with iothreads or not? Shouldn't we just state that sigaltstack coroutine implementation doesn't support iothreads? And do error out on iothread creation if sigaltstack coroutines is in use?
On 26.01.21 13:44, Vladimir Sementsov-Ogievskiy wrote: > 25.01.2021 15:03, Max Reitz wrote: >> Disposition (action) for any given signal is global for the process. >> When two threads run coroutine-sigaltstack's qemu_coroutine_new() >> concurrently, they may interfere with each other: One of them may revert >> the SIGUSR2 handler to SIG_DFL, between the other thread (a) setting up >> coroutine_trampoline() as the handler and (b) raising SIGUSR2. That >> SIGUSR2 will then terminate the QEMU process abnormally. >> >> We have to ensure that only one thread at a time can modify the >> process-global SIGUSR2 handler. To do so, wrap the whole section where >> that is done in a mutex. >> >> Alternatively, we could for example have the SIGUSR2 handler always be >> coroutine_trampoline(), so there would be no need to invoke sigaction() >> in qemu_coroutine_new(). Laszlo has posted a patch to do so here: >> >> https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg05962.html >> >> However, given that coroutine-sigaltstack is more of a fallback >> implementation for platforms that do not support ucontext, that change >> may be a bit too invasive to be comfortable with it. The mutex proposed >> here may negatively impact performance, but the change is much simpler. >> >> Signed-off-by: Max Reitz <mreitz@redhat.com> >> --- >> util/coroutine-sigaltstack.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c >> index aade82afb8..e99b8a4f9c 100644 >> --- a/util/coroutine-sigaltstack.c >> +++ b/util/coroutine-sigaltstack.c >> @@ -157,6 +157,7 @@ Coroutine *qemu_coroutine_new(void) >> sigset_t sigs; >> sigset_t osigs; >> sigjmp_buf old_env; >> + static pthread_mutex_t sigusr2_mutex = PTHREAD_MUTEX_INITIALIZER; >> /* The way to manipulate stack is with the sigaltstack function. We >> * prepare a stack, with it delivering a signal to ourselves and >> then >> @@ -186,6 +187,12 @@ Coroutine *qemu_coroutine_new(void) >> sa.sa_handler = coroutine_trampoline; >> sigfillset(&sa.sa_mask); >> sa.sa_flags = SA_ONSTACK; >> + >> + /* >> + * sigaction() is a process-global operation. We must not run >> + * this code in multiple threads at once. >> + */ >> + pthread_mutex_lock(&sigusr2_mutex); >> if (sigaction(SIGUSR2, &sa, &osa) != 0) { >> abort(); >> } >> @@ -234,6 +241,8 @@ Coroutine *qemu_coroutine_new(void) >> * Restore the old SIGUSR2 signal handler and mask >> */ >> sigaction(SIGUSR2, &osa, NULL); >> + pthread_mutex_unlock(&sigusr2_mutex); >> + >> pthread_sigmask(SIG_SETMASK, &osigs, NULL); >> /* >> > > weak: > Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > > Side thought: so, sigaltstack coroutine implementation is not > thread-safe. Is that the only bug? It would be great if I could tell you for sure whether there’s no bug in some piece of code. :) > Or actually, the whole implementation > should be revisited to check, could it be used with iothreads or not? Judging from the discussion I had with Laszlo, I’m definitely not the right person to do so, because for example I don’t know the ins and outs of signal handling. I can only tell you it’s the only issue I’ve seen, and that there’s just not much more code in coroutine-sigaltstack.c than the code around qemu_coroutine_new(). > Shouldn't we just state that sigaltstack coroutine implementation > doesn't support iothreads? And do error out on iothread creation if > sigaltstack coroutines is in use? I’m not sure whether that would be better than potentially having a bug in it. What you’re proposing is effectively breaking all iothreads usage on MacOS. If I were a MacOS user, I’d rather risk encountering bugs than that. (And it isn’t like we know it’s unstable with iothreads; I haven’t seen it breaking with this patch applied yet, and I don’t think there’s reason to believe it would be. qemu_coroutine_new() together with coroutine_trampoline() sets up a coroutine environment, and the rest of the code just consists of sigsetjmp() and siglongjmp(). I believe Laszlo hat some open questions about signal masking done by those functions, but I don’t think that has anything to do with multithreading.) Max
On Mon, Jan 25, 2021 at 01:03:05PM +0100, Max Reitz wrote: > Disposition (action) for any given signal is global for the process. > When two threads run coroutine-sigaltstack's qemu_coroutine_new() > concurrently, they may interfere with each other: One of them may revert > the SIGUSR2 handler to SIG_DFL, between the other thread (a) setting up > coroutine_trampoline() as the handler and (b) raising SIGUSR2. That > SIGUSR2 will then terminate the QEMU process abnormally. > > We have to ensure that only one thread at a time can modify the > process-global SIGUSR2 handler. To do so, wrap the whole section where > that is done in a mutex. > > Alternatively, we could for example have the SIGUSR2 handler always be > coroutine_trampoline(), so there would be no need to invoke sigaction() > in qemu_coroutine_new(). Laszlo has posted a patch to do so here: > > https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg05962.html > > However, given that coroutine-sigaltstack is more of a fallback > implementation for platforms that do not support ucontext, that change > may be a bit too invasive to be comfortable with it. The mutex proposed > here may negatively impact performance, but the change is much simpler. > > Signed-off-by: Max Reitz <mreitz@redhat.com> > --- > util/coroutine-sigaltstack.c | 9 +++++++++ > 1 file changed, 9 insertions(+) I slightly prefer Laszlo's patch: since the signal disposition is process-wide it's cleaner to set it up globally and once only. That said, this patch is okay too. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
On 01/26/21 14:16, Max Reitz wrote: > On 26.01.21 13:44, Vladimir Sementsov-Ogievskiy wrote: >> 25.01.2021 15:03, Max Reitz wrote: >>> Disposition (action) for any given signal is global for the process. >>> When two threads run coroutine-sigaltstack's qemu_coroutine_new() >>> concurrently, they may interfere with each other: One of them may revert >>> the SIGUSR2 handler to SIG_DFL, between the other thread (a) setting up >>> coroutine_trampoline() as the handler and (b) raising SIGUSR2. That >>> SIGUSR2 will then terminate the QEMU process abnormally. >>> >>> We have to ensure that only one thread at a time can modify the >>> process-global SIGUSR2 handler. To do so, wrap the whole section where >>> that is done in a mutex. >>> >>> Alternatively, we could for example have the SIGUSR2 handler always be >>> coroutine_trampoline(), so there would be no need to invoke sigaction() >>> in qemu_coroutine_new(). Laszlo has posted a patch to do so here: >>> >>> >>> https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg05962.html >>> >>> However, given that coroutine-sigaltstack is more of a fallback >>> implementation for platforms that do not support ucontext, that change >>> may be a bit too invasive to be comfortable with it. The mutex proposed >>> here may negatively impact performance, but the change is much simpler. >>> >>> Signed-off-by: Max Reitz <mreitz@redhat.com> >>> --- >>> util/coroutine-sigaltstack.c | 9 +++++++++ >>> 1 file changed, 9 insertions(+) >>> >>> diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c >>> index aade82afb8..e99b8a4f9c 100644 >>> --- a/util/coroutine-sigaltstack.c >>> +++ b/util/coroutine-sigaltstack.c >>> @@ -157,6 +157,7 @@ Coroutine *qemu_coroutine_new(void) >>> sigset_t sigs; >>> sigset_t osigs; >>> sigjmp_buf old_env; >>> + static pthread_mutex_t sigusr2_mutex = PTHREAD_MUTEX_INITIALIZER; >>> /* The way to manipulate stack is with the sigaltstack >>> function. We >>> * prepare a stack, with it delivering a signal to ourselves >>> and then >>> @@ -186,6 +187,12 @@ Coroutine *qemu_coroutine_new(void) >>> sa.sa_handler = coroutine_trampoline; >>> sigfillset(&sa.sa_mask); >>> sa.sa_flags = SA_ONSTACK; >>> + >>> + /* >>> + * sigaction() is a process-global operation. We must not run >>> + * this code in multiple threads at once. >>> + */ >>> + pthread_mutex_lock(&sigusr2_mutex); >>> if (sigaction(SIGUSR2, &sa, &osa) != 0) { >>> abort(); >>> } >>> @@ -234,6 +241,8 @@ Coroutine *qemu_coroutine_new(void) >>> * Restore the old SIGUSR2 signal handler and mask >>> */ >>> sigaction(SIGUSR2, &osa, NULL); >>> + pthread_mutex_unlock(&sigusr2_mutex); >>> + >>> pthread_sigmask(SIG_SETMASK, &osigs, NULL); >>> /* >>> >> >> weak: >> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> >> Side thought: so, sigaltstack coroutine implementation is not >> thread-safe. Is that the only bug? > > It would be great if I could tell you for sure whether there’s no bug in > some piece of code. :) > >> Or actually, the whole implementation should be revisited to check, >> could it be used with iothreads or not? > > Judging from the discussion I had with Laszlo, I’m definitely not the > right person to do so, because for example I don’t know the ins and outs > of signal handling. > > I can only tell you it’s the only issue I’ve seen, and that there’s just > not much more code in coroutine-sigaltstack.c than the code around > qemu_coroutine_new(). > >> Shouldn't we just state that sigaltstack coroutine implementation >> doesn't support iothreads? And do error out on iothread creation if >> sigaltstack coroutines is in use? > > I’m not sure whether that would be better than potentially having a bug > in it. What you’re proposing is effectively breaking all iothreads > usage on MacOS. If I were a MacOS user, I’d rather risk encountering > bugs than that. > > (And it isn’t like we know it’s unstable with iothreads; I haven’t seen > it breaking with this patch applied yet, and I don’t think there’s > reason to believe it would be. qemu_coroutine_new() together with > coroutine_trampoline() sets up a coroutine environment, and the rest of > the code just consists of sigsetjmp() and siglongjmp(). I believe > Laszlo hat some open questions about signal masking done by those > functions, but I don’t think that has anything to do with multithreading.) I've no open questions regarding the signal masking done by sigsetjmp() and siglongjmp(). I was briefly confused by sigsetjmp() potentially saving the signal mask into the "env" buffer even if "savemask" were zero (POSIX allows this behavior), but then I re-learned that siglongjmp() is *required to ignore* that potentially-saved mask in "env" if "savemask" was 0 in the first place. So the end result is as expected, it's just that the distribution of responsibilities is potentially non-intuitive (i.e., why permit the "save" function to stash some crap, under some circumstances, if the "load" function is required to ignore said crap under the same circumstances?) Of course the answer is that POSIX codifies existent practice, and some system does this. I guess I would have appreciated a hint right in sigsetjmp(). Anyway: no open questions on my end. Thanks Laszlo
On 1/25/21 6:03 AM, Max Reitz wrote: > Disposition (action) for any given signal is global for the process. > When two threads run coroutine-sigaltstack's qemu_coroutine_new() > concurrently, they may interfere with each other: One of them may revert > the SIGUSR2 handler to SIG_DFL, between the other thread (a) setting up > coroutine_trampoline() as the handler and (b) raising SIGUSR2. That > SIGUSR2 will then terminate the QEMU process abnormally. > > We have to ensure that only one thread at a time can modify the > process-global SIGUSR2 handler. To do so, wrap the whole section where > that is done in a mutex. > > Alternatively, we could for example have the SIGUSR2 handler always be > coroutine_trampoline(), so there would be no need to invoke sigaction() > in qemu_coroutine_new(). Laszlo has posted a patch to do so here: > > https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg05962.html I indeed like that one, but also concur that simplicity trumps the uncertainty of a larger patch. Let's get things unbroken before we worry about optimizing things to avoid the mutex. > > However, given that coroutine-sigaltstack is more of a fallback > implementation for platforms that do not support ucontext, that change > may be a bit too invasive to be comfortable with it. The mutex proposed > here may negatively impact performance, but the change is much simpler. > > Signed-off-by: Max Reitz <mreitz@redhat.com> > --- > util/coroutine-sigaltstack.c | 9 +++++++++ > 1 file changed, 9 insertions(+) Reviewed-by: Eric Blake <eblake@redhat.com>
diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c index aade82afb8..e99b8a4f9c 100644 --- a/util/coroutine-sigaltstack.c +++ b/util/coroutine-sigaltstack.c @@ -157,6 +157,7 @@ Coroutine *qemu_coroutine_new(void) sigset_t sigs; sigset_t osigs; sigjmp_buf old_env; + static pthread_mutex_t sigusr2_mutex = PTHREAD_MUTEX_INITIALIZER; /* The way to manipulate stack is with the sigaltstack function. We * prepare a stack, with it delivering a signal to ourselves and then @@ -186,6 +187,12 @@ Coroutine *qemu_coroutine_new(void) sa.sa_handler = coroutine_trampoline; sigfillset(&sa.sa_mask); sa.sa_flags = SA_ONSTACK; + + /* + * sigaction() is a process-global operation. We must not run + * this code in multiple threads at once. + */ + pthread_mutex_lock(&sigusr2_mutex); if (sigaction(SIGUSR2, &sa, &osa) != 0) { abort(); } @@ -234,6 +241,8 @@ Coroutine *qemu_coroutine_new(void) * Restore the old SIGUSR2 signal handler and mask */ sigaction(SIGUSR2, &osa, NULL); + pthread_mutex_unlock(&sigusr2_mutex); + pthread_sigmask(SIG_SETMASK, &osigs, NULL); /*
Disposition (action) for any given signal is global for the process. When two threads run coroutine-sigaltstack's qemu_coroutine_new() concurrently, they may interfere with each other: One of them may revert the SIGUSR2 handler to SIG_DFL, between the other thread (a) setting up coroutine_trampoline() as the handler and (b) raising SIGUSR2. That SIGUSR2 will then terminate the QEMU process abnormally. We have to ensure that only one thread at a time can modify the process-global SIGUSR2 handler. To do so, wrap the whole section where that is done in a mutex. Alternatively, we could for example have the SIGUSR2 handler always be coroutine_trampoline(), so there would be no need to invoke sigaction() in qemu_coroutine_new(). Laszlo has posted a patch to do so here: https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg05962.html However, given that coroutine-sigaltstack is more of a fallback implementation for platforms that do not support ucontext, that change may be a bit too invasive to be comfortable with it. The mutex proposed here may negatively impact performance, but the change is much simpler. Signed-off-by: Max Reitz <mreitz@redhat.com> --- util/coroutine-sigaltstack.c | 9 +++++++++ 1 file changed, 9 insertions(+)