Message ID | ad910973c593c5ac2fed3a10ea958f7e9c12f82c.1527935663.git.mprivozn@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sat, Jun 02, 2018 at 12:34:52PM +0200, Michal Privoznik wrote: > After 047f7038f586d215 it is possible for event loop to run two > times. First time whilst parsing command line options (the idea > is to bring up monitor early so that management applications can > tweak config before machine is initialized). And the second time > is after everything is set up (this is the usual place). In both > cases the event loop is called as main_loop_wait(nonblocking = > false) which causes the event loop to block until at least one > event occurred. > > Now, consider that somebody (i.e. libvirt) calls us with > -daemonize. This operation is split in two steps. The main() > calls os_daemonize() which fork()-s and then waits in read() > until child notifies it via write(): > > /qemu.git $ ./x86_64-softmmu/qemu-system-x86_64 -S -daemonize \ > -no-user-config -nodefaults -nographic > > main(): child: > os_daemonize(): > read(pipe[0]) > > main_loop(): > main_loop_wait(false) > > os_setup_post(): > write(pipe[1]) > > main_loop(): > main_loop_wait(false) > > Here it can be clearly seen that main() does not exit until an > event occurs, but at the same time nobody will touch the monitor > socket until their exec("qemu-system-*") finishes. So the whole > thing deadlocks. > > The solution is to not call main_loop() unless --preconfig was > specified (in which case caller knows they must connect to the > socket before exec() finishes). > > Signed-off-by: Michal Privoznik <mprivozn@redhat.com> > --- > vl.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/vl.c b/vl.c > index 70f090c823..cde2934c40 100644 > --- a/vl.c > +++ b/vl.c > @@ -4469,8 +4469,13 @@ int main(int argc, char **argv, char **envp) > } > parse_numa_opts(current_machine); > > - /* do monitor/qmp handling at preconfig state if requested */ > - main_loop(); > + if (preconfig_exit_requested) { > + runstate_set(RUN_STATE_PRELAUNCH); > + preconfig_exit_requested = false; > + } else { > + /* do monitor/qmp handling at preconfig state if requested */ > + main_loop(); > + } Avoiding the double-run of main_loop is good, however, I think we should also not have put current_run_state in RUN_STATE_PRECONFIG in the first place if --preconfig wasn't set. I've sent a patch to fix that problem too, so if yours is also applied, it could be changed to just do: if (current_run_state == RNU_STATE_PRECONFIG) { main_loop(); } Regards, Daniel
On Sat, 2 Jun 2018 12:34:52 +0200 Michal Privoznik <mprivozn@redhat.com> wrote: > After 047f7038f586d215 it is possible for event loop to run two > times. First time whilst parsing command line options (the idea > is to bring up monitor early so that management applications can > tweak config before machine is initialized). And the second time > is after everything is set up (this is the usual place). In both > cases the event loop is called as main_loop_wait(nonblocking = > false) which causes the event loop to block until at least one > event occurred. > > Now, consider that somebody (i.e. libvirt) calls us with > -daemonize. This operation is split in two steps. The main() > calls os_daemonize() which fork()-s and then waits in read() > until child notifies it via write(): > > /qemu.git $ ./x86_64-softmmu/qemu-system-x86_64 -S -daemonize \ > -no-user-config -nodefaults -nographic > > main(): child: > os_daemonize(): > read(pipe[0]) > > main_loop(): > main_loop_wait(false) > > os_setup_post(): > write(pipe[1]) > > main_loop(): > main_loop_wait(false) > > Here it can be clearly seen that main() does not exit until an > event occurs, but at the same time nobody will touch the monitor > socket until their exec("qemu-system-*") finishes. So the whole > thing deadlocks. > > The solution is to not call main_loop() unless --preconfig was > specified (in which case caller knows they must connect to the > socket before exec() finishes). > > Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> > --- > vl.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/vl.c b/vl.c > index 70f090c823..cde2934c40 100644 > --- a/vl.c > +++ b/vl.c > @@ -4469,8 +4469,13 @@ int main(int argc, char **argv, char **envp) > } > parse_numa_opts(current_machine); > > - /* do monitor/qmp handling at preconfig state if requested */ > - main_loop(); > + if (preconfig_exit_requested) { > + runstate_set(RUN_STATE_PRELAUNCH); > + preconfig_exit_requested = false; > + } else { > + /* do monitor/qmp handling at preconfig state if requested */ > + main_loop(); > + } > > /* from here on runstate is RUN_STATE_PRELAUNCH */ > machine_run_board_init(current_machine);
On Mon, Jun 04, 2018 at 11:32:44AM +0100, Daniel P. Berrangé wrote: [...] > Avoiding the double-run of main_loop is good, however, I think we should > also not have put current_run_state in RUN_STATE_PRECONFIG in the first > place if --preconfig wasn't set. I've sent a patch to fix that problem > too, so if yours is also applied, it could be changed to just do: > > if (current_run_state == RNU_STATE_PRECONFIG) { > main_loop(); > } So, this patch is desirable even if we refactor the state machine as suggested in the other threads, right? I'm queueing it on machine-next right now.
On Mon, 4 Jun 2018 21:41:46 -0300 Eduardo Habkost <ehabkost@redhat.com> wrote: > On Mon, Jun 04, 2018 at 11:32:44AM +0100, Daniel P. Berrangé wrote: > [...] > > Avoiding the double-run of main_loop is good, however, I think we should > > also not have put current_run_state in RUN_STATE_PRECONFIG in the first > > place if --preconfig wasn't set. I've sent a patch to fix that problem > > too, so if yours is also applied, it could be changed to just do: > > > > if (current_run_state == RNU_STATE_PRECONFIG) { > > main_loop(); > > } > > So, this patch is desirable even if we refactor the state machine > as suggested in the other threads, right? > > I'm queueing it on machine-next right now. > Maybe add a note to commit message that it also fixes qemu hang when -nodefaults CLI option is used. Also we would need Daniel's [PATCH v2 2/2] vl: fix use of --daemonize with --preconfig to fix the same issue with demonize but in case --preconfig were used
On Tue, 5 Jun 2018 13:43:14 +0200 Igor Mammedov <imammedo@redhat.com> wrote: > On Mon, 4 Jun 2018 21:41:46 -0300 > Eduardo Habkost <ehabkost@redhat.com> wrote: > > > On Mon, Jun 04, 2018 at 11:32:44AM +0100, Daniel P. Berrangé wrote: > > [...] > > > Avoiding the double-run of main_loop is good, however, I think we should > > > also not have put current_run_state in RUN_STATE_PRECONFIG in the first > > > place if --preconfig wasn't set. I've sent a patch to fix that problem > > > too, so if yours is also applied, it could be changed to just do: > > > > > > if (current_run_state == RNU_STATE_PRECONFIG) { > > > main_loop(); > > > } > > > > So, this patch is desirable even if we refactor the state machine > > as suggested in the other threads, right? > > > > I'm queueing it on machine-next right now. > > > Maybe add a note to commit message that it also fixes > qemu hang when -nodefaults CLI option is used. > > Also we would need Daniel's > > [PATCH v2 2/2] vl: fix use of --daemonize with --preconfig > > to fix the same issue with demonize but in case --preconfig were used > iotests are still broken even with it, so I'm looking into it right now
diff --git a/vl.c b/vl.c index 70f090c823..cde2934c40 100644 --- a/vl.c +++ b/vl.c @@ -4469,8 +4469,13 @@ int main(int argc, char **argv, char **envp) } parse_numa_opts(current_machine); - /* do monitor/qmp handling at preconfig state if requested */ - main_loop(); + if (preconfig_exit_requested) { + runstate_set(RUN_STATE_PRELAUNCH); + preconfig_exit_requested = false; + } else { + /* do monitor/qmp handling at preconfig state if requested */ + main_loop(); + } /* from here on runstate is RUN_STATE_PRELAUNCH */ machine_run_board_init(current_machine);
After 047f7038f586d215 it is possible for event loop to run two times. First time whilst parsing command line options (the idea is to bring up monitor early so that management applications can tweak config before machine is initialized). And the second time is after everything is set up (this is the usual place). In both cases the event loop is called as main_loop_wait(nonblocking = false) which causes the event loop to block until at least one event occurred. Now, consider that somebody (i.e. libvirt) calls us with -daemonize. This operation is split in two steps. The main() calls os_daemonize() which fork()-s and then waits in read() until child notifies it via write(): /qemu.git $ ./x86_64-softmmu/qemu-system-x86_64 -S -daemonize \ -no-user-config -nodefaults -nographic main(): child: os_daemonize(): read(pipe[0]) main_loop(): main_loop_wait(false) os_setup_post(): write(pipe[1]) main_loop(): main_loop_wait(false) Here it can be clearly seen that main() does not exit until an event occurs, but at the same time nobody will touch the monitor socket until their exec("qemu-system-*") finishes. So the whole thing deadlocks. The solution is to not call main_loop() unless --preconfig was specified (in which case caller knows they must connect to the socket before exec() finishes). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- vl.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)