Message ID | 20240317083704.23244-3-zhukeqian1@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Some fixes for pause and resume all vcpus | expand |
On 17.03.24 09:37, Keqian Zhu via wrote: > For vCPU being hotplugged, qemu_init_vcpu() is called. In this > function, we set vcpu state as stopped, and then wait vcpu thread > to be created. > > As the vcpu state is stopped, it will inform us it has been created > and then wait on halt_cond. After we has realized vcpu object, we > will resume the vcpu thread. > > However, during we wait vcpu thread to be created, the bql is > unlocked, and other thread is allowed to call resume_all_vcpus(), > which will resume the un-realized vcpu. > > This fixes the issue by filter out un-realized vcpu during > resume_all_vcpus(). Similar question: is there a reproducer? How could we currently hotplug a VCPU, and while it is being created, see pause_all_vcpus()/resume_all_vcpus() getting claled. If I am not getting this wrong, there seems to be some other mechanism missing that makes sure that this cannot happen. Dropping the BQL half-way through creating a VCPU might be the problem.
Hi David, On 17.03.24 09:37, Keqian Zhu via wrote: >> For vCPU being hotplugged, qemu_init_vcpu() is called. In this >> function, we set vcpu state as stopped, and then wait vcpu thread to >> be created. >> >> As the vcpu state is stopped, it will inform us it has been created >> and then wait on halt_cond. After we has realized vcpu object, we will >> resume the vcpu thread. >> >> However, during we wait vcpu thread to be created, the bql is >> unlocked, and other thread is allowed to call resume_all_vcpus(), >> which will resume the un-realized vcpu. >> >> This fixes the issue by filter out un-realized vcpu during >> resume_all_vcpus(). > >Similar question: is there a reproducer? > >How could we currently hotplug a VCPU, and while it is being created, see pause_all_vcpus()/resume_all_vcpus() getting claled. > I described the reason for this at patch 1. >If I am not getting this wrong, there seems to be some other mechanism missing that makes sure that this cannot happen. Dropping the BQL half-way through creating a VCPU might be the problem. > When we add retry mechanism in pause_all_vcpus(), we can solve this problem. With the sematic unchanged for user, which means: With bql, we can make sure all vcpus are paused after pause_all_vcpus() finish, and all vcpus are resumed after resume_all_vcpus() finish. Thanks, Keqian > > -- Cheers, David / dhildenb
On 19.03.24 06:11, zhukeqian wrote: > Hi David, > > On 17.03.24 09:37, Keqian Zhu via wrote: >>> For vCPU being hotplugged, qemu_init_vcpu() is called. In this >>> function, we set vcpu state as stopped, and then wait vcpu thread to >>> be created. >>> >>> As the vcpu state is stopped, it will inform us it has been created >>> and then wait on halt_cond. After we has realized vcpu object, we will >>> resume the vcpu thread. >>> >>> However, during we wait vcpu thread to be created, the bql is >>> unlocked, and other thread is allowed to call resume_all_vcpus(), >>> which will resume the un-realized vcpu. >>> >>> This fixes the issue by filter out un-realized vcpu during >>> resume_all_vcpus(). >> >> Similar question: is there a reproducer? >> >> How could we currently hotplug a VCPU, and while it is being created, see pause_all_vcpus()/resume_all_vcpus() getting claled. >> > I described the reason for this at patch 1. > >> If I am not getting this wrong, there seems to be some other mechanism missing that makes sure that this cannot happen. Dropping the BQL half-way through creating a VCPU might be the problem. >> > When we add retry mechanism in pause_all_vcpus(), we can solve this problem. With the sematic unchanged for user, which means: > With bql, we can make sure all vcpus are paused after pause_all_vcpus() finish, and all vcpus are resumed after resume_all_vcpus() finish. Okay, got it. As just replied to #1, please see if you can avoid messing with pause_all_vcpus() by inhibiting KVM IOCTLs like KVM does. That would be preferable.
diff --git a/system/cpus.c b/system/cpus.c index 4e41abe23e..8871f5dfa9 100644 --- a/system/cpus.c +++ b/system/cpus.c @@ -638,6 +638,9 @@ void resume_all_vcpus(void) qemu_clock_enable(QEMU_CLOCK_VIRTUAL, true); CPU_FOREACH(cpu) { + if (!object_property_get_bool(OBJECT(cpu), "realized", &error_abort)) { + continue; + } cpu_resume(cpu); } }
For vCPU being hotplugged, qemu_init_vcpu() is called. In this function, we set vcpu state as stopped, and then wait vcpu thread to be created. As the vcpu state is stopped, it will inform us it has been created and then wait on halt_cond. After we has realized vcpu object, we will resume the vcpu thread. However, during we wait vcpu thread to be created, the bql is unlocked, and other thread is allowed to call resume_all_vcpus(), which will resume the un-realized vcpu. This fixes the issue by filter out un-realized vcpu during resume_all_vcpus(). Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> --- system/cpus.c | 3 +++ 1 file changed, 3 insertions(+)