Message ID | 20231009112812.10612-2-salil.mehta@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add architecture agnostic code to support vCPU Hotplug | expand |
On 09.10.23 13:28, Salil Mehta wrote: > KVM vCPU creation is done once during the initialization of the VM when Qemu > thread is spawned. This is common to all the architectures. > > Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the > corresponding KVM vCPU object in the Host KVM is not destroyed and its > representative KVM vCPU object/context in Qemu is parked. > > Refactor common logic so that some APIs could be reused by vCPU Hotplug code. > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com> [...] > > int kvm_init_vcpu(CPUState *cpu, Error **errp) > @@ -395,19 +434,14 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp) > > trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu)); > > - ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu)); > + ret = kvm_create_vcpu(cpu); > if (ret < 0) { > - error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)", > + error_setg_errno(errp, -ret, > + "kvm_init_vcpu: kvm_create_vcpu failed (%lu)", Unrelated change. > kvm_arch_vcpu_id(cpu)); > goto err; > } > > - cpu->kvm_fd = ret; > - cpu->kvm_state = s; > - cpu->vcpu_dirty = true; > - cpu->dirty_pages = 0; > - cpu->throttle_us_per_full = 0; > - > mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0); > if (mmap_size < 0) { > ret = mmap_size; > diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events > index 399aaeb0ec..08e2dc253f 100644 > --- a/accel/kvm/trace-events > +++ b/accel/kvm/trace-events > @@ -9,6 +9,10 @@ kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 0x%x, arg %p" > kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" PRIu64 " from KVM: %s" > kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" PRIu64 " to KVM: %s" > kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu" > +kvm_create_vcpu(int cpu_index, unsigned long arch_cpu_id) "creating KVM cpu: cpu_index: %d arch vcpu-id: %lu" > +kvm_get_vcpu(unsigned long arch_cpu_id) "unparking KVM vcpu: arch vcpu-id: %lu" > +kvm_destroy_vcpu(int cpu_index, unsigned long arch_cpu_id) "destroy vcpu: cpu_index: %d arch vcpu-id: %lu" > +kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id) "parking KVM vcpu: cpu_index: %d arch vcpu-id: %lu" It's a bit confusing that there is now 1) create (create new or return parked) 2) destroy (cleanup + park) 3) park (park only) Why would one use 2) instead of 3) or the other way around? But I suspect that kvm_destroy_vcpu() is only supposed to be a KVM-internal helper ... > kvm_irqchip_commit_routes(void) "" > kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev %s vector %d virq %d" > kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d" > diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h > index ee9025f8e9..57bd8f8fd6 100644 > --- a/include/sysemu/kvm.h > +++ b/include/sysemu/kvm.h > @@ -464,6 +464,20 @@ void kvm_set_sigmask_len(KVMState *s, unsigned int sigmask_len); > > int kvm_physical_memory_addr_from_host(KVMState *s, void *ram_addr, > hwaddr *phys_addr); > +/** > + * kvm_create_vcpu - Gets a parked KVM vCPU or creates a KVM vCPU > + * @cpu: QOM CPUState object for which KVM vCPU has to be fetched/created. > + * > + * @returns: 0 when success, errno (<0) when failed. > + */ > +int kvm_create_vcpu(CPUState *cpu); > +/** > + * kvm_park_vcpu - Gets a parked KVM vCPU if it exists ^ I suspect that description is wrong.
Hi David, Thanks for the review. > From: David Hildenbrand <david@redhat.com> > Sent: Monday, October 9, 2023 1:21 PM > To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org > Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron > <jonathan.cameron@huawei.com>; lpieralisi@kernel.org; > peter.maydell@linaro.org; richard.henderson@linaro.org; > imammedo@redhat.com; andrew.jones@linux.dev; philmd@linaro.org; > eric.auger@redhat.com; oliver.upton@linux.dev; pbonzini@redhat.com; > mst@redhat.com; will@kernel.org; gshan@redhat.com; rafael@kernel.org; > alex.bennee@linaro.org; linux@armlinux.org.uk; > darren@os.amperecomputing.com; ilkka@os.amperecomputing.com; > vishnu@os.amperecomputing.com; karl.heubaum@oracle.com; > miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian > <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>; > wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com; > maobibo@loongson.cn; lixianglai@loongson.cn; Linuxarm <linuxarm@huawei.com> > Subject: Re: [PATCH V3 01/10] accel/kvm: Extract common KVM vCPU > {creation,parking} code > > On 09.10.23 13:28, Salil Mehta wrote: > > KVM vCPU creation is done once during the initialization of the VM when Qemu > > thread is spawned. This is common to all the architectures. > > > > Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the > > corresponding KVM vCPU object in the Host KVM is not destroyed and its > > representative KVM vCPU object/context in Qemu is parked. > > > > Refactor common logic so that some APIs could be reused by vCPU Hotplug code. > > > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com> > > [...] > > > > > int kvm_init_vcpu(CPUState *cpu, Error **errp) > > @@ -395,19 +434,14 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp) > > > > trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu)); > > > > - ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu)); > > + ret = kvm_create_vcpu(cpu); > > if (ret < 0) { > > - error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)", > > + error_setg_errno(errp, -ret, > > + "kvm_init_vcpu: kvm_create_vcpu failed (%lu)", > > Unrelated change. It is related. I think you missed kvm_get_vcpu -> kvm_create_vcpu change in the string. > > kvm_arch_vcpu_id(cpu)); > > goto err; > > } > > > > - cpu->kvm_fd = ret; > > - cpu->kvm_state = s; > > - cpu->vcpu_dirty = true; > > - cpu->dirty_pages = 0; > > - cpu->throttle_us_per_full = 0; > > - > > mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0); > > if (mmap_size < 0) { > > ret = mmap_size; > > diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events > > index 399aaeb0ec..08e2dc253f 100644 > > --- a/accel/kvm/trace-events > > +++ b/accel/kvm/trace-events > > @@ -9,6 +9,10 @@ kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 0x%x, arg %p" > > kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" PRIu64 " from KVM: %s" > > kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" PRIu64 " to KVM: %s" > > kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu" > > +kvm_create_vcpu(int cpu_index, unsigned long arch_cpu_id) "creating KVM cpu: cpu_index: %d arch vcpu-id: %lu" > > +kvm_get_vcpu(unsigned long arch_cpu_id) "unparking KVM vcpu: arch vcpu-id: %lu" > > +kvm_destroy_vcpu(int cpu_index, unsigned long arch_cpu_id) "destroy vcpu: cpu_index: %d arch vcpu-id: %lu" > > +kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id) "parking KVM vcpu: cpu_index: %d arch vcpu-id: %lu" > > It's a bit confusing that there is now > > 1) create (create new or return parked) > 2) destroy (cleanup + park) > 3) park (park only) > > Why would one use 2) instead of 3) or the other way around? But I > suspect that kvm_destroy_vcpu() is only supposed to be a KVM-internal > helper ... kvm_destroy_vcpu is more than just parking: 1. Arch destroy vcpu 2. Unmap cpu->kvm_run 3. Parking logic To support virtual CPU Hotplug on ARM platforms we pre-create all the KVM vCPUs but their corresponding Qemu threads are not spawned (and hence cpu->kvm_run is not mapped). Unplugged vCPUs remains parked in the list. Hence, only step-3 is required. https://lore.kernel.org/qemu-devel/b9dd8569-e95d-2085-9965-08686ce6666d@redhat.com/ When a virtual CPU is plugged. QOM CPU object is realized and corresponding thread is spawned. kvm_init_vcpu ends up in unaprking the KVM vCPU, mapping of cpu->kvm_run and kvm_arch_init_vcpu. When a virtul CPU is un-plugged, reverse of step-1, 2 and 3 is required during un-realization of QOM CPU object. We do not destroy vCPU inside the KVM. > > kvm_irqchip_commit_routes(void) "" > > kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev %s vector %d virq %d" > > kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d" > > diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h > > index ee9025f8e9..57bd8f8fd6 100644 > > --- a/include/sysemu/kvm.h > > +++ b/include/sysemu/kvm.h > > @@ -464,6 +464,20 @@ void kvm_set_sigmask_len(KVMState *s, unsigned int sigmask_len); > > > > int kvm_physical_memory_addr_from_host(KVMState *s, void *ram_addr, > > hwaddr *phys_addr); > > +/** > > + * kvm_create_vcpu - Gets a parked KVM vCPU or creates a KVM vCPU > > + * @cpu: QOM CPUState object for which KVM vCPU has to be fetched/created. > > + * > > + * @returns: 0 when success, errno (<0) when failed. > > + */ > > +int kvm_create_vcpu(CPUState *cpu); > > +/** > > + * kvm_park_vcpu - Gets a parked KVM vCPU if it exists > > > ^ I suspect that description is wrong. Good catch. I think manual merge error while copying the change. Will fix. Thanks Salil.
On 09.10.23 15:42, Salil Mehta wrote: > Hi David, > Thanks for the review. > >> From: David Hildenbrand <david@redhat.com> >> Sent: Monday, October 9, 2023 1:21 PM >> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org >> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron >> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org; >> peter.maydell@linaro.org; richard.henderson@linaro.org; >> imammedo@redhat.com; andrew.jones@linux.dev; philmd@linaro.org; >> eric.auger@redhat.com; oliver.upton@linux.dev; pbonzini@redhat.com; >> mst@redhat.com; will@kernel.org; gshan@redhat.com; rafael@kernel.org; >> alex.bennee@linaro.org; linux@armlinux.org.uk; >> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com; >> vishnu@os.amperecomputing.com; karl.heubaum@oracle.com; >> miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian >> <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>; >> wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com; >> maobibo@loongson.cn; lixianglai@loongson.cn; Linuxarm <linuxarm@huawei.com> >> Subject: Re: [PATCH V3 01/10] accel/kvm: Extract common KVM vCPU >> {creation,parking} code >> >> On 09.10.23 13:28, Salil Mehta wrote: >>> KVM vCPU creation is done once during the initialization of the VM when Qemu >>> thread is spawned. This is common to all the architectures. >>> >>> Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the >>> corresponding KVM vCPU object in the Host KVM is not destroyed and its >>> representative KVM vCPU object/context in Qemu is parked. >>> >>> Refactor common logic so that some APIs could be reused by vCPU Hotplug code. >>> >>> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> >> >> [...] >> >>> >>> int kvm_init_vcpu(CPUState *cpu, Error **errp) >>> @@ -395,19 +434,14 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp) >>> >>> trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu)); >>> >>> - ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu)); >>> + ret = kvm_create_vcpu(cpu); >>> if (ret < 0) { >>> - error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)", >>> + error_setg_errno(errp, -ret, >>> + "kvm_init_vcpu: kvm_create_vcpu failed (%lu)", >> >> Unrelated change. > > > It is related. I think you missed kvm_get_vcpu -> kvm_create_vcpu change > in the string. Indeed, I did :) > > >>> kvm_arch_vcpu_id(cpu)); >>> goto err; >>> } >>> >>> - cpu->kvm_fd = ret; >>> - cpu->kvm_state = s; >>> - cpu->vcpu_dirty = true; >>> - cpu->dirty_pages = 0; >>> - cpu->throttle_us_per_full = 0; >>> - >>> mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0); >>> if (mmap_size < 0) { >>> ret = mmap_size; >>> diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events >>> index 399aaeb0ec..08e2dc253f 100644 >>> --- a/accel/kvm/trace-events >>> +++ b/accel/kvm/trace-events >>> @@ -9,6 +9,10 @@ kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 0x%x, arg %p" >>> kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" PRIu64 " from KVM: %s" >>> kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" PRIu64 " to KVM: %s" >>> kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu" >>> +kvm_create_vcpu(int cpu_index, unsigned long arch_cpu_id) "creating KVM cpu: cpu_index: %d arch vcpu-id: %lu" >>> +kvm_get_vcpu(unsigned long arch_cpu_id) "unparking KVM vcpu: arch vcpu-id: %lu" >>> +kvm_destroy_vcpu(int cpu_index, unsigned long arch_cpu_id) "destroy vcpu: cpu_index: %d arch vcpu-id: %lu" >>> +kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id) "parking KVM vcpu: cpu_index: %d arch vcpu-id: %lu" >> >> It's a bit confusing that there is now >> >> 1) create (create new or return parked) >> 2) destroy (cleanup + park) >> 3) park (park only) >> >> Why would one use 2) instead of 3) or the other way around? But I >> suspect that kvm_destroy_vcpu() is only supposed to be a KVM-internal >> helper ... > > kvm_destroy_vcpu is more than just parking: > > 1. Arch destroy vcpu > 2. Unmap cpu->kvm_run > 3. Parking logic > > To support virtual CPU Hotplug on ARM platforms we pre-create all > the KVM vCPUs but their corresponding Qemu threads are not spawned > (and hence cpu->kvm_run is not mapped). Unplugged vCPUs remains > parked in the list. Hence, only step-3 is required. IIUC, your current flow is going to be 1) Create 2) Park 3) Create [which ends up reusing the parked VCPU] 4) Destroy [when unplugging the CPU] If that's the case, that API really is suboptimal. What speaks against an API that models 1) and 2) in a single step kvm_precreate_vcpu kvm_create_vcpu kvm_destroy_vcpu One could even make kvm_create_vcpu() fail on ARM if the VCPU hasn't been pre-created. Or did I get it all wrong? :)
> From: David Hildenbrand <david@redhat.com> > Sent: Monday, October 9, 2023 3:11 PM > To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu- > arm@nongnu.org > Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron > <jonathan.cameron@huawei.com>; lpieralisi@kernel.org; > peter.maydell@linaro.org; richard.henderson@linaro.org; > imammedo@redhat.com; andrew.jones@linux.dev; philmd@linaro.org; > eric.auger@redhat.com; oliver.upton@linux.dev; pbonzini@redhat.com; > mst@redhat.com; will@kernel.org; gshan@redhat.com; rafael@kernel.org; > alex.bennee@linaro.org; linux@armlinux.org.uk; > darren@os.amperecomputing.com; ilkka@os.amperecomputing.com; > vishnu@os.amperecomputing.com; karl.heubaum@oracle.com; > miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian > <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>; > wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com; > maobibo@loongson.cn; lixianglai@loongson.cn; Linuxarm <linuxarm@huawei.com> > Subject: Re: [PATCH V3 01/10] accel/kvm: Extract common KVM vCPU > {creation,parking} code > > On 09.10.23 15:42, Salil Mehta wrote: > > Hi David, > > Thanks for the review. > > > >> From: David Hildenbrand <david@redhat.com> > >> Sent: Monday, October 9, 2023 1:21 PM > >> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org > >> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron > >> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org; > >> peter.maydell@linaro.org; richard.henderson@linaro.org; > >> imammedo@redhat.com; andrew.jones@linux.dev; philmd@linaro.org; > >> eric.auger@redhat.com; oliver.upton@linux.dev; pbonzini@redhat.com; > >> mst@redhat.com; will@kernel.org; gshan@redhat.com; rafael@kernel.org; > >> alex.bennee@linaro.org; linux@armlinux.org.uk; > >> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com; > >> vishnu@os.amperecomputing.com; karl.heubaum@oracle.com; > >> miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian > >> <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>; > >> wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com; > >> maobibo@loongson.cn; lixianglai@loongson.cn; Linuxarm > <linuxarm@huawei.com> > >> Subject: Re: [PATCH V3 01/10] accel/kvm: Extract common KVM vCPU > >> {creation,parking} code > >> > >> On 09.10.23 13:28, Salil Mehta wrote: > >>> KVM vCPU creation is done once during the initialization of the VM when Qemu > >>> thread is spawned. This is common to all the architectures. > >>> > >>> Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the > >>> corresponding KVM vCPU object in the Host KVM is not destroyed and its > >>> representative KVM vCPU object/context in Qemu is parked. > >>> > >>> Refactor common logic so that some APIs could be reused by vCPU Hotplug code. > >>> > >>> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> > >> > >> [...] > >> > >>> > >>> int kvm_init_vcpu(CPUState *cpu, Error **errp) > >>> @@ -395,19 +434,14 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp) > >>> > >>> trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu)); > >>> > >>> - ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu)); > >>> + ret = kvm_create_vcpu(cpu); > >>> if (ret < 0) { > >>> - error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)", > >>> + error_setg_errno(errp, -ret, > >>> + "kvm_init_vcpu: kvm_create_vcpu failed (%lu)", > >> > >> Unrelated change. > > > > > > It is related. I think you missed kvm_get_vcpu -> kvm_create_vcpu change > > in the string. > > Indeed, I did :) > > > > > > >>> kvm_arch_vcpu_id(cpu)); > >>> goto err; > >>> } > >>> > >>> - cpu->kvm_fd = ret; > >>> - cpu->kvm_state = s; > >>> - cpu->vcpu_dirty = true; > >>> - cpu->dirty_pages = 0; > >>> - cpu->throttle_us_per_full = 0; > >>> - > >>> mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0); > >>> if (mmap_size < 0) { > >>> ret = mmap_size; > >>> diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events > >>> index 399aaeb0ec..08e2dc253f 100644 > >>> --- a/accel/kvm/trace-events > >>> +++ b/accel/kvm/trace-events > >>> @@ -9,6 +9,10 @@ kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 0x%x, arg %p" > >>> kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" PRIu64 " from KVM: %s" > >>> kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" PRIu64 " to KVM: %s" > >>> kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu" > >>> +kvm_create_vcpu(int cpu_index, unsigned long arch_cpu_id) "creating KVM cpu: cpu_index: %d arch vcpu-id: %lu" > >>> +kvm_get_vcpu(unsigned long arch_cpu_id) "unparking KVM vcpu: arch vcpu-id: %lu" > >>> +kvm_destroy_vcpu(int cpu_index, unsigned long arch_cpu_id) "destroy vcpu: cpu_index: %d arch vcpu-id: %lu" > >>> +kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id) "parking KVM vcpu: cpu_index: %d arch vcpu-id: %lu" > >> > >> It's a bit confusing that there is now > >> > >> 1) create (create new or return parked) > >> 2) destroy (cleanup + park) > >> 3) park (park only) > >> > >> Why would one use 2) instead of 3) or the other way around? But I > >> suspect that kvm_destroy_vcpu() is only supposed to be a KVM-internal > >> helper ... > > > > kvm_destroy_vcpu is more than just parking: > > > > 1. Arch destroy vcpu > > 2. Unmap cpu->kvm_run > > 3. Parking logic > > > > To support virtual CPU Hotplug on ARM platforms we pre-create all > > the KVM vCPUs but their corresponding Qemu threads are not spawned > > (and hence cpu->kvm_run is not mapped). Unplugged vCPUs remains > > parked in the list. Hence, only step-3 is required. > > IIUC, your current flow is going to be > > 1) Create > 2) Park > 3) Create [which ends up reusing the parked VCPU] > 4) Destroy [when unplugging the CPU] In the ARM specific code, Yes. > If that's the case, that API really is suboptimal. > > What speaks against an API that models 1) and 2) in a single step API is generic and is part of architecture agnostic code. > > kvm_precreate_vcpu pre-creation is very much specific to ARM right now. I am not sure if it is right to have an API with this name in the code which is common to other architectures. > kvm_create_vcpu > kvm_destroy_vcpu > > One could even make kvm_create_vcpu() fail on ARM if the VCPU hasn't > been pre-created. Right now, we abort the CPU initialization process if this happens. I am planning to change abort() into 'fatal_error' in RFC V3 though. > > Or did I get it all wrong? :) I won't say that it is just another point of view which is absolutely fine. But I would like to stick to current APIs. Thanks Salil.
> >> >> kvm_precreate_vcpu > > pre-creation is very much specific to ARM right now. I am not sure > if it is right to have an API with this name in the code which is > common to other architectures. I don't like exposing the concept of "parking" CPUs externally, which is so far handled completely internally. [...] > > >> kvm_create_vcpu >> kvm_destroy_vcpu >> >> One could even make kvm_create_vcpu() fail on ARM if the VCPU hasn't >> been pre-created. > > Right now, we abort the CPU initialization process if this happens. I > am planning to change abort() into 'fatal_error' in RFC V3 though. > > > >> >> Or did I get it all wrong? :) > > I won't say that it is just another point of view which is absolutely > fine. But I would like to stick to current APIs. No really strong opinion, I wouldn't do it that way. I'll let others chime in if they have an opinion.
> From: David Hildenbrand <david@redhat.com> > Sent: Monday, October 9, 2023 4:21 PM > To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu- > arm@nongnu.org > Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron > <jonathan.cameron@huawei.com>; lpieralisi@kernel.org; > peter.maydell@linaro.org; richard.henderson@linaro.org; > imammedo@redhat.com; andrew.jones@linux.dev; philmd@linaro.org; > eric.auger@redhat.com; oliver.upton@linux.dev; pbonzini@redhat.com; > mst@redhat.com; will@kernel.org; gshan@redhat.com; rafael@kernel.org; > alex.bennee@linaro.org; linux@armlinux.org.uk; > darren@os.amperecomputing.com; ilkka@os.amperecomputing.com; > vishnu@os.amperecomputing.com; karl.heubaum@oracle.com; > miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian > <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>; > wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com; > maobibo@loongson.cn; lixianglai@loongson.cn; Linuxarm <linuxarm@huawei.com> > Subject: Re: [PATCH V3 01/10] accel/kvm: Extract common KVM vCPU > {creation,parking} code > > > > >> > >> kvm_precreate_vcpu > > > > pre-creation is very much specific to ARM right now. I am not sure > > if it is right to have an API with this name in the code which is > > common to other architectures. > > I don't like exposing the concept of "parking" CPUs externally, which is > so far handled completely internally. I understand your point of view. There is a subtle difference in the way parking logic has been used till now, say in x86 world and how it is being used in the ARM in the RFC patches being proposed. AFAICS, in x86 world we have a liberty to delay the creation of the vCPUs in KVM for the first time but once they are created cannot be destroyed in the KVM so are (un)parked for subsequent use during hot(un)plug. Because of the ARM CPU architecture limitations and that of GIC, we are not allowed to do this. Hence, we have to pre-create all the KVM vCPUs and size VGIC during initialization. Since some of the KVM vCPUs wont have any QOM CPU objects because they are 'yet-to-be-plugged' so need to be parked. Hence, we require that common parking logic. > > [...] > > > > > > >> kvm_create_vcpu > >> kvm_destroy_vcpu > >> > >> One could even make kvm_create_vcpu() fail on ARM if the VCPU hasn't > >> been pre-created. > > > > Right now, we abort the CPU initialization process if this happens. I > > am planning to change abort() into 'fatal_error' in RFC V3 though. > > > > > > > >> > >> Or did I get it all wrong? :) > > > > I won't say that it is just another point of view which is absolutely > > fine. But I would like to stick to current APIs. > > No really strong opinion, I wouldn't do it that way. I'll let others > chime in if they have an opinion. Ok, Thanks. Salil.
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index ff1578bb32..0dcaa15276 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -137,6 +137,7 @@ static QemuMutex kml_slots_lock; #define kvm_slots_unlock() qemu_mutex_unlock(&kml_slots_lock) static void kvm_slot_init_dirty_bitmap(KVMSlot *mem); +static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id); static inline void kvm_resample_fd_remove(int gsi) { @@ -320,14 +321,53 @@ err: return ret; } +void kvm_park_vcpu(CPUState *cpu) +{ + struct KVMParkedVcpu *vcpu; + + trace_kvm_park_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu)); + + vcpu = g_malloc0(sizeof(*vcpu)); + vcpu->vcpu_id = kvm_arch_vcpu_id(cpu); + vcpu->kvm_fd = cpu->kvm_fd; + QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node); +} + +int kvm_create_vcpu(CPUState *cpu) +{ + unsigned long vcpu_id = kvm_arch_vcpu_id(cpu); + KVMState *s = kvm_state; + int kvm_fd; + + trace_kvm_create_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu)); + + /* check if the KVM vCPU already exist but is parked */ + kvm_fd = kvm_get_vcpu(s, vcpu_id); + if (kvm_fd < 0) { + /* vCPU not parked: create a new KVM vCPU */ + kvm_fd = kvm_vm_ioctl(s, KVM_CREATE_VCPU, vcpu_id); + if (kvm_fd < 0) { + error_report("KVM_CREATE_VCPU IOCTL failed for vCPU %lu", vcpu_id); + return kvm_fd; + } + } + + cpu->kvm_fd = kvm_fd; + cpu->kvm_state = s; + cpu->vcpu_dirty = true; + cpu->dirty_pages = 0; + cpu->throttle_us_per_full = 0; + + return 0; +} + static int do_kvm_destroy_vcpu(CPUState *cpu) { KVMState *s = kvm_state; long mmap_size; - struct KVMParkedVcpu *vcpu = NULL; int ret = 0; - DPRINTF("kvm_destroy_vcpu\n"); + trace_kvm_destroy_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu)); ret = kvm_arch_destroy_vcpu(cpu); if (ret < 0) { @@ -353,10 +393,7 @@ static int do_kvm_destroy_vcpu(CPUState *cpu) } } - vcpu = g_malloc0(sizeof(*vcpu)); - vcpu->vcpu_id = kvm_arch_vcpu_id(cpu); - vcpu->kvm_fd = cpu->kvm_fd; - QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node); + kvm_park_vcpu(cpu); err: return ret; } @@ -377,6 +414,8 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id) if (cpu->vcpu_id == vcpu_id) { int kvm_fd; + trace_kvm_get_vcpu(vcpu_id); + QLIST_REMOVE(cpu, node); kvm_fd = cpu->kvm_fd; g_free(cpu); @@ -384,7 +423,7 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id) } } - return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id); + return -ENOENT; } int kvm_init_vcpu(CPUState *cpu, Error **errp) @@ -395,19 +434,14 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp) trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu)); - ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu)); + ret = kvm_create_vcpu(cpu); if (ret < 0) { - error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)", + error_setg_errno(errp, -ret, + "kvm_init_vcpu: kvm_create_vcpu failed (%lu)", kvm_arch_vcpu_id(cpu)); goto err; } - cpu->kvm_fd = ret; - cpu->kvm_state = s; - cpu->vcpu_dirty = true; - cpu->dirty_pages = 0; - cpu->throttle_us_per_full = 0; - mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0); if (mmap_size < 0) { ret = mmap_size; diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events index 399aaeb0ec..08e2dc253f 100644 --- a/accel/kvm/trace-events +++ b/accel/kvm/trace-events @@ -9,6 +9,10 @@ kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 0x%x, arg %p" kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" PRIu64 " from KVM: %s" kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" PRIu64 " to KVM: %s" kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu" +kvm_create_vcpu(int cpu_index, unsigned long arch_cpu_id) "creating KVM cpu: cpu_index: %d arch vcpu-id: %lu" +kvm_get_vcpu(unsigned long arch_cpu_id) "unparking KVM vcpu: arch vcpu-id: %lu" +kvm_destroy_vcpu(int cpu_index, unsigned long arch_cpu_id) "destroy vcpu: cpu_index: %d arch vcpu-id: %lu" +kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id) "parking KVM vcpu: cpu_index: %d arch vcpu-id: %lu" kvm_irqchip_commit_routes(void) "" kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev %s vector %d virq %d" kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d" diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index ee9025f8e9..57bd8f8fd6 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -464,6 +464,20 @@ void kvm_set_sigmask_len(KVMState *s, unsigned int sigmask_len); int kvm_physical_memory_addr_from_host(KVMState *s, void *ram_addr, hwaddr *phys_addr); +/** + * kvm_create_vcpu - Gets a parked KVM vCPU or creates a KVM vCPU + * @cpu: QOM CPUState object for which KVM vCPU has to be fetched/created. + * + * @returns: 0 when success, errno (<0) when failed. + */ +int kvm_create_vcpu(CPUState *cpu); +/** + * kvm_park_vcpu - Gets a parked KVM vCPU if it exists + * @cpu: QOM CPUState object for which Qemu KVM vCPU context has to be parked. + * + * @returns: none + */ +void kvm_park_vcpu(CPUState *cpu); #endif /* NEED_CPU_H */
KVM vCPU creation is done once during the initialization of the VM when Qemu thread is spawned. This is common to all the architectures. Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the corresponding KVM vCPU object in the Host KVM is not destroyed and its representative KVM vCPU object/context in Qemu is parked. Refactor common logic so that some APIs could be reused by vCPU Hotplug code. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> --- accel/kvm/kvm-all.c | 64 ++++++++++++++++++++++++++++++++---------- accel/kvm/trace-events | 4 +++ include/sysemu/kvm.h | 14 +++++++++ 3 files changed, 67 insertions(+), 15 deletions(-)