diff mbox

cpu hotplug issue

Message ID CA+1DO-x=xdHePDtdkisiyM37G0pOi5NLrwwZUOw4JXkqDu3dMw@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Vasilis Liaskovitis July 21, 2011, 11:06 a.m. UTC
Hi,

On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
> On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
>> Hello,
>>
>> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
>>
> You do everything right. It's qemu who is buggy. Since qemu need a patch
> for cpu hotplug to not crash it nobody tests it, so code bit rots.

thanks for your reply.

As I mentioned in the original email, onlining a hotplugged-cpu with
qemu-kvm/master results in:

>> echo 1 > /sys/devices/system/cpu/cpu1/online
>> bash: echo: write error: Input/output error
>>
>> in the guest, dmesg reports:
>>
>> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
>> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
>> [ 2330.821306] CPU1: Not responding.

I tried to git-bisect between qemu-kvm-0.13.0 (last known version
where cpu hotplug works correctly
for me) and qemu-kvm/master.

More precisely: To enable cpu-hotplug at each bisect stage, I apply
this patch derived from:
http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Gleb Natapov July 21, 2011, 11:33 a.m. UTC | #1
On Thu, Jul 21, 2011 at 01:06:41PM +0200, Vasilis Liaskovitis wrote:
> Hi,
> 
> On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
> > On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
> >> Hello,
> >>
> >> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
> >>
> > You do everything right. It's qemu who is buggy. Since qemu need a patch
> > for cpu hotplug to not crash it nobody tests it, so code bit rots.
> 
> thanks for your reply.
> 
> As I mentioned in the original email, onlining a hotplugged-cpu with
> qemu-kvm/master results in:
> 
> >> echo 1 > /sys/devices/system/cpu/cpu1/online
> >> bash: echo: write error: Input/output error
> >>
> >> in the guest, dmesg reports:
> >>
> >> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
> >> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
> >> [ 2330.821306] CPU1: Not responding.
> 
> I tried to git-bisect between qemu-kvm-0.13.0 (last known version
> where cpu hotplug works correctly
> for me) and qemu-kvm/master.
> 
> More precisely: To enable cpu-hotplug at each bisect stage, I apply
> this patch derived from:
> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> 
> diff --git a/hw/qdev.c b/hw/qdev.c
> index 1aa1ea0..aed48ce 100644
> --- a/hw/qdev.c
> +++ b/hw/qdev.c
> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>     if (!main_system_bus) {
>         main_system_bus = qbus_create(&system_bus_info, NULL,
>                                       "main-system-bus");
> +       main_system_bus->allow_hotplug = 1;
>     }
>     return main_system_bus;
> }
> 
> and test cpu hotplug functionality.
> The commit that appears to break CPU hotplug is:
> 
Thank you for going through the pain of bisecting it! Bisects that
require patch to be applied between each step are especially annoying.

> commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
> Author: Jan Kiszka <jan.kiszka@siemens.com>
> Date:   Mon Feb 21 12:28:07 2011 +0100
>    qemu-kvm: Mark VCPU state dirty on creation
> 
Jan can you look at this please?

> Is it possible that kvm_vcpu_dirty should not be set to 1 for a CPU
> that's being hot-plugged?
> I.e. when kvm_cpu_exec() is called for the first time during
> initialization of a hotplugged-CPU,
> we shouldn't try to restore state with kvm_arch_put_registers().
> 
> The following patch enables hotplug and solves the non-responsive
> hotplugged CPU problem,
> by not setting the vcpu_dirty state when hotplugging. Enabling hotplug
> is still done through
> the main_system_bus (see above).
> Tested on amd64host/amd64guest combination. Comments?
> 
> Note that there is probably another bug in qemu-kvm/master regarding
> acpi-udev event delivery for
> a cpu-hotplug event (cpu_set in the qemu_monitor no longer triggers
> the event in the guest, see
> first issue in my original mail). This patch does not address that issue.
> 
Did this work in qemu-0.13?

> Signed-off-by: Vasilis Liaskovitis <vliaskov@gmail.com>
> ---
> 
>  hw/acpi_piix4.c   |    2 +-
>  hw/pc.c           |   13 +++++++++++--
>  hw/qdev.c         |    1 +
>  kvm-all.c         |   22 +++++++++++++++++++++-
>  kvm.h             |    3 +++
>  target-i386/cpu.h |    2 +-
>  6 files changed, 38 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
> index c30a050..b0cac60 100644
> --- a/hw/acpi_piix4.c
> +++ b/hw/acpi_piix4.c
> @@ -584,7 +584,7 @@ void qemu_system_cpu_hot_add(int cpu, int state)
>      PIIX4PMState *s = global_piix4_pm_state;
> 
>      if (state && !qemu_get_cpu(cpu)) {
> -        env = pc_new_cpu(global_cpu_model);
> +        env = pc_new_cpu(global_cpu_model, 1);
>          if (!env) {
>              fprintf(stderr, "cpu %d creation failed\n", cpu);
>              return;
> diff --git a/hw/pc.c b/hw/pc.c
> index c0a88e1..8cfbf27 100644
> --- a/hw/pc.c
> +++ b/hw/pc.c
> @@ -924,7 +924,7 @@ static void pc_cpu_reset(void *opaque)
>      env->halted = !cpu_is_bsp(env);
>  }
> 
> -CPUState *pc_new_cpu(const char *cpu_model)
> +CPUState *pc_new_cpu(const char *cpu_model, int hotplugged)
>  {
>      CPUState *env;
> 
> @@ -936,7 +936,16 @@ CPUState *pc_new_cpu(const char *cpu_model)
>  #endif
>      }
> 
> +    if (hotplugged) {
> +        kvm_set_cpuhotplug_req();
> +    }
> +
>      env = cpu_init(cpu_model);
> +
> +    if (hotplugged) {
> +        kvm_reset_cpuhotplug_req();
> +    }
> +
>      if (!env) {
>          fprintf(stderr, "Unable to find x86 CPU definition\n");
>          exit(1);
> @@ -956,7 +965,7 @@ void pc_cpus_init(const char *cpu_model)
> 
>      /* init CPUs */
>      for(i = 0; i < smp_cpus; i++) {
> -        pc_new_cpu(cpu_model);
> +        pc_new_cpu(cpu_model, 0);
>      }
>  }
> 
> diff --git a/hw/qdev.c b/hw/qdev.c
> index 292b52f..f85ac0f 100644
> --- a/hw/qdev.c
> +++ b/hw/qdev.c
> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>      if (!main_system_bus) {
>          main_system_bus = qbus_create(&system_bus_info, NULL,
>                                        "main-system-bus");
> +	main_system_bus->allow_hotplug = 1;
>      }
>      return main_system_bus;
>  }
> diff --git a/kvm-all.c b/kvm-all.c
> index 3ad2459..8aae1d7 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -85,6 +85,7 @@ struct KVMState
>  #endif
>      void *used_gsi_bitmap;
>      int max_gsi;
> +    int cpu_hotplug_req;
>  };
> 
>  KVMState *kvm_state;
> @@ -220,7 +221,9 @@ int kvm_init_vcpu(CPUState *env)
> 
>      env->kvm_fd = ret;
>      env->kvm_state = s;
> -    env->kvm_vcpu_dirty = 1;
> +    if (!kvm_has_cpuhotplug_req()) {
> +        env->kvm_vcpu_dirty = 1;
> +    }
> 
>      mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
>      if (mmap_size < 0) {
> @@ -797,6 +800,8 @@ int kvm_init(void)
> 
>      s->pit_in_kernel = kvm_pit;
> 
> +    s->cpu_hotplug_req = 0;
> +
>      ret = kvm_arch_init(s);
>      if (ret < 0) {
>          goto err;
> @@ -1104,6 +1109,21 @@ int kvm_has_vcpu_events(void)
>      return kvm_state->vcpu_events;
>  }
> 
> +int kvm_has_cpuhotplug_req(void)
> +{
> +    return kvm_state->cpu_hotplug_req;
> +}
> +
> +void kvm_set_cpuhotplug_req(void)
> +{
> +    kvm_state->cpu_hotplug_req = 1;
> +}
> +
> +void kvm_reset_cpuhotplug_req(void)
> +{
> +    kvm_state->cpu_hotplug_req = 0;
> +}
> +
>  int kvm_has_robust_singlestep(void)
>  {
>      return kvm_state->robust_singlestep;
> diff --git a/kvm.h b/kvm.h
> index b15e1dd..7fa72fd 100644
> --- a/kvm.h
> +++ b/kvm.h
> @@ -52,6 +52,9 @@ int kvm_has_xsave(void);
>  int kvm_has_xcrs(void);
>  int kvm_has_many_ioeventfds(void);
>  int kvm_has_pit_state2(void);
> +int kvm_has_cpuhotplug_req(void);
> +void kvm_set_cpuhotplug_req(void);
> +void kvm_reset_cpuhotplug_req(void);
> 
>  #ifdef NEED_CPU_H
>  int kvm_init_vcpu(CPUState *env);
> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> index 935d08a..7e7839b 100644
> --- a/target-i386/cpu.h
> +++ b/target-i386/cpu.h
> @@ -923,7 +923,7 @@ void cpu_x86_update_cr4(CPUX86State *env, uint32_t new_cr4);
>  /* hw/pc.c */
>  void cpu_smm_update(CPUX86State *env);
>  uint64_t cpu_get_tsc(CPUX86State *env);
> -CPUState *pc_new_cpu(const char *cpu_model);
> +CPUState *pc_new_cpu(const char *cpu_model, int hotplugged);
> 
>  /* used to debug */
>  #define X86_DUMP_FPU  0x0001 /* dump FPU state too */

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka July 21, 2011, 11:36 a.m. UTC | #2
On 2011-07-21 13:06, Vasilis Liaskovitis wrote:
> Hi,
> 
> On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
>> On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
>>> Hello,
>>>
>>> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
>>>
>> You do everything right. It's qemu who is buggy. Since qemu need a patch
>> for cpu hotplug to not crash it nobody tests it, so code bit rots.
> 
> thanks for your reply.
> 
> As I mentioned in the original email, onlining a hotplugged-cpu with
> qemu-kvm/master results in:
> 
>>> echo 1 > /sys/devices/system/cpu/cpu1/online
>>> bash: echo: write error: Input/output error
>>>
>>> in the guest, dmesg reports:
>>>
>>> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
>>> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
>>> [ 2330.821306] CPU1: Not responding.
> 
> I tried to git-bisect between qemu-kvm-0.13.0 (last known version
> where cpu hotplug works correctly
> for me) and qemu-kvm/master.
> 
> More precisely: To enable cpu-hotplug at each bisect stage, I apply
> this patch derived from:
> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> 
> diff --git a/hw/qdev.c b/hw/qdev.c
> index 1aa1ea0..aed48ce 100644
> --- a/hw/qdev.c
> +++ b/hw/qdev.c
> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>     if (!main_system_bus) {
>         main_system_bus = qbus_create(&system_bus_info, NULL,
>                                       "main-system-bus");
> +       main_system_bus->allow_hotplug = 1;
>     }
>     return main_system_bus;
> }
> 
> and test cpu hotplug functionality.
> The commit that appears to break CPU hotplug is:
> 
> commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
> Author: Jan Kiszka <jan.kiszka@siemens.com>
> Date:   Mon Feb 21 12:28:07 2011 +0100
>    qemu-kvm: Mark VCPU state dirty on creation
> 
> Is it possible that kvm_vcpu_dirty should not be set to 1 for a CPU
> that's being hot-plugged?
> I.e. when kvm_cpu_exec() is called for the first time during
> initialization of a hotplugged-CPU,
> we shouldn't try to restore state with kvm_arch_put_registers().

We should because user space defines the CPU state on creation or after
reset, and that state has to be transferred to the kernel. If you get
further by skipping this, we likely create to wrong state in user space
while the kernel happens to have a working one. So the state needs
fixing, not the write back (IOW, your workaround likely papers over the
real issue).

So far for a high-level analysis without digging in this dirt on my own. :)

Jan
Jan Kiszka July 21, 2011, 11:42 a.m. UTC | #3
On 2011-07-21 13:33, Gleb Natapov wrote:
> On Thu, Jul 21, 2011 at 01:06:41PM +0200, Vasilis Liaskovitis wrote:
>> Hi,
>>
>> On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
>>> On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
>>>> Hello,
>>>>
>>>> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
>>>>
>>> You do everything right. It's qemu who is buggy. Since qemu need a patch
>>> for cpu hotplug to not crash it nobody tests it, so code bit rots.
>>
>> thanks for your reply.
>>
>> As I mentioned in the original email, onlining a hotplugged-cpu with
>> qemu-kvm/master results in:
>>
>>>> echo 1 > /sys/devices/system/cpu/cpu1/online
>>>> bash: echo: write error: Input/output error
>>>>
>>>> in the guest, dmesg reports:
>>>>
>>>> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
>>>> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
>>>> [ 2330.821306] CPU1: Not responding.
>>
>> I tried to git-bisect between qemu-kvm-0.13.0 (last known version
>> where cpu hotplug works correctly
>> for me) and qemu-kvm/master.
>>
>> More precisely: To enable cpu-hotplug at each bisect stage, I apply
>> this patch derived from:
>> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
>>
>> diff --git a/hw/qdev.c b/hw/qdev.c
>> index 1aa1ea0..aed48ce 100644
>> --- a/hw/qdev.c
>> +++ b/hw/qdev.c
>> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>>     if (!main_system_bus) {
>>         main_system_bus = qbus_create(&system_bus_info, NULL,
>>                                       "main-system-bus");
>> +       main_system_bus->allow_hotplug = 1;
>>     }
>>     return main_system_bus;
>> }
>>
>> and test cpu hotplug functionality.
>> The commit that appears to break CPU hotplug is:
>>
> Thank you for going through the pain of bisecting it! Bisects that
> require patch to be applied between each step are especially annoying.
> 
>> commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>> Date:   Mon Feb 21 12:28:07 2011 +0100
>>    qemu-kvm: Mark VCPU state dirty on creation
>>
> Jan can you look at this please?

I can't promise to do debugging myself.

Also, as I never succeeded in getting anything working with CPU hotplug,
even back in the days it was supposed to work, I'm a bit clueless /wrt
to the right test cases.

Jan
Gleb Natapov July 21, 2011, 11:51 a.m. UTC | #4
On Thu, Jul 21, 2011 at 01:42:08PM +0200, Jan Kiszka wrote:
> On 2011-07-21 13:33, Gleb Natapov wrote:
> > On Thu, Jul 21, 2011 at 01:06:41PM +0200, Vasilis Liaskovitis wrote:
> >> Hi,
> >>
> >> On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
> >>> On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
> >>>> Hello,
> >>>>
> >>>> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
> >>>>
> >>> You do everything right. It's qemu who is buggy. Since qemu need a patch
> >>> for cpu hotplug to not crash it nobody tests it, so code bit rots.
> >>
> >> thanks for your reply.
> >>
> >> As I mentioned in the original email, onlining a hotplugged-cpu with
> >> qemu-kvm/master results in:
> >>
> >>>> echo 1 > /sys/devices/system/cpu/cpu1/online
> >>>> bash: echo: write error: Input/output error
> >>>>
> >>>> in the guest, dmesg reports:
> >>>>
> >>>> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
> >>>> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
> >>>> [ 2330.821306] CPU1: Not responding.
> >>
> >> I tried to git-bisect between qemu-kvm-0.13.0 (last known version
> >> where cpu hotplug works correctly
> >> for me) and qemu-kvm/master.
> >>
> >> More precisely: To enable cpu-hotplug at each bisect stage, I apply
> >> this patch derived from:
> >> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> >>
> >> diff --git a/hw/qdev.c b/hw/qdev.c
> >> index 1aa1ea0..aed48ce 100644
> >> --- a/hw/qdev.c
> >> +++ b/hw/qdev.c
> >> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
> >>     if (!main_system_bus) {
> >>         main_system_bus = qbus_create(&system_bus_info, NULL,
> >>                                       "main-system-bus");
> >> +       main_system_bus->allow_hotplug = 1;
> >>     }
> >>     return main_system_bus;
> >> }
> >>
> >> and test cpu hotplug functionality.
> >> The commit that appears to break CPU hotplug is:
> >>
> > Thank you for going through the pain of bisecting it! Bisects that
> > require patch to be applied between each step are especially annoying.
> > 
> >> commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
> >> Author: Jan Kiszka <jan.kiszka@siemens.com>
> >> Date:   Mon Feb 21 12:28:07 2011 +0100
> >>    qemu-kvm: Mark VCPU state dirty on creation
> >>
> > Jan can you look at this please?
> 
> I can't promise to do debugging myself.
> 
> Also, as I never succeeded in getting anything working with CPU hotplug,
> even back in the days it was supposed to work, I'm a bit clueless /wrt
> to the right test cases.
> 
CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
applied). But we have two bugs currently. One is that ACPI interrupt
is not send when cpu is onlined (at least this appears to be the case).
I will look at that one. Another is that after new cpu is detected it
can't be onlined.

After fixing the first bug the test should look like this:
1. start vm with  -smp 1,macpus=2
2. wait for it to boot
3. do "cpu 1 online" in monitor.
4. do "echo 1 > /sys/devices/system/cpu/cpu1/online"

If step 4 should succeed. It fails now.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka July 21, 2011, 11:55 a.m. UTC | #5
On 2011-07-21 13:51, Gleb Natapov wrote:
> On Thu, Jul 21, 2011 at 01:42:08PM +0200, Jan Kiszka wrote:
>> On 2011-07-21 13:33, Gleb Natapov wrote:
>>> On Thu, Jul 21, 2011 at 01:06:41PM +0200, Vasilis Liaskovitis wrote:
>>>> Hi,
>>>>
>>>> On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
>>>>> On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
>>>>>>
>>>>> You do everything right. It's qemu who is buggy. Since qemu need a patch
>>>>> for cpu hotplug to not crash it nobody tests it, so code bit rots.
>>>>
>>>> thanks for your reply.
>>>>
>>>> As I mentioned in the original email, onlining a hotplugged-cpu with
>>>> qemu-kvm/master results in:
>>>>
>>>>>> echo 1 > /sys/devices/system/cpu/cpu1/online
>>>>>> bash: echo: write error: Input/output error
>>>>>>
>>>>>> in the guest, dmesg reports:
>>>>>>
>>>>>> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
>>>>>> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
>>>>>> [ 2330.821306] CPU1: Not responding.
>>>>
>>>> I tried to git-bisect between qemu-kvm-0.13.0 (last known version
>>>> where cpu hotplug works correctly
>>>> for me) and qemu-kvm/master.
>>>>
>>>> More precisely: To enable cpu-hotplug at each bisect stage, I apply
>>>> this patch derived from:
>>>> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
>>>>
>>>> diff --git a/hw/qdev.c b/hw/qdev.c
>>>> index 1aa1ea0..aed48ce 100644
>>>> --- a/hw/qdev.c
>>>> +++ b/hw/qdev.c
>>>> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>>>>     if (!main_system_bus) {
>>>>         main_system_bus = qbus_create(&system_bus_info, NULL,
>>>>                                       "main-system-bus");
>>>> +       main_system_bus->allow_hotplug = 1;
>>>>     }
>>>>     return main_system_bus;
>>>> }
>>>>
>>>> and test cpu hotplug functionality.
>>>> The commit that appears to break CPU hotplug is:
>>>>
>>> Thank you for going through the pain of bisecting it! Bisects that
>>> require patch to be applied between each step are especially annoying.
>>>
>>>> commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>>> Date:   Mon Feb 21 12:28:07 2011 +0100
>>>>    qemu-kvm: Mark VCPU state dirty on creation
>>>>
>>> Jan can you look at this please?
>>
>> I can't promise to do debugging myself.
>>
>> Also, as I never succeeded in getting anything working with CPU hotplug,
>> even back in the days it was supposed to work, I'm a bit clueless /wrt
>> to the right test cases.
>>
> CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
> applied). But we have two bugs currently. One is that ACPI interrupt

I bet we have multiple bugs for quite a while now, which accelerated bit
rotting even further.

> is not send when cpu is onlined (at least this appears to be the case).
> I will look at that one. Another is that after new cpu is detected it
> can't be onlined.
> 
> After fixing the first bug the test should look like this:
> 1. start vm with  -smp 1,macpus=2
> 2. wait for it to boot
> 3. do "cpu 1 online" in monitor.
> 4. do "echo 1 > /sys/devices/system/cpu/cpu1/online"
> 
> If step 4 should succeed. It fails now.

Should the above work even with the current ACPI issue unfixed?

Jan
Gleb Natapov July 21, 2011, noon UTC | #6
On Thu, Jul 21, 2011 at 01:55:26PM +0200, Jan Kiszka wrote:
> On 2011-07-21 13:51, Gleb Natapov wrote:
> > On Thu, Jul 21, 2011 at 01:42:08PM +0200, Jan Kiszka wrote:
> >> On 2011-07-21 13:33, Gleb Natapov wrote:
> >>> On Thu, Jul 21, 2011 at 01:06:41PM +0200, Vasilis Liaskovitis wrote:
> >>>> Hi,
> >>>>
> >>>> On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
> >>>>> On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
> >>>>>>
> >>>>> You do everything right. It's qemu who is buggy. Since qemu need a patch
> >>>>> for cpu hotplug to not crash it nobody tests it, so code bit rots.
> >>>>
> >>>> thanks for your reply.
> >>>>
> >>>> As I mentioned in the original email, onlining a hotplugged-cpu with
> >>>> qemu-kvm/master results in:
> >>>>
> >>>>>> echo 1 > /sys/devices/system/cpu/cpu1/online
> >>>>>> bash: echo: write error: Input/output error
> >>>>>>
> >>>>>> in the guest, dmesg reports:
> >>>>>>
> >>>>>> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
> >>>>>> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
> >>>>>> [ 2330.821306] CPU1: Not responding.
> >>>>
> >>>> I tried to git-bisect between qemu-kvm-0.13.0 (last known version
> >>>> where cpu hotplug works correctly
> >>>> for me) and qemu-kvm/master.
> >>>>
> >>>> More precisely: To enable cpu-hotplug at each bisect stage, I apply
> >>>> this patch derived from:
> >>>> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> >>>>
> >>>> diff --git a/hw/qdev.c b/hw/qdev.c
> >>>> index 1aa1ea0..aed48ce 100644
> >>>> --- a/hw/qdev.c
> >>>> +++ b/hw/qdev.c
> >>>> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
> >>>>     if (!main_system_bus) {
> >>>>         main_system_bus = qbus_create(&system_bus_info, NULL,
> >>>>                                       "main-system-bus");
> >>>> +       main_system_bus->allow_hotplug = 1;
> >>>>     }
> >>>>     return main_system_bus;
> >>>> }
> >>>>
> >>>> and test cpu hotplug functionality.
> >>>> The commit that appears to break CPU hotplug is:
> >>>>
> >>> Thank you for going through the pain of bisecting it! Bisects that
> >>> require patch to be applied between each step are especially annoying.
> >>>
> >>>> commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
> >>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
> >>>> Date:   Mon Feb 21 12:28:07 2011 +0100
> >>>>    qemu-kvm: Mark VCPU state dirty on creation
> >>>>
> >>> Jan can you look at this please?
> >>
> >> I can't promise to do debugging myself.
> >>
> >> Also, as I never succeeded in getting anything working with CPU hotplug,
> >> even back in the days it was supposed to work, I'm a bit clueless /wrt
> >> to the right test cases.
> >>
> > CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
> > applied). But we have two bugs currently. One is that ACPI interrupt
> 
> I bet we have multiple bugs for quite a while now, which accelerated bit
> rotting even further.
> 
> > is not send when cpu is onlined (at least this appears to be the case).
> > I will look at that one. Another is that after new cpu is detected it
> > can't be onlined.
> > 
> > After fixing the first bug the test should look like this:
> > 1. start vm with  -smp 1,macpus=2
> > 2. wait for it to boot
> > 3. do "cpu 1 online" in monitor.
> > 4. do "echo 1 > /sys/devices/system/cpu/cpu1/online"
> > 
> > If step 4 should succeed. It fails now.
> 
> Should the above work even with the current ACPI issue unfixed?
> 
Last I checked cpu hotplug worked. cpu unplug didn't. ACPI is only
responsible for making guest noticing new cpu, not starting it, and
according to Vasilis testing this is still working (kind of).

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity July 21, 2011, 12:18 p.m. UTC | #7
On 07/21/2011 02:55 PM, Jan Kiszka wrote:
> >>
> >  CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
> >  applied). But we have two bugs currently. One is that ACPI interrupt
>
> I bet we have multiple bugs for quite a while now, which accelerated bit
> rotting even further.
>

kvm-autotest to the rescue!

Lucas, do we have a cpu hotplug test in kvm-autotest?

If so, I promise to add it to my autotest run once it is fixed, to 
prevent the spread of the disease.

btw, Anthony, are you running kvm-autotest these days? with qemu-kvm 
diverging less and less from qemu.git these days, it's becoming less 
effective to run it downstream.
Jan Kiszka July 21, 2011, 12:22 p.m. UTC | #8
On 2011-07-21 13:06, Vasilis Liaskovitis wrote:
> More precisely: To enable cpu-hotplug at each bisect stage, I apply
> this patch derived from:
> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> 
> diff --git a/hw/qdev.c b/hw/qdev.c
> index 1aa1ea0..aed48ce 100644
> --- a/hw/qdev.c
> +++ b/hw/qdev.c
> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>     if (!main_system_bus) {
>         main_system_bus = qbus_create(&system_bus_info, NULL,
>                                       "main-system-bus");
> +       main_system_bus->allow_hotplug = 1;

BTW, this reminds me why "fixing" CPU hotplug won't help a lot. We
finally need to _design_ the infrastructure required for CPU hotplugging
and stop hacking away reasonable bits that do not fit in the current
simplistic CPU interface model.

The questions are:
 - How should the CPU-APIC-chipset topology should look like in an
   ideal (qdev reworked or qdev replaced with X) world should like?
 - How can we make useful step in that direction given the current
   mechanisms (e.g. a hotpluggable APIC bus created by the chipset)?

Jan
Gleb Natapov July 21, 2011, 12:22 p.m. UTC | #9
On Thu, Jul 21, 2011 at 03:18:03PM +0300, Avi Kivity wrote:
> On 07/21/2011 02:55 PM, Jan Kiszka wrote:
> >>>
> >>  CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
> >>  applied). But we have two bugs currently. One is that ACPI interrupt
> >
> >I bet we have multiple bugs for quite a while now, which accelerated bit
> >rotting even further.
> >
> 
> kvm-autotest to the rescue!
> 
+1 but cpu hotplug does not work without patch in
http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html.
The patch was rejected and alternative wasn't proposed.

> Lucas, do we have a cpu hotplug test in kvm-autotest?
> 
> If so, I promise to add it to my autotest run once it is fixed, to
> prevent the spread of the disease.
> 
> btw, Anthony, are you running kvm-autotest these days? with qemu-kvm
> diverging less and less from qemu.git these days, it's becoming less
> effective to run it downstream.
> 
> -- 
> error compiling committee.c: too many arguments to function

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gleb Natapov July 21, 2011, 12:25 p.m. UTC | #10
On Thu, Jul 21, 2011 at 02:22:07PM +0200, Jan Kiszka wrote:
> On 2011-07-21 13:06, Vasilis Liaskovitis wrote:
> > More precisely: To enable cpu-hotplug at each bisect stage, I apply
> > this patch derived from:
> > http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> > 
> > diff --git a/hw/qdev.c b/hw/qdev.c
> > index 1aa1ea0..aed48ce 100644
> > --- a/hw/qdev.c
> > +++ b/hw/qdev.c
> > @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
> >     if (!main_system_bus) {
> >         main_system_bus = qbus_create(&system_bus_info, NULL,
> >                                       "main-system-bus");
> > +       main_system_bus->allow_hotplug = 1;
> 
> BTW, this reminds me why "fixing" CPU hotplug won't help a lot. We
I do not see relation between qdev not allowing cpu hotplug bug and cpu been
created with incorrect state bug. Fixing of former will not magically
make later disappear. Both should be solved.

> finally need to _design_ the infrastructure required for CPU hotplugging
> and stop hacking away reasonable bits that do not fit in the current
> simplistic CPU interface model.
> 
> The questions are:
>  - How should the CPU-APIC-chipset topology should look like in an
>    ideal (qdev reworked or qdev replaced with X) world should like?
>  - How can we make useful step in that direction given the current
>    mechanisms (e.g. a hotpluggable APIC bus created by the chipset)?
> 
> Jan
> 
> -- 
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka July 21, 2011, 12:35 p.m. UTC | #11
On 2011-07-21 14:25, Gleb Natapov wrote:
> On Thu, Jul 21, 2011 at 02:22:07PM +0200, Jan Kiszka wrote:
>> On 2011-07-21 13:06, Vasilis Liaskovitis wrote:
>>> More precisely: To enable cpu-hotplug at each bisect stage, I apply
>>> this patch derived from:
>>> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
>>>
>>> diff --git a/hw/qdev.c b/hw/qdev.c
>>> index 1aa1ea0..aed48ce 100644
>>> --- a/hw/qdev.c
>>> +++ b/hw/qdev.c
>>> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>>>     if (!main_system_bus) {
>>>         main_system_bus = qbus_create(&system_bus_info, NULL,
>>>                                       "main-system-bus");
>>> +       main_system_bus->allow_hotplug = 1;
>>
>> BTW, this reminds me why "fixing" CPU hotplug won't help a lot. We
> I do not see relation between qdev not allowing cpu hotplug bug

It's not a bug, it's a feature: The system bus we attach APICs to so far
is a statically configured beast, intentionally hotplug-incapable.
Hacking this away in qemu-kvm will be a step towards where we came from
and will break sooner or later again. We need a CPU/APIC bus, also for
other reasons, that can then be made hotplug-capable.

> and cpu been
> created with incorrect state bug. Fixing of former will not magically
> make later disappear. Both should be solved.

The issues hidden by the topology design problem should be solved
nevertheless, that's true.

Jan
Jan Kiszka July 21, 2011, 12:39 p.m. UTC | #12
On 2011-07-21 14:18, Avi Kivity wrote:
> On 07/21/2011 02:55 PM, Jan Kiszka wrote:
>>>>
>>>  CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
>>>  applied). But we have two bugs currently. One is that ACPI interrupt
>>
>> I bet we have multiple bugs for quite a while now, which accelerated bit
>> rotting even further.
>>
> 
> kvm-autotest to the rescue!

CPU hotplug primarily suffers from lacking integration with QEMU
(hotpluggable CPU bus, qdev'ified CPUs to enable "device_add/del cpu",
and all that upstream). That needs to be finally solved, then regression
testing will indeed be very important.

Jan
Gleb Natapov July 21, 2011, 12:40 p.m. UTC | #13
On Thu, Jul 21, 2011 at 02:35:09PM +0200, Jan Kiszka wrote:
> On 2011-07-21 14:25, Gleb Natapov wrote:
> > On Thu, Jul 21, 2011 at 02:22:07PM +0200, Jan Kiszka wrote:
> >> On 2011-07-21 13:06, Vasilis Liaskovitis wrote:
> >>> More precisely: To enable cpu-hotplug at each bisect stage, I apply
> >>> this patch derived from:
> >>> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> >>>
> >>> diff --git a/hw/qdev.c b/hw/qdev.c
> >>> index 1aa1ea0..aed48ce 100644
> >>> --- a/hw/qdev.c
> >>> +++ b/hw/qdev.c
> >>> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
> >>>     if (!main_system_bus) {
> >>>         main_system_bus = qbus_create(&system_bus_info, NULL,
> >>>                                       "main-system-bus");
> >>> +       main_system_bus->allow_hotplug = 1;
> >>
> >> BTW, this reminds me why "fixing" CPU hotplug won't help a lot. We
> > I do not see relation between qdev not allowing cpu hotplug bug
> 
> It's not a bug, it's a feature: The system bus we attach APICs to so far
> is a statically configured beast, intentionally hotplug-incapable.
> Hacking this away in qemu-kvm will be a step towards where we came from
> and will break sooner or later again. We need a CPU/APIC bus, also for
> other reasons, that can then be made hotplug-capable.
> 
Agree. I was calling the whole situation the "bug", not this missing line in
qdev in particular.

> > and cpu been
> > created with incorrect state bug. Fixing of former will not magically
> > make later disappear. Both should be solved.
> 
> The issues hidden by the topology design problem should be solved
> nevertheless, that's true.
> 
> Jan
> 
> -- 
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vasilis Liaskovitis July 21, 2011, 1:08 p.m. UTC | #14
Hi,

thanks for looking at this closer.

On Thu, Jul 21, 2011 at 1:33 PM, Gleb Natapov <gleb@redhat.com> wrote:

>> Note that there is probably another bug in qemu-kvm/master regarding
>> acpi-udev event delivery for
>> a cpu-hotplug event (cpu_set in the qemu_monitor no longer triggers
>> the event in the guest, see
>> first issue in my original mail). This patch does not address that issue.
>>
> Did this work in qemu-0.13?

yes, the acpi-event delivery worked fine for me in 0.13.0

I also tried to separately bisect the acpi-udev delivery issue between
0.13.0 and qemu-kvm/master but ended up
with a big merge commit as the culprit. Not sure if it would help.

FWIW, here's my "git bisect log" for the acpi-udev delivery issue:

git bisect start
# good: [2e1e73f3d746feffc1c12e9463b2fc04e10df860] Merge remote branch
'upstream/stable-0.13' into stable-0.13
git bisect good 2e1e73f3d746feffc1c12e9463b2fc04e10df860
# bad: [fda19064e889d4419dd3dc69ca8e6e7a1535fdf5] Merge branch 'upstream-merge'
git bisect bad fda19064e889d4419dd3dc69ca8e6e7a1535fdf5
# good: [01ac6428a04576ae6f84f07d82c98da304b9ac77] Fix memory leak in
register save load due to xsave support
git bisect good 01ac6428a04576ae6f84f07d82c98da304b9ac77
# good: [f14224a695f51576e33d96a4bc26b9a67899dbb9] kvm_stat: add 'x'
key for enabling/disabling "drilldown"
git bisect good f14224a695f51576e33d96a4bc26b9a67899dbb9
# good: [f14224a695f51576e33d96a4bc26b9a67899dbb9] kvm_stat: add 'x'
key for enabling/disabling "drilldown"
git bisect good f14224a695f51576e33d96a4bc26b9a67899dbb9
# bad: [f47b14bbbe43fbaef9137e2abea8d300a6e6287f] Merge commit
'45fe15c25a5c9feea6e0f78434f5e9f632de9d94' into upstream-merge
git bisect bad f47b14bbbe43fbaef9137e2abea8d300a6e6287f
# good: [d23948b15a9920fb7f6374b55a6db1ecff81f3ee] lm32: add Milkymist
VGAFB support
git bisect good d23948b15a9920fb7f6374b55a6db1ecff81f3ee
# good: [47c0143cdde080101e97a1b39f3ff13e33b5726c] target-i386: add
CPU86_LDouble <-> double conversion functions
git bisect good 47c0143cdde080101e97a1b39f3ff13e33b5726c
# good: [d2d979c628e4b2c4a3cb71a31841875795c79043] NBD: Avoid leaking
a couple of strings when the NBD device is closed
git bisect good d2d979c628e4b2c4a3cb71a31841875795c79043
# bad: [70757dcaa40e14978bf287084d8fab9efb815a2d] qemu-kvm: Limit MSI
vector walk to actual array size
git bisect bad 70757dcaa40e14978bf287084d8fab9efb815a2d
# good: [913f5649dc778a1635e3440edba3429d84caec89] Merge commit
'a9f8ad8f2acdb2398da5d32a5efc19cb0196d79f' into upstream-merge
git bisect good 913f5649dc778a1635e3440edba3429d84caec89
# bad: [ad16018ee64433754914b73782da94fe2bcc0dac] qemu-kvm: Move gsi
bits from kvm_msix_vector_add to kvm_msi_add_message
git bisect bad ad16018ee64433754914b73782da94fe2bcc0dac
# bad: [98e934f0ab47583e82e3cabaa1c6a48c816e7bd0] qemu-kvm:
Synchronize states before reset
git bisect bad 98e934f0ab47583e82e3cabaa1c6a48c816e7bd0
# bad: [dc080800baef853f5a4d383d4f00d6afa7a46ff4] Merge branch 'upstream-merge'
git bisect bad dc080800baef853f5a4d383d4f00d6afa7a46ff4
# bad: [7d821420734b3e1fed438ae41b5791eb281f9448] Merge commit
'4d9ad7f793605abd9806fc932b3e04e028894565' into upstream-merge
git bisect bad 7d821420734b3e1fed438ae41b5791eb281f9448
# bad: [fc58948644fe3975c541f8452c63dd2d257587bd] Merge commit
'23910d3f669d46073b403876e30a7314599633af' into upstream-merge
git bisect bad fc58948644fe3975c541f8452c63dd2d257587bd

fc58948644fe3975c541f8452c63dd2d257587bd is the first bad commit

git show fc58948644fe3975c541f8452c63dd2d257587bd

commit fc58948644fe3975c541f8452c63dd2d257587bd
Merge: 913f564 23910d3
Author: Marcelo Tosatti <mtosatti@redhat.com>
Date:   Fri Apr 15 08:35:38 2011 -0300

    Merge commit '23910d3f669d46073b403876e30a7314599633af' into upstream-merge

    * commit '23910d3f669d46073b403876e30a7314599633af': (109 commits)
      acpi, acpi_piix: factor out GPE logic
      arm: basic support for ARMv4/ARMv4T emulation
      Fix conversions from pointer to tcg_target_long
      vnc: tight: Fix crash after 2GB of output
      smbus_eeprom: consolidate smbus eeprom creation oc pc_piix,
mips_mapta, mips_fulong
      lan9118: Ignore write to MAC_VLAN1 register
      acpi, acpi_piix, vt82c686: factor out PM1_CNT logic
      acpi, acpi_piix, vt82c686: factor out PM1a EVT logic
      acpi, acpi_piix, vt82c686: factor out PM_TMR logic
      hw/pflash_cfi02: Fix lazy reset of ROMD mode
      configure: avoid basename usage message
      mpc85xx_pci_map_irq: change "unknow" to "unknown"
      event: trivial coding style fixes
      multiboot: Quote filename in error message
      ppce500_mpc8544ds: Fix compile with --enable-debug and --disable-kvm
      Use existing helper function to implement popcntd instruction
      Delay creation of pseries device tree until reset
      pseries: Abolish envs array
      spapr_vscsi: Set uninitialized variable
      Don't call cpu_synchronize_state() from machine init.
      ...

    Conflicts:
        hw/acpi_piix4.c

    Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

diff --cc Makefile.objs
index 1fc3708,44ce368..895d416
--- a/Makefile.objs
+++ b/Makefile.objs
@@@ -14,10 -14,9 +14,10 @@@ oslib-obj-$(CONFIG_POSIX) += oslib-posi
  # block-obj-y is code used by both qemu system emulation and qemu-img

  block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o
module.o async.o
- block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o
+ block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o
qemu-progress.o qemu-sockets.o
  block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
  block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 +block-obj-$(CONFIG_POSIX) += compatfd.o

  block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o
bochs.o vpc.o vvfat.o
  block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o
qcow2-snapshot.o qcow2-cache.o
diff --cc Makefile.target
index 3e70627,d5761b7..3a37205
--- a/Makefile.target



- Vasilis
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gleb Natapov July 21, 2011, 1:11 p.m. UTC | #15
On Thu, Jul 21, 2011 at 03:08:37PM +0200, Vasilis Liaskovitis wrote:
> Hi,
> 
> thanks for looking at this closer.
> 
> On Thu, Jul 21, 2011 at 1:33 PM, Gleb Natapov <gleb@redhat.com> wrote:
> 
> >> Note that there is probably another bug in qemu-kvm/master regarding
> >> acpi-udev event delivery for
> >> a cpu-hotplug event (cpu_set in the qemu_monitor no longer triggers
> >> the event in the guest, see
> >> first issue in my original mail). This patch does not address that issue.
> >>
> > Did this work in qemu-0.13?
> 
> yes, the acpi-event delivery worked fine for me in 0.13.0
> 
I've already sent a patch to fix this problem in one of the emails in this thread.

> I also tried to separately bisect the acpi-udev delivery issue between
> 0.13.0 and qemu-kvm/master but ended up
> with a big merge commit as the culprit. Not sure if it would help.
> 
> FWIW, here's my "git bisect log" for the acpi-udev delivery issue:
> 
> git bisect start
> # good: [2e1e73f3d746feffc1c12e9463b2fc04e10df860] Merge remote branch
> 'upstream/stable-0.13' into stable-0.13
> git bisect good 2e1e73f3d746feffc1c12e9463b2fc04e10df860
> # bad: [fda19064e889d4419dd3dc69ca8e6e7a1535fdf5] Merge branch 'upstream-merge'
> git bisect bad fda19064e889d4419dd3dc69ca8e6e7a1535fdf5
> # good: [01ac6428a04576ae6f84f07d82c98da304b9ac77] Fix memory leak in
> register save load due to xsave support
> git bisect good 01ac6428a04576ae6f84f07d82c98da304b9ac77
> # good: [f14224a695f51576e33d96a4bc26b9a67899dbb9] kvm_stat: add 'x'
> key for enabling/disabling "drilldown"
> git bisect good f14224a695f51576e33d96a4bc26b9a67899dbb9
> # good: [f14224a695f51576e33d96a4bc26b9a67899dbb9] kvm_stat: add 'x'
> key for enabling/disabling "drilldown"
> git bisect good f14224a695f51576e33d96a4bc26b9a67899dbb9
> # bad: [f47b14bbbe43fbaef9137e2abea8d300a6e6287f] Merge commit
> '45fe15c25a5c9feea6e0f78434f5e9f632de9d94' into upstream-merge
> git bisect bad f47b14bbbe43fbaef9137e2abea8d300a6e6287f
> # good: [d23948b15a9920fb7f6374b55a6db1ecff81f3ee] lm32: add Milkymist
> VGAFB support
> git bisect good d23948b15a9920fb7f6374b55a6db1ecff81f3ee
> # good: [47c0143cdde080101e97a1b39f3ff13e33b5726c] target-i386: add
> CPU86_LDouble <-> double conversion functions
> git bisect good 47c0143cdde080101e97a1b39f3ff13e33b5726c
> # good: [d2d979c628e4b2c4a3cb71a31841875795c79043] NBD: Avoid leaking
> a couple of strings when the NBD device is closed
> git bisect good d2d979c628e4b2c4a3cb71a31841875795c79043
> # bad: [70757dcaa40e14978bf287084d8fab9efb815a2d] qemu-kvm: Limit MSI
> vector walk to actual array size
> git bisect bad 70757dcaa40e14978bf287084d8fab9efb815a2d
> # good: [913f5649dc778a1635e3440edba3429d84caec89] Merge commit
> 'a9f8ad8f2acdb2398da5d32a5efc19cb0196d79f' into upstream-merge
> git bisect good 913f5649dc778a1635e3440edba3429d84caec89
> # bad: [ad16018ee64433754914b73782da94fe2bcc0dac] qemu-kvm: Move gsi
> bits from kvm_msix_vector_add to kvm_msi_add_message
> git bisect bad ad16018ee64433754914b73782da94fe2bcc0dac
> # bad: [98e934f0ab47583e82e3cabaa1c6a48c816e7bd0] qemu-kvm:
> Synchronize states before reset
> git bisect bad 98e934f0ab47583e82e3cabaa1c6a48c816e7bd0
> # bad: [dc080800baef853f5a4d383d4f00d6afa7a46ff4] Merge branch 'upstream-merge'
> git bisect bad dc080800baef853f5a4d383d4f00d6afa7a46ff4
> # bad: [7d821420734b3e1fed438ae41b5791eb281f9448] Merge commit
> '4d9ad7f793605abd9806fc932b3e04e028894565' into upstream-merge
> git bisect bad 7d821420734b3e1fed438ae41b5791eb281f9448
> # bad: [fc58948644fe3975c541f8452c63dd2d257587bd] Merge commit
> '23910d3f669d46073b403876e30a7314599633af' into upstream-merge
> git bisect bad fc58948644fe3975c541f8452c63dd2d257587bd
> 
> fc58948644fe3975c541f8452c63dd2d257587bd is the first bad commit
> 
> git show fc58948644fe3975c541f8452c63dd2d257587bd
> 
> commit fc58948644fe3975c541f8452c63dd2d257587bd
> Merge: 913f564 23910d3
> Author: Marcelo Tosatti <mtosatti@redhat.com>
> Date:   Fri Apr 15 08:35:38 2011 -0300
> 
>     Merge commit '23910d3f669d46073b403876e30a7314599633af' into upstream-merge
> 
>     * commit '23910d3f669d46073b403876e30a7314599633af': (109 commits)
>       acpi, acpi_piix: factor out GPE logic
>       arm: basic support for ARMv4/ARMv4T emulation
>       Fix conversions from pointer to tcg_target_long
>       vnc: tight: Fix crash after 2GB of output
>       smbus_eeprom: consolidate smbus eeprom creation oc pc_piix,
> mips_mapta, mips_fulong
>       lan9118: Ignore write to MAC_VLAN1 register
>       acpi, acpi_piix, vt82c686: factor out PM1_CNT logic
>       acpi, acpi_piix, vt82c686: factor out PM1a EVT logic
>       acpi, acpi_piix, vt82c686: factor out PM_TMR logic
>       hw/pflash_cfi02: Fix lazy reset of ROMD mode
>       configure: avoid basename usage message
>       mpc85xx_pci_map_irq: change "unknow" to "unknown"
>       event: trivial coding style fixes
>       multiboot: Quote filename in error message
>       ppce500_mpc8544ds: Fix compile with --enable-debug and --disable-kvm
>       Use existing helper function to implement popcntd instruction
>       Delay creation of pseries device tree until reset
>       pseries: Abolish envs array
>       spapr_vscsi: Set uninitialized variable
>       Don't call cpu_synchronize_state() from machine init.
>       ...
> 
>     Conflicts:
>         hw/acpi_piix4.c
> 
>     Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> diff --cc Makefile.objs
> index 1fc3708,44ce368..895d416
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@@ -14,10 -14,9 +14,10 @@@ oslib-obj-$(CONFIG_POSIX) += oslib-posi
>   # block-obj-y is code used by both qemu system emulation and qemu-img
> 
>   block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o
> module.o async.o
> - block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o
> + block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o
> qemu-progress.o qemu-sockets.o
>   block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
>   block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>  +block-obj-$(CONFIG_POSIX) += compatfd.o
> 
>   block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o
> bochs.o vpc.o vvfat.o
>   block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o
> qcow2-snapshot.o qcow2-cache.o
> diff --cc Makefile.target
> index 3e70627,d5761b7..3a37205
> --- a/Makefile.target
> 
> 
> 
> - Vasilis

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vasilis Liaskovitis July 21, 2011, 1:12 p.m. UTC | #16
On Thu, Jul 21, 2011 at 3:11 PM, Gleb Natapov <gleb@redhat.com> wrote:
> On Thu, Jul 21, 2011 at 03:08:37PM +0200, Vasilis Liaskovitis wrote:
> I've already sent a patch to fix this problem in one of the emails in this thread.

oops sorry I missed your last mail/patch. Got it.
thanks again.

- Vasilis
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gleb Natapov July 21, 2011, 1:13 p.m. UTC | #17
On Thu, Jul 21, 2011 at 03:12:39PM +0200, Vasilis Liaskovitis wrote:
> On Thu, Jul 21, 2011 at 3:11 PM, Gleb Natapov <gleb@redhat.com> wrote:
> > On Thu, Jul 21, 2011 at 03:08:37PM +0200, Vasilis Liaskovitis wrote:
> > I've already sent a patch to fix this problem in one of the emails in this thread.
> 
> oops sorry I missed your last mail/patch. Got it.
> thanks again.
> 
Why sorry? Thank you for your work.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity July 21, 2011, 1:15 p.m. UTC | #18
On 07/21/2011 04:08 PM, Vasilis Liaskovitis wrote:
> Hi,
>
> thanks for looking at this closer.
>
> On Thu, Jul 21, 2011 at 1:33 PM, Gleb Natapov<gleb@redhat.com>  wrote:
>
> >>  Note that there is probably another bug in qemu-kvm/master regarding
> >>  acpi-udev event delivery for
> >>  a cpu-hotplug event (cpu_set in the qemu_monitor no longer triggers
> >>  the event in the guest, see
> >>  first issue in my original mail). This patch does not address that issue.
> >>
> >  Did this work in qemu-0.13?
>
> yes, the acpi-event delivery worked fine for me in 0.13.0
>
> I also tried to separately bisect the acpi-udev delivery issue between
> 0.13.0 and qemu-kvm/master but ended up
> with a big merge commit as the culprit. Not sure if it would help.

You can bisect it further.  If you tag the bad commit "bad", then you do

$ git bisect start bad^2 $(git merge-base bad^1 bad^2)

   (for each commit bisect feeds you:)
   $ git merge --no-commit bad^1
<test>
   $ git checkout -f
   $ git bisect <good|bad>

What this does is bisect all commits added by the merge, but first 
re-does the merge before testing.  You shouldn't have any conflicts 
during these merging.

It is recommended to set up ccache to reduce build time, as those merges 
force a full rebuild every time.
Avi Kivity July 21, 2011, 1:15 p.m. UTC | #19
On 07/21/2011 04:15 PM, Avi Kivity wrote:
>> I also tried to separately bisect the acpi-udev delivery issue between
>> 0.13.0 and qemu-kvm/master but ended up
>> with a big merge commit as the culprit. Not sure if it would help.
>
>
> You can bisect it further.  If you tag the bad commit "bad", then you do

Err, I now see this is moot.
Lucas Meneghel Rodrigues July 21, 2011, 1:27 p.m. UTC | #20
On 07/21/2011 09:18 AM, Avi Kivity wrote:
> On 07/21/2011 02:55 PM, Jan Kiszka wrote:
>> >>
>> > CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
>> > applied). But we have two bugs currently. One is that ACPI interrupt
>>
>> I bet we have multiple bugs for quite a while now, which accelerated bit
>> rotting even further.
>>
>
> kvm-autotest to the rescue!
>
> Lucas, do we have a cpu hotplug test in kvm-autotest?

Hey Avi, I just read the thread, and let me tell you guys, I have worked 
on a cpu_set test, that was supposed to test CPU hotplug. I have created 
the entire test, however, I could never ever get the test to pass due to 
the fact that the infrastructure on qemu is broken. I have revisited the 
test late 2010, and I've verified that although cpu_set does not crash 
qemu anymore, the guest OS has no idea whatsoever that a new CPU was 
added. I remember that one of the problems we had was lack of 
functionality in Seabios...

I can gladly get back to it and get it commited (I was reluctant in 
getting it upstream because I never ever got a successful test result), 
but I can put it into shape and get it commited.

Cheers,

Lucas

> If so, I promise to add it to my autotest run once it is fixed, to
> prevent the spread of the disease.
>
> btw, Anthony, are you running kvm-autotest these days? with qemu-kvm
> diverging less and less from qemu.git these days, it's becoming less
> effective to run it downstream.
>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/hw/qdev.c b/hw/qdev.c
index 1aa1ea0..aed48ce 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -327,6 +327,7 @@  BusState *sysbus_get_default(void)
    if (!main_system_bus) {
        main_system_bus = qbus_create(&system_bus_info, NULL,
                                      "main-system-bus");
+       main_system_bus->allow_hotplug = 1;
    }
    return main_system_bus;
}

and test cpu hotplug functionality.
The commit that appears to break CPU hotplug is:

commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
Author: Jan Kiszka <jan.kiszka@siemens.com>
Date:   Mon Feb 21 12:28:07 2011 +0100
   qemu-kvm: Mark VCPU state dirty on creation

Is it possible that kvm_vcpu_dirty should not be set to 1 for a CPU
that's being hot-plugged?
I.e. when kvm_cpu_exec() is called for the first time during
initialization of a hotplugged-CPU,
we shouldn't try to restore state with kvm_arch_put_registers().

The following patch enables hotplug and solves the non-responsive
hotplugged CPU problem,
by not setting the vcpu_dirty state when hotplugging. Enabling hotplug
is still done through
the main_system_bus (see above).
Tested on amd64host/amd64guest combination. Comments?

Note that there is probably another bug in qemu-kvm/master regarding
acpi-udev event delivery for
a cpu-hotplug event (cpu_set in the qemu_monitor no longer triggers
the event in the guest, see
first issue in my original mail). This patch does not address that issue.

Signed-off-by: Vasilis Liaskovitis <vliaskov@gmail.com>
---

 hw/acpi_piix4.c   |    2 +-
 hw/pc.c           |   13 +++++++++++--
 hw/qdev.c         |    1 +
 kvm-all.c         |   22 +++++++++++++++++++++-
 kvm.h             |    3 +++
 target-i386/cpu.h |    2 +-
 6 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index c30a050..b0cac60 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -584,7 +584,7 @@  void qemu_system_cpu_hot_add(int cpu, int state)
     PIIX4PMState *s = global_piix4_pm_state;

     if (state && !qemu_get_cpu(cpu)) {
-        env = pc_new_cpu(global_cpu_model);
+        env = pc_new_cpu(global_cpu_model, 1);
         if (!env) {
             fprintf(stderr, "cpu %d creation failed\n", cpu);
             return;
diff --git a/hw/pc.c b/hw/pc.c
index c0a88e1..8cfbf27 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -924,7 +924,7 @@  static void pc_cpu_reset(void *opaque)
     env->halted = !cpu_is_bsp(env);
 }

-CPUState *pc_new_cpu(const char *cpu_model)
+CPUState *pc_new_cpu(const char *cpu_model, int hotplugged)
 {
     CPUState *env;

@@ -936,7 +936,16 @@  CPUState *pc_new_cpu(const char *cpu_model)
 #endif
     }

+    if (hotplugged) {
+        kvm_set_cpuhotplug_req();
+    }
+
     env = cpu_init(cpu_model);
+
+    if (hotplugged) {
+        kvm_reset_cpuhotplug_req();
+    }
+
     if (!env) {
         fprintf(stderr, "Unable to find x86 CPU definition\n");
         exit(1);
@@ -956,7 +965,7 @@  void pc_cpus_init(const char *cpu_model)

     /* init CPUs */
     for(i = 0; i < smp_cpus; i++) {
-        pc_new_cpu(cpu_model);
+        pc_new_cpu(cpu_model, 0);
     }
 }

diff --git a/hw/qdev.c b/hw/qdev.c
index 292b52f..f85ac0f 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -327,6 +327,7 @@  BusState *sysbus_get_default(void)
     if (!main_system_bus) {
         main_system_bus = qbus_create(&system_bus_info, NULL,
                                       "main-system-bus");
+	main_system_bus->allow_hotplug = 1;
     }
     return main_system_bus;
 }
diff --git a/kvm-all.c b/kvm-all.c
index 3ad2459..8aae1d7 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -85,6 +85,7 @@  struct KVMState
 #endif
     void *used_gsi_bitmap;
     int max_gsi;
+    int cpu_hotplug_req;
 };

 KVMState *kvm_state;
@@ -220,7 +221,9 @@  int kvm_init_vcpu(CPUState *env)

     env->kvm_fd = ret;
     env->kvm_state = s;
-    env->kvm_vcpu_dirty = 1;
+    if (!kvm_has_cpuhotplug_req()) {
+        env->kvm_vcpu_dirty = 1;
+    }

     mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
     if (mmap_size < 0) {
@@ -797,6 +800,8 @@  int kvm_init(void)

     s->pit_in_kernel = kvm_pit;

+    s->cpu_hotplug_req = 0;
+
     ret = kvm_arch_init(s);
     if (ret < 0) {
         goto err;
@@ -1104,6 +1109,21 @@  int kvm_has_vcpu_events(void)
     return kvm_state->vcpu_events;
 }

+int kvm_has_cpuhotplug_req(void)
+{
+    return kvm_state->cpu_hotplug_req;
+}
+
+void kvm_set_cpuhotplug_req(void)
+{
+    kvm_state->cpu_hotplug_req = 1;
+}
+
+void kvm_reset_cpuhotplug_req(void)
+{
+    kvm_state->cpu_hotplug_req = 0;
+}
+
 int kvm_has_robust_singlestep(void)
 {
     return kvm_state->robust_singlestep;
diff --git a/kvm.h b/kvm.h
index b15e1dd..7fa72fd 100644
--- a/kvm.h
+++ b/kvm.h
@@ -52,6 +52,9 @@  int kvm_has_xsave(void);
 int kvm_has_xcrs(void);
 int kvm_has_many_ioeventfds(void);
 int kvm_has_pit_state2(void);
+int kvm_has_cpuhotplug_req(void);
+void kvm_set_cpuhotplug_req(void);
+void kvm_reset_cpuhotplug_req(void);

 #ifdef NEED_CPU_H
 int kvm_init_vcpu(CPUState *env);
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 935d08a..7e7839b 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -923,7 +923,7 @@  void cpu_x86_update_cr4(CPUX86State *env, uint32_t new_cr4);
 /* hw/pc.c */
 void cpu_smm_update(CPUX86State *env);
 uint64_t cpu_get_tsc(CPUX86State *env);
-CPUState *pc_new_cpu(const char *cpu_model);
+CPUState *pc_new_cpu(const char *cpu_model, int hotplugged);

 /* used to debug */
 #define X86_DUMP_FPU  0x0001 /* dump FPU state too */