diff mbox

KVM: kvm_set_slave_cpu: Invalid argument when trying direct interrupt delivery

Message ID 74961C475F61A9459FE12A5810EF11CE032E7924@szxeml538-mbx.china.huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Yangminqiang April 7, 2013, 10:09 a.m. UTC
Hi Tomoki,

I offline the cpu2 and cpu3 on my machine and continue to try your patch.  I run the vm without pass-through device for I only want to know the interrupt latency improvement.(Am I right?)

my qemu parameter:
./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 1024 -cpu qemu64,+x2apic -no-kvm-pit -serial pty -nographic -drive file=/mnt/sdb/vms/testfc/testfc.qcow2,if=virtio,index=0,format=qcow2 -spice port=12000,addr=186.100.8.171,disable-ticketing,plaintext-channel=main,plaintext-channel=playback,plaintext-channel=record,image-compression=auto_glz -no-kvm-pit

cyclictest:
   cyclictest -m -p 99 -n -l 100000 -h 3000 -q  


but I got very bad result:
  avg lantency: 20000+ us
  max lantency: 50000+ us

and got

Message from syslogd@kvmsteven at Apr  7 05:43:30 ...
 kernel:[ 2201.151817] BUG: soft lockup - CPU#18 stuck for 22s! [qemu-system-x86:2365]

my setup:
host kernel: 3.6.0-rc4+ and your patches
guest kernel: 3.6.11.1-rt32
qemu: qemu-kvm-1.0 with your patch

BTW, I am sure that my rt-kernel works well, which got 12us max latency as a host OS.

Could you please provoide me more detail about your benchmark so I could reproduce your benchmark result? 

Thanks,
Steven

Comments

Yangminqiang April 23, 2013, 2:49 a.m. UTC | #1
HI Tomoki,

Thanks for you config file, but it is for linux-3.5-rc4, but the patches you 
posted to the community was based on linux-3.6 as described in the following
link. 
  http://thread.gmane.org/gmane.linux.kernel/1353803
I also tested the config file on linux-3.6, still can not work. 

Could you please provide me the detail about your setup, for example:
- kernel version and config
- the version and url of your patch
- qemu command line
- other info
so that I can try and reproduce your benchmark :-).

I think that your patch would be a good workaround on realtime and 
performance improvement with hardware which do not support apicv.

BTW, I notice your benchmark result in your slides, which shows that 
the throughput of a 10Gbps NIC exceeds 10Gbps? How did that happen?

My benchmark also shows a interesting data: the throughput of a pass-
through device is higher than bare-mental sometime.(I use NetPIPE)

Thanks a lot.

Yang Minqiang

> -----Original Message-----

> From: Tomoki Sekiyama [mailto:tomoki.sekiyama@hds.com]

> Sent: Thursday, April 11, 2013 4:37 AM

> To: Yangminqiang

> Cc: kvm@vger.kernel.org; Haofeng; Luohao (brian)

> Subject: Re: KVM: kvm_set_slave_cpu: Invalid argument when trying direct

> interrupt delivery

> 

> Hi,

> 

> 

> On 4/7/13 6:09 , "Yangminqiang" <yangminqiang@huawei.com> wrote:

> >Hi Tomoki,

> >

> >I offline the cpu2 and cpu3 on my machine and continue to try your patch.

> > I run the vm without pass-through device for I only want to know the

> >interrupt latency improvement.(Am I right?)

> 

> This should be OK.

> 

> >my qemu parameter:

> >./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 1024 -cpu

> >qemu64,+x2apic -no-kvm-pit -serial pty -nographic -drive

> >file=/mnt/sdb/vms/testfc/testfc.qcow2,if=virtio,index=0,format=qcow2

> >-spice

> >port=12000,addr=186.100.8.171,disable-ticketing,plaintext-channel=main,

> >pla

> >intext-channel=playback,plaintext-channel=record,image-compression=aut

> o

> >_gl

> >z -no-kvm-pit

> 

> Using video (-spice) could cause a large latency, because non-passthrough

> devices cause larger latency with this patch.

> Please use serial console (but should avoid serial I/O while benchmark,

> because serial I/O is also non-passthrough device).

> 

> You may need to turn off CPU low power features, or running some infinite

> loop (e.g. yes>/dev/null) to avoid cpu core from going into power saving

> mode.

> 

> 

> >cyclictest:

> >   cyclictest -m -p 99 -n -l 100000 -h 3000 -q

> >

> >

> >but I got very bad result:

> >  avg lantency: 20000+ us

> >  max lantency: 50000+ us

> >

> >and got

> >

> >Message from syslogd@kvmsteven at Apr  7 05:43:30 ...

> > kernel:[ 2201.151817] BUG: soft lockup - CPU#18 stuck for 22s!

> >[qemu-system-x86:2365]

> 

> This patch is not yet tested in various kernel CONFIG_* settings, and some

> config may cause issue like this...

> I was using Fedora-based config (attached). Could you try with this?

> 

> >my setup:

> >host kernel: 3.6.0-rc4+ and your patches guest kernel: 3.6.11.1-rt32

> >qemu: qemu-kvm-1.0 with your patch

> >

> >BTW, I am sure that my rt-kernel works well, which got 12us max latency

> >as a host OS.

> >

> >Could you please provoide me more detail about your benchmark so I

> >could reproduce your benchmark result?

> >

> >Thanks,

> >Steven
Tomoki Sekiyama April 23, 2013, 5:30 p.m. UTC | #2
Hi, Minqiang


On 4/22/13 22:49 , "Yangminqiang" <yangminqiang@huawei.com> wrote:

>HI Tomoki,
>
>Thanks for you config file, but it is for linux-3.5-rc4, but the patches
>you 
>posted to the community was based on linux-3.6 as described in the
>following
>link. 
>  http://thread.gmane.org/gmane.linux.kernel/1353803
>I also tested the config file on linux-3.6, still can not work.

Ah, that config was for patch v1.
Unfortunately I lost my config for 3.6, however it is not much different
from one I used for in 3.6 + patch v2. (I just did `make oldconfig').

Please refer slides for the command lines information:
http://events.linuxfoundation.org/images/stories/pdf/lcna_co2012_sekiyama.p
df


Could you confirm that the VM is really running on the slave core?
The qemu-kvm process should use only several %CPU even while the guest
is running infinite loop.
And running infinite loop on HOST may help to avoid cores going
into power saving...

>BTW, I notice your benchmark result in your slides, which shows that
>the throughput of a 10Gbps NIC exceeds 10Gbps? How did that happen?

That's because I measured throughput between host and guest via
the internal switch of SR-IOV 10G NIC, which is only limited by
NIC's controller chip performance and the bus speed.

>My benchmark also shows a interesting data: the throughput of a pass-
>through device is higher than bare-mental sometime.(I use NetPIPE)

Thanks,
Tomoki Sekiyama


>Thanks a lot.
>
>
>Yang Minqiang
>
>> -----Original Message-----
>> From: Tomoki Sekiyama [mailto:tomoki.sekiyama@hds.com]
>> Sent: Thursday, April 11, 2013 4:37 AM
>> To: Yangminqiang
>> Cc: kvm@vger.kernel.org; Haofeng; Luohao (brian)
>> Subject: Re: KVM: kvm_set_slave_cpu: Invalid argument when trying direct
>> interrupt delivery
>> 
>> Hi,
>> 
>> 
>> On 4/7/13 6:09 , "Yangminqiang" <yangminqiang@huawei.com> wrote:
>> >Hi Tomoki,
>> >
>> >I offline the cpu2 and cpu3 on my machine and continue to try your
>>patch.
>> > I run the vm without pass-through device for I only want to know the
>> >interrupt latency improvement.(Am I right?)
>> 
>> This should be OK.
>> 
>> >my qemu parameter:
>> >./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 1024 -cpu
>> >qemu64,+x2apic -no-kvm-pit -serial pty -nographic -drive
>> >file=/mnt/sdb/vms/testfc/testfc.qcow2,if=virtio,index=0,format=qcow2
>> >-spice
>> >port=12000,addr=186.100.8.171,disable-ticketing,plaintext-channel=main,
>> >pla
>> >intext-channel=playback,plaintext-channel=record,image-compression=aut
>> o
>> >_gl
>> >z -no-kvm-pit
>> 
>> Using video (-spice) could cause a large latency, because
>>non-passthrough
>> devices cause larger latency with this patch.
>> Please use serial console (but should avoid serial I/O while benchmark,
>> because serial I/O is also non-passthrough device).
>> 
>> You may need to turn off CPU low power features, or running some
>>infinite
>> loop (e.g. yes>/dev/null) to avoid cpu core from going into power saving
>> mode.
>> 
>> 
>> >cyclictest:
>> >   cyclictest -m -p 99 -n -l 100000 -h 3000 -q
>> >
>> >
>> >but I got very bad result:
>> >  avg lantency: 20000+ us
>> >  max lantency: 50000+ us
>> >
>> >and got
>> >
>> >Message from syslogd@kvmsteven at Apr  7 05:43:30 ...
>> > kernel:[ 2201.151817] BUG: soft lockup - CPU#18 stuck for 22s!
>> >[qemu-system-x86:2365]
>> 
>> This patch is not yet tested in various kernel CONFIG_* settings, and
>>some
>> config may cause issue like this...
>> I was using Fedora-based config (attached). Could you try with this?
>> 
>> >my setup:
>> >host kernel: 3.6.0-rc4+ and your patches guest kernel: 3.6.11.1-rt32
>> >qemu: qemu-kvm-1.0 with your patch
>> >
>> >BTW, I am sure that my rt-kernel works well, which got 12us max latency
>> >as a host OS.
>> >
>> >Could you please provoide me more detail about your benchmark so I
>> >could reproduce your benchmark result?
>> >
>> >Thanks,
>> >Steven
>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff -Narup a/qemu-kvm-1.0/qemu-kvm-x86.c b/qemu-kvm-1.0/qemu-kvm-x86.c
--- a/qemu-kvm-1.0/qemu-kvm-x86.c       2011-12-04 19:38:06.000000000 +0900
+++ b/qemu-kvm-1.0/qemu-kvm-x86.c       2012-09-06 20:19:44.828163734 +0900
@@ -139,12 +139,28 @@  static int kvm_enable_tpr_access_reporti
     return kvm_vcpu_ioctl(env, KVM_TPR_ACCESS_REPORTING, &tac);
 }

+static int kvm_set_slave_cpu(CPUState *env)
+{
+    int r, slave = env->cpu_index == 0 ? 2 : env->cpu_index == 1 ? 3 : -1;

`slave' is the offlined CPU ID assigned, and `env->cpu_index' is
the virtual CPU ID. You need to modify here and recompile qemu-kvm
(or just offline cpu 2 and 3 for a 2vcpus guest ;) ).

Thanks,
Tomoki Sekiyama
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html