Message ID | 49F0CE65.4050005@web.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Jan, While the patch itself looks fine, IMO it would be better to move all of the timer handling to userspace, except the performance critical parts, since most of it is generic. Either periodic or one-shot timer, with: - PIO or MMIO region returns remaining time for expiration. - PIO or MMIO region programs the next event and timer mode. Oversimplified of course (kvm_timer_ops was the first step in that direction). I believe there will be a proposed HPET in-kernel driver. I don't see what is the problem with partial components that Avi talks about. On Thu, Apr 23, 2009 at 10:24:05PM +0200, Jan Kiszka wrote: > When using the in-kernel PIT the speaker emulation has to synchronize > the PIT state with KVM. Enhance the existing speaker sound device and > allow it to take over port 0x61 by using KVM_CREATE_PIT_NOSPKR when > available. This unbreaks -soundhw pcspk in KVM mode. > > Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti wrote: > Jan, > > While the patch itself looks fine, IMO it would be better to move all of > the timer handling to userspace, except the performance critical parts, > since most of it is generic. Either periodic or one-shot timer, with: > The reason for having the PIT in-kernel is not performance. The PIT is not performance sensitive. It's because it was easier to do interrupt catch-up by pushing the PIT into the kernel which IMHO was the wrong path to go down. Regards, Anthony Liguori -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Anthony Liguori wrote: > Marcelo Tosatti wrote: >> Jan, >> >> While the patch itself looks fine, IMO it would be better to move all >> of the timer handling to userspace, except the performance critical >> parts, >> since most of it is generic. Either periodic or one-shot timer, with: >> > > The reason for having the PIT in-kernel is not performance. The PIT is > not performance sensitive. I think that depends. Some OSes (in some configurations) use the PIT counter as clock source and/or program it regularly in one-shot mode. An aging use case, but still a valid one. > > It's because it was easier to do interrupt catch-up by pushing the PIT > into the kernel which IMHO was the wrong path to go down. Pushing the emulation of port 0x61 into the kernel was a mistake we now have to deal with. I'm not that sure about the PIT itself. Jan
Jan Kiszka wrote: > Anthony Liguori wrote: > >> Marcelo Tosatti wrote: >> >>> Jan, >>> >>> While the patch itself looks fine, IMO it would be better to move all >>> of the timer handling to userspace, except the performance critical >>> parts, >>> since most of it is generic. Either periodic or one-shot timer, with: >>> >>> >> The reason for having the PIT in-kernel is not performance. The PIT is >> not performance sensitive. >> > > I think that depends. Some OSes (in some configurations) use the PIT > counter as clock source and/or program it regularly in one-shot mode. An > aging use case, but still a valid one. > I can't find the thread, but this has been discussed at length before. The justification has always been for time drift correction. If you crunch the numbers, even at a 1024HZ, there just aren't enough exits to really make a difference from a performance perspective. Just to state it more clearly, if you assume an additional 5us to drop to userspace (which is absurdly high, but let's stick with it), 1024 exits per second comes out to about 5ms which is only 0.5% in terms of CPU consumption. The APIC is quite a bit more understandable because especially with SMP, you can generate a very high number of interrupts per second and taking a drop to userspace for every EOI can be start to matter with exit rates in the hundreds of thousands. >> It's because it was easier to do interrupt catch-up by pushing the PIT >> into the kernel which IMHO was the wrong path to go down. >> > > Pushing the emulation of port 0x61 into the kernel was a mistake we now > have to deal with. I'm not that sure about the PIT itself. > I agree re: port 0x61. I'm just saying that there is no point in moving just the non "performance critical" components to userspace as Marcelo suggests because the whole thing is non "performance critical". Regards, Anthony Liguori -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sunday 26 April 2009 03:59:11 Anthony Liguori wrote: > Jan Kiszka wrote: > > Anthony Liguori wrote: > >> Marcelo Tosatti wrote: > >>> Jan, > >>> > >>> While the patch itself looks fine, IMO it would be better to move all > >>> of the timer handling to userspace, except the performance critical > >>> parts, > >>> since most of it is generic. Either periodic or one-shot timer, with: > >> > >> The reason for having the PIT in-kernel is not performance. The PIT is > >> not performance sensitive. > > > > I think that depends. Some OSes (in some configurations) use the PIT > > counter as clock source and/or program it regularly in one-shot mode. An > > aging use case, but still a valid one. > > I can't find the thread, but this has been discussed at length before. > The justification has always been for time drift correction. If you > crunch the numbers, even at a 1024HZ, there just aren't enough exits to > really make a difference from a performance perspective. I am agree too. When I moved PIT to kernel, the direct reason is at that time, timer in KVM is crappy, mainly due to interrupt handling stuffs. I remember the most obviously one is userspace pit injected one interrupt after another, regardless if the interrupt have already been delivered to the guest, so some interrupt lost, and the timer of guest would become slower and slower. We decided to depends on in-kernel pit to provide a stable time source, so move the whole pit to kernel(rather than try to provide a interface to fix it as Xen did at the time which seems much more complex). Now KVM timer is much maturer and stable than that time, so I think it's ok to try to separate the timer interrupt logic and IO logic now. (though I also think it would still spend some time to get a elegant interface...)
On Thu, Apr 23, 2009 at 10:24:05PM +0200, Jan Kiszka wrote: > When using the in-kernel PIT the speaker emulation has to synchronize > the PIT state with KVM. Enhance the existing speaker sound device and > allow it to take over port 0x61 by using KVM_CREATE_PIT_NOSPKR when > available. This unbreaks -soundhw pcspk in KVM mode. ACK both patches. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Anthony Liguori wrote: > Jan Kiszka wrote: >> Anthony Liguori wrote: >> >>> Marcelo Tosatti wrote: >>> >>>> Jan, >>>> >>>> While the patch itself looks fine, IMO it would be better to move all >>>> of the timer handling to userspace, except the performance critical >>>> parts, >>>> since most of it is generic. Either periodic or one-shot timer, with: >>>> >>> The reason for having the PIT in-kernel is not performance. The PIT is >>> not performance sensitive. >>> >> >> I think that depends. Some OSes (in some configurations) use the PIT >> counter as clock source and/or program it regularly in one-shot mode. An >> aging use case, but still a valid one. >> > > I can't find the thread, but this has been discussed at length before. > The justification has always been for time drift correction. If you > crunch the numbers, even at a 1024HZ, there just aren't enough exits to > really make a difference from a performance perspective. > > Just to state it more clearly, if you assume an additional 5us to drop > to userspace (which is absurdly high, but let's stick with it), 1024 > exits per second comes out to about 5ms which is only 0.5% in terms of > CPU consumption. You are considering timekeeping activities only. RHEL4 for example reads the PIT for each gettimeofday call. For applications that add timestamps to logging the PIT is a *HUGE* overhead (and the PMTMR for that matter). I have one example where something like 15% of each second is wasted handling the ioport reads and writes for get_offset_pit. david > > The APIC is quite a bit more understandable because especially with SMP, > you can generate a very high number of interrupts per second and taking > a drop to userspace for every EOI can be start to matter with exit rates > in the hundreds of thousands. > >>> It's because it was easier to do interrupt catch-up by pushing the PIT >>> into the kernel which IMHO was the wrong path to go down. >>> >> >> Pushing the emulation of port 0x61 into the kernel was a mistake we now >> have to deal with. I'm not that sure about the PIT itself. >> > > I agree re: port 0x61. I'm just saying that there is no point in moving > just the non "performance critical" components to userspace as Marcelo > suggests because the whole thing is non "performance critical". > > Regards, > > Anthony Liguori > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David S. Ahern wrote: > Anthony Liguori wrote: > >> Jan Kiszka wrote: >> >>> Anthony Liguori wrote: >>> >>> >>>> Marcelo Tosatti wrote: >>>> >>>> >>>>> Jan, >>>>> >>>>> While the patch itself looks fine, IMO it would be better to move all >>>>> of the timer handling to userspace, except the performance critical >>>>> parts, >>>>> since most of it is generic. Either periodic or one-shot timer, with: >>>>> >>>>> >>>> The reason for having the PIT in-kernel is not performance. The PIT is >>>> not performance sensitive. >>>> >>>> >>> I think that depends. Some OSes (in some configurations) use the PIT >>> counter as clock source and/or program it regularly in one-shot mode. An >>> aging use case, but still a valid one. >>> >>> >> I can't find the thread, but this has been discussed at length before. >> The justification has always been for time drift correction. If you >> crunch the numbers, even at a 1024HZ, there just aren't enough exits to >> really make a difference from a performance perspective. >> >> Just to state it more clearly, if you assume an additional 5us to drop >> to userspace (which is absurdly high, but let's stick with it), 1024 >> exits per second comes out to about 5ms which is only 0.5% in terms of >> CPU consumption. >> > > > You are considering timekeeping activities only. > > RHEL4 for example reads the PIT for each gettimeofday call. For > applications that add timestamps to logging the PIT is a *HUGE* overhead > (and the PMTMR for that matter). I have one example where something like > 15% of each second is wasted handling the ioport reads and writes for > get_offset_pit. > > david > > I found the link to the previous discussion about moving the pit to userspace: http://www.mail-archive.com/kvm@vger.kernel.org/msg02357.html In the above discussion Marcelo pointed out that we need the pit in the kernel is order to have the timer and the vcpu thread running on the same cpu. Otherwise IPIs will be sent from the io-thread to the vcpu thread in order of injection the irq. I guess we can also do it also using specific timer thread in userspace, but it is getting more complex. btw: I found a type in the patch in the line below: "fprintf(stderr, "Create kernel PIC irqchip failed\n");" s/PIC/PIT/ > >> The APIC is quite a bit more understandable because especially with SMP, >> you can generate a very high number of interrupts per second and taking >> a drop to userspace for every EOI can be start to matter with exit rates >> in the hundreds of thousands. >> >> >>>> It's because it was easier to do interrupt catch-up by pushing the PIT >>>> into the kernel which IMHO was the wrong path to go down. >>>> >>>> >>> Pushing the emulation of port 0x61 into the kernel was a mistake we now >>> have to deal with. I'm not that sure about the PIT itself. >>> >>> >> I agree re: port 0x61. I'm just saying that there is no point in moving >> just the non "performance critical" components to userspace as Marcelo >> suggests because the whole thing is non "performance critical". >> >> Regards, >> >> Anthony Liguori >> >> -- >> To unsubscribe from this list: send the line "unsubscribe kvm" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Kiszka wrote: > When using the in-kernel PIT the speaker emulation has to synchronize > the PIT state with KVM. Enhance the existing speaker sound device and > allow it to take over port 0x61 by using KVM_CREATE_PIT_NOSPKR when > available. This unbreaks -soundhw pcspk in KVM mode. > > diff --git a/qemu/hw/pcspk.c b/qemu/hw/pcspk.c > index ec1d0c6..4752518 100644 > --- a/qemu/hw/pcspk.c > +++ b/qemu/hw/pcspk.c > @@ -27,6 +27,8 @@ > #include "isa.h" > #include "audio/audio.h" > #include "qemu-timer.h" > +#include "i8254.h" > +#include "qemu-kvm.h" > > #define PCSPK_BUF_LEN 1792 > #define PCSPK_SAMPLE_RATE 32000 > @@ -71,7 +73,15 @@ static void pcspk_callback(void *opaque, int free) > { > PCSpkState *s = opaque; > unsigned int n; > +#ifdef USE_KVM_PIT > + struct kvm_pit_state pit_state; > > + if (kvm_enabled() && qemu_kvm_pit_in_kernel()) { > + kvm_get_pit(kvm_context, &pit_state); > + s->pit->channels[2].mode = pit_state.channels[2].mode; > + s->pit->channels[2].count = pit_state.channels[2].count; > + } > +#endif > if (pit_get_mode(s->pit, 2) != 3) > return; > > @@ -120,7 +130,17 @@ static uint32_t pcspk_ioport_read(void *opaque, uint32_t addr) > { > PCSpkState *s = opaque; > int out; > - > +#ifdef USE_KVM_PIT > + struct kvm_pit_state pit_state; > + > + if (kvm_enabled() && qemu_kvm_pit_in_kernel()) { > + kvm_get_pit(kvm_context, &pit_state); > + s->pit->channels[2].mode = pit_state.channels[2].mode; > + s->pit->channels[2].count = pit_state.channels[2].count; > + s->pit->channels[2].count_load_time = pit_state.channels[2].count_load_time; > + s->pit->channels[2].gate = pit_state.channels[2].gate; > + } > +#endif > s->dummy_refresh_clock ^= (1 << 4); > out = pit_get_out(s->pit, 2, qemu_get_clock(vm_clock)) << 5; > > @@ -131,7 +151,17 @@ static void pcspk_ioport_write(void *opaque, uint32_t addr, uint32_t val) > { > PCSpkState *s = opaque; > const int gate = val & 1; > - > +#ifdef USE_KVM_PIT > + struct kvm_pit_state pit_state; > + > + if (kvm_enabled() && qemu_kvm_pit_in_kernel()) { > + kvm_get_pit(kvm_context, &pit_state); > + s->pit->channels[2].mode = pit_state.channels[2].mode; > + s->pit->channels[2].count = pit_state.channels[2].count; > + s->pit->channels[2].count_load_time = pit_state.channels[2].count_load_time; > + s->pit->channels[2].gate = pit_state.channels[2].gate; > + } > +#endif > s->data_on = (val >> 1) & 1; > pit_set_gate(s->pit, 2, gate); > if (s->voice) { > @@ -139,6 +169,15 @@ static void pcspk_ioport_write(void *opaque, uint32_t addr, uint32_t val) > s->play_pos = 0; > AUD_set_active_out(s->voice, gate & s->data_on); > } > +#ifdef USE_KVM_PIT > + if (kvm_enabled() && qemu_kvm_pit_in_kernel()) { > + pit_state.channels[2].mode = s->pit->channels[2].mode; > + pit_state.channels[2].count = s->pit->channels[2].count; > + pit_state.channels[2].count_load_time = s->pit->channels[2].count_load_time; > + pit_state.channels[2].gate = s->pit->channels[2].gate; > + kvm_set_pit(kvm_context, &pit_state); > + } > +#endif > } > > void pcspk_init(PITState *pit) > > Please extract those bits into functions.
diff --git a/libkvm/libkvm-x86.c b/libkvm/libkvm-x86.c index 2fc4fce..03b1939 100644 --- a/libkvm/libkvm-x86.c +++ b/libkvm/libkvm-x86.c @@ -59,6 +59,19 @@ int kvm_create_pit(kvm_context_t kvm) kvm->pit_in_kernel = 0; if (!kvm->no_pit_creation) { +#ifdef KVM_CAP_PIT_NOSPKR + r = ioctl(kvm->fd, KVM_CHECK_EXTENSION, KVM_CAP_PIT_NOSPKR); + if (r > 0) { + r = ioctl(kvm->vm_fd, KVM_CREATE_PIT_NOSPKR); + if (r >= 0) { + kvm->pit_in_kernel = 1; + return 0; + } else { + fprintf(stderr, "Create kernel PIC irqchip failed\n"); + return r; + } + } +#endif r = ioctl(kvm->fd, KVM_CHECK_EXTENSION, KVM_CAP_PIT); if (r > 0) { r = ioctl(kvm->vm_fd, KVM_CREATE_PIT); diff --git a/qemu/hw/pcspk.c b/qemu/hw/pcspk.c index ec1d0c6..4752518 100644 --- a/qemu/hw/pcspk.c +++ b/qemu/hw/pcspk.c @@ -27,6 +27,8 @@ #include "isa.h" #include "audio/audio.h" #include "qemu-timer.h" +#include "i8254.h" +#include "qemu-kvm.h" #define PCSPK_BUF_LEN 1792 #define PCSPK_SAMPLE_RATE 32000 @@ -71,7 +73,15 @@ static void pcspk_callback(void *opaque, int free) { PCSpkState *s = opaque; unsigned int n; +#ifdef USE_KVM_PIT + struct kvm_pit_state pit_state; + if (kvm_enabled() && qemu_kvm_pit_in_kernel()) { + kvm_get_pit(kvm_context, &pit_state); + s->pit->channels[2].mode = pit_state.channels[2].mode; + s->pit->channels[2].count = pit_state.channels[2].count; + } +#endif if (pit_get_mode(s->pit, 2) != 3) return; @@ -120,7 +130,17 @@ static uint32_t pcspk_ioport_read(void *opaque, uint32_t addr) { PCSpkState *s = opaque; int out; - +#ifdef USE_KVM_PIT + struct kvm_pit_state pit_state; + + if (kvm_enabled() && qemu_kvm_pit_in_kernel()) { + kvm_get_pit(kvm_context, &pit_state); + s->pit->channels[2].mode = pit_state.channels[2].mode; + s->pit->channels[2].count = pit_state.channels[2].count; + s->pit->channels[2].count_load_time = pit_state.channels[2].count_load_time; + s->pit->channels[2].gate = pit_state.channels[2].gate; + } +#endif s->dummy_refresh_clock ^= (1 << 4); out = pit_get_out(s->pit, 2, qemu_get_clock(vm_clock)) << 5; @@ -131,7 +151,17 @@ static void pcspk_ioport_write(void *opaque, uint32_t addr, uint32_t val) { PCSpkState *s = opaque; const int gate = val & 1; - +#ifdef USE_KVM_PIT + struct kvm_pit_state pit_state; + + if (kvm_enabled() && qemu_kvm_pit_in_kernel()) { + kvm_get_pit(kvm_context, &pit_state); + s->pit->channels[2].mode = pit_state.channels[2].mode; + s->pit->channels[2].count = pit_state.channels[2].count; + s->pit->channels[2].count_load_time = pit_state.channels[2].count_load_time; + s->pit->channels[2].gate = pit_state.channels[2].gate; + } +#endif s->data_on = (val >> 1) & 1; pit_set_gate(s->pit, 2, gate); if (s->voice) { @@ -139,6 +169,15 @@ static void pcspk_ioport_write(void *opaque, uint32_t addr, uint32_t val) s->play_pos = 0; AUD_set_active_out(s->voice, gate & s->data_on); } +#ifdef USE_KVM_PIT + if (kvm_enabled() && qemu_kvm_pit_in_kernel()) { + pit_state.channels[2].mode = s->pit->channels[2].mode; + pit_state.channels[2].count = s->pit->channels[2].count; + pit_state.channels[2].count_load_time = s->pit->channels[2].count_load_time; + pit_state.channels[2].gate = s->pit->channels[2].gate; + kvm_set_pit(kvm_context, &pit_state); + } +#endif } void pcspk_init(PITState *pit)
When using the in-kernel PIT the speaker emulation has to synchronize the PIT state with KVM. Enhance the existing speaker sound device and allow it to take over port 0x61 by using KVM_CREATE_PIT_NOSPKR when available. This unbreaks -soundhw pcspk in KVM mode. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> --- libkvm/libkvm-x86.c | 13 +++++++++++++ qemu/hw/pcspk.c | 43 +++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 54 insertions(+), 2 deletions(-)