Message ID | 20130620175914.4e4f9eb3.yoshikawa_takuya_b1@lab.ntt.co.jp (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Il 20/06/2013 10:59, Takuya Yoshikawa ha scritto: > Without this information, users will just see unexpected performance > problems and there is little chance we will get good reports from them: > note that mmio generation is increased even when we just start, or stop, > dirty logging for some memory slot, in which case users should never > expect all shadow pages to be zapped. > > Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> > --- > arch/x86/kvm/mmu.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index c60c5da..bc8302f 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -4385,8 +4385,10 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm) > * The max value is MMIO_MAX_GEN - 1 since it is not called > * when mark memslot invalid. > */ > - if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) > + if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) { > + printk(KERN_INFO "kvm: zapping shadow pages for mmio generation wraparound"); This should at least be rate-limited, because it is guest triggerable. But why isn't the kvm_mmu_invalidate_zap_all_pages tracepoint enough? Paolo > kvm_mmu_invalidate_zap_all_pages(kvm); > + } > } > > static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc) > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jun 20, 2013 at 12:59:54PM +0200, Paolo Bonzini wrote: > Il 20/06/2013 10:59, Takuya Yoshikawa ha scritto: > > Without this information, users will just see unexpected performance > > problems and there is little chance we will get good reports from them: > > note that mmio generation is increased even when we just start, or stop, > > dirty logging for some memory slot, in which case users should never > > expect all shadow pages to be zapped. > > > > Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> > > --- > > arch/x86/kvm/mmu.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > > index c60c5da..bc8302f 100644 > > --- a/arch/x86/kvm/mmu.c > > +++ b/arch/x86/kvm/mmu.c > > @@ -4385,8 +4385,10 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm) > > * The max value is MMIO_MAX_GEN - 1 since it is not called > > * when mark memslot invalid. > > */ > > - if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) > > + if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) { > > + printk(KERN_INFO "kvm: zapping shadow pages for mmio generation wraparound"); > > This should at least be rate-limited, because it is guest triggerable. > It will be hard for guest to triggers it 1 << 19 times too fast though. > But why isn't the kvm_mmu_invalidate_zap_all_pages tracepoint enough? > This one will trigger during slot deletion/move too. I would put it in to see if it actually triggers in some real world workloads (skipping the firs wraparound since it is intentional), we can always drop it if it will turn out to create a lot of noise. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 20 Jun 2013 14:45:04 +0300 Gleb Natapov <gleb@redhat.com> wrote: > On Thu, Jun 20, 2013 at 12:59:54PM +0200, Paolo Bonzini wrote: > > Il 20/06/2013 10:59, Takuya Yoshikawa ha scritto: > > > Without this information, users will just see unexpected performance > > > problems and there is little chance we will get good reports from them: > > > note that mmio generation is increased even when we just start, or stop, > > > dirty logging for some memory slot, in which case users should never > > > expect all shadow pages to be zapped. > > > > > > Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> > > > --- > > > arch/x86/kvm/mmu.c | 4 +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > > > index c60c5da..bc8302f 100644 > > > --- a/arch/x86/kvm/mmu.c > > > +++ b/arch/x86/kvm/mmu.c > > > @@ -4385,8 +4385,10 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm) > > > * The max value is MMIO_MAX_GEN - 1 since it is not called > > > * when mark memslot invalid. > > > */ > > > - if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) > > > + if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) { > > > + printk(KERN_INFO "kvm: zapping shadow pages for mmio generation wraparound"); > > > > This should at least be rate-limited, because it is guest triggerable. > > > It will be hard for guest to triggers it 1 << 19 times too fast though. I think guest-triggerable zap_all itself is a threat for the host, rather than a matter of log flooding, even if it can be preempted. > > > But why isn't the kvm_mmu_invalidate_zap_all_pages tracepoint enough? > > > This one will trigger during slot deletion/move too. > > I would put it in to see if it actually triggers in some real world > workloads (skipping the firs wraparound since it is intentional), > we can always drop it if it will turn out to create a lot of noise. > This patch is not for developers but for end users: of course they do not use tracers during running their services normally. If they see mysterious peformance problems induced by this wraparound, the only way to know the cause later is by this kind of information in the syslog. So even the first wraparound may better be printed out IMO. I want to let administrators know the cause if possible, any better way? Takuya -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Il 20/06/2013 13:45, Gleb Natapov ha scritto: > On Thu, Jun 20, 2013 at 12:59:54PM +0200, Paolo Bonzini wrote: >> Il 20/06/2013 10:59, Takuya Yoshikawa ha scritto: >>> Without this information, users will just see unexpected performance >>> problems and there is little chance we will get good reports from them: >>> note that mmio generation is increased even when we just start, or stop, >>> dirty logging for some memory slot, in which case users should never >>> expect all shadow pages to be zapped. >>> >>> Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> >>> --- >>> arch/x86/kvm/mmu.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c >>> index c60c5da..bc8302f 100644 >>> --- a/arch/x86/kvm/mmu.c >>> +++ b/arch/x86/kvm/mmu.c >>> @@ -4385,8 +4385,10 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm) >>> * The max value is MMIO_MAX_GEN - 1 since it is not called >>> * when mark memslot invalid. >>> */ >>> - if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) >>> + if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) { >>> + printk(KERN_INFO "kvm: zapping shadow pages for mmio generation wraparound"); >> >> This should at least be rate-limited, because it is guest triggerable. >> > It will be hard for guest to triggers it 1 << 19 times too fast though. > >> But why isn't the kvm_mmu_invalidate_zap_all_pages tracepoint enough? > > This one will trigger during slot deletion/move too. > > I would put it in to see if it actually triggers in some real world > workloads (skipping the firs wraparound since it is intentional), > we can always drop it if it will turn out to create a lot of noise. Reading a ROM in a loop can trigger it in less than 5 minutes on my machine. Not a lot of noise, but enough to be annoying. I think the existing tracepoint is enough, or we can add a more specific one here. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jun 20, 2013 at 02:35:36PM +0200, Paolo Bonzini wrote: > Il 20/06/2013 13:45, Gleb Natapov ha scritto: > > On Thu, Jun 20, 2013 at 12:59:54PM +0200, Paolo Bonzini wrote: > >> Il 20/06/2013 10:59, Takuya Yoshikawa ha scritto: > >>> Without this information, users will just see unexpected performance > >>> problems and there is little chance we will get good reports from them: > >>> note that mmio generation is increased even when we just start, or stop, > >>> dirty logging for some memory slot, in which case users should never > >>> expect all shadow pages to be zapped. > >>> > >>> Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> > >>> --- > >>> arch/x86/kvm/mmu.c | 4 +++- > >>> 1 file changed, 3 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > >>> index c60c5da..bc8302f 100644 > >>> --- a/arch/x86/kvm/mmu.c > >>> +++ b/arch/x86/kvm/mmu.c > >>> @@ -4385,8 +4385,10 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm) > >>> * The max value is MMIO_MAX_GEN - 1 since it is not called > >>> * when mark memslot invalid. > >>> */ > >>> - if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) > >>> + if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) { > >>> + printk(KERN_INFO "kvm: zapping shadow pages for mmio generation wraparound"); > >> > >> This should at least be rate-limited, because it is guest triggerable. > >> > > It will be hard for guest to triggers it 1 << 19 times too fast though. > > > >> But why isn't the kvm_mmu_invalidate_zap_all_pages tracepoint enough? > > > > This one will trigger during slot deletion/move too. > > > > I would put it in to see if it actually triggers in some real world > > workloads (skipping the firs wraparound since it is intentional), > > we can always drop it if it will turn out to create a lot of noise. > > Reading a ROM in a loop can trigger it in less than 5 minutes on my > machine. Not a lot of noise, but enough to be annoying. I think the > existing tracepoint is enough, or we can add a more specific one here. > 5 minutes after first wraparound? -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jun 20, 2013 at 09:28:37PM +0900, Takuya Yoshikawa wrote: > On Thu, 20 Jun 2013 14:45:04 +0300 > Gleb Natapov <gleb@redhat.com> wrote: > > > On Thu, Jun 20, 2013 at 12:59:54PM +0200, Paolo Bonzini wrote: > > > Il 20/06/2013 10:59, Takuya Yoshikawa ha scritto: > > > > Without this information, users will just see unexpected performance > > > > problems and there is little chance we will get good reports from them: > > > > note that mmio generation is increased even when we just start, or stop, > > > > dirty logging for some memory slot, in which case users should never > > > > expect all shadow pages to be zapped. > > > > > > > > Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> > > > > --- > > > > arch/x86/kvm/mmu.c | 4 +++- > > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > > > > index c60c5da..bc8302f 100644 > > > > --- a/arch/x86/kvm/mmu.c > > > > +++ b/arch/x86/kvm/mmu.c > > > > @@ -4385,8 +4385,10 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm) > > > > * The max value is MMIO_MAX_GEN - 1 since it is not called > > > > * when mark memslot invalid. > > > > */ > > > > - if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) > > > > + if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) { > > > > + printk(KERN_INFO "kvm: zapping shadow pages for mmio generation wraparound"); > > > > > > This should at least be rate-limited, because it is guest triggerable. > > > > > It will be hard for guest to triggers it 1 << 19 times too fast though. > > I think guest-triggerable zap_all itself is a threat for the host, rather > than a matter of log flooding, even if it can be preempted. > It's not much we can do about it. Slot removal/creation is triggerable through HW emulation registers. > > > > > But why isn't the kvm_mmu_invalidate_zap_all_pages tracepoint enough? > > > > > This one will trigger during slot deletion/move too. > > > > I would put it in to see if it actually triggers in some real world > > workloads (skipping the firs wraparound since it is intentional), > > we can always drop it if it will turn out to create a lot of noise. > > > > This patch is not for developers but for end users: of course they do not > use tracers during running their services normally. > > If they see mysterious peformance problems induced by this wraparound, the only > way to know the cause later is by this kind of information in the syslog. > So even the first wraparound may better be printed out IMO. Think about starting hundreds VMs on a freshly booted host. You will see hundreds of those pretty quickly. > > I want to let administrators know the cause if possible, any better way? > Not that I can think of. Paolo what about print_once() and ignore first wraparound? -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Il 20/06/2013 14:54, Gleb Natapov ha scritto: >> If they see mysterious peformance problems induced by this wraparound, the only >> way to know the cause later is by this kind of information in the syslog. >> So even the first wraparound may better be printed out IMO. > Think about starting hundreds VMs on a freshly booted host. You will see > hundreds of those pretty quickly. With the change I made to Xiao's patch (changing -13 to -150) you won't see it immediately after startup, but the first wraparound may still come very soon with a loop that reads the ROM. (The second takes 5 minutes). >> I want to let administrators know the cause if possible, any better way? >> > Not that I can think of. Paolo what about print_once() and ignore first > wraparound? printk_ratelimited is enough, even without ignoring the first wraparound. It will handle the case of multiple VMs too. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 20 Jun 2013 15:54:38 +0300 Gleb Natapov <gleb@redhat.com> wrote: > On Thu, Jun 20, 2013 at 09:28:37PM +0900, Takuya Yoshikawa wrote: > > On Thu, 20 Jun 2013 14:45:04 +0300 > > Gleb Natapov <gleb@redhat.com> wrote: > > > > > On Thu, Jun 20, 2013 at 12:59:54PM +0200, Paolo Bonzini wrote: > > > > Il 20/06/2013 10:59, Takuya Yoshikawa ha scritto: > > > > > Without this information, users will just see unexpected performance > > > > > problems and there is little chance we will get good reports from them: > > > > > note that mmio generation is increased even when we just start, or stop, > > > > > dirty logging for some memory slot, in which case users should never > > > > > expect all shadow pages to be zapped. > > > > > > > > > > Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> > > > > > --- > > > > > arch/x86/kvm/mmu.c | 4 +++- > > > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > > > > > index c60c5da..bc8302f 100644 > > > > > --- a/arch/x86/kvm/mmu.c > > > > > +++ b/arch/x86/kvm/mmu.c > > > > > @@ -4385,8 +4385,10 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm) > > > > > * The max value is MMIO_MAX_GEN - 1 since it is not called > > > > > * when mark memslot invalid. > > > > > */ > > > > > - if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) > > > > > + if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) { > > > > > + printk(KERN_INFO "kvm: zapping shadow pages for mmio generation wraparound"); > > > > > > > > This should at least be rate-limited, because it is guest triggerable. > > > > > > > It will be hard for guest to triggers it 1 << 19 times too fast though. > > > > I think guest-triggerable zap_all itself is a threat for the host, rather > > than a matter of log flooding, even if it can be preempted. > > > It's not much we can do about it. Slot removal/creation is triggerable > through HW emulation registers. OK, I see. > > > > > > > > But why isn't the kvm_mmu_invalidate_zap_all_pages tracepoint enough? > > > > > > > This one will trigger during slot deletion/move too. > > > > > > I would put it in to see if it actually triggers in some real world > > > workloads (skipping the firs wraparound since it is intentional), > > > we can always drop it if it will turn out to create a lot of noise. > > > > > > > This patch is not for developers but for end users: of course they do not > > use tracers during running their services normally. > > > > If they see mysterious peformance problems induced by this wraparound, the only > > way to know the cause later is by this kind of information in the syslog. > > So even the first wraparound may better be printed out IMO. > Think about starting hundreds VMs on a freshly booted host. You will see > hundreds of those pretty quickly. Yes. > > > > > I want to let administrators know the cause if possible, any better way? > > > Not that I can think of. Paolo what about print_once() and ignore first > wraparound? Assuming that the first one will be removed someday, it's for debugging anyway, we can just do print_once() in the future? That way, admins can check if there is any guest which did some problematic things. Takuya -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 20 Jun 2013 15:14:42 +0200 Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 20/06/2013 14:54, Gleb Natapov ha scritto: > >> If they see mysterious peformance problems induced by this wraparound, the only > >> way to know the cause later is by this kind of information in the syslog. > >> So even the first wraparound may better be printed out IMO. > > Think about starting hundreds VMs on a freshly booted host. You will see > > hundreds of those pretty quickly. > > With the change I made to Xiao's patch (changing -13 to -150) you won't > see it immediately after startup, but the first wraparound may still > come very soon with a loop that reads the ROM. (The second takes 5 > minutes). > > >> I want to let administrators know the cause if possible, any better way? > >> > > Not that I can think of. Paolo what about print_once() and ignore first > > wraparound? > > printk_ratelimited is enough, even without ignoring the first > wraparound. It will handle the case of multiple VMs too. OK, I'm now trying printk_ratelimited() version. Will send v2 if it works. Takuya -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index c60c5da..bc8302f 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -4385,8 +4385,10 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm) * The max value is MMIO_MAX_GEN - 1 since it is not called * when mark memslot invalid. */ - if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) + if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1))) { + printk(KERN_INFO "kvm: zapping shadow pages for mmio generation wraparound"); kvm_mmu_invalidate_zap_all_pages(kvm); + } } static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
Without this information, users will just see unexpected performance problems and there is little chance we will get good reports from them: note that mmio generation is increased even when we just start, or stop, dirty logging for some memory slot, in which case users should never expect all shadow pages to be zapped. Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> --- arch/x86/kvm/mmu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)