Message ID | 1423579310-24555-7-git-send-email-riel@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 02/10/2015 06:41 AM, riel@redhat.com wrote: > From: Rik van Riel <riel@redhat.com> > > The host kernel is not doing anything while the CPU is executing > a KVM guest VCPU, so it can be marked as being in an extended > quiescent state, identical to that used when running user space > code. > > The only exception to that rule is when the host handles an > interrupt, which is already handled by the irq code, which > calls rcu_irq_enter and rcu_irq_exit. > > The guest_enter and guest_exit functions already switch vtime > accounting independent of context tracking. Leave those calls > where they are, instead of moving them into the context tracking > code. > > Signed-off-by: Rik van Riel <riel@redhat.com> > --- > include/linux/context_tracking.h | 6 ++++++ > include/linux/context_tracking_state.h | 1 + > include/linux/kvm_host.h | 3 ++- > 3 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h > index 954253283709..b65fd1420e53 100644 > --- a/include/linux/context_tracking.h > +++ b/include/linux/context_tracking.h > @@ -80,10 +80,16 @@ static inline void guest_enter(void) > vtime_guest_enter(current); > else > current->flags |= PF_VCPU; > + > + if (context_tracking_is_enabled()) > + context_tracking_enter(IN_GUEST); Why the if statement? Also, have you checked how much this hurts guest lightweight entry/exit latency? Context tracking is shockingly expensive for reasons I don't fully understand, but hopefully most of it is the vtime stuff. (Context tracking is *so* expensive that I almost think we should set the performance taint flag if we enable it, assuming that flag ended up getting merged. Also, we should make context tracking faster.) --Andy -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/10/2015 02:59 PM, Andy Lutomirski wrote: > On 02/10/2015 06:41 AM, riel@redhat.com wrote: >> From: Rik van Riel <riel@redhat.com> >> >> The host kernel is not doing anything while the CPU is executing >> a KVM guest VCPU, so it can be marked as being in an extended >> quiescent state, identical to that used when running user space >> code. >> >> The only exception to that rule is when the host handles an >> interrupt, which is already handled by the irq code, which >> calls rcu_irq_enter and rcu_irq_exit. >> >> The guest_enter and guest_exit functions already switch vtime >> accounting independent of context tracking. Leave those calls >> where they are, instead of moving them into the context tracking >> code. >> >> Signed-off-by: Rik van Riel <riel@redhat.com> >> --- >> include/linux/context_tracking.h | 6 ++++++ >> include/linux/context_tracking_state.h | 1 + >> include/linux/kvm_host.h | 3 ++- >> 3 files changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/include/linux/context_tracking.h >> b/include/linux/context_tracking.h >> index 954253283709..b65fd1420e53 100644 >> --- a/include/linux/context_tracking.h >> +++ b/include/linux/context_tracking.h >> @@ -80,10 +80,16 @@ static inline void guest_enter(void) >> vtime_guest_enter(current); >> else >> current->flags |= PF_VCPU; >> + >> + if (context_tracking_is_enabled()) >> + context_tracking_enter(IN_GUEST); > > Why the if statement? > > Also, have you checked how much this hurts guest lightweight entry/exit > latency? Context tracking is shockingly expensive for reasons I don't > fully understand, but hopefully most of it is the vtime stuff. Guest_enter and guest_exit already do the vtime stuff today. This patch series adds the rcu stuff, and modifies context_tracking_enter & context_tracking_exit to not do the vtime stuff twice. > (Context tracking is *so* expensive that I almost think we should set the > performance taint flag if we enable it, assuming that flag ended up > getting merged. Also, we should make context tracking faster.) I am all for making it faster :) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 10, 2015 at 11:59:09AM -0800, Andy Lutomirski wrote: > On 02/10/2015 06:41 AM, riel@redhat.com wrote: > >From: Rik van Riel <riel@redhat.com> > > > >The host kernel is not doing anything while the CPU is executing > >a KVM guest VCPU, so it can be marked as being in an extended > >quiescent state, identical to that used when running user space > >code. > > > >The only exception to that rule is when the host handles an > >interrupt, which is already handled by the irq code, which > >calls rcu_irq_enter and rcu_irq_exit. > > > >The guest_enter and guest_exit functions already switch vtime > >accounting independent of context tracking. Leave those calls > >where they are, instead of moving them into the context tracking > >code. > > > >Signed-off-by: Rik van Riel <riel@redhat.com> > >--- > > include/linux/context_tracking.h | 6 ++++++ > > include/linux/context_tracking_state.h | 1 + > > include/linux/kvm_host.h | 3 ++- > > 3 files changed, 9 insertions(+), 1 deletion(-) > > > >diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h > >index 954253283709..b65fd1420e53 100644 > >--- a/include/linux/context_tracking.h > >+++ b/include/linux/context_tracking.h > >@@ -80,10 +80,16 @@ static inline void guest_enter(void) > > vtime_guest_enter(current); > > else > > current->flags |= PF_VCPU; > >+ > >+ if (context_tracking_is_enabled()) > >+ context_tracking_enter(IN_GUEST); > > Why the if statement? > > Also, have you checked how much this hurts guest lightweight > entry/exit latency? Context tracking is shockingly expensive for > reasons I don't fully understand, but hopefully most of it is the > vtime stuff. (Context tracking is *so* expensive that I almost > think we should set the performance taint flag if we enable it, > assuming that flag ended up getting merged. Also, we should make > context tracking faster.) It turns out that context_tracking_is_enabled() is a static inline that uses a static_key, so the overhead should be minimal on platforms having a full implementation of static keys. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 10, 2015 at 12:14 PM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Tue, Feb 10, 2015 at 11:59:09AM -0800, Andy Lutomirski wrote: >> On 02/10/2015 06:41 AM, riel@redhat.com wrote: >> >From: Rik van Riel <riel@redhat.com> >> > >> >The host kernel is not doing anything while the CPU is executing >> >a KVM guest VCPU, so it can be marked as being in an extended >> >quiescent state, identical to that used when running user space >> >code. >> > >> >The only exception to that rule is when the host handles an >> >interrupt, which is already handled by the irq code, which >> >calls rcu_irq_enter and rcu_irq_exit. >> > >> >The guest_enter and guest_exit functions already switch vtime >> >accounting independent of context tracking. Leave those calls >> >where they are, instead of moving them into the context tracking >> >code. >> > >> >Signed-off-by: Rik van Riel <riel@redhat.com> >> >--- >> > include/linux/context_tracking.h | 6 ++++++ >> > include/linux/context_tracking_state.h | 1 + >> > include/linux/kvm_host.h | 3 ++- >> > 3 files changed, 9 insertions(+), 1 deletion(-) >> > >> >diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h >> >index 954253283709..b65fd1420e53 100644 >> >--- a/include/linux/context_tracking.h >> >+++ b/include/linux/context_tracking.h >> >@@ -80,10 +80,16 @@ static inline void guest_enter(void) >> > vtime_guest_enter(current); >> > else >> > current->flags |= PF_VCPU; >> >+ >> >+ if (context_tracking_is_enabled()) >> >+ context_tracking_enter(IN_GUEST); >> >> Why the if statement? >> >> Also, have you checked how much this hurts guest lightweight >> entry/exit latency? Context tracking is shockingly expensive for >> reasons I don't fully understand, but hopefully most of it is the >> vtime stuff. (Context tracking is *so* expensive that I almost >> think we should set the performance taint flag if we enable it, >> assuming that flag ended up getting merged. Also, we should make >> context tracking faster.) > > It turns out that context_tracking_is_enabled() is a static inline > that uses a static_key, so the overhead should be minimal on platforms > having a full implementation of static keys. Shouldn't we just fold that into context_tracking_xyz_enter? Also, why does the vtime stuff depend on RCU extended quiescent states? To me, they seem mostly orthogonal other than the fact that they hook into the same places. --Andy -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 10, 2015 at 12:19:28PM -0800, Andy Lutomirski wrote: > On Tue, Feb 10, 2015 at 12:14 PM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > > On Tue, Feb 10, 2015 at 11:59:09AM -0800, Andy Lutomirski wrote: > >> On 02/10/2015 06:41 AM, riel@redhat.com wrote: > >> >From: Rik van Riel <riel@redhat.com> > >> > > >> >The host kernel is not doing anything while the CPU is executing > >> >a KVM guest VCPU, so it can be marked as being in an extended > >> >quiescent state, identical to that used when running user space > >> >code. > >> > > >> >The only exception to that rule is when the host handles an > >> >interrupt, which is already handled by the irq code, which > >> >calls rcu_irq_enter and rcu_irq_exit. > >> > > >> >The guest_enter and guest_exit functions already switch vtime > >> >accounting independent of context tracking. Leave those calls > >> >where they are, instead of moving them into the context tracking > >> >code. > >> > > >> >Signed-off-by: Rik van Riel <riel@redhat.com> > >> >--- > >> > include/linux/context_tracking.h | 6 ++++++ > >> > include/linux/context_tracking_state.h | 1 + > >> > include/linux/kvm_host.h | 3 ++- > >> > 3 files changed, 9 insertions(+), 1 deletion(-) > >> > > >> >diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h > >> >index 954253283709..b65fd1420e53 100644 > >> >--- a/include/linux/context_tracking.h > >> >+++ b/include/linux/context_tracking.h > >> >@@ -80,10 +80,16 @@ static inline void guest_enter(void) > >> > vtime_guest_enter(current); > >> > else > >> > current->flags |= PF_VCPU; > >> >+ > >> >+ if (context_tracking_is_enabled()) > >> >+ context_tracking_enter(IN_GUEST); > >> > >> Why the if statement? > >> > >> Also, have you checked how much this hurts guest lightweight > >> entry/exit latency? Context tracking is shockingly expensive for > >> reasons I don't fully understand, but hopefully most of it is the > >> vtime stuff. (Context tracking is *so* expensive that I almost > >> think we should set the performance taint flag if we enable it, > >> assuming that flag ended up getting merged. Also, we should make > >> context tracking faster.) > > > > It turns out that context_tracking_is_enabled() is a static inline > > that uses a static_key, so the overhead should be minimal on platforms > > having a full implementation of static keys. > > Shouldn't we just fold that into context_tracking_xyz_enter? If I am not getting too confused, Rik did that initially, but it caused some pain for the ARM guys. I don't see a performance downside, at least not for a modern compiler that does a decent job of inlining. > Also, why does the vtime stuff depend on RCU extended quiescent > states? To me, they seem mostly orthogonal other than the fact that > they hook into the same places. I might be missing your point, but... If there are no scheduling-clock interrupts, then the CPU needs to be in an extended quiescent state, otherwise you will get RCU CPU stall warnings and eventually OOM. Similarly, if there are no scheduling-clock interupts, then you need to compute the vtime stuff based on start times and deltas instead of relying on a scheduling-clock interrupt that never comes. So it isn't that the vtime and RCU stuff are directly related, but rather that they both must take evasive action if there are to be no scheduling-clock interrupts for an extended time period. Therefore, they really need to key off of the same conditions. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 10, 2015 at 12:42 PM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Tue, Feb 10, 2015 at 12:19:28PM -0800, Andy Lutomirski wrote: >> On Tue, Feb 10, 2015 at 12:14 PM, Paul E. McKenney >> <paulmck@linux.vnet.ibm.com> wrote: >> > On Tue, Feb 10, 2015 at 11:59:09AM -0800, Andy Lutomirski wrote: >> >> On 02/10/2015 06:41 AM, riel@redhat.com wrote: >> >> >From: Rik van Riel <riel@redhat.com> >> >> > >> >> >The host kernel is not doing anything while the CPU is executing >> >> >a KVM guest VCPU, so it can be marked as being in an extended >> >> >quiescent state, identical to that used when running user space >> >> >code. >> >> > >> >> >The only exception to that rule is when the host handles an >> >> >interrupt, which is already handled by the irq code, which >> >> >calls rcu_irq_enter and rcu_irq_exit. >> >> > >> >> >The guest_enter and guest_exit functions already switch vtime >> >> >accounting independent of context tracking. Leave those calls >> >> >where they are, instead of moving them into the context tracking >> >> >code. >> >> > >> >> >Signed-off-by: Rik van Riel <riel@redhat.com> >> >> >--- >> >> > include/linux/context_tracking.h | 6 ++++++ >> >> > include/linux/context_tracking_state.h | 1 + >> >> > include/linux/kvm_host.h | 3 ++- >> >> > 3 files changed, 9 insertions(+), 1 deletion(-) >> >> > >> >> >diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h >> >> >index 954253283709..b65fd1420e53 100644 >> >> >--- a/include/linux/context_tracking.h >> >> >+++ b/include/linux/context_tracking.h >> >> >@@ -80,10 +80,16 @@ static inline void guest_enter(void) >> >> > vtime_guest_enter(current); >> >> > else >> >> > current->flags |= PF_VCPU; >> >> >+ >> >> >+ if (context_tracking_is_enabled()) >> >> >+ context_tracking_enter(IN_GUEST); >> >> >> >> Why the if statement? >> >> >> >> Also, have you checked how much this hurts guest lightweight >> >> entry/exit latency? Context tracking is shockingly expensive for >> >> reasons I don't fully understand, but hopefully most of it is the >> >> vtime stuff. (Context tracking is *so* expensive that I almost >> >> think we should set the performance taint flag if we enable it, >> >> assuming that flag ended up getting merged. Also, we should make >> >> context tracking faster.) >> > >> > It turns out that context_tracking_is_enabled() is a static inline >> > that uses a static_key, so the overhead should be minimal on platforms >> > having a full implementation of static keys. >> >> Shouldn't we just fold that into context_tracking_xyz_enter? > > If I am not getting too confused, Rik did that initially, but it caused > some pain for the ARM guys. I don't see a performance downside, at > least not for a modern compiler that does a decent job of inlining. It's more of a tidiness issue to me than a performance issue. > >> Also, why does the vtime stuff depend on RCU extended quiescent >> states? To me, they seem mostly orthogonal other than the fact that >> they hook into the same places. > > I might be missing your point, but... > > If there are no scheduling-clock interrupts, then the CPU needs to be > in an extended quiescent state, otherwise you will get RCU CPU stall > warnings and eventually OOM. Similarly, if there are no scheduling-clock > interupts, then you need to compute the vtime stuff based on start times > and deltas instead of relying on a scheduling-clock interrupt that never > comes. So it isn't that the vtime and RCU stuff are directly related, > but rather that they both must take evasive action if there are to be > no scheduling-clock interrupts for an extended time period. I'm probably missing something, but isn't vtime also used for accurate CPU time stats? --Andy -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 10, 2015 at 01:00:35PM -0800, Andy Lutomirski wrote: > On Tue, Feb 10, 2015 at 12:42 PM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > > On Tue, Feb 10, 2015 at 12:19:28PM -0800, Andy Lutomirski wrote: > >> On Tue, Feb 10, 2015 at 12:14 PM, Paul E. McKenney > >> <paulmck@linux.vnet.ibm.com> wrote: > >> > On Tue, Feb 10, 2015 at 11:59:09AM -0800, Andy Lutomirski wrote: > >> >> On 02/10/2015 06:41 AM, riel@redhat.com wrote: > >> >> >From: Rik van Riel <riel@redhat.com> > >> >> > > >> >> >The host kernel is not doing anything while the CPU is executing > >> >> >a KVM guest VCPU, so it can be marked as being in an extended > >> >> >quiescent state, identical to that used when running user space > >> >> >code. > >> >> > > >> >> >The only exception to that rule is when the host handles an > >> >> >interrupt, which is already handled by the irq code, which > >> >> >calls rcu_irq_enter and rcu_irq_exit. > >> >> > > >> >> >The guest_enter and guest_exit functions already switch vtime > >> >> >accounting independent of context tracking. Leave those calls > >> >> >where they are, instead of moving them into the context tracking > >> >> >code. > >> >> > > >> >> >Signed-off-by: Rik van Riel <riel@redhat.com> > >> >> >--- > >> >> > include/linux/context_tracking.h | 6 ++++++ > >> >> > include/linux/context_tracking_state.h | 1 + > >> >> > include/linux/kvm_host.h | 3 ++- > >> >> > 3 files changed, 9 insertions(+), 1 deletion(-) > >> >> > > >> >> >diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h > >> >> >index 954253283709..b65fd1420e53 100644 > >> >> >--- a/include/linux/context_tracking.h > >> >> >+++ b/include/linux/context_tracking.h > >> >> >@@ -80,10 +80,16 @@ static inline void guest_enter(void) > >> >> > vtime_guest_enter(current); > >> >> > else > >> >> > current->flags |= PF_VCPU; > >> >> >+ > >> >> >+ if (context_tracking_is_enabled()) > >> >> >+ context_tracking_enter(IN_GUEST); > >> >> > >> >> Why the if statement? > >> >> > >> >> Also, have you checked how much this hurts guest lightweight > >> >> entry/exit latency? Context tracking is shockingly expensive for > >> >> reasons I don't fully understand, but hopefully most of it is the > >> >> vtime stuff. (Context tracking is *so* expensive that I almost > >> >> think we should set the performance taint flag if we enable it, > >> >> assuming that flag ended up getting merged. Also, we should make > >> >> context tracking faster.) > >> > > >> > It turns out that context_tracking_is_enabled() is a static inline > >> > that uses a static_key, so the overhead should be minimal on platforms > >> > having a full implementation of static keys. > >> > >> Shouldn't we just fold that into context_tracking_xyz_enter? > > > > If I am not getting too confused, Rik did that initially, but it caused > > some pain for the ARM guys. I don't see a performance downside, at > > least not for a modern compiler that does a decent job of inlining. > > It's more of a tidiness issue to me than a performance issue. I feel that the current patch does a good job of optimizing global tidiness. > >> Also, why does the vtime stuff depend on RCU extended quiescent > >> states? To me, they seem mostly orthogonal other than the fact that > >> they hook into the same places. > > > > I might be missing your point, but... > > > > If there are no scheduling-clock interrupts, then the CPU needs to be > > in an extended quiescent state, otherwise you will get RCU CPU stall > > warnings and eventually OOM. Similarly, if there are no scheduling-clock > > interupts, then you need to compute the vtime stuff based on start times > > and deltas instead of relying on a scheduling-clock interrupt that never > > comes. So it isn't that the vtime and RCU stuff are directly related, > > but rather that they both must take evasive action if there are to be > > no scheduling-clock interrupts for an extended time period. > > I'm probably missing something, but isn't vtime also used for accurate > CPU time stats? Right. In my previous email, I only talked about what happens if there is no scheduling-clock interrupt, and your question is instead about what can happen if the scheduling-clock interrupt is enabled. The accurate CPU time stats are optional if you leave the scheduling clock on during userspace execution, but become mandatory in the nohz_full case where the scheduling clock is disabled across userspace execution. So the accurate CPU time stats are mandatory in the same situation where you have an RCU extended quiescent state. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h index 954253283709..b65fd1420e53 100644 --- a/include/linux/context_tracking.h +++ b/include/linux/context_tracking.h @@ -80,10 +80,16 @@ static inline void guest_enter(void) vtime_guest_enter(current); else current->flags |= PF_VCPU; + + if (context_tracking_is_enabled()) + context_tracking_enter(IN_GUEST); } static inline void guest_exit(void) { + if (context_tracking_is_enabled()) + context_tracking_exit(IN_GUEST); + if (vtime_accounting_enabled()) vtime_guest_exit(current); else diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h index 72ab10fe1e46..90a7bab8779e 100644 --- a/include/linux/context_tracking_state.h +++ b/include/linux/context_tracking_state.h @@ -15,6 +15,7 @@ struct context_tracking { enum ctx_state { IN_KERNEL = 0, IN_USER, + IN_GUEST, } state; }; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 26f106022c88..c7828a6a9614 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -772,7 +772,8 @@ static inline void kvm_guest_enter(void) * one time slice). Lets treat guest mode as quiescent state, just like * we do with user-mode execution. */ - rcu_virt_note_context_switch(smp_processor_id()); + if (!context_tracking_cpu_is_enabled()) + rcu_virt_note_context_switch(smp_processor_id()); } static inline void kvm_guest_exit(void)