Message ID | 149633845700.12814.7130992212550379105.stgit@Solace.fritz.box (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 01/06/17 18:34, Dario Faggioli wrote: > Trace when interrupts are disabled and (re)enabled. > Basically, we replace the IRQ disabling and enabling > functions with helpers that does the same, but also > output the proper trace record. > > For putting in the record something that will let > us identify _where_ in the code (i.e., in what function) > the IRQ manipulation is happening, use either: > - current_text_addr(), > - or __builtin_return_address(0). > > In fact, depending on whether the disabling/enabling > happens in macros (like for local_irq_disable() and > local_irq_enable()) or in actual functions (like in > spin_lock_irq*()), it is either: > - the actual content of the instruction pointer when > IRQ are disabled/enabled, > - or the return address of the utility function where > IRQ are disabled/enabled, > that will tell us what it is the actual piece of code > that is asking for the IRQ manipulation operation. > > Gate this with its specific Kconfig option, and keep > it in disabled state by default (i.e., don't build it, > if not explicitly specified), as the impact on > performance may be non negligible. > > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> This reminds me that I need to dust off my series which fixes local_irq_save() to not capture its variables by name. I.e. it would be used as "flags = local_irq_save();" in the normal C way. I'm fairly sure I can also improve the generated assembly. > --- > Cc: George Dunlap <george.dunlap@eu.citrix.com> > Cc: Stefano Stabellini <sstabellini@kernel.org> > Cc: Julien Grall <julien.grall@arm.com> > Cc: Jan Beulich <jbeulich@suse.com> > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > Cc: Jennifer Herbert <jennifer.herbert@citrix.com> > --- > xen/Kconfig.debug | 11 ++++- > xen/common/spinlock.c | 16 +++++-- > xen/common/trace.c | 10 +++- > xen/include/asm-arm/arm32/system.h | 12 +++++ > xen/include/asm-arm/arm64/system.h | 12 +++++ > xen/include/asm-x86/system.h | 85 ++++++++++++++++++++++++++++++++++-- > xen/include/public/trace.h | 2 + > xen/include/xen/rwlock.h | 33 +++++++++++--- > 8 files changed, 161 insertions(+), 20 deletions(-) > > diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug > index 374c1c0..81910c9 100644 > --- a/xen/Kconfig.debug > +++ b/xen/Kconfig.debug > @@ -98,7 +98,7 @@ config PERF_ARRAYS > ---help--- > Enables software performance counter array histograms. > > -config TRACING > +menuconfig TRACING > bool "Tracing" > default y > ---help--- > @@ -106,6 +106,15 @@ config TRACING > in per-CPU ring buffers. The 'xentrace' tool can be used to read > the buffers and dump the content on the disk. > > +config TRACE_IRQSOFF > + bool "Trace when IRQs are disabled and (re)enabled" if EXPERT = "y" > + default n > + depends on TRACING > + ---help--- > + Makes it possible to generate events _every_ time IRQs are disabled > + and (re)enabled, with also an indication of where that happened. > + Note that this comes with high overead and produces huge amount of > + tracing data. You've got mixed tabs/spaces here. > > config VERBOSE_DEBUG > bool "Verbose debug messages" > diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c > index 2a06406..33b903e 100644 > --- a/xen/common/spinlock.c > +++ b/xen/common/spinlock.c > @@ -150,7 +150,9 @@ void _spin_lock(spinlock_t *lock) > void _spin_lock_irq(spinlock_t *lock) > { > ASSERT(local_irq_is_enabled()); > - local_irq_disable(); > + _local_irq_disable(); > + if ( unlikely(tb_init_done) ) > + trace_irq_disable_ret(); > _spin_lock(lock); > } Is it sensible to do this way around? By writing the trace record while interrupts are disabled, you do prevent nesting in the general case (but not in NMIs/MCEs or the irqsave() variants), but you increase the critical region while interrupts are disabled. Alternatively, writing the trace record outside of the critical region would reduce the size of the region. Interpretation logic already needs to cope with nesting, so is one extra level of nesting in this corner case a problem? Does the logic cope with the fact that interrupt gates automatically disable interrupts? ~Andrew
On Thu, 2017-06-01 at 20:08 +0100, Andrew Cooper wrote: > On 01/06/17 18:34, Dario Faggioli wrote: > > diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c > > index 2a06406..33b903e 100644 > > --- a/xen/common/spinlock.c > > +++ b/xen/common/spinlock.c > > @@ -150,7 +150,9 @@ void _spin_lock(spinlock_t *lock) > > void _spin_lock_irq(spinlock_t *lock) > > { > > ASSERT(local_irq_is_enabled()); > > - local_irq_disable(); > > + _local_irq_disable(); > > + if ( unlikely(tb_init_done) ) > > + trace_irq_disable_ret(); > > _spin_lock(lock); > > } > > Is it sensible to do this way around? > Well, I guess either variant has up and down sides, and it looks to me that there is no way to measure this, without interfering with the behavior of the thing being measured ("Once upon a time, there was a cat, who lived in a box. His name was Schrödinger..." :-D :-D :-D) > By writing the trace record while interrupts are disabled, you do > prevent nesting in the general case (but not in NMIs/MCEs or the > irqsave() variants), > Forgive the ignorance, what's special about NMIs/MCAs that is relevant for this? About _irqsave(), I'm also not sure what you mean, as _irqsave() is already done differently than this. > but you increase the critical region while > interrupts are disabled. > I know. > Alternatively, writing the trace record outside of the critical > region > would reduce the size of the region. Interpretation logic already > needs > to cope with nesting, so is one extra level of nesting in this corner > case a problem? > TBH, I was indeed trying to offer to the component that will be looking at and interpreting the data, the as clear as possible view on when IRQs are _actually_ disabled and enabled. As in, if nesting occurs, only the event corresponding to: - the first local_irq_disable() - the last local_irq_enable() I.e., to not require that it (the interpretation logic) does understand nesting. But more than this, what I was concerned about was the accuracy of the reporting itself. In fact, if I do as you suggest, I can be interrupted (as interrupts are still on) after having called trace_irq_disable(). That means the report will show higher IRQ disabled time period, for this instance, than what the reality is. And the same is true on the enabling side, when I call trace_irq_enable() _before_ re-enabling interrupts, which has the same bad effect on the length of IRQ disabled section. Maybe, considering that anything that will interrupt me in these cases, are other interrupts, that will most likely disable interrupts themselves, I should not worry too much about this... What do you think? > Does the logic cope with the fact that interrupt gates automatically > disable interrupts? > Ah, right. No, it does not. I probably should mention this in the changelog. Any ideas of how to deal with that? If yes, I'm more than happy to fix this... Thanks and Regards, Dario
Hi Dario, On 01/06/17 18:34, Dario Faggioli wrote: > Trace when interrupts are disabled and (re)enabled. > Basically, we replace the IRQ disabling and enabling > functions with helpers that does the same, but also > output the proper trace record. > > For putting in the record something that will let > us identify _where_ in the code (i.e., in what function) > the IRQ manipulation is happening, use either: > - current_text_addr(), > - or __builtin_return_address(0). > > In fact, depending on whether the disabling/enabling > happens in macros (like for local_irq_disable() and > local_irq_enable()) or in actual functions (like in > spin_lock_irq*()), it is either: > - the actual content of the instruction pointer when > IRQ are disabled/enabled, > - or the return address of the utility function where > IRQ are disabled/enabled, > that will tell us what it is the actual piece of code > that is asking for the IRQ manipulation operation. > > Gate this with its specific Kconfig option, and keep > it in disabled state by default (i.e., don't build it, > if not explicitly specified), as the impact on > performance may be non negligible. > > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> > --- > Cc: George Dunlap <george.dunlap@eu.citrix.com> > Cc: Stefano Stabellini <sstabellini@kernel.org> > Cc: Julien Grall <julien.grall@arm.com> > Cc: Jan Beulich <jbeulich@suse.com> > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > Cc: Jennifer Herbert <jennifer.herbert@citrix.com> > --- > xen/Kconfig.debug | 11 ++++- > xen/common/spinlock.c | 16 +++++-- > xen/common/trace.c | 10 +++- > xen/include/asm-arm/arm32/system.h | 12 +++++ > xen/include/asm-arm/arm64/system.h | 12 +++++ > xen/include/asm-x86/system.h | 85 ++++++++++++++++++++++++++++++++++-- > xen/include/public/trace.h | 2 + > xen/include/xen/rwlock.h | 33 +++++++++++--- > 8 files changed, 161 insertions(+), 20 deletions(-) > > diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug > index 374c1c0..81910c9 100644 > --- a/xen/Kconfig.debug > +++ b/xen/Kconfig.debug > @@ -98,7 +98,7 @@ config PERF_ARRAYS > ---help--- > Enables software performance counter array histograms. > > -config TRACING > +menuconfig TRACING > bool "Tracing" > default y > ---help--- > @@ -106,6 +106,15 @@ config TRACING > in per-CPU ring buffers. The 'xentrace' tool can be used to read > the buffers and dump the content on the disk. > > +config TRACE_IRQSOFF > + bool "Trace when IRQs are disabled and (re)enabled" if EXPERT = "y" > + default n > + depends on TRACING > + ---help--- > + Makes it possible to generate events _every_ time IRQs are disabled > + and (re)enabled, with also an indication of where that happened. > + Note that this comes with high overead and produces huge amount of > + tracing data. > > config VERBOSE_DEBUG > bool "Verbose debug messages" > diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c > index 2a06406..33b903e 100644 > --- a/xen/common/spinlock.c > +++ b/xen/common/spinlock.c > @@ -150,7 +150,9 @@ void _spin_lock(spinlock_t *lock) > void _spin_lock_irq(spinlock_t *lock) > { > ASSERT(local_irq_is_enabled()); > - local_irq_disable(); > + _local_irq_disable(); > + if ( unlikely(tb_init_done) ) __trace_var already contain a "if ( !tb_init_done )". It sounds pointless to do it twice, and also I think it is not obvious to understand the meaning of the check (i.e what is tb_init_done). Would it be possible to hide this check and avoid check tb_init_done twice? > diff --git a/xen/include/asm-arm/arm32/system.h b/xen/include/asm-arm/arm32/system.h > index c617b40..20871ad 100644 > --- a/xen/include/asm-arm/arm32/system.h > +++ b/xen/include/asm-arm/arm32/system.h > @@ -4,6 +4,8 @@ > > #include <asm/arm32/cmpxchg.h> > > +#include <xen/trace.h> > + > #define local_irq_disable() asm volatile ( "cpsid i @ local_irq_disable\n" : : : "cc" ) > #define local_irq_enable() asm volatile ( "cpsie i @ local_irq_enable\n" : : : "cc" ) > > @@ -41,6 +43,16 @@ static inline int local_irq_is_enabled(void) > #define local_abort_enable() __asm__("cpsie a @ __sta\n" : : : "memory", "cc") > #define local_abort_disable() __asm__("cpsid a @ __sta\n" : : : "memory", "cc") > > +/* We do not support tracing (at all) yet */ I know that some bits are missing for ARM, but the patch #5 contradicts this comment as you have CONFIG_TRACE=y by default. > +#define trace_irq_disable_ret() do { } while ( 0 ) > +#define trace_irq_enable_ret() do { } while ( 0 ) > +#define trace_irq_save_ret(_x) do { } while ( 0 ) > +#define trace_irq_restore_ret(_x) do { } while ( 0 ) > +#define _local_irq_disable() local_irq_disable() > +#define _local_irq_enable() local_irq_enable() > +#define _local_irq_save(_x) local_irq_save(_x) > +#define _local_irq_restore(_x) local_irq_restore(_x) > + This does not need to be duplicated in both asm-arm/arm{32,64}/system.h. You could directly implement them in asm-arm/system.h. Also, I would prefer if you stay consistent with x86. I.e non-underscore versions are calling the underscore versions and not the invert. Cheers,
On Wed, 2017-06-07 at 12:16 +0100, Julien Grall wrote: > Hi Dario, > Hi, > On 01/06/17 18:34, Dario Faggioli wrote: > > diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c > > index 2a06406..33b903e 100644 > > --- a/xen/common/spinlock.c > > +++ b/xen/common/spinlock.c > > @@ -150,7 +150,9 @@ void _spin_lock(spinlock_t *lock) > > void _spin_lock_irq(spinlock_t *lock) > > { > > ASSERT(local_irq_is_enabled()); > > - local_irq_disable(); > > + _local_irq_disable(); > > + if ( unlikely(tb_init_done) ) > > __trace_var already contain a "if ( !tb_init_done )". It sounds > pointless to do it twice, and also I think it is not obvious to > understand the meaning of the check (i.e what is tb_init_done). > > Would it be possible to hide this check and avoid check tb_init_done > twice? > I totally agree. And in fact, in another patch, I remove the tb_init_done check in __trace_var(). Reason behind this is that all the callers of __trace_var() (the main one being trace_var()), already checks for tb_init_done themselves. In fact, the explicit check in the caller, is the (only) basis on which one decides to call either trace_var() or __trace_var(). This here is one of the call sites where I think the check is better done in the caller. > > diff --git a/xen/include/asm-arm/arm32/system.h b/xen/include/asm- > > arm/arm32/system.h > > index c617b40..20871ad 100644 > > --- a/xen/include/asm-arm/arm32/system.h > > +++ b/xen/include/asm-arm/arm32/system.h > > @@ -4,6 +4,8 @@ > > > > #include <asm/arm32/cmpxchg.h> > > > > +#include <xen/trace.h> > > + > > #define local_irq_disable() asm volatile ( "cpsid i @ > > local_irq_disable\n" : : : "cc" ) > > #define local_irq_enable() asm volatile ( "cpsie i @ > > local_irq_enable\n" : : : "cc" ) > > > > @@ -41,6 +43,16 @@ static inline int local_irq_is_enabled(void) > > #define local_abort_enable() __asm__("cpsie a @ __sta\n" : : : > > "memory", "cc") > > #define local_abort_disable() __asm__("cpsid a @ __sta\n" : : : > > "memory", "cc") > > > > +/* We do not support tracing (at all) yet */ > > I know that some bits are missing for ARM, but the patch #5 > contradicts > this comment as you have CONFIG_TRACE=y by default. > No sure what you mean. Tracing is de facto on by default right now, despite it not being implemented for ARM. So what I'm doing is basically keeping things as they are. If you think it should be off by default, well, let's talk about it... but I'm not sure this is really what you are saying/asking. > > +#define trace_irq_disable_ret() do { } while ( 0 ) > > +#define trace_irq_enable_ret() do { } while ( 0 ) > > +#define trace_irq_save_ret(_x) do { } while ( 0 ) > > +#define trace_irq_restore_ret(_x) do { } while ( 0 ) > > +#define _local_irq_disable() local_irq_disable() > > +#define _local_irq_enable() local_irq_enable() > > +#define _local_irq_save(_x) local_irq_save(_x) > > +#define _local_irq_restore(_x) local_irq_restore(_x) > > + > > This does not need to be duplicated in both asm- > arm/arm{32,64}/system.h. > You could directly implement them in asm-arm/system.h. > Ok, thanks. > Also, I would prefer if you stay consistent with x86. I.e non- > underscore > versions are calling the underscore versions and not the invert. > Well, I know it is counterintuitive, but it's the easiest way of getting over this, i.e., without the need of a lot of stubs, and touching as few existing code as possible. But I certainly can give it a try doing it as you say, and see how it ends up looking like. Regards, Dario
On 02/06/17 00:42, Dario Faggioli wrote: > On Thu, 2017-06-01 at 20:08 +0100, Andrew Cooper wrote: >> On 01/06/17 18:34, Dario Faggioli wrote: >>> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c >>> index 2a06406..33b903e 100644 >>> --- a/xen/common/spinlock.c >>> +++ b/xen/common/spinlock.c >>> @@ -150,7 +150,9 @@ void _spin_lock(spinlock_t *lock) >>> void _spin_lock_irq(spinlock_t *lock) >>> { >>> ASSERT(local_irq_is_enabled()); >>> - local_irq_disable(); >>> + _local_irq_disable(); >>> + if ( unlikely(tb_init_done) ) >>> + trace_irq_disable_ret(); >>> _spin_lock(lock); >>> } >> >> Is it sensible to do this way around? >> > Well, I guess either variant has up and down sides, and it looks to me > that there is no way to measure this, without interfering with the > behavior of the thing being measured ("Once upon a time, there was a > cat, who lived in a box. His name was Schrödinger..." :-D :-D :-D) > >> By writing the trace record while interrupts are disabled, you do >> prevent nesting in the general case (but not in NMIs/MCEs or the >> irqsave() variants), >> > Forgive the ignorance, what's special about NMIs/MCAs that is relevant > for this? About _irqsave(), I'm also not sure what you mean, as > _irqsave() is already done differently than this. > >> but you increase the critical region while >> interrupts are disabled. >> > I know. > >> Alternatively, writing the trace record outside of the critical >> region >> would reduce the size of the region. Interpretation logic already >> needs >> to cope with nesting, so is one extra level of nesting in this corner >> case a problem? >> > TBH, I was indeed trying to offer to the component that will be looking > at and interpreting the data, the as clear as possible view on when > IRQs are _actually_ disabled and enabled. As in, if nesting occurs, > only the event corresponding to: > - the first local_irq_disable() > - the last local_irq_enable() > > I.e., to not require that it (the interpretation logic) does understand > nesting. > > But more than this, what I was concerned about was the accuracy of the > reporting itself. In fact, if I do as you suggest, I can be interrupted > (as interrupts are still on) after having called trace_irq_disable(). > That means the report will show higher IRQ disabled time period, for > this instance, than what the reality is. > > And the same is true on the enabling side, when I call > trace_irq_enable() _before_ re-enabling interrupts, which has the same > bad effect on the length of IRQ disabled section. > > Maybe, considering that anything that will interrupt me in these cases, > are other interrupts, that will most likely disable interrupts > themselves, I should not worry too much about this... What do you > think? I think it would make the analysis easier to do it the way you've done it. It certainly does increase the length of the actual critical section, but we're already assuming that all of this code is going to be disabled unless people specifically want to do IRQ analysis, so I don't think that should be an issue. -George
On 01/06/17 18:34, Dario Faggioli wrote: > Trace when interrupts are disabled and (re)enabled. > Basically, we replace the IRQ disabling and enabling > functions with helpers that does the same, but also > output the proper trace record. > > For putting in the record something that will let > us identify _where_ in the code (i.e., in what function) > the IRQ manipulation is happening, use either: > - current_text_addr(), > - or __builtin_return_address(0). > > In fact, depending on whether the disabling/enabling > happens in macros (like for local_irq_disable() and > local_irq_enable()) or in actual functions (like in > spin_lock_irq*()), it is either: > - the actual content of the instruction pointer when > IRQ are disabled/enabled, > - or the return address of the utility function where > IRQ are disabled/enabled, > that will tell us what it is the actual piece of code > that is asking for the IRQ manipulation operation. > > Gate this with its specific Kconfig option, and keep > it in disabled state by default (i.e., don't build it, > if not explicitly specified), as the impact on > performance may be non negligible. > > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> > --- > Cc: George Dunlap <george.dunlap@eu.citrix.com> > Cc: Stefano Stabellini <sstabellini@kernel.org> > Cc: Julien Grall <julien.grall@arm.com> > Cc: Jan Beulich <jbeulich@suse.com> > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > Cc: Jennifer Herbert <jennifer.herbert@citrix.com> > --- > xen/Kconfig.debug | 11 ++++- > xen/common/spinlock.c | 16 +++++-- > xen/common/trace.c | 10 +++- > xen/include/asm-arm/arm32/system.h | 12 +++++ > xen/include/asm-arm/arm64/system.h | 12 +++++ > xen/include/asm-x86/system.h | 85 ++++++++++++++++++++++++++++++++++-- > xen/include/public/trace.h | 2 + > xen/include/xen/rwlock.h | 33 +++++++++++--- > 8 files changed, 161 insertions(+), 20 deletions(-) > > diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug > index 374c1c0..81910c9 100644 > --- a/xen/Kconfig.debug > +++ b/xen/Kconfig.debug > @@ -98,7 +98,7 @@ config PERF_ARRAYS > ---help--- > Enables software performance counter array histograms. > > -config TRACING > +menuconfig TRACING > bool "Tracing" > default y > ---help--- > @@ -106,6 +106,15 @@ config TRACING > in per-CPU ring buffers. The 'xentrace' tool can be used to read > the buffers and dump the content on the disk. > > +config TRACE_IRQSOFF > + bool "Trace when IRQs are disabled and (re)enabled" if EXPERT = "y" > + default n > + depends on TRACING > + ---help--- > + Makes it possible to generate events _every_ time IRQs are disabled > + and (re)enabled, with also an indication of where that happened. > + Note that this comes with high overead and produces huge amount of > + tracing data. *overhead (yours is missing an 'h') I think I might emphasize that the overhead well be present even when tracing is off. What about something like this? "Note that this comes with an overhead even when tracing is disabled; and has a high overhead and produces a large amount of tracing data when enabled." With that change: Acked-by: George Dunlap <george.dunlap@citrix.com>
>>> On 02.06.17 at 01:42, <dario.faggioli@citrix.com> wrote: > On Thu, 2017-06-01 at 20:08 +0100, Andrew Cooper wrote: >> By writing the trace record while interrupts are disabled, you do >> prevent nesting in the general case (but not in NMIs/MCEs or the >> irqsave() variants), >> > Forgive the ignorance, what's special about NMIs/MCAs that is relevant > for this? NMI/#MC can nest nevertheless, so preventing nesting in the common case doesn't mean you won't see nesting at all. >> Does the logic cope with the fact that interrupt gates automatically >> disable interrupts? >> > Ah, right. No, it does not. I probably should mention this in the > changelog. Any ideas of how to deal with that? If yes, I'm more than > happy to fix this... Well, respective entry points will need to update tracking state. See how Linux has placed certain macros to that effect on various entry paths. Jan
On Thu, 2017-06-08 at 17:01 +0100, George Dunlap wrote: > On 01/06/17 18:34, Dario Faggioli wrote: > > diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug > > index 374c1c0..81910c9 100644 > > --- a/xen/Kconfig.debug > > +++ b/xen/Kconfig.debug > > @@ -98,7 +98,7 @@ config PERF_ARRAYS > > ---help--- > > Enables software performance counter array histograms. > > > > -config TRACING > > +menuconfig TRACING > > bool "Tracing" > > default y > > ---help--- > > @@ -106,6 +106,15 @@ config TRACING > > in per-CPU ring buffers. The 'xentrace' tool can be used > > to read > > the buffers and dump the content on the disk. > > > > +config TRACE_IRQSOFF > > + bool "Trace when IRQs are disabled and (re)enabled" if > > EXPERT = "y" > > + default n > > + depends on TRACING > > + ---help--- > > + Makes it possible to generate events _every_ time IRQs > > are disabled > > + and (re)enabled, with also an indication of where that > > happened. > > + Note that this comes with high overead and produces huge > > amount of > > + tracing data. > > I think I might emphasize that the overhead well be present even when > tracing is off. What about something like this? > Yes, good point. > "Note that this comes with an overhead even when tracing is disabled; > and has a high overhead and produces a large amount of tracing data > when > enabled." > I like it, I'll go for it. > With that change: > > Acked-by: George Dunlap <george.dunlap@citrix.com> > Thanks and Regards, Dario
>>> On 01.06.17 at 19:34, <dario.faggioli@citrix.com> wrote: > @@ -106,6 +106,15 @@ config TRACING > in per-CPU ring buffers. The 'xentrace' tool can be used to read > the buffers and dump the content on the disk. > > +config TRACE_IRQSOFF > + bool "Trace when IRQs are disabled and (re)enabled" if EXPERT = "y" > + default n I'm not aware of any behavioral difference between this an not specifying a default at all, so I'd like to ask to follow suit (see surrounding code) and omit this line. > --- a/xen/common/spinlock.c > +++ b/xen/common/spinlock.c > @@ -150,7 +150,9 @@ void _spin_lock(spinlock_t *lock) > void _spin_lock_irq(spinlock_t *lock) > { > ASSERT(local_irq_is_enabled()); > - local_irq_disable(); > + _local_irq_disable(); Hmm, looks you're introducing new non-static, underscore-prefixed symbols. Please don't, use e.g. a "arch_" prefix or a "_notrace" suffix instead. > --- a/xen/include/asm-x86/system.h > +++ b/xen/include/asm-x86/system.h > @@ -5,6 +5,8 @@ > #include <xen/bitops.h> > #include <asm/processor.h> > > +#include <xen/trace.h> Why? > @@ -214,6 +216,79 @@ static always_inline unsigned long __xadd( > "ri" ( (x) & X86_EFLAGS_IF ) ); \ > }) > > +#ifdef CONFIG_TRACE_IRQSOFF > + > +#define TRACE_LOCAL_ADDR ((uint64_t) current_text_addr()) > +#define TRACE_RET_ADDR ((uint64_t) __builtin_return_address(0)) > + > +#define trace_irq_disable(_a) \ > +({ \ > + uint64_t addr = _a; \ > + __trace_var(TRC_HW_IRQ_DISABLE, 1, sizeof(addr), &addr); \ > +}) > +#define trace_irq_enable(_a) \ > +({ \ > + uint64_t addr = _a; \ > + __trace_var(TRC_HW_IRQ_ENABLE, 1, sizeof(addr), &addr); \ > +}) > +#define trace_irq_save(_x, _a) \ > +({ \ > + uint64_t addr = _a; \ > + if ( _x & X86_EFLAGS_IF ) \ > + __trace_var(TRC_HW_IRQ_DISABLE, 1, sizeof(addr), &addr); \ > +}) > +#define trace_irq_restore(_x, _a) \ > +({ \ > + uint64_t addr = _a; \ > + if ( _x & X86_EFLAGS_IF ) \ > + __trace_var(TRC_HW_IRQ_ENABLE, 1, sizeof(addr), &addr); \ > +}) > + > +#define trace_irq_disable_ret() trace_irq_disable(TRACE_RET_ADDR) > +#define trace_irq_enable_ret() trace_irq_enable(TRACE_RET_ADDR) > +#define trace_irq_save_ret(_x) trace_irq_save(_x, TRACE_RET_ADDR) > +#define trace_irq_restore_ret(_x) trace_irq_restore(_x, TRACE_RET_ADDR) > + > +#define local_irq_disable() \ > +({ \ > + bool_t irqon = local_irq_is_enabled(); \ > + _local_irq_disable(); \ > + if ( unlikely(tb_init_done && irqon) ) \ > + trace_irq_disable(TRACE_LOCAL_ADDR); \ > +}) > + > +#define local_irq_enable() \ > +({ \ > + if ( unlikely(tb_init_done) ) \ > + trace_irq_enable(TRACE_LOCAL_ADDR); \ > + _local_irq_enable(); \ > +}) > + > +#define local_irq_save(_x) \ > +({ \ > + local_save_flags(_x); \ > + _local_irq_disable(); \ > + if ( unlikely(tb_init_done) ) \ > + trace_irq_save(_x, TRACE_LOCAL_ADDR); \ > +}) > + > +#define local_irq_restore(_x) \ > +({ \ > + if ( unlikely(tb_init_done) ) \ > + trace_irq_restore(_x, TRACE_LOCAL_ADDR); \ > + _local_irq_restore(_x); \ > +}) > +#else /* !TRACE_IRQSOFF */ > +#define trace_irq_disable_ret() do { } while ( 0 ) > +#define trace_irq_enable_ret() do { } while ( 0 ) > +#define trace_irq_save_ret(_x) do { } while ( 0 ) > +#define trace_irq_restore_ret(_x) do { } while ( 0 ) > +#define local_irq_disable() _local_irq_disable() > +#define local_irq_enable() _local_irq_enable() > +#define local_irq_save(_x) _local_irq_save(_x) > +#define local_irq_restore(_x) _local_irq_restore(_x) > +#endif /* TRACE_IRQSOFF */ None of this looks to be arch-specific. Jan
On 07/06/17 16:22, Dario Faggioli wrote: > On Wed, 2017-06-07 at 12:16 +0100, Julien Grall wrote: >> Hi Dario, >> > Hi, > >> On 01/06/17 18:34, Dario Faggioli wrote: >>> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c >>> index 2a06406..33b903e 100644 >>> --- a/xen/common/spinlock.c >>> +++ b/xen/common/spinlock.c >>> @@ -150,7 +150,9 @@ void _spin_lock(spinlock_t *lock) >>> void _spin_lock_irq(spinlock_t *lock) >>> { >>> ASSERT(local_irq_is_enabled()); >>> - local_irq_disable(); >>> + _local_irq_disable(); >>> + if ( unlikely(tb_init_done) ) >> >> __trace_var already contain a "if ( !tb_init_done )". It sounds >> pointless to do it twice, and also I think it is not obvious to >> understand the meaning of the check (i.e what is tb_init_done). >> >> Would it be possible to hide this check and avoid check tb_init_done >> twice? >> > I totally agree. And in fact, in another patch, I remove the > tb_init_done check in __trace_var(). Reason behind this is that all the > callers of __trace_var() (the main one being trace_var()), already > checks for tb_init_done themselves. > > In fact, the explicit check in the caller, is the (only) basis on which > one decides to call either trace_var() or __trace_var(). > > This here is one of the call sites where I think the check is better > done in the caller. I admit, it is a bit confusing to require the caller to check tb_init_done. It adds more code and IHMO counterintuitive. > >>> diff --git a/xen/include/asm-arm/arm32/system.h b/xen/include/asm- >>> arm/arm32/system.h >>> index c617b40..20871ad 100644 >>> --- a/xen/include/asm-arm/arm32/system.h >>> +++ b/xen/include/asm-arm/arm32/system.h >>> @@ -4,6 +4,8 @@ >>> >>> #include <asm/arm32/cmpxchg.h> >>> >>> +#include <xen/trace.h> >>> + >>> #define local_irq_disable() asm volatile ( "cpsid i @ >>> local_irq_disable\n" : : : "cc" ) >>> #define local_irq_enable() asm volatile ( "cpsie i @ >>> local_irq_enable\n" : : : "cc" ) >>> >>> @@ -41,6 +43,16 @@ static inline int local_irq_is_enabled(void) >>> #define local_abort_enable() __asm__("cpsie a @ __sta\n" : : : >>> "memory", "cc") >>> #define local_abort_disable() __asm__("cpsid a @ __sta\n" : : : >>> "memory", "cc") >>> >>> +/* We do not support tracing (at all) yet */ >> >> I know that some bits are missing for ARM, but the patch #5 >> contradicts >> this comment as you have CONFIG_TRACE=y by default. >> > No sure what you mean. Tracing is de facto on by default right now, > despite it not being implemented for ARM. So what I'm doing is > basically keeping things as they are. > > If you think it should be off by default, well, let's talk about it... > but I'm not sure this is really what you are saying/asking. I mean that CONFIG_TRACE=y by default gives the impression that tracing is supported on ARM. This is not the case, hence your comment. So I think it should be off my default until we get the support. BTW, it is only a couple of simple patches but they never made upstreamed :/ >>> +#define trace_irq_disable_ret() do { } while ( 0 ) >>> +#define trace_irq_enable_ret() do { } while ( 0 ) >>> +#define trace_irq_save_ret(_x) do { } while ( 0 ) >>> +#define trace_irq_restore_ret(_x) do { } while ( 0 ) >>> +#define _local_irq_disable() local_irq_disable() >>> +#define _local_irq_enable() local_irq_enable() >>> +#define _local_irq_save(_x) local_irq_save(_x) >>> +#define _local_irq_restore(_x) local_irq_restore(_x) >>> + >> >> This does not need to be duplicated in both asm- >> arm/arm{32,64}/system.h. >> You could directly implement them in asm-arm/system.h. >> > Ok, thanks. > >> Also, I would prefer if you stay consistent with x86. I.e non- >> underscore >> versions are calling the underscore versions and not the invert. >> > Well, I know it is counterintuitive, but it's the easiest way of > getting over this, i.e., without the need of a lot of stubs, and > touching as few existing code as possible. Can you detail here? The only things I was asking is to do: #define _local_irq_disable() ...... #define local_irq_disable() _local_irq_disable. The callers would still use local_irq_disable() and the code would be consistent with x86. Cheers,
On 09/06/17 11:51, Julien Grall wrote: > > > On 07/06/17 16:22, Dario Faggioli wrote: >> On Wed, 2017-06-07 at 12:16 +0100, Julien Grall wrote: >>> Hi Dario, >>> >> Hi, >> >>> On 01/06/17 18:34, Dario Faggioli wrote: >>>> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c >>>> index 2a06406..33b903e 100644 >>>> --- a/xen/common/spinlock.c >>>> +++ b/xen/common/spinlock.c >>>> @@ -150,7 +150,9 @@ void _spin_lock(spinlock_t *lock) >>>> void _spin_lock_irq(spinlock_t *lock) >>>> { >>>> ASSERT(local_irq_is_enabled()); >>>> - local_irq_disable(); >>>> + _local_irq_disable(); >>>> + if ( unlikely(tb_init_done) ) >>> >>> __trace_var already contain a "if ( !tb_init_done )". It sounds >>> pointless to do it twice, and also I think it is not obvious to >>> understand the meaning of the check (i.e what is tb_init_done). >>> >>> Would it be possible to hide this check and avoid check tb_init_done >>> twice? >>> >> I totally agree. And in fact, in another patch, I remove the >> tb_init_done check in __trace_var(). Reason behind this is that all the >> callers of __trace_var() (the main one being trace_var()), already >> checks for tb_init_done themselves. >> >> In fact, the explicit check in the caller, is the (only) basis on which >> one decides to call either trace_var() or __trace_var(). >> >> This here is one of the call sites where I think the check is better >> done in the caller. > > I admit, it is a bit confusing to require the caller to check > tb_init_done. It adds more code and IHMO counterintuitive. BTW, I am not responsible of this code, so if the REST maintainers are happy with that then fine. > >> >>>> diff --git a/xen/include/asm-arm/arm32/system.h b/xen/include/asm- >>>> arm/arm32/system.h >>>> index c617b40..20871ad 100644 >>>> --- a/xen/include/asm-arm/arm32/system.h >>>> +++ b/xen/include/asm-arm/arm32/system.h >>>> @@ -4,6 +4,8 @@ >>>> >>>> #include <asm/arm32/cmpxchg.h> >>>> >>>> +#include <xen/trace.h> >>>> + >>>> #define local_irq_disable() asm volatile ( "cpsid i @ >>>> local_irq_disable\n" : : : "cc" ) >>>> #define local_irq_enable() asm volatile ( "cpsie i @ >>>> local_irq_enable\n" : : : "cc" ) >>>> >>>> @@ -41,6 +43,16 @@ static inline int local_irq_is_enabled(void) >>>> #define local_abort_enable() __asm__("cpsie a @ __sta\n" : : : >>>> "memory", "cc") >>>> #define local_abort_disable() __asm__("cpsid a @ __sta\n" : : : >>>> "memory", "cc") >>>> >>>> +/* We do not support tracing (at all) yet */ >>> >>> I know that some bits are missing for ARM, but the patch #5 >>> contradicts >>> this comment as you have CONFIG_TRACE=y by default. >>> >> No sure what you mean. Tracing is de facto on by default right now, >> despite it not being implemented for ARM. So what I'm doing is >> basically keeping things as they are. >> >> If you think it should be off by default, well, let's talk about it... >> but I'm not sure this is really what you are saying/asking. > > I mean that CONFIG_TRACE=y by default gives the impression that tracing > is supported on ARM. This is not the case, hence your comment. So I > think it should be off my default until we get the support. BTW, it is > only a couple of simple patches but they never made upstreamed :/ > >>>> +#define trace_irq_disable_ret() do { } while ( 0 ) >>>> +#define trace_irq_enable_ret() do { } while ( 0 ) >>>> +#define trace_irq_save_ret(_x) do { } while ( 0 ) >>>> +#define trace_irq_restore_ret(_x) do { } while ( 0 ) >>>> +#define _local_irq_disable() local_irq_disable() >>>> +#define _local_irq_enable() local_irq_enable() >>>> +#define _local_irq_save(_x) local_irq_save(_x) >>>> +#define _local_irq_restore(_x) local_irq_restore(_x) >>>> + >>> >>> This does not need to be duplicated in both asm- >>> arm/arm{32,64}/system.h. >>> You could directly implement them in asm-arm/system.h. >>> >> Ok, thanks. >> >>> Also, I would prefer if you stay consistent with x86. I.e non- >>> underscore >>> versions are calling the underscore versions and not the invert. >>> >> Well, I know it is counterintuitive, but it's the easiest way of >> getting over this, i.e., without the need of a lot of stubs, and >> touching as few existing code as possible. > > Can you detail here? The only things I was asking is to do: > > #define _local_irq_disable() ...... > > #define local_irq_disable() _local_irq_disable. > > The callers would still use local_irq_disable() and the code would be > consistent with x86. > > Cheers, >
On 09/06/17 11:51, Julien Grall wrote: > > > On 07/06/17 16:22, Dario Faggioli wrote: >> On Wed, 2017-06-07 at 12:16 +0100, Julien Grall wrote: >>> Hi Dario, >>> >> Hi, >> >>> On 01/06/17 18:34, Dario Faggioli wrote: >>>> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c >>>> index 2a06406..33b903e 100644 >>>> --- a/xen/common/spinlock.c >>>> +++ b/xen/common/spinlock.c >>>> @@ -150,7 +150,9 @@ void _spin_lock(spinlock_t *lock) >>>> void _spin_lock_irq(spinlock_t *lock) >>>> { >>>> ASSERT(local_irq_is_enabled()); >>>> - local_irq_disable(); >>>> + _local_irq_disable(); >>>> + if ( unlikely(tb_init_done) ) >>> >>> __trace_var already contain a "if ( !tb_init_done )". It sounds >>> pointless to do it twice, and also I think it is not obvious to >>> understand the meaning of the check (i.e what is tb_init_done). >>> >>> Would it be possible to hide this check and avoid check tb_init_done >>> twice? >>> >> I totally agree. And in fact, in another patch, I remove the >> tb_init_done check in __trace_var(). Reason behind this is that all the >> callers of __trace_var() (the main one being trace_var()), already >> checks for tb_init_done themselves. >> >> In fact, the explicit check in the caller, is the (only) basis on which >> one decides to call either trace_var() or __trace_var(). >> >> This here is one of the call sites where I think the check is better >> done in the caller. > > I admit, it is a bit confusing to require the caller to check > tb_init_done. It adds more code and IHMO counterintuitive. Because a number of the traces include a certain amount if "marshalling" work ahead of time; there's no point in making space for a struct on the stack, carefully reading and sorting all kinds if things to put into it, only to have the data ignored because tracing isn't enabled (which is by far the most common case). -George
On 09/06/17 11:55, George Dunlap wrote: > On 09/06/17 11:51, Julien Grall wrote: >> >> >> On 07/06/17 16:22, Dario Faggioli wrote: >>> On Wed, 2017-06-07 at 12:16 +0100, Julien Grall wrote: >>>> Hi Dario, >>>> >>> Hi, >>> >>>> On 01/06/17 18:34, Dario Faggioli wrote: >>>>> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c >>>>> index 2a06406..33b903e 100644 >>>>> --- a/xen/common/spinlock.c >>>>> +++ b/xen/common/spinlock.c >>>>> @@ -150,7 +150,9 @@ void _spin_lock(spinlock_t *lock) >>>>> void _spin_lock_irq(spinlock_t *lock) >>>>> { >>>>> ASSERT(local_irq_is_enabled()); >>>>> - local_irq_disable(); >>>>> + _local_irq_disable(); >>>>> + if ( unlikely(tb_init_done) ) >>>> >>>> __trace_var already contain a "if ( !tb_init_done )". It sounds >>>> pointless to do it twice, and also I think it is not obvious to >>>> understand the meaning of the check (i.e what is tb_init_done). >>>> >>>> Would it be possible to hide this check and avoid check tb_init_done >>>> twice? >>>> >>> I totally agree. And in fact, in another patch, I remove the >>> tb_init_done check in __trace_var(). Reason behind this is that all the >>> callers of __trace_var() (the main one being trace_var()), already >>> checks for tb_init_done themselves. >>> >>> In fact, the explicit check in the caller, is the (only) basis on which >>> one decides to call either trace_var() or __trace_var(). >>> >>> This here is one of the call sites where I think the check is better >>> done in the caller. >> >> I admit, it is a bit confusing to require the caller to check >> tb_init_done. It adds more code and IHMO counterintuitive. > > Because a number of the traces include a certain amount if "marshalling" > work ahead of time; there's no point in making space for a struct on the > stack, carefully reading and sorting all kinds if things to put into it, > only to have the data ignored because tracing isn't enabled (which is by > far the most common case). At the moment all the callers have the semantic: if ( !unlikely(tb_init_done) ) trace_irq_* So why not hiding that in the macro trace_irq_disable/trace_irq_enable...? Cheers,
diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug index 374c1c0..81910c9 100644 --- a/xen/Kconfig.debug +++ b/xen/Kconfig.debug @@ -98,7 +98,7 @@ config PERF_ARRAYS ---help--- Enables software performance counter array histograms. -config TRACING +menuconfig TRACING bool "Tracing" default y ---help--- @@ -106,6 +106,15 @@ config TRACING in per-CPU ring buffers. The 'xentrace' tool can be used to read the buffers and dump the content on the disk. +config TRACE_IRQSOFF + bool "Trace when IRQs are disabled and (re)enabled" if EXPERT = "y" + default n + depends on TRACING + ---help--- + Makes it possible to generate events _every_ time IRQs are disabled + and (re)enabled, with also an indication of where that happened. + Note that this comes with high overead and produces huge amount of + tracing data. config VERBOSE_DEBUG bool "Verbose debug messages" diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c index 2a06406..33b903e 100644 --- a/xen/common/spinlock.c +++ b/xen/common/spinlock.c @@ -150,7 +150,9 @@ void _spin_lock(spinlock_t *lock) void _spin_lock_irq(spinlock_t *lock) { ASSERT(local_irq_is_enabled()); - local_irq_disable(); + _local_irq_disable(); + if ( unlikely(tb_init_done) ) + trace_irq_disable_ret(); _spin_lock(lock); } @@ -158,7 +160,9 @@ unsigned long _spin_lock_irqsave(spinlock_t *lock) { unsigned long flags; - local_irq_save(flags); + _local_irq_save(flags); + if ( unlikely(tb_init_done) ) + trace_irq_save_ret(flags); _spin_lock(lock); return flags; } @@ -175,13 +179,17 @@ void _spin_unlock(spinlock_t *lock) void _spin_unlock_irq(spinlock_t *lock) { _spin_unlock(lock); - local_irq_enable(); + if ( unlikely(tb_init_done) ) + trace_irq_enable_ret(); + _local_irq_enable(); } void _spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags) { _spin_unlock(lock); - local_irq_restore(flags); + if ( unlikely(tb_init_done) ) + trace_irq_restore_ret(flags); + _local_irq_restore(flags); } int _spin_is_locked(spinlock_t *lock) diff --git a/xen/common/trace.c b/xen/common/trace.c index 2c18462..71202df 100644 --- a/xen/common/trace.c +++ b/xen/common/trace.c @@ -722,7 +722,12 @@ void __trace_var(u32 event, bool_t cycles, unsigned int extra, /* Read tb_init_done /before/ t_bufs. */ smp_rmb(); - spin_lock_irqsave(&this_cpu(t_lock), flags); + /* + * spin_lock_irqsave() would call local_irq_save(), which (may) + * call __trace_var(). Open code it to avoid recursing. + */ + _local_irq_save(flags); + spin_lock(&this_cpu(t_lock)); buf = this_cpu(t_bufs); @@ -809,7 +814,8 @@ void __trace_var(u32 event, bool_t cycles, unsigned int extra, __insert_record(buf, event, extra, cycles, rec_size, extra_data); unlock: - spin_unlock_irqrestore(&this_cpu(t_lock), flags); + spin_unlock(&this_cpu(t_lock)); + _local_irq_restore(flags); /* Notify trace buffer consumer that we've crossed the high water mark. */ if ( likely(buf!=NULL) diff --git a/xen/include/asm-arm/arm32/system.h b/xen/include/asm-arm/arm32/system.h index c617b40..20871ad 100644 --- a/xen/include/asm-arm/arm32/system.h +++ b/xen/include/asm-arm/arm32/system.h @@ -4,6 +4,8 @@ #include <asm/arm32/cmpxchg.h> +#include <xen/trace.h> + #define local_irq_disable() asm volatile ( "cpsid i @ local_irq_disable\n" : : : "cc" ) #define local_irq_enable() asm volatile ( "cpsie i @ local_irq_enable\n" : : : "cc" ) @@ -41,6 +43,16 @@ static inline int local_irq_is_enabled(void) #define local_abort_enable() __asm__("cpsie a @ __sta\n" : : : "memory", "cc") #define local_abort_disable() __asm__("cpsid a @ __sta\n" : : : "memory", "cc") +/* We do not support tracing (at all) yet */ +#define trace_irq_disable_ret() do { } while ( 0 ) +#define trace_irq_enable_ret() do { } while ( 0 ) +#define trace_irq_save_ret(_x) do { } while ( 0 ) +#define trace_irq_restore_ret(_x) do { } while ( 0 ) +#define _local_irq_disable() local_irq_disable() +#define _local_irq_enable() local_irq_enable() +#define _local_irq_save(_x) local_irq_save(_x) +#define _local_irq_restore(_x) local_irq_restore(_x) + static inline int local_fiq_is_enabled(void) { unsigned long flags; diff --git a/xen/include/asm-arm/arm64/system.h b/xen/include/asm-arm/arm64/system.h index 2e2ee21..6603b0c 100644 --- a/xen/include/asm-arm/arm64/system.h +++ b/xen/include/asm-arm/arm64/system.h @@ -4,6 +4,8 @@ #include <asm/arm64/cmpxchg.h> +#include <xen/trace.h> + /* Uses uimm4 as a bitmask to select the clearing of one or more of * the DAIF exception mask bits: * bit 3 selects the D mask, @@ -44,6 +46,16 @@ : "memory"); \ }) +/* We do not support tracing (at all) yet */ +#define trace_irq_disable_ret() do { } while ( 0 ) +#define trace_irq_enable_ret() do { } while ( 0 ) +#define trace_irq_save_ret(_x) do { } while ( 0 ) +#define trace_irq_restore_ret(_x) do { } while ( 0 ) +#define _local_irq_disable() local_irq_disable() +#define _local_irq_enable() local_irq_enable() +#define _local_irq_save(_x) local_irq_save(_x) +#define _local_irq_restore(_x) local_irq_restore(_x) + static inline int local_irq_is_enabled(void) { unsigned long flags; diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h index eb498f5..0e7bf01 100644 --- a/xen/include/asm-x86/system.h +++ b/xen/include/asm-x86/system.h @@ -5,6 +5,8 @@ #include <xen/bitops.h> #include <asm/processor.h> +#include <xen/trace.h> + #define read_sreg(name) \ ({ unsigned int __sel; \ asm volatile ( "mov %%" STR(name) ",%0" : "=r" (__sel) ); \ @@ -185,8 +187,8 @@ static always_inline unsigned long __xadd( #define set_mb(var, value) do { xchg(&var, value); } while (0) #define set_wmb(var, value) do { var = value; wmb(); } while (0) -#define local_irq_disable() asm volatile ( "cli" : : : "memory" ) -#define local_irq_enable() asm volatile ( "sti" : : : "memory" ) +#define _local_irq_disable() asm volatile ( "cli" : : : "memory" ) +#define _local_irq_enable() asm volatile ( "sti" : : : "memory" ) /* used in the idle loop; sti takes one instruction cycle to complete */ #define safe_halt() asm volatile ( "sti; hlt" : : : "memory" ) @@ -198,12 +200,12 @@ static always_inline unsigned long __xadd( BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ asm volatile ( "pushf" __OS " ; pop" __OS " %0" : "=g" (x)); \ }) -#define local_irq_save(x) \ +#define _local_irq_save(x) \ ({ \ local_save_flags(x); \ - local_irq_disable(); \ + _local_irq_disable(); \ }) -#define local_irq_restore(x) \ +#define _local_irq_restore(x) \ ({ \ BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ asm volatile ( "pushfq\n\t" \ @@ -214,6 +216,79 @@ static always_inline unsigned long __xadd( "ri" ( (x) & X86_EFLAGS_IF ) ); \ }) +#ifdef CONFIG_TRACE_IRQSOFF + +#define TRACE_LOCAL_ADDR ((uint64_t) current_text_addr()) +#define TRACE_RET_ADDR ((uint64_t) __builtin_return_address(0)) + +#define trace_irq_disable(_a) \ +({ \ + uint64_t addr = _a; \ + __trace_var(TRC_HW_IRQ_DISABLE, 1, sizeof(addr), &addr); \ +}) +#define trace_irq_enable(_a) \ +({ \ + uint64_t addr = _a; \ + __trace_var(TRC_HW_IRQ_ENABLE, 1, sizeof(addr), &addr); \ +}) +#define trace_irq_save(_x, _a) \ +({ \ + uint64_t addr = _a; \ + if ( _x & X86_EFLAGS_IF ) \ + __trace_var(TRC_HW_IRQ_DISABLE, 1, sizeof(addr), &addr); \ +}) +#define trace_irq_restore(_x, _a) \ +({ \ + uint64_t addr = _a; \ + if ( _x & X86_EFLAGS_IF ) \ + __trace_var(TRC_HW_IRQ_ENABLE, 1, sizeof(addr), &addr); \ +}) + +#define trace_irq_disable_ret() trace_irq_disable(TRACE_RET_ADDR) +#define trace_irq_enable_ret() trace_irq_enable(TRACE_RET_ADDR) +#define trace_irq_save_ret(_x) trace_irq_save(_x, TRACE_RET_ADDR) +#define trace_irq_restore_ret(_x) trace_irq_restore(_x, TRACE_RET_ADDR) + +#define local_irq_disable() \ +({ \ + bool_t irqon = local_irq_is_enabled(); \ + _local_irq_disable(); \ + if ( unlikely(tb_init_done && irqon) ) \ + trace_irq_disable(TRACE_LOCAL_ADDR); \ +}) + +#define local_irq_enable() \ +({ \ + if ( unlikely(tb_init_done) ) \ + trace_irq_enable(TRACE_LOCAL_ADDR); \ + _local_irq_enable(); \ +}) + +#define local_irq_save(_x) \ +({ \ + local_save_flags(_x); \ + _local_irq_disable(); \ + if ( unlikely(tb_init_done) ) \ + trace_irq_save(_x, TRACE_LOCAL_ADDR); \ +}) + +#define local_irq_restore(_x) \ +({ \ + if ( unlikely(tb_init_done) ) \ + trace_irq_restore(_x, TRACE_LOCAL_ADDR); \ + _local_irq_restore(_x); \ +}) +#else /* !TRACE_IRQSOFF */ +#define trace_irq_disable_ret() do { } while ( 0 ) +#define trace_irq_enable_ret() do { } while ( 0 ) +#define trace_irq_save_ret(_x) do { } while ( 0 ) +#define trace_irq_restore_ret(_x) do { } while ( 0 ) +#define local_irq_disable() _local_irq_disable() +#define local_irq_enable() _local_irq_enable() +#define local_irq_save(_x) _local_irq_save(_x) +#define local_irq_restore(_x) _local_irq_restore(_x) +#endif /* TRACE_IRQSOFF */ + static inline int local_irq_is_enabled(void) { unsigned long flags; diff --git a/xen/include/public/trace.h b/xen/include/public/trace.h index f66a7af..1692a79 100644 --- a/xen/include/public/trace.h +++ b/xen/include/public/trace.h @@ -275,6 +275,8 @@ #define TRC_HW_IRQ_ENTER (TRC_HW_IRQ + 0xA) #define TRC_HW_IRQ_GUEST (TRC_HW_IRQ + 0xB) #define TRC_HW_IRQ_EXIT (TRC_HW_IRQ + 0xC) +#define TRC_HW_IRQ_DISABLE (TRC_HW_IRQ + 0xD) +#define TRC_HW_IRQ_ENABLE (TRC_HW_IRQ + 0xE) /* * Event Flags diff --git a/xen/include/xen/rwlock.h b/xen/include/xen/rwlock.h index 35657c5..04f50e5 100644 --- a/xen/include/xen/rwlock.h +++ b/xen/include/xen/rwlock.h @@ -73,14 +73,19 @@ static inline void _read_lock(rwlock_t *lock) static inline void _read_lock_irq(rwlock_t *lock) { ASSERT(local_irq_is_enabled()); - local_irq_disable(); + _local_irq_disable(); + if ( unlikely(tb_init_done) ) + trace_irq_disable_ret(); _read_lock(lock); } static inline unsigned long _read_lock_irqsave(rwlock_t *lock) { unsigned long flags; - local_irq_save(flags); + + _local_irq_save(flags); + if ( unlikely(tb_init_done) ) + trace_irq_save_ret(flags); _read_lock(lock); return flags; } @@ -100,13 +105,17 @@ static inline void _read_unlock(rwlock_t *lock) static inline void _read_unlock_irq(rwlock_t *lock) { _read_unlock(lock); - local_irq_enable(); + if ( unlikely(tb_init_done) ) + trace_irq_enable_ret(); + _local_irq_enable(); } static inline void _read_unlock_irqrestore(rwlock_t *lock, unsigned long flags) { _read_unlock(lock); - local_irq_restore(flags); + if ( unlikely(tb_init_done) ) + trace_irq_restore_ret(flags); + _local_irq_restore(flags); } static inline int _rw_is_locked(rwlock_t *lock) @@ -130,7 +139,9 @@ static inline void _write_lock(rwlock_t *lock) static inline void _write_lock_irq(rwlock_t *lock) { ASSERT(local_irq_is_enabled()); - local_irq_disable(); + _local_irq_disable(); + if ( unlikely(tb_init_done) ) + trace_irq_disable_ret(); _write_lock(lock); } @@ -138,7 +149,9 @@ static inline unsigned long _write_lock_irqsave(rwlock_t *lock) { unsigned long flags; - local_irq_save(flags); + _local_irq_save(flags); + if ( unlikely(tb_init_done) ) + trace_irq_save_ret(flags); _write_lock(lock); return flags; } @@ -171,13 +184,17 @@ static inline void _write_unlock(rwlock_t *lock) static inline void _write_unlock_irq(rwlock_t *lock) { _write_unlock(lock); - local_irq_enable(); + if ( unlikely(tb_init_done) ) + trace_irq_enable_ret(); + _local_irq_enable(); } static inline void _write_unlock_irqrestore(rwlock_t *lock, unsigned long flags) { _write_unlock(lock); - local_irq_restore(flags); + if ( unlikely(tb_init_done) ) + trace_irq_restore_ret(flags); + _local_irq_restore(flags); } static inline int _rw_is_write_locked(rwlock_t *lock)
Trace when interrupts are disabled and (re)enabled. Basically, we replace the IRQ disabling and enabling functions with helpers that does the same, but also output the proper trace record. For putting in the record something that will let us identify _where_ in the code (i.e., in what function) the IRQ manipulation is happening, use either: - current_text_addr(), - or __builtin_return_address(0). In fact, depending on whether the disabling/enabling happens in macros (like for local_irq_disable() and local_irq_enable()) or in actual functions (like in spin_lock_irq*()), it is either: - the actual content of the instruction pointer when IRQ are disabled/enabled, - or the return address of the utility function where IRQ are disabled/enabled, that will tell us what it is the actual piece of code that is asking for the IRQ manipulation operation. Gate this with its specific Kconfig option, and keep it in disabled state by default (i.e., don't build it, if not explicitly specified), as the impact on performance may be non negligible. Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> --- Cc: George Dunlap <george.dunlap@eu.citrix.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Julien Grall <julien.grall@arm.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Jennifer Herbert <jennifer.herbert@citrix.com> --- xen/Kconfig.debug | 11 ++++- xen/common/spinlock.c | 16 +++++-- xen/common/trace.c | 10 +++- xen/include/asm-arm/arm32/system.h | 12 +++++ xen/include/asm-arm/arm64/system.h | 12 +++++ xen/include/asm-x86/system.h | 85 ++++++++++++++++++++++++++++++++++-- xen/include/public/trace.h | 2 + xen/include/xen/rwlock.h | 33 +++++++++++--- 8 files changed, 161 insertions(+), 20 deletions(-)