[1/2] genriq: Avoid summation loops for /proc/stat

Message ID	20190130123615.501708580@linutronix.de (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-fsdevel-owner@kernel.org> Message-Id: <20190130123615.501708580@linutronix.de> User-Agent: quilt/0.65 Date: Wed, 30 Jan 2019 13:31:31 +0100 From: Thomas Gleixner <tglx@linutronix.de> To: LKML <linux-kernel@vger.kernel.org> Cc: Waiman Long <longman@redhat.com>, Matthew Wilcox <willy@infradead.org>, Andrew Morton <akpm@linux-foundation.org>, Alexey Dobriyan <adobriyan@gmail.com>, Kees Cook <keescook@chromium.org>, linux-fsdevel@vger.kernel.org, Davidlohr Bueso <dave@stgolabs.net>, Miklos Szeredi <miklos@szeredi.hu>, Daniel Colascione <dancol@google.com>, Dave Chinner <david@fromorbit.com>, Randy Dunlap <rdunlap@infradead.org>, Marc Zyngier <marc.zyngier@arm.com> Subject: [patch 1/2] genriq: Avoid summation loops for /proc/stat References: <20190130123130.785636313@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk
Series	genirq, proc: Speedup /proc/stat interrupt statistics \| expand [0/2] genirq, proc: Speedup /proc/stat interrupt statistics [1/2] genriq: Avoid summation loops for /proc/stat [2/2] proc/stat: Make the interrupt statistics more efficient

Message ID

20190130123615.501708580@linutronix.de (mailing list archive)

State

New, archived

Headers

Message-Id: <20190130123615.501708580@linutronix.de>
User-Agent: quilt/0.65
Date: Wed, 30 Jan 2019 13:31:31 +0100
From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Waiman Long <longman@redhat.com>,
        Matthew Wilcox <willy@infradead.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        Alexey Dobriyan <adobriyan@gmail.com>,
        Kees Cook <keescook@chromium.org>,
        linux-fsdevel@vger.kernel.org, Davidlohr Bueso <dave@stgolabs.net>,
        Miklos Szeredi <miklos@szeredi.hu>,
        Daniel Colascione <dancol@google.com>,
        Dave Chinner <david@fromorbit.com>,
        Randy Dunlap <rdunlap@infradead.org>,
        Marc Zyngier <marc.zyngier@arm.com>
Subject: [patch 1/2] genriq: Avoid summation loops for /proc/stat
References: <20190130123130.785636313@linutronix.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Sender: linux-fsdevel-owner@vger.kernel.org
Precedence: bulk

Series

genirq, proc: Speedup /proc/stat interrupt statistics | expand

Commit Message

Thomas Gleixner Jan. 30, 2019, 12:31 p.m. UTC

Waiman reported that on large systems with a large amount of interrupts the
readout of /proc/stat takes a long time to sum up the interrupt
statistics. In principle this is not a problem. but for unknown reasons
some enterprise quality software reads /proc/stat with a high frequency.

The reason for this is that interrupt statistics are accounted per cpu. So
the /proc/stat logic has to sum up the interrupt stats for each interrupt.

This can be largely avoided for interrupts which are not marked as
'PER_CPU' interrupts by simply adding a per interrupt summation counter
which is incremented along with the per interrupt per cpu counter.

The PER_CPU interrupts need to avoid that and use only per cpu accounting
because they share the interrupt number and the interrupt descriptor and
concurrent updates would conflict or require unwanted synchronization.

Reported-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

8<-------------

 include/linux/irqdesc.h |    3 ++-
 kernel/irq/chip.c       |   12 ++++++++++--
 kernel/irq/internals.h  |    8 +++++++-
 kernel/irq/irqdesc.c    |    7 ++++++-
 4 files changed, 25 insertions(+), 5 deletions(-)

Comments

Waiman Long Jan. 30, 2019, 4 p.m. UTC | #1

On 01/30/2019 07:31 AM, Thomas Gleixner wrote:
> Waiman reported that on large systems with a large amount of interrupts the
> readout of /proc/stat takes a long time to sum up the interrupt
> statistics. In principle this is not a problem. but for unknown reasons
> some enterprise quality software reads /proc/stat with a high frequency.
>
> The reason for this is that interrupt statistics are accounted per cpu. So
> the /proc/stat logic has to sum up the interrupt stats for each interrupt.
>
> This can be largely avoided for interrupts which are not marked as
> 'PER_CPU' interrupts by simply adding a per interrupt summation counter
> which is incremented along with the per interrupt per cpu counter.
>
> The PER_CPU interrupts need to avoid that and use only per cpu accounting
> because they share the interrupt number and the interrupt descriptor and
> concurrent updates would conflict or require unwanted synchronization.
>
> Reported-by: Waiman Long <longman@redhat.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>
> 8<-------------
>
>  include/linux/irqdesc.h |    3 ++-
>  kernel/irq/chip.c       |   12 ++++++++++--
>  kernel/irq/internals.h  |    8 +++++++-
>  kernel/irq/irqdesc.c    |    7 ++++++-
>  4 files changed, 25 insertions(+), 5 deletions(-)
>
>
> --- a/include/linux/irqdesc.h
> +++ b/include/linux/irqdesc.h
> @@ -65,9 +65,10 @@ struct irq_desc {
>  	unsigned int		core_internal_state__do_not_mess_with_it;
>  	unsigned int		depth;		/* nested irq disables */
>  	unsigned int		wake_depth;	/* nested wake enables */
> +	unsigned int		tot_count;
>  	unsigned int		irq_count;	/* For detecting broken IRQs */
> -	unsigned long		last_unhandled;	/* Aging timer for unhandled count */
>  	unsigned int		irqs_unhandled;
> +	unsigned long		last_unhandled;	/* Aging timer for unhandled count */
>  	atomic_t		threads_handled;
>  	int			threads_handled_last;
>  	raw_spinlock_t		lock;

Just one minor nit. Why you want to move the last_unhandled down one
slot? There were 5 int's before. Adding one more will just fill the
padding hole. Moving down the last_unhandled will probably leave 4-byte
holes in both above and below it assuming that raw_spinlock_t is 4 bytes.

Cheers,
Longman

Thomas Gleixner Jan. 30, 2019, 5:58 p.m. UTC | #2

On Wed, 30 Jan 2019, Waiman Long wrote:
> On 01/30/2019 07:31 AM, Thomas Gleixner wrote:
> > --- a/include/linux/irqdesc.h
> > +++ b/include/linux/irqdesc.h
> > @@ -65,9 +65,10 @@ struct irq_desc {
> >  	unsigned int		core_internal_state__do_not_mess_with_it;
> >  	unsigned int		depth;		/* nested irq disables */
> >  	unsigned int		wake_depth;	/* nested wake enables */
> > +	unsigned int		tot_count;
> >  	unsigned int		irq_count;	/* For detecting broken IRQs */
> > -	unsigned long		last_unhandled;	/* Aging timer for unhandled count */
> >  	unsigned int		irqs_unhandled;
> > +	unsigned long		last_unhandled;	/* Aging timer for unhandled count */
> >  	atomic_t		threads_handled;
> >  	int			threads_handled_last;
> >  	raw_spinlock_t		lock;
> 
> Just one minor nit. Why you want to move the last_unhandled down one
> slot? There were 5 int's before. Adding one more will just fill the
> padding hole. Moving down the last_unhandled will probably leave 4-byte
> holes in both above and below it assuming that raw_spinlock_t is 4 bytes.

Unintentional wreckage. Will undo. Thanks for spotting it.

Thanks,

	tglx

--- a/include/linux/irqdesc.h
+++ b/include/linux/irqdesc.h
@@ -65,9 +65,10 @@  struct irq_desc {
 	unsigned int		core_internal_state__do_not_mess_with_it;
 	unsigned int		depth;		/* nested irq disables */
 	unsigned int		wake_depth;	/* nested wake enables */
+	unsigned int		tot_count;
 	unsigned int		irq_count;	/* For detecting broken IRQs */
-	unsigned long		last_unhandled;	/* Aging timer for unhandled count */
 	unsigned int		irqs_unhandled;
+	unsigned long		last_unhandled;	/* Aging timer for unhandled count */
 	atomic_t		threads_handled;
 	int			threads_handled_last;
 	raw_spinlock_t		lock;
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -855,7 +855,11 @@  void handle_percpu_irq(struct irq_desc *
 {
 	struct irq_chip *chip = irq_desc_get_chip(desc);
 
-	kstat_incr_irqs_this_cpu(desc);
+	/*
+	 * PER CPU interrupts are not serialized. Do not touch
+	 * desc->tot_count.
+	 */
+	__kstat_incr_irqs_this_cpu(desc);
 
 	if (chip->irq_ack)
 		chip->irq_ack(&desc->irq_data);
@@ -884,7 +888,11 @@  void handle_percpu_devid_irq(struct irq_
 	unsigned int irq = irq_desc_get_irq(desc);
 	irqreturn_t res;
 
-	kstat_incr_irqs_this_cpu(desc);
+	/*
+	 * PER CPU interrupts are not serialized. Do not touch
+	 * desc->tot_count.
+	 */
+	__kstat_incr_irqs_this_cpu(desc);
 
 	if (chip->irq_ack)
 		chip->irq_ack(&desc->irq_data);
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -242,12 +242,18 @@  static inline void irq_state_set_masked(
 
 #undef __irqd_to_state
 
-static inline void kstat_incr_irqs_this_cpu(struct irq_desc *desc)
+static inline void __kstat_incr_irqs_this_cpu(struct irq_desc *desc)
 {
 	__this_cpu_inc(*desc->kstat_irqs);
 	__this_cpu_inc(kstat.irqs_sum);
 }
 
+static inline void kstat_incr_irqs_this_cpu(struct irq_desc *desc)
+{
+	__kstat_incr_irqs_this_cpu(desc);
+	desc->tot_count++;
+}
+
 static inline int irq_desc_get_node(struct irq_desc *desc)
 {
 	return irq_common_data_get_node(&desc->irq_common_data);
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -119,6 +119,7 @@  static void desc_set_defaults(unsigned i
 	desc->depth = 1;
 	desc->irq_count = 0;
 	desc->irqs_unhandled = 0;
+	desc->tot_count = 0;
 	desc->name = NULL;
 	desc->owner = owner;
 	for_each_possible_cpu(cpu)
@@ -919,11 +920,15 @@  unsigned int kstat_irqs_cpu(unsigned int
 unsigned int kstat_irqs(unsigned int irq)
 {
 	struct irq_desc *desc = irq_to_desc(irq);
-	int cpu;
 	unsigned int sum = 0;
+	int cpu;
 
 	if (!desc || !desc->kstat_irqs)
 		return 0;
+	if (!irq_settings_is_per_cpu_devid(desc) &&
+	    !irq_settings_is_per_cpu(desc))
+	    return desc->tot_count;
+
 	for_each_possible_cpu(cpu)
 		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
 	return sum;

[1/2] genriq: Avoid summation loops for /proc/stat

Commit Message

Comments

Patch