diff mbox series

[v2] printk: ringbuffer: Improve prb_next_seq() performance

Message ID YXlddJxLh77DKfIO@alley (mailing list archive)
State New, archived
Headers show
Series [v2] printk: ringbuffer: Improve prb_next_seq() performance | expand

Commit Message

Petr Mladek Oct. 27, 2021, 2:08 p.m. UTC
prb_next_seq() always iterates from the first known sequence number.
In the worst case, it might loop 8k times for 256kB buffer,
15k times for 512kB buffer, and 64k times for 2MB buffer.

It was reported that pooling and reading using syslog interface
might occupy 50% of CPU.

Speedup the search by storing @id of the last finalized descriptor.

The loop is still needed because the @id is stored and read in the best
effort way. An atomic variable is used to keep the @id consistent.
But the stores and reads are not serialized against each other.
The descriptor could get reused in the meantime. The related sequence
number will be used only when it is still valid.

An invalid value should be read _only_ when there is a flood of messages
and the ringbuffer is rapidly reused. The performance is the least
problem in this case.

Reported-by: Chunlei Wang <chunlei.wang@mediatek.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
Changes against v1:

  + Read @seq by the last finalized @id directly in prb_next_seq() (John)

 kernel/printk/printk_ringbuffer.c | 49 ++++++++++++++++++++++++++++---
 kernel/printk/printk_ringbuffer.h |  2 ++
 2 files changed, 47 insertions(+), 4 deletions(-)

Comments

John Ogness Oct. 27, 2021, 3:48 p.m. UTC | #1
On 2021-10-27, Petr Mladek <pmladek@suse.com> wrote:
> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
> index 8a7b7362c0dd..24f47fbefbb5 100644
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -474,8 +474,10 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
>  	 * state has been re-checked. A memcpy() for all of @desc
>  	 * cannot be used because of the atomic_t @state_var field.
>  	 */
> -	memcpy(&desc_out->text_blk_lpos, &desc->text_blk_lpos,
> -	       sizeof(desc_out->text_blk_lpos)); /* LMM(desc_read:C) */
> +	if (desc_out) {
> +		memcpy(&desc_out->text_blk_lpos, &desc->text_blk_lpos,
> +		       sizeof(desc_out->text_blk_lpos)); /* LMM(desc_read:C) */
> +	}
>  	if (seq_out)
>  		*seq_out = info->seq; /* also part of desc_read:C */
>  	if (caller_id_out)

You also need this hunk:

@@ -530,7 +530,8 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
 	state_val = atomic_long_read(state_var); /* LMM(desc_read:E) */
 	d_state = get_desc_state(id, state_val);
 out:
-	atomic_long_set(&desc_out->state_var, state_val);
+	if (desc_out)
+		atomic_long_set(&desc_out->state_var, state_val);
 	return d_state;
 }

Sorry, I missed it in my last review. This time I build+boot tested the patch.

With this hunk added:

Reviewed-by: John Ogness <john.ogness@linutronix.de>

John
Petr Mladek Nov. 1, 2021, 10:06 a.m. UTC | #2
On Wed 2021-10-27 17:54:42, John Ogness wrote:
> On 2021-10-27, Petr Mladek <pmladek@suse.com> wrote:
> > diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
> > index 8a7b7362c0dd..24f47fbefbb5 100644
> > --- a/kernel/printk/printk_ringbuffer.c
> > +++ b/kernel/printk/printk_ringbuffer.c
> > @@ -474,8 +474,10 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
> >  	 * state has been re-checked. A memcpy() for all of @desc
> >  	 * cannot be used because of the atomic_t @state_var field.
> >  	 */
> > -	memcpy(&desc_out->text_blk_lpos, &desc->text_blk_lpos,
> > -	       sizeof(desc_out->text_blk_lpos)); /* LMM(desc_read:C) */
> > +	if (desc_out) {
> > +		memcpy(&desc_out->text_blk_lpos, &desc->text_blk_lpos,
> > +		       sizeof(desc_out->text_blk_lpos)); /* LMM(desc_read:C) */
> > +	}
> >  	if (seq_out)
> >  		*seq_out = info->seq; /* also part of desc_read:C */
> >  	if (caller_id_out)
> 
> You also need this hunk:
> 
> @@ -530,7 +530,8 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
>  	state_val = atomic_long_read(state_var); /* LMM(desc_read:E) */
>  	d_state = get_desc_state(id, state_val);
>  out:
> -	atomic_long_set(&desc_out->state_var, state_val);
> +	if (desc_out)
> +		atomic_long_set(&desc_out->state_var, state_val);
>  	return d_state;
>  }
> 
> Sorry, I missed it in my last review. This time I build+boot tested the patch.

Urgh. Great catch! It seems that my testing was not good enough. Or
that I was just lucky.

> With this hunk added:
> 
> Reviewed-by: John Ogness <john.ogness@linutronix.de>

I am going to queue it for 5.17 after the pull request for 5.16 is
accepted. I do not feel comfortable to rush this into 5.16, especially
after finding the bug. The merge window has just started...

Anyway, thanks a lot for review and catching the mistake.

Best Regards,
Petr
Saravana Kannan April 8, 2022, 12:48 a.m. UTC | #3
> Petr Mladek <pmladek@suse.com> wrote:
> 
> prb_next_seq() always iterates from the first known sequence number.
> In the worst case, it might loop 8k times for 256kB buffer,
> 15k times for 512kB buffer, and 64k times for 2MB buffer.
> 
> It was reported that pooling and reading using syslog interface
> might occupy 50% of CPU.
> 
> Speedup the search by storing @id of the last finalized descriptor.
> 
> The loop is still needed because the @id is stored and read in the best
> effort way. An atomic variable is used to keep the @id consistent.
> But the stores and reads are not serialized against each other.
> The descriptor could get reused in the meantime. The related sequence
> number will be used only when it is still valid.
> 
> An invalid value should be read _only_ when there is a flood of messages
> and the ringbuffer is rapidly reused. The performance is the least
> problem in this case.
> 
> Reported-by: Chunlei Wang <chunlei.wang@mediatek.com>
> Signed-off-by: Petr Mladek <pmladek@suse.com>
> ---
> Changes against v1:
> 
>   + Read @seq by the last finalized @id directly in prb_next_seq() (John)
> 
>  kernel/printk/printk_ringbuffer.c | 49 ++++++++++++++++++++++++++++---
>  kernel/printk/printk_ringbuffer.h |  2 ++
>  2 files changed, 47 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
> index 8a7b7362c0dd..24f47fbefbb5 100644
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -474,8 +474,10 @@ static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
>  	 * state has been re-checked. A memcpy() for all of @desc
>  	 * cannot be used because of the atomic_t @state_var field.
>  	 */
> -	memcpy(&desc_out->text_blk_lpos, &desc->text_blk_lpos,
> -	       sizeof(desc_out->text_blk_lpos)); /* LMM(desc_read:C) */
> +	if (desc_out) {
> +		memcpy(&desc_out->text_blk_lpos, &desc->text_blk_lpos,
> +		       sizeof(desc_out->text_blk_lpos)); /* LMM(desc_read:C) */
> +	}
>  	if (seq_out)
>  		*seq_out = info->seq; /* also part of desc_read:C */
>  	if (caller_id_out)
> @@ -1449,6 +1451,9 @@ static void desc_make_final(struct prb_desc_ring *desc_ring, unsigned long id)
>  
>  	atomic_long_cmpxchg_relaxed(&d->state_var, prev_state_val,
>  			DESC_SV(id, desc_finalized)); /* LMM(desc_make_final:A) */
> +
> +	/* Best effort to remember the last finalized @id. */
> +	atomic_long_set(&desc_ring->last_finalized_id, id);
>  }
>  
>  /**
> @@ -1657,7 +1662,12 @@ void prb_commit(struct prb_reserved_entry *e)
>   */
>  void prb_final_commit(struct prb_reserved_entry *e)
>  {
> +	struct prb_desc_ring *desc_ring = &e->rb->desc_ring;
> +
>  	_prb_commit(e, desc_finalized);
> +
> +	/* Best effort to remember the last finalized @id. */
> +	atomic_long_set(&desc_ring->last_finalized_id, e->id);
>  }
>  
>  /*
> @@ -2005,9 +2015,39 @@ u64 prb_first_valid_seq(struct printk_ringbuffer *rb)
>   */
>  u64 prb_next_seq(struct printk_ringbuffer *rb)
>  {
> -	u64 seq = 0;
> +	struct prb_desc_ring *desc_ring = &rb->desc_ring;
> +	enum desc_state d_state;
> +	unsigned long id;
> +	u64 seq;
> +
> +	/* Check if the cached @id still points to a valid @seq. */
> +	id = atomic_long_read(&desc_ring->last_finalized_id);
> +	d_state = desc_read(desc_ring, id, NULL, &seq, NULL);
>  
> -	/* Search forward from the oldest descriptor. */
> +	if (d_state == desc_finalized || d_state == desc_reusable) {
> +		/*
> +		 * Begin searching after the last finalized record.
> +		 *
> +		 * On 0, the search must begin at 0 because of hack#2
> +		 * of the bootstrapping phase it is not known if a
> +		 * record at index 0 exists.
> +		 */
> +		if (seq != 0)
> +			seq++;
> +	} else {
> +		/*
> +		 * The information about the last finalized sequence number
> +		 * has gone. It should happen only when there is a flood of
> +		 * new messages and the ringbuffer is rapidly recycled.
> +		 * Give up and start from the beginning.
> +		 */
> +		seq = 0;
> +	}
> +
> +	/*
> +	 * The information about the last finalized @seq might be inaccurate.
> +	 * Search forward to find the current one.
> +	 */
>  	while (_prb_read_valid(rb, &seq, NULL, NULL))
>  		seq++;
>  
> @@ -2044,6 +2084,7 @@ void prb_init(struct printk_ringbuffer *rb,
>  	rb->desc_ring.infos = infos;
>  	atomic_long_set(&rb->desc_ring.head_id, DESC0_ID(descbits));
>  	atomic_long_set(&rb->desc_ring.tail_id, DESC0_ID(descbits));
> +	atomic_long_set(&rb->desc_ring.last_finalized_id, DESC0_ID(descbits));
>  
>  	rb->text_data_ring.size_bits = textbits;
>  	rb->text_data_ring.data = text_buf;
> diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h
> index 73cc80e01cef..18cd25e489b8 100644
> --- a/kernel/printk/printk_ringbuffer.h
> +++ b/kernel/printk/printk_ringbuffer.h
> @@ -75,6 +75,7 @@ struct prb_desc_ring {
>  	struct printk_info	*infos;
>  	atomic_long_t		head_id;
>  	atomic_long_t		tail_id;
> +	atomic_long_t		last_finalized_id;
>  };


I really know nothing about this code, but while looking around
kernel/printk/ I noticed kernel/printk/printk.c has these lines in
log_buf_vmcoreinfo_setup().

	VMCOREINFO_STRUCT_SIZE(prb_desc_ring);
	VMCOREINFO_OFFSET(prb_desc_ring, count_bits);
	VMCOREINFO_OFFSET(prb_desc_ring, descs);
	VMCOREINFO_OFFSET(prb_desc_ring, infos);
	VMCOREINFO_OFFSET(prb_desc_ring, head_id);
	VMCOREINFO_OFFSET(prb_desc_ring, tail_id);

Would this header file change also require a change to that location?
Something like

	VMCOREINFO_OFFSET(prb_desc_ring, head_id);
	VMCOREINFO_OFFSET(prb_desc_ring, tail_id);
+	VMCOREINFO_OFFSET(prb_desc_ring, last_finalized_id);

-Saravana

>  
>  /*
> @@ -258,6 +259,7 @@ static struct printk_ringbuffer name = {							\
>  		.infos		= &_##name##_infos[0],						\
>  		.head_id	= ATOMIC_INIT(DESC0_ID(descbits)),				\
>  		.tail_id	= ATOMIC_INIT(DESC0_ID(descbits)),				\
> +		.last_finalized_id = ATOMIC_INIT(DESC0_ID(descbits)),				\
>  	},											\
>  	.text_data_ring = {									\
>  		.size_bits	= (avgtextbits) + (descbits),					\
> -- 
> 2.26.2
>
Petr Mladek April 8, 2022, 7:15 a.m. UTC | #4
On Thu 2022-04-07 17:48:21, Saravana Kannan wrote:
> 
> > Petr Mladek <pmladek@suse.com> wrote:
> > 
> > prb_next_seq() always iterates from the first known sequence number.
> > In the worst case, it might loop 8k times for 256kB buffer,
> > 15k times for 512kB buffer, and 64k times for 2MB buffer.
> > 
> > It was reported that pooling and reading using syslog interface
> > might occupy 50% of CPU.
> > 
> > Speedup the search by storing @id of the last finalized descriptor.
> > 
> > The loop is still needed because the @id is stored and read in the best
> > effort way. An atomic variable is used to keep the @id consistent.
> > But the stores and reads are not serialized against each other.
> > The descriptor could get reused in the meantime. The related sequence
> > number will be used only when it is still valid.
> > 
> > An invalid value should be read _only_ when there is a flood of messages
> > and the ringbuffer is rapidly reused. The performance is the least
> > problem in this case.
> > 
> > --- a/kernel/printk/printk_ringbuffer.h
> > +++ b/kernel/printk/printk_ringbuffer.h
> > @@ -75,6 +75,7 @@ struct prb_desc_ring {
> >  	struct printk_info	*infos;
> >  	atomic_long_t		head_id;
> >  	atomic_long_t		tail_id;
> > +	atomic_long_t		last_finalized_id;
> >  };
> 
> 
> I really know nothing about this code, but while looking around
> kernel/printk/ I noticed kernel/printk/printk.c has these lines in
> log_buf_vmcoreinfo_setup().
> 
> 	VMCOREINFO_STRUCT_SIZE(prb_desc_ring);
> 	VMCOREINFO_OFFSET(prb_desc_ring, count_bits);
> 	VMCOREINFO_OFFSET(prb_desc_ring, descs);
> 	VMCOREINFO_OFFSET(prb_desc_ring, infos);
> 	VMCOREINFO_OFFSET(prb_desc_ring, head_id);
> 	VMCOREINFO_OFFSET(prb_desc_ring, tail_id);
> 
> Would this header file change also require a change to that location?
> Something like
> 
> 	VMCOREINFO_OFFSET(prb_desc_ring, head_id);
> 	VMCOREINFO_OFFSET(prb_desc_ring, tail_id);
> +	VMCOREINFO_OFFSET(prb_desc_ring, last_finalized_id);

It is actually not needed. VMCOREINFO_*() macros define some data
that are used by the "crash" tool when reading crashdumps. The field
"last_finalized_id" is used only at runtime. It is not needed
for reading the log in the crashdump.

Anyway, thanks for the question. It is great to know that there are
more people checking our changes.

Best Regards,
Petr
diff mbox series

Patch

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 8a7b7362c0dd..24f47fbefbb5 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -474,8 +474,10 @@  static enum desc_state desc_read(struct prb_desc_ring *desc_ring,
 	 * state has been re-checked. A memcpy() for all of @desc
 	 * cannot be used because of the atomic_t @state_var field.
 	 */
-	memcpy(&desc_out->text_blk_lpos, &desc->text_blk_lpos,
-	       sizeof(desc_out->text_blk_lpos)); /* LMM(desc_read:C) */
+	if (desc_out) {
+		memcpy(&desc_out->text_blk_lpos, &desc->text_blk_lpos,
+		       sizeof(desc_out->text_blk_lpos)); /* LMM(desc_read:C) */
+	}
 	if (seq_out)
 		*seq_out = info->seq; /* also part of desc_read:C */
 	if (caller_id_out)
@@ -1449,6 +1451,9 @@  static void desc_make_final(struct prb_desc_ring *desc_ring, unsigned long id)
 
 	atomic_long_cmpxchg_relaxed(&d->state_var, prev_state_val,
 			DESC_SV(id, desc_finalized)); /* LMM(desc_make_final:A) */
+
+	/* Best effort to remember the last finalized @id. */
+	atomic_long_set(&desc_ring->last_finalized_id, id);
 }
 
 /**
@@ -1657,7 +1662,12 @@  void prb_commit(struct prb_reserved_entry *e)
  */
 void prb_final_commit(struct prb_reserved_entry *e)
 {
+	struct prb_desc_ring *desc_ring = &e->rb->desc_ring;
+
 	_prb_commit(e, desc_finalized);
+
+	/* Best effort to remember the last finalized @id. */
+	atomic_long_set(&desc_ring->last_finalized_id, e->id);
 }
 
 /*
@@ -2005,9 +2015,39 @@  u64 prb_first_valid_seq(struct printk_ringbuffer *rb)
  */
 u64 prb_next_seq(struct printk_ringbuffer *rb)
 {
-	u64 seq = 0;
+	struct prb_desc_ring *desc_ring = &rb->desc_ring;
+	enum desc_state d_state;
+	unsigned long id;
+	u64 seq;
+
+	/* Check if the cached @id still points to a valid @seq. */
+	id = atomic_long_read(&desc_ring->last_finalized_id);
+	d_state = desc_read(desc_ring, id, NULL, &seq, NULL);
 
-	/* Search forward from the oldest descriptor. */
+	if (d_state == desc_finalized || d_state == desc_reusable) {
+		/*
+		 * Begin searching after the last finalized record.
+		 *
+		 * On 0, the search must begin at 0 because of hack#2
+		 * of the bootstrapping phase it is not known if a
+		 * record at index 0 exists.
+		 */
+		if (seq != 0)
+			seq++;
+	} else {
+		/*
+		 * The information about the last finalized sequence number
+		 * has gone. It should happen only when there is a flood of
+		 * new messages and the ringbuffer is rapidly recycled.
+		 * Give up and start from the beginning.
+		 */
+		seq = 0;
+	}
+
+	/*
+	 * The information about the last finalized @seq might be inaccurate.
+	 * Search forward to find the current one.
+	 */
 	while (_prb_read_valid(rb, &seq, NULL, NULL))
 		seq++;
 
@@ -2044,6 +2084,7 @@  void prb_init(struct printk_ringbuffer *rb,
 	rb->desc_ring.infos = infos;
 	atomic_long_set(&rb->desc_ring.head_id, DESC0_ID(descbits));
 	atomic_long_set(&rb->desc_ring.tail_id, DESC0_ID(descbits));
+	atomic_long_set(&rb->desc_ring.last_finalized_id, DESC0_ID(descbits));
 
 	rb->text_data_ring.size_bits = textbits;
 	rb->text_data_ring.data = text_buf;
diff --git a/kernel/printk/printk_ringbuffer.h b/kernel/printk/printk_ringbuffer.h
index 73cc80e01cef..18cd25e489b8 100644
--- a/kernel/printk/printk_ringbuffer.h
+++ b/kernel/printk/printk_ringbuffer.h
@@ -75,6 +75,7 @@  struct prb_desc_ring {
 	struct printk_info	*infos;
 	atomic_long_t		head_id;
 	atomic_long_t		tail_id;
+	atomic_long_t		last_finalized_id;
 };
 
 /*
@@ -258,6 +259,7 @@  static struct printk_ringbuffer name = {							\
 		.infos		= &_##name##_infos[0],						\
 		.head_id	= ATOMIC_INIT(DESC0_ID(descbits)),				\
 		.tail_id	= ATOMIC_INIT(DESC0_ID(descbits)),				\
+		.last_finalized_id = ATOMIC_INIT(DESC0_ID(descbits)),				\
 	},											\
 	.text_data_ring = {									\
 		.size_bits	= (avgtextbits) + (descbits),					\