diff mbox

[v8,00/18] 8250-core based serial driver for OMAP + DMA

Message ID 20140908151501.GA22584@linutronix.de (mailing list archive)
State New, archived
Headers show

Commit Message

Sebastian Andrzej Siewior Sept. 8, 2014, 3:15 p.m. UTC
* Frans Klaver | 2014-09-08 16:46:18 [+0200]:

>- ncurses based applications (vi, less) don't play nice for me on the
>  console with this series. less doesn't show me anything. vi doesn't
>  return to console properly.

Can you give a test case 

>- I seem seem to get stuck in a "serial8250: too much work for irq%d"
>  loop somewhat reliably. We have a rather demanding application with
>  typically somewhere between 600 and 1000 byte packets being sent at
>  240Hz (roughly somewhere between 1.5 and 2 Mb/s). We run at baudrate
>  3500k. I get into this "too much work" thing already when running at
>  300 bytes per packet.

Do you get this message also at lower baud rates, say 115200?

What I am trying to understand is why you are spinning in the handler. 
_With_ DMA you should hardly get into the serial handler under normal 
conditions. Running at 3.5MB/sec should give one byte every 2.8us and
48 Bytes every ~137us. This looks like plenty of time to get  out of
the handler. My *guess* is that serial8250_handle_irq() has IIR often
set to timeout and you end up fetching byte after byte. 

This patch should protocol when and why you got into the handler.


>I hope this is of some use to you. I'll do more testing later.

Which SoC do you use and do you have DMA enabled?

>Thanks,
>Frans

Sebastian

Comments

Sebastian Andrzej Siewior Sept. 8, 2014, 4:33 p.m. UTC | #1
* Sebastian Andrzej Siewior | 2014-09-08 17:15:01 [+0200]:

>* Frans Klaver | 2014-09-08 16:46:18 [+0200]:
>
>>- ncurses based applications (vi, less) don't play nice for me on the
>>  console with this series. less doesn't show me anything. vi doesn't
>>  return to console properly.
>
>Can you give a test case 

Okay. less. My am335x-evm freezes after a while for no obvious reason.
The data that hits the RX fifo is still received but the TX won't do
anything. The DMA request is pending, the FIFO level is @64 bytes and
the UART doesn't make any progress.
On beagle-board I see what you described: less on a file and nothing
happens.
Disabling DMA seems to help fix the problem omap3. Nothing changes for
am335x.

Sebastian
Frans Klaver Sept. 8, 2014, 6:25 p.m. UTC | #2
On Mon, Sep 08, 2014 at 06:33:13PM +0200, Sebastian Andrzej Siewior wrote:
> * Sebastian Andrzej Siewior | 2014-09-08 17:15:01 [+0200]:
> 
> >* Frans Klaver | 2014-09-08 16:46:18 [+0200]:
> >
> >>- ncurses based applications (vi, less) don't play nice for me on the
> >>  console with this series. less doesn't show me anything. vi doesn't
> >>  return to console properly.
> >
> >Can you give a test case 
> 
> Okay. less. My am335x-evm freezes after a while for no obvious reason.
> The data that hits the RX fifo is still received but the TX won't do
> anything. The DMA request is pending, the FIFO level is @64 bytes and
> the UART doesn't make any progress.
> On beagle-board I see what you described: less on a file and nothing
> happens.

Exactly that, yes.

Frans
Frans Klaver Sept. 8, 2014, 6:33 p.m. UTC | #3
On Mon, Sep 08, 2014 at 05:15:01PM +0200, Sebastian Andrzej Siewior wrote:
> * Frans Klaver | 2014-09-08 16:46:18 [+0200]:
> 
> >- I seem seem to get stuck in a "serial8250: too much work for irq%d"
> >  loop somewhat reliably. We have a rather demanding application with
> >  typically somewhere between 600 and 1000 byte packets being sent at
> >  240Hz (roughly somewhere between 1.5 and 2 Mb/s). We run at baudrate
> >  3500k. I get into this "too much work" thing already when running at
> >  300 bytes per packet.
> 
> Do you get this message also at lower baud rates, say 115200?

I don't get this message at lower data rates. Haven't tested lower baud
rates yet.

> What I am trying to understand is why you are spinning in the handler. 
> _With_ DMA you should hardly get into the serial handler under normal 
> conditions. Running at 3.5MB/sec should give one byte every 2.8us and
> 48 Bytes every ~137us. This looks like plenty of time to get  out of
> the handler. My *guess* is that serial8250_handle_irq() has IIR often
> set to timeout and you end up fetching byte after byte. 
> 
> This patch should protocol when and why you got into the handler.
>
> diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
> index 7111b22de000..59852069e4a0 100644
> --- a/drivers/tty/serial/8250/8250_core.c
> +++ b/drivers/tty/serial/8250/8250_core.c
> @@ -1583,6 +1583,7 @@ int serial8250_handle_irq(struct uart_port *port, unsigned int iir)
>  	status = serial_port_in(port, UART_LSR);
>  
>  	DEBUG_INTR("status = %x...", status);
> +	trace_printk("l%d IIR %x LSR %x\n", port->line, iir, status);
>  
>  	if (status & (UART_LSR_DR | UART_LSR_BI)) {
>  		if (up->dma)
> @@ -1707,6 +1708,7 @@ static irqreturn_t serial8250_interrupt(int irq, void *dev_id)
>  
>  	spin_unlock(&i->lock);
>  
> +	trace_printk("%d e\n", irq);
>  	DEBUG_INTR("end.\n");
>  
>  	return IRQ_RETVAL(handled);
> 

Thanks. I'll give it a spin on Wednesday.


> >I hope this is of some use to you. I'll do more testing later.
> 
> Which SoC do you use and do you have DMA enabled?

am335x, DMA is enabled, unless I need to do something extra in the
device tree. We depend on am335x.dtsi, so I would think that would be
automatic if CONFIG_SERIAL_8250_DMA=y.

Thanks,
Frans
Sebastian Andrzej Siewior Sept. 9, 2014, 7:41 p.m. UTC | #4
On 09/08/2014 08:33 PM, Frans Klaver wrote:
> Thanks. I'll give it a spin on Wednesday.

Could you please pull the upcoming v9 first?

 git://git.breakpoint.cc/bigeasy/linux.git uart_v9_pre1

This solves a few of my am335x related issues.

The problem that the uart freezes on beagle board xm is still there.
From what I can say is that the DMA transfer is started but not
completed and I can't reproduce it on dra7xx

> Thanks,
> Frans
> 

Sebastian
Frans Klaver Sept. 10, 2014, 2:15 p.m. UTC | #5
On Tue, Sep 09, 2014 at 09:41:20PM +0200, Sebastian Andrzej Siewior wrote:
> On 09/08/2014 08:33 PM, Frans Klaver wrote:
> > Thanks. I'll give it a spin on Wednesday.
> 
> Could you please pull the upcoming v9 first?
> 
>  git://git.breakpoint.cc/bigeasy/linux.git uart_v9_pre1
> 
> This solves a few of my am335x related issues.

Using v9_pre1, I get a kernel panic in edma_dma_pause() on echan->edesc
being NULL. I do get data before this happens. This is at high speed,
high rate. No mention of the irq having too much to do, though. The more
data I transmit, the more likely this is to occur.

I don't currently have the setup to lower the baudrate. I would probably
need to reproduce this on a beaglebone, instead of our custom board.
I'll see if I can do that.

If you need more info, just let me know.

Thanks,
Frans
Sebastian Andrzej Siewior Sept. 10, 2014, 4:56 p.m. UTC | #6
On 09/10/2014 04:15 PM, Frans Klaver wrote:
> On Tue, Sep 09, 2014 at 09:41:20PM +0200, Sebastian Andrzej Siewior wrote:
>> On 09/08/2014 08:33 PM, Frans Klaver wrote:
>>> Thanks. I'll give it a spin on Wednesday.
>>
>> Could you please pull the upcoming v9 first?
>>
>>  git://git.breakpoint.cc/bigeasy/linux.git uart_v9_pre1
>>
>> This solves a few of my am335x related issues.
> 
> Using v9_pre1, I get a kernel panic in edma_dma_pause() on echan->edesc
> being NULL. I do get data before this happens. This is at high speed,
> high rate. No mention of the irq having too much to do, though. The more
> data I transmit, the more likely this is to occur.

Hmm. This shouldn't happen because if dma->rx_running is set, then
there should be a descriptor to pause.
Could check how this could happen? (and which event tries to do so)

In the meantime, the lower part of [0] should fix the NULL bug.

[0] https://lkml.org/lkml/2014/7/29/506

> I don't currently have the setup to lower the baudrate. I would probably
> need to reproduce this on a beaglebone, instead of our custom board.
> I'll see if I can do that.
> 
> If you need more info, just let me know.

I just pushed uart_v9_pre2 and I will post it once I rebased it on top
Greg's latest.
There was a bug where THRE could be active longer than needed caused a
few not needed interrupts until the FIFO was empty again.

My "less on file" test case did not fail anymore with this.

> Thanks,
> Frans

Sebastian
diff mbox

Patch

diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
index 7111b22de000..59852069e4a0 100644
--- a/drivers/tty/serial/8250/8250_core.c
+++ b/drivers/tty/serial/8250/8250_core.c
@@ -1583,6 +1583,7 @@  int serial8250_handle_irq(struct uart_port *port, unsigned int iir)
 	status = serial_port_in(port, UART_LSR);
 
 	DEBUG_INTR("status = %x...", status);
+	trace_printk("l%d IIR %x LSR %x\n", port->line, iir, status);
 
 	if (status & (UART_LSR_DR | UART_LSR_BI)) {
 		if (up->dma)
@@ -1707,6 +1708,7 @@  static irqreturn_t serial8250_interrupt(int irq, void *dev_id)
 
 	spin_unlock(&i->lock);
 
+	trace_printk("%d e\n", irq);
 	DEBUG_INTR("end.\n");
 
 	return IRQ_RETVAL(handled);