Message ID | 20140717154300.GA16623@linutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, On Thu, Jul 17, 2014 at 05:43:00PM +0200, Sebastian Andrzej Siewior wrote: > * Peter Hurley | 2014-07-17 11:31:59 [-0400]: > > >On 07/16/2014 12:06 PM, Felipe Balbi wrote: > >>On Wed, Jul 16, 2014 at 05:54:56PM +0200, Sebastian Andrzej Siewior wrote: > >>>On 07/16/2014 05:16 PM, Felipe Balbi wrote: > > > >>>>I wonder if you should get_sync() on start_tx() and only > >>>>put_autosuspend() at stop_tx(). I guess the outcome would be > >>>>largely the same, no ? > >>> > >>>I just opened minicom on ttyS0 and gave a try. start_tx() was invoked > >>>each time I pressed a key (sent a character). I haven't seen stop_tx() > >>>even after after I closed minicom. I guess stop_tx() is invoked if you > >>>switch half-duplex communication. > >> > >>that's bad, I expected stop to be called also after each character. > > > >The 8250 core auto-stops tx when the tx ring buffer is empty (except > >in the case of dma, where stopping tx isn't necessary). > > This is correct. So this is what I have now for the non-dma case: > > diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c > index 2e4a93b..480a1c0 100644 > --- a/drivers/tty/serial/8250/8250_core.c > +++ b/drivers/tty/serial/8250/8250_core.c > @@ -1283,6 +1283,9 @@ static inline void __stop_tx(struct uart_8250_port *p) > if (p->ier & UART_IER_THRI) { > p->ier &= ~UART_IER_THRI; > serial_out(p, UART_IER, p->ier); > + > + pm_runtime_mark_last_busy(p->port.dev); > + pm_runtime_put_autosuspend(p->port.dev); > } > } > > @@ -1310,12 +1313,12 @@ static void serial8250_start_tx(struct uart_port *port) > struct uart_8250_port *up = > container_of(port, struct uart_8250_port, port); > > - pm_runtime_get_sync(port->dev); > if (up->dma && !serial8250_tx_dma(up)) { > goto out; > } else if (!(up->ier & UART_IER_THRI)) { > up->ier |= UART_IER_THRI; > + pm_runtime_get_sync(port->dev); > serial_port_out(port, UART_IER, up->ier); > > if (up->bugs & UART_BUG_TXEN) { > unsigned char lsr; this looks better. So we get on start_tx() and put on stop_tx(). > @@ -1500,9 +1503,10 @@ void serial8250_tx_chars(struct uart_8250_port *up) > uart_write_wakeup(port); > > DEBUG_INTR("THRE..."); > - > +#if 0 > if (uart_circ_empty(xmit)) > __stop_tx(up); > +#endif > } > EXPORT_SYMBOL_GPL(serial8250_tx_chars); is it so that start_tx() gets called one and stop_tx() might be called N times ? That looks unbalanced to me. If the calls are balanced, then you shouldn't need to care because pm_runtime will handle reference counting for you, right? > and now I need to come up with something that is not if (port != omap) > for that #if 0 block. The code disables the TX FIFO empty interrupt once > the transfer is complete. I want to call __stop_tx() once the tx fifo is > empty. > Felipe, Would a check for dev->power.use_autosuspend be the right thing > to do? probably not, as that's internal to the pm_runtime code. But I wonder if start/stop tx calls are balanced, if they are then we're good. Unless I'm missing something else.
On 07/17/2014 06:02 PM, Felipe Balbi wrote: >> diff --git a/drivers/tty/serial/8250/8250_core.c >> b/drivers/tty/serial/8250/8250_core.c index 2e4a93b..480a1c0 >> 100644 --- a/drivers/tty/serial/8250/8250_core.c +++ >> b/drivers/tty/serial/8250/8250_core.c @@ -1283,6 +1283,9 @@ >> static inline void __stop_tx(struct uart_8250_port *p) if (p->ier >> & UART_IER_THRI) { p->ier &= ~UART_IER_THRI; serial_out(p, >> UART_IER, p->ier); + + pm_runtime_mark_last_busy(p->port.dev); + >> pm_runtime_put_autosuspend(p->port.dev); } } >> >> @@ -1310,12 +1313,12 @@ static void serial8250_start_tx(struct >> uart_port *port) struct uart_8250_port *up = container_of(port, >> struct uart_8250_port, port); >> >> - pm_runtime_get_sync(port->dev); if (up->dma && >> !serial8250_tx_dma(up)) { goto out; } else if (!(up->ier & >> UART_IER_THRI)) { up->ier |= UART_IER_THRI; + >> pm_runtime_get_sync(port->dev); serial_port_out(port, UART_IER, >> up->ier); >> >> if (up->bugs & UART_BUG_TXEN) { unsigned char lsr; > > this looks better. So we get on start_tx() and put on stop_tx(). > >> @@ -1500,9 +1503,10 @@ void serial8250_tx_chars(struct >> uart_8250_port *up) uart_write_wakeup(port); >> >> DEBUG_INTR("THRE..."); - +#if 0 if (uart_circ_empty(xmit)) >> __stop_tx(up); +#endif } EXPORT_SYMBOL_GPL(serial8250_tx_chars); > > is it so that start_tx() gets called one and stop_tx() might be > called N times ? That looks unbalanced to me. If the calls are > balanced, then you shouldn't need to care because pm_runtime will > handle reference counting for you, right? No, this is okay. If you look, it checks for "up->ier & UART_IER_THRI". On the second invocation it will see that this bit is already set and therefore won't call get_sync() for the second time. That bit is removed in the _stop_tx() path. >> and now I need to come up with something that is not if (port != >> omap) for that #if 0 block. The code disables the TX FIFO empty >> interrupt once the transfer is complete. I want to call >> __stop_tx() once the tx fifo is empty. Felipe, Would a check for >> dev->power.use_autosuspend be the right thing to do? > > probably not, as that's internal to the pm_runtime code. But I > wonder if start/stop tx calls are balanced, if they are then we're > good. Unless I'm missing something else. Do you have other ideas? It doesn't look like this is exported at all. If we call _stop_tx() right away, then we have 64 bytes in the TX fifo in the worst case. They should be gone "soon" but the HW-flow control may delay it (in theory for a long time)). Sebastian
Hi, On Thu, Jul 17, 2014 at 06:06:59PM +0200, Sebastian Andrzej Siewior wrote: > On 07/17/2014 06:02 PM, Felipe Balbi wrote: > >> diff --git a/drivers/tty/serial/8250/8250_core.c > >> b/drivers/tty/serial/8250/8250_core.c index 2e4a93b..480a1c0 > >> 100644 --- a/drivers/tty/serial/8250/8250_core.c +++ > >> b/drivers/tty/serial/8250/8250_core.c @@ -1283,6 +1283,9 @@ > >> static inline void __stop_tx(struct uart_8250_port *p) if (p->ier > >> & UART_IER_THRI) { p->ier &= ~UART_IER_THRI; serial_out(p, > >> UART_IER, p->ier); + + pm_runtime_mark_last_busy(p->port.dev); + > >> pm_runtime_put_autosuspend(p->port.dev); } } > >> > >> @@ -1310,12 +1313,12 @@ static void serial8250_start_tx(struct > >> uart_port *port) struct uart_8250_port *up = container_of(port, > >> struct uart_8250_port, port); > >> > >> - pm_runtime_get_sync(port->dev); if (up->dma && > >> !serial8250_tx_dma(up)) { goto out; } else if (!(up->ier & > >> UART_IER_THRI)) { up->ier |= UART_IER_THRI; + > >> pm_runtime_get_sync(port->dev); serial_port_out(port, UART_IER, > >> up->ier); > >> > >> if (up->bugs & UART_BUG_TXEN) { unsigned char lsr; > > > > this looks better. So we get on start_tx() and put on stop_tx(). > > > >> @@ -1500,9 +1503,10 @@ void serial8250_tx_chars(struct > >> uart_8250_port *up) uart_write_wakeup(port); > >> > >> DEBUG_INTR("THRE..."); - +#if 0 if (uart_circ_empty(xmit)) > >> __stop_tx(up); +#endif } EXPORT_SYMBOL_GPL(serial8250_tx_chars); > > > > is it so that start_tx() gets called one and stop_tx() might be > > called N times ? That looks unbalanced to me. If the calls are > > balanced, then you shouldn't need to care because pm_runtime will > > handle reference counting for you, right? > > No, this is okay. If you look, it checks for "up->ier & > UART_IER_THRI". On the second invocation it will see that this bit is > already set and therefore won't call get_sync() for the second time. > That bit is removed in the _stop_tx() path. oh, right. But that's actually unnecessary. Calling pm_runtime_get() multiple times will just increment the usage counter multiple times, which means you can call __stop_tx() multiple times too and everything gets balanced, right ? > >> and now I need to come up with something that is not if (port != > >> omap) for that #if 0 block. The code disables the TX FIFO empty > >> interrupt once the transfer is complete. I want to call > >> __stop_tx() once the tx fifo is empty. Felipe, Would a check for > >> dev->power.use_autosuspend be the right thing to do? > > > > probably not, as that's internal to the pm_runtime code. But I > > wonder if start/stop tx calls are balanced, if they are then we're > > good. Unless I'm missing something else. > > Do you have other ideas? It doesn't look like this is exported at all. > If we call _stop_tx() right away, then we have 64 bytes in the TX fifo > in the worst case. They should be gone "soon" but the HW-flow control > may delay it (in theory for a long time)). this can be problematic, specially for OMAP which can go into OFF while idle. Whatever is in the FIFO would get lost. It seems like omap-serial solved this within transmit_chars(). See how transmit_chars() is called from within IRQ handler with clocks enabled then it conditionally calls serial_omap_stop_tx() which will pm_runtime_get_sync() -> do_the_harlem_shake() -> pm_runtime_put_autosuspend(). That leaves one unbalanced pm_runtime_get() which is balanced when we're exitting the IRQ handler. This seems work fine and dandy without DMA, but for DMA work, I think we need to make sure this IP stays powered until we get DMA completion callback. But that's future, I guess.
On 07/17/2014 06:18 PM, Felipe Balbi wrote: >> No, this is okay. If you look, it checks for "up->ier & >> UART_IER_THRI". On the second invocation it will see that this >> bit is already set and therefore won't call get_sync() for the >> second time. That bit is removed in the _stop_tx() path. > > oh, right. But that's actually unnecessary. Calling > pm_runtime_get() multiple times will just increment the usage > counter multiple times, which means you can call __stop_tx() > multiple times too and everything gets balanced, right ? No. start_tx() will be called multiple times but only the first invocation invoke pm_runtime_get(). Now I noticed that I forgot to remove pm_runtime_put_autosuspend() at the bottom of it. But you get the idea right? pm_get() on the while the UART_IER_THRI is not yet set. pm_put() once the fifo is completely empty. >> Do you have other ideas? It doesn't look like this is exported at >> all. If we call _stop_tx() right away, then we have 64 bytes in >> the TX fifo in the worst case. They should be gone "soon" but the >> HW-flow control may delay it (in theory for a long time)). > > this can be problematic, specially for OMAP which can go into OFF > while idle. Whatever is in the FIFO would get lost. It seems like > omap-serial solved this within transmit_chars(). No, it didn't. > See how transmit_chars() is called from within IRQ handler with > clocks enabled then it conditionally calls serial_omap_stop_tx() > which will pm_runtime_get_sync() -> do_the_harlem_shake() -> > pm_runtime_put_autosuspend(). That leaves one unbalanced > pm_runtime_get() which is balanced when we're exitting the IRQ > handler. omap-serial and the 8250 do the following on tx path: - start_tx() -> sets UART_IER_THRI. This will generate an interrupt once the FIFO is empty. - interrupt, notices the empty fifo, invokes serial8250_start_tx()/ transmit_chars(). Both have a while loop that fills the FIFO. This loop is left once the tty-buffer is empty (uart_circ_empty() is true) or the FIFO full. Lets say you filled 64 bytes into the FIFO and then left because your FIFO is full and tty-buffer is empty. That means you will invoke serial_omap_stop_tx() and remove UART_IER_THRI bit. This is okay because you are not interested in further FIFO empty interrupts because you don't have any TX-bytes to be sent. However, once you leave the transmit_chars() you leave serial_omap_irq() which does the last pm_put(). That means you have data in the TX FIFO that is about to be sent and the device is in auto-suspend. This is "fine" as long as the timeout is greater then the time required for the data be sent (plus assuming HW-float control does not stall it for too long) so nobody notices a thing. For that reason I added the hack / #if0 block in the 8250 driver. To ensure we do not disable the TX-FIFO-empty interrupt even if there is nothing to send. Instead we enter serial8250_tx_chars() once again with empty FIFO and empty tty-buffer and will invoke _stop_tx() which also finally does the pm_put(). That is the plan. The problem I have is how to figure out that the device is using auto-suspend. If I don't then I would have to remove the #if0 block and that would mean for everybody an extra interrupt (which I wanted to avoid). > This seems work fine and dandy without DMA, but for DMA work, I > think we need to make sure this IP stays powered until we get DMA > completion callback. But that's future, I guess. Yes, probably. That means one get at dma start, one put at dma complete callback. And I assume we get that callbacks once the DMA transfer is complete, not when the FIFO is empty :) So lets leave it to the future for now… Sebastian
On Fri, Jul 18, 2014 at 10:35:10AM +0200, Sebastian Andrzej Siewior wrote: > On 07/17/2014 06:18 PM, Felipe Balbi wrote: > > >> No, this is okay. If you look, it checks for "up->ier & > >> UART_IER_THRI". On the second invocation it will see that this > >> bit is already set and therefore won't call get_sync() for the > >> second time. That bit is removed in the _stop_tx() path. > > > > oh, right. But that's actually unnecessary. Calling > > pm_runtime_get() multiple times will just increment the usage > > counter multiple times, which means you can call __stop_tx() > > multiple times too and everything gets balanced, right ? > > No. start_tx() will be called multiple times but only the first > invocation invoke pm_runtime_get(). Now I noticed that I forgot to right, but that's unnecessary. You can pm_runtime_get() every time start_tx() is called. Just make sure to put everytime stop_tx() is called too. > remove pm_runtime_put_autosuspend() at the bottom of it. But you get > the idea right? > pm_get() on the while the UART_IER_THRI is not yet set. pm_put() once > the fifo is completely empty. > > >> Do you have other ideas? It doesn't look like this is exported at > >> all. If we call _stop_tx() right away, then we have 64 bytes in > >> the TX fifo in the worst case. They should be gone "soon" but the > >> HW-flow control may delay it (in theory for a long time)). > > > > this can be problematic, specially for OMAP which can go into OFF > > while idle. Whatever is in the FIFO would get lost. It seems like > > omap-serial solved this within transmit_chars(). > > No, it didn't. > > > See how transmit_chars() is called from within IRQ handler with > > clocks enabled then it conditionally calls serial_omap_stop_tx() > > which will pm_runtime_get_sync() -> do_the_harlem_shake() -> > > pm_runtime_put_autosuspend(). That leaves one unbalanced > > pm_runtime_get() which is balanced when we're exitting the IRQ > > handler. > > omap-serial and the 8250 do the following on tx path: > - start_tx() > -> sets UART_IER_THRI. This will generate an interrupt once the FIFO > is empty. > - interrupt, notices the empty fifo, invokes serial8250_start_tx()/ > transmit_chars(). > Both have a while loop that fills the FIFO. This loop is left once > the tty-buffer is empty (uart_circ_empty() is true) or the FIFO full. > > Lets say you filled 64 bytes into the FIFO and then left because your > FIFO is full and tty-buffer is empty. That means you will invoke > serial_omap_stop_tx() and remove UART_IER_THRI bit. > This is okay because you are not interested in further FIFO empty > interrupts because you don't have any TX-bytes to be sent. However, > once you leave the transmit_chars() you leave serial_omap_irq() which > does the last pm_put(). That means you have data in the TX FIFO that is > about to be sent and the device is in auto-suspend. > This is "fine" as long as the timeout is greater then the time required > for the data be sent (plus assuming HW-float control does not stall it > for too long) so nobody notices a thing. the time is set to -1 by default. I guess this only works because nobody has ever tested long transfers with slow baud rates :-p > For that reason I added the hack / #if0 block in the 8250 driver. To > ensure we do not disable the TX-FIFO-empty interrupt even if there is > nothing to send. Instead we enter serial8250_tx_chars() once again with > empty FIFO and empty tty-buffer and will invoke _stop_tx() which also > finally does the pm_put(). > That is the plan. The problem I have is how to figure out that the > device is using auto-suspend. If I don't then I would have to remove > the #if0 block and that would mean for everybody an extra interrupt > (which I wanted to avoid). looks like the closest you have is: if (pm_runtime_autosuspend_expiration(dev) > 0) foo(); Another possibility would be to implement the ->runtime_idle() callback and only return 0 if fifo is empty, otherwise return -EAGAIN ? then, if the autosuspend timer expires, ->runtime_idle gets called and you can do the right thing depending on fifo empty or not. Take a look at drivers/usb/core/driver.c::usb_runtime_{idle,resume,suspend} for examples. That seems to work pretty well. > > This seems work fine and dandy without DMA, but for DMA work, I > > think we need to make sure this IP stays powered until we get DMA > > completion callback. But that's future, I guess. > > Yes, probably. That means one get at dma start, one put at dma complete > callback. And I assume we get that callbacks once the DMA transfer is > complete, not when the FIFO is empty :) So lets leave it to the future > for now… k
On 07/18/2014 11:31 AM, Felipe Balbi wrote: > On Fri, Jul 18, 2014 at 10:35:10AM +0200, Sebastian Andrzej Siewior wrote: >> >On 07/17/2014 06:18 PM, Felipe Balbi wrote: >> > >>>> > >>No, this is okay. If you look, it checks for "up->ier & >>>> > >>UART_IER_THRI". On the second invocation it will see that this >>>> > >>bit is already set and therefore won't call get_sync() for the >>>> > >>second time. That bit is removed in the _stop_tx() path. >>> > > >>> > >oh, right. But that's actually unnecessary. Calling >>> > >pm_runtime_get() multiple times will just increment the usage >>> > >counter multiple times, which means you can call __stop_tx() >>> > >multiple times too and everything gets balanced, right ? >> > >> >No. start_tx() will be called multiple times but only the first >> >invocation invoke pm_runtime_get(). Now I noticed that I forgot to > right, but that's unnecessary. You can pm_runtime_get() every time > start_tx() is called. Just make sure to put everytime stop_tx() is > called too. The interface is asymmetric. start_tx() may be invoked multiple times for which only 1 interrupt will occur, and thus only invoke __stop_tx() once. Regards, Peter Hurley
On Fri, Jul 18, 2014 at 11:53:21AM -0400, Peter Hurley wrote: > On 07/18/2014 11:31 AM, Felipe Balbi wrote: > >On Fri, Jul 18, 2014 at 10:35:10AM +0200, Sebastian Andrzej Siewior wrote: > >>>On 07/17/2014 06:18 PM, Felipe Balbi wrote: > >>> > >>>>> >>No, this is okay. If you look, it checks for "up->ier & > >>>>> >>UART_IER_THRI". On the second invocation it will see that this > >>>>> >>bit is already set and therefore won't call get_sync() for the > >>>>> >>second time. That bit is removed in the _stop_tx() path. > >>>> > > >>>> >oh, right. But that's actually unnecessary. Calling > >>>> >pm_runtime_get() multiple times will just increment the usage > >>>> >counter multiple times, which means you can call __stop_tx() > >>>> >multiple times too and everything gets balanced, right ? > >>> > >>>No. start_tx() will be called multiple times but only the first > >>>invocation invoke pm_runtime_get(). Now I noticed that I forgot to > >right, but that's unnecessary. You can pm_runtime_get() every time > >start_tx() is called. Just make sure to put everytime stop_tx() is > >called too. > > The interface is asymmetric. > > start_tx() may be invoked multiple times for which only 1 interrupt > will occur, and thus only invoke __stop_tx() once. alright, thanks for the info.
diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c index 2e4a93b..480a1c0 100644 --- a/drivers/tty/serial/8250/8250_core.c +++ b/drivers/tty/serial/8250/8250_core.c @@ -1283,6 +1283,9 @@ static inline void __stop_tx(struct uart_8250_port *p) if (p->ier & UART_IER_THRI) { p->ier &= ~UART_IER_THRI; serial_out(p, UART_IER, p->ier); + + pm_runtime_mark_last_busy(p->port.dev); + pm_runtime_put_autosuspend(p->port.dev); } } @@ -1310,12 +1313,12 @@ static void serial8250_start_tx(struct uart_port *port) struct uart_8250_port *up = container_of(port, struct uart_8250_port, port); - pm_runtime_get_sync(port->dev); if (up->dma && !serial8250_tx_dma(up)) { goto out; } else if (!(up->ier & UART_IER_THRI)) { up->ier |= UART_IER_THRI; + pm_runtime_get_sync(port->dev); serial_port_out(port, UART_IER, up->ier); if (up->bugs & UART_BUG_TXEN) { unsigned char lsr; @@ -1500,9 +1503,10 @@ void serial8250_tx_chars(struct uart_8250_port *up) uart_write_wakeup(port); DEBUG_INTR("THRE..."); - +#if 0 if (uart_circ_empty(xmit)) __stop_tx(up); +#endif } EXPORT_SYMBOL_GPL(serial8250_tx_chars);