Message ID | 1438977619-15488-2-git-send-email-bigeasy@linutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 08/07/2015 04:00 PM, Sebastian Andrzej Siewior wrote: > The 8250-omap driver requires the DMA-engine driver to support the pause > command in order to properly turn off programmed RX transfer before the > driver stars manually reading from the FIFO. > The lacking support of the requirement has been discovered recently. In > order to stay safe here we disable support for RX-DMA as soon as we > notice that it does not work. This should happen very early. > If the user does not want to see this backtrace he can either disable > DMA support (completely or RX-only) or backport the required patches for > edma / omap-dma once they hit mainline. > > Cc: <stable@vger.kernel.org> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > drivers/tty/serial/8250/8250_omap.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c > index 0340ee6ba970..07a11e0935e4 100644 > --- a/drivers/tty/serial/8250/8250_omap.c > +++ b/drivers/tty/serial/8250/8250_omap.c > @@ -112,6 +112,7 @@ struct omap8250_priv { > struct work_struct qos_work; > struct uart_8250_dma omap8250_dma; > spinlock_t rx_dma_lock; > + bool rx_dma_broken; > }; > > static u32 uart_read(struct uart_8250_port *up, u32 reg) > @@ -761,6 +762,7 @@ static void omap_8250_rx_dma_flush(struct uart_8250_port *p) > struct omap8250_priv *priv = p->port.private_data; > struct uart_8250_dma *dma = p->dma; > unsigned long flags; > + int ret; > > spin_lock_irqsave(&priv->rx_dma_lock, flags); > > @@ -769,7 +771,9 @@ static void omap_8250_rx_dma_flush(struct uart_8250_port *p) > return; > } > > - dmaengine_pause(dma->rxchan); > + ret = dmaengine_pause(dma->rxchan); > + if (WARN_ON_ONCE(ret)) > + priv->rx_dma_broken = true; No offense, Sebastian, but it boggles my mind that anyone could defend this as solid api design. We're in the middle of an interrupt handler and the slave dma driver is /just/ telling us now that it doesn't implement this functionality?!!? The dmaengine api has _so much_ setup and none of it contemplates telling the consumer that critical functionality is missing? Even dma_get_slave_caps() returns _true_ for cmd_pause support; ok, that interface is pointless. Rather than losing /critical data/ here, the interrupt handler should just busy-wait until dmaengine_tx_status() returns DMA_COMPLETE for the rx_cookie. Regards, Peter Hurley > spin_unlock_irqrestore(&priv->rx_dma_lock, flags); > > @@ -813,6 +817,9 @@ static int omap_8250_rx_dma(struct uart_8250_port *p, unsigned int iir) > break; > } > > + if (priv->rx_dma_broken) > + return -EINVAL; > + > spin_lock_irqsave(&priv->rx_dma_lock, flags); > > if (dma->rx_running) > -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 07, 2015 at 08:28:57PM -0400, Peter Hurley wrote: > Even dma_get_slave_caps() returns _true_ for cmd_pause support; ok, that > interface is pointless. How about reporting that as a bug then, because if you look back in the git history, as you are fully capable of, you will find that the slave capability stuff went in _after_ omap-dma, and *many* DMA engine drivers have not been updated. Here, let me do _your_ work for you: commit cb8cea513c80db1dfe2dce468d2d0772005bb9a1 Author: Maxime Ripard <maxime.ripard@free-electrons.com> Date: Mon Nov 17 14:42:04 2014 +0100 dmaengine: Create a generic dma_slave_caps callback commit 2dcdf570936168d488acf90be9b04a3d32dafce7 Author: Peter Ujfalusi <peter.ujfalusi@ti.com> Date: Fri Sep 14 15:05:45 2012 +0300 dmaengine: omap: Add support for pause/resume in cyclic dma mode Oh look, omap-dma pre-dates the creation of dma_slave_caps by over two years! However, it's *not* as trivial as you think: the dma_slave_caps() call has no knowledge whether a channel will be used in cyclic mode or not, or which direction it will be used. So, it really _can't_ report whether pause mode is supported or not. So actually, this is something that can't be fixed trivially, and was a detail missed when the slave caps was reviewed (I probably missed the review or missed the point.) So, how about you stop playing the blame game and trying to shovel your shit onto DMA engine.
On 08/08/2015 02:28 AM, Peter Hurley wrote: >> diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c >> index 0340ee6ba970..07a11e0935e4 100644 >> --- a/drivers/tty/serial/8250/8250_omap.c >> +++ b/drivers/tty/serial/8250/8250_omap.c >> @@ -769,7 +771,9 @@ static void omap_8250_rx_dma_flush(struct uart_8250_port *p) >> return; >> } >> >> - dmaengine_pause(dma->rxchan); >> + ret = dmaengine_pause(dma->rxchan); >> + if (WARN_ON_ONCE(ret)) >> + priv->rx_dma_broken = true; > > No offense, Sebastian, but it boggles my mind that anyone could defend this > as solid api design. We're in the middle of an interrupt handler and the > slave dma driver is /just/ telling us now that it doesn't implement this > functionality?!!? This is the best thing I could come up with. But to be fair: This happens once _very_ early in the process and is 100% reproducible. The WARN_ON should trigger user attention. > The dmaengine api has _so much_ setup and none of it contemplates telling the > consumer that critical functionality is missing? Lets say it is a missing feature which was not noticed / required until now. I have the missing functionality in patch #3. I don't see the point in adding a lookup API for this feature which has to be backported stable just to figure out that RX-DMA might not work. Again: the WARN_ON triggers on the first short RX read _or_ on shutdown of the port which is not after hours using the port. > Even dma_get_slave_caps() returns _true_ for cmd_pause support; ok, that > interface is pointless. > > Rather than losing /critical data/ here, the interrupt handler should just > busy-wait until dmaengine_tx_status() returns DMA_COMPLETE for the rx_cookie. This might not happen at all. At 115200 I *never* encouraged this. Once the FIFO is filled with less than RX-trigger size than the UART sends the time-out interrupt and the DMA *never* completes. Even if the FIFO reaches trigger size later. So you have to abort the DMA transfer and read data manually from the FIFO. That is why the omap enqueues the RX-transfer upfront (while the FIFO is still empty). It happens *sometimes* that the UART sends a timeout interrupt and the DMA transfer just is started and it has been only observed at 3mbaud (but there were no tests 115200 > x > 3mbaud that I know of). Therefor I think this is a fair fix without hiding anything. > Regards, > Peter Hurley Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Aug 08, 2015 at 11:32:05AM +0200, Sebastian Andrzej Siewior wrote: > This might not happen at all. At 115200 I *never* encouraged this. Once > the FIFO is filled with less than RX-trigger size than the UART sends > the time-out interrupt and the DMA *never* completes. Careful with statements like that, they have a habbit of backfiring :) It just needs the right timing - you need a character sequence which has a pause long enough to trigger the timeout, but short enough to get characters in between the UART deciding to report a timeout, and the DMA being terminated. You'll then be in the situation where the UART is receiving characters and using the DMA, but you've been interrupted to handle the RX timeout event. Whether the UART resets the IIR in that situation to indicate something other than a RX timeout, I'm not sure - if it doesn't, then there's a race there (which from your behaviour at faster rates seems likely.) The more loaded the system, the longer an IRQ may take to be serviced (especially if there are interrupt handlers which aren't fast) and the bigger the window is for that to happen. I've seen some long service times with USB with HID in older kernels (milliseconds), but that seems to have been fixed now - longest IRQ handling I'm now seeing is around 500us. Given 115200 baud, the character time is 87us, that's probably more than enough if you get the timing just right.
On Sat, Aug 08, 2015 at 11:32:05AM +0200, Sebastian Andrzej Siewior wrote: > On 08/08/2015 02:28 AM, Peter Hurley wrote: > >> diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c > >> index 0340ee6ba970..07a11e0935e4 100644 > >> --- a/drivers/tty/serial/8250/8250_omap.c > >> +++ b/drivers/tty/serial/8250/8250_omap.c > >> @@ -769,7 +771,9 @@ static void omap_8250_rx_dma_flush(struct uart_8250_port *p) > >> return; > >> } > >> > >> - dmaengine_pause(dma->rxchan); > >> + ret = dmaengine_pause(dma->rxchan); > >> + if (WARN_ON_ONCE(ret)) > >> + priv->rx_dma_broken = true; > > > > No offense, Sebastian, but it boggles my mind that anyone could defend this > > as solid api design. We're in the middle of an interrupt handler and the > > slave dma driver is /just/ telling us now that it doesn't implement this > > functionality?!!? > > This is the best thing I could come up with. But to be fair: This > happens once _very_ early in the process and is 100% reproducible. The > WARN_ON should trigger user attention. Never expect a _user_ to do anything with a WARN_ON except ignore it. WARN_ON is for developers when something really bad went wrong that you want to be notified so that you can fix the code to prevent that from ever happening again. So this isn't ok, sorry, a user would not know what to do with this at all except email me asking why the driver is broken because a kernel "oops" just happened, when in reality, their hardware is broken in a way that you already know about when you wrote the patch. You know better than this. greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/07/2015 11:00 PM, Sebastian Andrzej Siewior wrote: > The 8250-omap driver requires the DMA-engine driver to support the pause > command in order to properly turn off programmed RX transfer before the > driver stars manually reading from the FIFO. > The lacking support of the requirement has been discovered recently. In > order to stay safe here we disable support for RX-DMA as soon as we > notice that it does not work. This should happen very early. > If the user does not want to see this backtrace he can either disable > DMA support (completely or RX-only) or backport the required patches for > edma / omap-dma once they hit mainline. > > Cc: <stable@vger.kernel.org> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > drivers/tty/serial/8250/8250_omap.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c > index 0340ee6ba970..07a11e0935e4 100644 > --- a/drivers/tty/serial/8250/8250_omap.c > +++ b/drivers/tty/serial/8250/8250_omap.c > @@ -112,6 +112,7 @@ struct omap8250_priv { > struct work_struct qos_work; > struct uart_8250_dma omap8250_dma; > spinlock_t rx_dma_lock; > + bool rx_dma_broken; > }; > > static u32 uart_read(struct uart_8250_port *up, u32 reg) > @@ -761,6 +762,7 @@ static void omap_8250_rx_dma_flush(struct uart_8250_port *p) > struct omap8250_priv *priv = p->port.private_data; > struct uart_8250_dma *dma = p->dma; > unsigned long flags; > + int ret; > > spin_lock_irqsave(&priv->rx_dma_lock, flags); > > @@ -769,7 +771,9 @@ static void omap_8250_rx_dma_flush(struct uart_8250_port *p) > return; > } > > - dmaengine_pause(dma->rxchan); > + ret = dmaengine_pause(dma->rxchan); > + if (WARN_ON_ONCE(ret)) > + priv->rx_dma_broken = true; I don't think this is good thing for the stable _and_ for the mainline at the same time: in stable the rx DMA should not be allowed since the stable kernels does not allow pause/resume with omap-dma, so there the rx DMA should be just disabled for UART. This change will cause regression since it introduce a WARN_ON_ONCE, which will be printed if the user tries to use non working feature. In mainline you will eventually going to have pause/resume support so this patch will make no sense there. > > spin_unlock_irqrestore(&priv->rx_dma_lock, flags); > > @@ -813,6 +817,9 @@ static int omap_8250_rx_dma(struct uart_8250_port *p, unsigned int iir) > break; > } > > + if (priv->rx_dma_broken) > + return -EINVAL; > + > spin_lock_irqsave(&priv->rx_dma_lock, flags); > > if (dma->rx_running) >
On 08/10/2015 01:54 PM, Peter Ujfalusi wrote: >> diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c >> index 0340ee6ba970..07a11e0935e4 100644 >> --- a/drivers/tty/serial/8250/8250_omap.c >> @@ -769,7 +771,9 @@ static void omap_8250_rx_dma_flush(struct uart_8250_port *p) >> return; >> } >> >> - dmaengine_pause(dma->rxchan); >> + ret = dmaengine_pause(dma->rxchan); >> + if (WARN_ON_ONCE(ret)) >> + priv->rx_dma_broken = true; > > I don't think this is good thing for the stable _and_ for the mainline at the > same time: > in stable the rx DMA should not be allowed since the stable kernels does not > allow pause/resume with omap-dma, so there the rx DMA should be just disabled > for UART. This change will cause regression since it introduce a WARN_ON_ONCE, > which will be printed if the user tries to use non working feature. Okay. We do have pause support in mainline for edma since v4.2-rc1. This driver can use edma or sdma depending on the configuration. But it is not yet released. So you suggest remove RX-DMA support completely from the 8250-omap, mark it stable, and revert that patch once we have it fixed in sdma? > In mainline you will eventually going to have pause/resume support so this > patch will make no sense there. The way this works is that it has to be fixed upstream before it can be backported stable. Also Russell made clear (for a good reason) that the RX problem has to be fixed upstream before he thinks about acking the SDMA patch. So if you prefer to instead remove RX-DMA until it is fixed, I am all yours. Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/10/2015 07:54 AM, Peter Ujfalusi wrote: > On 08/07/2015 11:00 PM, Sebastian Andrzej Siewior wrote: > I don't think this is good thing for the stable _and_ for the mainline at the > same time: > in stable the rx DMA should not be allowed since the stable kernels does not > allow pause/resume with omap-dma, so there the rx DMA should be just disabled > for UART. This change will cause regression since it introduce a WARN_ON_ONCE, > which will be printed if the user tries to use non working feature. > > In mainline you will eventually going to have pause/resume support so this > patch will make no sense there. No, the whole point of this mess is that omap-dma does not provide pause/resume support (without data loss). omap-dma will only be suitable for pause/terminate dma. And adding pause support doesn't address the underlying problem that dmaengine is not providing a means of determining if suitable support is available for use by serial drivers, so this same problem is just waiting to happen again. I think the way forward is, For -stable, disable dma in the 8250_omap driver. Then for mainline, * extend the dma_get_slave_caps() api to differentiate the types of pause support * query dma_get_slave_caps() when setting up the slave channel in 8250_dma & 8250_omap and only enable dma if suitable pause support is available * add required dmaengine error checking in 8250_dma & 8250_omap _for unexpected errors_ (so _not_ WARNs) * do whatever with omap-dma. Not even sure it's worth trying to support dma with that; it still won't fully support tx dma which is forcing all kinds of goofy workarounds Russell seemed to think that the current dma operation was necessary information to differentiate types of pause support, but I don't think that's required. As Sebastian's omap-dma driver patch shows, partial pause support has more to do with how it's being used. Regards, Peter Hurley -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 10, 2015 at 09:00:29AM -0400, Peter Hurley wrote: > Russell seemed to think that the current dma operation was necessary information to > differentiate types of pause support, but I don't think that's required. > As Sebastian's omap-dma driver patch shows, partial pause support has more > to do with how it's being used. Do you think you can rewrite the first sentence above in gramatically correct English please, I'm failing to understand what you're saying.
On Sat, Aug 08, 2015 at 10:03:43AM +0100, Russell King - ARM Linux wrote: > On Fri, Aug 07, 2015 at 08:28:57PM -0400, Peter Hurley wrote: > > Even dma_get_slave_caps() returns _true_ for cmd_pause support; ok, that > > interface is pointless. > > How about reporting that as a bug then, because if you look back in the > git history, as you are fully capable of, you will find that the slave > capability stuff went in _after_ omap-dma, and *many* DMA engine drivers > have not been updated. Here, let me do _your_ work for you: > > commit cb8cea513c80db1dfe2dce468d2d0772005bb9a1 > Author: Maxime Ripard <maxime.ripard@free-electrons.com> > Date: Mon Nov 17 14:42:04 2014 +0100 > > dmaengine: Create a generic dma_slave_caps callback > > commit 2dcdf570936168d488acf90be9b04a3d32dafce7 > Author: Peter Ujfalusi <peter.ujfalusi@ti.com> > Date: Fri Sep 14 15:05:45 2012 +0300 > > dmaengine: omap: Add support for pause/resume in cyclic dma mode > > Oh look, omap-dma pre-dates the creation of dma_slave_caps by over two > years! > > However, it's *not* as trivial as you think: the dma_slave_caps() call > has no knowledge whether a channel will be used in cyclic mode or not, > or which direction it will be used. So, it really _can't_ report > whether pause mode is supported or not. So actually, this is something > that can't be fixed trivially, and was a detail missed when the slave > caps was reviewed (I probably missed the review or missed the point.) The API queries the capabilities for a channel. So a channel has been allocated. BUT it would not imply the direction as that is descriptor based, so should we query the capabilities for a descriptor and use those in clients ? Also the current dma_slave_caps() has been moved to framework and reports based on driver callbacks. Now we have a hardware which supports pause for one direction only so we should model it bit differently Thoughts... ??
diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c index 0340ee6ba970..07a11e0935e4 100644 --- a/drivers/tty/serial/8250/8250_omap.c +++ b/drivers/tty/serial/8250/8250_omap.c @@ -112,6 +112,7 @@ struct omap8250_priv { struct work_struct qos_work; struct uart_8250_dma omap8250_dma; spinlock_t rx_dma_lock; + bool rx_dma_broken; }; static u32 uart_read(struct uart_8250_port *up, u32 reg) @@ -761,6 +762,7 @@ static void omap_8250_rx_dma_flush(struct uart_8250_port *p) struct omap8250_priv *priv = p->port.private_data; struct uart_8250_dma *dma = p->dma; unsigned long flags; + int ret; spin_lock_irqsave(&priv->rx_dma_lock, flags); @@ -769,7 +771,9 @@ static void omap_8250_rx_dma_flush(struct uart_8250_port *p) return; } - dmaengine_pause(dma->rxchan); + ret = dmaengine_pause(dma->rxchan); + if (WARN_ON_ONCE(ret)) + priv->rx_dma_broken = true; spin_unlock_irqrestore(&priv->rx_dma_lock, flags); @@ -813,6 +817,9 @@ static int omap_8250_rx_dma(struct uart_8250_port *p, unsigned int iir) break; } + if (priv->rx_dma_broken) + return -EINVAL; + spin_lock_irqsave(&priv->rx_dma_lock, flags); if (dma->rx_running)
The 8250-omap driver requires the DMA-engine driver to support the pause command in order to properly turn off programmed RX transfer before the driver stars manually reading from the FIFO. The lacking support of the requirement has been discovered recently. In order to stay safe here we disable support for RX-DMA as soon as we notice that it does not work. This should happen very early. If the user does not want to see this backtrace he can either disable DMA support (completely or RX-only) or backport the required patches for edma / omap-dma once they hit mainline. Cc: <stable@vger.kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- drivers/tty/serial/8250/8250_omap.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)