Message ID | 0eb99caa68fa697af01b90e2711d30ea3545062f.1545462045.git.lukas@wunner.de (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | Raspberry Pi DMA fixes + cleanups | expand |
Hi Lukas, > Lukas Wunner <lukas@wunner.de> hat am 22. Dezember 2018 um 08:28 geschrieben: > > > ... > drivers/dma/bcm2835-dma.c | 33 +++------------------------------ > 1 file changed, 3 insertions(+), 30 deletions(-) > > diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c > index e94f41c56975..17bc7304db3a 100644 > --- a/drivers/dma/bcm2835-dma.c > +++ b/drivers/dma/bcm2835-dma.c > @@ -419,25 +419,11 @@ static int bcm2835_dma_abort(void __iomem *chan_base) > writel(0, chan_base + BCM2835_DMA_CS); > > /* Wait for any current AXI transfer to complete */ > - while ((cs & BCM2835_DMA_ISPAUSED) && --timeout) { > + while ((readl(chan_base + BCM2835_DMA_CS) & > + BCM2835_DMA_WAITING_FOR_WRITES) && --timeout) > cpu_relax(); > - cs = readl(chan_base + BCM2835_DMA_CS); > - } > - > - /* We'll un-pause when we set of our next DMA */ > - if (!timeout) > - return -ETIMEDOUT; i'm only sceptical about silently ignoring the timeout case. I prefer to have a comment explaining why we proceed with BCM2835_DMA_RESET in this case and some kind of error / debug message like below. > - > - if (!(cs & BCM2835_DMA_ACTIVE)) > - return 0; > - > - /* Terminate the control block chain */ > - writel(0, chan_base + BCM2835_DMA_NEXTCB); > - > - /* Abort the whole DMA */ > - writel(BCM2835_DMA_ABORT | BCM2835_DMA_ACTIVE, > - chan_base + BCM2835_DMA_CS); > > + writel(BCM2835_DMA_RESET, chan_base + BCM2835_DMA_CS); > return 0; > } > > @@ -787,7 +773,6 @@ static int bcm2835_dma_terminate_all(struct dma_chan *chan) > struct bcm2835_chan *c = to_bcm2835_dma_chan(chan); > struct bcm2835_dmadev *d = to_bcm2835_dma_dev(c->vc.chan.device); > unsigned long flags; > - int timeout = 10000; > LIST_HEAD(head); > > spin_lock_irqsave(&c->vc.lock, flags); > @@ -802,18 +787,6 @@ static int bcm2835_dma_terminate_all(struct dma_chan *chan) > vchan_terminate_vdesc(&c->desc->vd); > c->desc = NULL; > bcm2835_dma_abort(c->chan_base); > - > - /* Wait for stopping */ > - while (--timeout) { > - if (!(readl(c->chan_base + BCM2835_DMA_CS) & > - BCM2835_DMA_ACTIVE)) > - break; > - > - cpu_relax(); > - } > - > - if (!timeout) > - dev_err(d->ddev.dev, "DMA transfer could not be terminated\n"); > } > Stefan
On Fri, Dec 28, 2018 at 02:20:42PM +0100, Stefan Wahren wrote: > Lukas Wunner <lukas@wunner.de> hat am 22. Dezember 2018 um 08:28 geschrieben: > > drivers/dma/bcm2835-dma.c | 33 +++------------------------------ > > 1 file changed, 3 insertions(+), 30 deletions(-) > > > > diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c > > index e94f41c56975..17bc7304db3a 100644 > > --- a/drivers/dma/bcm2835-dma.c > > +++ b/drivers/dma/bcm2835-dma.c > > @@ -419,25 +419,11 @@ static int bcm2835_dma_abort(void __iomem *chan_base) > > writel(0, chan_base + BCM2835_DMA_CS); > > > > /* Wait for any current AXI transfer to complete */ > > - while ((cs & BCM2835_DMA_ISPAUSED) && --timeout) { > > + while ((readl(chan_base + BCM2835_DMA_CS) & > > + BCM2835_DMA_WAITING_FOR_WRITES) && --timeout) > > cpu_relax(); > > - cs = readl(chan_base + BCM2835_DMA_CS); > > - } > > - > > - /* We'll un-pause when we set of our next DMA */ > > - if (!timeout) > > - return -ETIMEDOUT; > > i'm only sceptical about silently ignoring the timeout case. I prefer > to have a comment explaining why we proceed with BCM2835_DMA_RESET in > this case and some kind of error / debug message like below. > > > - if (!timeout) > > - dev_err(d->ddev.dev, "DMA transfer could not be terminated\n"); The Foundation's downstream tree contains an additional DMA driver called drivers/dma/bcm2708-dmaengine.c: https://github.com/raspberrypi/linux/commit/5fa54ef4d495 There's a code comment in that driver which sheds some light on the abort algorithm's rationale: "This routine waits for the current AXI transfer to complete before terminating the current DMA. If the current transfer is hung on a DREQ used by an uncooperative peripheral the AXI transfer may never complete. In this case the routine times out and return a non-zero error code." The author's intention seems to have been that a peripheral may buffer a write if it has deasserted its DREQ signal and won't send an AXI write response until it can process the write. And if the peripheral is somehow stuck, that write response may never occur. I'm not sure if that can actually happen in the real world or if it's a purely theoretical issue. But the additional cost (in terms of LoC and performance) is small, so I left it in. There's not much we can do if the peripheral doesn't acknowledge writes, so I just carry on resetting the channel. I intend to amend the commit like below to address your objection, let me know if you do not consider this sufficient. Thanks. -- >8 -- diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c index ceec110..c3d9a26 100644 --- a/drivers/dma/bcm2835-dma.c +++ b/drivers/dma/bcm2835-dma.c @@ -417,8 +417,9 @@ static void bcm2835_dma_fill_cb_chain_with_sg( } } -static int bcm2835_dma_abort(void __iomem *chan_base) +static int bcm2835_dma_abort(struct bcm2835_chan *c) { + void __iomem *chan_base = c->chan_base; unsigned long cs; long int timeout = 10000; @@ -434,6 +435,11 @@ static int bcm2835_dma_abort(void __iomem *chan_base) BCM2835_DMA_WAITING_FOR_WRITES) && --timeout) cpu_relax(); + /* Uncooperative peripherals may not signal AXI write responses */ + if (!timeout) + dev_err(c->vc.chan.device->dev, + "failed to complete outstanding writes\n"); + writel(BCM2835_DMA_RESET, chan_base + BCM2835_DMA_CS); return 0; } @@ -797,7 +803,7 @@ static int bcm2835_dma_terminate_all(struct dma_chan *chan) if (c->desc) { vchan_terminate_vdesc(&c->desc->vd); c->desc = NULL; - bcm2835_dma_abort(c->chan_base); + bcm2835_dma_abort(c); } vchan_get_all_descriptors(&c->vc, &head);
diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c index e94f41c56975..17bc7304db3a 100644 --- a/drivers/dma/bcm2835-dma.c +++ b/drivers/dma/bcm2835-dma.c @@ -419,25 +419,11 @@ static int bcm2835_dma_abort(void __iomem *chan_base) writel(0, chan_base + BCM2835_DMA_CS); /* Wait for any current AXI transfer to complete */ - while ((cs & BCM2835_DMA_ISPAUSED) && --timeout) { + while ((readl(chan_base + BCM2835_DMA_CS) & + BCM2835_DMA_WAITING_FOR_WRITES) && --timeout) cpu_relax(); - cs = readl(chan_base + BCM2835_DMA_CS); - } - - /* We'll un-pause when we set of our next DMA */ - if (!timeout) - return -ETIMEDOUT; - - if (!(cs & BCM2835_DMA_ACTIVE)) - return 0; - - /* Terminate the control block chain */ - writel(0, chan_base + BCM2835_DMA_NEXTCB); - - /* Abort the whole DMA */ - writel(BCM2835_DMA_ABORT | BCM2835_DMA_ACTIVE, - chan_base + BCM2835_DMA_CS); + writel(BCM2835_DMA_RESET, chan_base + BCM2835_DMA_CS); return 0; } @@ -787,7 +773,6 @@ static int bcm2835_dma_terminate_all(struct dma_chan *chan) struct bcm2835_chan *c = to_bcm2835_dma_chan(chan); struct bcm2835_dmadev *d = to_bcm2835_dma_dev(c->vc.chan.device); unsigned long flags; - int timeout = 10000; LIST_HEAD(head); spin_lock_irqsave(&c->vc.lock, flags); @@ -802,18 +787,6 @@ static int bcm2835_dma_terminate_all(struct dma_chan *chan) vchan_terminate_vdesc(&c->desc->vd); c->desc = NULL; bcm2835_dma_abort(c->chan_base); - - /* Wait for stopping */ - while (--timeout) { - if (!(readl(c->chan_base + BCM2835_DMA_CS) & - BCM2835_DMA_ACTIVE)) - break; - - cpu_relax(); - } - - if (!timeout) - dev_err(d->ddev.dev, "DMA transfer could not be terminated\n"); } vchan_get_all_descriptors(&c->vc, &head);
There are multiple issues with bcm2835_dma_abort() (which is called on termination of a transaction): * The algorithm to abort the transaction first pauses the channel by clearing the ACTIVE flag in the CS register, then waits for the PAUSED flag to clear. Page 49 of the spec documents the latter as follows: "Indicates if the DMA is currently paused and not transferring data. This will occur if the active bit has been cleared [...]" https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf So the function is entering an infinite loop because it is waiting for PAUSED to clear which is always set due to the function having cleared the ACTIVE flag. The only thing that's saving it from itself is the upper bound of 10000 loop iterations. The code comment says that the intention is to "wait for any current AXI transfer to complete", so the author probably wanted to check the WAITING_FOR_OUTSTANDING_WRITES flag instead. Amend the function accordingly. * The CS register is only read at the beginning of the function. It needs to be read again after pausing the channel and before checking for outstanding writes, otherwise writes which were issued between the register read at the beginning of the function and pausing the channel may not be waited for. * The function seeks to abort the transfer by writing 0 to the NEXTCONBK register and setting the ABORT and ACTIVE flags. Thereby, the 0 in NEXTCONBK is sought to be loaded into the CONBLK_AD register. However experimentation has shown this approach to not work: The CONBLK_AD register remains the same as before and the CS register contains 0x00000030 (PAUSED | DREQ_STOPS_DMA). In other words, the control block is not aborted but merely paused and it will be resumed once the next DMA transaction is started. That is absolutely not the desired behavior. A simpler approach is to set the channel's RESET flag instead. This reliably zeroes the NEXTCONBK as well as the CS register. It requires less code and only a single MMIO write. This is also what popular user space DMA drivers do, e.g.: https://github.com/metachris/RPIO/blob/master/source/c_pwm/pwm.c Note that the spec is contradictory whether the NEXTCONBK register is writeable at all. On the one hand, page 41 claims: "The value loaded into the NEXTCONBK register can be overwritten so that the linked list of Control Block data structures can be dynamically altered. However it is only safe to do this when the DMA is paused." On the other hand, page 40 specifies: "Only three registers in each channel's register set are directly writeable (CS, CONBLK_AD and DEBUG). The other registers (TI, SOURCE_AD, DEST_AD, TXFR_LEN, STRIDE & NEXTCONBK), are automatically loaded from a Control Block data structure held in external memory." Fixes: 96286b576690 ("dmaengine: Add support for BCM2835") Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v3.14+ Cc: Frank Pavlic <f.pavlic@kunbus.de> Cc: Martin Sperl <kernel@martin.sperl.org> Cc: Florian Meier <florian.meier@koalo.de> --- drivers/dma/bcm2835-dma.c | 33 +++------------------------------ 1 file changed, 3 insertions(+), 30 deletions(-)