diff mbox series

[1/3] soc: qcom: geni: More properly switch to DMA mode

Message ID 20201008155154.1.Ifdb1b69fa3367b81118e16e9e4e63299980ca798@changeid (mailing list archive)
State Superseded
Headers show
Series i2c: i2c-qcom-geni: More properly fix the DMA race | expand

Commit Message

Doug Anderson Oct. 8, 2020, 10:52 p.m. UTC
On geni-i2c transfers using DMA, it was seen that if you program the
command (I2C_READ) before calling geni_se_rx_dma_prep() that it could
cause interrupts to fire.  If we get unlucky, these interrupts can
just keep firing (and not be handled) blocking further progress and
hanging the system.

In commit 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
we avoided that by making sure we didn't program the command until
after geni_se_rx_dma_prep() was called.  While that avoided the
problems, it also turns out to be invalid.  At least in the TX case we
started seeing sporadic corrupted transfers.  This is easily seen by
adding an msleep() between the DMA prep and the writing of the
command, which makes the problem worse.  That means we need to revert
that commit and find another way to fix the bogus IRQs.

Specifically, after reverting commit 02b9aec59243 ("i2c:
i2c-qcom-geni: Fix DMA transfer race"), I put some traces in.  I found
that the when the interrupts were firing like crazy:
- "m_stat" had bits for M_RX_IRQ_EN, M_RX_FIFO_WATERMARK_EN set.
- "dma" was set.

Further debugging showed that I could make the problem happen more
reliably by adding an "msleep(1)" any time after geni_se_setup_m_cmd()
ran up until geni_se_rx_dma_prep() programmed the length.

A rather simple fix is to change geni_se_select_dma_mode() so it's a
true inverse of geni_se_select_fifo_mode() and disables all the FIFO
related interrupts.  Now the problematic interrupts can't fire and we
can program things in the correct order without worrying.

As part of this, let's also change the writel_relaxed() in the prepare
function to a writel() so that our DMA is guaranteed to be prepared
now that we can't rely on geni_se_setup_m_cmd()'s writel().

NOTE: the only current user of GENI_SE_DMA in mainline is i2c.

Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
Fixes: 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/soc/qcom/qcom-geni-se.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

Comments

Stephen Boyd Oct. 10, 2020, 12:39 a.m. UTC | #1
Quoting Douglas Anderson (2020-10-08 15:52:33)
> On geni-i2c transfers using DMA, it was seen that if you program the
> command (I2C_READ) before calling geni_se_rx_dma_prep() that it could
> cause interrupts to fire.  If we get unlucky, these interrupts can
> just keep firing (and not be handled) blocking further progress and
> hanging the system.
> 
> In commit 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
> we avoided that by making sure we didn't program the command until
> after geni_se_rx_dma_prep() was called.  While that avoided the
> problems, it also turns out to be invalid.  At least in the TX case we
> started seeing sporadic corrupted transfers.  This is easily seen by
> adding an msleep() between the DMA prep and the writing of the
> command, which makes the problem worse.  That means we need to revert
> that commit and find another way to fix the bogus IRQs.
> 
> Specifically, after reverting commit 02b9aec59243 ("i2c:
> i2c-qcom-geni: Fix DMA transfer race"), I put some traces in.  I found
> that the when the interrupts were firing like crazy:
> - "m_stat" had bits for M_RX_IRQ_EN, M_RX_FIFO_WATERMARK_EN set.
> - "dma" was set.
> 
> Further debugging showed that I could make the problem happen more
> reliably by adding an "msleep(1)" any time after geni_se_setup_m_cmd()
> ran up until geni_se_rx_dma_prep() programmed the length.
> 
> A rather simple fix is to change geni_se_select_dma_mode() so it's a
> true inverse of geni_se_select_fifo_mode() and disables all the FIFO
> related interrupts.  Now the problematic interrupts can't fire and we
> can program things in the correct order without worrying.
> 
> As part of this, let's also change the writel_relaxed() in the prepare
> function to a writel() so that our DMA is guaranteed to be prepared
> now that we can't rely on geni_se_setup_m_cmd()'s writel().
> 
> NOTE: the only current user of GENI_SE_DMA in mainline is i2c.
> 
> Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
> Fixes: 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---

Reviewed-by: Stephen Boyd <swboyd@chromium.org>

> 
>  drivers/soc/qcom/qcom-geni-se.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/soc/qcom/qcom-geni-se.c b/drivers/soc/qcom/qcom-geni-se.c
> index d0e4f520cff8..751a49f6534f 100644
> --- a/drivers/soc/qcom/qcom-geni-se.c
> +++ b/drivers/soc/qcom/qcom-geni-se.c
> @@ -289,10 +289,23 @@ static void geni_se_select_fifo_mode(struct geni_se *se)
>  
>  static void geni_se_select_dma_mode(struct geni_se *se)
>  {
> +       u32 proto = geni_se_read_proto(se);
>         u32 val;
>  
>         geni_se_irq_clear(se);
>  
> +       val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
> +       if (proto != GENI_SE_UART) {

Not a problem with this patch but it would be great if there was a
comment here (and probably in geni_se_select_fifo_mode() too) indicating
why GENI_SE_UART is special. Is it because GENI_SE_UART doesn't use the
main sequencer? I think that is the reason, but I forgot and reading
this code doesn't tell me that.

Splitting the driver in this way where the logic is in the geni wrapper
and in the engine driver leads to this confusion.

> +               val &= ~(M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN);
> +               val &= ~(M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN);
> +       }
> +       writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
> +
> +       val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
> +       if (proto != GENI_SE_UART)
> +               val &= ~S_CMD_DONE_EN;
> +       writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
> +
>         val = readl_relaxed(se->base + SE_GENI_DMA_MODE_EN);
>         val |= GENI_DMA_MODE_EN;
>         writel_relaxed(val, se->base + SE_GENI_DMA_MODE_EN);
Akash Asthana Oct. 12, 2020, 8:09 a.m. UTC | #2
On 10/9/2020 4:22 AM, Douglas Anderson wrote:
> On geni-i2c transfers using DMA, it was seen that if you program the
> command (I2C_READ) before calling geni_se_rx_dma_prep() that it could
> cause interrupts to fire.  If we get unlucky, these interrupts can
> just keep firing (and not be handled) blocking further progress and
> hanging the system.
>
> In commit 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
> we avoided that by making sure we didn't program the command until
> after geni_se_rx_dma_prep() was called.  While that avoided the
> problems, it also turns out to be invalid.  At least in the TX case we
> started seeing sporadic corrupted transfers.  This is easily seen by
> adding an msleep() between the DMA prep and the writing of the
> command, which makes the problem worse.  That means we need to revert
> that commit and find another way to fix the bogus IRQs.
>
> Specifically, after reverting commit 02b9aec59243 ("i2c:
> i2c-qcom-geni: Fix DMA transfer race"), I put some traces in.  I found
> that the when the interrupts were firing like crazy:
> - "m_stat" had bits for M_RX_IRQ_EN, M_RX_FIFO_WATERMARK_EN set.
> - "dma" was set.
>
> Further debugging showed that I could make the problem happen more
> reliably by adding an "msleep(1)" any time after geni_se_setup_m_cmd()
> ran up until geni_se_rx_dma_prep() programmed the length.
>
> A rather simple fix is to change geni_se_select_dma_mode() so it's a
> true inverse of geni_se_select_fifo_mode() and disables all the FIFO
> related interrupts.  Now the problematic interrupts can't fire and we
> can program things in the correct order without worrying.
>
> As part of this, let's also change the writel_relaxed() in the prepare
> function to a writel() so that our DMA is guaranteed to be prepared
> now that we can't rely on geni_se_setup_m_cmd()'s writel().
>
> NOTE: the only current user of GENI_SE_DMA in mainline is i2c.
>
> Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
> Fixes: 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Akash Asthana <akashast@codeaurora.org>
Akash Asthana Oct. 12, 2020, 9:05 a.m. UTC | #3
Hi Stephen,


>>   
>>   static void geni_se_select_dma_mode(struct geni_se *se)
>>   {
>> +       u32 proto = geni_se_read_proto(se);
>>          u32 val;
>>   
>>          geni_se_irq_clear(se);
>>   
>> +       val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
>> +       if (proto != GENI_SE_UART) {
> Not a problem with this patch but it would be great if there was a
> comment here (and probably in geni_se_select_fifo_mode() too) indicating
> why GENI_SE_UART is special. Is it because GENI_SE_UART doesn't use the
> main sequencer? I think that is the reason, but I forgot and reading
> this code doesn't tell me that.
>
> Splitting the driver in this way where the logic is in the geni wrapper
> and in the engine driver leads to this confusion.

GENI_SE_UART uses main sequencer for TX and secondary for RX transfers 
because it is asynchronous in nature.

That's why  RX related bits (M_RX_FIFO_WATERMARK_EN | 
M_RX_FIFO_LAST_EN)  are not enable in main sequencer for UART.

(M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN) bits are controlled from UART 
driver, it's gets enabled and disabled multiple times from start_tx 
,stop_tx respectively.


Regards,

Akash

>
>> +               val &= ~(M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN);
>> +               val &= ~(M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN);
>> +       }
>> +       writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
>> +
>> +       val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
>> +       if (proto != GENI_SE_UART)
>> +               val &= ~S_CMD_DONE_EN;
>> +       writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
>> +
>>          val = readl_relaxed(se->base + SE_GENI_DMA_MODE_EN);
>>          val |= GENI_DMA_MODE_EN;
>>          writel_relaxed(val, se->base + SE_GENI_DMA_MODE_EN);
Doug Anderson Oct. 13, 2020, 9:35 p.m. UTC | #4
Hi,

On Mon, Oct 12, 2020 at 2:05 AM Akash Asthana <akashast@codeaurora.org> wrote:
>
> Hi Stephen,
>
>
> >>
> >>   static void geni_se_select_dma_mode(struct geni_se *se)
> >>   {
> >> +       u32 proto = geni_se_read_proto(se);
> >>          u32 val;
> >>
> >>          geni_se_irq_clear(se);
> >>
> >> +       val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
> >> +       if (proto != GENI_SE_UART) {
> > Not a problem with this patch but it would be great if there was a
> > comment here (and probably in geni_se_select_fifo_mode() too) indicating
> > why GENI_SE_UART is special. Is it because GENI_SE_UART doesn't use the
> > main sequencer? I think that is the reason, but I forgot and reading
> > this code doesn't tell me that.
> >
> > Splitting the driver in this way where the logic is in the geni wrapper
> > and in the engine driver leads to this confusion.
>
> GENI_SE_UART uses main sequencer for TX and secondary for RX transfers
> because it is asynchronous in nature.
>
> That's why  RX related bits (M_RX_FIFO_WATERMARK_EN |
> M_RX_FIFO_LAST_EN)  are not enable in main sequencer for UART.
>
> (M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN) bits are controlled from UART
> driver, it's gets enabled and disabled multiple times from start_tx
> ,stop_tx respectively.

For now I've "solved" this by adding some comments (in the 3rd patch)
basically summarizing what Akash said.  I didn't want to go further
than that for now because it felt more important to get the i2c bug
fixed sooner rather than later and re-organizing would be a big enough
change that it'd probably need a few spins.

Our bug trackers don't make it trivially easy to file a public bug
tracking this and assign it to Qualcomm, but I've filed a bug asking
folks at Qualcomm to help with re-organizing things after my patch
series lands.  This is internally tracked at Google as b:170766462
("Rejigger geni_se_select_fifo_mode() / geni_se_select_dma_mode() to
not manage interrupt enables").

-Doug
diff mbox series

Patch

diff --git a/drivers/soc/qcom/qcom-geni-se.c b/drivers/soc/qcom/qcom-geni-se.c
index d0e4f520cff8..751a49f6534f 100644
--- a/drivers/soc/qcom/qcom-geni-se.c
+++ b/drivers/soc/qcom/qcom-geni-se.c
@@ -289,10 +289,23 @@  static void geni_se_select_fifo_mode(struct geni_se *se)
 
 static void geni_se_select_dma_mode(struct geni_se *se)
 {
+	u32 proto = geni_se_read_proto(se);
 	u32 val;
 
 	geni_se_irq_clear(se);
 
+	val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
+	if (proto != GENI_SE_UART) {
+		val &= ~(M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN);
+		val &= ~(M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN);
+	}
+	writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
+
+	val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
+	if (proto != GENI_SE_UART)
+		val &= ~S_CMD_DONE_EN;
+	writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
+
 	val = readl_relaxed(se->base + SE_GENI_DMA_MODE_EN);
 	val |= GENI_DMA_MODE_EN;
 	writel_relaxed(val, se->base + SE_GENI_DMA_MODE_EN);
@@ -651,7 +664,7 @@  int geni_se_tx_dma_prep(struct geni_se *se, void *buf, size_t len,
 	writel_relaxed(lower_32_bits(*iova), se->base + SE_DMA_TX_PTR_L);
 	writel_relaxed(upper_32_bits(*iova), se->base + SE_DMA_TX_PTR_H);
 	writel_relaxed(GENI_SE_DMA_EOT_BUF, se->base + SE_DMA_TX_ATTR);
-	writel_relaxed(len, se->base + SE_DMA_TX_LEN);
+	writel(len, se->base + SE_DMA_TX_LEN);
 	return 0;
 }
 EXPORT_SYMBOL(geni_se_tx_dma_prep);
@@ -688,7 +701,7 @@  int geni_se_rx_dma_prep(struct geni_se *se, void *buf, size_t len,
 	writel_relaxed(upper_32_bits(*iova), se->base + SE_DMA_RX_PTR_H);
 	/* RX does not have EOT buffer type bit. So just reset RX_ATTR */
 	writel_relaxed(0, se->base + SE_DMA_RX_ATTR);
-	writel_relaxed(len, se->base + SE_DMA_RX_LEN);
+	writel(len, se->base + SE_DMA_RX_LEN);
 	return 0;
 }
 EXPORT_SYMBOL(geni_se_rx_dma_prep);