diff mbox

[PATCHv3] dmaengine: fsl-edma: fixup reg offset and hw S/G support in big-endian model

Message ID 1411550642-12577-1-git-send-email-jingchang.lu@freescale.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jingchang Lu Sept. 24, 2014, 9:24 a.m. UTC
The offset of all 8-/16-bit register in big-endian eDMA model are
swapped in a 32-bit size opposite those in the little-endian model.

The hardware Scatter/Gather requires the subsequent TCDs in memory
to be auto loaded should retain little endian independent of the
register endian model, the dma engine will do the swap if need.

This patch also use regular assignment for tcd variables r/w
instead of with io function previously that may not always be true.

Signed-off-by: Jingchang Lu <jingchang.lu@freescale.com>
---
changes in v3:
 use unsigned long instead of u32 in reg offset swap cast to avoid warning.

changes in v2:
 simplify register offset swap calculation.
 use regular assignment for tcd variables r/w instead of io function.

 drivers/dma/fsl-edma.c | 106 ++++++++++++++++++++++++++-----------------------
 1 file changed, 57 insertions(+), 49 deletions(-)

Comments

Lothar Waßmann Sept. 24, 2014, 11:05 a.m. UTC | #1
Hi,

Jingchang Lu wrote:
> The offset of all 8-/16-bit register in big-endian eDMA model are
s/register/registers/

> swapped in a 32-bit size opposite those in the little-endian model.
> 
> The hardware Scatter/Gather requires the subsequent TCDs in memory
> to be auto loaded should retain little endian independent of the
> register endian model, the dma engine will do the swap if need.
> 
> This patch also use regular assignment for tcd variables r/w
> instead of with io function previously that may not always be true.
> 
> Signed-off-by: Jingchang Lu <jingchang.lu@freescale.com>
> ---
> changes in v3:
>  use unsigned long instead of u32 in reg offset swap cast to avoid warning.
> 
> changes in v2:
>  simplify register offset swap calculation.
>  use regular assignment for tcd variables r/w instead of io function.
> 
>  drivers/dma/fsl-edma.c | 106 ++++++++++++++++++++++++++-----------------------
>  1 file changed, 57 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/dma/fsl-edma.c b/drivers/dma/fsl-edma.c
> index 3c5711d..9ca55ee 100644
> --- a/drivers/dma/fsl-edma.c
> +++ b/drivers/dma/fsl-edma.c
> @@ -177,16 +177,10 @@ struct fsl_edma_engine {
>  /*
>   * R/W functions for big- or little-endian registers
>   * the eDMA controller's endian is independent of the CPU core's endian.
> + * for the big-endian IP module, the offset for 8-bit or 16-bit register
s/for/For/

> + * should also be swapped oposite to that in little-endian IP.
s/oposite/opposite/

[...]
> @@ -459,20 +460,27 @@ static void fill_tcd_params(struct fsl_edma_engine *edma,
>  	u16 csr = 0;
>  
>  	/*
> -	 * eDMA hardware SGs require the TCD parameters stored in memory
> -	 * the same endian as the eDMA module so that they can be loaded
> -	 * automatically by the engine
> +	 * eDMA hardware SGs requires the TCDs to be auto loaded
> +	 * in the little endian whenver the register endian model,
"in little endian whatever the register endian model"

> +	 * So we put the value in little endian in memory, waitting
s/waitting/waiting/

> +	 * for fsl_set_tcd_params doing the swap.
fsl_set_tcd_params()



Lothar Waßmann
Russell King - ARM Linux Sept. 24, 2014, 12:03 p.m. UTC | #2
On Wed, Sep 24, 2014 at 01:05:10PM +0200, Lothar Waßmann wrote:
> Hi,
> 
> Jingchang Lu wrote:
> > +	 * eDMA hardware SGs requires the TCDs to be auto loaded
> > +	 * in the little endian whenver the register endian model,
> "in little endian whatever the register endian model"

Even that took several reads to work it out.

	eDMA hardware SGs require the TCDs to be stored in little endian
	format irrespective of the register endian model.

and I think that's all that needs to be said here.

However, as I realdy suggested, running this ruddy thing through sparse
would be a /very/ good idea, because it'll warn with:

+       tcd->daddr = cpu_to_le32(dst);

The reason it'll warn on that is because daddr is not declared with
__le32, etc - the types used in struct fsl_edma_hw_tcd tell sparse that
the values to be stored there are in _host_ endian format, but they're
being assigned little endian formatted data.

> > +	 * for fsl_set_tcd_params doing the swap.
> fsl_set_tcd_params()

That's the wrong function name anyway.  It's fsl_edma_set_tcd_params().

However, let's look at this a little more:

        fsl_edma_set_tcd_params(fsl_chan, tcd->saddr, tcd->daddr, tcd->attr,
                        tcd->soff, tcd->nbytes, tcd->slast, tcd->citer,
                        tcd->biter, tcd->doff, tcd->dlast_sga, tcd->csr);

Is it /really/ a good idea to read all that data out of the structure,
only to then have most of it saved into the stack, which
fsl_edma_set_tcd_params() then has to read back off the stack again?
With stuff like this, one has to wonder if there is much clue how to
write optimal C code in this driver.

This should be passing the tcd structure into fsl_edma_set_tcd_params().

Now, this function contains this comment:

        /*
         * TCD parameters have been swapped in fill_tcd_params(),
         * so just write them to registers in the cpu endian here
         */

Which is almost reasonable, but let's update it:

	TCD parameters are stored in struct fsl_edma_hw_tcd in little
	endian format.  However, we need to load the registers in
	<insert appropriate endianness - big|little|cpu> endian.

Now, remember that writel() and friends expect native CPU endian formatted
data (and yes, sparse checks this too) so that needs to be considered more.

Lastly, the driver seems to be a total mess of accessors.  In some places
it uses io{read,write}*, in other places, it uses plain {read,write}*.  It
should use either one set or the other set, and not mix these two.

I just spotted this badly formatted code too:

        for (i = 0; i < fsl_desc->n_tcds; i++)
                        dma_pool_free(fsl_desc->echan->tcd_pool,
                                        fsl_desc->tcd[i].vtcd,
                                        fsl_desc->tcd[i].ptcd);
...
                edma_writeb(fsl_chan->edma,
                                EDMAMUX_CHCFG_ENBL | EDMAMUX_CHCFG_SOURCE(slot),
                                muxaddr + ch_off);
Jingchang Lu Sept. 25, 2014, 6:42 a.m. UTC | #3
>-----Original Message-----

>From: Lothar Waßmann [mailto:LW@KARO-electronics.de]

>Sent: Wednesday, September 24, 2014 7:05 PM

>To: Lu Jingchang-B35083

>Cc: vinod.koul@intel.com; linux@arm.linux.org.uk; arnd@arndb.de; linux-

>kernel@vger.kernel.org; dmaengine@vger.kernel.org; linux-arm-

>kernel@lists.infradead.org

>Subject: Re: [PATCHv3] dmaengine: fsl-edma: fixup reg offset and hw S/G

>support in big-endian model

>

>Hi,

>

>Jingchang Lu wrote:

>> The offset of all 8-/16-bit register in big-endian eDMA model are

>s/register/registers/

>

>> swapped in a 32-bit size opposite those in the little-endian model.

>>

>> The hardware Scatter/Gather requires the subsequent TCDs in memory to

>> be auto loaded should retain little endian independent of the register

>> endian model, the dma engine will do the swap if need.

>>

>> This patch also use regular assignment for tcd variables r/w instead

>> of with io function previously that may not always be true.

>>

>> Signed-off-by: Jingchang Lu <jingchang.lu@freescale.com>

>> ---

>> changes in v3:

>>  use unsigned long instead of u32 in reg offset swap cast to avoid

>warning.

>>

>> changes in v2:

>>  simplify register offset swap calculation.

>>  use regular assignment for tcd variables r/w instead of io function.

>>

>>  drivers/dma/fsl-edma.c | 106

>> ++++++++++++++++++++++++++-----------------------

>>  1 file changed, 57 insertions(+), 49 deletions(-)

>>

>> diff --git a/drivers/dma/fsl-edma.c b/drivers/dma/fsl-edma.c index

>> 3c5711d..9ca55ee 100644

>> --- a/drivers/dma/fsl-edma.c

>> +++ b/drivers/dma/fsl-edma.c

>> @@ -177,16 +177,10 @@ struct fsl_edma_engine {

>>  /*

>>   * R/W functions for big- or little-endian registers

>>   * the eDMA controller's endian is independent of the CPU core's endian.

>> + * for the big-endian IP module, the offset for 8-bit or 16-bit

>> + register

>s/for/For/

>

>> + * should also be swapped oposite to that in little-endian IP.

>s/oposite/opposite/

>

>[...]

>> @@ -459,20 +460,27 @@ static void fill_tcd_params(struct fsl_edma_engine

>*edma,

>>  	u16 csr = 0;

>>

>>  	/*

>> -	 * eDMA hardware SGs require the TCD parameters stored in memory

>> -	 * the same endian as the eDMA module so that they can be loaded

>> -	 * automatically by the engine

>> +	 * eDMA hardware SGs requires the TCDs to be auto loaded

>> +	 * in the little endian whenver the register endian model,

>"in little endian whatever the register endian model"

>

>> +	 * So we put the value in little endian in memory, waitting

>s/waitting/waiting/

>

>> +	 * for fsl_set_tcd_params doing the swap.

>fsl_set_tcd_params()

>

>

>

>Lothar Waßmann

>--

Thanks, I will amend them in the next version patch.

Best Regards,
Jingchang
Jingchang Lu Sept. 25, 2014, 7:39 a.m. UTC | #4
>-----Original Message-----

>From: Russell King - ARM Linux [mailto:linux@arm.linux.org.uk]

>Sent: Wednesday, September 24, 2014 8:04 PM

>To: Lothar Waßmann

>Cc: Lu Jingchang-B35083; vinod.koul@intel.com; arnd@arndb.de; linux-

>kernel@vger.kernel.org; dmaengine@vger.kernel.org; linux-arm-

>kernel@lists.infradead.org

>Subject: Re: [PATCHv3] dmaengine: fsl-edma: fixup reg offset and hw S/G

>support in big-endian model

>

>On Wed, Sep 24, 2014 at 01:05:10PM +0200, Lothar Waßmann wrote:

>> Hi,

>>

>> Jingchang Lu wrote:

>> > +	 * eDMA hardware SGs requires the TCDs to be auto loaded

>> > +	 * in the little endian whenver the register endian model,

>> "in little endian whatever the register endian model"

>

>Even that took several reads to work it out.

>

>	eDMA hardware SGs require the TCDs to be stored in little endian

>	format irrespective of the register endian model.

>

>and I think that's all that needs to be said here.

>

>However, as I realdy suggested, running this ruddy thing through sparse

>would be a /very/ good idea, because it'll warn with:

>

>+       tcd->daddr = cpu_to_le32(dst);

>

>The reason it'll warn on that is because daddr is not declared with __le32,

>etc - the types used in struct fsl_edma_hw_tcd tell sparse that the values

>to be stored there are in _host_ endian format, but they're being assigned

>little endian formatted data.

>

I didn't realize the type warning, they indeed exist. I will use __le32 and
__le16 for fsl_edma_hw_tcd struct members as below:
struct fsl_edma_hw_tcd {
        __le32  saddr;
        __le16  soff;
        __le16  attr;
        __le32  nbytes;
        __le32  slast;
        __le32  daddr;
        __le16  doff;
        __le16  citer;
        __le32  dlast_sga;
        __le16  csr;
        __le16  biter;
};

>> > +	 * for fsl_set_tcd_params doing the swap.

>> fsl_set_tcd_params()

>

>That's the wrong function name anyway.  It's fsl_edma_set_tcd_params().

>

>However, let's look at this a little more:

>

>        fsl_edma_set_tcd_params(fsl_chan, tcd->saddr, tcd->daddr, tcd-

>>attr,

>                        tcd->soff, tcd->nbytes, tcd->slast, tcd->citer,

>                        tcd->biter, tcd->doff, tcd->dlast_sga, tcd->csr);

>

>Is it /really/ a good idea to read all that data out of the structure,

>only to then have most of it saved into the stack, which

>fsl_edma_set_tcd_params() then has to read back off the stack again?

>With stuff like this, one has to wonder if there is much clue how to write

>optimal C code in this driver.

>

>This should be passing the tcd structure into fsl_edma_set_tcd_params().

>

>Now, this function contains this comment:

>

>        /*

>         * TCD parameters have been swapped in fill_tcd_params(),

>         * so just write them to registers in the cpu endian here

>         */

>

>Which is almost reasonable, but let's update it:

>

>	TCD parameters are stored in struct fsl_edma_hw_tcd in little

>	endian format.  However, we need to load the registers in

>	<insert appropriate endianness - big|little|cpu> endian.

>

>Now, remember that writel() and friends expect native CPU endian formatted

>data (and yes, sparse checks this too) so that needs to be considered more.

>

>Lastly, the driver seems to be a total mess of accessors.  In some places

>it uses io{read,write}*, in other places, it uses plain {read,write}*.  It

>should use either one set or the other set, and not mix these two.

>

>I just spotted this badly formatted code too:

>

>        for (i = 0; i < fsl_desc->n_tcds; i++)

>                        dma_pool_free(fsl_desc->echan->tcd_pool,

>                                        fsl_desc->tcd[i].vtcd,

>                                        fsl_desc->tcd[i].ptcd); ...

>                edma_writeb(fsl_chan->edma,

>                                EDMAMUX_CHCFG_ENBL |

>EDMAMUX_CHCFG_SOURCE(slot),

>                                muxaddr + ch_off);

>

>--

Thanks, I will use the tcd pointer instead. And I will use one r/w set.

Best Regards,
Jingchang
diff mbox

Patch

diff --git a/drivers/dma/fsl-edma.c b/drivers/dma/fsl-edma.c
index 3c5711d..9ca55ee 100644
--- a/drivers/dma/fsl-edma.c
+++ b/drivers/dma/fsl-edma.c
@@ -177,16 +177,10 @@  struct fsl_edma_engine {
 /*
  * R/W functions for big- or little-endian registers
  * the eDMA controller's endian is independent of the CPU core's endian.
+ * for the big-endian IP module, the offset for 8-bit or 16-bit register
+ * should also be swapped oposite to that in little-endian IP.
  */
 
-static u16 edma_readw(struct fsl_edma_engine *edma, void __iomem *addr)
-{
-	if (edma->big_endian)
-		return ioread16be(addr);
-	else
-		return ioread16(addr);
-}
-
 static u32 edma_readl(struct fsl_edma_engine *edma, void __iomem *addr)
 {
 	if (edma->big_endian)
@@ -197,13 +191,18 @@  static u32 edma_readl(struct fsl_edma_engine *edma, void __iomem *addr)
 
 static void edma_writeb(struct fsl_edma_engine *edma, u8 val, void __iomem *addr)
 {
-	iowrite8(val, addr);
+	/* swap the reg offset for these in big-endian mode */
+	if (edma->big_endian)
+		iowrite8(val, (void __iomem *)((unsigned long)addr ^ 0x3));
+	else
+		iowrite8(val, addr);
 }
 
 static void edma_writew(struct fsl_edma_engine *edma, u16 val, void __iomem *addr)
 {
+	/* swap the reg offset for these in big-endian mode */
 	if (edma->big_endian)
-		iowrite16be(val, addr);
+		iowrite16be(val, (void __iomem *)((unsigned long)addr ^ 0x2));
 	else
 		iowrite16(val, addr);
 }
@@ -256,11 +255,10 @@  static void fsl_edma_chan_mux(struct fsl_edma_chan *fsl_chan,
 	muxaddr = fsl_chan->edma->muxbase[ch / chans_per_mux];
 
 	if (enable)
-		edma_writeb(fsl_chan->edma,
-				EDMAMUX_CHCFG_ENBL | EDMAMUX_CHCFG_SOURCE(slot),
+		writeb(EDMAMUX_CHCFG_ENBL | EDMAMUX_CHCFG_SOURCE(slot),
 				muxaddr + ch_off);
 	else
-		edma_writeb(fsl_chan->edma, EDMAMUX_CHCFG_DIS, muxaddr + ch_off);
+		writeb(EDMAMUX_CHCFG_DIS, muxaddr + ch_off);
 }
 
 static unsigned int fsl_edma_get_tcd_attr(enum dma_slave_buswidth addr_width)
@@ -363,8 +361,8 @@  static size_t fsl_edma_desc_residue(struct fsl_edma_chan *fsl_chan,
 
 	/* calculate the total size in this desc */
 	for (len = i = 0; i < fsl_chan->edesc->n_tcds; i++)
-		len += edma_readl(fsl_chan->edma, &(edesc->tcd[i].vtcd->nbytes))
-			* edma_readw(fsl_chan->edma, &(edesc->tcd[i].vtcd->biter));
+		len += le32_to_cpu(edesc->tcd[i].vtcd->nbytes)
+			* le16_to_cpu(edesc->tcd[i].vtcd->biter);
 
 	if (!in_progress)
 		return len;
@@ -376,14 +374,12 @@  static size_t fsl_edma_desc_residue(struct fsl_edma_chan *fsl_chan,
 
 	/* figure out the finished and calculate the residue */
 	for (i = 0; i < fsl_chan->edesc->n_tcds; i++) {
-		size = edma_readl(fsl_chan->edma, &(edesc->tcd[i].vtcd->nbytes))
-			* edma_readw(fsl_chan->edma, &(edesc->tcd[i].vtcd->biter));
+		size = le32_to_cpu(edesc->tcd[i].vtcd->nbytes)
+			* le16_to_cpu(edesc->tcd[i].vtcd->biter);
 		if (dir == DMA_MEM_TO_DEV)
-			dma_addr = edma_readl(fsl_chan->edma,
-					&(edesc->tcd[i].vtcd->saddr));
+			dma_addr = le32_to_cpu(edesc->tcd[i].vtcd->saddr);
 		else
-			dma_addr = edma_readl(fsl_chan->edma,
-					&(edesc->tcd[i].vtcd->daddr));
+			dma_addr = le32_to_cpu(edesc->tcd[i].vtcd->daddr);
 
 		len -= size;
 		if (cur_addr > dma_addr && cur_addr < dma_addr + size) {
@@ -433,21 +429,26 @@  static void fsl_edma_set_tcd_params(struct fsl_edma_chan *fsl_chan,
 	u32 ch = fsl_chan->vchan.chan.chan_id;
 
 	/*
-	 * TCD parameters have been swapped in fill_tcd_params(),
-	 * so just write them to registers in the cpu endian here
+	 * TCD parameters should be swapped according the eDMA
+	 * engine requirement.
 	 */
-	writew(0, addr + EDMA_TCD_CSR(ch));
-	writel(src, addr + EDMA_TCD_SADDR(ch));
-	writel(dst, addr + EDMA_TCD_DADDR(ch));
-	writew(attr, addr + EDMA_TCD_ATTR(ch));
-	writew(soff, addr + EDMA_TCD_SOFF(ch));
-	writel(nbytes, addr + EDMA_TCD_NBYTES(ch));
-	writel(slast, addr + EDMA_TCD_SLAST(ch));
-	writew(citer, addr + EDMA_TCD_CITER(ch));
-	writew(biter, addr + EDMA_TCD_BITER(ch));
-	writew(doff, addr + EDMA_TCD_DOFF(ch));
-	writel(dlast_sga, addr + EDMA_TCD_DLAST_SGA(ch));
-	writew(csr, addr + EDMA_TCD_CSR(ch));
+	edma_writew(fsl_chan->edma, 0, addr + EDMA_TCD_CSR(ch));
+	edma_writel(fsl_chan->edma, src, addr + EDMA_TCD_SADDR(ch));
+	edma_writel(fsl_chan->edma, dst, addr + EDMA_TCD_DADDR(ch));
+
+	edma_writew(fsl_chan->edma, attr, addr + EDMA_TCD_ATTR(ch));
+	edma_writew(fsl_chan->edma, soff, addr + EDMA_TCD_SOFF(ch));
+
+	edma_writel(fsl_chan->edma, nbytes, addr + EDMA_TCD_NBYTES(ch));
+	edma_writel(fsl_chan->edma, slast, addr + EDMA_TCD_SLAST(ch));
+
+	edma_writew(fsl_chan->edma, citer, addr + EDMA_TCD_CITER(ch));
+	edma_writew(fsl_chan->edma, biter, addr + EDMA_TCD_BITER(ch));
+	edma_writew(fsl_chan->edma, doff, addr + EDMA_TCD_DOFF(ch));
+
+	edma_writel(fsl_chan->edma, dlast_sga, addr + EDMA_TCD_DLAST_SGA(ch));
+
+	edma_writew(fsl_chan->edma, csr, addr + EDMA_TCD_CSR(ch));
 }
 
 static void fill_tcd_params(struct fsl_edma_engine *edma,
@@ -459,20 +460,27 @@  static void fill_tcd_params(struct fsl_edma_engine *edma,
 	u16 csr = 0;
 
 	/*
-	 * eDMA hardware SGs require the TCD parameters stored in memory
-	 * the same endian as the eDMA module so that they can be loaded
-	 * automatically by the engine
+	 * eDMA hardware SGs requires the TCDs to be auto loaded
+	 * in the little endian whenver the register endian model,
+	 * So we put the value in little endian in memory, waitting
+	 * for fsl_set_tcd_params doing the swap.
 	 */
-	edma_writel(edma, src, &(tcd->saddr));
-	edma_writel(edma, dst, &(tcd->daddr));
-	edma_writew(edma, attr, &(tcd->attr));
-	edma_writew(edma, EDMA_TCD_SOFF_SOFF(soff), &(tcd->soff));
-	edma_writel(edma, EDMA_TCD_NBYTES_NBYTES(nbytes), &(tcd->nbytes));
-	edma_writel(edma, EDMA_TCD_SLAST_SLAST(slast), &(tcd->slast));
-	edma_writew(edma, EDMA_TCD_CITER_CITER(citer), &(tcd->citer));
-	edma_writew(edma, EDMA_TCD_DOFF_DOFF(doff), &(tcd->doff));
-	edma_writel(edma, EDMA_TCD_DLAST_SGA_DLAST_SGA(dlast_sga), &(tcd->dlast_sga));
-	edma_writew(edma, EDMA_TCD_BITER_BITER(biter), &(tcd->biter));
+	tcd->saddr = cpu_to_le32(src);
+	tcd->daddr = cpu_to_le32(dst);
+
+	tcd->attr = cpu_to_le16(attr);
+
+	tcd->soff = cpu_to_le16(EDMA_TCD_SOFF_SOFF(soff));
+
+	tcd->nbytes = cpu_to_le32(EDMA_TCD_NBYTES_NBYTES(nbytes));
+	tcd->slast = cpu_to_le32(EDMA_TCD_SLAST_SLAST(slast));
+
+	tcd->citer = cpu_to_le16(EDMA_TCD_CITER_CITER(citer));
+	tcd->doff = cpu_to_le16(EDMA_TCD_DOFF_DOFF(doff));
+
+	tcd->dlast_sga = cpu_to_le32(EDMA_TCD_DLAST_SGA_DLAST_SGA(dlast_sga));
+
+	tcd->biter = cpu_to_le16(EDMA_TCD_BITER_BITER(biter));
 	if (major_int)
 		csr |= EDMA_TCD_CSR_INT_MAJOR;
 
@@ -482,7 +490,7 @@  static void fill_tcd_params(struct fsl_edma_engine *edma,
 	if (enable_sg)
 		csr |= EDMA_TCD_CSR_E_SG;
 
-	edma_writew(edma, csr, &(tcd->csr));
+	tcd->csr = cpu_to_le16(csr);
 }
 
 static struct fsl_edma_desc *fsl_edma_alloc_desc(struct fsl_edma_chan *fsl_chan,