Message ID | 1460392212-101116-1-git-send-email-andriy.shevchenko@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Apr 11, 2016 at 07:30:12PM +0300, Andy Shevchenko wrote: > To optimize amount of bus writes on memory side set burst to be the same amount > of data on both sides. > + txconf.src_maxburst = 4 * dws->dma_width; > txconf.dst_maxburst = 16; This doesn't seem to do what the subject says (at least not always, it'll align for a dma_width of 4)?
On Tue, 2016-04-12 at 01:34 +0100, Mark Brown wrote: > On Mon, Apr 11, 2016 at 07:30:12PM +0300, Andy Shevchenko wrote: > +Vinod and dmaengine@ > > > > To optimize amount of bus writes on memory side set burst to be the > > same amount > > of data on both sides. > > > > + txconf.src_maxburst = 4 * dws->dma_width; > > txconf.dst_maxburst = 16; > This doesn't seem to do what the subject says (at least not always, > it'll align for a dma_width of 4)? Thanks you didn't apply the patch. I think the approach itself is wrong. The peripheral drivers usually have no idea and shouldn't know about DMA engine memory side characteristics (bus width, bursts, etc). This should be fixed in certain DMA engine drivers. Also, as you may have noticed when we get maximum length of the segment we take into consideration what DMA device supports. Many of them report something like 2^n - 1, which is apparently unaligned and thus in the poorly written DMA driver leads to performance degradation. Looks like all Intel related DMA drivers should be fixed (HSU, iDMA64, dw_dmac). Vinod, am I right?
On Tue, Apr 12, 2016 at 02:56:35PM +0300, Andy Shevchenko wrote: > On Tue, 2016-04-12 at 01:34 +0100, Mark Brown wrote: > > On Mon, Apr 11, 2016 at 07:30:12PM +0300, Andy Shevchenko wrote: > > > > +Vinod and dmaengine@ > > > > > > > To optimize amount of bus writes on memory side set burst to be the > > > same amount > > > of data on both sides. > > > > > > + txconf.src_maxburst = 4 * dws->dma_width; > > > txconf.dst_maxburst = 16; > > This doesn't seem to do what the subject says (at least not always, > > it'll align for a dma_width of 4)? > > Thanks you didn't apply the patch. > > I think the approach itself is wrong. > > The peripheral drivers usually have no idea and shouldn't know about DMA > engine memory side characteristics (bus width, bursts, etc). These are typically you system characterstics, like 32 bit or 64 bit bus to memory and rest (burst etc) should be maximum as the data will go from/to dmaengine FIFO to/from memory, so you would want to push as fast as possible Said that, maximun burst with 32bit wide should be saner value in modern systems. > > This should be fixed in certain DMA engine drivers. > > Also, as you may have noticed when we get maximum length of the segment > we take into consideration what DMA device supports. Many of them report > something like 2^n - 1, which is apparently unaligned and thus in the > poorly written DMA driver leads to performance degradation. Which Intel controller supports 2^n - 1? AFAIK the dw and idma don't. > Looks like all Intel related DMA drivers should be fixed (HSU, iDMA64, > dw_dmac). > > Vinod, am I right? > > -- > Andy Shevchenko <andriy.shevchenko@linux.intel.com> > Intel Finland Oy >
On Tue, 2016-04-12 at 19:32 +0530, Vinod Koul wrote: > On Tue, Apr 12, 2016 at 02:56:35PM +0300, Andy Shevchenko wrote: > > > > On Tue, 2016-04-12 at 01:34 +0100, Mark Brown wrote: > > > > > > On Mon, Apr 11, 2016 at 07:30:12PM +0300, Andy Shevchenko wrote: > > > To optimize amount of bus writes on memory side set burst to be > > > > the > > > > same amount > > > > of data on both sides. > > > > > > > > + txconf.src_maxburst = 4 * dws->dma_width; > > > > txconf.dst_maxburst = 16; > > > This doesn't seem to do what the subject says (at least not > > > always, > > > it'll align for a dma_width of 4)? > > Thanks you didn't apply the patch. > > > > I think the approach itself is wrong. > > > > The peripheral drivers usually have no idea and shouldn't know about > > DMA > > engine memory side characteristics (bus width, bursts, etc). > These are typically you system characterstics, like 32 bit or 64 bit > bus to > memory and rest (burst etc) should be maximum as the data will go > from/to > dmaengine FIFO to/from memory, so you would want to push as fast as > possible > > Said that, maximun burst with 32bit wide should be saner value in > modern > systems. My point that peripheral driver does not and _should not_ care about memory side of the transfer. This is property of DMAengine controller and platform that has it installed. Documentation tells nothing how clients should setup _memory side_ of the transfer. Thus, I propose to update documentation to tell that there are two sides of the transfer in case of mem2dev, dev2mem, where one of them is _memory side_, and it's DMAengine controller's responsibility to rightfully set the transfer width and burst size. I would make a patch if we would agree on this. > > > > > > > This should be fixed in certain DMA engine drivers. > > > > Also, as you may have noticed when we get maximum length of the > > segment > > we take into consideration what DMA device supports. Many of them > > report > > something like 2^n - 1, which is apparently unaligned and thus in > > the > > poorly written DMA driver leads to performance degradation. > Which Intel controller supports 2^n - 1? AFAIK the dw and idma don't. All three mentioned below takes block size as [0 .. 2 ^ number of bits in the register - 1]. If transfer width is 1 byte (which is calculated automatically now, the burst will be 1 byte on memory side! >> Looks like all Intel related DMA drivers should be fixed (HSU, > > iDMA64, > > dw_dmac).
On Tue, Apr 12, 2016 at 07:06:01PM +0300, Andy Shevchenko wrote: > On Tue, 2016-04-12 at 19:32 +0530, Vinod Koul wrote: > > On Tue, Apr 12, 2016 at 02:56:35PM +0300, Andy Shevchenko wrote: > > > > > > On Tue, 2016-04-12 at 01:34 +0100, Mark Brown wrote: > > > > > > > > On Mon, Apr 11, 2016 at 07:30:12PM +0300, Andy Shevchenko wrote: > > > > > To optimize amount of bus writes on memory side set burst to be > > > > > the > > > > > same amount > > > > > of data on both sides. > > > > > > > > > > + txconf.src_maxburst = 4 * dws->dma_width; > > > > > txconf.dst_maxburst = 16; > > > > This doesn't seem to do what the subject says (at least not > > > > always, > > > > it'll align for a dma_width of 4)? > > > Thanks you didn't apply the patch. > > > > > > I think the approach itself is wrong. > > > > > > The peripheral drivers usually have no idea and shouldn't know about > > > DMA > > > engine memory side characteristics (bus width, bursts, etc). > > These are typically you system characterstics, like 32 bit or 64 bit > > bus to > > memory and rest (burst etc) should be maximum as the data will go > > from/to > > dmaengine FIFO to/from memory, so you would want to push as fast as > > possible > > > > Said that, maximun burst with 32bit wide should be saner value in > > modern > > systems. > > My point that peripheral driver does not and _should not_ care about > memory side of the transfer. This is property of DMAengine controller > and platform that has it installed. > > Documentation tells nothing how clients should setup _memory side_ of > the transfer. > > Thus, I propose to update documentation to tell that there are two sides > of the transfer in case of mem2dev, dev2mem, where one of them is > _memory side_, and it's DMAengine controller's responsibility to > rightfully set the transfer width and burst size. Well a dmaengine maybe operating in a 32bit or 64 bit bus. Doing 64bit will not help in former case. How do we guess? I do agree that clients shouldn't be bothered with this > > I would make a patch if we would agree on this. > > > > > > > > > > > > This should be fixed in certain DMA engine drivers. > > > > > > Also, as you may have noticed when we get maximum length of the > > > segment > > > we take into consideration what DMA device supports. Many of them > > > report > > > something like 2^n - 1, which is apparently unaligned and thus in > > > the > > > poorly written DMA driver leads to performance degradation. > > Which Intel controller supports 2^n - 1? AFAIK the dw and idma don't. > > All three mentioned below takes block size as [0 .. 2 ^ number of bits > in the register - 1]. If transfer width is 1 byte (which is calculated > automatically now, the burst will be 1 byte on memory side! That is not correct :) > > >> Looks like all Intel related DMA drivers should be fixed (HSU, > > > iDMA64, > > > dw_dmac). > > > -- > Andy Shevchenko <andriy.shevchenko@linux.intel.com> > Intel Finland Oy >
diff --git a/drivers/spi/spi-dw-mid.c b/drivers/spi/spi-dw-mid.c index e31971f9..0e0c34e 100644 --- a/drivers/spi/spi-dw-mid.c +++ b/drivers/spi/spi-dw-mid.c @@ -157,6 +157,7 @@ static struct dma_async_tx_descriptor *dw_spi_dma_prepare_tx(struct dw_spi *dws, txconf.direction = DMA_MEM_TO_DEV; txconf.dst_addr = dws->dma_addr; + txconf.src_maxburst = 4 * dws->dma_width; txconf.dst_maxburst = 16; txconf.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES; txconf.dst_addr_width = convert_dma_width(dws->dma_width); @@ -203,6 +204,7 @@ static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi *dws, rxconf.direction = DMA_DEV_TO_MEM; rxconf.src_addr = dws->dma_addr; + rxconf.dst_maxburst = 4 * dws->dma_width; rxconf.src_maxburst = 16; rxconf.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES; rxconf.src_addr_width = convert_dma_width(dws->dma_width);
To optimize amount of bus writes on memory side set burst to be the same amount of data on both sides. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> --- drivers/spi/spi-dw-mid.c | 2 ++ 1 file changed, 2 insertions(+)