Message ID | 1486740584-17875-4-git-send-email-fisaksen@baylibre.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Frode Isaksen <fisaksen@baylibre.com> writes: > Limit the transfer size to 20 scatter/gather pages if > DMA is enabled. > The eDMA DMA engine is limited to 20 SG entries in one DMA > transaction. If this number is exceeded, DMA receive fails. > This error occurs with large vmalloc'ed buffers. > > Signed-off-by: Frode Isaksen <fisaksen@baylibre.com> > --- > drivers/spi/spi-davinci.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c > index b7b2da1..f1b46f6 100644 > --- a/drivers/spi/spi-davinci.c > +++ b/drivers/spi/spi-davinci.c > @@ -485,6 +485,16 @@ static bool davinci_spi_can_dma(struct spi_master *master, > return __davinci_spi_can_dma(spi); > } > > +static size_t davinci_spi_max_transfer_size(struct spi_device *spi) > +{ > + /* > + * The eDMA DMA engine is limited to 20 SG entries in one DMA > + * transaction. If this number is exceeded, DMA receive fails. > + * An extra SG entry is needed when the buffer is not page aligned. > + */ > + return (__davinci_spi_can_dma(spi)) ? 19 * PAGE_SIZE : SIZE_MAX; The number here should be a #define. Also, comment says 20, code says 19. This should be clarified. Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-spi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
+ Peter On Friday 10 February 2017 08:59 PM, Frode Isaksen wrote: > Limit the transfer size to 20 scatter/gather pages if > DMA is enabled. > The eDMA DMA engine is limited to 20 SG entries in one DMA > transaction. If this number is exceeded, DMA receive fails. > This error occurs with large vmalloc'ed buffers. This needs more explanation because there is support available in edma driver for long SG lists by breaking them down into transfers using 20 PaRAM entries at a time. If thats not working for you, that needs further debug. > > Signed-off-by: Frode Isaksen <fisaksen@baylibre.com> Thanks, Sekhar > --- > drivers/spi/spi-davinci.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c > index b7b2da1..f1b46f6 100644 > --- a/drivers/spi/spi-davinci.c > +++ b/drivers/spi/spi-davinci.c > @@ -485,6 +485,16 @@ static bool davinci_spi_can_dma(struct spi_master *master, > return __davinci_spi_can_dma(spi); > } > > +static size_t davinci_spi_max_transfer_size(struct spi_device *spi) > +{ > + /* > + * The eDMA DMA engine is limited to 20 SG entries in one DMA > + * transaction. If this number is exceeded, DMA receive fails. > + * An extra SG entry is needed when the buffer is not page aligned. > + */ > + return (__davinci_spi_can_dma(spi)) ? 19 * PAGE_SIZE : SIZE_MAX; > +} > + > static int davinci_spi_check_error(struct davinci_spi *dspi, int int_status) > { > struct device *sdev = dspi->bitbang.master->dev.parent; > @@ -1010,6 +1020,7 @@ static int davinci_spi_probe(struct platform_device *pdev) > master->setup = davinci_spi_setup; > master->cleanup = davinci_spi_cleanup; > master->can_dma = davinci_spi_can_dma; > + master->max_transfer_size = davinci_spi_max_transfer_size; > > dspi->bitbang.chipselect = davinci_spi_chipselect; > dspi->bitbang.setup_transfer = davinci_spi_setup_transfer; > -- To unsubscribe from this list: send the line "unsubscribe linux-spi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 13/02/2017 06:59, Sekhar Nori wrote: > + Peter > > On Friday 10 February 2017 08:59 PM, Frode Isaksen wrote: >> Limit the transfer size to 20 scatter/gather pages if >> DMA is enabled. >> The eDMA DMA engine is limited to 20 SG entries in one DMA >> transaction. If this number is exceeded, DMA receive fails. >> This error occurs with large vmalloc'ed buffers. > This needs more explanation because there is support available in edma > driver for long SG lists by breaking them down into transfers using 20 > PaRAM entries at a time. If thats not working for you, that needs > further debug. The SPI controller has a FIFO of only 1 word, so at 1Mbps, filling the FIFO will take only 8us. Handling the DMA interrupt and re-programming the PaRAM entries takes much longer than that. At 1Mbps, about 50 bytes is lost on Rx, @ 2Mbps 100 bytes and @ 5Mbps about 260 bytes hinting towards a time setting up a new DMA transfer > 400us. If the Tx and Rx buffers are identically aligned there are no errors, because the re-programming of the Tx and Rx PaRAM entries happens at the same time. I have also verified this with a scope. In the Tx direction, there is a pause in the transfer of 600us after the 20th SG entrey (when setting up new PaRAM entries). Since setting up Rx PaRAM is identical, this shows that breaking down the transfer is not working in the Rx direction for SPI caused by the relative high bit rate and small FIFO. Thanks, Frode >> Signed-off-by: Frode Isaksen <fisaksen@baylibre.com> > Thanks, > Sekhar > >> --- >> drivers/spi/spi-davinci.c | 11 +++++++++++ >> 1 file changed, 11 insertions(+) >> >> diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c >> index b7b2da1..f1b46f6 100644 >> --- a/drivers/spi/spi-davinci.c >> +++ b/drivers/spi/spi-davinci.c >> @@ -485,6 +485,16 @@ static bool davinci_spi_can_dma(struct spi_master *master, >> return __davinci_spi_can_dma(spi); >> } >> >> +static size_t davinci_spi_max_transfer_size(struct spi_device *spi) >> +{ >> + /* >> + * The eDMA DMA engine is limited to 20 SG entries in one DMA >> + * transaction. If this number is exceeded, DMA receive fails. >> + * An extra SG entry is needed when the buffer is not page aligned. >> + */ >> + return (__davinci_spi_can_dma(spi)) ? 19 * PAGE_SIZE : SIZE_MAX; >> +} >> + >> static int davinci_spi_check_error(struct davinci_spi *dspi, int int_status) >> { >> struct device *sdev = dspi->bitbang.master->dev.parent; >> @@ -1010,6 +1020,7 @@ static int davinci_spi_probe(struct platform_device *pdev) >> master->setup = davinci_spi_setup; >> master->cleanup = davinci_spi_cleanup; >> master->can_dma = davinci_spi_can_dma; >> + master->max_transfer_size = davinci_spi_max_transfer_size; >> >> dspi->bitbang.chipselect = davinci_spi_chipselect; >> dspi->bitbang.setup_transfer = davinci_spi_setup_transfer; >> -- To unsubscribe from this list: send the line "unsubscribe linux-spi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 14 February 2017 04:57 PM, Frode Isaksen wrote: > > > On 13/02/2017 06:59, Sekhar Nori wrote: >> + Peter >> >> On Friday 10 February 2017 08:59 PM, Frode Isaksen wrote: >>> Limit the transfer size to 20 scatter/gather pages if >>> DMA is enabled. >>> The eDMA DMA engine is limited to 20 SG entries in one DMA >>> transaction. If this number is exceeded, DMA receive fails. >>> This error occurs with large vmalloc'ed buffers. >> This needs more explanation because there is support available in edma >> driver for long SG lists by breaking them down into transfers using 20 >> PaRAM entries at a time. If thats not working for you, that needs >> further debug. > The SPI controller has a FIFO of only 1 word, so at 1Mbps, filling the > FIFO will take only 8us. Handling the DMA interrupt and re-programming > the PaRAM entries takes much longer than that. At 1Mbps, about 50 bytes > is lost on Rx, @ 2Mbps 100 bytes and @ 5Mbps about 260 bytes hinting > towards a time setting up a new DMA transfer > 400us. If the Tx and Rx > buffers are identically aligned there are no errors, because the > re-programming of the Tx and Rx PaRAM entries happens at the same time. > I have also verified this with a scope. In the Tx direction, there is a > pause in the transfer of 600us after the 20th SG entrey (when setting up > new PaRAM entries). Since setting up Rx PaRAM is identical, this shows > that breaking down the transfer is not working in the Rx direction for > SPI caused by the relative high bit rate and small FIFO. SPI is synchronous transfer so it does not need a large FIFO. If DMA is not able to replenish the TX shift register, the master should hold the clock until the time it is ready with data. On the scope do you see master continuing to pulse the clock while it is waiting for data to arrive from DMA to its TX shift register ? That should not be happening. Thanks, Sekhar -- To unsubscribe from this list: send the line "unsubscribe linux-spi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 14/02/2017 16:29, Sekhar Nori wrote: > On Tuesday 14 February 2017 04:57 PM, Frode Isaksen wrote: >> >> On 13/02/2017 06:59, Sekhar Nori wrote: >>> + Peter >>> >>> On Friday 10 February 2017 08:59 PM, Frode Isaksen wrote: >>>> Limit the transfer size to 20 scatter/gather pages if >>>> DMA is enabled. >>>> The eDMA DMA engine is limited to 20 SG entries in one DMA >>>> transaction. If this number is exceeded, DMA receive fails. >>>> This error occurs with large vmalloc'ed buffers. >>> This needs more explanation because there is support available in edma >>> driver for long SG lists by breaking them down into transfers using 20 >>> PaRAM entries at a time. If thats not working for you, that needs >>> further debug. >> The SPI controller has a FIFO of only 1 word, so at 1Mbps, filling the >> FIFO will take only 8us. Handling the DMA interrupt and re-programming >> the PaRAM entries takes much longer than that. At 1Mbps, about 50 bytes >> is lost on Rx, @ 2Mbps 100 bytes and @ 5Mbps about 260 bytes hinting >> towards a time setting up a new DMA transfer > 400us. If the Tx and Rx >> buffers are identically aligned there are no errors, because the >> re-programming of the Tx and Rx PaRAM entries happens at the same time. >> I have also verified this with a scope. In the Tx direction, there is a >> pause in the transfer of 600us after the 20th SG entrey (when setting up >> new PaRAM entries). Since setting up Rx PaRAM is identical, this shows >> that breaking down the transfer is not working in the Rx direction for >> SPI caused by the relative high bit rate and small FIFO. > SPI is synchronous transfer so it does not need a large FIFO. If DMA is > not able to replenish the TX shift register, the master should hold the > clock until the time it is ready with data. On the scope do you see > master continuing to pulse the clock while it is waiting for data to > arrive from DMA to its TX shift register ? That should not be happening. Take the example of a long Rx-only transfer using vmalloc'ed buffer. The dummy Tx buffer will be a contiguous buffer (1 SG entry) and clock and dummy data will be continuous during the transfer with no pause. On the Rx side, the PaRAM entries needs to be reloaded after 20*4096 bytes Rx'ed. Handling the interrupt, reloading the PaRAM's and resuming DMA takes ~600 us. Since the slave is still transmitting, data will be lost, since the FIFO is only 1 word and 1 byte takes 8us @ 1MHz. The 600us came from timing a transmit with vmalloc'ed buffer. The time between the last byte in the 20th SG entry and the next byte is 600us measured with a scope. The assumption is that handling interrupt and reloading PaRAM's is more or less the same on Tx and Rx side. This time seems to be confirmed by the number of bytes lost on the Rx side as well which is approximately 50 x XMHz. Hope this clarifies.. Another solution to this would be to have the dummy Tx buffer and Rx buffer to be identical (same type of allocation, same length, same offset), but this requires changes in the SPI framework. Thanks, Frode Thanks, Frode > > Thanks, > Sekhar -- To unsubscribe from this list: send the line "unsubscribe linux-spi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/13/2017 07:59 AM, Sekhar Nori wrote: > + Peter > > On Friday 10 February 2017 08:59 PM, Frode Isaksen wrote: >> Limit the transfer size to 20 scatter/gather pages if >> DMA is enabled. >> The eDMA DMA engine is limited to 20 SG entries in one DMA >> transaction. If this number is exceeded, DMA receive fails. >> This error occurs with large vmalloc'ed buffers. > > This needs more explanation because there is support available in edma > driver for long SG lists by breaking them down into transfers using 20 > PaRAM entries at a time. If thats not working for you, that needs > further debug. While breaking down long SG list works in the eDMA driver, it is not w/o it's limitation. It might take (to) long to re-configure the paRAM slots for the next batch. We see missed events with MMC/SD in case of long SG list, so I would not be surprised if this is not causing issues in other places... > >> >> Signed-off-by: Frode Isaksen <fisaksen@baylibre.com> > > Thanks, > Sekhar > >> --- >> drivers/spi/spi-davinci.c | 11 +++++++++++ >> 1 file changed, 11 insertions(+) >> >> diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c >> index b7b2da1..f1b46f6 100644 >> --- a/drivers/spi/spi-davinci.c >> +++ b/drivers/spi/spi-davinci.c >> @@ -485,6 +485,16 @@ static bool davinci_spi_can_dma(struct spi_master *master, >> return __davinci_spi_can_dma(spi); >> } >> >> +static size_t davinci_spi_max_transfer_size(struct spi_device *spi) >> +{ >> + /* >> + * The eDMA DMA engine is limited to 20 SG entries in one DMA >> + * transaction. If this number is exceeded, DMA receive fails. >> + * An extra SG entry is needed when the buffer is not page aligned. >> + */ >> + return (__davinci_spi_can_dma(spi)) ? 19 * PAGE_SIZE : SIZE_MAX; >> +} >> + >> static int davinci_spi_check_error(struct davinci_spi *dspi, int int_status) >> { >> struct device *sdev = dspi->bitbang.master->dev.parent; >> @@ -1010,6 +1020,7 @@ static int davinci_spi_probe(struct platform_device *pdev) >> master->setup = davinci_spi_setup; >> master->cleanup = davinci_spi_cleanup; >> master->can_dma = davinci_spi_can_dma; >> + master->max_transfer_size = davinci_spi_max_transfer_size; >> >> dspi->bitbang.chipselect = davinci_spi_chipselect; >> dspi->bitbang.setup_transfer = davinci_spi_setup_transfer; >> > - Péter -- To unsubscribe from this list: send the line "unsubscribe linux-spi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 21/03/2017 23:02, Peter Ujfalusi wrote: > > > On 02/13/2017 07:59 AM, Sekhar Nori wrote: >> + Peter >> >> On Friday 10 February 2017 08:59 PM, Frode Isaksen wrote: >>> Limit the transfer size to 20 scatter/gather pages if >>> DMA is enabled. >>> The eDMA DMA engine is limited to 20 SG entries in one DMA >>> transaction. If this number is exceeded, DMA receive fails. >>> This error occurs with large vmalloc'ed buffers. >> >> This needs more explanation because there is support available in edma >> driver for long SG lists by breaking them down into transfers using 20 >> PaRAM entries at a time. If thats not working for you, that needs >> further debug. > > While breaking down long SG list works in the eDMA driver, it is not w/o it's > limitation. It might take (to) long to re-configure the paRAM slots for the > next batch. We see missed events with MMC/SD in case of long SG list, so I > would not be surprised if this is not causing issues in other places... > I checked the time to re-configure the paRAM slots with an oscillo and it takes about 160uS. Since the slave is still transmitting, the number of bytes lost is about 160/8=20. With the spi-loopback-test I saw about 30bytes lost when re-configuring. Thanks, Frode >> >>> >>> Signed-off-by: Frode Isaksen <fisaksen@baylibre.com> >> >> Thanks, >> Sekhar >> >>> --- >>> drivers/spi/spi-davinci.c | 11 +++++++++++ >>> 1 file changed, 11 insertions(+) >>> >>> diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c >>> index b7b2da1..f1b46f6 100644 >>> --- a/drivers/spi/spi-davinci.c >>> +++ b/drivers/spi/spi-davinci.c >>> @@ -485,6 +485,16 @@ static bool davinci_spi_can_dma(struct spi_master *master, >>> return __davinci_spi_can_dma(spi); >>> } >>> >>> +static size_t davinci_spi_max_transfer_size(struct spi_device *spi) >>> +{ >>> + /* >>> + * The eDMA DMA engine is limited to 20 SG entries in one DMA >>> + * transaction. If this number is exceeded, DMA receive fails. >>> + * An extra SG entry is needed when the buffer is not page aligned. >>> + */ >>> + return (__davinci_spi_can_dma(spi)) ? 19 * PAGE_SIZE : SIZE_MAX; >>> +} >>> + >>> static int davinci_spi_check_error(struct davinci_spi *dspi, int int_status) >>> { >>> struct device *sdev = dspi->bitbang.master->dev.parent; >>> @@ -1010,6 +1020,7 @@ static int davinci_spi_probe(struct platform_device >>> *pdev) >>> master->setup = davinci_spi_setup; >>> master->cleanup = davinci_spi_cleanup; >>> master->can_dma = davinci_spi_can_dma; >>> + master->max_transfer_size = davinci_spi_max_transfer_size; >>> >>> dspi->bitbang.chipselect = davinci_spi_chipselect; >>> dspi->bitbang.setup_transfer = davinci_spi_setup_transfer; >>> >> > > - Péter -- To unsubscribe from this list: send the line "unsubscribe linux-spi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday 22 March 2017 04:41 PM, Frode Isaksen wrote: > > > On 21/03/2017 23:02, Peter Ujfalusi wrote: >> >> >> On 02/13/2017 07:59 AM, Sekhar Nori wrote: >>> + Peter >>> >>> On Friday 10 February 2017 08:59 PM, Frode Isaksen wrote: >>>> Limit the transfer size to 20 scatter/gather pages if >>>> DMA is enabled. >>>> The eDMA DMA engine is limited to 20 SG entries in one DMA >>>> transaction. If this number is exceeded, DMA receive fails. >>>> This error occurs with large vmalloc'ed buffers. >>> >>> This needs more explanation because there is support available in edma >>> driver for long SG lists by breaking them down into transfers using 20 >>> PaRAM entries at a time. If thats not working for you, that needs >>> further debug. >> >> While breaking down long SG list works in the eDMA driver, it is not w/o it's >> limitation. It might take (to) long to re-configure the paRAM slots for the >> next batch. We see missed events with MMC/SD in case of long SG list, so I >> would not be surprised if this is not causing issues in other places... >> > I checked the time to re-configure the paRAM slots with an oscillo and it takes > about 160uS. Since the slave is still transmitting, the number of bytes lost is > about 160/8=20. With the spi-loopback-test I saw about 30bytes lost when > re-configuring. I guess this means that most IPs continue to get more missed sync events even after the first sync event is generated at end of MAX_NR_SG PaRAMs. And may be even that first sync event miss is a problem because the IP does not pause and ends up shifting data in the shift register anyway. If this is so broken, should edma driver not refuse to handle more than MAX_NR_SG for all transfers but mem-to-mem transfers? Thanks, Sekhar -- To unsubscribe from this list: send the line "unsubscribe linux-spi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/22/2017 01:35 PM, Sekhar Nori wrote: > On Wednesday 22 March 2017 04:41 PM, Frode Isaksen wrote: > I guess this means that most IPs continue to get more missed sync events > even after the first sync event is generated at end of MAX_NR_SG PaRAMs. > And may be even that first sync event miss is a problem because the IP > does not pause and ends up shifting data in the shift register anyway. > > If this is so broken, should edma driver not refuse to handle more than > MAX_NR_SG for all transfers but mem-to-mem transfers? DM355, OMAP-L138, etc have only 128 paRAM slots. If we lift the restriction on the number of paRAM slots a channel can take, then DMA will most likely going to stop working: If any peripherals request 128 long SG transfer (and MMC easily does that alone) then we will have no available paRAM slot for other clients. It might be possible to start the paRAM slot update for the next batch already when we have 1 or 2 slots to be processed, but that is going to be tricky and not even reliable. Note: audio also sets limit on the number of periods to avoid this reconfiguration of paRAM slots.
On 22/03/2017 13:20, Peter Ujfalusi wrote: > On 03/22/2017 01:35 PM, Sekhar Nori wrote: >> On Wednesday 22 March 2017 04:41 PM, Frode Isaksen wrote: >> I guess this means that most IPs continue to get more missed sync events >> even after the first sync event is generated at end of MAX_NR_SG PaRAMs. >> And may be even that first sync event miss is a problem because the IP >> does not pause and ends up shifting data in the shift register anyway. >> >> If this is so broken, should edma driver not refuse to handle more than >> MAX_NR_SG for all transfers but mem-to-mem transfers? > > DM355, OMAP-L138, etc have only 128 paRAM slots. If we lift the restriction on > the number of paRAM slots a channel can take, then DMA will most likely going > to stop working: If any peripherals request 128 long SG transfer (and MMC > easily does that alone) then we will have no available paRAM slot for other > clients. > > It might be possible to start the paRAM slot update for the next batch already > when we have 1 or 2 slots to be processed, but that is going to be tricky and > not even reliable. Note that with SPI, a workaround for this issue has been found by using the rx buffer as the dummy tx buffer when doing read-only transfers (this is the case that fails). With this, the rx and tx paRAM slots are reconfigured at the same time. Thanks, Frode > > Note: audio also sets limit on the number of periods to avoid this > reconfiguration of paRAM slots. > -- To unsubscribe from this list: send the line "unsubscribe linux-spi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c index b7b2da1..f1b46f6 100644 --- a/drivers/spi/spi-davinci.c +++ b/drivers/spi/spi-davinci.c @@ -485,6 +485,16 @@ static bool davinci_spi_can_dma(struct spi_master *master, return __davinci_spi_can_dma(spi); } +static size_t davinci_spi_max_transfer_size(struct spi_device *spi) +{ + /* + * The eDMA DMA engine is limited to 20 SG entries in one DMA + * transaction. If this number is exceeded, DMA receive fails. + * An extra SG entry is needed when the buffer is not page aligned. + */ + return (__davinci_spi_can_dma(spi)) ? 19 * PAGE_SIZE : SIZE_MAX; +} + static int davinci_spi_check_error(struct davinci_spi *dspi, int int_status) { struct device *sdev = dspi->bitbang.master->dev.parent; @@ -1010,6 +1020,7 @@ static int davinci_spi_probe(struct platform_device *pdev) master->setup = davinci_spi_setup; master->cleanup = davinci_spi_cleanup; master->can_dma = davinci_spi_can_dma; + master->max_transfer_size = davinci_spi_max_transfer_size; dspi->bitbang.chipselect = davinci_spi_chipselect; dspi->bitbang.setup_transfer = davinci_spi_setup_transfer;
Limit the transfer size to 20 scatter/gather pages if DMA is enabled. The eDMA DMA engine is limited to 20 SG entries in one DMA transaction. If this number is exceeded, DMA receive fails. This error occurs with large vmalloc'ed buffers. Signed-off-by: Frode Isaksen <fisaksen@baylibre.com> --- drivers/spi/spi-davinci.c | 11 +++++++++++ 1 file changed, 11 insertions(+)