Message ID | 20171121090904.6901-1-ricardo.ribalda@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 5a1314fa697fc65cefaba64cd4699bfc3e6882a6 |
Headers | show |
On Tue, Nov 21, 2017 at 10:09:02AM +0100, Ricardo Ribalda Delgado wrote: > + stalled = 10; > while (rx_words) { > + if (rx_words == n_words && !(stalled--) && > + !(sr & XSPI_SR_TX_EMPTY_MASK) && > + (sr & XSPI_SR_RX_EMPTY_MASK)) { 10 seems like a small number for what is essentially just a busy spin - are we sure that we won't reasonably hit a case where the CPU is sufficiently fast and the bus sufficiently slow we falsely detect a stall? Where did this number come from?
Hi Mark On Wed, Nov 22, 2017 at 12:12 PM, Mark Brown <broonie@kernel.org> wrote: > On Tue, Nov 21, 2017 at 10:09:02AM +0100, Ricardo Ribalda Delgado wrote: > >> + stalled = 10; >> while (rx_words) { >> + if (rx_words == n_words && !(stalled--) && >> + !(sr & XSPI_SR_TX_EMPTY_MASK) && >> + (sr & XSPI_SR_RX_EMPTY_MASK)) { > > 10 seems like a small number for what is essentially just a busy spin - > are we sure that we won't reasonably hit a case where the CPU is > sufficiently fast and the bus sufficiently slow we falsely detect a > stall? Where did this number come from? This code is executed after all the data has been pushed to the out fifo. Since we are not inhibitng tx, the moment the empty mask is evaluated, at least one byte should be out. After 1 day of use I have been able to locate one event when stalled was two. So 10 is very conservative number. In other words: 10 is one order of magnitude bigger than the worst measured case. Thanks for your review :)
Hi again On Wed, Nov 22, 2017 at 8:25 PM, Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com> wrote: > Hi Mark > > > On Wed, Nov 22, 2017 at 12:12 PM, Mark Brown <broonie@kernel.org> wrote: >> On Tue, Nov 21, 2017 at 10:09:02AM +0100, Ricardo Ribalda Delgado wrote: >> >>> + stalled = 10; >>> while (rx_words) { >>> + if (rx_words == n_words && !(stalled--) && >>> + !(sr & XSPI_SR_TX_EMPTY_MASK) && >>> + (sr & XSPI_SR_RX_EMPTY_MASK)) { >> >> 10 seems like a small number for what is essentially just a busy spin - >> are we sure that we won't reasonably hit a case where the CPU is >> sufficiently fast and the bus sufficiently slow we falsely detect a >> stall? Where did this number come from? > > This code is executed after all the data has been pushed to the out > fifo. Since we are not inhibitng tx, the moment the empty mask is > evaluated, at least one byte should be out. > > After 1 day of use I have been able to locate one event when stalled > was two. So 10 is very conservative number. > > In other words: 10 is one order of magnitude bigger than the worst > measured case. If it makes you more comfortable I could add something like if (n_tries < 8) msleep(10); Regards! > > Thanks for your review :) > > > -- > Ricardo Ribalda
diff --git a/drivers/spi/spi-xilinx.c b/drivers/spi/spi-xilinx.c index bc7100b93dfc..e0b9fe1d0e37 100644 --- a/drivers/spi/spi-xilinx.c +++ b/drivers/spi/spi-xilinx.c @@ -271,6 +271,7 @@ static int xilinx_spi_txrx_bufs(struct spi_device *spi, struct spi_transfer *t) while (remaining_words) { int n_words, tx_words, rx_words; u32 sr; + int stalled; n_words = min(remaining_words, xspi->buffer_size); @@ -299,7 +300,17 @@ static int xilinx_spi_txrx_bufs(struct spi_device *spi, struct spi_transfer *t) /* Read out all the data from the Rx FIFO */ rx_words = n_words; + stalled = 10; while (rx_words) { + if (rx_words == n_words && !(stalled--) && + !(sr & XSPI_SR_TX_EMPTY_MASK) && + (sr & XSPI_SR_RX_EMPTY_MASK)) { + dev_err(&spi->dev, + "Detected stall. Check C_SPI_MODE and C_SPI_MEMORY\n"); + xspi_init_hw(xspi); + return -EIO; + } + if ((sr & XSPI_SR_TX_EMPTY_MASK) && (rx_words > 1)) { xilinx_spi_rx(xspi); rx_words--;
When the core is configured in C_SPI_MODE > 0, it integrates a lookup table that automatically configures the core in dual or quad mode based on the command (first byte on the tx fifo). Unfortunately, that list mode_?_memoy_*.mif does not contain all the supported commands by the flash. Since 4.14 spi-nor automatically tries to probe the flash using SFDP (command 0x5a), and that command is not part of the list_mode table. Whit the right combination of C_SPI_MODE and C_SPI_MEMORY this leads into a stall that can only be recovered with a soft rest. This patch detects this kind of stall and returns -EIO to the caller on those commands. spi-nor can handle this error properly: m25p80 spi0.0: Detected stall. Check C_SPI_MODE and C_SPI_MEMORY. 0x21 0x2404 m25p80 spi0.0: SPI transfer failed: -5 spi_master spi0: failed to transfer one message from queue m25p80 spi0.0: s25sl064p (8192 Kbytes) Cc: Mark Brown <broonie@kernel.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com> --- drivers/spi/spi-xilinx.c | 11 +++++++++++ 1 file changed, 11 insertions(+)