diff mbox series

[PATCHv5] mmc: block: Check for errors after write on SPI

Message ID 76f6f5d2b35543bab3dfe438f268609c@hyperstone.com (mailing list archive)
State New, archived
Headers show
Series [PATCHv5] mmc: block: Check for errors after write on SPI | expand

Commit Message

Christian Loehle March 24, 2022, 2:18 p.m. UTC
Introduce a SEND_STATUS check for writes through SPI to not mark
an unsuccessful write as successful.

Since SPI SD/MMC does not have states, after a write, the card will
just hold the line LOW until it is ready again. The driver marks the
write therefore as completed as soon as it reads something other than
all zeroes.
The driver does not distinguish from a card no longer signalling busy
and it being disconnected (and the line being pulled-up by the host).
This lead to writes being marked as successful when disconnecting
a busy card.
Now the card is ensured to be still connected by an additional CMD13,
just like non-SPI is ensured to go back to TRAN state.

While at it and since we already poll for the post-write status anyway,
we might as well check for SPIs error bits (any of them).

The disconnecting card problem is reproducable for me after continuous
write activity and randomly disconnecting, around every 20-50 tries
on SPI DS for some card.

Fixes: 7213d175e3b6f ("MMC/SD card driver learns SPI")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Loehle <cloehle@hyperstone.com>
---
v2:
  - Reorder err and status check for err to take precedence and look cleaner
v3:
  - Move the logic into its own function
v4:
  - Move block layer handling out of the spi-specific function
v5:
  - reorder err and status check  

 drivers/mmc/core/block.c | 34 +++++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

Comments

Andy Shevchenko March 25, 2022, 12:42 p.m. UTC | #1
On Thu, Mar 24, 2022 at 02:18:41PM +0000, Christian Löhle wrote:
> Introduce a SEND_STATUS check for writes through SPI to not mark
> an unsuccessful write as successful.
> 
> Since SPI SD/MMC does not have states, after a write, the card will
> just hold the line LOW until it is ready again. The driver marks the
> write therefore as completed as soon as it reads something other than
> all zeroes.
> The driver does not distinguish from a card no longer signalling busy
> and it being disconnected (and the line being pulled-up by the host).
> This lead to writes being marked as successful when disconnecting
> a busy card.
> Now the card is ensured to be still connected by an additional CMD13,
> just like non-SPI is ensured to go back to TRAN state.
> 
> While at it and since we already poll for the post-write status anyway,
> we might as well check for SPIs error bits (any of them).
> 
> The disconnecting card problem is reproducable for me after continuous
> write activity and randomly disconnecting, around every 20-50 tries
> on SPI DS for some card.

LGTM, FWIW,
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

> Fixes: 7213d175e3b6f ("MMC/SD card driver learns SPI")
> Cc: stable@vger.kernel.org
> Signed-off-by: Christian Loehle <cloehle@hyperstone.com>
> ---
> v2:
>   - Reorder err and status check for err to take precedence and look cleaner
> v3:
>   - Move the logic into its own function
> v4:
>   - Move block layer handling out of the spi-specific function
> v5:
>   - reorder err and status check
> 
>  drivers/mmc/core/block.c | 34 +++++++++++++++++++++++++++++++++-
>  1 file changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> index 4e67c1403cc9..be2078684417 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -1880,6 +1880,31 @@ static inline bool mmc_blk_rq_error(struct mmc_blk_request *brq)
>  	       brq->data.error || brq->cmd.resp[0] & CMD_ERRORS;
>  }
> 
> +static int mmc_spi_err_check(struct mmc_card *card)
> +{
> +	u32 status = 0;
> +	int err;
> +
> +	/*
> +	 * SPI does not have a TRAN state we have to wait on, instead the
> +	 * card is ready again when it no longer holds the line LOW.
> +	 * We still have to ensure two things here before we know the write
> +	 * was successful:
> +	 * 1. The card has not disconnected during busy and we actually read our
> +	 * own pull-up, thinking it was still connected, so ensure it
> +	 * still responds.
> +	 * 2. Check for any error bits, in particular R1_SPI_IDLE to catch a
> +	 * just reconnected card after being disconnected during busy.
> +	 */
> +	err = __mmc_send_status(card, &status, 0);
> +	if (err)
> +		return err;
> +	/* All R1 and R2 bits of SPI are errors in our case */
> +	if (status)
> +		return -EIO;
> +	return 0;
> +}
> +
>  static int mmc_blk_busy_cb(void *cb_data, bool *busy)
>  {
>  	struct mmc_blk_busy_data *data = cb_data;
> @@ -1903,9 +1928,16 @@ static int mmc_blk_card_busy(struct mmc_card *card, struct request *req)
>  	struct mmc_blk_busy_data cb_data;
>  	int err;
> 
> -	if (mmc_host_is_spi(card->host) || rq_data_dir(req) == READ)
> +	if (rq_data_dir(req) == READ)
>  		return 0;
> 
> +	if (mmc_host_is_spi(card->host)) {
> +		err = mmc_spi_err_check(card);
> +		if (err)
> +			mqrq->brq.data.bytes_xfered = 0;
> +		return err;
> +	}
> +
>  	cb_data.card = card;
>  	cb_data.status = 0;
>  	err = __mmc_poll_for_busy(card->host, 0, MMC_BLK_TIMEOUT_MS,
> --
> 2.34.1
> Hyperstone GmbH | Reichenaustr. 39a  | 78467 Konstanz
> Managing Director: Dr. Jan Peter Berns.
> Commercial register of local courts: Freiburg HRB381782
>
Ulf Hansson March 28, 2022, 2:05 p.m. UTC | #2
On Thu, 24 Mar 2022 at 15:18, Christian Löhle <CLoehle@hyperstone.com> wrote:
>
> Introduce a SEND_STATUS check for writes through SPI to not mark
> an unsuccessful write as successful.
>
> Since SPI SD/MMC does not have states, after a write, the card will
> just hold the line LOW until it is ready again. The driver marks the
> write therefore as completed as soon as it reads something other than
> all zeroes.
> The driver does not distinguish from a card no longer signalling busy
> and it being disconnected (and the line being pulled-up by the host).
> This lead to writes being marked as successful when disconnecting
> a busy card.
> Now the card is ensured to be still connected by an additional CMD13,
> just like non-SPI is ensured to go back to TRAN state.
>
> While at it and since we already poll for the post-write status anyway,
> we might as well check for SPIs error bits (any of them).
>
> The disconnecting card problem is reproducable for me after continuous
> write activity and randomly disconnecting, around every 20-50 tries
> on SPI DS for some card.
>
> Fixes: 7213d175e3b6f ("MMC/SD card driver learns SPI")
> Cc: stable@vger.kernel.org
> Signed-off-by: Christian Loehle <cloehle@hyperstone.com>

Applied for fixes, thanks!

Kind regards
Uffe


> ---
> v2:
>   - Reorder err and status check for err to take precedence and look cleaner
> v3:
>   - Move the logic into its own function
> v4:
>   - Move block layer handling out of the spi-specific function
> v5:
>   - reorder err and status check
>
>  drivers/mmc/core/block.c | 34 +++++++++++++++++++++++++++++++++-
>  1 file changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> index 4e67c1403cc9..be2078684417 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -1880,6 +1880,31 @@ static inline bool mmc_blk_rq_error(struct mmc_blk_request *brq)
>                brq->data.error || brq->cmd.resp[0] & CMD_ERRORS;
>  }
>
> +static int mmc_spi_err_check(struct mmc_card *card)
> +{
> +       u32 status = 0;
> +       int err;
> +
> +       /*
> +        * SPI does not have a TRAN state we have to wait on, instead the
> +        * card is ready again when it no longer holds the line LOW.
> +        * We still have to ensure two things here before we know the write
> +        * was successful:
> +        * 1. The card has not disconnected during busy and we actually read our
> +        * own pull-up, thinking it was still connected, so ensure it
> +        * still responds.
> +        * 2. Check for any error bits, in particular R1_SPI_IDLE to catch a
> +        * just reconnected card after being disconnected during busy.
> +        */
> +       err = __mmc_send_status(card, &status, 0);
> +       if (err)
> +               return err;
> +       /* All R1 and R2 bits of SPI are errors in our case */
> +       if (status)
> +               return -EIO;
> +       return 0;
> +}
> +
>  static int mmc_blk_busy_cb(void *cb_data, bool *busy)
>  {
>         struct mmc_blk_busy_data *data = cb_data;
> @@ -1903,9 +1928,16 @@ static int mmc_blk_card_busy(struct mmc_card *card, struct request *req)
>         struct mmc_blk_busy_data cb_data;
>         int err;
>
> -       if (mmc_host_is_spi(card->host) || rq_data_dir(req) == READ)
> +       if (rq_data_dir(req) == READ)
>                 return 0;
>
> +       if (mmc_host_is_spi(card->host)) {
> +               err = mmc_spi_err_check(card);
> +               if (err)
> +                       mqrq->brq.data.bytes_xfered = 0;
> +               return err;
> +       }
> +
>         cb_data.card = card;
>         cb_data.status = 0;
>         err = __mmc_poll_for_busy(card->host, 0, MMC_BLK_TIMEOUT_MS,
> --
> 2.34.1
> Hyperstone GmbH | Reichenaustr. 39a  | 78467 Konstanz
> Managing Director: Dr. Jan Peter Berns.
> Commercial register of local courts: Freiburg HRB381782
>
diff mbox series

Patch

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 4e67c1403cc9..be2078684417 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1880,6 +1880,31 @@  static inline bool mmc_blk_rq_error(struct mmc_blk_request *brq)
 	       brq->data.error || brq->cmd.resp[0] & CMD_ERRORS;
 }
 
+static int mmc_spi_err_check(struct mmc_card *card)
+{
+	u32 status = 0;
+	int err;
+
+	/*
+	 * SPI does not have a TRAN state we have to wait on, instead the
+	 * card is ready again when it no longer holds the line LOW.
+	 * We still have to ensure two things here before we know the write
+	 * was successful:
+	 * 1. The card has not disconnected during busy and we actually read our
+	 * own pull-up, thinking it was still connected, so ensure it
+	 * still responds.
+	 * 2. Check for any error bits, in particular R1_SPI_IDLE to catch a
+	 * just reconnected card after being disconnected during busy.
+	 */
+	err = __mmc_send_status(card, &status, 0);
+	if (err)
+		return err;
+	/* All R1 and R2 bits of SPI are errors in our case */
+	if (status)
+		return -EIO;
+	return 0;
+}
+
 static int mmc_blk_busy_cb(void *cb_data, bool *busy)
 {
 	struct mmc_blk_busy_data *data = cb_data;
@@ -1903,9 +1928,16 @@  static int mmc_blk_card_busy(struct mmc_card *card, struct request *req)
 	struct mmc_blk_busy_data cb_data;
 	int err;
 
-	if (mmc_host_is_spi(card->host) || rq_data_dir(req) == READ)
+	if (rq_data_dir(req) == READ)
 		return 0;
 
+	if (mmc_host_is_spi(card->host)) {
+		err = mmc_spi_err_check(card);
+		if (err)
+			mqrq->brq.data.bytes_xfered = 0;
+		return err;
+	}
+
 	cb_data.card = card;
 	cb_data.status = 0;
 	err = __mmc_poll_for_busy(card->host, 0, MMC_BLK_TIMEOUT_MS,