diff mbox

[1/6] mmc: block: Resume multi-block reads after transient read errors.

Message ID 1303520502-32171-2-git-send-email-john.stultz@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

John Stultz April 23, 2011, 1:01 a.m. UTC
From: David Ding <david.j.ding@motorola.com>

CC: Chris Ball <cjb@laptop.org>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Dima Zavin <dima@android.com>
Signed-off-by: Bentao Zou <bzou1@motorola.com>
Signed-off-by: David Ding <david.j.ding@motorola.com>
Signed-off-by: San Mehat <san@google.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 drivers/mmc/card/block.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

Comments

Arnd Bergmann April 26, 2011, 1:19 p.m. UTC | #1
On Saturday 23 April 2011, John Stultz wrote:
> From: David Ding <david.j.ding@motorola.com>
> 
> CC: Chris Ball <cjb@laptop.org>
> CC: Arnd Bergmann <arnd@arndb.de>
> CC: Dima Zavin <dima@android.com>
> Signed-off-by: Bentao Zou <bzou1@motorola.com>
> Signed-off-by: David Ding <david.j.ding@motorola.com>
> Signed-off-by: San Mehat <san@google.com>
> Signed-off-by: John Stultz <john.stultz@linaro.org>

The disable_multi logic was introduced in 6a79e391 "mmc_block: ensure
all sectors that do not have errors are read" by Adrian Hunter. Maybe
he can comment on this.

My impression is that the code makes more sense without this
add-on patch, because the flag is set for exactly one request
and makes mmc_blk_issue_rw_rq issue all blocks in that request
one by one up to the first error, while after the patch,
we would read one sector, then read the remaining request at
once, fail, and read the next sector, and so on.

If the problem is transient read errors, a better solution might
be to retry the entire request before doing it one sector at a time.

>  drivers/mmc/card/block.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
> index 61d233a..edac9ac 100644
> --- a/drivers/mmc/card/block.c
> +++ b/drivers/mmc/card/block.c
> @@ -440,6 +440,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
>  				continue;
>  			}
>  			status = get_card_status(card, req);
> +		} else if (disable_multi == 1) {
> +			disable_multi = 0;
>  		}
>  
>  		if (brq.cmd.error) {
> -- 
> 1.7.3.2.146.gca209
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adrian Hunter April 27, 2011, 8:16 a.m. UTC | #2
On 26/04/11 16:19, Arnd Bergmann wrote:
> On Saturday 23 April 2011, John Stultz wrote:
>> From: David Ding<david.j.ding@motorola.com>
>>
>> CC: Chris Ball<cjb@laptop.org>
>> CC: Arnd Bergmann<arnd@arndb.de>
>> CC: Dima Zavin<dima@android.com>
>> Signed-off-by: Bentao Zou<bzou1@motorola.com>
>> Signed-off-by: David Ding<david.j.ding@motorola.com>
>> Signed-off-by: San Mehat<san@google.com>
>> Signed-off-by: John Stultz<john.stultz@linaro.org>
>
> The disable_multi logic was introduced in 6a79e391 "mmc_block: ensure
> all sectors that do not have errors are read" by Adrian Hunter. Maybe
> he can comment on this.
>
> My impression is that the code makes more sense without this
> add-on patch, because the flag is set for exactly one request
> and makes mmc_blk_issue_rw_rq issue all blocks in that request
> one by one up to the first error, while after the patch,
> we would read one sector, then read the remaining request at
> once, fail, and read the next sector, and so on.
>
> If the problem is transient read errors, a better solution might
> be to retry the entire request before doing it one sector at a time.

Having to guess what the patch is for is pretty sad.  Is writing
commit messages so hard.

The word "transient" suggests the errors might disappear, in which case
the entire read should be retried some number of times.

The big problem with reading 1 sector at a time is that it
can be quite slow (orders of magnitude slower).  Possibly
this patch is aimed at that.  Somewhere there is a patch to
change the retries from 1 sector at a time to 1 segment
at a time, which helps a bit.  Possibly it would help to bisect
the number sectors/segments. i.e.

	read error
		read half of the number of sectors/segments
		that were read last time (but always read at
		least 1)
	no read error
		read all the remaining sectors/segments

This patch does the second half of that approach.

>
>>   drivers/mmc/card/block.c |    2 ++
>>   1 files changed, 2 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
>> index 61d233a..edac9ac 100644
>> --- a/drivers/mmc/card/block.c
>> +++ b/drivers/mmc/card/block.c
>> @@ -440,6 +440,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
>>   				continue;
>>   			}
>>   			status = get_card_status(card, req);
>> +		} else if (disable_multi == 1) {
>> +			disable_multi = 0;
>>   		}
>>
>>   		if (brq.cmd.error) {
>> --
>> 1.7.3.2.146.gca209
>>
>>
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrei Warkentin April 27, 2011, 8:42 a.m. UTC | #3
On Wed, Apr 27, 2011 at 3:16 AM, Adrian Hunter <adrian.hunter@nokia.com> wrote:
> On 26/04/11 16:19, Arnd Bergmann wrote:
>>
>> On Saturday 23 April 2011, John Stultz wrote:
>>>
>>> From: David Ding<david.j.ding@motorola.com>
>>>
>>> CC: Chris Ball<cjb@laptop.org>
>>> CC: Arnd Bergmann<arnd@arndb.de>
>>> CC: Dima Zavin<dima@android.com>
>>> Signed-off-by: Bentao Zou<bzou1@motorola.com>
>>> Signed-off-by: David Ding<david.j.ding@motorola.com>
>>> Signed-off-by: San Mehat<san@google.com>
>>> Signed-off-by: John Stultz<john.stultz@linaro.org>
>>
>> The disable_multi logic was introduced in 6a79e391 "mmc_block: ensure
>> all sectors that do not have errors are read" by Adrian Hunter. Maybe
>> he can comment on this.
>>
>> My impression is that the code makes more sense without this
>> add-on patch, because the flag is set for exactly one request
>> and makes mmc_blk_issue_rw_rq issue all blocks in that request
>> one by one up to the first error, while after the patch,
>> we would read one sector, then read the remaining request at
>> once, fail, and read the next sector, and so on.
>>

You could definitely interpret the retry path as being:
- do a probe sector-sized read/write
- if it succeeds, pretend everything is fine and retry with disable_multi=0
- else the probed sector must the point of failure

...it would mean more retrying in the case of an actual card failure
mode (at which point you're not doing much anyway), and drastic
improvement in the case of some one-time (or not so one-time) host
glitches...

The other thought I have is that it's not ideal to overcomplicate the
logic inside issue_rw_rq (i.e. bisection stuff). Until Per's
refactoring as part of async issue it was already too big :-).

What do you think?

Anyway I hope the patch owner weighs in.

A
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 61d233a..edac9ac 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -440,6 +440,8 @@  static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 				continue;
 			}
 			status = get_card_status(card, req);
+		} else if (disable_multi == 1) {
+			disable_multi = 0;
 		}
 
 		if (brq.cmd.error) {